linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
@ 2022-06-17 11:18 Shuai Xue
  2022-06-17 11:18 ` [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
                   ` (22 more replies)
  0 siblings, 23 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-17 11:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song

This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
which custom-built by Alibaba Group's chip development business, T-Head.

Shuai Xue (3):
  docs: perf: Add description for Alibaba's T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
    SoC
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver

 .../admin-guide/perf/alibaba_pmu.rst          |  94 +++
 Documentation/admin-guide/perf/index.rst      |   1 +
 MAINTAINERS                                   |   8 +
 drivers/perf/Kconfig                          |   8 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c         | 771 ++++++++++++++++++
 6 files changed, 883 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-06-17 11:18 ` Shuai Xue
  2022-06-20 11:49   ` Jonathan Cameron
  2022-07-04  3:53   ` kernel test robot
  2022-06-17 11:18 ` [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (21 subsequent siblings)
  22 siblings, 2 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-17 11:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song

Alibaba's T-Head SoC implements uncore PMU for performance and functional
debugging to facilitate system maintenance. Document it to provide guidance
on how to use it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 .../admin-guide/perf/alibaba_pmu.rst          | 94 +++++++++++++++++++
 Documentation/admin-guide/perf/index.rst      |  1 +
 2 files changed, 95 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst

diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
new file mode 100644
index 000000000000..337f4f1d4c54
--- /dev/null
+++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
@@ -0,0 +1,94 @@
+=============================================================
+Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
+=============================================================
+
+The Yitian 710, custom-built by Alibaba Group's chip development business,
+T-Head, implements uncore PMU for performance and functional debugging to
+facilitate system maintenance.
+
+DDR Sub-System Driveway (DRW) PMU Driver
+=========================================
+
+The Yitian 710 SoC supports the most advanced DDR5/4 DRAM to provide
+tremendous memory bandwidth for cloud computing and HPC. The Driveway is a
+module which is a bridge between a router of mesh network and memory
+controller. It provides various functions like secure control, address map
+and so on.
+
+Yitian 710 employs eight DDR5/4 channels, four on each die. Each channel is
+independent of others to service system memory requests. And one DDR5
+channel is split into two independent sub-channels. The DDR Sub-System
+Driveway implements separate PMUs for each sub-channel to monitor various
+performance metrics.
+
+The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
+For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
+sub-channels of the same channel in die 0. And the PMU device of die 1 is
+prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
+
+Each sub-channel has 36 PMU counters in total, which is classified into
+four groups:
+
+- Group 0: PMU Cycle Counter. This group has one pair of counters
+  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
+  based on DDRC core clock.
+
+- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
+  to count the total access number of either the eight bank groups in a
+  selected rank, or four ranks separately in the first 4 counters. The base
+  transfer unit is 64B.
+
+- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
+  count the total retry number of each type of uncorrectable error.
+
+- Group 3: PMU Common Counters. This group has 16 counters, that are used
+  to count the common events.
+
+For now, the Driveway PMU driver only uses counters in group 0 and group 3.
+
+Example usage of counting memory data bandwidth::
+
+  perf stat \
+    -e ali_drw_21000/perf_hif_wr/ \
+    -e ali_drw_21000/perf_hif_rd/ \
+    -e ali_drw_21000/perf_hif_rmw/ \
+    -e ali_drw_21000/perf_cycle/ \
+    -e ali_drw_21080/perf_hif_wr/ \
+    -e ali_drw_21080/perf_hif_rd/ \
+    -e ali_drw_21080/perf_hif_rmw/ \
+    -e ali_drw_21080/perf_cycle/ \
+    -e ali_drw_23000/perf_hif_wr/ \
+    -e ali_drw_23000/perf_hif_rd/ \
+    -e ali_drw_23000/perf_hif_rmw/ \
+    -e ali_drw_23000/perf_cycle/ \
+    -e ali_drw_23080/perf_hif_wr/ \
+    -e ali_drw_23080/perf_hif_rd/ \
+    -e ali_drw_23080/perf_hif_rmw/ \
+    -e ali_drw_23080/perf_cycle/ \
+    -e ali_drw_25000/perf_hif_wr/ \
+    -e ali_drw_25000/perf_hif_rd/ \
+    -e ali_drw_25000/perf_hif_rmw/ \
+    -e ali_drw_25000/perf_cycle/ \
+    -e ali_drw_25080/perf_hif_wr/ \
+    -e ali_drw_25080/perf_hif_rd/ \
+    -e ali_drw_25080/perf_hif_rmw/ \
+    -e ali_drw_25080/perf_cycle/ \
+    -e ali_drw_27000/perf_hif_wr/ \
+    -e ali_drw_27000/perf_hif_rd/ \
+    -e ali_drw_27000/perf_hif_rmw/ \
+    -e ali_drw_27000/perf_cycle/ \
+    -e ali_drw_27080/perf_hif_wr/ \
+    -e ali_drw_27080/perf_hif_rd/ \
+    -e ali_drw_27080/perf_hif_rmw/ \
+    -e ali_drw_27080/perf_cycle/ -- sleep 10
+
+The average DRAM bandwidth can be calculated as follows:
+
+- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+
+Here, DDRC_WIDTH = 64 bytes.
+
+The current driver does not support sampling. So "perf record" is
+unsupported.  Also attach to a task is unsupported as the events are all
+uncore.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 69b23f087c05..bf466ae91c6c 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -17,3 +17,4 @@ Performance monitor support
    xgene-pmu
    arm_dsu_pmu
    thunderx2-pmu
+   thead_pmu
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
  2022-06-17 11:18 ` [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-06-17 11:18 ` Shuai Xue
  2022-06-17 19:59   ` kernel test robot
  2022-06-20 13:50   ` Jonathan Cameron
  2022-06-17 11:18 ` [PATCH v1 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
                   ` (20 subsequent siblings)
  22 siblings, 2 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-17 11:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song

Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports the most
advanced DDR5/4 DRAM to provide tremendous memory bandwidth for cloud
computing and HPC.

Each PMU is registered as a device in /sys/bus/event_source/devices, and
users can select event to monitor in each sub-channel, independently. For
example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
sub-channels of the same channel in die 0. And the PMU device of die 1 is
prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
---
 drivers/perf/Kconfig                  |   8 +
 drivers/perf/Makefile                 |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c | 771 ++++++++++++++++++++++++++
 3 files changed, 780 insertions(+)
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 1e2d69453771..dfafba4cb066 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU
 	  Provides support for the non-architectural CPU PMUs present on
 	  the Apple M1 SoCs and derivatives.
 
+config ALIBABA_UNCORE_DRW_PMU
+	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
+	depends on (ARM64 && ACPI)
+	default m
+	help
+	  Support for Driveway PMU events monitoring on Yitian 710 DDR
+	  Sub-system.
+
 source "drivers/perf/hisilicon/Kconfig"
 
 config MARVELL_CN10K_DDR_PMU
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 57a279c61df5..050d04ee19dd 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
+obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
new file mode 100644
index 000000000000..9228ff672276
--- /dev/null
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -0,0 +1,771 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Alibaba DDR Sub-System Driveway PMU driver
+ *
+ * Copyright (C) 2022 Alibaba Inc
+ */
+
+#define ALI_DRW_PMUNAME		"ali_drw"
+#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
+#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
+
+#include <linux/acpi.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+
+#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
+#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
+
+#define ALI_DRW_PMU_PA_SHIFT			12
+#define ALI_DRW_PMU_CNT_INIT			0x00000000
+#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
+#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
+
+#define ALI_DRW_PMU_CNT_CTRL			0xC00
+#define ALI_DRW_PMU_CNT_RST			BIT(2)
+#define ALI_DRW_PMU_CNT_STOP			BIT(1)
+#define ALI_DRW_PMU_CNT_START			BIT(0)
+
+#define ALI_DRW_PMU_CNT_STATE			0xC04
+#define ALI_DRW_PMU_TEST_CTRL			0xC08
+#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
+
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
+#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
+
+/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_EVENT_SEL0			0xC68
+/* counter 0-3 use sel0, counter 4-7 use sel1...*/
+#define ALI_DRW_PMU_EVENT_SELn(n) \
+	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
+#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
+	(8 * (n % 4))
+/* counter 0: bit 7, counter 1: bit 15, counter 2: bit 23... */
+#define ALI_DRW_PMCOM_CNT_EN_OFFSET(n) \
+	(7 + 8 * (n % 4))
+
+/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
+#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
+	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
+
+#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
+#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
+#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
+#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
+#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
+#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
+#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
+#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
+
+static int ali_drw_cpuhp_state_num;
+
+static LIST_HEAD(ali_drw_pmu_irqs);
+static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);
+
+struct ali_drw_pmu_irq {
+	struct hlist_node node;
+	struct list_head irqs_node;
+	struct list_head pmus_node;
+	int irq_num;
+	int cpu;
+	refcount_t refcount;
+};
+
+struct ali_drw_pmu {
+	void __iomem *cfg_base;
+	struct device *dev;
+
+	struct list_head pmus_node;
+	struct ali_drw_pmu_irq *irq;
+	int irq_num;
+	int cpu;
+	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);
+	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+
+	struct pmu pmu;
+};
+
+#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu))
+
+#define DRW_CONFIG_EVENTID		GENMASK(7, 0)
+#define GET_DRW_EVENTID(event)	FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr.config)
+
+ssize_t ali_drw_pmu_format_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(buf, "%s\n", (char *)eattr->var);
+}
+
+/*
+ * PMU event attributes
+ */
+ssize_t ali_drw_pmu_event_show(struct device *dev,
+				struct device_attribute *attr, char *page)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
+}
+
+#define ALI_DRW_PMU_ATTR(_name, _func, _config)                            \
+		(&((struct dev_ext_attribute[]) {                               \
+				{ __ATTR(_name, 0444, _func, NULL), (void *)_config }   \
+		})[0].attr.attr)
+
+#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config)            \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config)
+#define ALI_DRW_PMU_EVENT_ATTR(_name, _config)             \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config)
+
+static struct attribute *ali_drw_pmu_events_attrs[] = {
+	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd_or_wr,		0x0),
+	ALI_DRW_PMU_EVENT_ATTR(perf_hif_wr,			0x1),
+	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd,			0x2),
+	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rmw,			0x3),
+	ALI_DRW_PMU_EVENT_ATTR(perf_hif_hi_pri_rd,		0x4),
+	ALI_DRW_PMU_EVENT_ATTR(perf_dfi_wr_data_cycles,	0x7),
+	ALI_DRW_PMU_EVENT_ATTR(perf_dfi_rd_data_cycles,	0x8),
+	ALI_DRW_PMU_EVENT_ATTR(perf_hpr_xact_when_critical,	0x9),
+	ALI_DRW_PMU_EVENT_ATTR(perf_lpr_xact_when_critical,	0xA),
+	ALI_DRW_PMU_EVENT_ATTR(perf_wpr_xact_when_critical,	0xB),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_activate,		0xC),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_or_wr,		0xD),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_activate,		0xE),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd,			0xF),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_wr,			0x10),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_mwr,			0x11),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_precharge,		0x12),
+	ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_rdwr,	0x13),
+	ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_other,	0x14),
+	ALI_DRW_PMU_EVENT_ATTR(perf_rdwr_transitions,		0x15),
+	ALI_DRW_PMU_EVENT_ATTR(perf_write_combine,		0x16),
+	ALI_DRW_PMU_EVENT_ATTR(perf_war_hazard,		0x17),
+	ALI_DRW_PMU_EVENT_ATTR(perf_raw_hazard,		0x18),
+	ALI_DRW_PMU_EVENT_ATTR(perf_waw_hazard,		0x19),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk0,	0x1A),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk1,	0x1B),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk2,	0x1C),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk3,	0x1D),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk0,	0x1E),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk1,	0x1F),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk2,	0x20),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk3,	0x21),
+	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk0,		0x26),
+	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk1,		0x27),
+	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk2,		0x28),
+	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk3,		0x29),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_refresh,		0x2A),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_crit_ref,		0x2B),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_load_mode,		0x2D),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqcl,		0x2E),
+	ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_rd, 0x30),
+	ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_wr, 0x31),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mpc,		0x34),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mrr,		0x35),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_tcr_mrr,		0x36),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqstart,		0x37),
+	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqlatch,		0x38),
+	ALI_DRW_PMU_EVENT_ATTR(perf_chi_txreq,			0x39),
+	ALI_DRW_PMU_EVENT_ATTR(perf_chi_txdat,			0x3A),
+	ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxdat,			0x3B),
+	ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxrsp,			0x3C),
+	ALI_DRW_PMU_EVENT_ATTR(perf_tsz_vio,			0x3D),
+	ALI_DRW_PMU_EVENT_ATTR(perf_cycle,			0x80),
+	NULL,
+};
+
+static struct attribute_group ali_drw_pmu_events_attr_group = {
+	.name = "events",
+	.attrs = ali_drw_pmu_events_attrs,
+};
+
+static struct attribute *ali_drw_pmu_format_attr[] = {
+	ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_format_group = {
+	.name = "format",
+	.attrs = ali_drw_pmu_format_attr,
+};
+
+static ssize_t ali_drw_pmu_cpumask_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu));
+}
+
+static struct device_attribute ali_drw_pmu_cpumask_attr =
+		__ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL);
+
+static struct attribute *ali_drw_pmu_cpumask_attrs[] = {
+	&ali_drw_pmu_cpumask_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_cpumask_attr_group = {
+	.attrs = ali_drw_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *ali_drw_pmu_attr_groups[] = {
+	&ali_drw_pmu_events_attr_group,
+	&ali_drw_pmu_cpumask_attr_group,
+	&ali_drw_pmu_format_group,
+	NULL,
+};
+
+/* find a counter for event, then in add func, hw.idx will equal to counter */
+static int ali_drw_get_counter_idx(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int idx;
+
+	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
+		if (!test_and_set_bit(idx, drw_pmu->used_mask))
+			return idx;
+	}
+
+	/* The counters are all in use. */
+	return -EBUSY;
+}
+
+static u64 ali_drw_pmu_read_counter(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	u64 cycle_high, cycle_low;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		cycle_high = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH);
+		cycle_high &= ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK;
+		cycle_low = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW);
+		cycle_low &= ALI_DRW_PMU_CYCLE_CNT_LOW_MASK;
+		return (cycle_high << 32 | cycle_low);
+	}
+
+	return readl(drw_pmu->cfg_base +
+		     ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx));
+}
+
+static void ali_drw_pmu_event_update(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 delta, prev, now;
+
+	do {
+		prev = local64_read(&hwc->prev_count);
+		now = ali_drw_pmu_read_counter(event);
+	} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);
+
+	/* handle overflow. */
+	delta = now - prev;
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID)
+		delta &= ALI_DRW_PMU_OV_INTR_MASK;
+	else
+		delta &= ALI_DRW_CNT_MAX_PERIOD;
+	local64_add(delta, &event->count);
+}
+
+static void ali_drw_pmu_event_set_period(struct perf_event *event)
+{
+	u64 pre_val;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	/* set a preload counter for test purpose */
+	writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+
+	/* set conunter initial value */
+	pre_val = ALI_DRW_PMU_CNT_INIT;
+	writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	local64_set(&event->hw.prev_count, pre_val);
+
+	/* set sel mode to zero to start test */
+	writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+}
+
+static void ali_drw_pmu_enable_counter(struct perf_event *event)
+{
+	u32 val, reg;
+	int counter = event->hw.idx;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	/* enable the common counter n */
+	val |= (1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter));
+	/* decides the event which will be counted by the common counter n */
+	val |= ((drw_pmu->evtids[counter] & 0x3F) << ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter));
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static void ali_drw_pmu_disable_counter(struct perf_event *event)
+{
+	u32 val, reg;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int counter = event->hw.idx;
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	val &= ~(1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter));
+	val &= ~(0x3F << ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter));
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
+{
+	struct ali_drw_pmu_irq *irq = data;
+	struct ali_drw_pmu *drw_pmu;
+	irqreturn_t ret = IRQ_NONE;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {
+		unsigned long status;
+		struct perf_event *event;
+		unsigned int idx;
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			ali_drw_pmu_disable_counter(event);
+		}
+
+		/* common counter intr status */
+		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
+		status = (status >> 8) & 0xFFFF;
+		if (status) {
+			for_each_set_bit(idx, &status,
+					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+				event = drw_pmu->events[idx];
+				if (WARN_ON_ONCE(!event))
+					continue;
+				ali_drw_pmu_event_update(event);
+				ali_drw_pmu_event_set_period(event);
+			}
+
+			/* clear common counter intr status */
+			writel((status & 0xFFFF) << 8,
+				drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+		}
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			if (!(event->hw.state & PERF_HES_STOPPED))
+				ali_drw_pmu_enable_counter(event);
+		}
+
+		ret = IRQ_HANDLED;
+	}
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
+						      *pdev, int irq_num)
+{
+	int ret;
+	struct ali_drw_pmu_irq *irq;
+
+	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
+		if (irq->irq_num == irq_num
+		    && refcount_inc_not_zero(&irq->refcount))
+			return irq;
+	}
+
+	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
+	if (!irq)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&irq->pmus_node);
+
+	/* Pick one CPU to be the preferred one to use */
+	irq->cpu = smp_processor_id();
+	refcount_set(&irq->refcount, 1);
+
+	/*
+	 * FIXME: DDRSS Driveway PMU overflow interrupt shares the same irq
+	 * number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
+	 * successfully, add IRQF_SHARED flag.
+	 */
+	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
+			       IRQF_SHARED, dev_name(&pdev->dev), irq);
+	if (ret < 0) {
+		dev_err(&pdev->dev,
+			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
+		goto out_free;
+	}
+
+	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
+	if (ret)
+		goto out_free;
+
+	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, &irq->node);
+	if (ret)
+		goto out_free;
+
+	irq->irq_num = irq_num;
+	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
+
+	return irq;
+
+out_free:
+	kfree(irq);
+	return ERR_PTR(ret);
+}
+
+static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
+				struct platform_device *pdev)
+{
+	int irq_num;
+	struct ali_drw_pmu_irq *irq;
+
+	/* Read and init IRQ */
+	irq_num = platform_get_irq(pdev, 0);
+	if (irq_num < 0)
+		return irq_num;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	if (IS_ERR(irq))
+		return PTR_ERR(irq);
+
+	drw_pmu->irq = irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	return 0;
+}
+
+static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
+{
+	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_del_rcu(&drw_pmu->pmus_node);
+
+	if (!refcount_dec_and_test(&irq->refcount)) {
+		mutex_unlock(&ali_drw_pmu_irqs_lock);
+		return;
+	}
+
+	list_del(&irq->irqs_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
+	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
+					    &irq->node);
+	kfree(irq);
+}
+
+static int ali_drw_pmu_event_init(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event *sibling;
+	struct device *dev = drw_pmu->pmu.dev;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (is_sampling_event(event)) {
+		dev_err(dev, "Sampling not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->attach_state & PERF_ATTACH_TASK) {
+		dev_err(dev, "Per-task counter cannot allocate!\n");
+		return -EOPNOTSUPP;
+	}
+
+	event->cpu = drw_pmu->cpu;
+	if (event->cpu < 0) {
+		dev_err(dev, "Per-task mode not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->group_leader != event &&
+	    !is_software_event(event->group_leader)) {
+		dev_err(dev, "driveway only allow one event!\n");
+		return -EINVAL;
+	}
+
+	for_each_sibling_event(sibling, event->group_leader) {
+		if (sibling != event && !is_software_event(sibling)) {
+			dev_err(dev, "driveway event not allowed!\n");
+			return -EINVAL;
+		}
+	}
+
+	/* reset all the pmu counters */
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	hwc->idx = -1;
+
+	return 0;
+}
+
+static void ali_drw_pmu_start(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	event->hw.state = 0;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		writel(ALI_DRW_PMU_CNT_START,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+		return;
+	}
+
+	ali_drw_pmu_event_set_period(event);
+	if (flags & PERF_EF_RELOAD) {
+		unsigned long prev_raw_count =
+		    local64_read(&event->hw.prev_count);
+		writel(prev_raw_count,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	}
+
+	ali_drw_pmu_enable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+}
+
+static void ali_drw_pmu_stop(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	if (event->hw.state & PERF_HES_STOPPED)
+		return;
+
+	if (GET_DRW_EVENTID(event) != ALI_DRW_PMU_CYCLE_EVT_ID)
+		ali_drw_pmu_disable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	ali_drw_pmu_event_update(event);
+	event->hw.state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int ali_drw_pmu_add(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = -1;
+	int evtid;
+
+	evtid = GET_DRW_EVENTID(event);
+
+	if (evtid != ALI_DRW_PMU_CYCLE_EVT_ID) {
+		idx = ali_drw_get_counter_idx(event);
+		if (idx < 0)
+			return idx;
+		drw_pmu->events[idx] = event;
+		drw_pmu->evtids[idx] = evtid;
+	}
+	hwc->idx = idx;
+
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		ali_drw_pmu_start(event, PERF_EF_RELOAD);
+
+	/* Propagate our changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void ali_drw_pmu_del(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	ali_drw_pmu_stop(event, PERF_EF_UPDATE);
+
+	if (idx >= 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+		drw_pmu->events[idx] = NULL;
+		drw_pmu->evtids[idx] = 0;
+		clear_bit(idx, drw_pmu->used_mask);
+	}
+
+	perf_event_update_userpage(event);
+}
+
+static void ali_drw_pmu_read(struct perf_event *event)
+{
+	ali_drw_pmu_event_update(event);
+}
+
+static int ali_drw_pmu_probe(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu;
+	struct resource *res;
+	char *name;
+	int ret;
+
+	drw_pmu = devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL);
+	if (!drw_pmu)
+		return -ENOMEM;
+
+	drw_pmu->dev = &pdev->dev;
+	platform_set_drvdata(pdev, drw_pmu);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	drw_pmu->cfg_base = devm_ioremap_resource(&pdev->dev, res);
+	if (!drw_pmu->cfg_base)
+		return -ENOMEM;
+
+	name = devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx",
+			      (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT));
+	if (!name)
+		return -ENOMEM;
+
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	/* enable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL);
+
+	/* clearing interrupt status */
+	writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+
+	drw_pmu->cpu = smp_processor_id();
+
+	ret = ali_drw_pmu_init_irq(drw_pmu, pdev);
+	if (ret)
+		return ret;
+
+	drw_pmu->pmu = (struct pmu) {
+		.module		= THIS_MODULE,
+		.task_ctx_nr	= perf_invalid_context,
+		.event_init	= ali_drw_pmu_event_init,
+		.add		= ali_drw_pmu_add,
+		.del		= ali_drw_pmu_del,
+		.start		= ali_drw_pmu_start,
+		.stop		= ali_drw_pmu_stop,
+		.read		= ali_drw_pmu_read,
+		.attr_groups	= ali_drw_pmu_attr_groups,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE,
+	};
+
+	ret = perf_pmu_register(&drw_pmu->pmu, name, -1);
+	if (ret) {
+		dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n");
+		ali_drw_pmu_uninit_irq(drw_pmu);
+	}
+
+	return ret;
+}
+
+static int ali_drw_pmu_remove(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu = platform_get_drvdata(pdev);
+
+	/* disable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL);
+
+	ali_drw_pmu_uninit_irq(drw_pmu);
+	perf_pmu_unregister(&drw_pmu->pmu);
+
+	return 0;
+}
+
+static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct ali_drw_pmu_irq *irq;
+	struct ali_drw_pmu *drw_pmu;
+	unsigned int target;
+
+	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
+	if (cpu != irq->cpu)
+		return 0;
+
+	target = cpumask_any_but(cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	/* We're only reading, but this isn't the place to be involving RCU */
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
+		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
+	irq->cpu = target;
+
+	return 0;
+}
+
+static const struct acpi_device_id ali_drw_acpi_match[] = {
+	{"ARMHD700", 0},
+	{}
+};
+
+MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
+
+static struct platform_driver ali_drw_pmu_driver = {
+	.driver = {
+		   .name = "ali_drw_pmu",
+		   .acpi_match_table = ACPI_PTR(ali_drw_acpi_match),
+		   },
+	.probe = ali_drw_pmu_probe,
+	.remove = ali_drw_pmu_remove,
+};
+
+static int __init ali_drw_pmu_init(void)
+{
+	int ret;
+
+	ali_drw_cpuhp_state_num = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+							  "ali_drw_pmu:online",
+							  NULL,
+							  ali_drw_pmu_offline_cpu);
+
+	if (ali_drw_cpuhp_state_num < 0) {
+		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
+		       ali_drw_cpuhp_state_num);
+		return ali_drw_cpuhp_state_num;
+	}
+
+	ret = platform_driver_register(&ali_drw_pmu_driver);
+	if (ret)
+		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+
+	return ret;
+}
+
+static void __exit ali_drw_pmu_exit(void)
+{
+	platform_driver_unregister(&ali_drw_pmu_driver);
+	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+}
+
+module_init(ali_drw_pmu_init);
+module_exit(ali_drw_pmu_exit);
+
+MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
+MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
+MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
+MODULE_DESCRIPTION("DDR Sub-System PMU driver");
+MODULE_LICENSE("GPL v2");
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
  2022-06-17 11:18 ` [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
  2022-06-17 11:18 ` [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-06-17 11:18 ` Shuai Xue
  2022-06-28  3:16 ` [PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-17 11:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song

Add maintainers for Alibaba PMU document and driver

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1fc9ead83d2a..68e02d471758 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -733,6 +733,14 @@ S:	Maintained
 F:	Documentation/i2c/busses/i2c-ali1563.rst
 F:	drivers/i2c/busses/i2c-ali1563.c
 
+ALIBABA PMU DRIVER
+M:	Shuai Xue <xueshuai@linux.alibaba.com>
+M:	Hongbo Yao <yaohongbo@linux.alibaba.com>
+M:	Neng Chen <nengchen@linux.alibaba.com>
+S:	Supported
+F:	Documentation/admin-guide/perf/alibaba_pmu.rst
+F:	drivers/perf/alibaba_uncore_dwr_pmu.c
+
 ALIENWARE WMI DRIVER
 L:	Dell.Client.Kernel@dell.com
 S:	Maintained
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 ` [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-06-17 19:59   ` kernel test robot
  2022-06-20 13:50   ` Jonathan Cameron
  1 sibling, 0 replies; 41+ messages in thread
From: kernel test robot @ 2022-06-17 19:59 UTC (permalink / raw)
  To: Shuai Xue, linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: kbuild-all, baolin.wang, yaohongbo, nengchen, zhuo.song

Hi Shuai,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on soc/for-next]
[also build test WARNING on linus/master v5.19-rc2 next-20220617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Shuai-Xue/drivers-perf-add-DDR-Sub-System-Driveway-PMU-driver-for-Yitian-710-SoC/20220617-192123
base:   https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git for-next
config: arm64-allyesconfig (https://download.01.org/0day-ci/archive/20220618/202206180343.eGENefte-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 11.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/641441c43a62293dd0bae407fda7c229e8689617
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Shuai-Xue/drivers-perf-add-DDR-Sub-System-Driveway-PMU-driver-for-Yitian-710-SoC/20220617-192123
        git checkout 641441c43a62293dd0bae407fda7c229e8689617
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/perf/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/perf/alibaba_uncore_drw_pmu.c:97:9: warning: no previous prototype for 'ali_drw_pmu_format_show' [-Wmissing-prototypes]
      97 | ssize_t ali_drw_pmu_format_show(struct device *dev,
         |         ^~~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/alibaba_uncore_drw_pmu.c:110:9: warning: no previous prototype for 'ali_drw_pmu_event_show' [-Wmissing-prototypes]
     110 | ssize_t ali_drw_pmu_event_show(struct device *dev,
         |         ^~~~~~~~~~~~~~~~~~~~~~


vim +/ali_drw_pmu_format_show +97 drivers/perf/alibaba_uncore_drw_pmu.c

    96	
  > 97	ssize_t ali_drw_pmu_format_show(struct device *dev,
    98					struct device_attribute *attr, char *buf)
    99	{
   100		struct dev_ext_attribute *eattr;
   101	
   102		eattr = container_of(attr, struct dev_ext_attribute, attr);
   103	
   104		return sprintf(buf, "%s\n", (char *)eattr->var);
   105	}
   106	
   107	/*
   108	 * PMU event attributes
   109	 */
 > 110	ssize_t ali_drw_pmu_event_show(struct device *dev,
   111					struct device_attribute *attr, char *page)
   112	{
   113		struct dev_ext_attribute *eattr;
   114	
   115		eattr = container_of(attr, struct dev_ext_attribute, attr);
   116	
   117		return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
   118	}
   119	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 ` [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-06-20 11:49   ` Jonathan Cameron
  2022-06-23  8:34     ` Shuai Xue
  2022-07-04  3:53   ` kernel test robot
  1 sibling, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-06-20 11:49 UTC (permalink / raw)
  To: Shuai Xue
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song

On Fri, 17 Jun 2022 19:18:23 +0800
Shuai Xue <xueshuai@linux.alibaba.com> wrote:

> Alibaba's T-Head SoC implements uncore PMU for performance and functional
> debugging to facilitate system maintenance. Document it to provide guidance
> on how to use it.
> 
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>

Hi Shuia Xue,

A few quick comments inline,

Thanks,

Jonathan

> ---
>  .../admin-guide/perf/alibaba_pmu.rst          | 94 +++++++++++++++++++
>  Documentation/admin-guide/perf/index.rst      |  1 +
>  2 files changed, 95 insertions(+)
>  create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
> 
> diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
> new file mode 100644
> index 000000000000..337f4f1d4c54
> --- /dev/null
> +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
> @@ -0,0 +1,94 @@
> +=============================================================
> +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
> +=============================================================
> +
> +The Yitian 710, custom-built by Alibaba Group's chip development business,
> +T-Head, implements uncore PMU for performance and functional debugging to
> +facilitate system maintenance.
> +
> +DDR Sub-System Driveway (DRW) PMU Driver
> +=========================================
> +
> +The Yitian 710 SoC supports the most advanced DDR5/4 DRAM to provide
> +tremendous memory bandwidth for cloud computing and HPC. The Driveway is a
> +module which is a bridge between a router of mesh network and memory
> +controller. It provides various functions like secure control, address map
> +and so on.

I'd focus on the PMU aspect. No point in describing other features in this
document.

> +
> +Yitian 710 employs eight DDR5/4 channels, four on each die. Each channel is
> +independent of others to service system memory requests. And one DDR5

Each DDR5 channel... 

> +channel is split into two independent sub-channels. The DDR Sub-System
> +Driveway implements separate PMUs for each sub-channel to monitor various
> +performance metrics.
> +
> +The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
> +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
> +sub-channels of the same channel in die 0. And the PMU device of die 1 is
> +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
> +
> +Each sub-channel has 36 PMU counters in total, which is classified into
> +four groups:
> +
> +- Group 0: PMU Cycle Counter. This group has one pair of counters
> +  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
> +  based on DDRC core clock.
> +
> +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
> +  to count the total access number of either the eight bank groups in a
> +  selected rank, or four ranks separately in the first 4 counters. The base
> +  transfer unit is 64B.
> +
> +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
> +  count the total retry number of each type of uncorrectable error.
> +
> +- Group 3: PMU Common Counters. This group has 16 counters, that are used
> +  to count the common events.
> +
> +For now, the Driveway PMU driver only uses counters in group 0 and group 3.
> +
> +Example usage of counting memory data bandwidth::
> +
> +  perf stat \
> +    -e ali_drw_21000/perf_hif_wr/ \

What does hif stand for? Also, perf seems redundant for
a perf event.


> +    -e ali_drw_21000/perf_hif_rd/ \
> +    -e ali_drw_21000/perf_hif_rmw/ \
> +    -e ali_drw_21000/perf_cycle/ \
> +    -e ali_drw_21080/perf_hif_wr/ \
> +    -e ali_drw_21080/perf_hif_rd/ \
> +    -e ali_drw_21080/perf_hif_rmw/ \
> +    -e ali_drw_21080/perf_cycle/ \
> +    -e ali_drw_23000/perf_hif_wr/ \
> +    -e ali_drw_23000/perf_hif_rd/ \
> +    -e ali_drw_23000/perf_hif_rmw/ \
> +    -e ali_drw_23000/perf_cycle/ \
> +    -e ali_drw_23080/perf_hif_wr/ \
> +    -e ali_drw_23080/perf_hif_rd/ \
> +    -e ali_drw_23080/perf_hif_rmw/ \
> +    -e ali_drw_23080/perf_cycle/ \
> +    -e ali_drw_25000/perf_hif_wr/ \
> +    -e ali_drw_25000/perf_hif_rd/ \
> +    -e ali_drw_25000/perf_hif_rmw/ \
> +    -e ali_drw_25000/perf_cycle/ \
> +    -e ali_drw_25080/perf_hif_wr/ \
> +    -e ali_drw_25080/perf_hif_rd/ \
> +    -e ali_drw_25080/perf_hif_rmw/ \
> +    -e ali_drw_25080/perf_cycle/ \
> +    -e ali_drw_27000/perf_hif_wr/ \
> +    -e ali_drw_27000/perf_hif_rd/ \
> +    -e ali_drw_27000/perf_hif_rmw/ \
> +    -e ali_drw_27000/perf_cycle/ \
> +    -e ali_drw_27080/perf_hif_wr/ \
> +    -e ali_drw_27080/perf_hif_rd/ \
> +    -e ali_drw_27080/perf_hif_rmw/ \
> +    -e ali_drw_27080/perf_cycle/ -- sleep 10
> +
> +The average DRAM bandwidth can be calculated as follows:
> +
> +- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
> +- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
> +
> +Here, DDRC_WIDTH = 64 bytes.
> +
> +The current driver does not support sampling. So "perf record" is
> +unsupported.  Also attach to a task is unsupported as the events are all
> +uncore.
> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
> index 69b23f087c05..bf466ae91c6c 100644
> --- a/Documentation/admin-guide/perf/index.rst
> +++ b/Documentation/admin-guide/perf/index.rst
> @@ -17,3 +17,4 @@ Performance monitor support
>     xgene-pmu
>     arm_dsu_pmu
>     thunderx2-pmu
> +   thead_pmu


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 ` [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
  2022-06-17 19:59   ` kernel test robot
@ 2022-06-20 13:50   ` Jonathan Cameron
  2022-06-24  7:04     ` Shuai Xue
  1 sibling, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-06-20 13:50 UTC (permalink / raw)
  To: Shuai Xue
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song

On Fri, 17 Jun 2022 19:18:24 +0800
Shuai Xue <xueshuai@linux.alibaba.com> wrote:

> Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
> support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports the most
> advanced DDR5/4 DRAM to provide tremendous memory bandwidth for cloud
> computing and HPC.

A little marketing heavy for a patch description but I suppose that's harmless!
Maybe the shorter option of something like

"Yitian supports DDR5/4 DRAM and targets cloud computing and HPC."
> 
> Each PMU is registered as a device in /sys/bus/event_source/devices, and
> users can select event to monitor in each sub-channel, independently. For
> example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
> sub-channels of the same channel in die 0. And the PMU device of die 1 is
> prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
> 
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
> Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>

Unusual to have such a deep sign off list for a patch positing.  Is
Co-developed appropriate here?

A few comments inline, but I only took a fairly superficial look.

Thanks,

Jonathan


> ---
>  drivers/perf/Kconfig                  |   8 +
>  drivers/perf/Makefile                 |   1 +
>  drivers/perf/alibaba_uncore_drw_pmu.c | 771 ++++++++++++++++++++++++++
>  3 files changed, 780 insertions(+)
>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
> 

> diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
> new file mode 100644
> index 000000000000..9228ff672276
> --- /dev/null
> +++ b/drivers/perf/alibaba_uncore_drw_pmu.c
> @@ -0,0 +1,771 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Alibaba DDR Sub-System Driveway PMU driver
> + *
> + * Copyright (C) 2022 Alibaba Inc
> + */
> +
> +#define ALI_DRW_PMUNAME		"ali_drw"
> +#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
> +#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
> +
> +#include <linux/acpi.h>
> +#include <linux/perf_event.h>
> +#include <linux/platform_device.h>
> +

...

> +static struct attribute *ali_drw_pmu_events_attrs[] = {
> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd_or_wr,		0x0),

As mentioned in review of patch 1, why perf_ prefix?
Also, what's hif, dfi, hpr, lpr wpr etc?

> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_wr,			0x1),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd,			0x2),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rmw,			0x3),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_hi_pri_rd,		0x4),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_dfi_wr_data_cycles,	0x7),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_dfi_rd_data_cycles,	0x8),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_hpr_xact_when_critical,	0x9),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_lpr_xact_when_critical,	0xA),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_wpr_xact_when_critical,	0xB),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_activate,		0xC),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_or_wr,		0xD),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_activate,		0xE),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd,			0xF),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_wr,			0x10),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_mwr,			0x11),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_precharge,		0x12),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_rdwr,	0x13),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_other,	0x14),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_rdwr_transitions,		0x15),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_write_combine,		0x16),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_war_hazard,		0x17),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_raw_hazard,		0x18),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_waw_hazard,		0x19),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk0,	0x1A),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk1,	0x1B),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk2,	0x1C),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk3,	0x1D),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk0,	0x1E),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk1,	0x1F),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk2,	0x20),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk3,	0x21),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk0,		0x26),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk1,		0x27),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk2,		0x28),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk3,		0x29),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_refresh,		0x2A),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_crit_ref,		0x2B),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_load_mode,		0x2D),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqcl,		0x2E),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_rd, 0x30),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_wr, 0x31),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mpc,		0x34),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mrr,		0x35),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_tcr_mrr,		0x36),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqstart,		0x37),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqlatch,		0x38),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_txreq,			0x39),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_txdat,			0x3A),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxdat,			0x3B),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxrsp,			0x3C),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_tsz_vio,			0x3D),
> +	ALI_DRW_PMU_EVENT_ATTR(perf_cycle,			0x80),
> +	NULL,
> +};



> +static void ali_drw_pmu_enable_counter(struct perf_event *event)
> +{
> +	u32 val, reg;
> +	int counter = event->hw.idx;
> +	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
> +
> +	reg = ALI_DRW_PMU_EVENT_SELn(counter);
> +	val = readl(drw_pmu->cfg_base + reg);
> +	/* enable the common counter n */
> +	val |= (1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter));
> +	/* decides the event which will be counted by the common counter n */
> +	val |= ((drw_pmu->evtids[counter] & 0x3F) << ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter));

You could use FIELD_PREP() to fill the 8 bit sub register then shift it based
on counter.  Would probably be more readable. Something like

	val = readl(drw_pmu->cfg_base + reg);
	subval = FIELD_PREP(ALI_DRAW_PMCOM_CNT_EN_MSK, 1) |
        	 FIELD_PREP(ALI_DRAW_PMCOM_CNT_EVENT_MSK, drw_pmu->evtids);

	shift = 8 * (counter % 4);
	val &= ~(GENMASK(7, 0) << shift);
	val |= subval << shift;


> +	writel(val, drw_pmu->cfg_base + reg);
> +}
> +
> +static void ali_drw_pmu_disable_counter(struct perf_event *event)
> +{
> +	u32 val, reg;
> +	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
> +	int counter = event->hw.idx;
> +
> +	reg = ALI_DRW_PMU_EVENT_SELn(counter);
> +	val = readl(drw_pmu->cfg_base + reg);
> +	val &= ~(1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter));
> +	val &= ~(0x3F << ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter));

Use GENMASK() for that mask and maybe #define it got give it a name.

> +
> +	writel(val, drw_pmu->cfg_base + reg);
> +}
> +
> +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
> +{
> +	struct ali_drw_pmu_irq *irq = data;
> +	struct ali_drw_pmu *drw_pmu;
> +	irqreturn_t ret = IRQ_NONE;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {

As mentioned below, I'm not clear this is actually worthwhile vs
just relying on IRQF_SHARED and registering each PMU separately.

> +		unsigned long status;
> +		struct perf_event *event;
> +		unsigned int idx;
> +
> +		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
> +			event = drw_pmu->events[idx];
> +			if (!event)
> +				continue;
> +			ali_drw_pmu_disable_counter(event);
> +		}
> +
> +		/* common counter intr status */
> +		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
> +		status = (status >> 8) & 0xFFFF;

Nice to use a MASK define and FIELD_GET()

> +		if (status) {
> +			for_each_set_bit(idx, &status,
> +					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
> +				event = drw_pmu->events[idx];
> +				if (WARN_ON_ONCE(!event))
> +					continue;
> +				ali_drw_pmu_event_update(event);
> +				ali_drw_pmu_event_set_period(event);
> +			}
> +
> +			/* clear common counter intr status */
> +			writel((status & 0xFFFF) << 8,
> +				drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
> +		}
> +
> +		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
> +			event = drw_pmu->events[idx];
> +			if (!event)
> +				continue;
> +			if (!(event->hw.state & PERF_HES_STOPPED))
> +				ali_drw_pmu_enable_counter(event);
> +		}
> +
> +		ret = IRQ_HANDLED;

If no bits were set in intr status, then should not set this.

> +	}
> +	rcu_read_unlock();
> +	return ret;
> +}
> +

This IRQ handling is nasty.  It would be good for the patch description to
explain what the constraints are.  What hardware shares an IRQ etc.
Seems like the MPAM driver is going to be fighting with this to migrate the
interrupt if needed.

I'm also not clear how bad it would be to just have each PMU register
shared IRQ and HP handler. That would simplify this code a fair bit
and the excess HP handler calls shouldn't matter (cpu_num wouldn't
match once first one has migrated it).

> +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
> +						      *pdev, int irq_num)
> +{
> +	int ret;
> +	struct ali_drw_pmu_irq *irq;
> +
> +	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
> +		if (irq->irq_num == irq_num
> +		    && refcount_inc_not_zero(&irq->refcount))
> +			return irq;
> +	}
> +
> +	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
> +	if (!irq)
> +		return ERR_PTR(-ENOMEM);
> +
> +	INIT_LIST_HEAD(&irq->pmus_node);
> +
> +	/* Pick one CPU to be the preferred one to use */
> +	irq->cpu = smp_processor_id();
> +	refcount_set(&irq->refcount, 1);
> +
> +	/*
> +	 * FIXME: DDRSS Driveway PMU overflow interrupt shares the same irq
> +	 * number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
> +	 * successfully, add IRQF_SHARED flag.

You have it already for this driver, so what does this FIXME refer to?

> +	 */
> +	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
> +			       IRQF_SHARED, dev_name(&pdev->dev), irq);
> +	if (ret < 0) {
> +		dev_err(&pdev->dev,
> +			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
> +		goto out_free;
> +	}
> +
> +	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
> +	if (ret)
> +		goto out_free;
> +
> +	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, &irq->node);
> +	if (ret)
> +		goto out_free;
> +
> +	irq->irq_num = irq_num;
> +	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
> +
> +	return irq;
> +
> +out_free:
> +	kfree(irq);
> +	return ERR_PTR(ret);
> +}
> +
> +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
> +				struct platform_device *pdev)
> +{
> +	int irq_num;
> +	struct ali_drw_pmu_irq *irq;
> +
> +	/* Read and init IRQ */
> +	irq_num = platform_get_irq(pdev, 0);
> +	if (irq_num < 0)
> +		return irq_num;
> +
> +	mutex_lock(&ali_drw_pmu_irqs_lock);
> +	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
> +
> +	if (IS_ERR(irq))
> +		return PTR_ERR(irq);
> +
> +	drw_pmu->irq = irq;
> +
> +	mutex_lock(&ali_drw_pmu_irqs_lock);
> +	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
> +
> +	return 0;
> +}
> +
> +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
> +{
> +	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
> +
> +	mutex_lock(&ali_drw_pmu_irqs_lock);
> +	list_del_rcu(&drw_pmu->pmus_node);
> +
> +	if (!refcount_dec_and_test(&irq->refcount)) {
> +		mutex_unlock(&ali_drw_pmu_irqs_lock);
> +		return;
> +	}
> +
> +	list_del(&irq->irqs_node);
> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
> +
> +	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
> +	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
> +					    &irq->node);
> +	kfree(irq);
> +}
> +

> +
> +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct ali_drw_pmu_irq *irq;
> +	struct ali_drw_pmu *drw_pmu;
> +	unsigned int target;
> +
> +	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
> +	if (cpu != irq->cpu)
> +		return 0;
> +
> +	target = cpumask_any_but(cpu_online_mask, cpu);

Preference for something on the same die if available?  I'm guessing that'll
reduce overheads for access.  Obviously you'll then want an online callback
to move back to the die if 1st cpu is onlined on it.

> +	if (target >= nr_cpu_ids)
> +		return 0;
> +
> +	/* We're only reading, but this isn't the place to be involving RCU */
> +	mutex_lock(&ali_drw_pmu_irqs_lock);
> +	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
> +		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
> +
> +	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
> +	irq->cpu = target;
> +
> +	return 0;
> +}
> +
> +static const struct acpi_device_id ali_drw_acpi_match[] = {
> +	{"ARMHD700", 0},

THEAD IP but with an ARM HID?  That needs at very least to have
a comment.  If this is a generic IP, then it should be documented
as such and the driver written with that in mind.

> +	{}
> +};
> +
> +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
> +
> +static struct platform_driver ali_drw_pmu_driver = {
> +	.driver = {
> +		   .name = "ali_drw_pmu",
> +		   .acpi_match_table = ACPI_PTR(ali_drw_acpi_match),
> +		   },
> +	.probe = ali_drw_pmu_probe,
> +	.remove = ali_drw_pmu_remove,
> +};
> +
> +static int __init ali_drw_pmu_init(void)
> +{
> +	int ret;
> +
> +	ali_drw_cpuhp_state_num = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> +							  "ali_drw_pmu:online",
> +							  NULL,
> +							  ali_drw_pmu_offline_cpu);
> +
> +	if (ali_drw_cpuhp_state_num < 0) {
> +		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
> +		       ali_drw_cpuhp_state_num);
> +		return ali_drw_cpuhp_state_num;
> +	}

Slightly more readable as

	ret = cpuhp_setup_state...
	if (ret < 0) {
		pr_err(..);
		return ret;
	}

	ali_drw_cpuhp_state_num = ret;
...

		
> +
> +	ret = platform_driver_register(&ali_drw_pmu_driver);
> +	if (ret)
> +		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
> +
> +	return ret;
> +}
> +
> +static void __exit ali_drw_pmu_exit(void)
> +{
> +	platform_driver_unregister(&ali_drw_pmu_driver);
> +	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
> +}
> +
> +module_init(ali_drw_pmu_init);
> +module_exit(ali_drw_pmu_exit);
> +
> +MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
> +MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
> +MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
> +MODULE_DESCRIPTION("DDR Sub-System PMU driver");
> +MODULE_LICENSE("GPL v2");


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-20 11:49   ` Jonathan Cameron
@ 2022-06-23  8:34     ` Shuai Xue
  0 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-23  8:34 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song


在 2022/6/20 PM7:49, Jonathan Cameron 写道:
> On Fri, 17 Jun 2022 19:18:23 +0800
> Shuai Xue <xueshuai@linux.alibaba.com> wrote:
> 
>> Alibaba's T-Head SoC implements uncore PMU for performance and functional
>> debugging to facilitate system maintenance. Document it to provide guidance
>> on how to use it.
>>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> 
> Hi Shuia Xue,
> 
> A few quick comments inline,
> 
> Thanks,
> 
> Jonathan

Hi, Jonathan,

Thank you very much for your quick comments.

> 
>> ---
>>  .../admin-guide/perf/alibaba_pmu.rst          | 94 +++++++++++++++++++
>>  Documentation/admin-guide/perf/index.rst      |  1 +
>>  2 files changed, 95 insertions(+)
>>  create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
>>
>> diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
>> new file mode 100644
>> index 000000000000..337f4f1d4c54
>> --- /dev/null
>> +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
>> @@ -0,0 +1,94 @@
>> +=============================================================
>> +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
>> +=============================================================
>> +
>> +The Yitian 710, custom-built by Alibaba Group's chip development business,
>> +T-Head, implements uncore PMU for performance and functional debugging to
>> +facilitate system maintenance.
>> +
>> +DDR Sub-System Driveway (DRW) PMU Driver
>> +=========================================
>> +
>> +The Yitian 710 SoC supports the most advanced DDR5/4 DRAM to provide
>> +tremendous memory bandwidth for cloud computing and HPC. The Driveway is a
>> +module which is a bridge between a router of mesh network and memory
>> +controller. It provides various functions like secure control, address map
>> +and so on.
> 
> I'd focus on the PMU aspect. No point in describing other features in this
> document.

Got it. I will delete this in next version.

>> +
>> +Yitian 710 employs eight DDR5/4 channels, four on each die. Each channel is
>> +independent of others to service system memory requests. And one DDR5
> 
> Each DDR5 channel... 

Good catch, I will declare as DDR5 channel.

> 
>> +channel is split into two independent sub-channels. The DDR Sub-System
>> +Driveway implements separate PMUs for each sub-channel to monitor various
>> +performance metrics.
>> +
>> +The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
>> +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
>> +sub-channels of the same channel in die 0. And the PMU device of die 1 is
>> +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
>> +
>> +Each sub-channel has 36 PMU counters in total, which is classified into
>> +four groups:
>> +
>> +- Group 0: PMU Cycle Counter. This group has one pair of counters
>> +  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
>> +  based on DDRC core clock.
>> +
>> +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
>> +  to count the total access number of either the eight bank groups in a
>> +  selected rank, or four ranks separately in the first 4 counters. The base
>> +  transfer unit is 64B.
>> +
>> +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
>> +  count the total retry number of each type of uncorrectable error.
>> +
>> +- Group 3: PMU Common Counters. This group has 16 counters, that are used
>> +  to count the common events.
>> +
>> +For now, the Driveway PMU driver only uses counters in group 0 and group 3.
>> +
>> +Example usage of counting memory data bandwidth::
>> +
>> +  perf stat \
>> +    -e ali_drw_21000/perf_hif_wr/ \
> 
> What does hif stand for? Also, perf seems redundant for
> a perf event.

HIF stands for Host Interface which is a Synopsys’ custom-defined interface.
On the host side, there is a choice of an AMBA 4 AXI interface or a
custom Host Interface (HIF). The AMBA interface supports single or multi-port
configurations, while the HIF supports a single port only. Read, write, and
RMW requests as well as write data are received through the HIF. I will doc
this in next version.

The `perf` prefix means Performance Logging Signals in DDRC, I will remove it
in next version.

>> +    -e ali_drw_21000/perf_hif_rd/ \
>> +    -e ali_drw_21000/perf_hif_rmw/ \
>> +    -e ali_drw_21000/perf_cycle/ \
>> +    -e ali_drw_21080/perf_hif_wr/ \
>> +    -e ali_drw_21080/perf_hif_rd/ \
>> +    -e ali_drw_21080/perf_hif_rmw/ \
>> +    -e ali_drw_21080/perf_cycle/ \
>> +    -e ali_drw_23000/perf_hif_wr/ \
>> +    -e ali_drw_23000/perf_hif_rd/ \
>> +    -e ali_drw_23000/perf_hif_rmw/ \
>> +    -e ali_drw_23000/perf_cycle/ \
>> +    -e ali_drw_23080/perf_hif_wr/ \
>> +    -e ali_drw_23080/perf_hif_rd/ \
>> +    -e ali_drw_23080/perf_hif_rmw/ \
>> +    -e ali_drw_23080/perf_cycle/ \
>> +    -e ali_drw_25000/perf_hif_wr/ \
>> +    -e ali_drw_25000/perf_hif_rd/ \
>> +    -e ali_drw_25000/perf_hif_rmw/ \
>> +    -e ali_drw_25000/perf_cycle/ \
>> +    -e ali_drw_25080/perf_hif_wr/ \
>> +    -e ali_drw_25080/perf_hif_rd/ \
>> +    -e ali_drw_25080/perf_hif_rmw/ \
>> +    -e ali_drw_25080/perf_cycle/ \
>> +    -e ali_drw_27000/perf_hif_wr/ \
>> +    -e ali_drw_27000/perf_hif_rd/ \
>> +    -e ali_drw_27000/perf_hif_rmw/ \
>> +    -e ali_drw_27000/perf_cycle/ \
>> +    -e ali_drw_27080/perf_hif_wr/ \
>> +    -e ali_drw_27080/perf_hif_rd/ \
>> +    -e ali_drw_27080/perf_hif_rmw/ \
>> +    -e ali_drw_27080/perf_cycle/ -- sleep 10
>> +
>> +The average DRAM bandwidth can be calculated as follows:
>> +
>> +- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
>> +- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
>> +
>> +Here, DDRC_WIDTH = 64 bytes.
>> +
>> +The current driver does not support sampling. So "perf record" is
>> +unsupported.  Also attach to a task is unsupported as the events are all
>> +uncore.
>> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
>> index 69b23f087c05..bf466ae91c6c 100644
>> --- a/Documentation/admin-guide/perf/index.rst
>> +++ b/Documentation/admin-guide/perf/index.rst
>> @@ -17,3 +17,4 @@ Performance monitor support
>>     xgene-pmu
>>     arm_dsu_pmu
>>     thunderx2-pmu
>> +   thead_pmu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-20 13:50   ` Jonathan Cameron
@ 2022-06-24  7:04     ` Shuai Xue
  0 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-24  7:04 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song



在 2022/6/20 PM9:50, Jonathan Cameron 写道:
> On Fri, 17 Jun 2022 19:18:24 +0800
> Shuai Xue <xueshuai@linux.alibaba.com> wrote:
> 
>> Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
>> support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports the most
>> advanced DDR5/4 DRAM to provide tremendous memory bandwidth for cloud
>> computing and HPC.
> 
> A little marketing heavy for a patch description but I suppose that's harmless!
> Maybe the shorter option of something like
> 
> "Yitian supports DDR5/4 DRAM and targets cloud computing and HPC."

Aha, thank you for your paraphrase. I will rewrite the patch description.

>> Each PMU is registered as a device in /sys/bus/event_source/devices, and
>> users can select event to monitor in each sub-channel, independently. For
>> example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
>> sub-channels of the same channel in die 0. And the PMU device of die 1 is
>> prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
>>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>> Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
>> Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
> 
> Unusual to have such a deep sign off list for a patch positing.  Is
> Co-developed appropriate here?

Yep, I will change SOB to Co-developed.

> A few comments inline, but I only took a fairly superficial look.
> 
> Thanks,
> 
> Jonathan

Thank you very much for your valuable comments.

Best Regards,
Shuai

> 
>> ---
>>  drivers/perf/Kconfig                  |   8 +
>>  drivers/perf/Makefile                 |   1 +
>>  drivers/perf/alibaba_uncore_drw_pmu.c | 771 ++++++++++++++++++++++++++
>>  3 files changed, 780 insertions(+)
>>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
>>
> 
>> diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
>> new file mode 100644
>> index 000000000000..9228ff672276
>> --- /dev/null
>> +++ b/drivers/perf/alibaba_uncore_drw_pmu.c
>> @@ -0,0 +1,771 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Alibaba DDR Sub-System Driveway PMU driver
>> + *
>> + * Copyright (C) 2022 Alibaba Inc
>> + */
>> +
>> +#define ALI_DRW_PMUNAME		"ali_drw"
>> +#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
>> +#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/perf_event.h>
>> +#include <linux/platform_device.h>
>> +
> 
> ...
> 
>> +static struct attribute *ali_drw_pmu_events_attrs[] = {
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd_or_wr,		0x0),
> 
> As mentioned in review of patch 1, why perf_ prefix?
> Also, what's hif, dfi, hpr, lpr wpr etc?

The `perf` prefix means Performance Logging Signals in DDRC, and the hardware
PMU counts the signals.

DFI stands for DDR PHY Interface[1]. The DDRCTL controller in Yitan SoC
contains a DFI MC (Memory Controller) interface, which is used to connect
to a DFI-compliant PHY, and to transfer address, control, and data to the PHY.

hpr and lpr means High Priority Read and Low Priority, respectively. The
hardware PMU count High Priority Read transaction that is scheduled when
the High Priority Queue is in Critical state.

Event perf_hpr_xact_when_critical counts for every High Priority Read
transaction that is scheduled when the High Priority Queue is in Critical
state.

Sorry, event perf_wpr_xact_when_critical is a mistake and should be corrected
to perf_wr_xact_when_critical. I will double check other events. Thank you.

> 
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_wr,			0x1),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd,			0x2),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_rmw,			0x3),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_hif_hi_pri_rd,		0x4),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_dfi_wr_data_cycles,	0x7),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_dfi_rd_data_cycles,	0x8),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_hpr_xact_when_critical,	0x9),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_lpr_xact_when_critical,	0xA),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_wpr_xact_when_critical,	0xB),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_activate,		0xC),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_or_wr,		0xD),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_activate,		0xE),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd,			0xF),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_wr,			0x10),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_mwr,			0x11),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_precharge,		0x12),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_rdwr,	0x13),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_other,	0x14),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_rdwr_transitions,		0x15),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_write_combine,		0x16),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_war_hazard,		0x17),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_raw_hazard,		0x18),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_waw_hazard,		0x19),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk0,	0x1A),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk1,	0x1B),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk2,	0x1C),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk3,	0x1D),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk0,	0x1E),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk1,	0x1F),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk2,	0x20),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk3,	0x21),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk0,		0x26),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk1,		0x27),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk2,		0x28),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk3,		0x29),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_refresh,		0x2A),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_crit_ref,		0x2B),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_load_mode,		0x2D),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqcl,		0x2E),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_rd, 0x30),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_wr, 0x31),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mpc,		0x34),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mrr,		0x35),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_tcr_mrr,		0x36),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqstart,		0x37),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqlatch,		0x38),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_txreq,			0x39),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_txdat,			0x3A),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxdat,			0x3B),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxrsp,			0x3C),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_tsz_vio,			0x3D),
>> +	ALI_DRW_PMU_EVENT_ATTR(perf_cycle,			0x80),
>> +	NULL,
>> +};
> 
> 
> 
>> +static void ali_drw_pmu_enable_counter(struct perf_event *event)
>> +{
>> +	u32 val, reg;
>> +	int counter = event->hw.idx;
>> +	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
>> +
>> +	reg = ALI_DRW_PMU_EVENT_SELn(counter);
>> +	val = readl(drw_pmu->cfg_base + reg);
>> +	/* enable the common counter n */
>> +	val |= (1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter));
>> +	/* decides the event which will be counted by the common counter n */
>> +	val |= ((drw_pmu->evtids[counter] & 0x3F) << ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter));
> 
> You could use FIELD_PREP() to fill the 8 bit sub register then shift it based
> on counter.  Would probably be more readable. Something like
> 
> 	val = readl(drw_pmu->cfg_base + reg);
> 	subval = FIELD_PREP(ALI_DRAW_PMCOM_CNT_EN_MSK, 1) |
>         	 FIELD_PREP(ALI_DRAW_PMCOM_CNT_EVENT_MSK, drw_pmu->evtids);
> 
> 	shift = 8 * (counter % 4);
> 	val &= ~(GENMASK(7, 0) << shift);
> 	val |= subval << shift;

Thank you for your suggestion, I will use FIELD_PREP() instead in next version.

> 
>> +	writel(val, drw_pmu->cfg_base + reg);
>> +}
>> +
>> +static void ali_drw_pmu_disable_counter(struct perf_event *event)
>> +{
>> +	u32 val, reg;
>> +	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
>> +	int counter = event->hw.idx;
>> +
>> +	reg = ALI_DRW_PMU_EVENT_SELn(counter);
>> +	val = readl(drw_pmu->cfg_base + reg);
>> +	val &= ~(1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter));
>> +	val &= ~(0x3F << ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter));
> 
> Use GENMASK() for that mask and maybe #define it got give it a name.

Will fix in next version. Thank you.

>> +
>> +	writel(val, drw_pmu->cfg_base + reg);
>> +}
>> +
>> +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
>> +{
>> +	struct ali_drw_pmu_irq *irq = data;
>> +	struct ali_drw_pmu *drw_pmu;
>> +	irqreturn_t ret = IRQ_NONE;
>> +
>> +	rcu_read_lock();
>> +	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {
> 
> As mentioned below, I'm not clear this is actually worthwhile vs
> just relying on IRQF_SHARED and registering each PMU separately.

This implementation is similar to dmc620_pmu_handle_irq. In my opioion,
its OK :)

> 
>> +		unsigned long status;
>> +		struct perf_event *event;
>> +		unsigned int idx;
>> +
>> +		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
>> +			event = drw_pmu->events[idx];
>> +			if (!event)
>> +				continue;
>> +			ali_drw_pmu_disable_counter(event);
>> +		}
>> +
>> +		/* common counter intr status */
>> +		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
>> +		status = (status >> 8) & 0xFFFF;
> 
> Nice to use a MASK define and FIELD_GET()

Thank you, I will change in next version.

> 
>> +		if (status) {
>> +			for_each_set_bit(idx, &status,
>> +					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
>> +				event = drw_pmu->events[idx];
>> +				if (WARN_ON_ONCE(!event))
>> +					continue;
>> +				ali_drw_pmu_event_update(event);
>> +				ali_drw_pmu_event_set_period(event);
>> +			}
>> +
>> +			/* clear common counter intr status */
>> +			writel((status & 0xFFFF) << 8,
>> +				drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
>> +		}
>> +
>> +		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
>> +			event = drw_pmu->events[idx];
>> +			if (!event)
>> +				continue;
>> +			if (!(event->hw.state & PERF_HES_STOPPED))
>> +				ali_drw_pmu_enable_counter(event);
>> +		}
>> +
>> +		ret = IRQ_HANDLED;
> 
> If no bits were set in intr status, then should not set this.

I see, will add a condition in next version. Thank you!

> 
>> +	}
>> +	rcu_read_unlock();
>> +	return ret;
>> +}
>> +
> 
> This IRQ handling is nasty.  It would be good for the patch description to
> explain what the constraints are.  What hardware shares an IRQ etc.
> Seems like the MPAM driver is going to be fighting with this to migrate the
> interrupt if needed.

I will add it in patch description.

> I'm also not clear how bad it would be to just have each PMU register
> shared IRQ and HP handler. That would simplify this code a fair bit
> and the excess HP handler calls shouldn't matter (cpu_num wouldn't
> match once first one has migrated it).

The interrupt of each PMU are independent of each other. Unfortunately, one of
them is shared with MPAM. And reference to dmc620_pmu_handle_irq, I think its OK
to leave as it is.

>> +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
>> +						      *pdev, int irq_num)
>> +{
>> +	int ret;
>> +	struct ali_drw_pmu_irq *irq;
>> +
>> +	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
>> +		if (irq->irq_num == irq_num
>> +		    && refcount_inc_not_zero(&irq->refcount))
>> +			return irq;
>> +	}
>> +
>> +	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
>> +	if (!irq)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	INIT_LIST_HEAD(&irq->pmus_node);
>> +
>> +	/* Pick one CPU to be the preferred one to use */
>> +	irq->cpu = smp_processor_id();
>> +	refcount_set(&irq->refcount, 1);
>> +
>> +	/*
>> +	 * FIXME: DDRSS Driveway PMU overflow interrupt shares the same irq
>> +	 * number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
>> +	 * successfully, add IRQF_SHARED flag.
> 
> You have it already for this driver, so what does this FIXME refer to?

As discussed in [1], I don't think PMU interrupt should be shared. I think the next
generation of Yitian will fix it. Then we should remove the IRQF_SHARED flag.

[1] https://patchwork.kernel.org/project/linux-arm-kernel/patch/1584491381-31492-1-git-send-email-tuanphan@os.amperecomputing.com/

> 
>> +	 */
>> +	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
>> +			       IRQF_SHARED, dev_name(&pdev->dev), irq);
>> +	if (ret < 0) {
>> +		dev_err(&pdev->dev,
>> +			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
>> +		goto out_free;
>> +	}
>> +
>> +	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
>> +	if (ret)
>> +		goto out_free;
>> +
>> +	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, &irq->node);
>> +	if (ret)
>> +		goto out_free;
>> +
>> +	irq->irq_num = irq_num;
>> +	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
>> +
>> +	return irq;
>> +
>> +out_free:
>> +	kfree(irq);
>> +	return ERR_PTR(ret);
>> +}
>> +
>> +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
>> +				struct platform_device *pdev)
>> +{
>> +	int irq_num;
>> +	struct ali_drw_pmu_irq *irq;
>> +
>> +	/* Read and init IRQ */
>> +	irq_num = platform_get_irq(pdev, 0);
>> +	if (irq_num < 0)
>> +		return irq_num;
>> +
>> +	mutex_lock(&ali_drw_pmu_irqs_lock);
>> +	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
>> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
>> +
>> +	if (IS_ERR(irq))
>> +		return PTR_ERR(irq);
>> +
>> +	drw_pmu->irq = irq;
>> +
>> +	mutex_lock(&ali_drw_pmu_irqs_lock);
>> +	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
>> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
>> +
>> +	return 0;
>> +}
>> +
>> +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
>> +{
>> +	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
>> +
>> +	mutex_lock(&ali_drw_pmu_irqs_lock);
>> +	list_del_rcu(&drw_pmu->pmus_node);
>> +
>> +	if (!refcount_dec_and_test(&irq->refcount)) {
>> +		mutex_unlock(&ali_drw_pmu_irqs_lock);
>> +		return;
>> +	}
>> +
>> +	list_del(&irq->irqs_node);
>> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
>> +
>> +	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
>> +	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
>> +					    &irq->node);
>> +	kfree(irq);
>> +}
>> +
> 
>> +
>> +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct ali_drw_pmu_irq *irq;
>> +	struct ali_drw_pmu *drw_pmu;
>> +	unsigned int target;
>> +
>> +	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
>> +	if (cpu != irq->cpu)
>> +		return 0;
>> +
>> +	target = cpumask_any_but(cpu_online_mask, cpu);
> 
> Preference for something on the same die if available?  I'm guessing that'll
> reduce overheads for access.  Obviously you'll then want an online callback
> to move back to the die if 1st cpu is onlined on it.

Agree, a cpu on the same die is more lightwight. But sorry, I don't find which
mask could be used in Arm platfrom. I only know topology_die_cpumask in X86.

> 
>> +	if (target >= nr_cpu_ids)
>> +		return 0;
>> +
>> +	/* We're only reading, but this isn't the place to be involving RCU */
>> +	mutex_lock(&ali_drw_pmu_irqs_lock);
>> +	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
>> +		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
>> +	mutex_unlock(&ali_drw_pmu_irqs_lock);
>> +
>> +	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
>> +	irq->cpu = target;
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct acpi_device_id ali_drw_acpi_match[] = {
>> +	{"ARMHD700", 0},
> 
> THEAD IP but with an ARM HID?  That needs at very least to have
> a comment.  If this is a generic IP, then it should be documented
> as such and the driver written with that in mind.

For historical reason, we used ARM HID. I will change to BABA5000 in next version,
keep ARMHD700 as CID for compatibility, and add comment.

>> +	{}
>> +};
>> +
>> +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
>> +
>> +static struct platform_driver ali_drw_pmu_driver = {
>> +	.driver = {
>> +		   .name = "ali_drw_pmu",
>> +		   .acpi_match_table = ACPI_PTR(ali_drw_acpi_match),
>> +		   },
>> +	.probe = ali_drw_pmu_probe,
>> +	.remove = ali_drw_pmu_remove,
>> +};
>> +
>> +static int __init ali_drw_pmu_init(void)
>> +{
>> +	int ret;
>> +
>> +	ali_drw_cpuhp_state_num = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>> +							  "ali_drw_pmu:online",
>> +							  NULL,
>> +							  ali_drw_pmu_offline_cpu);
>> +
>> +	if (ali_drw_cpuhp_state_num < 0) {
>> +		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
>> +		       ali_drw_cpuhp_state_num);
>> +		return ali_drw_cpuhp_state_num;
>> +	}
> 
> Slightly more readable as
> 
> 	ret = cpuhp_setup_state...
> 	if (ret < 0) {
> 		pr_err(..);
> 		return ret;
> 	}
> 
> 	ali_drw_cpuhp_state_num = ret;
> ...

Thank you, will fix in next version.

> 		
>> +
>> +	ret = platform_driver_register(&ali_drw_pmu_driver);
>> +	if (ret)
>> +		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
>> +
>> +	return ret;
>> +}
>> +
>> +static void __exit ali_drw_pmu_exit(void)
>> +{
>> +	platform_driver_unregister(&ali_drw_pmu_driver);
>> +	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
>> +}
>> +
>> +module_init(ali_drw_pmu_init);
>> +module_exit(ali_drw_pmu_exit);
>> +
>> +MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
>> +MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
>> +MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
>> +MODULE_DESCRIPTION("DDR Sub-System PMU driver");
>> +MODULE_LICENSE("GPL v2");

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (2 preceding siblings ...)
  2022-06-17 11:18 ` [PATCH v1 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
@ 2022-06-28  3:16 ` Shuai Xue
  2022-06-28  3:16 ` [PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-28  3:16 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

From: root <root@localhost.localdomain>

This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
which custom-built by Alibaba Group's chip development business, T-Head.

Changes since v1:
- add high level workflow about DDRC so that user cloud better understand the
  PMU hardware mechanism
- rewrite patch description and add interrupt sharing constraints
- delete event perf prefix
- add a condition to fix bug in ali_drw_pmu_isr
- perfer CPU in the same Node when migrating irq
- use FIELD_PREP and FIELD_GET to make code more readable
- add T-Head HID and leave ARMHD700 as CID for compatibility
- Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/

Shuai Xue (3):
  docs: perf: Add description for Alibaba's T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
    SoC
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver

 .../admin-guide/perf/alibaba_pmu.rst          |  97 +++
 Documentation/admin-guide/perf/index.rst      |   1 +
 MAINTAINERS                                   |   8 +
 drivers/perf/Kconfig                          |   8 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c         | 793 ++++++++++++++++++
 6 files changed, 908 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (3 preceding siblings ...)
  2022-06-28  3:16 ` [PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-06-28  3:16 ` Shuai Xue
  2022-06-28  3:16 ` [PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-28  3:16 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Alibaba's T-Head SoC implements uncore PMU for performance and functional
debugging to facilitate system maintenance. Document it to provide guidance
on how to use it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 .../admin-guide/perf/alibaba_pmu.rst          | 97 +++++++++++++++++++
 Documentation/admin-guide/perf/index.rst      |  1 +
 2 files changed, 98 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst

diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
new file mode 100644
index 000000000000..784885ce0dbb
--- /dev/null
+++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
@@ -0,0 +1,97 @@
+=============================================================
+Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
+=============================================================
+
+The Yitian 710, custom-built by Alibaba Group's chip development business,
+T-Head, implements uncore PMU for performance and functional debugging to
+facilitate system maintenance.
+
+DDR Sub-System Driveway (DRW) PMU Driver
+=========================================
+
+Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
+is independent of others to service system memory requests. And one DDR5
+channel is split into two independent sub-channels. The DDR Sub-System Driveway
+implements separate PMUs for each sub-channel to monitor various performance
+metrics.
+
+The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
+For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
+sub-channels of the same channel in die 0. And the PMU device of die 1 is
+prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
+
+Each sub-channel has 36 PMU counters in total, which is classified into
+four groups:
+
+- Group 0: PMU Cycle Counter. This group has one pair of counters
+  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
+  based on DDRC core clock.
+
+- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
+  to count the total access number of either the eight bank groups in a
+  selected rank, or four ranks separately in the first 4 counters. The base
+  transfer unit is 64B.
+
+- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
+  count the total retry number of each type of uncorrectable error.
+
+- Group 3: PMU Common Counters. This group has 16 counters, that are used
+  to count the common events.
+
+For now, the Driveway PMU driver only uses counters in group 0 and group 3.
+
+The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
+for connecting an SoC application bus to DDR memory devices. The DDRCTL
+receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
+These transactions are queued internally and scheduled for access while
+satisfying the SDRAM protocol timing requirements, transaction priorities, and
+dependencies between the transactions. The DDRCTL in turn issues commands on
+the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
+to and from the SDRAM. The driveway PMUs have hardware logic to gather statistics and performance logging signals on HIF, DFI, etc.
+
+By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF interface, we could calculate the bandwidth. Example usage of counting memory data bandwidth::
+
+  perf stat \
+    -e ali_drw_21000/hif_wr/ \
+    -e ali_drw_21000/hif_rd/ \
+    -e ali_drw_21000/hif_rmw/ \
+    -e ali_drw_21000/cycle/ \
+    -e ali_drw_21080/hif_wr/ \
+    -e ali_drw_21080/hif_rd/ \
+    -e ali_drw_21080/hif_rmw/ \
+    -e ali_drw_21080/cycle/ \
+    -e ali_drw_23000/hif_wr/ \
+    -e ali_drw_23000/hif_rd/ \
+    -e ali_drw_23000/hif_rmw/ \
+    -e ali_drw_23000/cycle/ \
+    -e ali_drw_23080/hif_wr/ \
+    -e ali_drw_23080/hif_rd/ \
+    -e ali_drw_23080/hif_rmw/ \
+    -e ali_drw_23080/cycle/ \
+    -e ali_drw_25000/hif_wr/ \
+    -e ali_drw_25000/hif_rd/ \
+    -e ali_drw_25000/hif_rmw/ \
+    -e ali_drw_25000/cycle/ \
+    -e ali_drw_25080/hif_wr/ \
+    -e ali_drw_25080/hif_rd/ \
+    -e ali_drw_25080/hif_rmw/ \
+    -e ali_drw_25080/cycle/ \
+    -e ali_drw_27000/hif_wr/ \
+    -e ali_drw_27000/hif_rd/ \
+    -e ali_drw_27000/hif_rmw/ \
+    -e ali_drw_27000/cycle/ \
+    -e ali_drw_27080/hif_wr/ \
+    -e ali_drw_27080/hif_rd/ \
+    -e ali_drw_27080/hif_rmw/ \
+    -e ali_drw_27080/cycle/ -- sleep 10
+
+The average DRAM bandwidth can be calculated as follows:
+
+- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+
+Here, DDRC_WIDTH = 64 bytes.
+
+The current driver does not support sampling. So "perf record" is
+unsupported.  Also attach to a task is unsupported as the events are all
+uncore.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 69b23f087c05..bf466ae91c6c 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -17,3 +17,4 @@ Performance monitor support
    xgene-pmu
    arm_dsu_pmu
    thunderx2-pmu
+   thead_pmu
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (4 preceding siblings ...)
  2022-06-28  3:16 ` [PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-06-28  3:16 ` Shuai Xue
  2022-06-28  3:16 ` [PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-28  3:16 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
DRAM and targets cloud computing and HPC.

Each PMU is registered as a device in /sys/bus/event_source/devices, and
users can select event to monitor in each sub-channel, independently. For
example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
sub-channels of the same channel in die 0. And the PMU device of die 1 is
prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.

Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
MPAM drivers successfully, add IRQF_SHARED flag.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
---
 drivers/perf/Kconfig                  |   8 +
 drivers/perf/Makefile                 |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++
 3 files changed, 802 insertions(+)
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 1e2d69453771..dfafba4cb066 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU
 	  Provides support for the non-architectural CPU PMUs present on
 	  the Apple M1 SoCs and derivatives.
 
+config ALIBABA_UNCORE_DRW_PMU
+	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
+	depends on (ARM64 && ACPI)
+	default m
+	help
+	  Support for Driveway PMU events monitoring on Yitian 710 DDR
+	  Sub-system.
+
 source "drivers/perf/hisilicon/Kconfig"
 
 config MARVELL_CN10K_DDR_PMU
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 57a279c61df5..050d04ee19dd 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
+obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
new file mode 100644
index 000000000000..62c1aeeac2bd
--- /dev/null
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -0,0 +1,793 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Alibaba DDR Sub-System Driveway PMU driver
+ *
+ * Copyright (C) 2022 Alibaba Inc
+ */
+
+#define ALI_DRW_PMUNAME		"ali_drw"
+#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
+#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
+
+#include <linux/acpi.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+
+#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
+#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
+
+#define ALI_DRW_PMU_PA_SHIFT			12
+#define ALI_DRW_PMU_CNT_INIT			0x00000000
+#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
+#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
+
+#define ALI_DRW_PMU_CNT_CTRL			0xC00
+#define ALI_DRW_PMU_CNT_RST			BIT(2)
+#define ALI_DRW_PMU_CNT_STOP			BIT(1)
+#define ALI_DRW_PMU_CNT_START			BIT(0)
+
+#define ALI_DRW_PMU_CNT_STATE			0xC04
+#define ALI_DRW_PMU_TEST_CTRL			0xC08
+#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
+
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
+#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
+
+/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_EVENT_SEL0			0xC68
+/* counter 0-3 use sel0, counter 4-7 use sel1...*/
+#define ALI_DRW_PMU_EVENT_SELn(n) \
+	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
+#define ALI_DRW_PMCOM_CNT_EN			BIT(7)
+#define ALI_DRW_PMCOM_CNT_EVENT_MASK		GENMASK(5, 0)
+#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
+	(8 * (n % 4))
+
+/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
+#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
+	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
+
+#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
+#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
+#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
+#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
+#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
+#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
+#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
+#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
+
+static int ali_drw_cpuhp_state_num;
+
+static LIST_HEAD(ali_drw_pmu_irqs);
+static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);
+
+struct ali_drw_pmu_irq {
+	struct hlist_node node;
+	struct list_head irqs_node;
+	struct list_head pmus_node;
+	int irq_num;
+	int cpu;
+	refcount_t refcount;
+};
+
+struct ali_drw_pmu {
+	void __iomem *cfg_base;
+	struct device *dev;
+
+	struct list_head pmus_node;
+	struct ali_drw_pmu_irq *irq;
+	int irq_num;
+	int cpu;
+	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);
+	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+
+	struct pmu pmu;
+};
+
+#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu))
+
+#define DRW_CONFIG_EVENTID		GENMASK(7, 0)
+#define GET_DRW_EVENTID(event)	FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr.config)
+
+static ssize_t ali_drw_pmu_format_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(buf, "%s\n", (char *)eattr->var);
+}
+
+/*
+ * PMU event attributes
+ */
+statuc ssize_t ali_drw_pmu_event_show(struct device *dev,
+			       struct device_attribute *attr, char *page)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
+}
+
+#define ALI_DRW_PMU_ATTR(_name, _func, _config)                            \
+		(&((struct dev_ext_attribute[]) {                               \
+				{ __ATTR(_name, 0444, _func, NULL), (void *)_config }   \
+		})[0].attr.attr)
+
+#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config)            \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config)
+#define ALI_DRW_PMU_EVENT_ATTR(_name, _config)             \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config)
+
+static struct attribute *ali_drw_pmu_events_attrs[] = {
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr,			0x0),
+	ALI_DRW_PMU_EVENT_ATTR(hif_wr,				0x1),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd,				0x2),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rmw,				0x3),
+	ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd,			0x4),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles,		0x7),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles,		0x8),
+	ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical,		0x9),
+	ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical,		0xA),
+	ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical,		0xB),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_activate,			0xC),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr,			0xD),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate,		0xE),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd,			0xF),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_wr,			0x10),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_mwr,			0x11),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_precharge,			0x12),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr,		0x13),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_other,		0x14),
+	ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions,		0x15),
+	ALI_DRW_PMU_EVENT_ATTR(write_combine,			0x16),
+	ALI_DRW_PMU_EVENT_ATTR(war_hazard,			0x17),
+	ALI_DRW_PMU_EVENT_ATTR(raw_hazard,			0x18),
+	ALI_DRW_PMU_EVENT_ATTR(waw_hazard,			0x19),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0,		0x1A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1,		0x1B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2,		0x1C),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3,		0x1D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0,	0x1E),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1,	0x1F),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2,	0x20),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3,	0x21),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0,		0x26),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1,		0x27),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2,		0x28),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3,		0x29),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_refresh,			0x2A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref,			0x2B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode,			0x2D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl,			0x2E),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc,		0x34),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr,		0x35),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr,			0x36),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart,			0x37),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch,			0x38),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txreq,			0x39),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txdat,			0x3A),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxdat,			0x3B),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp,			0x3C),
+	ALI_DRW_PMU_EVENT_ATTR(tsz_vio,				0x3D),
+	ALI_DRW_PMU_EVENT_ATTR(cycle,				0x80),
+	NULL,
+};
+
+static struct attribute_group ali_drw_pmu_events_attr_group = {
+	.name = "events",
+	.attrs = ali_drw_pmu_events_attrs,
+};
+
+static struct attribute *ali_drw_pmu_format_attr[] = {
+	ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_format_group = {
+	.name = "format",
+	.attrs = ali_drw_pmu_format_attr,
+};
+
+static ssize_t ali_drw_pmu_cpumask_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu));
+}
+
+static struct device_attribute ali_drw_pmu_cpumask_attr =
+		__ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL);
+
+static struct attribute *ali_drw_pmu_cpumask_attrs[] = {
+	&ali_drw_pmu_cpumask_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_cpumask_attr_group = {
+	.attrs = ali_drw_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *ali_drw_pmu_attr_groups[] = {
+	&ali_drw_pmu_events_attr_group,
+	&ali_drw_pmu_cpumask_attr_group,
+	&ali_drw_pmu_format_group,
+	NULL,
+};
+
+/* find a counter for event, then in add func, hw.idx will equal to counter */
+static int ali_drw_get_counter_idx(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int idx;
+
+	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
+		if (!test_and_set_bit(idx, drw_pmu->used_mask))
+			return idx;
+	}
+
+	/* The counters are all in use. */
+	return -EBUSY;
+}
+
+static u64 ali_drw_pmu_read_counter(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	u64 cycle_high, cycle_low;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		cycle_high = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH);
+		cycle_high &= ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK;
+		cycle_low = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW);
+		cycle_low &= ALI_DRW_PMU_CYCLE_CNT_LOW_MASK;
+		return (cycle_high << 32 | cycle_low);
+	}
+
+	return readl(drw_pmu->cfg_base +
+		     ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx));
+}
+
+static void ali_drw_pmu_event_update(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 delta, prev, now;
+
+	do {
+		prev = local64_read(&hwc->prev_count);
+		now = ali_drw_pmu_read_counter(event);
+	} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);
+
+	/* handle overflow. */
+	delta = now - prev;
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID)
+		delta &= ALI_DRW_PMU_OV_INTR_MASK;
+	else
+		delta &= ALI_DRW_CNT_MAX_PERIOD;
+	local64_add(delta, &event->count);
+}
+
+static void ali_drw_pmu_event_set_period(struct perf_event *event)
+{
+	u64 pre_val;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	/* set a preload counter for test purpose */
+	writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+
+	/* set conunter initial value */
+	pre_val = ALI_DRW_PMU_CNT_INIT;
+	writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	local64_set(&event->hw.prev_count, pre_val);
+
+	/* set sel mode to zero to start test */
+	writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+}
+
+static void ali_drw_pmu_enable_counter(struct perf_event *event)
+{
+	u32 val, subval, reg, shift;
+	int counter = event->hw.idx;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static void ali_drw_pmu_disable_counter(struct perf_event *event)
+{
+	u32 val, reg, subval, shift;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int counter = event->hw.idx;
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
+{
+	struct ali_drw_pmu_irq *irq = data;
+	struct ali_drw_pmu *drw_pmu;
+	irqreturn_t ret = IRQ_NONE;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {
+		unsigned long status, clr_status;
+		struct perf_event *event;
+		unsigned int idx;
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			ali_drw_pmu_disable_counter(event);
+		}
+
+		/* common counter intr status */
+		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
+		status = FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status);
+		if (status) {
+			for_each_set_bit(idx, &status,
+					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+				event = drw_pmu->events[idx];
+				if (WARN_ON_ONCE(!event))
+					continue;
+				ali_drw_pmu_event_update(event);
+				ali_drw_pmu_event_set_period(event);
+			}
+
+			/* clear common counter intr status */
+			clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1);
+			writel(clr_status,
+			       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+		}
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			if (!(event->hw.state & PERF_HES_STOPPED))
+				ali_drw_pmu_enable_counter(event);
+		}
+		if (status)
+			ret = IRQ_HANDLED;
+	}
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
+						      *pdev, int irq_num)
+{
+	int ret;
+	struct ali_drw_pmu_irq *irq;
+
+	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
+		if (irq->irq_num == irq_num
+		    && refcount_inc_not_zero(&irq->refcount))
+			return irq;
+	}
+
+	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
+	if (!irq)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&irq->pmus_node);
+
+	/* Pick one CPU to be the preferred one to use */
+	irq->cpu = smp_processor_id();
+	refcount_set(&irq->refcount, 1);
+
+	/*
+	 * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same
+	 * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
+	 * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not
+	 * share with other component.
+	 */
+	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
+			       IRQF_SHARED, dev_name(&pdev->dev), irq);
+	if (ret < 0) {
+		dev_err(&pdev->dev,
+			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
+		goto out_free;
+	}
+
+	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
+	if (ret)
+		goto out_free;
+
+	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num,
+					     &irq->node);
+	if (ret)
+		goto out_free;
+
+	irq->irq_num = irq_num;
+	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
+
+	return irq;
+
+out_free:
+	kfree(irq);
+	return ERR_PTR(ret);
+}
+
+static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
+				struct platform_device *pdev)
+{
+	int irq_num;
+	struct ali_drw_pmu_irq *irq;
+
+	/* Read and init IRQ */
+	irq_num = platform_get_irq(pdev, 0);
+	if (irq_num < 0)
+		return irq_num;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	if (IS_ERR(irq))
+		return PTR_ERR(irq);
+
+	drw_pmu->irq = irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	return 0;
+}
+
+static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
+{
+	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_del_rcu(&drw_pmu->pmus_node);
+
+	if (!refcount_dec_and_test(&irq->refcount)) {
+		mutex_unlock(&ali_drw_pmu_irqs_lock);
+		return;
+	}
+
+	list_del(&irq->irqs_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
+	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
+					    &irq->node);
+	kfree(irq);
+}
+
+static int ali_drw_pmu_event_init(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event *sibling;
+	struct device *dev = drw_pmu->pmu.dev;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (is_sampling_event(event)) {
+		dev_err(dev, "Sampling not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->attach_state & PERF_ATTACH_TASK) {
+		dev_err(dev, "Per-task counter cannot allocate!\n");
+		return -EOPNOTSUPP;
+	}
+
+	event->cpu = drw_pmu->cpu;
+	if (event->cpu < 0) {
+		dev_err(dev, "Per-task mode not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->group_leader != event &&
+	    !is_software_event(event->group_leader)) {
+		dev_err(dev, "driveway only allow one event!\n");
+		return -EINVAL;
+	}
+
+	for_each_sibling_event(sibling, event->group_leader) {
+		if (sibling != event && !is_software_event(sibling)) {
+			dev_err(dev, "driveway event not allowed!\n");
+			return -EINVAL;
+		}
+	}
+
+	/* reset all the pmu counters */
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	hwc->idx = -1;
+
+	return 0;
+}
+
+static void ali_drw_pmu_start(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	event->hw.state = 0;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		writel(ALI_DRW_PMU_CNT_START,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+		return;
+	}
+
+	ali_drw_pmu_event_set_period(event);
+	if (flags & PERF_EF_RELOAD) {
+		unsigned long prev_raw_count =
+		    local64_read(&event->hw.prev_count);
+		writel(prev_raw_count,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	}
+
+	ali_drw_pmu_enable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+}
+
+static void ali_drw_pmu_stop(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	if (event->hw.state & PERF_HES_STOPPED)
+		return;
+
+	if (GET_DRW_EVENTID(event) != ALI_DRW_PMU_CYCLE_EVT_ID)
+		ali_drw_pmu_disable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	ali_drw_pmu_event_update(event);
+	event->hw.state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int ali_drw_pmu_add(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = -1;
+	int evtid;
+
+	evtid = GET_DRW_EVENTID(event);
+
+	if (evtid != ALI_DRW_PMU_CYCLE_EVT_ID) {
+		idx = ali_drw_get_counter_idx(event);
+		if (idx < 0)
+			return idx;
+		drw_pmu->events[idx] = event;
+		drw_pmu->evtids[idx] = evtid;
+	}
+	hwc->idx = idx;
+
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		ali_drw_pmu_start(event, PERF_EF_RELOAD);
+
+	/* Propagate our changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void ali_drw_pmu_del(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	ali_drw_pmu_stop(event, PERF_EF_UPDATE);
+
+	if (idx >= 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+		drw_pmu->events[idx] = NULL;
+		drw_pmu->evtids[idx] = 0;
+		clear_bit(idx, drw_pmu->used_mask);
+	}
+
+	perf_event_update_userpage(event);
+}
+
+static void ali_drw_pmu_read(struct perf_event *event)
+{
+	ali_drw_pmu_event_update(event);
+}
+
+static int ali_drw_pmu_probe(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu;
+	struct resource *res;
+	char *name;
+	int ret;
+
+	drw_pmu = devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL);
+	if (!drw_pmu)
+		return -ENOMEM;
+
+	drw_pmu->dev = &pdev->dev;
+	platform_set_drvdata(pdev, drw_pmu);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	drw_pmu->cfg_base = devm_ioremap_resource(&pdev->dev, res);
+	if (!drw_pmu->cfg_base)
+		return -ENOMEM;
+
+	name = devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx",
+			      (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT));
+	if (!name)
+		return -ENOMEM;
+
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	/* enable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL);
+
+	/* clearing interrupt status */
+	writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+
+	drw_pmu->cpu = smp_processor_id();
+
+	ret = ali_drw_pmu_init_irq(drw_pmu, pdev);
+	if (ret)
+		return ret;
+
+	drw_pmu->pmu = (struct pmu) {
+		.module		= THIS_MODULE,
+		.task_ctx_nr	= perf_invalid_context,
+		.event_init	= ali_drw_pmu_event_init,
+		.add		= ali_drw_pmu_add,
+		.del		= ali_drw_pmu_del,
+		.start		= ali_drw_pmu_start,
+		.stop		= ali_drw_pmu_stop,
+		.read		= ali_drw_pmu_read,
+		.attr_groups	= ali_drw_pmu_attr_groups,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE,
+	};
+
+	ret = perf_pmu_register(&drw_pmu->pmu, name, -1);
+	if (ret) {
+		dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n");
+		ali_drw_pmu_uninit_irq(drw_pmu);
+	}
+
+	return ret;
+}
+
+static int ali_drw_pmu_remove(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu = platform_get_drvdata(pdev);
+
+	/* disable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL);
+
+	ali_drw_pmu_uninit_irq(drw_pmu);
+	perf_pmu_unregister(&drw_pmu->pmu);
+
+	return 0;
+}
+
+static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct ali_drw_pmu_irq *irq;
+	struct ali_drw_pmu *drw_pmu;
+	unsigned int target;
+	int ret;
+	cpumask_t node_online_cpus;
+
+	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
+	if (cpu != irq->cpu)
+		return 0;
+
+	ret = cpumask_and(&node_online_cpus,
+			  cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
+	if (ret)
+		target = cpumask_any_but(&node_online_cpus, cpu);
+	else
+		target = cpumask_any_but(cpu_online_mask, cpu);
+
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	/* We're only reading, but this isn't the place to be involving RCU */
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
+		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
+	irq->cpu = target;
+
+	return 0;
+}
+
+/*
+ * Due to historical reasons, the HID used in the production environment is
+ * ARMHD700, so we leave ARMHD700 for compatibility.
+ */
+static const struct acpi_device_id ali_drw_acpi_match[] = {
+	{"BABA5000", 0},
+	{"ARMHD700", 0},
+	{}
+};
+
+MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
+
+static struct platform_driver ali_drw_pmu_driver = {
+	.driver = {
+		   .name = "ali_drw_pmu",
+		   .acpi_match_table = ACPI_PTR(ali_drw_acpi_match),
+		   },
+	.probe = ali_drw_pmu_probe,
+	.remove = ali_drw_pmu_remove,
+};
+
+static int __init ali_drw_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "ali_drw_pmu:online",
+				      NULL, ali_drw_pmu_offline_cpu);
+
+	if (ret < 0) {
+		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
+		       ret);
+		return ret;
+	}
+	ali_drw_cpuhp_state_num = ret;
+
+	ret = platform_driver_register(&ali_drw_pmu_driver);
+	if (ret)
+		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+
+	return ret;
+}
+
+static void __exit ali_drw_pmu_exit(void)
+{
+	platform_driver_unregister(&ali_drw_pmu_driver);
+	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+}
+
+module_init(ali_drw_pmu_init);
+module_exit(ali_drw_pmu_exit);
+
+MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
+MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
+MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
+MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver");
+MODULE_LICENSE("GPL v2");
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (5 preceding siblings ...)
  2022-06-28  3:16 ` [PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-06-28  3:16 ` Shuai Xue
  2022-07-15 15:13 ` [RESEND PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-06-28  3:16 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Add maintainers for Alibaba PMU document and driver

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 55114e73de26..ccddb59c253c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -733,6 +733,14 @@ S:	Maintained
 F:	Documentation/i2c/busses/i2c-ali1563.rst
 F:	drivers/i2c/busses/i2c-ali1563.c
 
+ALIBABA PMU DRIVER
+M:	Shuai Xue <xueshuai@linux.alibaba.com>
+M:	Hongbo Yao <yaohongbo@linux.alibaba.com>
+M:	Neng Chen <nengchen@linux.alibaba.com>
+S:	Supported
+F:	Documentation/admin-guide/perf/alibaba_pmu.rst
+F:	drivers/perf/alibaba_uncore_dwr_pmu.c
+
 ALIENWARE WMI DRIVER
 L:	Dell.Client.Kernel@dell.com
 S:	Maintained
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 ` [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
  2022-06-20 11:49   ` Jonathan Cameron
@ 2022-07-04  3:53   ` kernel test robot
  1 sibling, 0 replies; 41+ messages in thread
From: kernel test robot @ 2022-07-04  3:53 UTC (permalink / raw)
  To: Shuai Xue, linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: kbuild-all, baolin.wang, yaohongbo, nengchen, zhuo.song

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

Hi Shuai,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on soc/for-next]
[also build test WARNING on linus/master v5.19-rc5 next-20220701]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Shuai-Xue/drivers-perf-add-DDR-Sub-System-Driveway-PMU-driver-for-Yitian-710-SoC/20220617-192123
base:   https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git for-next
reproduce: make htmldocs

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> Documentation/admin-guide/perf/index.rst:7: WARNING: toctree contains reference to nonexisting document 'admin-guide/perf/thead_pmu'

vim +7 Documentation/admin-guide/perf/index.rst

6baec31591cee0 Documentation/perf/index.rst Mauro Carvalho Chehab 2019-04-18  6  
6baec31591cee0 Documentation/perf/index.rst Mauro Carvalho Chehab 2019-04-18 @7  .. toctree::

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

[-- Attachment #2: config --]
[-- Type: text/plain, Size: 31470 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/i386 5.19.0-rc2 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="gcc-11 (Debian 11.3.0-3) 11.3.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=110300
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=23800
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23800
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_PAHOLE_VERSION=123
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
# CONFIG_WERROR is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_HAVE_KERNEL_ZSTD=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
# CONFIG_KERNEL_ZSTD is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
# CONFIG_SYSVIPC is not set
# CONFIG_WATCH_QUEUE is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
# CONFIG_USELIB is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# end of IRQ subsystem

CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y

#
# Timers subsystem
#
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US=100
# end of Timers subsystem

CONFIG_HAVE_EBPF_JIT=y

#
# BPF subsystem
#
# CONFIG_BPF_SYSCALL is not set
# end of BPF subsystem

CONFIG_PREEMPT_NONE_BUILD=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_DYNAMIC is not set

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_PSI is not set
# end of CPU/Task time and stats accounting

#
# RCU Subsystem
#
CONFIG_TINY_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TINY_SRCU=y
# end of RCU Subsystem

# CONFIG_IKCONFIG is not set
# CONFIG_IKHEADERS is not set
CONFIG_LOG_BUF_SHIFT=17
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y

#
# Scheduler features
#
# end of Scheduler features

CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
CONFIG_GCC12_NO_ARRAY_BOUNDS=y
# CONFIG_CGROUPS is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_TIME_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_CHECKPOINT_RESTORE is not set
# CONFIG_SCHED_AUTOGROUP is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_BOOT_CONFIG is not set
# CONFIG_INITRAMFS_PRESERVE_MTIME is not set
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_LD_ORPHAN_WARN=y
CONFIG_SYSCTL=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
# CONFIG_EXPERT is not set
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_IO_URING=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_RSEQ=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# end of Kernel Performance Events And Counters

# CONFIG_PROFILING is not set
# end of General setup

CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf32-i386"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_BITS_MAX=16
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_NR_GPIO=512
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

#
# Processor type and features
#
# CONFIG_SMP is not set
CONFIG_X86_FEATURE_NAMES=y
# CONFIG_GOLDFISH is not set
# CONFIG_RETPOLINE is not set
CONFIG_CC_HAS_SLS=y
# CONFIG_X86_CPU_RESCTRL is not set
# CONFIG_X86_EXTENDED_PLATFORM is not set
# CONFIG_X86_32_IRIS is not set
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
# CONFIG_HYPERVISOR_GUEST is not set
# CONFIG_M486SX is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
CONFIG_M686=y
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MELAN is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_INTERNODE_CACHE_SHIFT=5
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=6
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_IA32_FEAT_CTL=y
CONFIG_X86_VMX_FEATURE_NAMES=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_HYGON=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_CPU_SUP_TRANSMETA_32=y
CONFIG_CPU_SUP_ZHAOXIN=y
CONFIG_CPU_SUP_VORTEX_32=y
# CONFIG_HPET_TIMER is not set
CONFIG_DMI=y
CONFIG_NR_CPUS_RANGE_BEGIN=1
CONFIG_NR_CPUS_RANGE_END=1
CONFIG_NR_CPUS_DEFAULT=1
CONFIG_NR_CPUS=1
# CONFIG_X86_UP_APIC is not set
# CONFIG_X86_MCE is not set

#
# Performance monitoring
#
# CONFIG_PERF_EVENTS_AMD_POWER is not set
# CONFIG_PERF_EVENTS_AMD_UNCORE is not set
# CONFIG_PERF_EVENTS_AMD_BRS is not set
# end of Performance monitoring

# CONFIG_X86_LEGACY_VM86 is not set
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX32=y
# CONFIG_X86_IOPL_IOPERM is not set
# CONFIG_TOSHIBA is not set
# CONFIG_X86_REBOOTFIXUPS is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ILLEGAL_POINTER_VALUE=0
# CONFIG_HIGHPTE is not set
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_MTRR=y
# CONFIG_MTRR_SANITIZER is not set
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_ARCH_RANDOM=y
CONFIG_X86_UMIP=y
CONFIG_CC_HAS_IBT=y
CONFIG_X86_INTEL_TSX_MODE_OFF=y
# CONFIG_X86_INTEL_TSX_MODE_ON is not set
# CONFIG_X86_INTEL_TSX_MODE_AUTO is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x1000000
# CONFIG_RELOCATABLE is not set
CONFIG_PHYSICAL_ALIGN=0x200000
# CONFIG_COMPAT_VDSO is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
# CONFIG_STRICT_SIGALTSTACK_SIZE is not set
# end of Processor type and features

CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y

#
# Power management and ACPI options
#
# CONFIG_SUSPEND is not set
# CONFIG_PM is not set
CONFIG_ARCH_SUPPORTS_ACPI=y
# CONFIG_ACPI is not set

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
# end of CPU Frequency scaling

#
# CPU Idle
#
# CONFIG_CPU_IDLE is not set
# end of CPU Idle
# end of Power management and ACPI options

#
# Bus options (PCI etc.)
#
CONFIG_ISA_DMA_API=y
# CONFIG_ISA is not set
# CONFIG_SCx200 is not set
# CONFIG_OLPC is not set
# CONFIG_ALIX is not set
# CONFIG_NET5501 is not set
# CONFIG_GEOS is not set
# end of Bus options (PCI etc.)

#
# Binary Emulations
#
CONFIG_COMPAT_32=y
# end of Binary Emulations

CONFIG_HAVE_ATOMIC_IOMAP=y
CONFIG_HAVE_KVM=y
# CONFIG_VIRTUALIZATION is not set
CONFIG_AS_AVX512=y
CONFIG_AS_SHA1_NI=y
CONFIG_AS_SHA256_NI=y
CONFIG_AS_TPAUSE=y

#
# General architecture-dependent options
#
CONFIG_GENERIC_ENTRY=y
# CONFIG_JUMP_LABEL is not set
# CONFIG_STATIC_CALL_SELFTEST is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_ARCH_WANTS_NO_INSTR=y
CONFIG_ARCH_32BIT_OFF_T=y
CONFIG_HAVE_ASM_MODVERSIONS=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_RSEQ=y
CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_IPC_PARSE_VERSION=y
CONFIG_HAVE_ARCH_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
# CONFIG_SECCOMP is not set
CONFIG_HAVE_ARCH_STACKLEAK=y
CONFIG_HAVE_STACKPROTECTOR=y
# CONFIG_STACKPROTECTOR is not set
CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
CONFIG_LTO_NONE=y
CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_MOVE_PUD=y
CONFIG_HAVE_MOVE_PMD=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_REL=y
CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=8
CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
CONFIG_CLONE_BACKWARDS=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_OLD_SIGACTION=y
# CONFIG_COMPAT_32BIT_TIME is not set
CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
CONFIG_RANDOMIZE_KSTACK_OFFSET=y
# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
CONFIG_ARCH_HAS_MEM_ENCRYPT=y
CONFIG_HAVE_STATIC_CALL=y
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SPLIT_ARG64=y
CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y
CONFIG_DYNAMIC_SIGFRAME=y

#
# GCOV-based kernel profiling
#
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# end of GCOV-based kernel profiling

CONFIG_HAVE_GCC_PLUGINS=y
# CONFIG_GCC_PLUGINS is not set
# end of General architecture-dependent options

CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
# CONFIG_MODULES is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
# CONFIG_BLOCK_LEGACY_AUTOLOAD is not set
# CONFIG_BLK_DEV_BSGLIB is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
# CONFIG_BLK_DEV_ZONED is not set
# CONFIG_BLK_WBT is not set
# CONFIG_BLK_SED_OPAL is not set
# CONFIG_BLK_INLINE_ENCRYPTION is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_EFI_PARTITION=y
# end of Partition Types

#
# IO Schedulers
#
# CONFIG_MQ_IOSCHED_DEADLINE is not set
# CONFIG_MQ_IOSCHED_KYBER is not set
# CONFIG_IOSCHED_BFQ is not set
# end of IO Schedulers

CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y

#
# Executable file formats
#
# CONFIG_BINFMT_ELF is not set
# CONFIG_BINFMT_SCRIPT is not set
# CONFIG_BINFMT_MISC is not set
CONFIG_COREDUMP=y
# end of Executable file formats

#
# Memory Management options
#
# CONFIG_SWAP is not set

#
# SLAB allocator options
#
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLAB_MERGE_DEFAULT is not set
# CONFIG_SLAB_FREELIST_RANDOM is not set
# CONFIG_SLAB_FREELIST_HARDENED is not set
# CONFIG_SLUB_STATS is not set
# end of SLAB allocator options

# CONFIG_SHUFFLE_PAGE_ALLOCATOR is not set
# CONFIG_COMPAT_BRK is not set
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_HAVE_FAST_GUP=y
CONFIG_EXCLUSIVE_SYSTEM_RAM=y
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_COMPACTION is not set
# CONFIG_PAGE_REPORTING is not set
# CONFIG_BOUNCE is not set
CONFIG_VIRT_TO_BUS=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
# CONFIG_TRANSPARENT_HUGEPAGE is not set
CONFIG_NEED_PER_CPU_KM=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
# CONFIG_CMA is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
CONFIG_ARCH_HAS_VM_GET_PAGE_PROT=y
CONFIG_ZONE_DMA=y
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_PERCPU_STATS is not set

#
# GUP_TEST needs to have DEBUG_FS enabled
#
CONFIG_ARCH_HAS_PTE_SPECIAL=y
CONFIG_KMAP_LOCAL=y
CONFIG_SECRETMEM=y
# CONFIG_ANON_VMA_NAME is not set
# CONFIG_USERFAULTFD is not set

#
# Data Access Monitoring
#
# CONFIG_DAMON is not set
# end of Data Access Monitoring
# end of Memory Management options

# CONFIG_NET is not set

#
# Device Drivers
#
CONFIG_HAVE_EISA=y
# CONFIG_EISA is not set
CONFIG_HAVE_PCI=y
# CONFIG_PCI is not set
# CONFIG_PCCARD is not set

#
# Generic Driver Options
#
# CONFIG_UEVENT_HELPER is not set
# CONFIG_DEVTMPFS is not set
# CONFIG_STANDALONE is not set
# CONFIG_PREVENT_FIRMWARE_BUILD is not set

#
# Firmware loader
#
CONFIG_FW_LOADER=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER is not set
# CONFIG_FW_LOADER_COMPRESS is not set
# CONFIG_FW_UPLOAD is not set
# end of Firmware loader

CONFIG_ALLOW_DEV_COREDUMP=y
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
# end of Generic Driver Options

#
# Bus devices
#
# CONFIG_MHI_BUS is not set
# CONFIG_MHI_BUS_EP is not set
# end of Bus devices

#
# Firmware Drivers
#

#
# ARM System Control and Management Interface Protocol
#
# end of ARM System Control and Management Interface Protocol

# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
# CONFIG_DMIID is not set
# CONFIG_DMI_SYSFS is not set
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
# CONFIG_FW_CFG_SYSFS is not set
# CONFIG_SYSFB_SIMPLEFB is not set
# CONFIG_GOOGLE_FIRMWARE is not set

#
# Tegra firmware driver
#
# end of Tegra firmware driver
# end of Firmware Drivers

# CONFIG_GNSS is not set
# CONFIG_MTD is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
# CONFIG_BLK_DEV is not set

#
# NVME Support
#
# CONFIG_NVME_FC is not set
# end of NVME Support

#
# Misc devices
#
# CONFIG_DUMMY_IRQ is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_SRAM is not set
# CONFIG_XILINX_SDFEC is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_93CX6 is not set
# end of EEPROM support

#
# Texas Instruments shared transport line discipline
#
# end of Texas Instruments shared transport line discipline

#
# Altera FPGA firmware download module (requires I2C)
#
# CONFIG_ECHO is not set
# CONFIG_PVPANIC is not set
# end of Misc devices

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
# CONFIG_SCSI is not set
# end of SCSI device support

# CONFIG_ATA is not set
# CONFIG_MD is not set
# CONFIG_TARGET_CORE is not set
# CONFIG_MACINTOSH_DRIVERS is not set

#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
# CONFIG_INPUT_SPARSEKMAP is not set
# CONFIG_INPUT_MATRIXKMAP is not set

#
# Userland interfaces
#
# CONFIG_INPUT_MOUSEDEV is not set
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
# CONFIG_RMI4_CORE is not set

#
# Hardware I/O ports
#
# CONFIG_SERIO is not set
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
# CONFIG_GAMEPORT is not set
# end of Hardware I/O ports
# end of Input device support

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
# CONFIG_LDISC_AUTOLOAD is not set

#
# Serial drivers
#
# CONFIG_SERIAL_8250 is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_UARTLITE is not set
# CONFIG_SERIAL_LANTIQ is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_TIMBERDALE is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_FSL_LINFLEXUART is not set
# end of Serial drivers

# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_NULL_TTY is not set
# CONFIG_SERIAL_DEV_BUS is not set
# CONFIG_VIRTIO_CONSOLE is not set
# CONFIG_IPMI_HANDLER is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
# CONFIG_NSC_GPIO is not set
# CONFIG_DEVMEM is not set
# CONFIG_NVRAM is not set
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
# CONFIG_RANDOM_TRUST_CPU is not set
# CONFIG_RANDOM_TRUST_BOOTLOADER is not set
# end of Character devices

#
# I2C support
#
# CONFIG_I2C is not set
# end of I2C support

# CONFIG_I3C is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
# CONFIG_PPS is not set

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK_OPTIONAL=y

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
# end of PTP clock support

# CONFIG_PINCTRL is not set
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
# CONFIG_POWER_RESET is not set
# CONFIG_POWER_SUPPLY is not set
# CONFIG_HWMON is not set
# CONFIG_THERMAL is not set
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y
# CONFIG_BCMA is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_MADERA is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_TQMX86 is not set
# end of Multifunction device drivers

# CONFIG_REGULATOR is not set
# CONFIG_RC_CORE is not set

#
# CEC support
#
# CONFIG_MEDIA_CEC_SUPPORT is not set
# end of CEC support

# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
# CONFIG_DRM is not set

#
# ARM devices
#
# end of ARM devices

#
# Frame buffer Devices
#
# CONFIG_FB is not set
# end of Frame buffer Devices

#
# Backlight & LCD device support
#
# CONFIG_LCD_CLASS_DEVICE is not set
# CONFIG_BACKLIGHT_CLASS_DEVICE is not set
# end of Backlight & LCD device support

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
# end of Console display driver support
# end of Graphics support

# CONFIG_SOUND is not set

#
# HID support
#
# CONFIG_HID is not set
# end of HID support

CONFIG_USB_OHCI_LITTLE_ENDIAN=y
# CONFIG_USB_SUPPORT is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_RTC_LIB=y
CONFIG_RTC_MC146818_LIB=y
# CONFIG_RTC_CLASS is not set
# CONFIG_DMADEVICES is not set

#
# DMABUF options
#
# CONFIG_SYNC_FILE is not set
# CONFIG_DMABUF_HEAPS is not set
# end of DMABUF options

# CONFIG_AUXDISPLAY is not set
# CONFIG_UIO is not set
# CONFIG_VFIO is not set
# CONFIG_VIRT_DRIVERS is not set
# CONFIG_VIRTIO_MENU is not set
# CONFIG_VHOST_MENU is not set

#
# Microsoft Hyper-V guest support
#
# end of Microsoft Hyper-V guest support

# CONFIG_GREYBUS is not set
# CONFIG_COMEDI is not set
# CONFIG_STAGING is not set
# CONFIG_X86_PLATFORM_DEVICES is not set
# CONFIG_CHROME_PLATFORMS is not set
# CONFIG_MELLANOX_PLATFORM is not set
# CONFIG_SURFACE_PLATFORMS is not set
# CONFIG_COMMON_CLK is not set
# CONFIG_HWSPINLOCK is not set

#
# Clock Source drivers
#
CONFIG_CLKSRC_I8253=y
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# end of Clock Source drivers

# CONFIG_MAILBOX is not set
# CONFIG_IOMMU_SUPPORT is not set

#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set
# end of Remoteproc drivers

#
# Rpmsg drivers
#
# CONFIG_RPMSG_VIRTIO is not set
# end of Rpmsg drivers

#
# SOC (System On Chip) specific Drivers
#

#
# Amlogic SoC drivers
#
# end of Amlogic SoC drivers

#
# Broadcom SoC drivers
#
# end of Broadcom SoC drivers

#
# NXP/Freescale QorIQ SoC drivers
#
# end of NXP/Freescale QorIQ SoC drivers

#
# i.MX SoC drivers
#
# end of i.MX SoC drivers

#
# Enable LiteX SoC Builder specific drivers
#
# end of Enable LiteX SoC Builder specific drivers

#
# Qualcomm SoC drivers
#
# end of Qualcomm SoC drivers

# CONFIG_SOC_TI is not set

#
# Xilinx SoC drivers
#
# end of Xilinx SoC drivers
# end of SOC (System On Chip) specific Drivers

# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
# CONFIG_PWM is not set

#
# IRQ chip support
#
# end of IRQ chip support

# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set

#
# PHY Subsystem
#
# CONFIG_GENERIC_PHY is not set
# CONFIG_PHY_CAN_TRANSCEIVER is not set

#
# PHY drivers for Broadcom platforms
#
# CONFIG_BCM_KONA_USB2_PHY is not set
# end of PHY drivers for Broadcom platforms

# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_PHY_INTEL_LGM_EMMC is not set
# end of PHY Subsystem

# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set

#
# Performance monitor support
#
# end of Performance monitor support

# CONFIG_RAS is not set

#
# Android
#
# CONFIG_ANDROID is not set
# end of Android

# CONFIG_DAX is not set
# CONFIG_NVMEM is not set

#
# HW tracing support
#
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
# end of HW tracing support

# CONFIG_FPGA is not set
# CONFIG_TEE is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set
# CONFIG_INTERCONNECT is not set
# CONFIG_COUNTER is not set
# CONFIG_PECI is not set
# CONFIG_HTE is not set
# end of Device Drivers

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
# CONFIG_VALIDATE_FS_PARSER is not set
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
# CONFIG_EXT4_FS is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_F2FS_FS is not set
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
CONFIG_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
# CONFIG_FS_VERITY is not set
# CONFIG_DNOTIFY is not set
# CONFIG_INOTIFY_USER is not set
# CONFIG_FANOTIFY is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_FUSE_FS is not set
# CONFIG_OVERLAY_FS is not set

#
# Caches
#
# CONFIG_FSCACHE is not set
# end of Caches

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set
# end of CD-ROM/DVD Filesystems

#
# DOS/FAT/EXFAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_EXFAT_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_NTFS3_FS is not set
# end of DOS/FAT/EXFAT/NT Filesystems

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_PROC_KCORE is not set
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_PROC_PID_ARCH_STATUS=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
# CONFIG_TMPFS is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_CONFIGFS_FS is not set
# end of Pseudo filesystems

# CONFIG_MISC_FILESYSTEMS is not set
# CONFIG_NLS is not set
# CONFIG_UNICODE is not set
CONFIG_IO_WQ=y
# end of File systems

#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
# CONFIG_HARDENED_USERCOPY is not set
# CONFIG_FORTIFY_SOURCE is not set
# CONFIG_STATIC_USERMODEHELPER is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,bpf"

#
# Kernel hardening options
#

#
# Memory initialization
#
CONFIG_INIT_STACK_NONE=y
# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set
# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
CONFIG_CC_HAS_ZERO_CALL_USED_REGS=y
# CONFIG_ZERO_CALL_USED_REGS is not set
# end of Memory initialization

CONFIG_RANDSTRUCT_NONE=y
# end of Kernel hardening options
# end of Security options

# CONFIG_CRYPTO is not set

#
# Library routines
#
# CONFIG_PACKING is not set
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
# CONFIG_CORDIC is not set
# CONFIG_PRIME_NUMBERS is not set
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_ARCH_USE_SYM_ANNOTATIONS=y

#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
# CONFIG_CRYPTO_LIB_CURVE25519 is not set
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=1
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# end of Crypto library routines

# CONFIG_CRC_CCITT is not set
# CONFIG_CRC16 is not set
# CONFIG_CRC_T10DIF is not set
# CONFIG_CRC64_ROCKSOFT is not set
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC64 is not set
# CONFIG_CRC4 is not set
# CONFIG_CRC7 is not set
# CONFIG_LIBCRC32C is not set
# CONFIG_CRC8 is not set
# CONFIG_RANDOM32_SELFTEST is not set
# CONFIG_XZ_DEC is not set
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_NEED_SG_DMA_LENGTH=y
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_IRQ_POLL is not set
CONFIG_HAVE_GENERIC_VDSO=y
CONFIG_GENERIC_GETTIMEOFDAY=y
CONFIG_GENERIC_VDSO_32=y
CONFIG_GENERIC_VDSO_TIME_NS=y
CONFIG_ARCH_STACKWALK=y
CONFIG_STACKDEPOT=y
CONFIG_STACK_HASH_ORDER=20
CONFIG_SBITMAP=y
# end of Library routines

#
# Kernel hacking
#

#
# printk and dmesg options
#
# CONFIG_PRINTK_TIME is not set
# CONFIG_PRINTK_CALLER is not set
# CONFIG_STACKTRACE_BUILD_ID is not set
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# CONFIG_DYNAMIC_DEBUG is not set
# CONFIG_DYNAMIC_DEBUG_CORE is not set
# CONFIG_SYMBOLIC_ERRNAME is not set
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options

# CONFIG_DEBUG_KERNEL is not set

#
# Compile-time checks and compiler options
#
CONFIG_FRAME_WARN=1024
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_HEADERS_INSTALL is not set
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_FRAME_POINTER=y
# end of Compile-time checks and compiler options

#
# Generic Kernel Debugging Instruments
#
# CONFIG_MAGIC_SYSRQ is not set
# CONFIG_DEBUG_FS is not set
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN is not set
CONFIG_HAVE_KCSAN_COMPILER=y
# end of Generic Kernel Debugging Instruments

#
# Networking Debugging
#
# end of Networking Debugging

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
CONFIG_SLUB_DEBUG=y
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_ARCH_HAS_DEBUG_WX=y
# CONFIG_DEBUG_WX is not set
CONFIG_GENERIC_PTDUMP=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
# CONFIG_DEBUG_VM_PGTABLE is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
CONFIG_HAVE_ARCH_KFENCE=y
# CONFIG_KFENCE is not set
# end of Memory Debugging

#
# Debug Oops, Lockups and Hangs
#
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
# end of Debug Oops, Lockups and Hangs

#
# Scheduler Debugging
#
# end of Scheduler Debugging

# CONFIG_DEBUG_TIMEKEEPING is not set

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
# CONFIG_WW_MUTEX_SELFTEST is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)

# CONFIG_DEBUG_IRQFLAGS is not set
CONFIG_STACKTRACE=y
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set

#
# Debug kernel data structures
#
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
# end of Debug kernel data structures

#
# RCU Debugging
#
# end of RCU Debugging

CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_HAVE_RETHOOK=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_HAVE_BUILDTIME_MCOUNT_SORT=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set
# CONFIG_SAMPLES is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y

#
# x86 Debugging
#
CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
# CONFIG_X86_VERBOSE_BOOTUP is not set
CONFIG_EARLY_PRINTK=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_UNWINDER_FRAME_POINTER=y
# end of x86 Debugging

#
# Kernel Testing and Coverage
#
# CONFIG_KUNIT is not set
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_RUNTIME_TESTING_MENU is not set
CONFIG_ARCH_USE_MEMTEST=y
# CONFIG_MEMTEST is not set
# end of Kernel Testing and Coverage
# end of Kernel hacking

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RESEND PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (6 preceding siblings ...)
  2022-06-28  3:16 ` [PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
@ 2022-07-15 15:13 ` Shuai Xue
  2022-07-15 15:13 ` [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-15 15:13 UTC (permalink / raw)
  To: Jonathan.Cameron, linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
which custom-built by Alibaba Group's chip development business, T-Head.

Changes since v1:
- add high level workflow about DDRC so that user cloud better understand the
  PMU hardware mechanism
- rewrite patch description and add interrupt sharing constraints
- delete event perf prefix
- add a condition to fix bug in ali_drw_pmu_isr
- perfer CPU in the same Node when migrating irq
- use FIELD_PREP and FIELD_GET to make code more readable
- add T-Head HID and leave ARMHD700 as CID for compatibility
- Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/

Shuai Xue (3):
  docs: perf: Add description for Alibaba's T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
    SoC
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver

 .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
 Documentation/admin-guide/perf/index.rst      |   1 +
 MAINTAINERS                                   |   8 +
 drivers/perf/Kconfig                          |   8 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c         | 793 ++++++++++++++++++
 6 files changed, 911 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (7 preceding siblings ...)
  2022-07-15 15:13 ` [RESEND PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-07-15 15:13 ` Shuai Xue
  2022-07-19 12:35   ` Jonathan Cameron
  2022-07-15 15:13 ` [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 41+ messages in thread
From: Shuai Xue @ 2022-07-15 15:13 UTC (permalink / raw)
  To: Jonathan.Cameron, linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Alibaba's T-Head SoC implements uncore PMU for performance and functional
debugging to facilitate system maintenance. Document it to provide guidance
on how to use it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 .../admin-guide/perf/alibaba_pmu.rst          | 100 ++++++++++++++++++
 Documentation/admin-guide/perf/index.rst      |   1 +
 2 files changed, 101 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst

diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
new file mode 100644
index 000000000000..11de998bb480
--- /dev/null
+++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
@@ -0,0 +1,100 @@
+=============================================================
+Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
+=============================================================
+
+The Yitian 710, custom-built by Alibaba Group's chip development business,
+T-Head, implements uncore PMU for performance and functional debugging to
+facilitate system maintenance.
+
+DDR Sub-System Driveway (DRW) PMU Driver
+=========================================
+
+Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
+is independent of others to service system memory requests. And one DDR5
+channel is split into two independent sub-channels. The DDR Sub-System Driveway
+implements separate PMUs for each sub-channel to monitor various performance
+metrics.
+
+The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
+For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
+sub-channels of the same channel in die 0. And the PMU device of die 1 is
+prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
+
+Each sub-channel has 36 PMU counters in total, which is classified into
+four groups:
+
+- Group 0: PMU Cycle Counter. This group has one pair of counters
+  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
+  based on DDRC core clock.
+
+- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
+  to count the total access number of either the eight bank groups in a
+  selected rank, or four ranks separately in the first 4 counters. The base
+  transfer unit is 64B.
+
+- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
+  count the total retry number of each type of uncorrectable error.
+
+- Group 3: PMU Common Counters. This group has 16 counters, that are used
+  to count the common events.
+
+For now, the Driveway PMU driver only uses counters in group 0 and group 3.
+
+The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
+for connecting an SoC application bus to DDR memory devices. The DDRCTL
+receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
+These transactions are queued internally and scheduled for access while
+satisfying the SDRAM protocol timing requirements, transaction priorities, and
+dependencies between the transactions. The DDRCTL in turn issues commands on
+the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
+to and from the SDRAM. The driveway PMUs have hardware logic to gather
+statistics and performance logging signals on HIF, DFI, etc.
+
+By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF
+interface, we could calculate the bandwidth. Example usage of counting memory
+data bandwidth::
+
+  perf stat \
+    -e ali_drw_21000/hif_wr/ \
+    -e ali_drw_21000/hif_rd/ \
+    -e ali_drw_21000/hif_rmw/ \
+    -e ali_drw_21000/cycle/ \
+    -e ali_drw_21080/hif_wr/ \
+    -e ali_drw_21080/hif_rd/ \
+    -e ali_drw_21080/hif_rmw/ \
+    -e ali_drw_21080/cycle/ \
+    -e ali_drw_23000/hif_wr/ \
+    -e ali_drw_23000/hif_rd/ \
+    -e ali_drw_23000/hif_rmw/ \
+    -e ali_drw_23000/cycle/ \
+    -e ali_drw_23080/hif_wr/ \
+    -e ali_drw_23080/hif_rd/ \
+    -e ali_drw_23080/hif_rmw/ \
+    -e ali_drw_23080/cycle/ \
+    -e ali_drw_25000/hif_wr/ \
+    -e ali_drw_25000/hif_rd/ \
+    -e ali_drw_25000/hif_rmw/ \
+    -e ali_drw_25000/cycle/ \
+    -e ali_drw_25080/hif_wr/ \
+    -e ali_drw_25080/hif_rd/ \
+    -e ali_drw_25080/hif_rmw/ \
+    -e ali_drw_25080/cycle/ \
+    -e ali_drw_27000/hif_wr/ \
+    -e ali_drw_27000/hif_rd/ \
+    -e ali_drw_27000/hif_rmw/ \
+    -e ali_drw_27000/cycle/ \
+    -e ali_drw_27080/hif_wr/ \
+    -e ali_drw_27080/hif_rd/ \
+    -e ali_drw_27080/hif_rmw/ \
+    -e ali_drw_27080/cycle/ -- sleep 10
+
+The average DRAM bandwidth can be calculated as follows:
+
+- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+
+Here, DDRC_WIDTH = 64 bytes.
+
+The current driver does not support sampling. So "perf record" is
+unsupported.  Also attach to a task is unsupported as the events are all
+uncore.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 69b23f087c05..823db08863db 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -17,3 +17,4 @@ Performance monitor support
    xgene-pmu
    arm_dsu_pmu
    thunderx2-pmu
+   alibaba_pmu
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (8 preceding siblings ...)
  2022-07-15 15:13 ` [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-07-15 15:13 ` Shuai Xue
  2022-07-15 19:05   ` Randy Dunlap
  2022-07-19 13:19   ` Jonathan Cameron
  2022-07-15 15:13 ` [RESEND PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
                   ` (12 subsequent siblings)
  22 siblings, 2 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-15 15:13 UTC (permalink / raw)
  To: Jonathan.Cameron, linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
DRAM and targets cloud computing and HPC.

Each PMU is registered as a device in /sys/bus/event_source/devices, and
users can select event to monitor in each sub-channel, independently. For
example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
sub-channels of the same channel in die 0. And the PMU device of die 1 is
prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.

Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
MPAM drivers successfully, add IRQF_SHARED flag.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
---
 drivers/perf/Kconfig                  |   8 +
 drivers/perf/Makefile                 |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++
 3 files changed, 802 insertions(+)
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 1e2d69453771..dfafba4cb066 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU
 	  Provides support for the non-architectural CPU PMUs present on
 	  the Apple M1 SoCs and derivatives.
 
+config ALIBABA_UNCORE_DRW_PMU
+	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
+	depends on (ARM64 && ACPI)
+	default m
+	help
+	  Support for Driveway PMU events monitoring on Yitian 710 DDR
+	  Sub-system.
+
 source "drivers/perf/hisilicon/Kconfig"
 
 config MARVELL_CN10K_DDR_PMU
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 57a279c61df5..050d04ee19dd 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
+obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
new file mode 100644
index 000000000000..4022bc50ffae
--- /dev/null
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -0,0 +1,793 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Alibaba DDR Sub-System Driveway PMU driver
+ *
+ * Copyright (C) 2022 Alibaba Inc
+ */
+
+#define ALI_DRW_PMUNAME		"ali_drw"
+#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
+#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
+
+#include <linux/acpi.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+
+#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
+#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
+
+#define ALI_DRW_PMU_PA_SHIFT			12
+#define ALI_DRW_PMU_CNT_INIT			0x00000000
+#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
+#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
+
+#define ALI_DRW_PMU_CNT_CTRL			0xC00
+#define ALI_DRW_PMU_CNT_RST			BIT(2)
+#define ALI_DRW_PMU_CNT_STOP			BIT(1)
+#define ALI_DRW_PMU_CNT_START			BIT(0)
+
+#define ALI_DRW_PMU_CNT_STATE			0xC04
+#define ALI_DRW_PMU_TEST_CTRL			0xC08
+#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
+
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
+#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
+
+/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_EVENT_SEL0			0xC68
+/* counter 0-3 use sel0, counter 4-7 use sel1...*/
+#define ALI_DRW_PMU_EVENT_SELn(n) \
+	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
+#define ALI_DRW_PMCOM_CNT_EN			BIT(7)
+#define ALI_DRW_PMCOM_CNT_EVENT_MASK		GENMASK(5, 0)
+#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
+	(8 * (n % 4))
+
+/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
+#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
+	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
+
+#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
+#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
+#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
+#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
+#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
+#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
+#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
+#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
+
+static int ali_drw_cpuhp_state_num;
+
+static LIST_HEAD(ali_drw_pmu_irqs);
+static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);
+
+struct ali_drw_pmu_irq {
+	struct hlist_node node;
+	struct list_head irqs_node;
+	struct list_head pmus_node;
+	int irq_num;
+	int cpu;
+	refcount_t refcount;
+};
+
+struct ali_drw_pmu {
+	void __iomem *cfg_base;
+	struct device *dev;
+
+	struct list_head pmus_node;
+	struct ali_drw_pmu_irq *irq;
+	int irq_num;
+	int cpu;
+	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);
+	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+
+	struct pmu pmu;
+};
+
+#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu))
+
+#define DRW_CONFIG_EVENTID		GENMASK(7, 0)
+#define GET_DRW_EVENTID(event)	FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr.config)
+
+static ssize_t ali_drw_pmu_format_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(buf, "%s\n", (char *)eattr->var);
+}
+
+/*
+ * PMU event attributes
+ */
+static ssize_t ali_drw_pmu_event_show(struct device *dev,
+			       struct device_attribute *attr, char *page)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
+}
+
+#define ALI_DRW_PMU_ATTR(_name, _func, _config)                            \
+		(&((struct dev_ext_attribute[]) {                               \
+				{ __ATTR(_name, 0444, _func, NULL), (void *)_config }   \
+		})[0].attr.attr)
+
+#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config)            \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config)
+#define ALI_DRW_PMU_EVENT_ATTR(_name, _config)             \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config)
+
+static struct attribute *ali_drw_pmu_events_attrs[] = {
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr,			0x0),
+	ALI_DRW_PMU_EVENT_ATTR(hif_wr,				0x1),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd,				0x2),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rmw,				0x3),
+	ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd,			0x4),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles,		0x7),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles,		0x8),
+	ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical,		0x9),
+	ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical,		0xA),
+	ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical,		0xB),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_activate,			0xC),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr,			0xD),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate,		0xE),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd,			0xF),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_wr,			0x10),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_mwr,			0x11),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_precharge,			0x12),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr,		0x13),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_other,		0x14),
+	ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions,		0x15),
+	ALI_DRW_PMU_EVENT_ATTR(write_combine,			0x16),
+	ALI_DRW_PMU_EVENT_ATTR(war_hazard,			0x17),
+	ALI_DRW_PMU_EVENT_ATTR(raw_hazard,			0x18),
+	ALI_DRW_PMU_EVENT_ATTR(waw_hazard,			0x19),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0,		0x1A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1,		0x1B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2,		0x1C),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3,		0x1D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0,	0x1E),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1,	0x1F),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2,	0x20),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3,	0x21),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0,		0x26),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1,		0x27),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2,		0x28),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3,		0x29),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_refresh,			0x2A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref,			0x2B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode,			0x2D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl,			0x2E),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc,		0x34),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr,		0x35),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr,			0x36),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart,			0x37),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch,			0x38),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txreq,			0x39),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txdat,			0x3A),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxdat,			0x3B),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp,			0x3C),
+	ALI_DRW_PMU_EVENT_ATTR(tsz_vio,				0x3D),
+	ALI_DRW_PMU_EVENT_ATTR(cycle,				0x80),
+	NULL,
+};
+
+static struct attribute_group ali_drw_pmu_events_attr_group = {
+	.name = "events",
+	.attrs = ali_drw_pmu_events_attrs,
+};
+
+static struct attribute *ali_drw_pmu_format_attr[] = {
+	ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_format_group = {
+	.name = "format",
+	.attrs = ali_drw_pmu_format_attr,
+};
+
+static ssize_t ali_drw_pmu_cpumask_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu));
+}
+
+static struct device_attribute ali_drw_pmu_cpumask_attr =
+		__ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL);
+
+static struct attribute *ali_drw_pmu_cpumask_attrs[] = {
+	&ali_drw_pmu_cpumask_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_cpumask_attr_group = {
+	.attrs = ali_drw_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *ali_drw_pmu_attr_groups[] = {
+	&ali_drw_pmu_events_attr_group,
+	&ali_drw_pmu_cpumask_attr_group,
+	&ali_drw_pmu_format_group,
+	NULL,
+};
+
+/* find a counter for event, then in add func, hw.idx will equal to counter */
+static int ali_drw_get_counter_idx(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int idx;
+
+	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
+		if (!test_and_set_bit(idx, drw_pmu->used_mask))
+			return idx;
+	}
+
+	/* The counters are all in use. */
+	return -EBUSY;
+}
+
+static u64 ali_drw_pmu_read_counter(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	u64 cycle_high, cycle_low;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		cycle_high = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH);
+		cycle_high &= ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK;
+		cycle_low = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW);
+		cycle_low &= ALI_DRW_PMU_CYCLE_CNT_LOW_MASK;
+		return (cycle_high << 32 | cycle_low);
+	}
+
+	return readl(drw_pmu->cfg_base +
+		     ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx));
+}
+
+static void ali_drw_pmu_event_update(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 delta, prev, now;
+
+	do {
+		prev = local64_read(&hwc->prev_count);
+		now = ali_drw_pmu_read_counter(event);
+	} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);
+
+	/* handle overflow. */
+	delta = now - prev;
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID)
+		delta &= ALI_DRW_PMU_OV_INTR_MASK;
+	else
+		delta &= ALI_DRW_CNT_MAX_PERIOD;
+	local64_add(delta, &event->count);
+}
+
+static void ali_drw_pmu_event_set_period(struct perf_event *event)
+{
+	u64 pre_val;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	/* set a preload counter for test purpose */
+	writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+
+	/* set conunter initial value */
+	pre_val = ALI_DRW_PMU_CNT_INIT;
+	writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	local64_set(&event->hw.prev_count, pre_val);
+
+	/* set sel mode to zero to start test */
+	writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+}
+
+static void ali_drw_pmu_enable_counter(struct perf_event *event)
+{
+	u32 val, subval, reg, shift;
+	int counter = event->hw.idx;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static void ali_drw_pmu_disable_counter(struct perf_event *event)
+{
+	u32 val, reg, subval, shift;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int counter = event->hw.idx;
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
+{
+	struct ali_drw_pmu_irq *irq = data;
+	struct ali_drw_pmu *drw_pmu;
+	irqreturn_t ret = IRQ_NONE;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {
+		unsigned long status, clr_status;
+		struct perf_event *event;
+		unsigned int idx;
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			ali_drw_pmu_disable_counter(event);
+		}
+
+		/* common counter intr status */
+		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
+		status = FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status);
+		if (status) {
+			for_each_set_bit(idx, &status,
+					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+				event = drw_pmu->events[idx];
+				if (WARN_ON_ONCE(!event))
+					continue;
+				ali_drw_pmu_event_update(event);
+				ali_drw_pmu_event_set_period(event);
+			}
+
+			/* clear common counter intr status */
+			clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1);
+			writel(clr_status,
+			       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+		}
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			if (!(event->hw.state & PERF_HES_STOPPED))
+				ali_drw_pmu_enable_counter(event);
+		}
+		if (status)
+			ret = IRQ_HANDLED;
+	}
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
+						      *pdev, int irq_num)
+{
+	int ret;
+	struct ali_drw_pmu_irq *irq;
+
+	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
+		if (irq->irq_num == irq_num
+		    && refcount_inc_not_zero(&irq->refcount))
+			return irq;
+	}
+
+	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
+	if (!irq)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&irq->pmus_node);
+
+	/* Pick one CPU to be the preferred one to use */
+	irq->cpu = smp_processor_id();
+	refcount_set(&irq->refcount, 1);
+
+	/*
+	 * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same
+	 * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
+	 * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not
+	 * share with other component.
+	 */
+	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
+			       IRQF_SHARED, dev_name(&pdev->dev), irq);
+	if (ret < 0) {
+		dev_err(&pdev->dev,
+			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
+		goto out_free;
+	}
+
+	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
+	if (ret)
+		goto out_free;
+
+	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num,
+					     &irq->node);
+	if (ret)
+		goto out_free;
+
+	irq->irq_num = irq_num;
+	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
+
+	return irq;
+
+out_free:
+	kfree(irq);
+	return ERR_PTR(ret);
+}
+
+static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
+				struct platform_device *pdev)
+{
+	int irq_num;
+	struct ali_drw_pmu_irq *irq;
+
+	/* Read and init IRQ */
+	irq_num = platform_get_irq(pdev, 0);
+	if (irq_num < 0)
+		return irq_num;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	if (IS_ERR(irq))
+		return PTR_ERR(irq);
+
+	drw_pmu->irq = irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	return 0;
+}
+
+static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
+{
+	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_del_rcu(&drw_pmu->pmus_node);
+
+	if (!refcount_dec_and_test(&irq->refcount)) {
+		mutex_unlock(&ali_drw_pmu_irqs_lock);
+		return;
+	}
+
+	list_del(&irq->irqs_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
+	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
+					    &irq->node);
+	kfree(irq);
+}
+
+static int ali_drw_pmu_event_init(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event *sibling;
+	struct device *dev = drw_pmu->pmu.dev;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (is_sampling_event(event)) {
+		dev_err(dev, "Sampling not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->attach_state & PERF_ATTACH_TASK) {
+		dev_err(dev, "Per-task counter cannot allocate!\n");
+		return -EOPNOTSUPP;
+	}
+
+	event->cpu = drw_pmu->cpu;
+	if (event->cpu < 0) {
+		dev_err(dev, "Per-task mode not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->group_leader != event &&
+	    !is_software_event(event->group_leader)) {
+		dev_err(dev, "driveway only allow one event!\n");
+		return -EINVAL;
+	}
+
+	for_each_sibling_event(sibling, event->group_leader) {
+		if (sibling != event && !is_software_event(sibling)) {
+			dev_err(dev, "driveway event not allowed!\n");
+			return -EINVAL;
+		}
+	}
+
+	/* reset all the pmu counters */
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	hwc->idx = -1;
+
+	return 0;
+}
+
+static void ali_drw_pmu_start(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	event->hw.state = 0;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		writel(ALI_DRW_PMU_CNT_START,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+		return;
+	}
+
+	ali_drw_pmu_event_set_period(event);
+	if (flags & PERF_EF_RELOAD) {
+		unsigned long prev_raw_count =
+		    local64_read(&event->hw.prev_count);
+		writel(prev_raw_count,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	}
+
+	ali_drw_pmu_enable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+}
+
+static void ali_drw_pmu_stop(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	if (event->hw.state & PERF_HES_STOPPED)
+		return;
+
+	if (GET_DRW_EVENTID(event) != ALI_DRW_PMU_CYCLE_EVT_ID)
+		ali_drw_pmu_disable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	ali_drw_pmu_event_update(event);
+	event->hw.state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int ali_drw_pmu_add(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = -1;
+	int evtid;
+
+	evtid = GET_DRW_EVENTID(event);
+
+	if (evtid != ALI_DRW_PMU_CYCLE_EVT_ID) {
+		idx = ali_drw_get_counter_idx(event);
+		if (idx < 0)
+			return idx;
+		drw_pmu->events[idx] = event;
+		drw_pmu->evtids[idx] = evtid;
+	}
+	hwc->idx = idx;
+
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		ali_drw_pmu_start(event, PERF_EF_RELOAD);
+
+	/* Propagate our changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void ali_drw_pmu_del(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	ali_drw_pmu_stop(event, PERF_EF_UPDATE);
+
+	if (idx >= 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+		drw_pmu->events[idx] = NULL;
+		drw_pmu->evtids[idx] = 0;
+		clear_bit(idx, drw_pmu->used_mask);
+	}
+
+	perf_event_update_userpage(event);
+}
+
+static void ali_drw_pmu_read(struct perf_event *event)
+{
+	ali_drw_pmu_event_update(event);
+}
+
+static int ali_drw_pmu_probe(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu;
+	struct resource *res;
+	char *name;
+	int ret;
+
+	drw_pmu = devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL);
+	if (!drw_pmu)
+		return -ENOMEM;
+
+	drw_pmu->dev = &pdev->dev;
+	platform_set_drvdata(pdev, drw_pmu);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	drw_pmu->cfg_base = devm_ioremap_resource(&pdev->dev, res);
+	if (!drw_pmu->cfg_base)
+		return -ENOMEM;
+
+	name = devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx",
+			      (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT));
+	if (!name)
+		return -ENOMEM;
+
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	/* enable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL);
+
+	/* clearing interrupt status */
+	writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+
+	drw_pmu->cpu = smp_processor_id();
+
+	ret = ali_drw_pmu_init_irq(drw_pmu, pdev);
+	if (ret)
+		return ret;
+
+	drw_pmu->pmu = (struct pmu) {
+		.module		= THIS_MODULE,
+		.task_ctx_nr	= perf_invalid_context,
+		.event_init	= ali_drw_pmu_event_init,
+		.add		= ali_drw_pmu_add,
+		.del		= ali_drw_pmu_del,
+		.start		= ali_drw_pmu_start,
+		.stop		= ali_drw_pmu_stop,
+		.read		= ali_drw_pmu_read,
+		.attr_groups	= ali_drw_pmu_attr_groups,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE,
+	};
+
+	ret = perf_pmu_register(&drw_pmu->pmu, name, -1);
+	if (ret) {
+		dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n");
+		ali_drw_pmu_uninit_irq(drw_pmu);
+	}
+
+	return ret;
+}
+
+static int ali_drw_pmu_remove(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu = platform_get_drvdata(pdev);
+
+	/* disable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL);
+
+	ali_drw_pmu_uninit_irq(drw_pmu);
+	perf_pmu_unregister(&drw_pmu->pmu);
+
+	return 0;
+}
+
+static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct ali_drw_pmu_irq *irq;
+	struct ali_drw_pmu *drw_pmu;
+	unsigned int target;
+	int ret;
+	cpumask_t node_online_cpus;
+
+	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
+	if (cpu != irq->cpu)
+		return 0;
+
+	ret = cpumask_and(&node_online_cpus,
+			  cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
+	if (ret)
+		target = cpumask_any_but(&node_online_cpus, cpu);
+	else
+		target = cpumask_any_but(cpu_online_mask, cpu);
+
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	/* We're only reading, but this isn't the place to be involving RCU */
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
+		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
+	irq->cpu = target;
+
+	return 0;
+}
+
+/*
+ * Due to historical reasons, the HID used in the production environment is
+ * ARMHD700, so we leave ARMHD700 as Compatible ID.
+ */
+static const struct acpi_device_id ali_drw_acpi_match[] = {
+	{"BABA5000", 0},
+	{"ARMHD700", 0},
+	{}
+};
+
+MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
+
+static struct platform_driver ali_drw_pmu_driver = {
+	.driver = {
+		   .name = "ali_drw_pmu",
+		   .acpi_match_table = ACPI_PTR(ali_drw_acpi_match),
+		   },
+	.probe = ali_drw_pmu_probe,
+	.remove = ali_drw_pmu_remove,
+};
+
+static int __init ali_drw_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "ali_drw_pmu:online",
+				      NULL, ali_drw_pmu_offline_cpu);
+
+	if (ret < 0) {
+		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
+		       ret);
+		return ret;
+	}
+	ali_drw_cpuhp_state_num = ret;
+
+	ret = platform_driver_register(&ali_drw_pmu_driver);
+	if (ret)
+		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+
+	return ret;
+}
+
+static void __exit ali_drw_pmu_exit(void)
+{
+	platform_driver_unregister(&ali_drw_pmu_driver);
+	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+}
+
+module_init(ali_drw_pmu_init);
+module_exit(ali_drw_pmu_exit);
+
+MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
+MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
+MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
+MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver");
+MODULE_LICENSE("GPL v2");
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RESEND PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (9 preceding siblings ...)
  2022-07-15 15:13 ` [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-07-15 15:13 ` Shuai Xue
  2022-07-20  6:58 ` [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-15 15:13 UTC (permalink / raw)
  To: Jonathan.Cameron, linux-arm-kernel, linux-kernel, will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Add maintainers for Alibaba PMU document and driver

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 55114e73de26..ccddb59c253c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -733,6 +733,14 @@ S:	Maintained
 F:	Documentation/i2c/busses/i2c-ali1563.rst
 F:	drivers/i2c/busses/i2c-ali1563.c
 
+ALIBABA PMU DRIVER
+M:	Shuai Xue <xueshuai@linux.alibaba.com>
+M:	Hongbo Yao <yaohongbo@linux.alibaba.com>
+M:	Neng Chen <nengchen@linux.alibaba.com>
+S:	Supported
+F:	Documentation/admin-guide/perf/alibaba_pmu.rst
+F:	drivers/perf/alibaba_uncore_dwr_pmu.c
+
 ALIENWARE WMI DRIVER
 L:	Dell.Client.Kernel@dell.com
 S:	Maintained
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-07-15 15:13 ` [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-07-15 19:05   ` Randy Dunlap
  2022-07-17 13:02     ` Shuai Xue
  2022-07-19 13:19   ` Jonathan Cameron
  1 sibling, 1 reply; 41+ messages in thread
From: Randy Dunlap @ 2022-07-15 19:05 UTC (permalink / raw)
  To: Shuai Xue, Jonathan.Cameron, linux-arm-kernel, linux-kernel,
	will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song

Hi--

On 7/15/22 08:13, Shuai Xue wrote:
> Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
> support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
> DRAM and targets cloud computing and HPC.
> 
> Each PMU is registered as a device in /sys/bus/event_source/devices, and
> users can select event to monitor in each sub-channel, independently. For
> example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
> sub-channels of the same channel in die 0. And the PMU device of die 1 is
> prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
> 
> Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
> shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
> MPAM drivers successfully, add IRQF_SHARED flag.
> 
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
> Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
> Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
> Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
> ---
>  drivers/perf/Kconfig                  |   8 +
>  drivers/perf/Makefile                 |   1 +
>  drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++
>  3 files changed, 802 insertions(+)
>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
> 
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index 1e2d69453771..dfafba4cb066 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU
>  	  Provides support for the non-architectural CPU PMUs present on
>  	  the Apple M1 SoCs and derivatives.
>  
> +config ALIBABA_UNCORE_DRW_PMU
> +	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
> +	depends on (ARM64 && ACPI)
> +	default m

Why should this driver be automatically built?
I.e., why is the "default m" here at all?

Thanks.

> +	help
> +	  Support for Driveway PMU events monitoring on Yitian 710 DDR
> +	  Sub-system.


-- 
~Randy

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-07-15 19:05   ` Randy Dunlap
@ 2022-07-17 13:02     ` Shuai Xue
  0 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-17 13:02 UTC (permalink / raw)
  To: Randy Dunlap, Jonathan.Cameron, linux-arm-kernel, linux-kernel,
	will, mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song

Hi, Randy,

Thank you for your quick reply.

在 2022/7/16 AM3:05, Randy Dunlap 写道:
> Hi--
>
> On 7/15/22 08:13, Shuai Xue wrote:
>> Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
>> support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
>> DRAM and targets cloud computing and HPC.
>>
>> Each PMU is registered as a device in /sys/bus/event_source/devices, and
>> users can select event to monitor in each sub-channel, independently. For
>> example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
>> sub-channels of the same channel in die 0. And the PMU device of die 1 is
>> prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
>>
>> Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
>> shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
>> MPAM drivers successfully, add IRQF_SHARED flag.
>>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>> Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
>> Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
>> Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
>> Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
>> ---
>>  drivers/perf/Kconfig                  |   8 +
>>  drivers/perf/Makefile                 |   1 +
>>  drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++
>>  3 files changed, 802 insertions(+)
>>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
>>
>> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
>> index 1e2d69453771..dfafba4cb066 100644
>> --- a/drivers/perf/Kconfig
>> +++ b/drivers/perf/Kconfig
>> @@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU
>>  	  Provides support for the non-architectural CPU PMUs present on
>>  	  the Apple M1 SoCs and derivatives.
>>  
>> +config ALIBABA_UNCORE_DRW_PMU
>> +	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
>> +	depends on (ARM64 && ACPI)
>> +	default m
> 
> Why should this driver be automatically built?
> I.e., why is the "default m" here at all?

Sorry, we automatically build this driver internally. I forgot to delete it.
I will delete "default m" in next version.

Best regards,
Shuai

> 
> Thanks.
> 
>> +	help
>> +	  Support for Driveway PMU events monitoring on Yitian 710 DDR
>> +	  Sub-system.
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-07-15 15:13 ` [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-07-19 12:35   ` Jonathan Cameron
  2022-07-20  1:41     ` Shuai Xue
  0 siblings, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-07-19 12:35 UTC (permalink / raw)
  To: Shuai Xue
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song

On Fri, 15 Jul 2022 23:13:08 +0800
Shuai Xue <xueshuai@linux.alibaba.com> wrote:

> Alibaba's T-Head SoC implements uncore PMU for performance and functional
> debugging to facilitate system maintenance. Document it to provide guidance
> on how to use it.
> 
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
I'm far from an expert on this, but looks good to me.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  .../admin-guide/perf/alibaba_pmu.rst          | 100 ++++++++++++++++++
>  Documentation/admin-guide/perf/index.rst      |   1 +
>  2 files changed, 101 insertions(+)
>  create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
> 
> diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
> new file mode 100644
> index 000000000000..11de998bb480
> --- /dev/null
> +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
> @@ -0,0 +1,100 @@
> +=============================================================
> +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
> +=============================================================
> +
> +The Yitian 710, custom-built by Alibaba Group's chip development business,
> +T-Head, implements uncore PMU for performance and functional debugging to
> +facilitate system maintenance.
> +
> +DDR Sub-System Driveway (DRW) PMU Driver
> +=========================================
> +
> +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
> +is independent of others to service system memory requests. And one DDR5
> +channel is split into two independent sub-channels. The DDR Sub-System Driveway
> +implements separate PMUs for each sub-channel to monitor various performance
> +metrics.
> +
> +The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
> +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
> +sub-channels of the same channel in die 0. And the PMU device of die 1 is
> +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
> +
> +Each sub-channel has 36 PMU counters in total, which is classified into
> +four groups:
> +
> +- Group 0: PMU Cycle Counter. This group has one pair of counters
> +  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
> +  based on DDRC core clock.
> +
> +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
> +  to count the total access number of either the eight bank groups in a
> +  selected rank, or four ranks separately in the first 4 counters. The base
> +  transfer unit is 64B.
> +
> +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
> +  count the total retry number of each type of uncorrectable error.
> +
> +- Group 3: PMU Common Counters. This group has 16 counters, that are used
> +  to count the common events.
> +
> +For now, the Driveway PMU driver only uses counters in group 0 and group 3.
> +
> +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
> +for connecting an SoC application bus to DDR memory devices. The DDRCTL
> +receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
> +These transactions are queued internally and scheduled for access while
> +satisfying the SDRAM protocol timing requirements, transaction priorities, and
> +dependencies between the transactions. The DDRCTL in turn issues commands on
> +the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
> +to and from the SDRAM. The driveway PMUs have hardware logic to gather
> +statistics and performance logging signals on HIF, DFI, etc.
> +
> +By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF
> +interface, we could calculate the bandwidth. Example usage of counting memory
> +data bandwidth::
> +
> +  perf stat \
> +    -e ali_drw_21000/hif_wr/ \
> +    -e ali_drw_21000/hif_rd/ \
> +    -e ali_drw_21000/hif_rmw/ \
> +    -e ali_drw_21000/cycle/ \
> +    -e ali_drw_21080/hif_wr/ \
> +    -e ali_drw_21080/hif_rd/ \
> +    -e ali_drw_21080/hif_rmw/ \
> +    -e ali_drw_21080/cycle/ \
> +    -e ali_drw_23000/hif_wr/ \
> +    -e ali_drw_23000/hif_rd/ \
> +    -e ali_drw_23000/hif_rmw/ \
> +    -e ali_drw_23000/cycle/ \
> +    -e ali_drw_23080/hif_wr/ \
> +    -e ali_drw_23080/hif_rd/ \
> +    -e ali_drw_23080/hif_rmw/ \
> +    -e ali_drw_23080/cycle/ \
> +    -e ali_drw_25000/hif_wr/ \
> +    -e ali_drw_25000/hif_rd/ \
> +    -e ali_drw_25000/hif_rmw/ \
> +    -e ali_drw_25000/cycle/ \
> +    -e ali_drw_25080/hif_wr/ \
> +    -e ali_drw_25080/hif_rd/ \
> +    -e ali_drw_25080/hif_rmw/ \
> +    -e ali_drw_25080/cycle/ \
> +    -e ali_drw_27000/hif_wr/ \
> +    -e ali_drw_27000/hif_rd/ \
> +    -e ali_drw_27000/hif_rmw/ \
> +    -e ali_drw_27000/cycle/ \
> +    -e ali_drw_27080/hif_wr/ \
> +    -e ali_drw_27080/hif_rd/ \
> +    -e ali_drw_27080/hif_rmw/ \
> +    -e ali_drw_27080/cycle/ -- sleep 10
> +
> +The average DRAM bandwidth can be calculated as follows:
> +
> +- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
> +- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
> +
> +Here, DDRC_WIDTH = 64 bytes.
> +
> +The current driver does not support sampling. So "perf record" is
> +unsupported.  Also attach to a task is unsupported as the events are all
> +uncore.
> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
> index 69b23f087c05..823db08863db 100644
> --- a/Documentation/admin-guide/perf/index.rst
> +++ b/Documentation/admin-guide/perf/index.rst
> @@ -17,3 +17,4 @@ Performance monitor support
>     xgene-pmu
>     arm_dsu_pmu
>     thunderx2-pmu
> +   alibaba_pmu


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-07-15 15:13 ` [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
  2022-07-15 19:05   ` Randy Dunlap
@ 2022-07-19 13:19   ` Jonathan Cameron
  2022-07-20  6:11     ` Shuai Xue
  1 sibling, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-07-19 13:19 UTC (permalink / raw)
  To: Shuai Xue
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song

On Fri, 15 Jul 2022 23:13:09 +0800
Shuai Xue <xueshuai@linux.alibaba.com> wrote:

> Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
> support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
> DRAM and targets cloud computing and HPC.
> 
> Each PMU is registered as a device in /sys/bus/event_source/devices, and
> users can select event to monitor in each sub-channel, independently. For
> example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
> sub-channels of the same channel in die 0. And the PMU device of die 1 is
> prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
> 
> Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
> shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
> MPAM drivers successfully, add IRQF_SHARED flag.
> 
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
> Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
> Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
> Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>

Hi.

I'd like to see the build constraints relaxed so this gets much more build
coverage from the automated systems.  COMPILE_TEST can keep the happy
balance of not showing the driver where it isn't supported, but still ensuring
it gets hit on lots of random config builds.

Only other significant thing is you are relying on a lot of 'implicit' includes.
Don't do that as it just makes it harder to refactor the core headers (which is
an active topic currently as reducing cross includes helps a lot with kernel build
time). Otherwise LGTM.  So with those addressed.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks,

Jonathan

> ---
>  drivers/perf/Kconfig                  |   8 +
>  drivers/perf/Makefile                 |   1 +
>  drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++
>  3 files changed, 802 insertions(+)
>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
> 
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index 1e2d69453771..dfafba4cb066 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU
>  	  Provides support for the non-architectural CPU PMUs present on
>  	  the Apple M1 SoCs and derivatives.
>  
> +config ALIBABA_UNCORE_DRW_PMU
> +	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
> +	depends on (ARM64 && ACPI)

Try to avoid limiting the ability to build the driver. I'm not seeing anything
in here that needs either ARM64 or ACPI to build.  If you want to keep those
(I wouldn't) then || COMPILE_TEST
That way you'll get a lot more builds / random configs run against your code
and typically see any issues that occur due to changes elsewhere in the tree
faster than you will if you heavily restrict when your driver is built.

> +	default m

You've addressed this in response to Randy's review.

> +	help
> +	  Support for Driveway PMU events monitoring on Yitian 710 DDR
> +	  Sub-system.
> +
>  source "drivers/perf/hisilicon/Kconfig"
>  
>  config MARVELL_CN10K_DDR_PMU
> diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> index 57a279c61df5..050d04ee19dd 100644
> --- a/drivers/perf/Makefile
> +++ b/drivers/perf/Makefile
> @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
>  obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
>  obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
>  obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
> +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
> diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
> new file mode 100644
> index 000000000000..4022bc50ffae
> --- /dev/null
> +++ b/drivers/perf/alibaba_uncore_drw_pmu.c
> @@ -0,0 +1,793 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Alibaba DDR Sub-System Driveway PMU driver
> + *
> + * Copyright (C) 2022 Alibaba Inc
> + */
> +
> +#define ALI_DRW_PMUNAME		"ali_drw"
> +#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
> +#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
> +
> +#include <linux/acpi.h>
> +#include <linux/perf_event.h>
> +#include <linux/platform_device.h>
> +

See below for examples, but you should look closer at the includes to make
sure they are appropriate.  Mostly you should not rely on one linux header
including another, but rather provide everything used in a given file.
There are some very generic headers where it is fine to ignore that...

> +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
> +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
> +
> +#define ALI_DRW_PMU_PA_SHIFT			12
> +#define ALI_DRW_PMU_CNT_INIT			0x00000000
> +#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
> +#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
> +
> +#define ALI_DRW_PMU_CNT_CTRL			0xC00
> +#define ALI_DRW_PMU_CNT_RST			BIT(2)
> +#define ALI_DRW_PMU_CNT_STOP			BIT(1)
> +#define ALI_DRW_PMU_CNT_START			BIT(0)
> +
> +#define ALI_DRW_PMU_CNT_STATE			0xC04
> +#define ALI_DRW_PMU_TEST_CTRL			0xC08
> +#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
> +
> +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
> +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
> +#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
> +#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
> +
> +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
> +#define ALI_DRW_PMU_EVENT_SEL0			0xC68
> +/* counter 0-3 use sel0, counter 4-7 use sel1...*/
> +#define ALI_DRW_PMU_EVENT_SELn(n) \
> +	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
> +#define ALI_DRW_PMCOM_CNT_EN			BIT(7)
> +#define ALI_DRW_PMCOM_CNT_EVENT_MASK		GENMASK(5, 0)
> +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
> +	(8 * (n % 4))
> +
> +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
> +#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
> +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
> +	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
> +
> +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
> +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
> +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
> +#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
> +#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
> +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
> +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
> +#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
> +
> +static int ali_drw_cpuhp_state_num;
> +
> +static LIST_HEAD(ali_drw_pmu_irqs);
> +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);

include linux/mutex.h

> +
> +struct ali_drw_pmu_irq {
> +	struct hlist_node node;
> +	struct list_head irqs_node;
> +	struct list_head pmus_node;
Include appropriate headers for list_head, hlist_node etc directly rather than
relying on possibly unstable includes from within other headers.

> +	int irq_num;
> +	int cpu;
> +	refcount_t refcount;
> +};
> +
> +struct ali_drw_pmu {
> +	void __iomem *cfg_base;
> +	struct device *dev;
> +
> +	struct list_head pmus_node;
> +	struct ali_drw_pmu_irq *irq;
> +	int irq_num;
> +	int cpu;
> +	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);

bitmap.h

> +	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
> +	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
> +
> +	struct pmu pmu;
> +};
> +

> +
> +/* find a counter for event, then in add func, hw.idx will equal to counter */
> +static int ali_drw_get_counter_idx(struct perf_event *event)
> +{
> +	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
> +	int idx;
> +
> +	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
> +		if (!test_and_set_bit(idx, drw_pmu->used_mask))
> +			return idx;
> +	}
> +
> +	/* The counters are all in use. */
> +	return -EBUSY;
> +}
>


> +
> +/*
> + * Due to historical reasons, the HID used in the production environment is
> + * ARMHD700, so we leave ARMHD700 as Compatible ID.
> + */
> +static const struct acpi_device_id ali_drw_acpi_match[] = {

If you build without ACPI support and drop the requirement for it as suggested
above, this will give an unused warning.
Either drop the ACPI_PTR() - that's very rarely worth bothering or
add a __maybe_unused marking to this array.


> +	{"BABA5000", 0},
> +	{"ARMHD700", 0},
> +	{}
> +};
> +
> +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
> +
> +static struct platform_driver ali_drw_pmu_driver = {
> +	.driver = {
> +		   .name = "ali_drw_pmu",
> +		   .acpi_match_table = ACPI_PTR(ali_drw_acpi_match),
> +		   },
> +	.probe = ali_drw_pmu_probe,
> +	.remove = ali_drw_pmu_remove,
> +};
> +


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-07-19 12:35   ` Jonathan Cameron
@ 2022-07-20  1:41     ` Shuai Xue
  0 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-20  1:41 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song


在 2022/7/19 PM8:35, Jonathan Cameron 写道:
> On Fri, 15 Jul 2022 23:13:08 +0800
> Shuai Xue <xueshuai@linux.alibaba.com> wrote:
> 
>> Alibaba's T-Head SoC implements uncore PMU for performance and functional
>> debugging to facilitate system maintenance. Document it to provide guidance
>> on how to use it.
>>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> I'm far from an expert on this, but looks good to me.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks for the review.

Cheers,
Shuai


>> ---
>>  .../admin-guide/perf/alibaba_pmu.rst          | 100 ++++++++++++++++++
>>  Documentation/admin-guide/perf/index.rst      |   1 +
>>  2 files changed, 101 insertions(+)
>>  create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
>>
>> diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
>> new file mode 100644
>> index 000000000000..11de998bb480
>> --- /dev/null
>> +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
>> @@ -0,0 +1,100 @@
>> +=============================================================
>> +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
>> +=============================================================
>> +
>> +The Yitian 710, custom-built by Alibaba Group's chip development business,
>> +T-Head, implements uncore PMU for performance and functional debugging to
>> +facilitate system maintenance.
>> +
>> +DDR Sub-System Driveway (DRW) PMU Driver
>> +=========================================
>> +
>> +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
>> +is independent of others to service system memory requests. And one DDR5
>> +channel is split into two independent sub-channels. The DDR Sub-System Driveway
>> +implements separate PMUs for each sub-channel to monitor various performance
>> +metrics.
>> +
>> +The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
>> +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
>> +sub-channels of the same channel in die 0. And the PMU device of die 1 is
>> +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
>> +
>> +Each sub-channel has 36 PMU counters in total, which is classified into
>> +four groups:
>> +
>> +- Group 0: PMU Cycle Counter. This group has one pair of counters
>> +  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
>> +  based on DDRC core clock.
>> +
>> +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
>> +  to count the total access number of either the eight bank groups in a
>> +  selected rank, or four ranks separately in the first 4 counters. The base
>> +  transfer unit is 64B.
>> +
>> +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
>> +  count the total retry number of each type of uncorrectable error.
>> +
>> +- Group 3: PMU Common Counters. This group has 16 counters, that are used
>> +  to count the common events.
>> +
>> +For now, the Driveway PMU driver only uses counters in group 0 and group 3.
>> +
>> +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
>> +for connecting an SoC application bus to DDR memory devices. The DDRCTL
>> +receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
>> +These transactions are queued internally and scheduled for access while
>> +satisfying the SDRAM protocol timing requirements, transaction priorities, and
>> +dependencies between the transactions. The DDRCTL in turn issues commands on
>> +the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
>> +to and from the SDRAM. The driveway PMUs have hardware logic to gather
>> +statistics and performance logging signals on HIF, DFI, etc.
>> +
>> +By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF
>> +interface, we could calculate the bandwidth. Example usage of counting memory
>> +data bandwidth::
>> +
>> +  perf stat \
>> +    -e ali_drw_21000/hif_wr/ \
>> +    -e ali_drw_21000/hif_rd/ \
>> +    -e ali_drw_21000/hif_rmw/ \
>> +    -e ali_drw_21000/cycle/ \
>> +    -e ali_drw_21080/hif_wr/ \
>> +    -e ali_drw_21080/hif_rd/ \
>> +    -e ali_drw_21080/hif_rmw/ \
>> +    -e ali_drw_21080/cycle/ \
>> +    -e ali_drw_23000/hif_wr/ \
>> +    -e ali_drw_23000/hif_rd/ \
>> +    -e ali_drw_23000/hif_rmw/ \
>> +    -e ali_drw_23000/cycle/ \
>> +    -e ali_drw_23080/hif_wr/ \
>> +    -e ali_drw_23080/hif_rd/ \
>> +    -e ali_drw_23080/hif_rmw/ \
>> +    -e ali_drw_23080/cycle/ \
>> +    -e ali_drw_25000/hif_wr/ \
>> +    -e ali_drw_25000/hif_rd/ \
>> +    -e ali_drw_25000/hif_rmw/ \
>> +    -e ali_drw_25000/cycle/ \
>> +    -e ali_drw_25080/hif_wr/ \
>> +    -e ali_drw_25080/hif_rd/ \
>> +    -e ali_drw_25080/hif_rmw/ \
>> +    -e ali_drw_25080/cycle/ \
>> +    -e ali_drw_27000/hif_wr/ \
>> +    -e ali_drw_27000/hif_rd/ \
>> +    -e ali_drw_27000/hif_rmw/ \
>> +    -e ali_drw_27000/cycle/ \
>> +    -e ali_drw_27080/hif_wr/ \
>> +    -e ali_drw_27080/hif_rd/ \
>> +    -e ali_drw_27080/hif_rmw/ \
>> +    -e ali_drw_27080/cycle/ -- sleep 10
>> +
>> +The average DRAM bandwidth can be calculated as follows:
>> +
>> +- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
>> +- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
>> +
>> +Here, DDRC_WIDTH = 64 bytes.
>> +
>> +The current driver does not support sampling. So "perf record" is
>> +unsupported.  Also attach to a task is unsupported as the events are all
>> +uncore.
>> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
>> index 69b23f087c05..823db08863db 100644
>> --- a/Documentation/admin-guide/perf/index.rst
>> +++ b/Documentation/admin-guide/perf/index.rst
>> @@ -17,3 +17,4 @@ Performance monitor support
>>     xgene-pmu
>>     arm_dsu_pmu
>>     thunderx2-pmu
>> +   alibaba_pmu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-07-19 13:19   ` Jonathan Cameron
@ 2022-07-20  6:11     ` Shuai Xue
  0 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-20  6:11 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-arm-kernel, linux-kernel, will, mark.rutland, baolin.wang,
	yaohongbo, nengchen, zhuo.song


在 2022/7/19 PM9:19, Jonathan Cameron 写道:
> On Fri, 15 Jul 2022 23:13:09 +0800
> Shuai Xue <xueshuai@linux.alibaba.com> wrote:
> 
>> Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
>> support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
>> DRAM and targets cloud computing and HPC.
>>
>> Each PMU is registered as a device in /sys/bus/event_source/devices, and
>> users can select event to monitor in each sub-channel, independently. For
>> example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
>> sub-channels of the same channel in die 0. And the PMU device of die 1 is
>> prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
>>
>> Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
>> shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
>> MPAM drivers successfully, add IRQF_SHARED flag.
>>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>> Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
>> Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
>> Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
>> Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
> 
> Hi.
> 
> I'd like to see the build constraints relaxed so this gets much more build
> coverage from the automated systems.  COMPILE_TEST can keep the happy
> balance of not showing the driver where it isn't supported, but still ensuring
> it gets hit on lots of random config builds.

I see your concern. I will add COMPILE_TEST in Kconfig in next version.
Thank you.

> Only other significant thing is you are relying on a lot of 'implicit' includes.
> Don't do that as it just makes it harder to refactor the core headers (which is
> an active topic currently as reducing cross includes helps a lot with kernel build
> time). Otherwise LGTM.  So with those addressed.>
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Thanks,
> 
> Jonathan

Thank you, Jonathan. I will explicitly include dependent header file in
next version.

Best Regards,
Shuai

> 
>> ---
>>  drivers/perf/Kconfig                  |   8 +
>>  drivers/perf/Makefile                 |   1 +
>>  drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++
>>  3 files changed, 802 insertions(+)
>>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
>>
>> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
>> index 1e2d69453771..dfafba4cb066 100644
>> --- a/drivers/perf/Kconfig
>> +++ b/drivers/perf/Kconfig
>> @@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU
>>  	  Provides support for the non-architectural CPU PMUs present on
>>  	  the Apple M1 SoCs and derivatives.
>>  
>> +config ALIBABA_UNCORE_DRW_PMU
>> +	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
>> +	depends on (ARM64 && ACPI)
> 
> Try to avoid limiting the ability to build the driver. I'm not seeing anything
> in here that needs either ARM64 or ACPI to build.  If you want to keep those
> (I wouldn't) then || COMPILE_TEST
> That way you'll get a lot more builds / random configs run against your code
> and typically see any issues that occur due to changes elsewhere in the tree
> faster than you will if you heavily restrict when your driver is built.

Got it. Agree to remove ACPI dependency, widen the compile-test coverage.
I prefer leaving ARM64 as ARM_CMN does.

	config ARM_CMN
		tristate "Arm CMN-600 PMU support"
		depends on ARM64 || COMPILE_TEST
		help
	  	Support for PMU events monitoring on the Arm CMN-600 Coherent Mesh
	 	Network interconnect.

>> +	default m
> 
> You've addressed this in response to Randy's review.

Yep, will address it in next version. Thank you.

> 
>> +	help
>> +	  Support for Driveway PMU events monitoring on Yitian 710 DDR
>> +	  Sub-system.
>> +
>>  source "drivers/perf/hisilicon/Kconfig"
>>  
>>  config MARVELL_CN10K_DDR_PMU
>> diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
>> index 57a279c61df5..050d04ee19dd 100644
>> --- a/drivers/perf/Makefile
>> +++ b/drivers/perf/Makefile
>> @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
>>  obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
>>  obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
>>  obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
>> +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
>> diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
>> new file mode 100644
>> index 000000000000..4022bc50ffae
>> --- /dev/null
>> +++ b/drivers/perf/alibaba_uncore_drw_pmu.c
>> @@ -0,0 +1,793 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Alibaba DDR Sub-System Driveway PMU driver
>> + *
>> + * Copyright (C) 2022 Alibaba Inc
>> + */
>> +
>> +#define ALI_DRW_PMUNAME		"ali_drw"
>> +#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
>> +#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/perf_event.h>
>> +#include <linux/platform_device.h>
>> +
> 
> See below for examples, but you should look closer at the includes to make
> sure they are appropriate.  Mostly you should not rely on one linux header
> including another, but rather provide everything used in a given file.
> There are some very generic headers where it is fine to ignore that...

Got it, good point. Thank you.

>> +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
>> +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
>> +
>> +#define ALI_DRW_PMU_PA_SHIFT			12
>> +#define ALI_DRW_PMU_CNT_INIT			0x00000000
>> +#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
>> +#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
>> +
>> +#define ALI_DRW_PMU_CNT_CTRL			0xC00
>> +#define ALI_DRW_PMU_CNT_RST			BIT(2)
>> +#define ALI_DRW_PMU_CNT_STOP			BIT(1)
>> +#define ALI_DRW_PMU_CNT_START			BIT(0)
>> +
>> +#define ALI_DRW_PMU_CNT_STATE			0xC04
>> +#define ALI_DRW_PMU_TEST_CTRL			0xC08
>> +#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
>> +
>> +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
>> +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
>> +#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
>> +#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
>> +
>> +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
>> +#define ALI_DRW_PMU_EVENT_SEL0			0xC68
>> +/* counter 0-3 use sel0, counter 4-7 use sel1...*/
>> +#define ALI_DRW_PMU_EVENT_SELn(n) \
>> +	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
>> +#define ALI_DRW_PMCOM_CNT_EN			BIT(7)
>> +#define ALI_DRW_PMCOM_CNT_EVENT_MASK		GENMASK(5, 0)
>> +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
>> +	(8 * (n % 4))
>> +
>> +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
>> +#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
>> +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
>> +	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
>> +
>> +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
>> +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
>> +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
>> +#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
>> +#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
>> +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
>> +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
>> +#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
>> +
>> +static int ali_drw_cpuhp_state_num;
>> +
>> +static LIST_HEAD(ali_drw_pmu_irqs);
>> +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);
> 
> include linux/mutex.h

Will add it in next version, thanks.

>> +
>> +struct ali_drw_pmu_irq {
>> +	struct hlist_node node;
>> +	struct list_head irqs_node;
>> +	struct list_head pmus_node;
> Include appropriate headers for list_head, hlist_node etc directly rather than
> relying on possibly unstable includes from within other headers.

Will add them in next version, thanks.

>> +	int irq_num;
>> +	int cpu;
>> +	refcount_t refcount;
>> +};
>> +
>> +struct ali_drw_pmu {
>> +	void __iomem *cfg_base;
>> +	struct device *dev;
>> +
>> +	struct list_head pmus_node;
>> +	struct ali_drw_pmu_irq *irq;
>> +	int irq_num;
>> +	int cpu;
>> +	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);
> 
> bitmap.h

Will add it in next version, thanks.

> 
>> +	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
>> +	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
>> +
>> +	struct pmu pmu;
>> +};
>> +
> 
>> +
>> +/* find a counter for event, then in add func, hw.idx will equal to counter */
>> +static int ali_drw_get_counter_idx(struct perf_event *event)
>> +{
>> +	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
>> +	int idx;
>> +
>> +	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
>> +		if (!test_and_set_bit(idx, drw_pmu->used_mask))
>> +			return idx;
>> +	}
>> +
>> +	/* The counters are all in use. */
>> +	return -EBUSY;
>> +}
>>
> 
> 
>> +
>> +/*
>> + * Due to historical reasons, the HID used in the production environment is
>> + * ARMHD700, so we leave ARMHD700 as Compatible ID.
>> + */
>> +static const struct acpi_device_id ali_drw_acpi_match[] = {
> 
> If you build without ACPI support and drop the requirement for it as suggested
> above, this will give an unused warning.
> Either drop the ACPI_PTR() - that's very rarely worth bothering or
> add a __maybe_unused marking to this array.

Got it. I will remove ACPI dependency and drop ACPI_PTR(). Thank you very much.
> 
>> +	{"BABA5000", 0},
>> +	{"ARMHD700", 0},
>> +	{}
>> +};
>> +
>> +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
>> +
>> +static struct platform_driver ali_drw_pmu_driver = {
>> +	.driver = {
>> +		   .name = "ali_drw_pmu",
>> +		   .acpi_match_table = ACPI_PTR(ali_drw_acpi_match),
>> +		   },
>> +	.probe = ali_drw_pmu_probe,
>> +	.remove = ali_drw_pmu_remove,
>> +};
>> +

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (10 preceding siblings ...)
  2022-07-15 15:13 ` [RESEND PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
@ 2022-07-20  6:58 ` Shuai Xue
  2022-08-05  9:07   ` Shuai Xue
  2022-08-17  8:15   ` Baolin Wang
  2022-07-20  6:58 ` [PATCH v3 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
                   ` (10 subsequent siblings)
  22 siblings, 2 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-20  6:58 UTC (permalink / raw)
  To: Jonathan.Cameron, rdunlap, linux-arm-kernel, linux-kernel, will,
	mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
which custom-built by Alibaba Group's chip development business, T-Head.

Changes since v2:
- relaxe build constraints and add COMPILE_TEST
- explicitly include dependent headers
- add Reviewed-by, thanks Jonathan Cameron and Randy Dunlap for their valuable review and comments
- Link: https://lore.kernel.org/linux-arm-kernel/20220715151310.90091-4-xueshuai@linux.alibaba.com/T/#m1116abc4b0bda1943ab436a45d95359f9bbe0858

Changes since v1:
- add high level workflow about DDRC so that user cloud better understand the
  PMU hardware mechanism
- rewrite patch description and add interrupt sharing constraints
- delete event perf prefix
- add a condition to fix bug in ali_drw_pmu_isr
- perfer CPU in the same Node when migrating irq
- use FIELD_PREP and FIELD_GET to make code more readable
- add T-Head HID and leave ARMHD700 as CID for compatibility
- Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/

Shuai Xue (3):
  docs: perf: Add description for Alibaba's T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
    SoC
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver

 .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
 Documentation/admin-guide/perf/index.rst      |   1 +
 MAINTAINERS                                   |   7 +
 drivers/perf/Kconfig                          |   7 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c         | 810 ++++++++++++++++++
 6 files changed, 926 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (11 preceding siblings ...)
  2022-07-20  6:58 ` [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-07-20  6:58 ` Shuai Xue
  2022-07-20  6:58 ` [PATCH v3 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-20  6:58 UTC (permalink / raw)
  To: Jonathan.Cameron, rdunlap, linux-arm-kernel, linux-kernel, will,
	mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Alibaba's T-Head SoC implements uncore PMU for performance and functional
debugging to facilitate system maintenance. Document it to provide guidance
on how to use it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 .../admin-guide/perf/alibaba_pmu.rst          | 100 ++++++++++++++++++
 Documentation/admin-guide/perf/index.rst      |   1 +
 2 files changed, 101 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst

diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
new file mode 100644
index 000000000000..11de998bb480
--- /dev/null
+++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
@@ -0,0 +1,100 @@
+=============================================================
+Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
+=============================================================
+
+The Yitian 710, custom-built by Alibaba Group's chip development business,
+T-Head, implements uncore PMU for performance and functional debugging to
+facilitate system maintenance.
+
+DDR Sub-System Driveway (DRW) PMU Driver
+=========================================
+
+Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
+is independent of others to service system memory requests. And one DDR5
+channel is split into two independent sub-channels. The DDR Sub-System Driveway
+implements separate PMUs for each sub-channel to monitor various performance
+metrics.
+
+The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
+For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
+sub-channels of the same channel in die 0. And the PMU device of die 1 is
+prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
+
+Each sub-channel has 36 PMU counters in total, which is classified into
+four groups:
+
+- Group 0: PMU Cycle Counter. This group has one pair of counters
+  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
+  based on DDRC core clock.
+
+- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
+  to count the total access number of either the eight bank groups in a
+  selected rank, or four ranks separately in the first 4 counters. The base
+  transfer unit is 64B.
+
+- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
+  count the total retry number of each type of uncorrectable error.
+
+- Group 3: PMU Common Counters. This group has 16 counters, that are used
+  to count the common events.
+
+For now, the Driveway PMU driver only uses counters in group 0 and group 3.
+
+The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
+for connecting an SoC application bus to DDR memory devices. The DDRCTL
+receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
+These transactions are queued internally and scheduled for access while
+satisfying the SDRAM protocol timing requirements, transaction priorities, and
+dependencies between the transactions. The DDRCTL in turn issues commands on
+the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
+to and from the SDRAM. The driveway PMUs have hardware logic to gather
+statistics and performance logging signals on HIF, DFI, etc.
+
+By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF
+interface, we could calculate the bandwidth. Example usage of counting memory
+data bandwidth::
+
+  perf stat \
+    -e ali_drw_21000/hif_wr/ \
+    -e ali_drw_21000/hif_rd/ \
+    -e ali_drw_21000/hif_rmw/ \
+    -e ali_drw_21000/cycle/ \
+    -e ali_drw_21080/hif_wr/ \
+    -e ali_drw_21080/hif_rd/ \
+    -e ali_drw_21080/hif_rmw/ \
+    -e ali_drw_21080/cycle/ \
+    -e ali_drw_23000/hif_wr/ \
+    -e ali_drw_23000/hif_rd/ \
+    -e ali_drw_23000/hif_rmw/ \
+    -e ali_drw_23000/cycle/ \
+    -e ali_drw_23080/hif_wr/ \
+    -e ali_drw_23080/hif_rd/ \
+    -e ali_drw_23080/hif_rmw/ \
+    -e ali_drw_23080/cycle/ \
+    -e ali_drw_25000/hif_wr/ \
+    -e ali_drw_25000/hif_rd/ \
+    -e ali_drw_25000/hif_rmw/ \
+    -e ali_drw_25000/cycle/ \
+    -e ali_drw_25080/hif_wr/ \
+    -e ali_drw_25080/hif_rd/ \
+    -e ali_drw_25080/hif_rmw/ \
+    -e ali_drw_25080/cycle/ \
+    -e ali_drw_27000/hif_wr/ \
+    -e ali_drw_27000/hif_rd/ \
+    -e ali_drw_27000/hif_rmw/ \
+    -e ali_drw_27000/cycle/ \
+    -e ali_drw_27080/hif_wr/ \
+    -e ali_drw_27080/hif_rd/ \
+    -e ali_drw_27080/hif_rmw/ \
+    -e ali_drw_27080/cycle/ -- sleep 10
+
+The average DRAM bandwidth can be calculated as follows:
+
+- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+
+Here, DDRC_WIDTH = 64 bytes.
+
+The current driver does not support sampling. So "perf record" is
+unsupported.  Also attach to a task is unsupported as the events are all
+uncore.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 69b23f087c05..823db08863db 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -17,3 +17,4 @@ Performance monitor support
    xgene-pmu
    arm_dsu_pmu
    thunderx2-pmu
+   alibaba_pmu
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (12 preceding siblings ...)
  2022-07-20  6:58 ` [PATCH v3 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-07-20  6:58 ` Shuai Xue
  2022-07-20  6:58 ` [PATCH v3 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-20  6:58 UTC (permalink / raw)
  To: Jonathan.Cameron, rdunlap, linux-arm-kernel, linux-kernel, will,
	mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
DRAM and targets cloud computing and HPC.

Each PMU is registered as a device in /sys/bus/event_source/devices, and
users can select event to monitor in each sub-channel, independently. For
example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
sub-channels of the same channel in die 0. And the PMU device of die 1 is
prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.

Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
MPAM drivers successfully, add IRQF_SHARED flag.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 drivers/perf/Kconfig                  |   7 +
 drivers/perf/Makefile                 |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c | 810 ++++++++++++++++++++++++++
 3 files changed, 818 insertions(+)
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 1e2d69453771..44c07ea487f4 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -183,6 +183,13 @@ config APPLE_M1_CPU_PMU
 	  Provides support for the non-architectural CPU PMUs present on
 	  the Apple M1 SoCs and derivatives.
 
+config ALIBABA_UNCORE_DRW_PMU
+	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
+	depends on ARM64 || COMPILE_TEST
+	help
+	  Support for Driveway PMU events monitoring on Yitian 710 DDR
+	  Sub-system.
+
 source "drivers/perf/hisilicon/Kconfig"
 
 config MARVELL_CN10K_DDR_PMU
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 57a279c61df5..050d04ee19dd 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
+obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
new file mode 100644
index 000000000000..82729b874f09
--- /dev/null
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -0,0 +1,810 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Alibaba DDR Sub-System Driveway PMU driver
+ *
+ * Copyright (C) 2022 Alibaba Inc
+ */
+
+#define ALI_DRW_PMUNAME		"ali_drw"
+#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
+#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
+
+#include <linux/acpi.h>
+#include <linux/bitfield.h>
+#include <linux/bitmap.h>
+#include <linux/bitops.h>
+#include <linux/cpuhotplug.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+#include <linux/rculist.h>
+#include <linux/refcount.h>
+
+
+#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
+#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
+
+#define ALI_DRW_PMU_PA_SHIFT			12
+#define ALI_DRW_PMU_CNT_INIT			0x00000000
+#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
+#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
+
+#define ALI_DRW_PMU_CNT_CTRL			0xC00
+#define ALI_DRW_PMU_CNT_RST			BIT(2)
+#define ALI_DRW_PMU_CNT_STOP			BIT(1)
+#define ALI_DRW_PMU_CNT_START			BIT(0)
+
+#define ALI_DRW_PMU_CNT_STATE			0xC04
+#define ALI_DRW_PMU_TEST_CTRL			0xC08
+#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
+
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
+#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
+
+/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_EVENT_SEL0			0xC68
+/* counter 0-3 use sel0, counter 4-7 use sel1...*/
+#define ALI_DRW_PMU_EVENT_SELn(n) \
+	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
+#define ALI_DRW_PMCOM_CNT_EN			BIT(7)
+#define ALI_DRW_PMCOM_CNT_EVENT_MASK		GENMASK(5, 0)
+#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
+	(8 * (n % 4))
+
+/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
+#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
+	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
+
+#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
+#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
+#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
+#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
+#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
+#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
+#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
+#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
+
+static int ali_drw_cpuhp_state_num;
+
+static LIST_HEAD(ali_drw_pmu_irqs);
+static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);
+
+struct ali_drw_pmu_irq {
+	struct hlist_node node;
+	struct list_head irqs_node;
+	struct list_head pmus_node;
+	int irq_num;
+	int cpu;
+	refcount_t refcount;
+};
+
+struct ali_drw_pmu {
+	void __iomem *cfg_base;
+	struct device *dev;
+
+	struct list_head pmus_node;
+	struct ali_drw_pmu_irq *irq;
+	int irq_num;
+	int cpu;
+	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);
+	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+
+	struct pmu pmu;
+};
+
+#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu))
+
+#define DRW_CONFIG_EVENTID		GENMASK(7, 0)
+#define GET_DRW_EVENTID(event)	FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr.config)
+
+static ssize_t ali_drw_pmu_format_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(buf, "%s\n", (char *)eattr->var);
+}
+
+/*
+ * PMU event attributes
+ */
+static ssize_t ali_drw_pmu_event_show(struct device *dev,
+			       struct device_attribute *attr, char *page)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
+}
+
+#define ALI_DRW_PMU_ATTR(_name, _func, _config)                            \
+		(&((struct dev_ext_attribute[]) {                               \
+				{ __ATTR(_name, 0444, _func, NULL), (void *)_config }   \
+		})[0].attr.attr)
+
+#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config)            \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config)
+#define ALI_DRW_PMU_EVENT_ATTR(_name, _config)             \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config)
+
+static struct attribute *ali_drw_pmu_events_attrs[] = {
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr,			0x0),
+	ALI_DRW_PMU_EVENT_ATTR(hif_wr,				0x1),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd,				0x2),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rmw,				0x3),
+	ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd,			0x4),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles,		0x7),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles,		0x8),
+	ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical,		0x9),
+	ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical,		0xA),
+	ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical,		0xB),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_activate,			0xC),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr,			0xD),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate,		0xE),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd,			0xF),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_wr,			0x10),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_mwr,			0x11),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_precharge,			0x12),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr,		0x13),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_other,		0x14),
+	ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions,		0x15),
+	ALI_DRW_PMU_EVENT_ATTR(write_combine,			0x16),
+	ALI_DRW_PMU_EVENT_ATTR(war_hazard,			0x17),
+	ALI_DRW_PMU_EVENT_ATTR(raw_hazard,			0x18),
+	ALI_DRW_PMU_EVENT_ATTR(waw_hazard,			0x19),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0,		0x1A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1,		0x1B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2,		0x1C),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3,		0x1D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0,	0x1E),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1,	0x1F),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2,	0x20),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3,	0x21),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0,		0x26),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1,		0x27),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2,		0x28),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3,		0x29),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_refresh,			0x2A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref,			0x2B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode,			0x2D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl,			0x2E),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc,		0x34),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr,		0x35),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr,			0x36),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart,			0x37),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch,			0x38),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txreq,			0x39),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txdat,			0x3A),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxdat,			0x3B),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp,			0x3C),
+	ALI_DRW_PMU_EVENT_ATTR(tsz_vio,				0x3D),
+	ALI_DRW_PMU_EVENT_ATTR(cycle,				0x80),
+	NULL,
+};
+
+static struct attribute_group ali_drw_pmu_events_attr_group = {
+	.name = "events",
+	.attrs = ali_drw_pmu_events_attrs,
+};
+
+static struct attribute *ali_drw_pmu_format_attr[] = {
+	ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_format_group = {
+	.name = "format",
+	.attrs = ali_drw_pmu_format_attr,
+};
+
+static ssize_t ali_drw_pmu_cpumask_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu));
+}
+
+static struct device_attribute ali_drw_pmu_cpumask_attr =
+		__ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL);
+
+static struct attribute *ali_drw_pmu_cpumask_attrs[] = {
+	&ali_drw_pmu_cpumask_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_cpumask_attr_group = {
+	.attrs = ali_drw_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *ali_drw_pmu_attr_groups[] = {
+	&ali_drw_pmu_events_attr_group,
+	&ali_drw_pmu_cpumask_attr_group,
+	&ali_drw_pmu_format_group,
+	NULL,
+};
+
+/* find a counter for event, then in add func, hw.idx will equal to counter */
+static int ali_drw_get_counter_idx(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int idx;
+
+	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
+		if (!test_and_set_bit(idx, drw_pmu->used_mask))
+			return idx;
+	}
+
+	/* The counters are all in use. */
+	return -EBUSY;
+}
+
+static u64 ali_drw_pmu_read_counter(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	u64 cycle_high, cycle_low;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		cycle_high = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH);
+		cycle_high &= ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK;
+		cycle_low = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW);
+		cycle_low &= ALI_DRW_PMU_CYCLE_CNT_LOW_MASK;
+		return (cycle_high << 32 | cycle_low);
+	}
+
+	return readl(drw_pmu->cfg_base +
+		     ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx));
+}
+
+static void ali_drw_pmu_event_update(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 delta, prev, now;
+
+	do {
+		prev = local64_read(&hwc->prev_count);
+		now = ali_drw_pmu_read_counter(event);
+	} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);
+
+	/* handle overflow. */
+	delta = now - prev;
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID)
+		delta &= ALI_DRW_PMU_OV_INTR_MASK;
+	else
+		delta &= ALI_DRW_CNT_MAX_PERIOD;
+	local64_add(delta, &event->count);
+}
+
+static void ali_drw_pmu_event_set_period(struct perf_event *event)
+{
+	u64 pre_val;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	/* set a preload counter for test purpose */
+	writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+
+	/* set conunter initial value */
+	pre_val = ALI_DRW_PMU_CNT_INIT;
+	writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	local64_set(&event->hw.prev_count, pre_val);
+
+	/* set sel mode to zero to start test */
+	writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+}
+
+static void ali_drw_pmu_enable_counter(struct perf_event *event)
+{
+	u32 val, subval, reg, shift;
+	int counter = event->hw.idx;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static void ali_drw_pmu_disable_counter(struct perf_event *event)
+{
+	u32 val, reg, subval, shift;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int counter = event->hw.idx;
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
+{
+	struct ali_drw_pmu_irq *irq = data;
+	struct ali_drw_pmu *drw_pmu;
+	irqreturn_t ret = IRQ_NONE;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {
+		unsigned long status, clr_status;
+		struct perf_event *event;
+		unsigned int idx;
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			ali_drw_pmu_disable_counter(event);
+		}
+
+		/* common counter intr status */
+		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
+		status = FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status);
+		if (status) {
+			for_each_set_bit(idx, &status,
+					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+				event = drw_pmu->events[idx];
+				if (WARN_ON_ONCE(!event))
+					continue;
+				ali_drw_pmu_event_update(event);
+				ali_drw_pmu_event_set_period(event);
+			}
+
+			/* clear common counter intr status */
+			clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1);
+			writel(clr_status,
+			       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+		}
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			if (!(event->hw.state & PERF_HES_STOPPED))
+				ali_drw_pmu_enable_counter(event);
+		}
+		if (status)
+			ret = IRQ_HANDLED;
+	}
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
+						      *pdev, int irq_num)
+{
+	int ret;
+	struct ali_drw_pmu_irq *irq;
+
+	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
+		if (irq->irq_num == irq_num
+		    && refcount_inc_not_zero(&irq->refcount))
+			return irq;
+	}
+
+	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
+	if (!irq)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&irq->pmus_node);
+
+	/* Pick one CPU to be the preferred one to use */
+	irq->cpu = smp_processor_id();
+	refcount_set(&irq->refcount, 1);
+
+	/*
+	 * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same
+	 * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
+	 * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not
+	 * share with other component.
+	 */
+	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
+			       IRQF_SHARED, dev_name(&pdev->dev), irq);
+	if (ret < 0) {
+		dev_err(&pdev->dev,
+			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
+		goto out_free;
+	}
+
+	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
+	if (ret)
+		goto out_free;
+
+	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num,
+					     &irq->node);
+	if (ret)
+		goto out_free;
+
+	irq->irq_num = irq_num;
+	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
+
+	return irq;
+
+out_free:
+	kfree(irq);
+	return ERR_PTR(ret);
+}
+
+static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
+				struct platform_device *pdev)
+{
+	int irq_num;
+	struct ali_drw_pmu_irq *irq;
+
+	/* Read and init IRQ */
+	irq_num = platform_get_irq(pdev, 0);
+	if (irq_num < 0)
+		return irq_num;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	if (IS_ERR(irq))
+		return PTR_ERR(irq);
+
+	drw_pmu->irq = irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	return 0;
+}
+
+static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
+{
+	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_del_rcu(&drw_pmu->pmus_node);
+
+	if (!refcount_dec_and_test(&irq->refcount)) {
+		mutex_unlock(&ali_drw_pmu_irqs_lock);
+		return;
+	}
+
+	list_del(&irq->irqs_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
+	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
+					    &irq->node);
+	kfree(irq);
+}
+
+static int ali_drw_pmu_event_init(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event *sibling;
+	struct device *dev = drw_pmu->pmu.dev;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (is_sampling_event(event)) {
+		dev_err(dev, "Sampling not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->attach_state & PERF_ATTACH_TASK) {
+		dev_err(dev, "Per-task counter cannot allocate!\n");
+		return -EOPNOTSUPP;
+	}
+
+	event->cpu = drw_pmu->cpu;
+	if (event->cpu < 0) {
+		dev_err(dev, "Per-task mode not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->group_leader != event &&
+	    !is_software_event(event->group_leader)) {
+		dev_err(dev, "driveway only allow one event!\n");
+		return -EINVAL;
+	}
+
+	for_each_sibling_event(sibling, event->group_leader) {
+		if (sibling != event && !is_software_event(sibling)) {
+			dev_err(dev, "driveway event not allowed!\n");
+			return -EINVAL;
+		}
+	}
+
+	/* reset all the pmu counters */
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	hwc->idx = -1;
+
+	return 0;
+}
+
+static void ali_drw_pmu_start(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	event->hw.state = 0;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		writel(ALI_DRW_PMU_CNT_START,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+		return;
+	}
+
+	ali_drw_pmu_event_set_period(event);
+	if (flags & PERF_EF_RELOAD) {
+		unsigned long prev_raw_count =
+		    local64_read(&event->hw.prev_count);
+		writel(prev_raw_count,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	}
+
+	ali_drw_pmu_enable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+}
+
+static void ali_drw_pmu_stop(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	if (event->hw.state & PERF_HES_STOPPED)
+		return;
+
+	if (GET_DRW_EVENTID(event) != ALI_DRW_PMU_CYCLE_EVT_ID)
+		ali_drw_pmu_disable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	ali_drw_pmu_event_update(event);
+	event->hw.state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int ali_drw_pmu_add(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = -1;
+	int evtid;
+
+	evtid = GET_DRW_EVENTID(event);
+
+	if (evtid != ALI_DRW_PMU_CYCLE_EVT_ID) {
+		idx = ali_drw_get_counter_idx(event);
+		if (idx < 0)
+			return idx;
+		drw_pmu->events[idx] = event;
+		drw_pmu->evtids[idx] = evtid;
+	}
+	hwc->idx = idx;
+
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		ali_drw_pmu_start(event, PERF_EF_RELOAD);
+
+	/* Propagate our changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void ali_drw_pmu_del(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	ali_drw_pmu_stop(event, PERF_EF_UPDATE);
+
+	if (idx >= 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+		drw_pmu->events[idx] = NULL;
+		drw_pmu->evtids[idx] = 0;
+		clear_bit(idx, drw_pmu->used_mask);
+	}
+
+	perf_event_update_userpage(event);
+}
+
+static void ali_drw_pmu_read(struct perf_event *event)
+{
+	ali_drw_pmu_event_update(event);
+}
+
+static int ali_drw_pmu_probe(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu;
+	struct resource *res;
+	char *name;
+	int ret;
+
+	drw_pmu = devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL);
+	if (!drw_pmu)
+		return -ENOMEM;
+
+	drw_pmu->dev = &pdev->dev;
+	platform_set_drvdata(pdev, drw_pmu);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	drw_pmu->cfg_base = devm_ioremap_resource(&pdev->dev, res);
+	if (!drw_pmu->cfg_base)
+		return -ENOMEM;
+
+	name = devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx",
+			      (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT));
+	if (!name)
+		return -ENOMEM;
+
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	/* enable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL);
+
+	/* clearing interrupt status */
+	writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+
+	drw_pmu->cpu = smp_processor_id();
+
+	ret = ali_drw_pmu_init_irq(drw_pmu, pdev);
+	if (ret)
+		return ret;
+
+	drw_pmu->pmu = (struct pmu) {
+		.module		= THIS_MODULE,
+		.task_ctx_nr	= perf_invalid_context,
+		.event_init	= ali_drw_pmu_event_init,
+		.add		= ali_drw_pmu_add,
+		.del		= ali_drw_pmu_del,
+		.start		= ali_drw_pmu_start,
+		.stop		= ali_drw_pmu_stop,
+		.read		= ali_drw_pmu_read,
+		.attr_groups	= ali_drw_pmu_attr_groups,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE,
+	};
+
+	ret = perf_pmu_register(&drw_pmu->pmu, name, -1);
+	if (ret) {
+		dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n");
+		ali_drw_pmu_uninit_irq(drw_pmu);
+	}
+
+	return ret;
+}
+
+static int ali_drw_pmu_remove(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu = platform_get_drvdata(pdev);
+
+	/* disable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL);
+
+	ali_drw_pmu_uninit_irq(drw_pmu);
+	perf_pmu_unregister(&drw_pmu->pmu);
+
+	return 0;
+}
+
+static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct ali_drw_pmu_irq *irq;
+	struct ali_drw_pmu *drw_pmu;
+	unsigned int target;
+	int ret;
+	cpumask_t node_online_cpus;
+
+	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
+	if (cpu != irq->cpu)
+		return 0;
+
+	ret = cpumask_and(&node_online_cpus,
+			  cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
+	if (ret)
+		target = cpumask_any_but(&node_online_cpus, cpu);
+	else
+		target = cpumask_any_but(cpu_online_mask, cpu);
+
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	/* We're only reading, but this isn't the place to be involving RCU */
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
+		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
+	irq->cpu = target;
+
+	return 0;
+}
+
+/*
+ * Due to historical reasons, the HID used in the production environment is
+ * ARMHD700, so we leave ARMHD700 as Compatible ID.
+ */
+static const struct acpi_device_id ali_drw_acpi_match[] = {
+	{"BABA5000", 0},
+	{"ARMHD700", 0},
+	{}
+};
+
+MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
+
+static struct platform_driver ali_drw_pmu_driver = {
+	.driver = {
+		   .name = "ali_drw_pmu",
+		   .acpi_match_table = ali_drw_acpi_match,
+		   },
+	.probe = ali_drw_pmu_probe,
+	.remove = ali_drw_pmu_remove,
+};
+
+static int __init ali_drw_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "ali_drw_pmu:online",
+				      NULL, ali_drw_pmu_offline_cpu);
+
+	if (ret < 0) {
+		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
+		       ret);
+		return ret;
+	}
+	ali_drw_cpuhp_state_num = ret;
+
+	ret = platform_driver_register(&ali_drw_pmu_driver);
+	if (ret)
+		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+
+	return ret;
+}
+
+static void __exit ali_drw_pmu_exit(void)
+{
+	platform_driver_unregister(&ali_drw_pmu_driver);
+	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+}
+
+module_init(ali_drw_pmu_init);
+module_exit(ali_drw_pmu_exit);
+
+MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
+MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
+MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
+MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver");
+MODULE_LICENSE("GPL v2");
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (13 preceding siblings ...)
  2022-07-20  6:58 ` [PATCH v3 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-07-20  6:58 ` Shuai Xue
  2022-08-18  3:18 ` [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-07-20  6:58 UTC (permalink / raw)
  To: Jonathan.Cameron, rdunlap, linux-arm-kernel, linux-kernel, will,
	mark.rutland
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, xueshuai

Add maintainers for Alibaba PMU document and driver

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 MAINTAINERS | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 55114e73de26..8c05de35828d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -733,6 +733,13 @@ S:	Maintained
 F:	Documentation/i2c/busses/i2c-ali1563.rst
 F:	drivers/i2c/busses/i2c-ali1563.c
 
+ALIBABA PMU DRIVER
+M:	Shuai Xue <xueshuai@linux.alibaba.com>
+M:	Hongbo Yao <yaohongbo@linux.alibaba.com>
+S:	Supported
+F:	Documentation/admin-guide/perf/alibaba_pmu.rst
+F:	drivers/perf/alibaba_uncore_dwr_pmu.c
+
 ALIENWARE WMI DRIVER
 L:	Dell.Client.Kernel@dell.com
 S:	Maintained
-- 
2.20.1.9.gb50a0d7


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-07-20  6:58 ` [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-08-05  9:07   ` Shuai Xue
  2022-08-17  8:15   ` Baolin Wang
  1 sibling, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-08-05  9:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: baolin.wang, yaohongbo, nengchen, zhuo.song, Jonathan.Cameron,
	rdunlap, linux-arm-kernel, linux-kernel, will, mark.rutland

Hi, Will,

I was wondering that do you have any comments to this patch set?

Thank you.

Best Regards,
Shuai

在 2022/7/20 PM2:58, Shuai Xue 写道:
> This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
> which custom-built by Alibaba Group's chip development business, T-Head.
> 
> Changes since v2:
> - relaxe build constraints and add COMPILE_TEST
> - explicitly include dependent headers
> - add Reviewed-by, thanks Jonathan Cameron and Randy Dunlap for their valuable review and comments
> - Link: https://lore.kernel.org/linux-arm-kernel/20220715151310.90091-4-xueshuai@linux.alibaba.com/T/#m1116abc4b0bda1943ab436a45d95359f9bbe0858
> 
> Changes since v1:
> - add high level workflow about DDRC so that user cloud better understand the
>   PMU hardware mechanism
> - rewrite patch description and add interrupt sharing constraints
> - delete event perf prefix
> - add a condition to fix bug in ali_drw_pmu_isr
> - perfer CPU in the same Node when migrating irq
> - use FIELD_PREP and FIELD_GET to make code more readable
> - add T-Head HID and leave ARMHD700 as CID for compatibility
> - Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/
> 
> Shuai Xue (3):
>   docs: perf: Add description for Alibaba's T-Head PMU driver
>   drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
>     SoC
>   MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
> 
>  .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
>  Documentation/admin-guide/perf/index.rst      |   1 +
>  MAINTAINERS                                   |   7 +
>  drivers/perf/Kconfig                          |   7 +
>  drivers/perf/Makefile                         |   1 +
>  drivers/perf/alibaba_uncore_drw_pmu.c         | 810 ++++++++++++++++++
>  6 files changed, 926 insertions(+)
>  create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-07-20  6:58 ` [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
  2022-08-05  9:07   ` Shuai Xue
@ 2022-08-17  8:15   ` Baolin Wang
  2022-08-17  8:19     ` Shuai Xue
  1 sibling, 1 reply; 41+ messages in thread
From: Baolin Wang @ 2022-08-17  8:15 UTC (permalink / raw)
  To: Shuai Xue, Jonathan.Cameron, rdunlap, linux-arm-kernel,
	linux-kernel, will, mark.rutland
  Cc: yaohongbo, nengchen, zhuo.song



On 7/20/2022 2:58 PM, Shuai Xue wrote:
> This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
> which custom-built by Alibaba Group's chip development business, T-Head.
> 
> Changes since v2:
> - relaxe build constraints and add COMPILE_TEST
> - explicitly include dependent headers
> - add Reviewed-by, thanks Jonathan Cameron and Randy Dunlap for their valuable review and comments
> - Link: https://lore.kernel.org/linux-arm-kernel/20220715151310.90091-4-xueshuai@linux.alibaba.com/T/#m1116abc4b0bda1943ab436a45d95359f9bbe0858
> 
> Changes since v1:
> - add high level workflow about DDRC so that user cloud better understand the
>    PMU hardware mechanism
> - rewrite patch description and add interrupt sharing constraints
> - delete event perf prefix
> - add a condition to fix bug in ali_drw_pmu_isr
> - perfer CPU in the same Node when migrating irq
> - use FIELD_PREP and FIELD_GET to make code more readable
> - add T-Head HID and leave ARMHD700 as CID for compatibility
> - Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/
> 

As I've reviewed the patch set internally before, please feel free to add:
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

> Shuai Xue (3):
>    docs: perf: Add description for Alibaba's T-Head PMU driver
>    drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
>      SoC
>    MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
> 
>   .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
>   Documentation/admin-guide/perf/index.rst      |   1 +
>   MAINTAINERS                                   |   7 +
>   drivers/perf/Kconfig                          |   7 +
>   drivers/perf/Makefile                         |   1 +
>   drivers/perf/alibaba_uncore_drw_pmu.c         | 810 ++++++++++++++++++
>   6 files changed, 926 insertions(+)
>   create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
>   create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-08-17  8:15   ` Baolin Wang
@ 2022-08-17  8:19     ` Shuai Xue
  0 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-08-17  8:19 UTC (permalink / raw)
  To: Baolin Wang, Jonathan.Cameron, rdunlap, linux-arm-kernel,
	linux-kernel, will, mark.rutland
  Cc: yaohongbo, nengchen, zhuo.song



在 2022/8/17 PM4:15, Baolin Wang 写道:
> 
> 
> On 7/20/2022 2:58 PM, Shuai Xue wrote:
>> This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
>> which custom-built by Alibaba Group's chip development business, T-Head.
>>
>> Changes since v2:
>> - relaxe build constraints and add COMPILE_TEST
>> - explicitly include dependent headers
>> - add Reviewed-by, thanks Jonathan Cameron and Randy Dunlap for their valuable review and comments
>> - Link: https://lore.kernel.org/linux-arm-kernel/20220715151310.90091-4-xueshuai@linux.alibaba.com/T/#m1116abc4b0bda1943ab436a45d95359f9bbe0858
>>
>> Changes since v1:
>> - add high level workflow about DDRC so that user cloud better understand the
>>    PMU hardware mechanism
>> - rewrite patch description and add interrupt sharing constraints
>> - delete event perf prefix
>> - add a condition to fix bug in ali_drw_pmu_isr
>> - perfer CPU in the same Node when migrating irq
>> - use FIELD_PREP and FIELD_GET to make code more readable
>> - add T-Head HID and leave ARMHD700 as CID for compatibility
>> - Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/
>>
> 
> As I've reviewed the patch set internally before, please feel free to add:
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Thank you. :) I will add your Reviewed-by in next version.

Best Regards,
Shuai

> 
>> Shuai Xue (3):
>>    docs: perf: Add description for Alibaba's T-Head PMU driver
>>    drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
>>      SoC
>>    MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
>>
>>   .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
>>   Documentation/admin-guide/perf/index.rst      |   1 +
>>   MAINTAINERS                                   |   7 +
>>   drivers/perf/Kconfig                          |   7 +
>>   drivers/perf/Makefile                         |   1 +
>>   drivers/perf/alibaba_uncore_drw_pmu.c         | 810 ++++++++++++++++++
>>   6 files changed, 926 insertions(+)
>>   create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
>>   create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (14 preceding siblings ...)
  2022-07-20  6:58 ` [PATCH v3 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
@ 2022-08-18  3:18 ` Shuai Xue
  2022-09-06  4:58   ` Shuai Xue
  2022-09-22 20:33   ` Will Deacon
  2022-08-18  3:18 ` [PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
                   ` (6 subsequent siblings)
  22 siblings, 2 replies; 41+ messages in thread
From: Shuai Xue @ 2022-08-18  3:18 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Hi, Will,

I am wondering that do you have any comments to this patch set? If/when you're
happy with them, cloud you please queue them up?

Thank you :)

Cheers,
Shuai.

Changes since v3:
- add Reviewed-by of Baolin
- Rebase on Linux v6.0 rc1

Changes since v2:
- relaxe build constraints and add COMPILE_TEST
- explicitly include dependent headers
- add Reviewed-by, thanks Jonathan Cameron and Randy Dunlap for their
  valuable review and comments
- Link: https://lore.kernel.org/linux-arm-kernel/20220715151310.90091-4-xueshuai@linux.alibaba.com/T/#m1116abc4b0bda1943ab436a45d95359f9bbe0858

Changes since v1:
- add high level workflow about DDRC so that user cloud better understand
  the PMU hardware mechanism
- rewrite patch description and add interrupt sharing constraints
- delete event perf prefix
- add a condition to fix bug in ali_drw_pmu_isr
- perfer CPU in the same Node when migrating irq
- use FIELD_PREP and FIELD_GET to make code more readable
- add T-Head HID and leave ARMHD700 as CID for compatibility
- Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/

This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
which custom-built by Alibaba Group's chip development business, T-Head.

Shuai Xue (3):
  docs: perf: Add description for Alibaba's T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
    SoC
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver

 .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
 Documentation/admin-guide/perf/index.rst      |   1 +
 MAINTAINERS                                   |   6 +
 drivers/perf/Kconfig                          |   7 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c         | 810 ++++++++++++++++++
 6 files changed, 925 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (15 preceding siblings ...)
  2022-08-18  3:18 ` [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-08-18  3:18 ` Shuai Xue
  2022-08-18  3:18 ` [PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-08-18  3:18 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Alibaba's T-Head SoC implements uncore PMU for performance and functional
debugging to facilitate system maintenance. Document it to provide guidance
on how to use it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 .../admin-guide/perf/alibaba_pmu.rst          | 100 ++++++++++++++++++
 Documentation/admin-guide/perf/index.rst      |   1 +
 2 files changed, 101 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst

diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
new file mode 100644
index 000000000000..11de998bb480
--- /dev/null
+++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
@@ -0,0 +1,100 @@
+=============================================================
+Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
+=============================================================
+
+The Yitian 710, custom-built by Alibaba Group's chip development business,
+T-Head, implements uncore PMU for performance and functional debugging to
+facilitate system maintenance.
+
+DDR Sub-System Driveway (DRW) PMU Driver
+=========================================
+
+Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
+is independent of others to service system memory requests. And one DDR5
+channel is split into two independent sub-channels. The DDR Sub-System Driveway
+implements separate PMUs for each sub-channel to monitor various performance
+metrics.
+
+The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
+For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
+sub-channels of the same channel in die 0. And the PMU device of die 1 is
+prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
+
+Each sub-channel has 36 PMU counters in total, which is classified into
+four groups:
+
+- Group 0: PMU Cycle Counter. This group has one pair of counters
+  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
+  based on DDRC core clock.
+
+- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
+  to count the total access number of either the eight bank groups in a
+  selected rank, or four ranks separately in the first 4 counters. The base
+  transfer unit is 64B.
+
+- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
+  count the total retry number of each type of uncorrectable error.
+
+- Group 3: PMU Common Counters. This group has 16 counters, that are used
+  to count the common events.
+
+For now, the Driveway PMU driver only uses counters in group 0 and group 3.
+
+The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
+for connecting an SoC application bus to DDR memory devices. The DDRCTL
+receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
+These transactions are queued internally and scheduled for access while
+satisfying the SDRAM protocol timing requirements, transaction priorities, and
+dependencies between the transactions. The DDRCTL in turn issues commands on
+the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
+to and from the SDRAM. The driveway PMUs have hardware logic to gather
+statistics and performance logging signals on HIF, DFI, etc.
+
+By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF
+interface, we could calculate the bandwidth. Example usage of counting memory
+data bandwidth::
+
+  perf stat \
+    -e ali_drw_21000/hif_wr/ \
+    -e ali_drw_21000/hif_rd/ \
+    -e ali_drw_21000/hif_rmw/ \
+    -e ali_drw_21000/cycle/ \
+    -e ali_drw_21080/hif_wr/ \
+    -e ali_drw_21080/hif_rd/ \
+    -e ali_drw_21080/hif_rmw/ \
+    -e ali_drw_21080/cycle/ \
+    -e ali_drw_23000/hif_wr/ \
+    -e ali_drw_23000/hif_rd/ \
+    -e ali_drw_23000/hif_rmw/ \
+    -e ali_drw_23000/cycle/ \
+    -e ali_drw_23080/hif_wr/ \
+    -e ali_drw_23080/hif_rd/ \
+    -e ali_drw_23080/hif_rmw/ \
+    -e ali_drw_23080/cycle/ \
+    -e ali_drw_25000/hif_wr/ \
+    -e ali_drw_25000/hif_rd/ \
+    -e ali_drw_25000/hif_rmw/ \
+    -e ali_drw_25000/cycle/ \
+    -e ali_drw_25080/hif_wr/ \
+    -e ali_drw_25080/hif_rd/ \
+    -e ali_drw_25080/hif_rmw/ \
+    -e ali_drw_25080/cycle/ \
+    -e ali_drw_27000/hif_wr/ \
+    -e ali_drw_27000/hif_rd/ \
+    -e ali_drw_27000/hif_rmw/ \
+    -e ali_drw_27000/cycle/ \
+    -e ali_drw_27080/hif_wr/ \
+    -e ali_drw_27080/hif_rd/ \
+    -e ali_drw_27080/hif_rmw/ \
+    -e ali_drw_27080/cycle/ -- sleep 10
+
+The average DRAM bandwidth can be calculated as follows:
+
+- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+
+Here, DDRC_WIDTH = 64 bytes.
+
+The current driver does not support sampling. So "perf record" is
+unsupported.  Also attach to a task is unsupported as the events are all
+uncore.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 9c9ece88ce53..793e1970bc05 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -18,3 +18,4 @@ Performance monitor support
    xgene-pmu
    arm_dsu_pmu
    thunderx2-pmu
+   alibaba_pmu
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (16 preceding siblings ...)
  2022-08-18  3:18 ` [PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-08-18  3:18 ` Shuai Xue
  2022-08-18  3:18 ` [PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-08-18  3:18 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
DRAM and targets cloud computing and HPC.

Each PMU is registered as a device in /sys/bus/event_source/devices, and
users can select event to monitor in each sub-channel, independently. For
example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
sub-channels of the same channel in die 0. And the PMU device of die 1 is
prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.

Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
MPAM drivers successfully, add IRQF_SHARED flag.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 drivers/perf/Kconfig                  |   7 +
 drivers/perf/Makefile                 |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c | 810 ++++++++++++++++++++++++++
 3 files changed, 818 insertions(+)
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 1e2d69453771..44c07ea487f4 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -183,6 +183,13 @@ config APPLE_M1_CPU_PMU
 	  Provides support for the non-architectural CPU PMUs present on
 	  the Apple M1 SoCs and derivatives.
 
+config ALIBABA_UNCORE_DRW_PMU
+	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
+	depends on ARM64 || COMPILE_TEST
+	help
+	  Support for Driveway PMU events monitoring on Yitian 710 DDR
+	  Sub-system.
+
 source "drivers/perf/hisilicon/Kconfig"
 
 config MARVELL_CN10K_DDR_PMU
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 57a279c61df5..050d04ee19dd 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
+obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
new file mode 100644
index 000000000000..82729b874f09
--- /dev/null
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -0,0 +1,810 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Alibaba DDR Sub-System Driveway PMU driver
+ *
+ * Copyright (C) 2022 Alibaba Inc
+ */
+
+#define ALI_DRW_PMUNAME		"ali_drw"
+#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
+#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
+
+#include <linux/acpi.h>
+#include <linux/bitfield.h>
+#include <linux/bitmap.h>
+#include <linux/bitops.h>
+#include <linux/cpuhotplug.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+#include <linux/rculist.h>
+#include <linux/refcount.h>
+
+
+#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
+#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
+
+#define ALI_DRW_PMU_PA_SHIFT			12
+#define ALI_DRW_PMU_CNT_INIT			0x00000000
+#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
+#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
+
+#define ALI_DRW_PMU_CNT_CTRL			0xC00
+#define ALI_DRW_PMU_CNT_RST			BIT(2)
+#define ALI_DRW_PMU_CNT_STOP			BIT(1)
+#define ALI_DRW_PMU_CNT_START			BIT(0)
+
+#define ALI_DRW_PMU_CNT_STATE			0xC04
+#define ALI_DRW_PMU_TEST_CTRL			0xC08
+#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
+
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
+#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
+
+/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_EVENT_SEL0			0xC68
+/* counter 0-3 use sel0, counter 4-7 use sel1...*/
+#define ALI_DRW_PMU_EVENT_SELn(n) \
+	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
+#define ALI_DRW_PMCOM_CNT_EN			BIT(7)
+#define ALI_DRW_PMCOM_CNT_EVENT_MASK		GENMASK(5, 0)
+#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
+	(8 * (n % 4))
+
+/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
+#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
+	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
+
+#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
+#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
+#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
+#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
+#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
+#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
+#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
+#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
+
+static int ali_drw_cpuhp_state_num;
+
+static LIST_HEAD(ali_drw_pmu_irqs);
+static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);
+
+struct ali_drw_pmu_irq {
+	struct hlist_node node;
+	struct list_head irqs_node;
+	struct list_head pmus_node;
+	int irq_num;
+	int cpu;
+	refcount_t refcount;
+};
+
+struct ali_drw_pmu {
+	void __iomem *cfg_base;
+	struct device *dev;
+
+	struct list_head pmus_node;
+	struct ali_drw_pmu_irq *irq;
+	int irq_num;
+	int cpu;
+	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);
+	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+
+	struct pmu pmu;
+};
+
+#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu))
+
+#define DRW_CONFIG_EVENTID		GENMASK(7, 0)
+#define GET_DRW_EVENTID(event)	FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr.config)
+
+static ssize_t ali_drw_pmu_format_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(buf, "%s\n", (char *)eattr->var);
+}
+
+/*
+ * PMU event attributes
+ */
+static ssize_t ali_drw_pmu_event_show(struct device *dev,
+			       struct device_attribute *attr, char *page)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
+}
+
+#define ALI_DRW_PMU_ATTR(_name, _func, _config)                            \
+		(&((struct dev_ext_attribute[]) {                               \
+				{ __ATTR(_name, 0444, _func, NULL), (void *)_config }   \
+		})[0].attr.attr)
+
+#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config)            \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config)
+#define ALI_DRW_PMU_EVENT_ATTR(_name, _config)             \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config)
+
+static struct attribute *ali_drw_pmu_events_attrs[] = {
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr,			0x0),
+	ALI_DRW_PMU_EVENT_ATTR(hif_wr,				0x1),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd,				0x2),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rmw,				0x3),
+	ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd,			0x4),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles,		0x7),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles,		0x8),
+	ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical,		0x9),
+	ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical,		0xA),
+	ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical,		0xB),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_activate,			0xC),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr,			0xD),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate,		0xE),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd,			0xF),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_wr,			0x10),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_mwr,			0x11),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_precharge,			0x12),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr,		0x13),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_other,		0x14),
+	ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions,		0x15),
+	ALI_DRW_PMU_EVENT_ATTR(write_combine,			0x16),
+	ALI_DRW_PMU_EVENT_ATTR(war_hazard,			0x17),
+	ALI_DRW_PMU_EVENT_ATTR(raw_hazard,			0x18),
+	ALI_DRW_PMU_EVENT_ATTR(waw_hazard,			0x19),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0,		0x1A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1,		0x1B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2,		0x1C),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3,		0x1D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0,	0x1E),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1,	0x1F),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2,	0x20),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3,	0x21),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0,		0x26),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1,		0x27),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2,		0x28),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3,		0x29),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_refresh,			0x2A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref,			0x2B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode,			0x2D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl,			0x2E),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc,		0x34),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr,		0x35),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr,			0x36),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart,			0x37),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch,			0x38),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txreq,			0x39),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txdat,			0x3A),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxdat,			0x3B),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp,			0x3C),
+	ALI_DRW_PMU_EVENT_ATTR(tsz_vio,				0x3D),
+	ALI_DRW_PMU_EVENT_ATTR(cycle,				0x80),
+	NULL,
+};
+
+static struct attribute_group ali_drw_pmu_events_attr_group = {
+	.name = "events",
+	.attrs = ali_drw_pmu_events_attrs,
+};
+
+static struct attribute *ali_drw_pmu_format_attr[] = {
+	ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_format_group = {
+	.name = "format",
+	.attrs = ali_drw_pmu_format_attr,
+};
+
+static ssize_t ali_drw_pmu_cpumask_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu));
+}
+
+static struct device_attribute ali_drw_pmu_cpumask_attr =
+		__ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL);
+
+static struct attribute *ali_drw_pmu_cpumask_attrs[] = {
+	&ali_drw_pmu_cpumask_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_cpumask_attr_group = {
+	.attrs = ali_drw_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *ali_drw_pmu_attr_groups[] = {
+	&ali_drw_pmu_events_attr_group,
+	&ali_drw_pmu_cpumask_attr_group,
+	&ali_drw_pmu_format_group,
+	NULL,
+};
+
+/* find a counter for event, then in add func, hw.idx will equal to counter */
+static int ali_drw_get_counter_idx(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int idx;
+
+	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
+		if (!test_and_set_bit(idx, drw_pmu->used_mask))
+			return idx;
+	}
+
+	/* The counters are all in use. */
+	return -EBUSY;
+}
+
+static u64 ali_drw_pmu_read_counter(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	u64 cycle_high, cycle_low;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		cycle_high = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH);
+		cycle_high &= ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK;
+		cycle_low = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW);
+		cycle_low &= ALI_DRW_PMU_CYCLE_CNT_LOW_MASK;
+		return (cycle_high << 32 | cycle_low);
+	}
+
+	return readl(drw_pmu->cfg_base +
+		     ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx));
+}
+
+static void ali_drw_pmu_event_update(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 delta, prev, now;
+
+	do {
+		prev = local64_read(&hwc->prev_count);
+		now = ali_drw_pmu_read_counter(event);
+	} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);
+
+	/* handle overflow. */
+	delta = now - prev;
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID)
+		delta &= ALI_DRW_PMU_OV_INTR_MASK;
+	else
+		delta &= ALI_DRW_CNT_MAX_PERIOD;
+	local64_add(delta, &event->count);
+}
+
+static void ali_drw_pmu_event_set_period(struct perf_event *event)
+{
+	u64 pre_val;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	/* set a preload counter for test purpose */
+	writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+
+	/* set conunter initial value */
+	pre_val = ALI_DRW_PMU_CNT_INIT;
+	writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	local64_set(&event->hw.prev_count, pre_val);
+
+	/* set sel mode to zero to start test */
+	writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+}
+
+static void ali_drw_pmu_enable_counter(struct perf_event *event)
+{
+	u32 val, subval, reg, shift;
+	int counter = event->hw.idx;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static void ali_drw_pmu_disable_counter(struct perf_event *event)
+{
+	u32 val, reg, subval, shift;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int counter = event->hw.idx;
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
+{
+	struct ali_drw_pmu_irq *irq = data;
+	struct ali_drw_pmu *drw_pmu;
+	irqreturn_t ret = IRQ_NONE;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {
+		unsigned long status, clr_status;
+		struct perf_event *event;
+		unsigned int idx;
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			ali_drw_pmu_disable_counter(event);
+		}
+
+		/* common counter intr status */
+		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
+		status = FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status);
+		if (status) {
+			for_each_set_bit(idx, &status,
+					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+				event = drw_pmu->events[idx];
+				if (WARN_ON_ONCE(!event))
+					continue;
+				ali_drw_pmu_event_update(event);
+				ali_drw_pmu_event_set_period(event);
+			}
+
+			/* clear common counter intr status */
+			clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1);
+			writel(clr_status,
+			       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+		}
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			if (!(event->hw.state & PERF_HES_STOPPED))
+				ali_drw_pmu_enable_counter(event);
+		}
+		if (status)
+			ret = IRQ_HANDLED;
+	}
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
+						      *pdev, int irq_num)
+{
+	int ret;
+	struct ali_drw_pmu_irq *irq;
+
+	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
+		if (irq->irq_num == irq_num
+		    && refcount_inc_not_zero(&irq->refcount))
+			return irq;
+	}
+
+	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
+	if (!irq)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&irq->pmus_node);
+
+	/* Pick one CPU to be the preferred one to use */
+	irq->cpu = smp_processor_id();
+	refcount_set(&irq->refcount, 1);
+
+	/*
+	 * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same
+	 * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
+	 * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not
+	 * share with other component.
+	 */
+	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
+			       IRQF_SHARED, dev_name(&pdev->dev), irq);
+	if (ret < 0) {
+		dev_err(&pdev->dev,
+			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
+		goto out_free;
+	}
+
+	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
+	if (ret)
+		goto out_free;
+
+	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num,
+					     &irq->node);
+	if (ret)
+		goto out_free;
+
+	irq->irq_num = irq_num;
+	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
+
+	return irq;
+
+out_free:
+	kfree(irq);
+	return ERR_PTR(ret);
+}
+
+static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
+				struct platform_device *pdev)
+{
+	int irq_num;
+	struct ali_drw_pmu_irq *irq;
+
+	/* Read and init IRQ */
+	irq_num = platform_get_irq(pdev, 0);
+	if (irq_num < 0)
+		return irq_num;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	if (IS_ERR(irq))
+		return PTR_ERR(irq);
+
+	drw_pmu->irq = irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	return 0;
+}
+
+static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
+{
+	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_del_rcu(&drw_pmu->pmus_node);
+
+	if (!refcount_dec_and_test(&irq->refcount)) {
+		mutex_unlock(&ali_drw_pmu_irqs_lock);
+		return;
+	}
+
+	list_del(&irq->irqs_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
+	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
+					    &irq->node);
+	kfree(irq);
+}
+
+static int ali_drw_pmu_event_init(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event *sibling;
+	struct device *dev = drw_pmu->pmu.dev;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (is_sampling_event(event)) {
+		dev_err(dev, "Sampling not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->attach_state & PERF_ATTACH_TASK) {
+		dev_err(dev, "Per-task counter cannot allocate!\n");
+		return -EOPNOTSUPP;
+	}
+
+	event->cpu = drw_pmu->cpu;
+	if (event->cpu < 0) {
+		dev_err(dev, "Per-task mode not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->group_leader != event &&
+	    !is_software_event(event->group_leader)) {
+		dev_err(dev, "driveway only allow one event!\n");
+		return -EINVAL;
+	}
+
+	for_each_sibling_event(sibling, event->group_leader) {
+		if (sibling != event && !is_software_event(sibling)) {
+			dev_err(dev, "driveway event not allowed!\n");
+			return -EINVAL;
+		}
+	}
+
+	/* reset all the pmu counters */
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	hwc->idx = -1;
+
+	return 0;
+}
+
+static void ali_drw_pmu_start(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	event->hw.state = 0;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		writel(ALI_DRW_PMU_CNT_START,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+		return;
+	}
+
+	ali_drw_pmu_event_set_period(event);
+	if (flags & PERF_EF_RELOAD) {
+		unsigned long prev_raw_count =
+		    local64_read(&event->hw.prev_count);
+		writel(prev_raw_count,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	}
+
+	ali_drw_pmu_enable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+}
+
+static void ali_drw_pmu_stop(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	if (event->hw.state & PERF_HES_STOPPED)
+		return;
+
+	if (GET_DRW_EVENTID(event) != ALI_DRW_PMU_CYCLE_EVT_ID)
+		ali_drw_pmu_disable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	ali_drw_pmu_event_update(event);
+	event->hw.state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int ali_drw_pmu_add(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = -1;
+	int evtid;
+
+	evtid = GET_DRW_EVENTID(event);
+
+	if (evtid != ALI_DRW_PMU_CYCLE_EVT_ID) {
+		idx = ali_drw_get_counter_idx(event);
+		if (idx < 0)
+			return idx;
+		drw_pmu->events[idx] = event;
+		drw_pmu->evtids[idx] = evtid;
+	}
+	hwc->idx = idx;
+
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		ali_drw_pmu_start(event, PERF_EF_RELOAD);
+
+	/* Propagate our changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void ali_drw_pmu_del(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	ali_drw_pmu_stop(event, PERF_EF_UPDATE);
+
+	if (idx >= 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+		drw_pmu->events[idx] = NULL;
+		drw_pmu->evtids[idx] = 0;
+		clear_bit(idx, drw_pmu->used_mask);
+	}
+
+	perf_event_update_userpage(event);
+}
+
+static void ali_drw_pmu_read(struct perf_event *event)
+{
+	ali_drw_pmu_event_update(event);
+}
+
+static int ali_drw_pmu_probe(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu;
+	struct resource *res;
+	char *name;
+	int ret;
+
+	drw_pmu = devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL);
+	if (!drw_pmu)
+		return -ENOMEM;
+
+	drw_pmu->dev = &pdev->dev;
+	platform_set_drvdata(pdev, drw_pmu);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	drw_pmu->cfg_base = devm_ioremap_resource(&pdev->dev, res);
+	if (!drw_pmu->cfg_base)
+		return -ENOMEM;
+
+	name = devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx",
+			      (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT));
+	if (!name)
+		return -ENOMEM;
+
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	/* enable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL);
+
+	/* clearing interrupt status */
+	writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+
+	drw_pmu->cpu = smp_processor_id();
+
+	ret = ali_drw_pmu_init_irq(drw_pmu, pdev);
+	if (ret)
+		return ret;
+
+	drw_pmu->pmu = (struct pmu) {
+		.module		= THIS_MODULE,
+		.task_ctx_nr	= perf_invalid_context,
+		.event_init	= ali_drw_pmu_event_init,
+		.add		= ali_drw_pmu_add,
+		.del		= ali_drw_pmu_del,
+		.start		= ali_drw_pmu_start,
+		.stop		= ali_drw_pmu_stop,
+		.read		= ali_drw_pmu_read,
+		.attr_groups	= ali_drw_pmu_attr_groups,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE,
+	};
+
+	ret = perf_pmu_register(&drw_pmu->pmu, name, -1);
+	if (ret) {
+		dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n");
+		ali_drw_pmu_uninit_irq(drw_pmu);
+	}
+
+	return ret;
+}
+
+static int ali_drw_pmu_remove(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu = platform_get_drvdata(pdev);
+
+	/* disable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL);
+
+	ali_drw_pmu_uninit_irq(drw_pmu);
+	perf_pmu_unregister(&drw_pmu->pmu);
+
+	return 0;
+}
+
+static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct ali_drw_pmu_irq *irq;
+	struct ali_drw_pmu *drw_pmu;
+	unsigned int target;
+	int ret;
+	cpumask_t node_online_cpus;
+
+	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
+	if (cpu != irq->cpu)
+		return 0;
+
+	ret = cpumask_and(&node_online_cpus,
+			  cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
+	if (ret)
+		target = cpumask_any_but(&node_online_cpus, cpu);
+	else
+		target = cpumask_any_but(cpu_online_mask, cpu);
+
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	/* We're only reading, but this isn't the place to be involving RCU */
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
+		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
+	irq->cpu = target;
+
+	return 0;
+}
+
+/*
+ * Due to historical reasons, the HID used in the production environment is
+ * ARMHD700, so we leave ARMHD700 as Compatible ID.
+ */
+static const struct acpi_device_id ali_drw_acpi_match[] = {
+	{"BABA5000", 0},
+	{"ARMHD700", 0},
+	{}
+};
+
+MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
+
+static struct platform_driver ali_drw_pmu_driver = {
+	.driver = {
+		   .name = "ali_drw_pmu",
+		   .acpi_match_table = ali_drw_acpi_match,
+		   },
+	.probe = ali_drw_pmu_probe,
+	.remove = ali_drw_pmu_remove,
+};
+
+static int __init ali_drw_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "ali_drw_pmu:online",
+				      NULL, ali_drw_pmu_offline_cpu);
+
+	if (ret < 0) {
+		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
+		       ret);
+		return ret;
+	}
+	ali_drw_cpuhp_state_num = ret;
+
+	ret = platform_driver_register(&ali_drw_pmu_driver);
+	if (ret)
+		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+
+	return ret;
+}
+
+static void __exit ali_drw_pmu_exit(void)
+{
+	platform_driver_unregister(&ali_drw_pmu_driver);
+	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+}
+
+module_init(ali_drw_pmu_init);
+module_exit(ali_drw_pmu_exit);
+
+MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
+MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
+MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
+MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver");
+MODULE_LICENSE("GPL v2");
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (17 preceding siblings ...)
  2022-08-18  3:18 ` [PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-08-18  3:18 ` Shuai Xue
  2022-09-14  2:23 ` [RESEND PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-08-18  3:18 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Add maintainers for Alibaba PMU document and driver

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 MAINTAINERS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f512b430c7cb..64d95db9179a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -749,6 +749,12 @@ S:	Supported
 F:	drivers/infiniband/hw/erdma
 F:	include/uapi/rdma/erdma-abi.h
 
+ALIBABA PMU DRIVER
+M:	Shuai Xue <xueshuai@linux.alibaba.com>
+S:	Supported
+F:	Documentation/admin-guide/perf/alibaba_pmu.rst
+F:	drivers/perf/alibaba_uncore_dwr_pmu.c
+
 ALIENWARE WMI DRIVER
 L:	Dell.Client.Kernel@dell.com
 S:	Maintained
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-08-18  3:18 ` [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-09-06  4:58   ` Shuai Xue
  2022-09-22 20:33   ` Will Deacon
  1 sibling, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-09-06  4:58 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song


在 2022/8/18 AM11:18, Shuai Xue 写道:
> Hi, Will,
> 
> I am wondering that do you have any comments to this patch set? If/when you're
> happy with them, cloud you please queue them up?
> 
> Thank you :)
> 
> Cheers,
> Shuai.

Gentle ping. Any comment or suggestion is appreciated.

Best Regards,
Shuai

> 
> Changes since v3:
> - add Reviewed-by of Baolin
> - Rebase on Linux v6.0 rc1
> 
> Changes since v2:
> - relaxe build constraints and add COMPILE_TEST
> - explicitly include dependent headers
> - add Reviewed-by, thanks Jonathan Cameron and Randy Dunlap for their
>   valuable review and comments
> - Link: https://lore.kernel.org/linux-arm-kernel/20220715151310.90091-4-xueshuai@linux.alibaba.com/T/#m1116abc4b0bda1943ab436a45d95359f9bbe0858
> 
> Changes since v1:
> - add high level workflow about DDRC so that user cloud better understand
>   the PMU hardware mechanism
> - rewrite patch description and add interrupt sharing constraints
> - delete event perf prefix
> - add a condition to fix bug in ali_drw_pmu_isr
> - perfer CPU in the same Node when migrating irq
> - use FIELD_PREP and FIELD_GET to make code more readable
> - add T-Head HID and leave ARMHD700 as CID for compatibility
> - Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/
> 
> This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
> which custom-built by Alibaba Group's chip development business, T-Head.
> 
> Shuai Xue (3):
>   docs: perf: Add description for Alibaba's T-Head PMU driver
>   drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
>     SoC
>   MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
> 
>  .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
>  Documentation/admin-guide/perf/index.rst      |   1 +
>  MAINTAINERS                                   |   6 +
>  drivers/perf/Kconfig                          |   7 +
>  drivers/perf/Makefile                         |   1 +
>  drivers/perf/alibaba_uncore_drw_pmu.c         | 810 ++++++++++++++++++
>  6 files changed, 925 insertions(+)
>  create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
>  create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RESEND PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (18 preceding siblings ...)
  2022-08-18  3:18 ` [PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
@ 2022-09-14  2:23 ` Shuai Xue
  2022-09-14  2:23 ` [RESEND PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-09-14  2:23 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Hi, Will,

I am wondering that do you have any comments to this patch set? If/when you're
happy with them, cloud you please queue them up?

Thank you :)

Cheers,
Shuai.

Changes since v3:
- add Reviewed-by of Baolin
- Rebase on Linux v6.0 rc1

Changes since v2:
- relaxe build constraints and add COMPILE_TEST
- explicitly include dependent headers
- add Reviewed-by, thanks Jonathan Cameron and Randy Dunlap for their
  valuable review and comments
- Link: https://lore.kernel.org/linux-arm-kernel/20220715151310.90091-4-xueshuai@linux.alibaba.com/T/

Changes since v1:
- add high level workflow about DDRC so that user cloud better understand
  the PMU hardware mechanism
- rewrite patch description and add interrupt sharing constraints
- delete event perf prefix
- add a condition to fix bug in ali_drw_pmu_isr
- perfer CPU in the same Node when migrating irq
- use FIELD_PREP and FIELD_GET to make code more readable
- add T-Head HID and leave ARMHD700 as CID for compatibility
- Link: https://lore.kernel.org/linux-arm-kernel/eb50310d-d4a0-c7ff-7f1c-b4ffd919b10c@linux.alibaba.com/T/

This patchset adds support for Yitian 710 DDR Sub-System Driveway PMU driver,
which custom-built by Alibaba Group's chip development business, T-Head.

Shuai Xue (3):
  docs: perf: Add description for Alibaba's T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710
    SoC
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver

 .../admin-guide/perf/alibaba_pmu.rst          | 100 +++
 Documentation/admin-guide/perf/index.rst      |   1 +
 MAINTAINERS                                   |   6 +
 drivers/perf/Kconfig                          |   7 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c         | 810 ++++++++++++++++++
 6 files changed, 925 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RESEND PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (19 preceding siblings ...)
  2022-09-14  2:23 ` [RESEND PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-09-14  2:23 ` Shuai Xue
  2022-09-14  2:23 ` [RESEND PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
  2022-09-14  2:23 ` [RESEND PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-09-14  2:23 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Alibaba's T-Head SoC implements uncore PMU for performance and functional
debugging to facilitate system maintenance. Document it to provide guidance
on how to use it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 .../admin-guide/perf/alibaba_pmu.rst          | 100 ++++++++++++++++++
 Documentation/admin-guide/perf/index.rst      |   1 +
 2 files changed, 101 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst

diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
new file mode 100644
index 000000000000..11de998bb480
--- /dev/null
+++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
@@ -0,0 +1,100 @@
+=============================================================
+Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
+=============================================================
+
+The Yitian 710, custom-built by Alibaba Group's chip development business,
+T-Head, implements uncore PMU for performance and functional debugging to
+facilitate system maintenance.
+
+DDR Sub-System Driveway (DRW) PMU Driver
+=========================================
+
+Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
+is independent of others to service system memory requests. And one DDR5
+channel is split into two independent sub-channels. The DDR Sub-System Driveway
+implements separate PMUs for each sub-channel to monitor various performance
+metrics.
+
+The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
+For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
+sub-channels of the same channel in die 0. And the PMU device of die 1 is
+prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
+
+Each sub-channel has 36 PMU counters in total, which is classified into
+four groups:
+
+- Group 0: PMU Cycle Counter. This group has one pair of counters
+  pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
+  based on DDRC core clock.
+
+- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
+  to count the total access number of either the eight bank groups in a
+  selected rank, or four ranks separately in the first 4 counters. The base
+  transfer unit is 64B.
+
+- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
+  count the total retry number of each type of uncorrectable error.
+
+- Group 3: PMU Common Counters. This group has 16 counters, that are used
+  to count the common events.
+
+For now, the Driveway PMU driver only uses counters in group 0 and group 3.
+
+The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
+for connecting an SoC application bus to DDR memory devices. The DDRCTL
+receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
+These transactions are queued internally and scheduled for access while
+satisfying the SDRAM protocol timing requirements, transaction priorities, and
+dependencies between the transactions. The DDRCTL in turn issues commands on
+the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
+to and from the SDRAM. The driveway PMUs have hardware logic to gather
+statistics and performance logging signals on HIF, DFI, etc.
+
+By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF
+interface, we could calculate the bandwidth. Example usage of counting memory
+data bandwidth::
+
+  perf stat \
+    -e ali_drw_21000/hif_wr/ \
+    -e ali_drw_21000/hif_rd/ \
+    -e ali_drw_21000/hif_rmw/ \
+    -e ali_drw_21000/cycle/ \
+    -e ali_drw_21080/hif_wr/ \
+    -e ali_drw_21080/hif_rd/ \
+    -e ali_drw_21080/hif_rmw/ \
+    -e ali_drw_21080/cycle/ \
+    -e ali_drw_23000/hif_wr/ \
+    -e ali_drw_23000/hif_rd/ \
+    -e ali_drw_23000/hif_rmw/ \
+    -e ali_drw_23000/cycle/ \
+    -e ali_drw_23080/hif_wr/ \
+    -e ali_drw_23080/hif_rd/ \
+    -e ali_drw_23080/hif_rmw/ \
+    -e ali_drw_23080/cycle/ \
+    -e ali_drw_25000/hif_wr/ \
+    -e ali_drw_25000/hif_rd/ \
+    -e ali_drw_25000/hif_rmw/ \
+    -e ali_drw_25000/cycle/ \
+    -e ali_drw_25080/hif_wr/ \
+    -e ali_drw_25080/hif_rd/ \
+    -e ali_drw_25080/hif_rmw/ \
+    -e ali_drw_25080/cycle/ \
+    -e ali_drw_27000/hif_wr/ \
+    -e ali_drw_27000/hif_rd/ \
+    -e ali_drw_27000/hif_rmw/ \
+    -e ali_drw_27000/cycle/ \
+    -e ali_drw_27080/hif_wr/ \
+    -e ali_drw_27080/hif_rd/ \
+    -e ali_drw_27080/hif_rmw/ \
+    -e ali_drw_27080/cycle/ -- sleep 10
+
+The average DRAM bandwidth can be calculated as follows:
+
+- Read Bandwidth =  perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+
+Here, DDRC_WIDTH = 64 bytes.
+
+The current driver does not support sampling. So "perf record" is
+unsupported.  Also attach to a task is unsupported as the events are all
+uncore.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 9c9ece88ce53..793e1970bc05 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -18,3 +18,4 @@ Performance monitor support
    xgene-pmu
    arm_dsu_pmu
    thunderx2-pmu
+   alibaba_pmu
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RESEND PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (20 preceding siblings ...)
  2022-09-14  2:23 ` [RESEND PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
@ 2022-09-14  2:23 ` Shuai Xue
  2022-09-14  2:23 ` [RESEND PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-09-14  2:23 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
DRAM and targets cloud computing and HPC.

Each PMU is registered as a device in /sys/bus/event_source/devices, and
users can select event to monitor in each sub-channel, independently. For
example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
sub-channels of the same channel in die 0. And the PMU device of die 1 is
prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.

Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
MPAM drivers successfully, add IRQF_SHARED flag.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 drivers/perf/Kconfig                  |   7 +
 drivers/perf/Makefile                 |   1 +
 drivers/perf/alibaba_uncore_drw_pmu.c | 810 ++++++++++++++++++++++++++
 3 files changed, 818 insertions(+)
 create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 1e2d69453771..44c07ea487f4 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -183,6 +183,13 @@ config APPLE_M1_CPU_PMU
 	  Provides support for the non-architectural CPU PMUs present on
 	  the Apple M1 SoCs and derivatives.
 
+config ALIBABA_UNCORE_DRW_PMU
+	tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver"
+	depends on ARM64 || COMPILE_TEST
+	help
+	  Support for Driveway PMU events monitoring on Yitian 710 DDR
+	  Sub-system.
+
 source "drivers/perf/hisilicon/Kconfig"
 
 config MARVELL_CN10K_DDR_PMU
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 57a279c61df5..050d04ee19dd 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
+obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
new file mode 100644
index 000000000000..82729b874f09
--- /dev/null
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -0,0 +1,810 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Alibaba DDR Sub-System Driveway PMU driver
+ *
+ * Copyright (C) 2022 Alibaba Inc
+ */
+
+#define ALI_DRW_PMUNAME		"ali_drw"
+#define ALI_DRW_DRVNAME		ALI_DRW_PMUNAME "_pmu"
+#define pr_fmt(fmt)		ALI_DRW_DRVNAME ": " fmt
+
+#include <linux/acpi.h>
+#include <linux/bitfield.h>
+#include <linux/bitmap.h>
+#include <linux/bitops.h>
+#include <linux/cpuhotplug.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+#include <linux/rculist.h>
+#include <linux/refcount.h>
+
+
+#define ALI_DRW_PMU_COMMON_MAX_COUNTERS			16
+#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE	19
+
+#define ALI_DRW_PMU_PA_SHIFT			12
+#define ALI_DRW_PMU_CNT_INIT			0x00000000
+#define ALI_DRW_CNT_MAX_PERIOD			0xffffffff
+#define ALI_DRW_PMU_CYCLE_EVT_ID		0x80
+
+#define ALI_DRW_PMU_CNT_CTRL			0xC00
+#define ALI_DRW_PMU_CNT_RST			BIT(2)
+#define ALI_DRW_PMU_CNT_STOP			BIT(1)
+#define ALI_DRW_PMU_CNT_START			BIT(0)
+
+#define ALI_DRW_PMU_CNT_STATE			0xC04
+#define ALI_DRW_PMU_TEST_CTRL			0xC08
+#define ALI_DRW_PMU_CNT_PRELOAD			0xC0C
+
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK		GENMASK(23, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK		GENMASK(31, 0)
+#define ALI_DRW_PMU_CYCLE_CNT_HIGH		0xC10
+#define ALI_DRW_PMU_CYCLE_CNT_LOW		0xC14
+
+/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_EVENT_SEL0			0xC68
+/* counter 0-3 use sel0, counter 4-7 use sel1...*/
+#define ALI_DRW_PMU_EVENT_SELn(n) \
+	(ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4)
+#define ALI_DRW_PMCOM_CNT_EN			BIT(7)
+#define ALI_DRW_PMCOM_CNT_EVENT_MASK		GENMASK(5, 0)
+#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \
+	(8 * (n % 4))
+
+/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte stride */
+#define ALI_DRW_PMU_COMMON_COUNTER0		0xC78
+#define ALI_DRW_PMU_COMMON_COUNTERn(n) \
+	(ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n))
+
+#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL		0xCB8
+#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL		0xCBC
+#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS	0xCC0
+#define ALI_DRW_PMU_OV_INTR_CLR			0xCC4
+#define ALI_DRW_PMU_OV_INTR_STATUS		0xCC8
+#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK		GENMASK(23, 8)
+#define ALI_DRW_PMBW_CNT_OV_INTR_MASK		GENMASK(7, 0)
+#define ALI_DRW_PMU_OV_INTR_MASK		GENMASK_ULL(63, 0)
+
+static int ali_drw_cpuhp_state_num;
+
+static LIST_HEAD(ali_drw_pmu_irqs);
+static DEFINE_MUTEX(ali_drw_pmu_irqs_lock);
+
+struct ali_drw_pmu_irq {
+	struct hlist_node node;
+	struct list_head irqs_node;
+	struct list_head pmus_node;
+	int irq_num;
+	int cpu;
+	refcount_t refcount;
+};
+
+struct ali_drw_pmu {
+	void __iomem *cfg_base;
+	struct device *dev;
+
+	struct list_head pmus_node;
+	struct ali_drw_pmu_irq *irq;
+	int irq_num;
+	int cpu;
+	DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS);
+	struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+	int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS];
+
+	struct pmu pmu;
+};
+
+#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu))
+
+#define DRW_CONFIG_EVENTID		GENMASK(7, 0)
+#define GET_DRW_EVENTID(event)	FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr.config)
+
+static ssize_t ali_drw_pmu_format_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(buf, "%s\n", (char *)eattr->var);
+}
+
+/*
+ * PMU event attributes
+ */
+static ssize_t ali_drw_pmu_event_show(struct device *dev,
+			       struct device_attribute *attr, char *page)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct dev_ext_attribute, attr);
+
+	return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var);
+}
+
+#define ALI_DRW_PMU_ATTR(_name, _func, _config)                            \
+		(&((struct dev_ext_attribute[]) {                               \
+				{ __ATTR(_name, 0444, _func, NULL), (void *)_config }   \
+		})[0].attr.attr)
+
+#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config)            \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config)
+#define ALI_DRW_PMU_EVENT_ATTR(_name, _config)             \
+	ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config)
+
+static struct attribute *ali_drw_pmu_events_attrs[] = {
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr,			0x0),
+	ALI_DRW_PMU_EVENT_ATTR(hif_wr,				0x1),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rd,				0x2),
+	ALI_DRW_PMU_EVENT_ATTR(hif_rmw,				0x3),
+	ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd,			0x4),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles,		0x7),
+	ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles,		0x8),
+	ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical,		0x9),
+	ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical,		0xA),
+	ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical,		0xB),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_activate,			0xC),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr,			0xD),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate,		0xE),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_rd,			0xF),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_wr,			0x10),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_mwr,			0x11),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_precharge,			0x12),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr,		0x13),
+	ALI_DRW_PMU_EVENT_ATTR(precharge_for_other,		0x14),
+	ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions,		0x15),
+	ALI_DRW_PMU_EVENT_ATTR(write_combine,			0x16),
+	ALI_DRW_PMU_EVENT_ATTR(war_hazard,			0x17),
+	ALI_DRW_PMU_EVENT_ATTR(raw_hazard,			0x18),
+	ALI_DRW_PMU_EVENT_ATTR(waw_hazard,			0x19),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0,		0x1A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1,		0x1B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2,		0x1C),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3,		0x1D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0,	0x1E),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1,	0x1F),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2,	0x20),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3,	0x21),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0,		0x26),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1,		0x27),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2,		0x28),
+	ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3,		0x29),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_refresh,			0x2A),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref,			0x2B),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode,			0x2D),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl,			0x2E),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30),
+	ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc,		0x34),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr,		0x35),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr,			0x36),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart,			0x37),
+	ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch,			0x38),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txreq,			0x39),
+	ALI_DRW_PMU_EVENT_ATTR(chi_txdat,			0x3A),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxdat,			0x3B),
+	ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp,			0x3C),
+	ALI_DRW_PMU_EVENT_ATTR(tsz_vio,				0x3D),
+	ALI_DRW_PMU_EVENT_ATTR(cycle,				0x80),
+	NULL,
+};
+
+static struct attribute_group ali_drw_pmu_events_attr_group = {
+	.name = "events",
+	.attrs = ali_drw_pmu_events_attrs,
+};
+
+static struct attribute *ali_drw_pmu_format_attr[] = {
+	ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_format_group = {
+	.name = "format",
+	.attrs = ali_drw_pmu_format_attr,
+};
+
+static ssize_t ali_drw_pmu_cpumask_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu));
+}
+
+static struct device_attribute ali_drw_pmu_cpumask_attr =
+		__ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL);
+
+static struct attribute *ali_drw_pmu_cpumask_attrs[] = {
+	&ali_drw_pmu_cpumask_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group ali_drw_pmu_cpumask_attr_group = {
+	.attrs = ali_drw_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *ali_drw_pmu_attr_groups[] = {
+	&ali_drw_pmu_events_attr_group,
+	&ali_drw_pmu_cpumask_attr_group,
+	&ali_drw_pmu_format_group,
+	NULL,
+};
+
+/* find a counter for event, then in add func, hw.idx will equal to counter */
+static int ali_drw_get_counter_idx(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int idx;
+
+	for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) {
+		if (!test_and_set_bit(idx, drw_pmu->used_mask))
+			return idx;
+	}
+
+	/* The counters are all in use. */
+	return -EBUSY;
+}
+
+static u64 ali_drw_pmu_read_counter(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	u64 cycle_high, cycle_low;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		cycle_high = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH);
+		cycle_high &= ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK;
+		cycle_low = readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW);
+		cycle_low &= ALI_DRW_PMU_CYCLE_CNT_LOW_MASK;
+		return (cycle_high << 32 | cycle_low);
+	}
+
+	return readl(drw_pmu->cfg_base +
+		     ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx));
+}
+
+static void ali_drw_pmu_event_update(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 delta, prev, now;
+
+	do {
+		prev = local64_read(&hwc->prev_count);
+		now = ali_drw_pmu_read_counter(event);
+	} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);
+
+	/* handle overflow. */
+	delta = now - prev;
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID)
+		delta &= ALI_DRW_PMU_OV_INTR_MASK;
+	else
+		delta &= ALI_DRW_CNT_MAX_PERIOD;
+	local64_add(delta, &event->count);
+}
+
+static void ali_drw_pmu_event_set_period(struct perf_event *event)
+{
+	u64 pre_val;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	/* set a preload counter for test purpose */
+	writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+
+	/* set conunter initial value */
+	pre_val = ALI_DRW_PMU_CNT_INIT;
+	writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	local64_set(&event->hw.prev_count, pre_val);
+
+	/* set sel mode to zero to start test */
+	writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL);
+}
+
+static void ali_drw_pmu_enable_counter(struct perf_event *event)
+{
+	u32 val, subval, reg, shift;
+	int counter = event->hw.idx;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static void ali_drw_pmu_disable_counter(struct perf_event *event)
+{
+	u32 val, reg, subval, shift;
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	int counter = event->hw.idx;
+
+	reg = ALI_DRW_PMU_EVENT_SELn(counter);
+	val = readl(drw_pmu->cfg_base + reg);
+	subval = FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) |
+		 FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0);
+
+	shift = ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter);
+	val &= ~(GENMASK(7, 0) << shift);
+	val |= subval << shift;
+
+	writel(val, drw_pmu->cfg_base + reg);
+}
+
+static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
+{
+	struct ali_drw_pmu_irq *irq = data;
+	struct ali_drw_pmu *drw_pmu;
+	irqreturn_t ret = IRQ_NONE;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) {
+		unsigned long status, clr_status;
+		struct perf_event *event;
+		unsigned int idx;
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			ali_drw_pmu_disable_counter(event);
+		}
+
+		/* common counter intr status */
+		status = readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS);
+		status = FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status);
+		if (status) {
+			for_each_set_bit(idx, &status,
+					 ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+				event = drw_pmu->events[idx];
+				if (WARN_ON_ONCE(!event))
+					continue;
+				ali_drw_pmu_event_update(event);
+				ali_drw_pmu_event_set_period(event);
+			}
+
+			/* clear common counter intr status */
+			clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1);
+			writel(clr_status,
+			       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+		}
+
+		for (idx = 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) {
+			event = drw_pmu->events[idx];
+			if (!event)
+				continue;
+			if (!(event->hw.state & PERF_HES_STOPPED))
+				ali_drw_pmu_enable_counter(event);
+		}
+		if (status)
+			ret = IRQ_HANDLED;
+	}
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_device
+						      *pdev, int irq_num)
+{
+	int ret;
+	struct ali_drw_pmu_irq *irq;
+
+	list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) {
+		if (irq->irq_num == irq_num
+		    && refcount_inc_not_zero(&irq->refcount))
+			return irq;
+	}
+
+	irq = kzalloc(sizeof(*irq), GFP_KERNEL);
+	if (!irq)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&irq->pmus_node);
+
+	/* Pick one CPU to be the preferred one to use */
+	irq->cpu = smp_processor_id();
+	refcount_set(&irq->refcount, 1);
+
+	/*
+	 * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same
+	 * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers
+	 * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not
+	 * share with other component.
+	 */
+	ret = devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr,
+			       IRQF_SHARED, dev_name(&pdev->dev), irq);
+	if (ret < 0) {
+		dev_err(&pdev->dev,
+			"Fail to request IRQ:%d ret:%d\n", irq_num, ret);
+		goto out_free;
+	}
+
+	ret = irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu));
+	if (ret)
+		goto out_free;
+
+	ret = cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num,
+					     &irq->node);
+	if (ret)
+		goto out_free;
+
+	irq->irq_num = irq_num;
+	list_add(&irq->irqs_node, &ali_drw_pmu_irqs);
+
+	return irq;
+
+out_free:
+	kfree(irq);
+	return ERR_PTR(ret);
+}
+
+static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu,
+				struct platform_device *pdev)
+{
+	int irq_num;
+	struct ali_drw_pmu_irq *irq;
+
+	/* Read and init IRQ */
+	irq_num = platform_get_irq(pdev, 0);
+	if (irq_num < 0)
+		return irq_num;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	irq = __ali_drw_pmu_init_irq(pdev, irq_num);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	if (IS_ERR(irq))
+		return PTR_ERR(irq);
+
+	drw_pmu->irq = irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	return 0;
+}
+
+static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu)
+{
+	struct ali_drw_pmu_irq *irq = drw_pmu->irq;
+
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_del_rcu(&drw_pmu->pmus_node);
+
+	if (!refcount_dec_and_test(&irq->refcount)) {
+		mutex_unlock(&ali_drw_pmu_irqs_lock);
+		return;
+	}
+
+	list_del(&irq->irqs_node);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL));
+	cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num,
+					    &irq->node);
+	kfree(irq);
+}
+
+static int ali_drw_pmu_event_init(struct perf_event *event)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event *sibling;
+	struct device *dev = drw_pmu->pmu.dev;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (is_sampling_event(event)) {
+		dev_err(dev, "Sampling not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->attach_state & PERF_ATTACH_TASK) {
+		dev_err(dev, "Per-task counter cannot allocate!\n");
+		return -EOPNOTSUPP;
+	}
+
+	event->cpu = drw_pmu->cpu;
+	if (event->cpu < 0) {
+		dev_err(dev, "Per-task mode not supported!\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (event->group_leader != event &&
+	    !is_software_event(event->group_leader)) {
+		dev_err(dev, "driveway only allow one event!\n");
+		return -EINVAL;
+	}
+
+	for_each_sibling_event(sibling, event->group_leader) {
+		if (sibling != event && !is_software_event(sibling)) {
+			dev_err(dev, "driveway event not allowed!\n");
+			return -EINVAL;
+		}
+	}
+
+	/* reset all the pmu counters */
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	hwc->idx = -1;
+
+	return 0;
+}
+
+static void ali_drw_pmu_start(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	event->hw.state = 0;
+
+	if (GET_DRW_EVENTID(event) == ALI_DRW_PMU_CYCLE_EVT_ID) {
+		writel(ALI_DRW_PMU_CNT_START,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+		return;
+	}
+
+	ali_drw_pmu_event_set_period(event);
+	if (flags & PERF_EF_RELOAD) {
+		unsigned long prev_raw_count =
+		    local64_read(&event->hw.prev_count);
+		writel(prev_raw_count,
+		       drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD);
+	}
+
+	ali_drw_pmu_enable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+}
+
+static void ali_drw_pmu_stop(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+
+	if (event->hw.state & PERF_HES_STOPPED)
+		return;
+
+	if (GET_DRW_EVENTID(event) != ALI_DRW_PMU_CYCLE_EVT_ID)
+		ali_drw_pmu_disable_counter(event);
+
+	writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	ali_drw_pmu_event_update(event);
+	event->hw.state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int ali_drw_pmu_add(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = -1;
+	int evtid;
+
+	evtid = GET_DRW_EVENTID(event);
+
+	if (evtid != ALI_DRW_PMU_CYCLE_EVT_ID) {
+		idx = ali_drw_get_counter_idx(event);
+		if (idx < 0)
+			return idx;
+		drw_pmu->events[idx] = event;
+		drw_pmu->evtids[idx] = evtid;
+	}
+	hwc->idx = idx;
+
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		ali_drw_pmu_start(event, PERF_EF_RELOAD);
+
+	/* Propagate our changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void ali_drw_pmu_del(struct perf_event *event, int flags)
+{
+	struct ali_drw_pmu *drw_pmu = to_ali_drw_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	ali_drw_pmu_stop(event, PERF_EF_UPDATE);
+
+	if (idx >= 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) {
+		drw_pmu->events[idx] = NULL;
+		drw_pmu->evtids[idx] = 0;
+		clear_bit(idx, drw_pmu->used_mask);
+	}
+
+	perf_event_update_userpage(event);
+}
+
+static void ali_drw_pmu_read(struct perf_event *event)
+{
+	ali_drw_pmu_event_update(event);
+}
+
+static int ali_drw_pmu_probe(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu;
+	struct resource *res;
+	char *name;
+	int ret;
+
+	drw_pmu = devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL);
+	if (!drw_pmu)
+		return -ENOMEM;
+
+	drw_pmu->dev = &pdev->dev;
+	platform_set_drvdata(pdev, drw_pmu);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	drw_pmu->cfg_base = devm_ioremap_resource(&pdev->dev, res);
+	if (!drw_pmu->cfg_base)
+		return -ENOMEM;
+
+	name = devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx",
+			      (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT));
+	if (!name)
+		return -ENOMEM;
+
+	writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL);
+
+	/* enable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL);
+
+	/* clearing interrupt status */
+	writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
+
+	drw_pmu->cpu = smp_processor_id();
+
+	ret = ali_drw_pmu_init_irq(drw_pmu, pdev);
+	if (ret)
+		return ret;
+
+	drw_pmu->pmu = (struct pmu) {
+		.module		= THIS_MODULE,
+		.task_ctx_nr	= perf_invalid_context,
+		.event_init	= ali_drw_pmu_event_init,
+		.add		= ali_drw_pmu_add,
+		.del		= ali_drw_pmu_del,
+		.start		= ali_drw_pmu_start,
+		.stop		= ali_drw_pmu_stop,
+		.read		= ali_drw_pmu_read,
+		.attr_groups	= ali_drw_pmu_attr_groups,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE,
+	};
+
+	ret = perf_pmu_register(&drw_pmu->pmu, name, -1);
+	if (ret) {
+		dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n");
+		ali_drw_pmu_uninit_irq(drw_pmu);
+	}
+
+	return ret;
+}
+
+static int ali_drw_pmu_remove(struct platform_device *pdev)
+{
+	struct ali_drw_pmu *drw_pmu = platform_get_drvdata(pdev);
+
+	/* disable the generation of interrupt by all common counters */
+	writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK,
+	       drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL);
+
+	ali_drw_pmu_uninit_irq(drw_pmu);
+	perf_pmu_unregister(&drw_pmu->pmu);
+
+	return 0;
+}
+
+static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct ali_drw_pmu_irq *irq;
+	struct ali_drw_pmu *drw_pmu;
+	unsigned int target;
+	int ret;
+	cpumask_t node_online_cpus;
+
+	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
+	if (cpu != irq->cpu)
+		return 0;
+
+	ret = cpumask_and(&node_online_cpus,
+			  cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
+	if (ret)
+		target = cpumask_any_but(&node_online_cpus, cpu);
+	else
+		target = cpumask_any_but(cpu_online_mask, cpu);
+
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	/* We're only reading, but this isn't the place to be involving RCU */
+	mutex_lock(&ali_drw_pmu_irqs_lock);
+	list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node)
+		perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target);
+	mutex_unlock(&ali_drw_pmu_irqs_lock);
+
+	WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target)));
+	irq->cpu = target;
+
+	return 0;
+}
+
+/*
+ * Due to historical reasons, the HID used in the production environment is
+ * ARMHD700, so we leave ARMHD700 as Compatible ID.
+ */
+static const struct acpi_device_id ali_drw_acpi_match[] = {
+	{"BABA5000", 0},
+	{"ARMHD700", 0},
+	{}
+};
+
+MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match);
+
+static struct platform_driver ali_drw_pmu_driver = {
+	.driver = {
+		   .name = "ali_drw_pmu",
+		   .acpi_match_table = ali_drw_acpi_match,
+		   },
+	.probe = ali_drw_pmu_probe,
+	.remove = ali_drw_pmu_remove,
+};
+
+static int __init ali_drw_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "ali_drw_pmu:online",
+				      NULL, ali_drw_pmu_offline_cpu);
+
+	if (ret < 0) {
+		pr_err("DRW Driveway PMU: setup hotplug failed, ret = %d\n",
+		       ret);
+		return ret;
+	}
+	ali_drw_cpuhp_state_num = ret;
+
+	ret = platform_driver_register(&ali_drw_pmu_driver);
+	if (ret)
+		cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+
+	return ret;
+}
+
+static void __exit ali_drw_pmu_exit(void)
+{
+	platform_driver_unregister(&ali_drw_pmu_driver);
+	cpuhp_remove_multi_state(ali_drw_cpuhp_state_num);
+}
+
+module_init(ali_drw_pmu_init);
+module_exit(ali_drw_pmu_exit);
+
+MODULE_AUTHOR("Hongbo Yao <yaohongbo@linux.alibaba.com>");
+MODULE_AUTHOR("Neng Chen <nengchen@linux.alibaba.com>");
+MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>");
+MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver");
+MODULE_LICENSE("GPL v2");
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RESEND PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
                   ` (21 preceding siblings ...)
  2022-09-14  2:23 ` [RESEND PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
@ 2022-09-14  2:23 ` Shuai Xue
  22 siblings, 0 replies; 41+ messages in thread
From: Shuai Xue @ 2022-09-14  2:23 UTC (permalink / raw)
  To: will, Jonathan.Cameron, linux-arm-kernel, linux-kernel
  Cc: rdunlap, robin.murphy, mark.rutland, baolin.wang, zhuo.song, xueshuai

Add maintainers for Alibaba PMU document and driver

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 MAINTAINERS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 936490dcc97b..e8f095c2f5ac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -749,6 +749,12 @@ S:	Supported
 F:	drivers/infiniband/hw/erdma
 F:	include/uapi/rdma/erdma-abi.h
 
+ALIBABA PMU DRIVER
+M:	Shuai Xue <xueshuai@linux.alibaba.com>
+S:	Supported
+F:	Documentation/admin-guide/perf/alibaba_pmu.rst
+F:	drivers/perf/alibaba_uncore_dwr_pmu.c
+
 ALIENWARE WMI DRIVER
 L:	Dell.Client.Kernel@dell.com
 S:	Maintained
-- 
2.20.1.12.g72788fdb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  2022-08-18  3:18 ` [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
  2022-09-06  4:58   ` Shuai Xue
@ 2022-09-22 20:33   ` Will Deacon
  1 sibling, 0 replies; 41+ messages in thread
From: Will Deacon @ 2022-09-22 20:33 UTC (permalink / raw)
  To: linux-kernel, Jonathan.Cameron, Shuai Xue, linux-arm-kernel
  Cc: catalin.marinas, kernel-team, Will Deacon, robin.murphy,
	mark.rutland, rdunlap, zhuo.song, baolin.wang

On Thu, 18 Aug 2022 11:18:19 +0800, Shuai Xue wrote:
> I am wondering that do you have any comments to this patch set? If/when you're
> happy with them, cloud you please queue them up?
> 
> Thank you :)
> 
> Cheers,
> Shuai.
> 
> [...]

Applied to will (for-next/perf), thanks!

[1/3] docs: perf: Add description for Alibaba's T-Head PMU driver
      https://git.kernel.org/will/c/a6f92909d6bb
[2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
      https://git.kernel.org/will/c/cf7b61073e45
[3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
      https://git.kernel.org/will/c/d813a19e7d2e

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2022-09-22 20:34 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17 11:18 [PATCH v1 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-06-17 11:18 ` [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
2022-06-20 11:49   ` Jonathan Cameron
2022-06-23  8:34     ` Shuai Xue
2022-07-04  3:53   ` kernel test robot
2022-06-17 11:18 ` [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-06-17 19:59   ` kernel test robot
2022-06-20 13:50   ` Jonathan Cameron
2022-06-24  7:04     ` Shuai Xue
2022-06-17 11:18 ` [PATCH v1 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
2022-06-28  3:16 ` [PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-06-28  3:16 ` [PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
2022-06-28  3:16 ` [PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-06-28  3:16 ` [PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
2022-07-15 15:13 ` [RESEND PATCH v2 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-07-15 15:13 ` [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
2022-07-19 12:35   ` Jonathan Cameron
2022-07-20  1:41     ` Shuai Xue
2022-07-15 15:13 ` [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-07-15 19:05   ` Randy Dunlap
2022-07-17 13:02     ` Shuai Xue
2022-07-19 13:19   ` Jonathan Cameron
2022-07-20  6:11     ` Shuai Xue
2022-07-15 15:13 ` [RESEND PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
2022-07-20  6:58 ` [PATCH v3 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-08-05  9:07   ` Shuai Xue
2022-08-17  8:15   ` Baolin Wang
2022-08-17  8:19     ` Shuai Xue
2022-07-20  6:58 ` [PATCH v3 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
2022-07-20  6:58 ` [PATCH v3 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-07-20  6:58 ` [PATCH v3 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
2022-08-18  3:18 ` [PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-09-06  4:58   ` Shuai Xue
2022-09-22 20:33   ` Will Deacon
2022-08-18  3:18 ` [PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
2022-08-18  3:18 ` [PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-08-18  3:18 ` [PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue
2022-09-14  2:23 ` [RESEND PATCH v4 0/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-09-14  2:23 ` [RESEND PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Shuai Xue
2022-09-14  2:23 ` [RESEND PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Shuai Xue
2022-09-14  2:23 ` [RESEND PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Shuai Xue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).