All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC V3 0/3] arm_pmu: acpi: variant support and QCOM Falkor extensions
@ 2018-06-22 19:46 ` Agustin Vega-Frias
  0 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, linux-acpi, Will Deacon,
	Mark Rutland, Jeremy Linton, Catalin Marinas, Marc Zyngier,
	Lorenzo Pieralisi, Rafael J. Wysocki
  Cc: timur, agustinv

This series is a complete re-design of V1 of the QCOM Falkor extensions [1],
it introduces a probe table based on the HID of a device nested under the CPU
device to allow variant detection and arm_pmu customization.

The first patch adds an additional section at the end of each ACPI probe table.
This allows probe tables to be sentinel-delimited and better accommodate some
APIs that require such tables.

The second patch adds the PMUv3 ACPI probe table and plumbing to allow drivers
to plug into the ACPI PMUv3 probe sequence.

The third patch adds the QCOM Falkor extensions using the new probe table.

If this found to be a reasonable extension approach other patches will be
added to the series to build on the base QCOM extensions.

[1] https://lkml.org/lkml/2017/3/1/540

Changes since V2:
- Address V2 comments, which resulted in removing all uses of the PMU lock.

Changes since V1:
- Redesign as a separate module by adding variant detection support.

Agustin Vega-Frias (3):
  ACPI: add support for sentinel-delimited probe tables
  arm_pmu: acpi: add support for CPU PMU variant detection
  perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

 drivers/perf/Makefile             |   2 +-
 drivers/perf/arm_pmu_acpi.c       |  27 ++++
 drivers/perf/qcom_arm_pmu.c       | 310 ++++++++++++++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h |   4 +-
 include/linux/acpi.h              |  11 ++
 include/linux/perf/arm_pmu.h      |   1 +
 6 files changed, 353 insertions(+), 2 deletions(-)
 create mode 100644 drivers/perf/qcom_arm_pmu.c

--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC V3 0/3] arm_pmu: acpi: variant support and QCOM Falkor extensions
@ 2018-06-22 19:46 ` Agustin Vega-Frias
  0 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-arm-kernel

This series is a complete re-design of V1 of the QCOM Falkor extensions [1],
it introduces a probe table based on the HID of a device nested under the CPU
device to allow variant detection and arm_pmu customization.

The first patch adds an additional section at the end of each ACPI probe table.
This allows probe tables to be sentinel-delimited and better accommodate some
APIs that require such tables.

The second patch adds the PMUv3 ACPI probe table and plumbing to allow drivers
to plug into the ACPI PMUv3 probe sequence.

The third patch adds the QCOM Falkor extensions using the new probe table.

If this found to be a reasonable extension approach other patches will be
added to the series to build on the base QCOM extensions.

[1] https://lkml.org/lkml/2017/3/1/540

Changes since V2:
- Address V2 comments, which resulted in removing all uses of the PMU lock.

Changes since V1:
- Redesign as a separate module by adding variant detection support.

Agustin Vega-Frias (3):
  ACPI: add support for sentinel-delimited probe tables
  arm_pmu: acpi: add support for CPU PMU variant detection
  perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

 drivers/perf/Makefile             |   2 +-
 drivers/perf/arm_pmu_acpi.c       |  27 ++++
 drivers/perf/qcom_arm_pmu.c       | 310 ++++++++++++++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h |   4 +-
 include/linux/acpi.h              |  11 ++
 include/linux/perf/arm_pmu.h      |   1 +
 6 files changed, 353 insertions(+), 2 deletions(-)
 create mode 100644 drivers/perf/qcom_arm_pmu.c

--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC V2 1/3] ACPI: add support for sentinel-delimited probe tables
  2018-06-22 19:46 ` Agustin Vega-Frias
@ 2018-06-22 19:46   ` Agustin Vega-Frias
  -1 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, linux-acpi, Will Deacon,
	Mark Rutland, Jeremy Linton, Catalin Marinas, Marc Zyngier,
	Lorenzo Pieralisi, Rafael J. Wysocki
  Cc: timur, agustinv

Tables declared with the ACPI_PROBE_TABLE linker macro are typically
traversed by using the start and end symbols created by the linker
script. However, there are some APIs that use sentinel-delimited
tables (e.g. acpi_match_device). To better support these APIs an
additional section is added at the end of the probe table. This
section can be used to add a sentinel for tables that require it.

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 include/asm-generic/vmlinux.lds.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index af24057..5894049 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -219,7 +219,8 @@
 	. = ALIGN(8);							\
 	VMLINUX_SYMBOL(__##name##_acpi_probe_table) = .;		\
 	KEEP(*(__##name##_acpi_probe_table))				\
-	VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;		\
+	KEEP(*(__##name##_acpi_probe_table_end))
 #else
 #define ACPI_PROBE_TABLE(name)
 #endif
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC V2 1/3] ACPI: add support for sentinel-delimited probe tables
@ 2018-06-22 19:46   ` Agustin Vega-Frias
  0 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-arm-kernel

Tables declared with the ACPI_PROBE_TABLE linker macro are typically
traversed by using the start and end symbols created by the linker
script. However, there are some APIs that use sentinel-delimited
tables (e.g. acpi_match_device). To better support these APIs an
additional section is added at the end of the probe table. This
section can be used to add a sentinel for tables that require it.

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 include/asm-generic/vmlinux.lds.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index af24057..5894049 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -219,7 +219,8 @@
 	. = ALIGN(8);							\
 	VMLINUX_SYMBOL(__##name##_acpi_probe_table) = .;		\
 	KEEP(*(__##name##_acpi_probe_table))				\
-	VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;		\
+	KEEP(*(__##name##_acpi_probe_table_end))
 #else
 #define ACPI_PROBE_TABLE(name)
 #endif
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC V2 2/3] arm_pmu: acpi: add support for CPU PMU variant detection
  2018-06-22 19:46 ` Agustin Vega-Frias
@ 2018-06-22 19:46   ` Agustin Vega-Frias
  -1 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, linux-acpi, Will Deacon,
	Mark Rutland, Jeremy Linton, Catalin Marinas, Marc Zyngier,
	Lorenzo Pieralisi, Rafael J. Wysocki
  Cc: timur, agustinv

DT allows CPU PMU variant detection via the PMU device compatible
property. ACPI does not have an equivalent mechanism so we introduce
a probe table to allow this via a device nested inside the CPU device
in the DSDT:

Device (CPU0)
{
    Name (_HID, "ACPI0007" /* Processor Device */)
    ...
    Device (PMU0)
    {
        Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */

        /*
         * The device might also contain _DSD properties to indicate other
         * IMPLEMENTATION DEFINED PMU features.
         */
        Name (_DSD, Package ()
        {
            ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
            Package ()
            {
                ...
            }
        })
    }
}

With this in place we can declare the variant:

    ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", falkor_pmu_init);

The init function is called after the default PMU initialization and is
passed a pointer to the arm_pmu structure and a pointer to the PMU device.
The init function can then override arm_pmu callbacks and attributes and
query more properties from the PMU device.

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 drivers/perf/arm_pmu_acpi.c       | 27 +++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h |  1 +
 include/linux/acpi.h              | 11 +++++++++++
 include/linux/perf/arm_pmu.h      |  1 +
 4 files changed, 40 insertions(+)

diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f19751..6b0ca71 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -220,6 +220,26 @@ static int arm_pmu_acpi_cpu_starting(unsigned int cpu)
 	return 0;
 }
 
+/*
+ * Check if the given child device of the CPU device matches a PMU variant
+ * device declared with ACPI_DECLARE_PMU_VARIANT, if so, pass the arm_pmu
+ * structure and the matching device for further initialization.
+ */
+static int arm_pmu_variant_init(struct device *dev, void *data)
+{
+	extern struct acpi_device_id ACPI_PROBE_TABLE(pmu);
+	unsigned int cpu = *((unsigned int *)data);
+	const struct acpi_device_id *id;
+
+	id = acpi_match_device(&ACPI_PROBE_TABLE(pmu), dev);
+	if (id) {
+		armpmu_acpi_init_fn fn = (armpmu_acpi_init_fn)id->driver_data;
+
+		return fn(per_cpu(probed_pmus, cpu), dev);
+	}
+	return 0;
+}
+
 int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 {
 	int pmu_idx = 0;
@@ -240,6 +260,7 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 	 */
 	for_each_possible_cpu(cpu) {
 		struct arm_pmu *pmu = per_cpu(probed_pmus, cpu);
+		struct device *dev = get_cpu_device(cpu);
 		char *base_name;
 
 		if (!pmu || pmu->name)
@@ -254,6 +275,10 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 			return ret;
 		}
 
+		ret = device_for_each_child(dev, &cpu, arm_pmu_variant_init);
+		if (ret == -ENODEV)
+			pr_warn("Failed PMU re-init, fallback to plain PMUv3");
+
 		base_name = pmu->name;
 		pmu->name = kasprintf(GFP_KERNEL, "%s_%d", base_name, pmu_idx++);
 		if (!pmu->name) {
@@ -290,3 +315,5 @@ static int arm_pmu_acpi_init(void)
 	return ret;
 }
 subsys_initcall(arm_pmu_acpi_init)
+
+ACPI_DECLARE_PMU_SENTINEL();
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5894049..f1be62a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -600,6 +600,7 @@
 	IRQCHIP_OF_MATCH_TABLE()					\
 	ACPI_PROBE_TABLE(irqchip)					\
 	ACPI_PROBE_TABLE(timer)						\
+	ACPI_PROBE_TABLE(pmu)						\
 	EARLYCON_TABLE()
 
 #define INIT_TEXT							\
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 15bfb15..9c410cf 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1153,6 +1153,17 @@ struct acpi_probe_entry {
 					  (&ACPI_PROBE_TABLE_END(t) -	\
 					   &ACPI_PROBE_TABLE(t)));	\
 	})
+
+#define ACPI_DECLARE_PMU_VARIANT(name, hid, init_fn)			\
+	static const struct acpi_device_id __acpi_probe_##name		\
+		__used __section(__pmu_acpi_probe_table)		\
+		= { .id = hid, .driver_data = (kernel_ulong_t)init_fn }
+
+#define ACPI_DECLARE_PMU_SENTINEL()					\
+	static const struct acpi_device_id __acpi_probe_sentinel	\
+		__used __section(__pmu_acpi_probe_table_end)		\
+		= { .id = "", .driver_data = 0 }
+
 #else
 static inline int acpi_dev_get_property(struct acpi_device *adev,
 					const char *name, acpi_object_type type,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 40036a5..ff43d65 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -123,6 +123,7 @@ int armpmu_map_event(struct perf_event *event,
 		     u32 raw_event_mask);
 
 typedef int (*armpmu_init_fn)(struct arm_pmu *);
+typedef int (*armpmu_acpi_init_fn)(struct arm_pmu *, struct device *);
 
 struct pmu_probe_info {
 	unsigned int cpuid;
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC V2 2/3] arm_pmu: acpi: add support for CPU PMU variant detection
@ 2018-06-22 19:46   ` Agustin Vega-Frias
  0 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-arm-kernel

DT allows CPU PMU variant detection via the PMU device compatible
property. ACPI does not have an equivalent mechanism so we introduce
a probe table to allow this via a device nested inside the CPU device
in the DSDT:

Device (CPU0)
{
    Name (_HID, "ACPI0007" /* Processor Device */)
    ...
    Device (PMU0)
    {
        Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */

        /*
         * The device might also contain _DSD properties to indicate other
         * IMPLEMENTATION DEFINED PMU features.
         */
        Name (_DSD, Package ()
        {
            ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
            Package ()
            {
                ...
            }
        })
    }
}

With this in place we can declare the variant:

    ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", falkor_pmu_init);

The init function is called after the default PMU initialization and is
passed a pointer to the arm_pmu structure and a pointer to the PMU device.
The init function can then override arm_pmu callbacks and attributes and
query more properties from the PMU device.

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 drivers/perf/arm_pmu_acpi.c       | 27 +++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h |  1 +
 include/linux/acpi.h              | 11 +++++++++++
 include/linux/perf/arm_pmu.h      |  1 +
 4 files changed, 40 insertions(+)

diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f19751..6b0ca71 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -220,6 +220,26 @@ static int arm_pmu_acpi_cpu_starting(unsigned int cpu)
 	return 0;
 }
 
+/*
+ * Check if the given child device of the CPU device matches a PMU variant
+ * device declared with ACPI_DECLARE_PMU_VARIANT, if so, pass the arm_pmu
+ * structure and the matching device for further initialization.
+ */
+static int arm_pmu_variant_init(struct device *dev, void *data)
+{
+	extern struct acpi_device_id ACPI_PROBE_TABLE(pmu);
+	unsigned int cpu = *((unsigned int *)data);
+	const struct acpi_device_id *id;
+
+	id = acpi_match_device(&ACPI_PROBE_TABLE(pmu), dev);
+	if (id) {
+		armpmu_acpi_init_fn fn = (armpmu_acpi_init_fn)id->driver_data;
+
+		return fn(per_cpu(probed_pmus, cpu), dev);
+	}
+	return 0;
+}
+
 int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 {
 	int pmu_idx = 0;
@@ -240,6 +260,7 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 	 */
 	for_each_possible_cpu(cpu) {
 		struct arm_pmu *pmu = per_cpu(probed_pmus, cpu);
+		struct device *dev = get_cpu_device(cpu);
 		char *base_name;
 
 		if (!pmu || pmu->name)
@@ -254,6 +275,10 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 			return ret;
 		}
 
+		ret = device_for_each_child(dev, &cpu, arm_pmu_variant_init);
+		if (ret == -ENODEV)
+			pr_warn("Failed PMU re-init, fallback to plain PMUv3");
+
 		base_name = pmu->name;
 		pmu->name = kasprintf(GFP_KERNEL, "%s_%d", base_name, pmu_idx++);
 		if (!pmu->name) {
@@ -290,3 +315,5 @@ static int arm_pmu_acpi_init(void)
 	return ret;
 }
 subsys_initcall(arm_pmu_acpi_init)
+
+ACPI_DECLARE_PMU_SENTINEL();
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5894049..f1be62a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -600,6 +600,7 @@
 	IRQCHIP_OF_MATCH_TABLE()					\
 	ACPI_PROBE_TABLE(irqchip)					\
 	ACPI_PROBE_TABLE(timer)						\
+	ACPI_PROBE_TABLE(pmu)						\
 	EARLYCON_TABLE()
 
 #define INIT_TEXT							\
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 15bfb15..9c410cf 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1153,6 +1153,17 @@ struct acpi_probe_entry {
 					  (&ACPI_PROBE_TABLE_END(t) -	\
 					   &ACPI_PROBE_TABLE(t)));	\
 	})
+
+#define ACPI_DECLARE_PMU_VARIANT(name, hid, init_fn)			\
+	static const struct acpi_device_id __acpi_probe_##name		\
+		__used __section(__pmu_acpi_probe_table)		\
+		= { .id = hid, .driver_data = (kernel_ulong_t)init_fn }
+
+#define ACPI_DECLARE_PMU_SENTINEL()					\
+	static const struct acpi_device_id __acpi_probe_sentinel	\
+		__used __section(__pmu_acpi_probe_table_end)		\
+		= { .id = "", .driver_data = 0 }
+
 #else
 static inline int acpi_dev_get_property(struct acpi_device *adev,
 					const char *name, acpi_object_type type,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 40036a5..ff43d65 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -123,6 +123,7 @@ int armpmu_map_event(struct perf_event *event,
 		     u32 raw_event_mask);
 
 typedef int (*armpmu_init_fn)(struct arm_pmu *);
+typedef int (*armpmu_acpi_init_fn)(struct arm_pmu *, struct device *);
 
 struct pmu_probe_info {
 	unsigned int cpuid;
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support
  2018-06-22 19:46 ` Agustin Vega-Frias
@ 2018-06-22 19:46   ` Agustin Vega-Frias
  -1 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, linux-acpi, Will Deacon,
	Mark Rutland, Jeremy Linton, Catalin Marinas, Marc Zyngier,
	Lorenzo Pieralisi, Rafael J. Wysocki
  Cc: timur, agustinv

Selection of these events can be envisioned as indexing them from
a 3D matrix:
- the first index selects a Region Event Selection Register (PMRESRx_EL0)
- the second index selects a group from which only one event at a time
  can be selected
- the third index selects the event

These events are encoded into perf_event_attr as:
  mbe      [config1:0   ]  (flag that indicates a matrix-based event)
  reg      [config:12-15]  (specifies the PMRESRx_EL0 instance)
  group    [config:0-3  ]  (specifies the event group)
  code     [config:4-11 ]  (specifies the event)

Events with the mbe flag set to zero are treated as common or raw PMUv3
events and are handled by the base PMUv3 driver code.

The first two indexes are set combining the RESR and group number with
a base number and writing it into the architected PMXEVTYPER_EL0 register.
The third index is set by writing the code into the bits corresponding
with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
register.

Support for this extension is signaled by the presence of the Falkor PMU
device node under each Falkor CPU device node in the DSDT ACPI table. E.g.:

    Device (CPU0)
    {
        Name (_HID, "ACPI0007" /* Processor Device */)
        ...
        Device (PMU0)
        {
            Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */
            ...
        }
    }

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 drivers/perf/Makefile       |   2 +-
 drivers/perf/qcom_arm_pmu.c | 342 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 343 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/qcom_arm_pmu.c

diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index b3902bd..a61afd9 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -3,7 +3,7 @@ obj-$(CONFIG_ARM_CCI_PMU) += arm-cci.o
 obj-$(CONFIG_ARM_CCN) += arm-ccn.o
 obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
 obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
-obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
+obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o qcom_arm_pmu.o
 obj-$(CONFIG_HISI_PMU) += hisilicon/
 obj-$(CONFIG_QCOM_L2_PMU)	+= qcom_l2_pmu.o
 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
diff --git a/drivers/perf/qcom_arm_pmu.c b/drivers/perf/qcom_arm_pmu.c
new file mode 100644
index 0000000..2f5e736
--- /dev/null
+++ b/drivers/perf/qcom_arm_pmu.c
@@ -0,0 +1,342 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+/*
+ * Qualcomm Technologies CPU PMU IMPLEMENTATION DEFINED extensions support
+ *
+ * Current extensions supported:
+ *
+ * - Matrix-based microarchitectural events support
+ *
+ *   Selection of these events can be envisioned as indexing them from
+ *   a 3D matrix:
+ *   - the first index selects a Region Event Selection Register (PMRESRx_EL0)
+ *   - the second index selects a group from which only one event at a time
+ *     can be selected
+ *   - the third index selects the event
+ *
+ *   These events are encoded into perf_event_attr as:
+ *     mbe      [config1:0   ]  (flag that indicates a matrix-based event)
+ *     reg      [config:12-15]  (specifies the PMRESRx_EL0 instance)
+ *     group    [config:0-3  ]  (specifies the event group)
+ *     code     [config:4-11 ]  (specifies the event)
+ *
+ *   Events with the mbe flag set to zero are treated as common or raw PMUv3
+ *   events and are handled by the base PMUv3 driver code.
+ *
+ *   The first two indexes are set combining the RESR and group number with a
+ *   base number and writing it into the architected PMXEVTYPER_EL0.evtCount.
+ *   The third index is set by writing the code into the bits corresponding
+ *   with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
+ *   register.
+ */
+
+#include <linux/acpi.h>
+#include <linux/bitops.h>
+#include <linux/device.h>
+#include <linux/perf_event.h>
+#include <linux/printk.h>
+#include <linux/types.h>
+
+#include <asm/barrier.h>
+#include <asm/sysreg.h>
+
+#include <linux/perf/arm_pmu.h>
+
+#define pmresr0_el0         sys_reg(3, 5, 11, 3, 0)
+#define pmresr1_el0         sys_reg(3, 5, 11, 3, 2)
+#define pmresr2_el0         sys_reg(3, 5, 11, 3, 4)
+#define pmxevcntcr_el0      sys_reg(3, 5, 11, 0, 3)
+
+#define QC_EVT_MBE_SHIFT    0
+#define QC_EVT_REG_SHIFT    12
+#define QC_EVT_CODE_SHIFT   4
+#define QC_EVT_GRP_SHIFT    0
+#define QC_EVT_MBE_MASK     GENMASK(QC_EVT_MBE_SHIFT + 1,  QC_EVT_MBE_SHIFT)
+#define QC_EVT_REG_MASK     GENMASK(QC_EVT_REG_SHIFT + 3,  QC_EVT_REG_SHIFT)
+#define QC_EVT_CODE_MASK    GENMASK(QC_EVT_CODE_SHIFT + 7, QC_EVT_CODE_SHIFT)
+#define QC_EVT_GRP_MASK     GENMASK(QC_EVT_GRP_SHIFT + 3,  QC_EVT_GRP_SHIFT)
+#define QC_EVT_RG_MASK      (QC_EVT_REG_MASK | QC_EVT_GRP_MASK)
+#define QC_EVT_RG(event)    ((event)->attr.config & QC_EVT_RG_MASK)
+#define QC_EVT_MBE(event)						\
+	(((event)->attr.config1 & QC_EVT_MBE_MASK) >> QC_EVT_MBE_SHIFT)
+#define QC_EVT_REG(event)						\
+	(((event)->attr.config & QC_EVT_REG_MASK) >> QC_EVT_REG_SHIFT)
+#define QC_EVT_CODE(event)						\
+	(((event)->attr.config & QC_EVT_CODE_MASK) >> QC_EVT_CODE_SHIFT)
+#define QC_EVT_GROUP(event)						\
+	(((event)->attr.config & QC_EVT_GRP_MASK) >> QC_EVT_GRP_SHIFT)
+
+#define QC_MAX_GROUP        7
+#define QC_MAX_RESR         2
+#define QC_BITS_PER_GROUP   8
+#define QC_RESR_ENABLE      BIT_ULL(63)
+#define QC_RESR_EVT_BASE    0xd8
+
+static struct arm_pmu *def_ops;
+
+static inline void falkor_write_pmresr(u64 reg, u64 val)
+{
+	switch (reg) {
+	case 0:
+		write_sysreg_s(val, pmresr0_el0);
+		return;
+	case 1:
+		write_sysreg_s(val, pmresr1_el0);
+		return;
+	default:
+		write_sysreg_s(val, pmresr2_el0);
+		return;
+	}
+}
+
+static inline u64 falkor_read_pmresr(u64 reg)
+{
+	switch (reg) {
+	case 0:
+		return read_sysreg_s(pmresr0_el0);
+	case 1:
+		return read_sysreg_s(pmresr1_el0);
+	default:
+		return read_sysreg_s(pmresr2_el0);
+	}
+}
+
+static void falkor_set_resr(u64 reg, u64 group, u64 code)
+{
+	u64 shift = group * QC_BITS_PER_GROUP;
+	u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
+	u64 val;
+
+	val = falkor_read_pmresr(reg) & ~mask;
+	val |= (code << shift);
+	val |= QC_RESR_ENABLE;
+	falkor_write_pmresr(reg, val);
+}
+
+static void falkor_clear_resr(u64 reg, u64 group)
+{
+	u32 shift = group * QC_BITS_PER_GROUP;
+	u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
+	u64 val = falkor_read_pmresr(reg) & ~mask;
+
+	falkor_write_pmresr(reg, val == QC_RESR_ENABLE ? 0 : val);
+}
+
+/*
+ * Check if e1 and e2 conflict with each other
+ *
+ * e1 is a matrix-based microarchitectural event we are checking against e2.
+ * A conflict exists if the events use the same reg, group, and a different
+ * code.
+ */
+static inline bool events_conflict(struct perf_event *e1, struct perf_event *e2)
+{
+	int type = e2->attr.type;
+	int dynamic = e1->pmu->type;
+
+	/* Same event? */
+	if (e1 == e2)
+		return false;
+
+	/* Other PMU that is not the RAW or this PMU's dynamic type? */
+	if ((e1->pmu != e2->pmu) && ((type != PERF_TYPE_RAW) && (type != dynamic)))
+		return false;
+
+	/* No conflict if using different mbe */
+	if (QC_EVT_MBE(e1) != QC_EVT_MBE(e2))
+		return false;
+
+	/* No conflict if using different reg or group */
+	if (QC_EVT_RG(e1) != QC_EVT_RG(e2))
+		return false;
+
+	/* Same mbe, reg and group is fine so long as code matches */
+	if (QC_EVT_CODE(e1) == QC_EVT_CODE(e2))
+		return false;
+
+	pr_debug_ratelimited("Group exclusion: conflicting events %llx %llx\n",
+			     e1->attr.config,
+			     e2->attr.config);
+	return true;
+}
+
+/*
+ * Check if the given event is valid for the PMU and if so return the value
+ * that can be used in PMXEVTYPER_EL0 to select the event
+ */
+static int falkor_map_event(struct perf_event *event)
+{
+	int type = event->attr.type;
+	int dynamic = event->pmu->type;
+	u64 reg = QC_EVT_REG(event);
+	u64 group = QC_EVT_GROUP(event);
+	struct perf_event *leader;
+	struct perf_event *sibling;
+
+	if (((type != PERF_TYPE_RAW) && (type != dynamic)) || !QC_EVT_MBE(event))
+		/* Common PMUv3 event, forward to the original op */
+		return def_ops->map_event(event);
+
+	/* Is it a valid matrix event? */
+	if ((group > QC_MAX_GROUP) || (reg > QC_MAX_RESR))
+		return -ENOENT;
+
+	/* If part of an event group, check if the event can be put in it */
+
+	leader = event->group_leader;
+	if (events_conflict(event, leader))
+		return -ENOENT;
+
+	for_each_sibling_event(sibling, leader)
+		if (events_conflict(event, sibling))
+			return -ENOENT;
+
+	return QC_RESR_EVT_BASE + reg * 8 + group;
+}
+
+/*
+ * Find a slot for the event on the current CPU
+ */
+static int falkor_get_event_idx(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+	int type = event->attr.type;
+	int dynamic = event->pmu->type;
+	int idx;
+
+	if (((type == PERF_TYPE_RAW) || (type == dynamic)) && QC_EVT_MBE(event))
+		/* Matrix event, check for conflicts with existing events */
+		for_each_set_bit(idx, cpuc->used_mask, ARMPMU_MAX_HWEVENTS)
+			if (cpuc->events[idx] &&
+			    events_conflict(event, cpuc->events[idx]))
+				return -ENOENT;
+
+	/* Let the original op handle the rest */
+	idx = def_ops->get_event_idx(cpuc, event);
+
+	/*
+	 * This is called for actually allocating the events, but also with
+	 * a dummy pmu_hw_events when validating groups, for that case we
+	 * need to ensure that cpuc->events[idx] is NULL so we don't use
+	 * an uninitialized pointer. Conflicts for matrix events in groups
+	 * are checked during event mapping anyway (see falkor_event_map).
+	 */
+	cpuc->events[idx] = NULL;
+
+	return idx;
+}
+
+/*
+ * Reset the PMU
+ */
+static void falkor_reset(void *info)
+{
+	struct arm_pmu *pmu = (struct arm_pmu *)info;
+	u32 i, ctrs = pmu->num_events;
+
+	/* PMRESRx_EL0 regs are unknown at reset, except for the EN field */
+	for (i = 0; i <= QC_MAX_RESR; i++)
+		falkor_write_pmresr(i, 0);
+
+	/* PMXEVCNTCRx_EL0 regs are unknown at reset */
+	for (i = 0; i <= ctrs; i++) {
+		write_sysreg(i, pmselr_el0);
+		isb();
+		write_sysreg_s(0, pmxevcntcr_el0);
+	}
+
+	/* Let the original op handle the rest */
+	def_ops->reset(info);
+}
+
+/*
+ * Enable the given event
+ */
+static void falkor_enable(struct perf_event *event)
+{
+	if (QC_EVT_MBE(event)) {
+		/* Matrix event, program the appropriate PMRESRx_EL0 */
+		u64 reg = QC_EVT_REG(event);
+		u64 code = QC_EVT_CODE(event);
+		u64 group = QC_EVT_GROUP(event);
+
+		falkor_set_resr(reg, group, code);
+	}
+
+	/* Let the original op handle the rest */
+	def_ops->enable(event);
+}
+
+/*
+ * Disable the given event
+ */
+static void falkor_disable(struct perf_event *event)
+{
+	/* Use the original op to disable the counter and interrupt  */
+	def_ops->enable(event);
+
+	if (QC_EVT_MBE(event)) {
+		/* Matrix event, de-program the appropriate PMRESRx_EL0 */
+		u64 reg = QC_EVT_REG(event);
+		u64 group = QC_EVT_GROUP(event);
+
+		falkor_clear_resr(reg, group);
+	}
+}
+
+PMU_FORMAT_ATTR(event, "config:0-15");
+PMU_FORMAT_ATTR(mbe,   "config1:0");
+PMU_FORMAT_ATTR(reg,   "config:12-15");
+PMU_FORMAT_ATTR(code,  "config:4-11");
+PMU_FORMAT_ATTR(group, "config:0-3");
+
+static struct attribute *falkor_pmu_formats[] = {
+	&format_attr_event.attr,
+	&format_attr_mbe.attr,
+	&format_attr_reg.attr,
+	&format_attr_code.attr,
+	&format_attr_group.attr,
+	NULL,
+};
+
+static struct attribute_group falkor_pmu_format_attr_group = {
+	.name = "format",
+	.attrs = falkor_pmu_formats,
+};
+
+static int qcom_falkor_pmu_init(struct arm_pmu *pmu, struct device *dev)
+{
+	/* Save base arm_pmu so we can invoke its ops when appropriate */
+	def_ops = devm_kmemdup(dev, pmu, sizeof(*def_ops), GFP_KERNEL);
+	if (!def_ops) {
+		pr_warn("Failed to allocate arm_pmu for QCOM extensions");
+		return -ENODEV;
+	}
+
+	pmu->name = "qcom_pmuv3";
+
+	/* Override the necessary ops */
+	pmu->map_event     = falkor_map_event;
+	pmu->get_event_idx = falkor_get_event_idx;
+	pmu->reset         = falkor_reset;
+	pmu->enable        = falkor_enable;
+	pmu->disable       = falkor_disable;
+
+	/* Override the necessary attributes */
+	pmu->pmu.attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&falkor_pmu_format_attr_group;
+
+	return 1;
+}
+
+ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", qcom_falkor_pmu_init);
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support
@ 2018-06-22 19:46   ` Agustin Vega-Frias
  0 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-22 19:46 UTC (permalink / raw)
  To: linux-arm-kernel

Selection of these events can be envisioned as indexing them from
a 3D matrix:
- the first index selects a Region Event Selection Register (PMRESRx_EL0)
- the second index selects a group from which only one event at a time
  can be selected
- the third index selects the event

These events are encoded into perf_event_attr as:
  mbe      [config1:0   ]  (flag that indicates a matrix-based event)
  reg      [config:12-15]  (specifies the PMRESRx_EL0 instance)
  group    [config:0-3  ]  (specifies the event group)
  code     [config:4-11 ]  (specifies the event)

Events with the mbe flag set to zero are treated as common or raw PMUv3
events and are handled by the base PMUv3 driver code.

The first two indexes are set combining the RESR and group number with
a base number and writing it into the architected PMXEVTYPER_EL0 register.
The third index is set by writing the code into the bits corresponding
with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
register.

Support for this extension is signaled by the presence of the Falkor PMU
device node under each Falkor CPU device node in the DSDT ACPI table. E.g.:

    Device (CPU0)
    {
        Name (_HID, "ACPI0007" /* Processor Device */)
        ...
        Device (PMU0)
        {
            Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */
            ...
        }
    }

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 drivers/perf/Makefile       |   2 +-
 drivers/perf/qcom_arm_pmu.c | 342 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 343 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/qcom_arm_pmu.c

diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index b3902bd..a61afd9 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -3,7 +3,7 @@ obj-$(CONFIG_ARM_CCI_PMU) += arm-cci.o
 obj-$(CONFIG_ARM_CCN) += arm-ccn.o
 obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
 obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
-obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
+obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o qcom_arm_pmu.o
 obj-$(CONFIG_HISI_PMU) += hisilicon/
 obj-$(CONFIG_QCOM_L2_PMU)	+= qcom_l2_pmu.o
 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
diff --git a/drivers/perf/qcom_arm_pmu.c b/drivers/perf/qcom_arm_pmu.c
new file mode 100644
index 0000000..2f5e736
--- /dev/null
+++ b/drivers/perf/qcom_arm_pmu.c
@@ -0,0 +1,342 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+/*
+ * Qualcomm Technologies CPU PMU IMPLEMENTATION DEFINED extensions support
+ *
+ * Current extensions supported:
+ *
+ * - Matrix-based microarchitectural events support
+ *
+ *   Selection of these events can be envisioned as indexing them from
+ *   a 3D matrix:
+ *   - the first index selects a Region Event Selection Register (PMRESRx_EL0)
+ *   - the second index selects a group from which only one event at a time
+ *     can be selected
+ *   - the third index selects the event
+ *
+ *   These events are encoded into perf_event_attr as:
+ *     mbe      [config1:0   ]  (flag that indicates a matrix-based event)
+ *     reg      [config:12-15]  (specifies the PMRESRx_EL0 instance)
+ *     group    [config:0-3  ]  (specifies the event group)
+ *     code     [config:4-11 ]  (specifies the event)
+ *
+ *   Events with the mbe flag set to zero are treated as common or raw PMUv3
+ *   events and are handled by the base PMUv3 driver code.
+ *
+ *   The first two indexes are set combining the RESR and group number with a
+ *   base number and writing it into the architected PMXEVTYPER_EL0.evtCount.
+ *   The third index is set by writing the code into the bits corresponding
+ *   with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
+ *   register.
+ */
+
+#include <linux/acpi.h>
+#include <linux/bitops.h>
+#include <linux/device.h>
+#include <linux/perf_event.h>
+#include <linux/printk.h>
+#include <linux/types.h>
+
+#include <asm/barrier.h>
+#include <asm/sysreg.h>
+
+#include <linux/perf/arm_pmu.h>
+
+#define pmresr0_el0         sys_reg(3, 5, 11, 3, 0)
+#define pmresr1_el0         sys_reg(3, 5, 11, 3, 2)
+#define pmresr2_el0         sys_reg(3, 5, 11, 3, 4)
+#define pmxevcntcr_el0      sys_reg(3, 5, 11, 0, 3)
+
+#define QC_EVT_MBE_SHIFT    0
+#define QC_EVT_REG_SHIFT    12
+#define QC_EVT_CODE_SHIFT   4
+#define QC_EVT_GRP_SHIFT    0
+#define QC_EVT_MBE_MASK     GENMASK(QC_EVT_MBE_SHIFT + 1,  QC_EVT_MBE_SHIFT)
+#define QC_EVT_REG_MASK     GENMASK(QC_EVT_REG_SHIFT + 3,  QC_EVT_REG_SHIFT)
+#define QC_EVT_CODE_MASK    GENMASK(QC_EVT_CODE_SHIFT + 7, QC_EVT_CODE_SHIFT)
+#define QC_EVT_GRP_MASK     GENMASK(QC_EVT_GRP_SHIFT + 3,  QC_EVT_GRP_SHIFT)
+#define QC_EVT_RG_MASK      (QC_EVT_REG_MASK | QC_EVT_GRP_MASK)
+#define QC_EVT_RG(event)    ((event)->attr.config & QC_EVT_RG_MASK)
+#define QC_EVT_MBE(event)						\
+	(((event)->attr.config1 & QC_EVT_MBE_MASK) >> QC_EVT_MBE_SHIFT)
+#define QC_EVT_REG(event)						\
+	(((event)->attr.config & QC_EVT_REG_MASK) >> QC_EVT_REG_SHIFT)
+#define QC_EVT_CODE(event)						\
+	(((event)->attr.config & QC_EVT_CODE_MASK) >> QC_EVT_CODE_SHIFT)
+#define QC_EVT_GROUP(event)						\
+	(((event)->attr.config & QC_EVT_GRP_MASK) >> QC_EVT_GRP_SHIFT)
+
+#define QC_MAX_GROUP        7
+#define QC_MAX_RESR         2
+#define QC_BITS_PER_GROUP   8
+#define QC_RESR_ENABLE      BIT_ULL(63)
+#define QC_RESR_EVT_BASE    0xd8
+
+static struct arm_pmu *def_ops;
+
+static inline void falkor_write_pmresr(u64 reg, u64 val)
+{
+	switch (reg) {
+	case 0:
+		write_sysreg_s(val, pmresr0_el0);
+		return;
+	case 1:
+		write_sysreg_s(val, pmresr1_el0);
+		return;
+	default:
+		write_sysreg_s(val, pmresr2_el0);
+		return;
+	}
+}
+
+static inline u64 falkor_read_pmresr(u64 reg)
+{
+	switch (reg) {
+	case 0:
+		return read_sysreg_s(pmresr0_el0);
+	case 1:
+		return read_sysreg_s(pmresr1_el0);
+	default:
+		return read_sysreg_s(pmresr2_el0);
+	}
+}
+
+static void falkor_set_resr(u64 reg, u64 group, u64 code)
+{
+	u64 shift = group * QC_BITS_PER_GROUP;
+	u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
+	u64 val;
+
+	val = falkor_read_pmresr(reg) & ~mask;
+	val |= (code << shift);
+	val |= QC_RESR_ENABLE;
+	falkor_write_pmresr(reg, val);
+}
+
+static void falkor_clear_resr(u64 reg, u64 group)
+{
+	u32 shift = group * QC_BITS_PER_GROUP;
+	u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
+	u64 val = falkor_read_pmresr(reg) & ~mask;
+
+	falkor_write_pmresr(reg, val == QC_RESR_ENABLE ? 0 : val);
+}
+
+/*
+ * Check if e1 and e2 conflict with each other
+ *
+ * e1 is a matrix-based microarchitectural event we are checking against e2.
+ * A conflict exists if the events use the same reg, group, and a different
+ * code.
+ */
+static inline bool events_conflict(struct perf_event *e1, struct perf_event *e2)
+{
+	int type = e2->attr.type;
+	int dynamic = e1->pmu->type;
+
+	/* Same event? */
+	if (e1 == e2)
+		return false;
+
+	/* Other PMU that is not the RAW or this PMU's dynamic type? */
+	if ((e1->pmu != e2->pmu) && ((type != PERF_TYPE_RAW) && (type != dynamic)))
+		return false;
+
+	/* No conflict if using different mbe */
+	if (QC_EVT_MBE(e1) != QC_EVT_MBE(e2))
+		return false;
+
+	/* No conflict if using different reg or group */
+	if (QC_EVT_RG(e1) != QC_EVT_RG(e2))
+		return false;
+
+	/* Same mbe, reg and group is fine so long as code matches */
+	if (QC_EVT_CODE(e1) == QC_EVT_CODE(e2))
+		return false;
+
+	pr_debug_ratelimited("Group exclusion: conflicting events %llx %llx\n",
+			     e1->attr.config,
+			     e2->attr.config);
+	return true;
+}
+
+/*
+ * Check if the given event is valid for the PMU and if so return the value
+ * that can be used in PMXEVTYPER_EL0 to select the event
+ */
+static int falkor_map_event(struct perf_event *event)
+{
+	int type = event->attr.type;
+	int dynamic = event->pmu->type;
+	u64 reg = QC_EVT_REG(event);
+	u64 group = QC_EVT_GROUP(event);
+	struct perf_event *leader;
+	struct perf_event *sibling;
+
+	if (((type != PERF_TYPE_RAW) && (type != dynamic)) || !QC_EVT_MBE(event))
+		/* Common PMUv3 event, forward to the original op */
+		return def_ops->map_event(event);
+
+	/* Is it a valid matrix event? */
+	if ((group > QC_MAX_GROUP) || (reg > QC_MAX_RESR))
+		return -ENOENT;
+
+	/* If part of an event group, check if the event can be put in it */
+
+	leader = event->group_leader;
+	if (events_conflict(event, leader))
+		return -ENOENT;
+
+	for_each_sibling_event(sibling, leader)
+		if (events_conflict(event, sibling))
+			return -ENOENT;
+
+	return QC_RESR_EVT_BASE + reg * 8 + group;
+}
+
+/*
+ * Find a slot for the event on the current CPU
+ */
+static int falkor_get_event_idx(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+	int type = event->attr.type;
+	int dynamic = event->pmu->type;
+	int idx;
+
+	if (((type == PERF_TYPE_RAW) || (type == dynamic)) && QC_EVT_MBE(event))
+		/* Matrix event, check for conflicts with existing events */
+		for_each_set_bit(idx, cpuc->used_mask, ARMPMU_MAX_HWEVENTS)
+			if (cpuc->events[idx] &&
+			    events_conflict(event, cpuc->events[idx]))
+				return -ENOENT;
+
+	/* Let the original op handle the rest */
+	idx = def_ops->get_event_idx(cpuc, event);
+
+	/*
+	 * This is called for actually allocating the events, but also with
+	 * a dummy pmu_hw_events when validating groups, for that case we
+	 * need to ensure that cpuc->events[idx] is NULL so we don't use
+	 * an uninitialized pointer. Conflicts for matrix events in groups
+	 * are checked during event mapping anyway (see falkor_event_map).
+	 */
+	cpuc->events[idx] = NULL;
+
+	return idx;
+}
+
+/*
+ * Reset the PMU
+ */
+static void falkor_reset(void *info)
+{
+	struct arm_pmu *pmu = (struct arm_pmu *)info;
+	u32 i, ctrs = pmu->num_events;
+
+	/* PMRESRx_EL0 regs are unknown at reset, except for the EN field */
+	for (i = 0; i <= QC_MAX_RESR; i++)
+		falkor_write_pmresr(i, 0);
+
+	/* PMXEVCNTCRx_EL0 regs are unknown@reset */
+	for (i = 0; i <= ctrs; i++) {
+		write_sysreg(i, pmselr_el0);
+		isb();
+		write_sysreg_s(0, pmxevcntcr_el0);
+	}
+
+	/* Let the original op handle the rest */
+	def_ops->reset(info);
+}
+
+/*
+ * Enable the given event
+ */
+static void falkor_enable(struct perf_event *event)
+{
+	if (QC_EVT_MBE(event)) {
+		/* Matrix event, program the appropriate PMRESRx_EL0 */
+		u64 reg = QC_EVT_REG(event);
+		u64 code = QC_EVT_CODE(event);
+		u64 group = QC_EVT_GROUP(event);
+
+		falkor_set_resr(reg, group, code);
+	}
+
+	/* Let the original op handle the rest */
+	def_ops->enable(event);
+}
+
+/*
+ * Disable the given event
+ */
+static void falkor_disable(struct perf_event *event)
+{
+	/* Use the original op to disable the counter and interrupt  */
+	def_ops->enable(event);
+
+	if (QC_EVT_MBE(event)) {
+		/* Matrix event, de-program the appropriate PMRESRx_EL0 */
+		u64 reg = QC_EVT_REG(event);
+		u64 group = QC_EVT_GROUP(event);
+
+		falkor_clear_resr(reg, group);
+	}
+}
+
+PMU_FORMAT_ATTR(event, "config:0-15");
+PMU_FORMAT_ATTR(mbe,   "config1:0");
+PMU_FORMAT_ATTR(reg,   "config:12-15");
+PMU_FORMAT_ATTR(code,  "config:4-11");
+PMU_FORMAT_ATTR(group, "config:0-3");
+
+static struct attribute *falkor_pmu_formats[] = {
+	&format_attr_event.attr,
+	&format_attr_mbe.attr,
+	&format_attr_reg.attr,
+	&format_attr_code.attr,
+	&format_attr_group.attr,
+	NULL,
+};
+
+static struct attribute_group falkor_pmu_format_attr_group = {
+	.name = "format",
+	.attrs = falkor_pmu_formats,
+};
+
+static int qcom_falkor_pmu_init(struct arm_pmu *pmu, struct device *dev)
+{
+	/* Save base arm_pmu so we can invoke its ops when appropriate */
+	def_ops = devm_kmemdup(dev, pmu, sizeof(*def_ops), GFP_KERNEL);
+	if (!def_ops) {
+		pr_warn("Failed to allocate arm_pmu for QCOM extensions");
+		return -ENODEV;
+	}
+
+	pmu->name = "qcom_pmuv3";
+
+	/* Override the necessary ops */
+	pmu->map_event     = falkor_map_event;
+	pmu->get_event_idx = falkor_get_event_idx;
+	pmu->reset         = falkor_reset;
+	pmu->enable        = falkor_enable;
+	pmu->disable       = falkor_disable;
+
+	/* Override the necessary attributes */
+	pmu->pmu.attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&falkor_pmu_format_attr_group;
+
+	return 1;
+}
+
+ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", qcom_falkor_pmu_init);
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC V2 2/3] arm_pmu: acpi: add support for CPU PMU variant detection
  2018-06-07 13:56 [RFC V2 0/3] arm_pmu: acpi: variant support and QCOM Falkor extensions Agustin Vega-Frias
@ 2018-06-07 13:56   ` Agustin Vega-Frias
  0 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-07 13:56 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, Will Deacon, Mark Rutland,
	Jeremy Linton, Catalin Marinas, Marc Zyngier, Lorenzo Pieralisi
  Cc: timur, agustinv

DT allows CPU PMU variant detection via the PMU device compatible
property. ACPI does not have an equivalent mechanism so we introduce
a probe table to allow this via a device nested inside the CPU device
in the DSDT:

Device (CPU0)
{
    Name (_HID, "ACPI0007" /* Processor Device */)
    ...
    Device (PMU0)
    {
        Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */

        /*
         * The device might also contain _DSD properties to indicate other
         * IMPLEMENTATION DEFINED PMU features.
         */
        Name (_DSD, Package ()
        {
            ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
            Package ()
            {
                ...
            }
        })
    }
}

With this in place we can declare the variant:

    ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", falkor_pmu_init);

The init function is called after the default PMU initialization and is
passed a pointer to the arm_pmu structure and a pointer to the PMU device.
The init function can then override arm_pmu callbacks and attributes and
query more properties from the PMU device.

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 drivers/perf/arm_pmu_acpi.c       | 27 +++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h |  1 +
 include/linux/acpi.h              | 11 +++++++++++
 include/linux/perf/arm_pmu.h      |  1 +
 4 files changed, 40 insertions(+)

diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f19751..6b0ca71 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -220,6 +220,26 @@ static int arm_pmu_acpi_cpu_starting(unsigned int cpu)
 	return 0;
 }
 
+/*
+ * Check if the given child device of the CPU device matches a PMU variant
+ * device declared with ACPI_DECLARE_PMU_VARIANT, if so, pass the arm_pmu
+ * structure and the matching device for further initialization.
+ */
+static int arm_pmu_variant_init(struct device *dev, void *data)
+{
+	extern struct acpi_device_id ACPI_PROBE_TABLE(pmu);
+	unsigned int cpu = *((unsigned int *)data);
+	const struct acpi_device_id *id;
+
+	id = acpi_match_device(&ACPI_PROBE_TABLE(pmu), dev);
+	if (id) {
+		armpmu_acpi_init_fn fn = (armpmu_acpi_init_fn)id->driver_data;
+
+		return fn(per_cpu(probed_pmus, cpu), dev);
+	}
+	return 0;
+}
+
 int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 {
 	int pmu_idx = 0;
@@ -240,6 +260,7 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 	 */
 	for_each_possible_cpu(cpu) {
 		struct arm_pmu *pmu = per_cpu(probed_pmus, cpu);
+		struct device *dev = get_cpu_device(cpu);
 		char *base_name;
 
 		if (!pmu || pmu->name)
@@ -254,6 +275,10 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 			return ret;
 		}
 
+		ret = device_for_each_child(dev, &cpu, arm_pmu_variant_init);
+		if (ret == -ENODEV)
+			pr_warn("Failed PMU re-init, fallback to plain PMUv3");
+
 		base_name = pmu->name;
 		pmu->name = kasprintf(GFP_KERNEL, "%s_%d", base_name, pmu_idx++);
 		if (!pmu->name) {
@@ -290,3 +315,5 @@ static int arm_pmu_acpi_init(void)
 	return ret;
 }
 subsys_initcall(arm_pmu_acpi_init)
+
+ACPI_DECLARE_PMU_SENTINEL();
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5894049..f1be62a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -600,6 +600,7 @@
 	IRQCHIP_OF_MATCH_TABLE()					\
 	ACPI_PROBE_TABLE(irqchip)					\
 	ACPI_PROBE_TABLE(timer)						\
+	ACPI_PROBE_TABLE(pmu)						\
 	EARLYCON_TABLE()
 
 #define INIT_TEXT							\
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 15bfb15..9c410cf 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1153,6 +1153,17 @@ struct acpi_probe_entry {
 					  (&ACPI_PROBE_TABLE_END(t) -	\
 					   &ACPI_PROBE_TABLE(t)));	\
 	})
+
+#define ACPI_DECLARE_PMU_VARIANT(name, hid, init_fn)			\
+	static const struct acpi_device_id __acpi_probe_##name		\
+		__used __section(__pmu_acpi_probe_table)		\
+		= { .id = hid, .driver_data = (kernel_ulong_t)init_fn }
+
+#define ACPI_DECLARE_PMU_SENTINEL()					\
+	static const struct acpi_device_id __acpi_probe_sentinel	\
+		__used __section(__pmu_acpi_probe_table_end)		\
+		= { .id = "", .driver_data = 0 }
+
 #else
 static inline int acpi_dev_get_property(struct acpi_device *adev,
 					const char *name, acpi_object_type type,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 40036a5..ff43d65 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -123,6 +123,7 @@ int armpmu_map_event(struct perf_event *event,
 		     u32 raw_event_mask);
 
 typedef int (*armpmu_init_fn)(struct arm_pmu *);
+typedef int (*armpmu_acpi_init_fn)(struct arm_pmu *, struct device *);
 
 struct pmu_probe_info {
 	unsigned int cpuid;
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC V2 2/3] arm_pmu: acpi: add support for CPU PMU variant detection
@ 2018-06-07 13:56   ` Agustin Vega-Frias
  0 siblings, 0 replies; 10+ messages in thread
From: Agustin Vega-Frias @ 2018-06-07 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

DT allows CPU PMU variant detection via the PMU device compatible
property. ACPI does not have an equivalent mechanism so we introduce
a probe table to allow this via a device nested inside the CPU device
in the DSDT:

Device (CPU0)
{
    Name (_HID, "ACPI0007" /* Processor Device */)
    ...
    Device (PMU0)
    {
        Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */

        /*
         * The device might also contain _DSD properties to indicate other
         * IMPLEMENTATION DEFINED PMU features.
         */
        Name (_DSD, Package ()
        {
            ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
            Package ()
            {
                ...
            }
        })
    }
}

With this in place we can declare the variant:

    ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", falkor_pmu_init);

The init function is called after the default PMU initialization and is
passed a pointer to the arm_pmu structure and a pointer to the PMU device.
The init function can then override arm_pmu callbacks and attributes and
query more properties from the PMU device.

Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
---
 drivers/perf/arm_pmu_acpi.c       | 27 +++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h |  1 +
 include/linux/acpi.h              | 11 +++++++++++
 include/linux/perf/arm_pmu.h      |  1 +
 4 files changed, 40 insertions(+)

diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f19751..6b0ca71 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -220,6 +220,26 @@ static int arm_pmu_acpi_cpu_starting(unsigned int cpu)
 	return 0;
 }
 
+/*
+ * Check if the given child device of the CPU device matches a PMU variant
+ * device declared with ACPI_DECLARE_PMU_VARIANT, if so, pass the arm_pmu
+ * structure and the matching device for further initialization.
+ */
+static int arm_pmu_variant_init(struct device *dev, void *data)
+{
+	extern struct acpi_device_id ACPI_PROBE_TABLE(pmu);
+	unsigned int cpu = *((unsigned int *)data);
+	const struct acpi_device_id *id;
+
+	id = acpi_match_device(&ACPI_PROBE_TABLE(pmu), dev);
+	if (id) {
+		armpmu_acpi_init_fn fn = (armpmu_acpi_init_fn)id->driver_data;
+
+		return fn(per_cpu(probed_pmus, cpu), dev);
+	}
+	return 0;
+}
+
 int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 {
 	int pmu_idx = 0;
@@ -240,6 +260,7 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 	 */
 	for_each_possible_cpu(cpu) {
 		struct arm_pmu *pmu = per_cpu(probed_pmus, cpu);
+		struct device *dev = get_cpu_device(cpu);
 		char *base_name;
 
 		if (!pmu || pmu->name)
@@ -254,6 +275,10 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
 			return ret;
 		}
 
+		ret = device_for_each_child(dev, &cpu, arm_pmu_variant_init);
+		if (ret == -ENODEV)
+			pr_warn("Failed PMU re-init, fallback to plain PMUv3");
+
 		base_name = pmu->name;
 		pmu->name = kasprintf(GFP_KERNEL, "%s_%d", base_name, pmu_idx++);
 		if (!pmu->name) {
@@ -290,3 +315,5 @@ static int arm_pmu_acpi_init(void)
 	return ret;
 }
 subsys_initcall(arm_pmu_acpi_init)
+
+ACPI_DECLARE_PMU_SENTINEL();
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5894049..f1be62a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -600,6 +600,7 @@
 	IRQCHIP_OF_MATCH_TABLE()					\
 	ACPI_PROBE_TABLE(irqchip)					\
 	ACPI_PROBE_TABLE(timer)						\
+	ACPI_PROBE_TABLE(pmu)						\
 	EARLYCON_TABLE()
 
 #define INIT_TEXT							\
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 15bfb15..9c410cf 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1153,6 +1153,17 @@ struct acpi_probe_entry {
 					  (&ACPI_PROBE_TABLE_END(t) -	\
 					   &ACPI_PROBE_TABLE(t)));	\
 	})
+
+#define ACPI_DECLARE_PMU_VARIANT(name, hid, init_fn)			\
+	static const struct acpi_device_id __acpi_probe_##name		\
+		__used __section(__pmu_acpi_probe_table)		\
+		= { .id = hid, .driver_data = (kernel_ulong_t)init_fn }
+
+#define ACPI_DECLARE_PMU_SENTINEL()					\
+	static const struct acpi_device_id __acpi_probe_sentinel	\
+		__used __section(__pmu_acpi_probe_table_end)		\
+		= { .id = "", .driver_data = 0 }
+
 #else
 static inline int acpi_dev_get_property(struct acpi_device *adev,
 					const char *name, acpi_object_type type,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 40036a5..ff43d65 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -123,6 +123,7 @@ int armpmu_map_event(struct perf_event *event,
 		     u32 raw_event_mask);
 
 typedef int (*armpmu_init_fn)(struct arm_pmu *);
+typedef int (*armpmu_acpi_init_fn)(struct arm_pmu *, struct device *);
 
 struct pmu_probe_info {
 	unsigned int cpuid;
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-06-22 19:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-22 19:46 [RFC V3 0/3] arm_pmu: acpi: variant support and QCOM Falkor extensions Agustin Vega-Frias
2018-06-22 19:46 ` Agustin Vega-Frias
2018-06-22 19:46 ` [RFC V2 1/3] ACPI: add support for sentinel-delimited probe tables Agustin Vega-Frias
2018-06-22 19:46   ` Agustin Vega-Frias
2018-06-22 19:46 ` [RFC V2 2/3] arm_pmu: acpi: add support for CPU PMU variant detection Agustin Vega-Frias
2018-06-22 19:46   ` Agustin Vega-Frias
2018-06-22 19:46 ` [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support Agustin Vega-Frias
2018-06-22 19:46   ` Agustin Vega-Frias
  -- strict thread matches above, loose matches on Subject: below --
2018-06-07 13:56 [RFC V2 0/3] arm_pmu: acpi: variant support and QCOM Falkor extensions Agustin Vega-Frias
2018-06-07 13:56 ` [RFC V2 2/3] arm_pmu: acpi: add support for CPU PMU variant detection Agustin Vega-Frias
2018-06-07 13:56   ` Agustin Vega-Frias

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.