All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] perf: Arm SMMU PMU driver
@ 2022-02-17 14:24 Robin Murphy
  2022-02-17 14:24 ` [PATCH 1/6] iommu/arm-smmu: Account for PMU interrupts Robin Murphy
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Robin Murphy @ 2022-02-17 14:24 UTC (permalink / raw)
  To: will, mark.rutland
  Cc: lorenzo.pieralisi, sudeep.holla, linux-arm-kernel, jean-philippe,
	leo.yan, noda.akio

Hi all,

Since our friends at Linaro had some interest in the area, I've dug up
and finished off my old SMMUv1/v2 PMU driver, now with a slightly less
terrible DT binding to boot. Patch #1 involves a general refactoring of
the IRQ setup code in arm-smmu which also now serves to remove the
general dependency on explicit IRQ resources for of_platform devices.

ACPI support via IORT (patch #2) is pleasingly trivial, but does have a
hard dependency on the previous patch, otherwise SMMU context
interrupts may get messed up.

The driver itself in patch #5 is still largely a time capsule of me
learning the PMU APIs 5 years ago (as a warm-up for arm-cmn). I've
mostly just added IRQ affinity support (which I didn't yet fully
understand at the time) and tweaked a few comments. I'm pretty confident
all the testing I did back then is still valid, so I've just run some
quick sanity checks with patch #6 to verify the new DT binding.

Cheers,
Robin.


Robin Murphy (6):
  iommu/arm-smmu: Account for PMU interrupts
  acpi/iort: Register SMMUv2 PMU interrupts
  iommu/arm-smmu: Add DT PMU support
  iommu/smmu: Create child devices for PMUs
  perf: Add ARM SMMU PMU driver
  arm64: dts: Add SMMU PMUs to Juno

 .../devicetree/bindings/iommu/arm,smmu.yaml   |  19 +-
 arch/arm64/boot/dts/arm/juno-base.dtsi        |  26 +-
 drivers/acpi/arm64/iort.c                     |  18 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c         | 143 ++--
 drivers/iommu/arm/arm-smmu/arm-smmu.h         |   5 +-
 drivers/perf/Kconfig                          |  24 +-
 drivers/perf/Makefile                         |   1 +
 drivers/perf/arm-smmu-pmu.c                   | 732 ++++++++++++++++++
 8 files changed, 896 insertions(+), 72 deletions(-)
 create mode 100644 drivers/perf/arm-smmu-pmu.c

-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: Account for PMU interrupts
  2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
@ 2022-02-17 14:24 ` Robin Murphy
  2022-02-17 14:24 ` [PATCH 2/6] acpi/iort: Register SMMUv2 " Robin Murphy
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2022-02-17 14:24 UTC (permalink / raw)
  To: will, mark.rutland
  Cc: lorenzo.pieralisi, sudeep.holla, linux-arm-kernel, jean-philippe,
	leo.yan, noda.akio, Lad Prabhakar

In preparation for SMMUv2 PMU support, rejig our IRQ setup code to
account for PMU interrupts as additional resources. We can simplify the
whole flow by only storing the context IRQs, since the global IRQs are
devres-managed and we never refer to them beyond the initial request.

CC: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 93 +++++++++++++--------------
 drivers/iommu/arm/arm-smmu/arm-smmu.h |  5 +-
 2 files changed, 46 insertions(+), 52 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 4bc75c4ce402..e50fcf37af58 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -807,7 +807,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 * Request context fault interrupt. Do this last to avoid the
 	 * handler seeing a half-initialised domain state.
 	 */
-	irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
+	irq = smmu->irqs[cfg->irptndx];
 
 	if (smmu->impl && smmu->impl->context_fault)
 		context_fault = smmu->impl->context_fault;
@@ -858,7 +858,7 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
 	if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
-		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
+		irq = smmu->irqs[cfg->irptndx];
 		devm_free_irq(smmu->dev, irq, domain);
 	}
 
@@ -1951,8 +1951,8 @@ static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu)
 	return ret;
 }
 
-static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
-				      struct arm_smmu_device *smmu)
+static int arm_smmu_device_acpi_probe(struct arm_smmu_device *smmu,
+				      u32 *global_irqs, u32 *pmu_irqs)
 {
 	struct device *dev = smmu->dev;
 	struct acpi_iort_node *node =
@@ -1968,7 +1968,8 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
 		return ret;
 
 	/* Ignore the configuration access interrupt */
-	smmu->num_global_irqs = 1;
+	*global_irqs = 1;
+	*pmu_irqs = 0;
 
 	if (iort_smmu->flags & ACPI_IORT_SMMU_COHERENT_WALK)
 		smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
@@ -1976,25 +1977,24 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
 	return 0;
 }
 #else
-static inline int arm_smmu_device_acpi_probe(struct platform_device *pdev,
-					     struct arm_smmu_device *smmu)
+static inline int arm_smmu_device_acpi_probe(struct arm_smmu_device *smmu,
+					     u32 *global_irqs, u32 *pmu_irqs)
 {
 	return -ENODEV;
 }
 #endif
 
-static int arm_smmu_device_dt_probe(struct platform_device *pdev,
-				    struct arm_smmu_device *smmu)
+static int arm_smmu_device_dt_probe(struct arm_smmu_device *smmu,
+				    u32 *global_irqs, u32 *pmu_irqs)
 {
 	const struct arm_smmu_match_data *data;
-	struct device *dev = &pdev->dev;
+	struct device *dev = smmu->dev;
 	bool legacy_binding;
 
-	if (of_property_read_u32(dev->of_node, "#global-interrupts",
-				 &smmu->num_global_irqs)) {
-		dev_err(dev, "missing #global-interrupts property\n");
-		return -ENODEV;
-	}
+	if (of_property_read_u32(dev->of_node, "#global-interrupts", global_irqs))
+		return dev_err_probe(dev, -ENODEV,
+				     "missing #global-interrupts property\n");
+	*pmu_irqs = 0;
 
 	data = of_device_get_match_data(dev);
 	smmu->version = data->version;
@@ -2073,6 +2073,7 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	struct arm_smmu_device *smmu;
 	struct device *dev = &pdev->dev;
 	int num_irqs, i, err;
+	u32 global_irqs, pmu_irqs;
 	irqreturn_t (*global_fault)(int irq, void *dev);
 
 	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
@@ -2083,10 +2084,9 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	smmu->dev = dev;
 
 	if (dev->of_node)
-		err = arm_smmu_device_dt_probe(pdev, smmu);
+		err = arm_smmu_device_dt_probe(smmu, &global_irqs, &pmu_irqs);
 	else
-		err = arm_smmu_device_acpi_probe(pdev, smmu);
-
+		err = arm_smmu_device_acpi_probe(smmu, &global_irqs, &pmu_irqs);
 	if (err)
 		return err;
 
@@ -2105,31 +2105,25 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	if (IS_ERR(smmu))
 		return PTR_ERR(smmu);
 
-	num_irqs = 0;
-	while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) {
-		num_irqs++;
-		if (num_irqs > smmu->num_global_irqs)
-			smmu->num_context_irqs++;
-	}
+	num_irqs = platform_irq_count(pdev);
 
-	if (!smmu->num_context_irqs) {
-		dev_err(dev, "found %d interrupts but expected at least %d\n",
-			num_irqs, smmu->num_global_irqs + 1);
-		return -ENODEV;
-	}
+	smmu->num_context_irqs = num_irqs - global_irqs - pmu_irqs;
+	if (smmu->num_context_irqs <= 0)
+		return dev_err_probe(dev, -ENODEV,
+				"found %d interrupts but expected at least %d\n",
+				num_irqs, global_irqs + pmu_irqs + 1);
 
-	smmu->irqs = devm_kcalloc(dev, num_irqs, sizeof(*smmu->irqs),
-				  GFP_KERNEL);
-	if (!smmu->irqs) {
-		dev_err(dev, "failed to allocate %d irqs\n", num_irqs);
-		return -ENOMEM;
-	}
+	smmu->irqs = devm_kcalloc(dev, smmu->num_context_irqs,
+				  sizeof(*smmu->irqs), GFP_KERNEL);
+	if (!smmu->irqs)
+		return dev_err_probe(dev, -ENOMEM, "failed to allocate %d irqs\n",
+				     smmu->num_context_irqs);
 
-	for (i = 0; i < num_irqs; ++i) {
-		int irq = platform_get_irq(pdev, i);
+	for (i = 0; i < smmu->num_context_irqs; i++) {
+		int irq = platform_get_irq(pdev, global_irqs + pmu_irqs + i);
 
 		if (irq < 0)
-			return -ENODEV;
+			return irq;
 		smmu->irqs[i] = irq;
 	}
 
@@ -2165,17 +2159,18 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	else
 		global_fault = arm_smmu_global_fault;
 
-	for (i = 0; i < smmu->num_global_irqs; ++i) {
-		err = devm_request_irq(smmu->dev, smmu->irqs[i],
-				       global_fault,
-				       IRQF_SHARED,
-				       "arm-smmu global fault",
-				       smmu);
-		if (err) {
-			dev_err(dev, "failed to request global IRQ %d (%u)\n",
-				i, smmu->irqs[i]);
-			return err;
-		}
+	for (i = 0; i < global_irqs; i++) {
+		int irq = platform_get_irq(pdev, i);
+
+		if (irq < 0)
+			return irq;
+
+		err = devm_request_irq(dev, irq, global_fault, IRQF_SHARED,
+				       "arm-smmu global fault", smmu);
+		if (err)
+			return dev_err_probe(dev, err,
+					"failed to request global IRQ %d (%u)\n",
+					i, irq);
 	}
 
 	err = iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 432de2f742c3..2b9b42fb6f30 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -318,11 +318,10 @@ struct arm_smmu_device {
 	unsigned long			pa_size;
 	unsigned long			pgsize_bitmap;
 
-	u32				num_global_irqs;
-	u32				num_context_irqs;
+	int				num_context_irqs;
+	int				num_clks;
 	unsigned int			*irqs;
 	struct clk_bulk_data		*clks;
-	int				num_clks;
 
 	spinlock_t			global_sync_lock;
 
-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/6] acpi/iort: Register SMMUv2 PMU interrupts
  2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
  2022-02-17 14:24 ` [PATCH 1/6] iommu/arm-smmu: Account for PMU interrupts Robin Murphy
@ 2022-02-17 14:24 ` Robin Murphy
  2022-02-17 14:24 ` [PATCH 3/6] iommu/arm-smmu: Add DT PMU support Robin Murphy
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2022-02-17 14:24 UTC (permalink / raw)
  To: will, mark.rutland
  Cc: lorenzo.pieralisi, sudeep.holla, linux-arm-kernel, jean-philippe,
	leo.yan, noda.akio

In preparation for SMMUv2 PMU support, make sure the PMU IRQs are
parsed out of IORT and registered somewhere the SMMU driver can find
them. Without making massively invasive changes there aren't many ways
to achieve this; inserting them into the SMMU's resource list between
the global and context IRQs is easy enough to cope with in the driver,
and offers the path of least resistance for the DT binding too.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/acpi/arm64/iort.c             | 18 ++++++++++++++----
 drivers/iommu/arm/arm-smmu/arm-smmu.c |  2 +-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 3b23fb775ac4..175397913be1 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1277,10 +1277,10 @@ static int __init arm_smmu_count_resources(struct acpi_iort_node *node)
 	 * configuration access interrupt.
 	 *
 	 * MMIO address and global fault interrupt resources are always
-	 * present so add them to the context interrupt count as a static
-	 * value.
+	 * present so add them to the context + PMU interrupt count as a
+	 * static value.
 	 */
-	return smmu->context_interrupt_count + 2;
+	return smmu->pmu_interrupt_count + smmu->context_interrupt_count + 2;
 }
 
 static void __init arm_smmu_init_resources(struct resource *res,
@@ -1288,7 +1288,7 @@ static void __init arm_smmu_init_resources(struct resource *res,
 {
 	struct acpi_iort_smmu *smmu;
 	int i, hw_irq, trigger, num_res = 0;
-	u64 *ctx_irq, *glb_irq;
+	u64 *ctx_irq, *glb_irq, *pmu_irq;
 
 	/* Retrieve SMMU specific data */
 	smmu = (struct acpi_iort_smmu *)node->node_data;
@@ -1306,6 +1306,16 @@ static void __init arm_smmu_init_resources(struct resource *res,
 	acpi_iort_register_irq(hw_irq, "arm-smmu-global", trigger,
 				     &res[num_res++]);
 
+	/* PMU IRQs */
+	pmu_irq = ACPI_ADD_PTR(u64, node, smmu->pmu_interrupt_offset);
+	for (i = 0; i < smmu->pmu_interrupt_count; i++) {
+		hw_irq = IORT_IRQ_MASK(pmu_irq[i]);
+		trigger = IORT_IRQ_TRIGGER_MASK(pmu_irq[i]);
+
+		acpi_iort_register_irq(hw_irq, "arm-smmu-pmu", trigger,
+				       &res[num_res++]);
+	}
+
 	/* Context IRQs */
 	ctx_irq = ACPI_ADD_PTR(u64, node, smmu->context_interrupt_offset);
 	for (i = 0; i < smmu->context_interrupt_count; i++) {
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index e50fcf37af58..cbfe4cc914f0 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1969,7 +1969,7 @@ static int arm_smmu_device_acpi_probe(struct arm_smmu_device *smmu,
 
 	/* Ignore the configuration access interrupt */
 	*global_irqs = 1;
-	*pmu_irqs = 0;
+	*pmu_irqs = iort_smmu->pmu_interrupt_count;
 
 	if (iort_smmu->flags & ACPI_IORT_SMMU_COHERENT_WALK)
 		smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/6] iommu/arm-smmu: Add DT PMU support
  2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
  2022-02-17 14:24 ` [PATCH 1/6] iommu/arm-smmu: Account for PMU interrupts Robin Murphy
  2022-02-17 14:24 ` [PATCH 2/6] acpi/iort: Register SMMUv2 " Robin Murphy
@ 2022-02-17 14:24 ` Robin Murphy
  2022-03-02 15:33   ` Rob Herring
  2022-02-17 14:24 ` [PATCH 4/6] iommu/smmu: Create child devices for PMUs Robin Murphy
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Robin Murphy @ 2022-02-17 14:24 UTC (permalink / raw)
  To: will, mark.rutland
  Cc: lorenzo.pieralisi, sudeep.holla, linux-arm-kernel, jean-philippe,
	leo.yan, noda.akio

Since IORT describes PMU interrupts rather inflexibly as an inherent
part of the SMMU, the best way to avoid excessive complexity is to make
the way we handle DT look as similar as possible. Fortunately the
de-facto standard for mentioning PMU interrupts at all under the current
binding has been to include them in the global interrupt count, listing
them after the actual fault interrupt(s), so we can capitalise on that.
It's about 9 years too late to redefine "#global-interrupts" to exclude
anything other than context interrupts without breaking compatibility,
so we're stuck with a slightly convoluted definition of PMU interrupts
as an optional subdivision of the "global" interrupts, but it works.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>

---

Not sure whether the count-backwards-from-the-middle nature of "number
of PMU interrupts" is too ugly and "index of first PMU interrupt" might
be any better.
---
 .../devicetree/bindings/iommu/arm,smmu.yaml   | 19 ++++++++++++++++++-
 drivers/iommu/arm/arm-smmu/arm-smmu.c         |  4 ++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index da5381c8ee11..9d39df42528a 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -87,10 +87,18 @@ properties:
 
   '#global-interrupts':
     description: The number of global interrupts exposed by the device.
+      Includes the count of PMU interrupts, if present.
     $ref: /schemas/types.yaml#/definitions/uint32
     minimum: 0
     maximum: 260   # 2 secure, 2 non-secure, and up to 256 perf counters
 
+  '#pmu-interrupts':
+    description: The number of PMU interrupts. Must be equal to the number of
+      implemented counter groups.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 1
+    maximum: 256
+
   '#iommu-cells':
     enum: [ 1, 2 ]
     description: |
@@ -115,6 +123,10 @@ properties:
       context bank. In the case of a single, combined interrupt, it must be
       listed multiple times.
 
+      If a PMU is present, the global interrupt entries consist of any fault
+      interrupts first, followed by #pmu-interrupts entries, one per implemented
+      counter group, specified in order of their indexing by the SMMU.
+
   dma-coherent:
     description: |
       Present if page table walks made by the SMMU are cache coherent with the
@@ -190,9 +202,14 @@ examples:
     smmu1: iommu@ba5e0000 {
             compatible = "arm,smmu-v1";
             reg = <0xba5e0000 0x10000>;
-            #global-interrupts = <2>;
+            #global-interrupts = <6>; /* 2 fault + 4 PMU */
+            #pmu-interrupts = <4>;
             interrupts = <0 32 4>,
                          <0 33 4>,
+                         <0 94 4>, /* This is the first PMU interrupt */
+                         <0 95 4>,
+                         <0 96 4>,
+                         <0 97 4>,
                          <0 34 4>, /* This is the first context interrupt */
                          <0 35 4>,
                          <0 36 4>,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index cbfe4cc914f0..8d6c8106fc1d 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1995,6 +1995,10 @@ static int arm_smmu_device_dt_probe(struct arm_smmu_device *smmu,
 		return dev_err_probe(dev, -ENODEV,
 				     "missing #global-interrupts property\n");
 	*pmu_irqs = 0;
+	of_property_read_u32(dev->of_node, "#pmu-interrupts", pmu_irqs);
+	if (*pmu_irqs > *global_irqs)
+		return dev_err_probe(dev, -EINVAL, "invalid #pmu_interrupts property");
+	*global_irqs -= *pmu_irqs;
 
 	data = of_device_get_match_data(dev);
 	smmu->version = data->version;
-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/6] iommu/smmu: Create child devices for PMUs
  2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
                   ` (2 preceding siblings ...)
  2022-02-17 14:24 ` [PATCH 3/6] iommu/arm-smmu: Add DT PMU support Robin Murphy
@ 2022-02-17 14:24 ` Robin Murphy
  2022-02-17 14:24 ` [PATCH 5/6] perf: Add ARM SMMU PMU driver Robin Murphy
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2022-02-17 14:24 UTC (permalink / raw)
  To: will, mark.rutland
  Cc: lorenzo.pieralisi, sudeep.holla, linux-arm-kernel, jean-philippe,
	leo.yan, noda.akio

Whilst SMMUv3 PMCGs are somewhat free-form, the SMMUv1/v2 PMU has a
well-defined and closely-coupled relationship with the main SMMU
interface. This makes it logical for the SMMU driver to be in charge of
consuming the firmware description and instantiating an abstract PMU
device directly, obviating the need for independent detection methods.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 46 +++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 8d6c8106fc1d..58daf81adae7 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -2070,6 +2070,45 @@ err_reset_platform_ops: __maybe_unused;
 	return err;
 }
 
+static int arm_smmu_pmu_init(struct arm_smmu_device *smmu,
+			     int irq_offset, int num_irqs)
+{
+	struct platform_device *pmu, *smmu_pdev;
+	int i, ret = -ENOMEM;
+
+	pmu = platform_device_alloc("arm-smmu-pmu", PLATFORM_DEVID_AUTO);
+	if (!pmu)
+		return -ENOMEM;
+
+	pmu->dev.parent = smmu->dev;
+	pmu->dev.platform_data = (__force void *)smmu->base + (3 << smmu->pgshift);
+	pmu->num_resources = num_irqs;
+	pmu->resource = kcalloc(num_irqs, sizeof(*pmu->resource), GFP_KERNEL);
+	if (!pmu->resource)
+		goto error;
+
+	smmu_pdev = to_platform_device(smmu->dev);
+	for (i = 0; i < num_irqs; i++) {
+		ret = platform_get_irq(smmu_pdev, irq_offset + i);
+		if (ret < 0)
+			goto error;
+
+		pmu->resource[i].start = ret;
+		pmu->resource[i].end = ret;
+		pmu->resource[i].flags = IORESOURCE_IRQ;
+	}
+
+	ret = platform_device_add(pmu);
+	if (ret)
+		goto error;
+
+	return 0;
+error:
+	pmu->dev.platform_data = NULL;
+	platform_device_put(pmu);
+	return ret;
+}
+
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
 	struct resource *res;
@@ -2216,6 +2255,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 			goto err_unregister_device;
 	}
 
+	/* Try the PMU last so we don't have to worry about cleaning it up */
+	if (pmu_irqs) {
+		err = arm_smmu_pmu_init(smmu, global_irqs, pmu_irqs);
+		if (err)
+			dev_warn(dev, "Failed to create PMU device: %d", err);
+	}
+
 	return 0;
 
 err_unregister_device:
-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/6] perf: Add ARM SMMU PMU driver
  2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
                   ` (3 preceding siblings ...)
  2022-02-17 14:24 ` [PATCH 4/6] iommu/smmu: Create child devices for PMUs Robin Murphy
@ 2022-02-17 14:24 ` Robin Murphy
  2022-02-17 14:24 ` [PATCH 6/6] arm64: dts: Add SMMU PMUs to Juno Robin Murphy
  2022-03-07 22:03 ` [PATCH 0/6] perf: Arm SMMU PMU driver Will Deacon
  6 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2022-02-17 14:24 UTC (permalink / raw)
  To: will, mark.rutland
  Cc: lorenzo.pieralisi, sudeep.holla, linux-arm-kernel, jean-philippe,
	leo.yan, noda.akio

Add a basic driver for the SMMUv2 Performance Monitors Extension. This
exposes the architecturally-defined events along with a fairly low-level
interface, based on the relevant register fields, for filtering.

The relationship between Stream ID Groups and Counter Groups, i.e. which
subsets of events may be visible to which sets of counters, is entirely
implementation defined and non-discoverable, and attempting to define
firmware bindings to describe the mappings would be considerable work
for very little benefit. Thus we expect the user to have the relevant
implementation knowledge, and explicitly specify the appropriate counter
group for each desired event.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/Kconfig        |  24 +-
 drivers/perf/Makefile       |   1 +
 drivers/perf/arm-smmu-pmu.c | 732 ++++++++++++++++++++++++++++++++++++
 3 files changed, 748 insertions(+), 9 deletions(-)
 create mode 100644 drivers/perf/arm-smmu-pmu.c

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index e1a0c44bc686..6ee07c392bdb 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -48,6 +48,15 @@ config ARM_CMN
 	  Support for PMU events monitoring on the Arm CMN-600 Coherent Mesh
 	  Network interconnect.
 
+config ARM_DSU_PMU
+	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
+	depends on ARM64
+	  help
+	  Provides support for performance monitor unit in ARM DynamIQ Shared
+	  Unit (DSU). The DSU integrates one or more cores with an L3 memory
+	  system, control logic. The PMU allows counting various events related
+	  to DSU.
+
 config ARM_PMU
 	depends on ARM || ARM64
 	bool "ARM PMU framework"
@@ -60,6 +69,12 @@ config ARM_PMU_ACPI
 	depends on ARM_PMU && ACPI
 	def_bool y
 
+config ARM_SMMU_PMU
+	tristate "Arm SMMUv2 PMU support"
+	depends on ARM_SMMU
+	help
+	 Support for ARM SMMUv1/v2 performance monitors.
+
 config ARM_SMMU_V3_PMU
 	 tristate "ARM SMMUv3 Performance Monitors Extension"
 	 depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT)
@@ -70,15 +85,6 @@ config ARM_SMMU_V3_PMU
 	   through the SMMU and allow the resulting information to be filtered
 	   based on the Stream ID of the corresponding master.
 
-config ARM_DSU_PMU
-	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
-	depends on ARM64
-	  help
-	  Provides support for performance monitor unit in ARM DynamIQ Shared
-	  Unit (DSU). The DSU integrates one or more cores with an L3 memory
-	  system, control logic. The PMU allows counting various events related
-	  to DSU.
-
 config FSL_IMX8_DDR_PMU
 	tristate "Freescale i.MX8 DDR perf monitor"
 	depends on ARCH_MXC || COMPILE_TEST
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 2db5418d5b0a..a0a1483a6825 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_ARM_CMN) += arm-cmn.o
 obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
 obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
 obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
+obj-$(CONFIG_ARM_SMMU_PMU) += arm-smmu-pmu.o
 obj-$(CONFIG_ARM_SMMU_V3_PMU) += arm_smmuv3_pmu.o
 obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o
 obj-$(CONFIG_HISI_PMU) += hisilicon/
diff --git a/drivers/perf/arm-smmu-pmu.c b/drivers/perf/arm-smmu-pmu.c
new file mode 100644
index 000000000000..da2cf00a5569
--- /dev/null
+++ b/drivers/perf/arm-smmu-pmu.c
@@ -0,0 +1,732 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2017-2022 Arm Ltd.
+// PMU driver for the Arm SMMU Performance Monitors Extension
+
+#define pr_fmt(fmt) "arm-smmu-pmu: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of_irq.h>
+#include <linux/slab.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+
+#define ARM_SMMU_PMEVCTR(n)		(0x0000 + (n) * 4)
+#define ARM_SMMU_PMEVTYPER(n)		(0x0400 + (n) * 4)
+#define ARM_SMMU_PMCGCR(g)		(0x0800 + (g) * 4)
+#define ARM_SMMU_PMCGSMR(g)		(0x0A00 + (g) * 4)
+#define ARM_SMMU_PMCNTENSET(n)		(0x0C00 + (n) / 32 * 4)
+#define ARM_SMMU_PMCNTENCLR(n)		(0x0C20 + (n) / 32 * 4)
+#define ARM_SMMU_PMINTENSET(n)		(0x0C40 + (n) / 32 * 4)
+#define ARM_SMMU_PMINTENCLR(n)		(0x0C60 + (n) / 32 * 4)
+#define ARM_SMMU_PMOVSCLR(n)		(0x0C80 + (n) / 32 * 4)
+#define ARM_SMMU_PMCFGR			0x0e00
+#define ARM_SMMU_PMCR			0x0e04
+#define ARM_SMMU_PMCEID0		0x0e20
+
+#define ARM_SMMU_EVENT_BIT(n)		BIT((n) % 32)
+
+#define ARM_SMMU_PMEVTYPER_EVENT	GENMASK(15, 0)
+#define ARM_SMMU_PMEVTYPER_NSU		BIT(28)
+#define ARM_SMMU_PMEVTYPER_NSP		BIT(29)
+#define ARM_SMMU_PMEVTYPER_U		BIT(30)
+#define ARM_SMMU_PMEVTYPER_P		BIT(31)
+
+#define ARM_SMMU_PMCGCR_NDX		GENMASK(7, 0)
+#define ARM_SMMU_PMCGCR_TCEFCFG		GENMASK(9, 8)
+#define ARM_SMMU_PMCGCR_E		BIT(11)
+#define ARM_SMMU_PMCGCR_SIDG		GENMASK(22, 16)
+#define ARM_SMMU_PMCGCR_CGNC		GENMASK(27, 24)
+
+#define ARM_SMMU_PMCFGR_N		GENMASK(7, 0)
+#define ARM_SMMU_PMCFGR_NCG		GENMASK(31, 24)
+
+#define ARM_SMMU_PMCGSMR_ID		GENMASK(15, 0)
+#define ARM_SMMU_PMCGSMR_MASK		GENMASK(31, 16)
+
+#define ARM_SMMU_PMCR_E			BIT(0)
+#define ARM_SMMU_PMCR_P			BIT(1)
+
+#define ARM_SMMU_PMU_MAX_COUNTERS	256
+
+/*
+ * For system PMUs where sampling events are irrelevant, the often-copied
+ * vague comments about "interrupt latency" accompanying this counter logic
+ * are misleading nonsense; here's the real deal...
+ *
+ * Without a sample period to worry about overrunning, this boils down to a
+ * trick to make reading counter values easier. Starting an n-bit counter
+ * at (2^(n-1)) effectively gives us an (n-1)-bit counter plus an overflow
+ * bit which can be read atomically together; the real trick at play,
+ * though, is that it puts logical overflow (where we lose information)
+ * half a period out-of-phase with arithmetic overflow (where we get an
+ * interrupt). By virtue of two's complement, reads can always compute the
+ * delta from the full counter value as ((new - prev) % (2^n)) regardless
+ * of wraparound, while the job of the overflow interrupt becomes that of
+ * maintaining the out-of-phase relationship; once it fires, the handler
+ * has the remaining half-period in which to accumulate any residual delta
+ * since the last read and restart the counter from the half-maximum value.
+ * In terms of simply reading counter values, this offers no more tolerance
+ * of latency than simply adding a full period to the count once per period,
+ * but means there is no race if, say, an interrupt is left pending as the
+ * counter is stopped. The result is that we avoid accessing the overflow
+ * register (beyond clearing interrupts), and indeed any explicit overflow
+ * arithmetic at all, at the cost of taking interrupts up to twice as often
+ * as we otherwise might, and writing to the counter register each time.
+ */
+#define ARM_SMMU_PMU_COUNTER_INIT_VAL	0x80000000
+
+static int arm_smmu_hp_state;
+
+struct arm_smmu_pmu_cg {
+	struct arm_smmu_pmu *pmu;
+	int irq;
+	u16 first;
+	u16 last;
+};
+
+struct arm_smmu_pmu {
+	void __iomem *base;
+	u16 num_counters;
+	u16 num_cgs;
+	u32 events;
+	struct perf_event **counters;
+	struct arm_smmu_pmu_cg *cgs;
+	struct pmu pmu;
+	struct hlist_node cpuhp_node;
+	int cpu;
+};
+
+#define to_smmu_pmu(x)	container_of(x, struct arm_smmu_pmu, pmu)
+
+struct arm_smmu_format_attr {
+	struct device_attribute attr;
+	u8 cfg;
+	u8 lsb;
+	u8 msb;
+};
+
+#define ARM_SMMU_FORMAT_ATTR(_name, _cfgoff, _mask)			\
+	(&((struct arm_smmu_format_attr[]) {{				\
+		.attr = __ATTR(_name, 0444, arm_smmu_format_show, NULL),\
+		.cfg = _cfgoff / 64,					\
+		.lsb = __builtin_ctz(_mask) + _cfgoff % 64,		\
+		.msb = (31 - __builtin_clz(_mask)) + _cfgoff % 64,	\
+	}})[0].attr.attr)
+
+static ssize_t arm_smmu_format_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct arm_smmu_format_attr *fattr;
+	int n;
+
+	fattr = container_of(attr, typeof(*fattr), attr);
+
+	n = sysfs_emit(buf, "config");
+	if (fattr->cfg > 0)
+		n += sysfs_emit_at(buf, n, "%u", fattr->cfg);
+	n += sysfs_emit_at(buf, n, ":%u", fattr->lsb);
+	if (fattr->msb > fattr->lsb)
+		n += sysfs_emit_at(buf, n, "-%u", fattr->msb);
+
+	return n + sysfs_emit_at(buf, n, "\n");
+}
+
+#define ARM_SMMU_EVENT_ATTR(_name, _var)				\
+	(&((struct dev_ext_attribute[]) {{				\
+		.attr = __ATTR(_name, 0444, arm_smmu_event_show, NULL),	\
+		.var = (void *)_var,					\
+	}})[0].attr.attr)
+
+static ssize_t arm_smmu_event_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, typeof(*eattr), attr);
+	return sysfs_emit(buf, "event=0x%lx\n", (unsigned long)eattr->var);
+}
+
+static bool arm_smmu_event_supported(struct arm_smmu_pmu *pmu, u16 event)
+{
+	/* We can't validate IMP-DEF events; assume they're OK */
+	if (event >= 128)
+		return true;
+	/* Otherwise, check PMCEID0 (nothing is defined in PMCEID1) */
+	return (event < 32) && (pmu->events & (1U << event));
+}
+
+static umode_t arm_smmu_event_attr_is_visible(struct kobject *kobj,
+					      struct attribute *attr,
+					      int unused)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct arm_smmu_pmu *smmu_pmu = to_smmu_pmu(dev_get_drvdata(dev));
+	struct dev_ext_attribute *eattr;
+
+	eattr = container_of(attr, typeof(*eattr), attr.attr);
+	if (arm_smmu_event_supported(smmu_pmu, (unsigned long)eattr->var))
+		return attr->mode;
+
+	return 0;
+}
+
+static struct attribute *arm_smmu_event_attrs[] = {
+	ARM_SMMU_EVENT_ATTR(cycles,		0x00),
+	ARM_SMMU_EVENT_ATTR(cycles_div64,	0x01),
+	ARM_SMMU_EVENT_ATTR(tlb_alloc,		0x08),
+	ARM_SMMU_EVENT_ATTR(tlb_alloc_r,	0x09),
+	ARM_SMMU_EVENT_ATTR(tlb_alloc_w,	0x0a),
+	ARM_SMMU_EVENT_ATTR(access,		0x10),
+	ARM_SMMU_EVENT_ATTR(access_r,		0x11),
+	ARM_SMMU_EVENT_ATTR(access_w,		0x12),
+	NULL
+};
+
+static const struct attribute_group arm_smmu_event_attrs_group = {
+	.name = "events",
+	.attrs = arm_smmu_event_attrs,
+	.is_visible = arm_smmu_event_attr_is_visible,
+};
+
+#define ARM_SMMU_FORMAT_CFG0L(name, mask) ARM_SMMU_FORMAT_ATTR(name, 0, mask)
+#define ARM_SMMU_FORMAT_CFG0H(name, mask) ARM_SMMU_FORMAT_ATTR(name, 32, mask)
+#define ARM_SMMU_FORMAT_CFG1L(name, mask) ARM_SMMU_FORMAT_ATTR(name, 64, mask)
+#define ARM_SMMU_FORMAT_CFG1H(name, mask) ARM_SMMU_FORMAT_ATTR(name, 96, mask)
+
+static struct attribute *arm_smmu_format_attrs[] = {
+	/* The lower half of config looks like PMEVTYPER... */
+	ARM_SMMU_FORMAT_CFG0L(event,	ARM_SMMU_PMEVTYPER_EVENT),
+	ARM_SMMU_FORMAT_CFG0L(nsu,	ARM_SMMU_PMEVTYPER_NSU),
+	ARM_SMMU_FORMAT_CFG0L(nsp,	ARM_SMMU_PMEVTYPER_NSP),
+	ARM_SMMU_FORMAT_CFG0L(u,	ARM_SMMU_PMEVTYPER_U),
+	ARM_SMMU_FORMAT_CFG0L(p,	ARM_SMMU_PMEVTYPER_P),
+	/* ...and the upper half is PMCGCR */
+	ARM_SMMU_FORMAT_CFG0H(ndx,	ARM_SMMU_PMCGCR_NDX),
+	ARM_SMMU_FORMAT_CFG0H(tcefcfg,	ARM_SMMU_PMCGCR_TCEFCFG),
+	/* Similarly, PMCGSMR goes in config1... */
+	ARM_SMMU_FORMAT_CFG1L(smr_id,	ARM_SMMU_PMCGSMR_ID),
+	ARM_SMMU_FORMAT_CFG1L(smr_mask,	ARM_SMMU_PMCGSMR_MASK),
+	/* ...with the counter group above it */
+	ARM_SMMU_FORMAT_CFG1H(cg,	0xff),
+	NULL
+};
+
+#define ARM_SMMU_CONFIG_PMEVTYPER(cfg)						\
+	((cfg) & (ARM_SMMU_PMEVTYPER_EVENT | ARM_SMMU_PMEVTYPER_NSU |		\
+	ARM_SMMU_PMEVTYPER_NSP | ARM_SMMU_PMEVTYPER_U |	ARM_SMMU_PMEVTYPER_P))
+#define ARM_SMMU_CONFIG_PMCGCR(cfg) \
+	((cfg) >> 32 & (ARM_SMMU_PMCGCR_TCEFCFG | ARM_SMMU_PMCGCR_NDX))
+#define ARM_SMMU_CONFIG1_PMCGSMR(cfg1)	((u32)(cfg1))
+#define ARM_SMMU_CONFIG1_CGIDX(cfg1)	((cfg1) >> 32 & 0xff)
+
+static const struct attribute_group arm_smmu_format_attrs_group = {
+	.name = "format",
+	.attrs = arm_smmu_format_attrs,
+};
+
+static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr,
+			    char *buf)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(pmu->cpu));
+}
+
+static struct device_attribute arm_smmu_cpumask_attr = __ATTR_RO(cpumask);
+
+static struct attribute *arm_smmu_cpumask_attrs[] = {
+	&arm_smmu_cpumask_attr.attr,
+	NULL,
+};
+
+static struct attribute_group arm_smmu_cpumask_attr_group = {
+	.attrs = arm_smmu_cpumask_attrs,
+};
+
+static const struct attribute_group *arm_smmu_attr_groups[] = {
+	&arm_smmu_event_attrs_group,
+	&arm_smmu_format_attrs_group,
+	&arm_smmu_cpumask_attr_group,
+	NULL
+};
+
+static void arm_smmu_pmu_enable(struct pmu *pmu)
+{
+	struct arm_smmu_pmu *smmu_pmu = to_smmu_pmu(pmu);
+
+	writel_relaxed(ARM_SMMU_PMCR_E, smmu_pmu->base + ARM_SMMU_PMCR);
+}
+
+static void arm_smmu_pmu_disable(struct pmu *pmu)
+{
+	struct arm_smmu_pmu *smmu_pmu = to_smmu_pmu(pmu);
+
+	writel_relaxed(0, smmu_pmu->base + ARM_SMMU_PMCR);
+}
+
+static u32 arm_smmu_read_ctr(struct perf_event *event)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(event->pmu);
+
+	return readl_relaxed(pmu->base + ARM_SMMU_PMEVCTR(event->hw.idx));
+}
+
+static void arm_smmu_write_ctr(struct perf_event *event, u32 val)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(event->pmu);
+
+	writel_relaxed(val, pmu->base + ARM_SMMU_PMEVCTR(event->hw.idx));
+}
+
+static void arm_smmu_init_ctr(struct perf_event *event)
+{
+	local64_set(&event->hw.prev_count, ARM_SMMU_PMU_COUNTER_INIT_VAL);
+	arm_smmu_write_ctr(event, ARM_SMMU_PMU_COUNTER_INIT_VAL);
+}
+
+static void arm_smmu_event_read(struct perf_event *event)
+{
+	local64_t *hw_prev = &event->hw.prev_count;
+	u32 new, prev;
+
+	do {
+		prev = local64_read(hw_prev);
+		new = arm_smmu_read_ctr(event);
+	} while (local64_cmpxchg(hw_prev, prev, new) != prev);
+
+	local64_add(new - prev, &event->count);
+}
+
+static void arm_smmu_event_start(struct perf_event *event, int flags)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	if (flags & PERF_EF_RELOAD)
+		arm_smmu_write_ctr(event, local64_read(&event->hw.prev_count));
+
+	writel_relaxed(ARM_SMMU_EVENT_BIT(idx),
+		       pmu->base + ARM_SMMU_PMCNTENSET(idx));
+}
+
+static void arm_smmu_event_stop(struct perf_event *event, int flags)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	writel_relaxed(ARM_SMMU_EVENT_BIT(idx),
+		       pmu->base + ARM_SMMU_PMCNTENCLR(idx));
+
+	if (flags & PERF_EF_UPDATE)
+		arm_smmu_event_read(event);
+}
+
+static bool arm_smmu_event_compatible(struct perf_event *a,
+				      struct perf_event *b)
+{
+	u32 pmcgcr_a, pmcgcr_b, pmcgsmr_a, pmcgsmr_b, mask;
+	int tcefcfg;
+
+	pmcgcr_a = ARM_SMMU_CONFIG_PMCGCR(a->attr.config);
+	pmcgcr_b = ARM_SMMU_CONFIG_PMCGCR(b->attr.config);
+	tcefcfg = FIELD_GET(ARM_SMMU_PMCGCR_TCEFCFG, pmcgcr_a);
+	if (tcefcfg != FIELD_GET(ARM_SMMU_PMCGCR_TCEFCFG, pmcgcr_b))
+		return false;
+
+	if (tcefcfg == 2 && FIELD_GET(ARM_SMMU_PMCGCR_NDX, pmcgcr_a ^ pmcgcr_b))
+		return false;
+
+	pmcgsmr_a = ARM_SMMU_CONFIG1_PMCGSMR(a->attr.config1);
+	pmcgsmr_b = ARM_SMMU_CONFIG1_PMCGSMR(b->attr.config1);
+	mask = FIELD_GET(ARM_SMMU_PMCGSMR_MASK, pmcgsmr_a);
+	if (tcefcfg == 1 && ((pmcgsmr_a ^ pmcgsmr_b) & ~mask))
+		return false;
+
+	return true;
+}
+
+static bool arm_smmu_group_event_ok(struct perf_event *event,
+				    struct perf_event *group_event,
+				    int *num_counters)
+{
+	if (is_software_event(group_event))
+		return true;
+
+	if (group_event->pmu != event->pmu)
+		return false;
+
+	if (ARM_SMMU_CONFIG1_CGIDX(event->attr.config1) !=
+	    ARM_SMMU_CONFIG1_CGIDX(group_event->attr.config1))
+		return true;
+
+	if (!arm_smmu_event_compatible(event, group_event))
+		return false;
+
+	if (--*num_counters < 0)
+		return false;
+
+	return true;
+}
+
+static bool arm_smmu_group_valid(struct arm_smmu_pmu *pmu,
+				 struct perf_event *event)
+{
+	struct perf_event *leader = event->group_leader;
+	struct perf_event *sibling;
+	int cgidx, free_counters;
+
+	cgidx = ARM_SMMU_CONFIG1_CGIDX(event->attr.config1);
+	free_counters = pmu->cgs[cgidx].last - pmu->cgs[cgidx].first;
+
+	if (!arm_smmu_group_event_ok(event, leader, &free_counters))
+		return false;
+
+	for_each_sibling_event(sibling, leader) {
+		if (!arm_smmu_group_event_ok(event, sibling, &free_counters))
+			return false;
+	}
+	return true;
+}
+
+static bool arm_smmu_event_valid(struct arm_smmu_pmu *pmu,
+				 struct perf_event *event)
+{
+	int cgidx, tcefcfg;
+	u32 pmcgcr;
+
+	cgidx = ARM_SMMU_CONFIG1_CGIDX(event->attr.config1);
+	if (cgidx >= pmu->num_cgs)
+		return false;
+
+	pmcgcr = ARM_SMMU_CONFIG_PMCGCR(event->attr.config);
+	tcefcfg = FIELD_GET(ARM_SMMU_PMCGCR_TCEFCFG, pmcgcr);
+	if (tcefcfg == 3)
+		return false;
+
+	return true;
+}
+
+static int arm_smmu_event_init(struct perf_event *event)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(event->pmu);
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
+		return -EINVAL;
+
+	event->cpu = pmu->cpu;
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	if (!arm_smmu_event_valid(pmu, event))
+		return -EINVAL;
+
+	if (!arm_smmu_event_supported(pmu, event->attr.config))
+		return -EOPNOTSUPP;
+
+	if (event->group_leader != event && !arm_smmu_group_valid(pmu, event))
+		return -EINVAL;
+
+	return 0;
+}
+
+static int arm_smmu_find_counter(struct arm_smmu_pmu *pmu, int cgidx,
+				 struct perf_event *event)
+{
+	struct arm_smmu_pmu_cg *cg = &pmu->cgs[cgidx];
+	int i, alloc_idx = -ENOSPC;
+
+	for (i = cg->first; i <= cg->last; i++) {
+		struct perf_event *tmp = pmu->counters[i];
+
+		if (!tmp)
+			alloc_idx = i;
+		else if (!arm_smmu_event_compatible(event, tmp))
+			return -EINVAL;
+	}
+
+	return alloc_idx;
+}
+
+static int arm_smmu_event_add(struct perf_event *event, int flags)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(event->pmu);
+	struct hw_perf_event *hw = &event->hw;
+	u64 config = event->attr.config;
+	u64 config1 = event->attr.config1;
+	int cgidx, ctridx;
+
+	cgidx = ARM_SMMU_CONFIG1_CGIDX(event->attr.config1);
+	ctridx = arm_smmu_find_counter(pmu, cgidx, event);
+	if (ctridx < 0)
+		return ctridx;
+
+	pmu->counters[ctridx] = event;
+	hw->idx = ctridx;
+	arm_smmu_init_ctr(event);
+
+	writel_relaxed(ARM_SMMU_CONFIG_PMCGCR(config),
+		       pmu->base + ARM_SMMU_PMCGCR(cgidx));
+	writel_relaxed(ARM_SMMU_CONFIG1_PMCGSMR(config1),
+		       pmu->base + ARM_SMMU_PMCGSMR(cgidx));
+	writel_relaxed(ARM_SMMU_CONFIG_PMEVTYPER(config),
+		       pmu->base + ARM_SMMU_PMEVTYPER(ctridx));
+
+	if (flags & PERF_EF_START)
+		arm_smmu_event_start(event, 0);
+
+	return 0;
+}
+
+static void arm_smmu_event_del(struct perf_event *event, int flags)
+{
+	struct arm_smmu_pmu *pmu = to_smmu_pmu(event->pmu);
+
+	arm_smmu_event_stop(event, PERF_EF_UPDATE);
+	pmu->counters[event->hw.idx] = NULL;
+}
+
+static irqreturn_t arm_smmu_pmu_irq(int irq, void *dev)
+{
+	struct arm_smmu_pmu_cg *cg = dev;
+	void __iomem *pmovsclr;
+	u64 set, clear, bit;
+	int i;
+
+	pmovsclr = cg->pmu->base + ARM_SMMU_PMOVSCLR(cg->first);
+	set = readl_relaxed(pmovsclr);
+	clear = 0;
+	/* At worst, the bitmap for a given group may straddle two registers */
+	if ((cg->first ^ cg->last) & 32)
+		set |= (u64)readl_relaxed(pmovsclr + 4) << 32;
+
+	bit = ARM_SMMU_EVENT_BIT(cg->first);
+	for (i = cg->first; i <= cg->last; i++, bit <<= 1) {
+		if (!(set & bit))
+			continue;
+
+		clear |= bit;
+		arm_smmu_event_read(cg->pmu->counters[i]);
+		arm_smmu_init_ctr(cg->pmu->counters[i]);
+	}
+
+	if (!clear)
+		return IRQ_NONE;
+
+	if (lower_32_bits(clear))
+		writel_relaxed(lower_32_bits(clear), pmovsclr);
+	if (upper_32_bits(clear))
+		writel_relaxed(upper_32_bits(clear), pmovsclr + 4);
+
+	return IRQ_HANDLED;
+}
+
+static void arm_smmu_pmu_reset(struct arm_smmu_pmu *pmu, bool irq_set)
+{
+	int i, pminten;
+
+	writel_relaxed(ARM_SMMU_PMCR_P, pmu->base + ARM_SMMU_PMCR);
+	for (i = 0; i < pmu->num_counters; i += 32) {
+		pminten = irq_set ? ARM_SMMU_PMINTENSET(i) :
+				    ARM_SMMU_PMINTENCLR(i);
+		writel_relaxed(~0UL, pmu->base + pminten);
+		writel_relaxed(~0UL, pmu->base + ARM_SMMU_PMCNTENCLR(i));
+		writel_relaxed(~0UL, pmu->base + ARM_SMMU_PMOVSCLR(i));
+	}
+}
+
+static int arm_smmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct arm_smmu_pmu *pmu;
+	unsigned int target;
+	int i;
+
+	pmu = hlist_entry_safe(node, struct arm_smmu_pmu, cpuhp_node);
+	if (cpu != pmu->cpu)
+		return 0;
+
+	target = cpumask_any_but(cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	perf_pmu_migrate_context(&pmu->pmu, cpu, target);
+	for (i = 0; i < pmu->num_cgs; i++)
+		irq_set_affinity(pmu->cgs[i].irq, cpumask_of(target));
+	pmu->cpu = target;
+
+	return 0;
+}
+
+static int arm_smmu_pmu_probe(struct platform_device *pdev)
+{
+	struct arm_smmu_pmu *pmu;
+	struct device *dev = &pdev->dev;
+	void __iomem *base;
+	const char *name;
+	int err, i, counter_offset;
+	u32 reg;
+
+	base = (void __iomem *)dev->platform_data;
+	if (!base)
+		return -EPROBE_DEFER;
+
+	pmu = devm_kzalloc(dev, sizeof(*pmu), GFP_KERNEL);
+	if (!pmu)
+		return -ENOMEM;
+
+	pmu->base = base;
+	pmu->events = readl_relaxed(base + ARM_SMMU_PMCEID0);
+
+	reg = readl_relaxed(base + ARM_SMMU_PMCFGR);
+	pmu->num_cgs = FIELD_GET(ARM_SMMU_PMCFGR_NCG, reg) + 1;
+	pmu->cgs = devm_kcalloc(dev, pmu->num_cgs,
+				sizeof(*pmu->cgs), GFP_KERNEL);
+	if (!pmu->cgs)
+		return -ENOMEM;
+
+	pmu->num_counters = FIELD_GET(ARM_SMMU_PMCFGR_N, reg) + 1;
+	pmu->counters = devm_kcalloc(dev, pmu->num_counters,
+				     sizeof(*pmu->counters), GFP_KERNEL);
+	if (!pmu->counters)
+		return -ENOMEM;
+
+	arm_smmu_pmu_reset(pmu, true);
+
+	counter_offset = 0;
+	for (i = 0; i < pmu->num_cgs; i++) {
+		int irq, cg_num_counters;
+
+		reg = readl_relaxed(base + ARM_SMMU_PMCGCR(i));
+		cg_num_counters = FIELD_GET(ARM_SMMU_PMCGCR_CGNC, reg);
+		dev_dbg(dev, "cg %d, %d counters, sidg %ld\n", i,
+			cg_num_counters, FIELD_GET(ARM_SMMU_PMCGCR_SIDG, reg));
+		pmu->cgs[i].first = counter_offset;
+		pmu->cgs[i].last = counter_offset + cg_num_counters - 1;
+		counter_offset += cg_num_counters;
+
+		irq = platform_get_irq(pdev, i);
+		if (irq < 0)
+			return irq;
+
+		pmu->cgs[i].pmu = pmu;
+		pmu->cgs[i].irq = irq;
+		/*
+		 * IRQF_SHARED here is relying entirely on the expectation that
+		 * at most we're only ever sharing with arm-smmu for the same
+		 * SMMU instance our PMU belongs to, and that that driver will
+		 * not touch the affinity. It's sketchy, but about the best we
+		 * can do given that there most definitely exists hardware
+		 * using a single combined IRQ for everything.
+		 */
+		err = devm_request_irq(dev, irq, arm_smmu_pmu_irq,
+				       IRQF_SHARED | IRQF_NOBALANCING,
+				       "arm-smmu pmu", &pmu->cgs[i]);
+		if (err)
+			return err;
+
+		writel_relaxed(0, base + ARM_SMMU_PMCGCR(i));
+		writel_relaxed(0, base + ARM_SMMU_PMCGSMR(i));
+	}
+	if (WARN_ON(counter_offset != pmu->num_counters))
+		return -ENODEV;
+
+	dev_info(dev, "PMU with %d counters in %d groups\n",
+		 pmu->num_counters, pmu->num_cgs);
+
+	platform_set_drvdata(pdev, pmu);
+
+	pmu->cpu = raw_smp_processor_id();
+	pmu->pmu = (struct pmu) {
+		.attr_groups = arm_smmu_attr_groups,
+		.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
+		.task_ctx_nr = perf_invalid_context,
+		.pmu_enable = arm_smmu_pmu_enable,
+		.pmu_disable = arm_smmu_pmu_disable,
+		.event_init = arm_smmu_event_init,
+		.add = arm_smmu_event_add,
+		.del = arm_smmu_event_del,
+		.start = arm_smmu_event_start,
+		.stop = arm_smmu_event_stop,
+		.read = arm_smmu_event_read,
+	};
+
+	/* It's helpful if the PMU device correlates to the platform device */
+	name = devm_kasprintf(dev, GFP_KERNEL, "arm_smmu_%d", pdev->id);
+	if (!name)
+		return -ENOMEM;
+
+	err = cpuhp_state_add_instance(arm_smmu_hp_state, &pmu->cpuhp_node);
+	if (err)
+		return err;
+
+	err = perf_pmu_register(&pmu->pmu, name, -1);
+	if (err)
+		cpuhp_state_remove_instance_nocalls(arm_smmu_hp_state, &pmu->cpuhp_node);
+	return err;
+}
+
+static int arm_smmu_pmu_remove(struct platform_device *pdev)
+{
+	struct arm_smmu_pmu *pmu = platform_get_drvdata(pdev);
+
+	perf_pmu_unregister(&pmu->pmu);
+
+	/* Make sure it's turned off, and devres will do the rest */
+	arm_smmu_pmu_reset(pmu, false);
+
+	return 0;
+}
+
+static struct platform_device_id arm_smmu_pmu_id[] = {
+	{ .name = "arm-smmu-pmu" },
+	{ }
+};
+MODULE_DEVICE_TABLE(platform, arm_smmu_pmu_id);
+
+static struct platform_driver arm_smmu_pmu_driver = {
+	.driver = { .name = "arm-smmu-pmu" },
+	.id_table = arm_smmu_pmu_id,
+	.probe = arm_smmu_pmu_probe,
+	.remove = arm_smmu_pmu_remove,
+};
+
+static int __init arm_smmu_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "perf/arm/smmu:online",
+				      NULL,
+				      arm_smmu_offline_cpu);
+	if (ret < 0)
+		return ret;
+
+	arm_smmu_hp_state = ret;
+
+	ret = platform_driver_register(&arm_smmu_pmu_driver);
+	if (ret)
+		cpuhp_remove_multi_state(arm_smmu_hp_state);
+	return ret;
+}
+
+static void __exit arm_smmu_pmu_exit(void)
+{
+	platform_driver_unregister(&arm_smmu_pmu_driver);
+	cpuhp_remove_multi_state(arm_smmu_hp_state);
+}
+
+module_init(arm_smmu_pmu_init);
+module_exit(arm_smmu_pmu_exit);
+
+MODULE_DESCRIPTION("PMU driver for Arm SMMU Performance Monitors");
+MODULE_AUTHOR("Robin Murphy <robin.murphy@arm.com>");
+MODULE_LICENSE("GPL v2");
-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/6] arm64: dts: Add SMMU PMUs to Juno
  2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
                   ` (4 preceding siblings ...)
  2022-02-17 14:24 ` [PATCH 5/6] perf: Add ARM SMMU PMU driver Robin Murphy
@ 2022-02-17 14:24 ` Robin Murphy
  2022-03-07 22:03 ` [PATCH 0/6] perf: Arm SMMU PMU driver Will Deacon
  6 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2022-02-17 14:24 UTC (permalink / raw)
  To: will, mark.rutland
  Cc: lorenzo.pieralisi, sudeep.holla, linux-arm-kernel, jean-philippe,
	leo.yan, noda.akio

MMU-401 implements a single counter group, with correspondingly a
single overflow interrupt, which is also muxed into the combined
interrupt output; the integrations in Juno rely on the latter.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 arch/arm64/boot/dts/arm/juno-base.dtsi | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/arm/juno-base.dtsi b/arch/arm64/boot/dts/arm/juno-base.dtsi
index 6288e104a089..9e242ea84871 100644
--- a/arch/arm64/boot/dts/arm/juno-base.dtsi
+++ b/arch/arm64/boot/dts/arm/juno-base.dtsi
@@ -48,20 +48,25 @@ smmu_pcie: iommu@2b500000 {
 		compatible = "arm,mmu-401", "arm,smmu-v1";
 		reg = <0x0 0x2b500000 0x0 0x10000>;
 		interrupts = <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>;
 		#iommu-cells = <1>;
-		#global-interrupts = <1>;
+		#global-interrupts = <2>;
+		#pmu-interrupts = <1>;
 		dma-coherent;
 		status = "disabled";
+
 	};
 
 	smmu_etr: iommu@2b600000 {
 		compatible = "arm,mmu-401", "arm,smmu-v1";
 		reg = <0x0 0x2b600000 0x0 0x10000>;
 		interrupts = <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>;
 		#iommu-cells = <1>;
-		#global-interrupts = <1>;
+		#global-interrupts = <2>;
+		#pmu-interrupts = <1>;
 		dma-coherent;
 		power-domains = <&scpi_devpd 0>;
 	};
@@ -638,9 +643,11 @@ smmu_dma: iommu@7fb00000 {
 		compatible = "arm,mmu-401", "arm,smmu-v1";
 		reg = <0x0 0x7fb00000 0x0 0x10000>;
 		interrupts = <GIC_SPI 95 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 95 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 95 IRQ_TYPE_LEVEL_HIGH>;
 		#iommu-cells = <1>;
-		#global-interrupts = <1>;
+		#global-interrupts = <2>;
+		#pmu-interrupts = <1>;
 		dma-coherent;
 	};
 
@@ -648,27 +655,34 @@ smmu_hdlcd1: iommu@7fb10000 {
 		compatible = "arm,mmu-401", "arm,smmu-v1";
 		reg = <0x0 0x7fb10000 0x0 0x10000>;
 		interrupts = <GIC_SPI 99 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 99 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 99 IRQ_TYPE_LEVEL_HIGH>;
 		#iommu-cells = <1>;
-		#global-interrupts = <1>;
+		#global-interrupts = <2>;
+		#pmu-interrupts = <1>;
+
 	};
 
 	smmu_hdlcd0: iommu@7fb20000 {
 		compatible = "arm,mmu-401", "arm,smmu-v1";
 		reg = <0x0 0x7fb20000 0x0 0x10000>;
 		interrupts = <GIC_SPI 97 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 97 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 97 IRQ_TYPE_LEVEL_HIGH>;
 		#iommu-cells = <1>;
-		#global-interrupts = <1>;
+		#global-interrupts = <2>;
+		#pmu-interrupts = <1>;
 	};
 
 	smmu_usb: iommu@7fb30000 {
 		compatible = "arm,mmu-401", "arm,smmu-v1";
 		reg = <0x0 0x7fb30000 0x0 0x10000>;
 		interrupts = <GIC_SPI 101 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 101 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 101 IRQ_TYPE_LEVEL_HIGH>;
 		#iommu-cells = <1>;
-		#global-interrupts = <1>;
+		#global-interrupts = <2>;
+		#pmu-interrupts = <1>;
 		dma-coherent;
 	};
 
-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/6] iommu/arm-smmu: Add DT PMU support
  2022-02-17 14:24 ` [PATCH 3/6] iommu/arm-smmu: Add DT PMU support Robin Murphy
@ 2022-03-02 15:33   ` Rob Herring
  2022-03-02 17:03     ` Robin Murphy
  0 siblings, 1 reply; 10+ messages in thread
From: Rob Herring @ 2022-03-02 15:33 UTC (permalink / raw)
  To: Robin Murphy
  Cc: will, mark.rutland, lorenzo.pieralisi, sudeep.holla,
	linux-arm-kernel, jean-philippe, leo.yan, noda.akio

Send DT patches to the DT list if you want them reviewed with some 
guarantee that I'll see them. (Except right now as PW stopped getting 
mail. :( )

On Thu, Feb 17, 2022 at 02:24:17PM +0000, Robin Murphy wrote:
> Since IORT describes PMU interrupts rather inflexibly as an inherent

What's IORT? ;)

> part of the SMMU, the best way to avoid excessive complexity is to make
> the way we handle DT look as similar as possible. Fortunately the
> de-facto standard for mentioning PMU interrupts at all under the current
> binding has been to include them in the global interrupt count, listing
> them after the actual fault interrupt(s), so we can capitalise on that.
> It's about 9 years too late to redefine "#global-interrupts" to exclude
> anything other than context interrupts without breaking compatibility,
> so we're stuck with a slightly convoluted definition of PMU interrupts
> as an optional subdivision of the "global" interrupts, but it works.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> 
> ---
> 
> Not sure whether the count-backwards-from-the-middle nature of "number
> of PMU interrupts" is too ugly and "index of first PMU interrupt" might
> be any better.

Can this be solved with 'interrupt-names' instead deprecating 
#global-interrupts along the way?

> ---
>  .../devicetree/bindings/iommu/arm,smmu.yaml   | 19 ++++++++++++++++++-
>  drivers/iommu/arm/arm-smmu/arm-smmu.c         |  4 ++++
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index da5381c8ee11..9d39df42528a 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -87,10 +87,18 @@ properties:
>  
>    '#global-interrupts':
>      description: The number of global interrupts exposed by the device.
> +      Includes the count of PMU interrupts, if present.
>      $ref: /schemas/types.yaml#/definitions/uint32
>      minimum: 0
>      maximum: 260   # 2 secure, 2 non-secure, and up to 256 perf counters
>  
> +  '#pmu-interrupts':

It would have been great if everything that's a count/size used '#', but 
that didn't happen. So I'm not that wild about using '#' randomly at 
all. 

This needs a vendor prefix (arm,).

> +    description: The number of PMU interrupts. Must be equal to the number of
> +      implemented counter groups.
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 1
> +    maximum: 256
> +
>    '#iommu-cells':
>      enum: [ 1, 2 ]
>      description: |
> @@ -115,6 +123,10 @@ properties:
>        context bank. In the case of a single, combined interrupt, it must be
>        listed multiple times.
>  
> +      If a PMU is present, the global interrupt entries consist of any fault
> +      interrupts first, followed by #pmu-interrupts entries, one per implemented
> +      counter group, specified in order of their indexing by the SMMU.
> +
>    dma-coherent:
>      description: |
>        Present if page table walks made by the SMMU are cache coherent with the
> @@ -190,9 +202,14 @@ examples:
>      smmu1: iommu@ba5e0000 {
>              compatible = "arm,smmu-v1";
>              reg = <0xba5e0000 0x10000>;
> -            #global-interrupts = <2>;
> +            #global-interrupts = <6>; /* 2 fault + 4 PMU */
> +            #pmu-interrupts = <4>;
>              interrupts = <0 32 4>,
>                           <0 33 4>,
> +                         <0 94 4>, /* This is the first PMU interrupt */
> +                         <0 95 4>,
> +                         <0 96 4>,
> +                         <0 97 4>,
>                           <0 34 4>, /* This is the first context interrupt */
>                           <0 35 4>,
>                           <0 36 4>,
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index cbfe4cc914f0..8d6c8106fc1d 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -1995,6 +1995,10 @@ static int arm_smmu_device_dt_probe(struct arm_smmu_device *smmu,
>  		return dev_err_probe(dev, -ENODEV,
>  				     "missing #global-interrupts property\n");
>  	*pmu_irqs = 0;
> +	of_property_read_u32(dev->of_node, "#pmu-interrupts", pmu_irqs);
> +	if (*pmu_irqs > *global_irqs)
> +		return dev_err_probe(dev, -EINVAL, "invalid #pmu_interrupts property");
> +	*global_irqs -= *pmu_irqs;
>  
>  	data = of_device_get_match_data(dev);
>  	smmu->version = data->version;
> -- 
> 2.28.0.dirty
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/6] iommu/arm-smmu: Add DT PMU support
  2022-03-02 15:33   ` Rob Herring
@ 2022-03-02 17:03     ` Robin Murphy
  0 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2022-03-02 17:03 UTC (permalink / raw)
  To: Rob Herring
  Cc: will, mark.rutland, lorenzo.pieralisi, sudeep.holla,
	linux-arm-kernel, jean-philippe, leo.yan, noda.akio

On 2022-03-02 15:33, Rob Herring wrote:
> Send DT patches to the DT list if you want them reviewed with some
> guarantee that I'll see them. (Except right now as PW stopped getting
> mail. :( )

Oops, sorry, I think I had some sort of mental block there - I should 
equally have cc'd the IOMMU list too.

> On Thu, Feb 17, 2022 at 02:24:17PM +0000, Robin Murphy wrote:
>> Since IORT describes PMU interrupts rather inflexibly as an inherent
> 
> What's IORT? ;)

See the preceding patch :P

But indeed this particular sentence is still its untouched 5-year-old 
self... today I'd tend to write "ACPI IORT", is that better?

>> part of the SMMU, the best way to avoid excessive complexity is to make
>> the way we handle DT look as similar as possible. Fortunately the
>> de-facto standard for mentioning PMU interrupts at all under the current
>> binding has been to include them in the global interrupt count, listing
>> them after the actual fault interrupt(s), so we can capitalise on that.
>> It's about 9 years too late to redefine "#global-interrupts" to exclude
>> anything other than context interrupts without breaking compatibility,
>> so we're stuck with a slightly convoluted definition of PMU interrupts
>> as an optional subdivision of the "global" interrupts, but it works.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>>
>> ---
>>
>> Not sure whether the count-backwards-from-the-middle nature of "number
>> of PMU interrupts" is too ugly and "index of first PMU interrupt" might
>> be any better.
> 
> Can this be solved with 'interrupt-names' instead deprecating
> #global-interrupts along the way?

My gut feeling is that that's not realistically feasible - we don't have 
a fixed set of IRQ lines with well-defined names, we have an imp-def set 
of "things that might raise one or more of global fault conditions", 
plus some number of context interrupts that work one of two ways, plus 
possibly some number of PMU counter group interrupts, with the 
additional fun that for some implementations those may all be the same 
physical signal (and I've now seen at least one SoC inexplicably wire up 
a combined interrupt *as well* as the individual outputs, and describe 
them all!)

My main concern is not breaking backward-compatibility for a new DT with 
an old OS. From past experience, I'm pretty sure Xen would be up in arms 
about any breakage in that direction, so "#global-interrupts" is sadly 
going nowhere for the foreseeable future. However even just thinking 
about implementing such a thing in the Linux driver puts me off - we'd 
have to take some horrible backwards approach where we look at all the 
names first to figure out which IRQs to get, and such an abuse of 
"interrupt-names" sounds worse than what it's trying to avoid.

The other approach I tried was to have a "pmu" sub-node with its own 
"interrupts" property, but that turned out to cause problems if it was 
behind an "interrupt-map", for reasons which didn't entirely make sense. 
Plus the corresponding DT changes were uglier and more invasive - the 
nice thing about the method here is that PMU support becomes a simple 
one-line addition to many existing DTs.

>> ---
>>   .../devicetree/bindings/iommu/arm,smmu.yaml   | 19 ++++++++++++++++++-
>>   drivers/iommu/arm/arm-smmu/arm-smmu.c         |  4 ++++
>>   2 files changed, 22 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> index da5381c8ee11..9d39df42528a 100644
>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> @@ -87,10 +87,18 @@ properties:
>>   
>>     '#global-interrupts':
>>       description: The number of global interrupts exposed by the device.
>> +      Includes the count of PMU interrupts, if present.
>>       $ref: /schemas/types.yaml#/definitions/uint32
>>       minimum: 0
>>       maximum: 260   # 2 secure, 2 non-secure, and up to 256 perf counters
>>   
>> +  '#pmu-interrupts':
> 
> It would have been great if everything that's a count/size used '#', but
> that didn't happen. So I'm not that wild about using '#' randomly at
> all.
> 
> This needs a vendor prefix (arm,).

Sure - I went for consistency with what's already established, but since 
I'm similarly none too fond of it, swapping for something like 
"arm,pmu-irq-start" is fine by me.

Thanks,
Robin.

>> +    description: The number of PMU interrupts. Must be equal to the number of
>> +      implemented counter groups.
>> +    $ref: /schemas/types.yaml#/definitions/uint32
>> +    minimum: 1
>> +    maximum: 256
>> +
>>     '#iommu-cells':
>>       enum: [ 1, 2 ]
>>       description: |
>> @@ -115,6 +123,10 @@ properties:
>>         context bank. In the case of a single, combined interrupt, it must be
>>         listed multiple times.
>>   
>> +      If a PMU is present, the global interrupt entries consist of any fault
>> +      interrupts first, followed by #pmu-interrupts entries, one per implemented
>> +      counter group, specified in order of their indexing by the SMMU.
>> +
>>     dma-coherent:
>>       description: |
>>         Present if page table walks made by the SMMU are cache coherent with the
>> @@ -190,9 +202,14 @@ examples:
>>       smmu1: iommu@ba5e0000 {
>>               compatible = "arm,smmu-v1";
>>               reg = <0xba5e0000 0x10000>;
>> -            #global-interrupts = <2>;
>> +            #global-interrupts = <6>; /* 2 fault + 4 PMU */
>> +            #pmu-interrupts = <4>;
>>               interrupts = <0 32 4>,
>>                            <0 33 4>,
>> +                         <0 94 4>, /* This is the first PMU interrupt */
>> +                         <0 95 4>,
>> +                         <0 96 4>,
>> +                         <0 97 4>,
>>                            <0 34 4>, /* This is the first context interrupt */
>>                            <0 35 4>,
>>                            <0 36 4>,
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> index cbfe4cc914f0..8d6c8106fc1d 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> @@ -1995,6 +1995,10 @@ static int arm_smmu_device_dt_probe(struct arm_smmu_device *smmu,
>>   		return dev_err_probe(dev, -ENODEV,
>>   				     "missing #global-interrupts property\n");
>>   	*pmu_irqs = 0;
>> +	of_property_read_u32(dev->of_node, "#pmu-interrupts", pmu_irqs);
>> +	if (*pmu_irqs > *global_irqs)
>> +		return dev_err_probe(dev, -EINVAL, "invalid #pmu_interrupts property");
>> +	*global_irqs -= *pmu_irqs;
>>   
>>   	data = of_device_get_match_data(dev);
>>   	smmu->version = data->version;
>> -- 
>> 2.28.0.dirty
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/6] perf: Arm SMMU PMU driver
  2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
                   ` (5 preceding siblings ...)
  2022-02-17 14:24 ` [PATCH 6/6] arm64: dts: Add SMMU PMUs to Juno Robin Murphy
@ 2022-03-07 22:03 ` Will Deacon
  6 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2022-03-07 22:03 UTC (permalink / raw)
  To: Robin Murphy, mark.rutland
  Cc: catalin.marinas, kernel-team, Will Deacon, jean-philippe,
	lorenzo.pieralisi, noda.akio, sudeep.holla, linux-arm-kernel,
	leo.yan

On Thu, 17 Feb 2022 14:24:14 +0000, Robin Murphy wrote:
> Since our friends at Linaro had some interest in the area, I've dug up
> and finished off my old SMMUv1/v2 PMU driver, now with a slightly less
> terrible DT binding to boot. Patch #1 involves a general refactoring of
> the IRQ setup code in arm-smmu which also now serves to remove the
> general dependency on explicit IRQ resources for of_platform devices.
> 
> ACPI support via IORT (patch #2) is pleasingly trivial, but does have a
> hard dependency on the previous patch, otherwise SMMU context
> interrupts may get messed up.
> 
> [...]

Applied first patch to will (for-joerg/arm-smmu/updates), thanks!

[1/6] iommu/arm-smmu: Account for PMU interrupts
      https://git.kernel.org/will/c/97dfad194ca8

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-03-07 22:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-17 14:24 [PATCH 0/6] perf: Arm SMMU PMU driver Robin Murphy
2022-02-17 14:24 ` [PATCH 1/6] iommu/arm-smmu: Account for PMU interrupts Robin Murphy
2022-02-17 14:24 ` [PATCH 2/6] acpi/iort: Register SMMUv2 " Robin Murphy
2022-02-17 14:24 ` [PATCH 3/6] iommu/arm-smmu: Add DT PMU support Robin Murphy
2022-03-02 15:33   ` Rob Herring
2022-03-02 17:03     ` Robin Murphy
2022-02-17 14:24 ` [PATCH 4/6] iommu/smmu: Create child devices for PMUs Robin Murphy
2022-02-17 14:24 ` [PATCH 5/6] perf: Add ARM SMMU PMU driver Robin Murphy
2022-02-17 14:24 ` [PATCH 6/6] arm64: dts: Add SMMU PMUs to Juno Robin Murphy
2022-03-07 22:03 ` [PATCH 0/6] perf: Arm SMMU PMU driver Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.