linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2
@ 2022-02-24 13:07 Hector Martin
  2022-02-24 13:07 ` [PATCH v2 1/7] PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form Hector Martin
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

Hi folks,

In the t6000/t6001 (M1 Pro / Max) SoCs, Apple introduced a new version
of their interrupt controller. This is a significant departure from
AICv1 and seems designed to better scale to larger chips. This series
adds support for it to the existing AIC driver.

Gone are CPU affinities; instead there seems to be some kind of
"automagic" dispatch to willing CPU cores, and cores can also opt-out
via an IMP-DEF sysreg (!). Right now the bootloader just sets up all
cores to accept IRQs, and we ignore all this and let the magic
algorithm pick a CPU to accept the IRQ. In the future, we might start
making use of these finer-grained capabilities for e.g. better
real-time guarantees (CPUs running RT threads might opt out of IRQs).

Legacy IPI support is also gone, so this implements Fast IPI support.
Fast IPIs are implemented entirely in the CPU core complexes, using
FIQs and IMP-DEF sysregs. This is also supported on t8103/M1, so we
enable it there too, but we keep the legacy AIC IPI codepath in case
it is useful for backporting to older chips.

This also adds support for multi-die AIC2 controllers. While no
multi-die products exist yet, the AIC2 in t600x is built to support
up to 2 dies, and it's pretty clear how it works, so let's implement
it. If we're lucky, when multi-die products roll around, this will
let us support them with only DT changes. In order to support the
extra die dimension, this introduces a 4-argument IRQ phandle form
(3-argument is always supported and just implies die 0).

All register offsets are computed based on capability register values,
which should allow forward-compatibility with future AIC2 variants...
except for one. For some inexplicable reason, the number of actually
implemented die register sets is nowhere to be found (t600x has 2,
but claims 1 die in use and 8 dies max, neither of which is what we
need), and this is necessary to compute the event register offset,
which is page-aligned after the die register sets. We have no choice
but to stick this offset in the device tree... which is the same thing
Apple do in their ADT.

Changes since v1:
- Split off the DT binding
- Changed fast-ipi codepath selection to use a static key for performance
- Added fix for PCI driver to support the new 4-cell IRQ form
- Minor style / review feedback fixes

Hector Martin (7):
  PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form
  dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2
  irqchip/apple-aic: Add Fast IPI support
  irqchip/apple-aic: Switch to irq_domain_create_tree and sparse hwirqs
  irqchip/apple-aic: Dynamically compute register offsets
  irqchip/apple-aic: Support multiple dies
  irqchip/apple-aic: Add support for AICv2

 .../interrupt-controller/apple,aic2.yaml      |  99 ++++
 MAINTAINERS                                   |   2 +-
 drivers/irqchip/irq-apple-aic.c               | 432 +++++++++++++++---
 drivers/pci/controller/pcie-apple.c           |   2 +-
 4 files changed, 458 insertions(+), 77 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml

-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/7] PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
@ 2022-02-24 13:07 ` Hector Martin
  2022-02-24 13:07 ` [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2 Hector Martin
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

AIC2 changes the IRQ fwspec to add a cell. Always use the second-to-last
cell for the MSI handling, so it will work for both AIC1 and AIC2 devices.

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 drivers/pci/controller/pcie-apple.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pcie-apple.c b/drivers/pci/controller/pcie-apple.c
index 854d95163112..a2c3c207a04b 100644
--- a/drivers/pci/controller/pcie-apple.c
+++ b/drivers/pci/controller/pcie-apple.c
@@ -219,7 +219,7 @@ static int apple_msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
 	if (hwirq < 0)
 		return -ENOSPC;
 
-	fwspec.param[1] += hwirq;
+	fwspec.param[fwspec.param_count - 2] += hwirq;
 
 	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, &fwspec);
 	if (ret)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
  2022-02-24 13:07 ` [PATCH v2 1/7] PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form Hector Martin
@ 2022-02-24 13:07 ` Hector Martin
  2022-02-25 20:19   ` Rob Herring
  2022-02-24 13:07 ` [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support Hector Martin
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

This new incompatible revision of the AIC peripheral introduces
multi-die support. This binding is based on apple,aic, but
changes interrupt-cells to add a new die argument.

Also adds an apple,event-reg property to specify the offset of the event
register. Inexplicably, the capability registers allow us to compute
other register offsets, but not this one. This allows us to keep
forward-compatibility with future SoCs that will likely implement
different die counts, thus shifting the event register. Apple do the
same thing in their device tree...

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 .../interrupt-controller/apple,aic2.yaml      | 99 +++++++++++++++++++
 MAINTAINERS                                   |  2 +-
 2 files changed, 100 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml b/Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml
new file mode 100644
index 000000000000..311900613b9e
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml
@@ -0,0 +1,99 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/apple,aic2.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple Interrupt Controller 2
+
+maintainers:
+  - Hector Martin <marcan@marcan.st>
+
+description: |
+  The Apple Interrupt Controller 2 is a simple interrupt controller present on
+  Apple ARM SoC platforms starting with t600x (M1 Pro and Max).
+
+  It provides the following features:
+
+  - Level-triggered hardware IRQs wired to SoC blocks
+    - Single mask bit per IRQ
+    - Automatic masking on event delivery (auto-ack)
+    - Software triggering (ORed with hw line)
+  - Automatic prioritization (single event/ack register per CPU, lower IRQs =
+    higher priority)
+  - Automatic masking on ack
+  - Support for multiple dies
+
+  This device also represents the FIQ interrupt sources on platforms using AIC,
+  which do not go through a discrete interrupt controller. It also handles
+  FIQ-based Fast IPIs.
+
+properties:
+  compatible:
+    items:
+      - const: apple,t6000-aic
+      - const: apple,aic2
+
+  interrupt-controller: true
+
+  '#interrupt-cells':
+    const: 4
+    description: |
+      The 1st cell contains the interrupt type:
+        - 0: Hardware IRQ
+        - 1: FIQ
+
+      The 2nd cell contains the die ID.
+
+      The next cell contains the interrupt number.
+        - HW IRQs: interrupt number
+        - FIQs:
+          - 0: physical HV timer
+          - 1: virtual HV timer
+          - 2: physical guest timer
+          - 3: virtual guest timer
+
+      The last cell contains the interrupt flags. This is normally
+      IRQ_TYPE_LEVEL_HIGH (4).
+
+  reg:
+    description: |
+      Specifies base physical address and size of the AIC registers.
+    maxItems: 1
+
+  power-domains:
+    maxItems: 1
+
+  apple,event-reg:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description:
+      Specifies the offset of the event register, which lies after all the
+      implemented die register sets, page aligned. This is not computable from
+      capability register values, so we have to specify it explicitly.
+
+required:
+  - compatible
+  - '#interrupt-cells'
+  - interrupt-controller
+  - reg
+  - apple,event-reg
+
+additionalProperties: false
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+
+examples:
+  - |
+    soc {
+        #address-cells = <2>;
+        #size-cells = <2>;
+
+        aic: interrupt-controller@28e100000 {
+            compatible = "apple,t6000-aic", "apple,aic2";
+            #interrupt-cells = <4>;
+            interrupt-controller;
+            reg = <0x2 0x8e100000 0x0 0x10000>;
+            apple,event-reg = <0xc000>;
+        };
+    };
diff --git a/MAINTAINERS b/MAINTAINERS
index aa0f6cbb634e..ad887ec7e96b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1806,7 +1806,7 @@ T:	git https://github.com/AsahiLinux/linux.git
 F:	Documentation/devicetree/bindings/arm/apple.yaml
 F:	Documentation/devicetree/bindings/arm/apple/*
 F:	Documentation/devicetree/bindings/i2c/apple,i2c.yaml
-F:	Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
+F:	Documentation/devicetree/bindings/interrupt-controller/apple,*
 F:	Documentation/devicetree/bindings/mailbox/apple,mailbox.yaml
 F:	Documentation/devicetree/bindings/pci/apple,pcie.yaml
 F:	Documentation/devicetree/bindings/pinctrl/apple,pinctrl.yaml
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
  2022-02-24 13:07 ` [PATCH v2 1/7] PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form Hector Martin
  2022-02-24 13:07 ` [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2 Hector Martin
@ 2022-02-24 13:07 ` Hector Martin
  2022-02-25 14:39   ` Marc Zyngier
  2022-02-24 13:07 ` [PATCH v2 4/7] irqchip/apple-aic: Switch to irq_domain_create_tree and sparse hwirqs Hector Martin
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

The newer AICv2 present in t600x SoCs does not have legacy IPI support
at all. Since t8103 also supports Fast IPIs, implement support for this
first. The legacy IPI code is left as a fallback, so it can be
potentially used by older SoCs in the future.

The vIPI code is shared; only the IPI firing/acking bits change for Fast
IPIs.

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 drivers/irqchip/irq-apple-aic.c | 122 ++++++++++++++++++++++++++++----
 1 file changed, 109 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 38091ebb9403..613e0ebdabdc 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -24,7 +24,7 @@
  * - Default "this CPU" register view and explicit per-CPU views
  *
  * In addition, this driver also handles FIQs, as these are routed to the same
- * IRQ vector. These are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and
+ * IRQ vector. These are used for Fast IPIs, the ARMv8 timer IRQs, and
  * performance counters (TODO).
  *
  * Implementation notes:
@@ -52,9 +52,11 @@
 #include <linux/irqchip.h>
 #include <linux/irqchip/arm-vgic-info.h>
 #include <linux/irqdomain.h>
+#include <linux/jump_label.h>
 #include <linux/limits.h>
 #include <linux/of_address.h>
 #include <linux/slab.h>
+#include <asm/cputype.h>
 #include <asm/exception.h>
 #include <asm/sysreg.h>
 #include <asm/virt.h>
@@ -106,7 +108,6 @@
 
 /*
  * IMP-DEF sysregs that control FIQ sources
- * Note: sysreg-based IPIs are not supported yet.
  */
 
 /* Core PMC control register */
@@ -155,6 +156,10 @@
 #define SYS_IMP_APL_UPMSR_EL1		sys_reg(3, 7, 15, 6, 4)
 #define UPMSR_IACT			BIT(0)
 
+/* MPIDR fields */
+#define MPIDR_CPU(x)			MPIDR_AFFINITY_LEVEL(x, 0)
+#define MPIDR_CLUSTER(x)		MPIDR_AFFINITY_LEVEL(x, 1)
+
 #define AIC_NR_FIQ		4
 #define AIC_NR_SWIPI		32
 
@@ -173,11 +178,44 @@
 #define AIC_TMR_EL02_PHYS	AIC_TMR_GUEST_PHYS
 #define AIC_TMR_EL02_VIRT	AIC_TMR_GUEST_VIRT
 
+DEFINE_STATIC_KEY_TRUE(use_fast_ipi);
+
+struct aic_info {
+	int version;
+
+	/* Features */
+	bool fast_ipi;
+};
+
+static const struct aic_info aic1_info = {
+	.version	= 1,
+};
+
+static const struct aic_info aic1_fipi_info = {
+	.version	= 1,
+
+	.fast_ipi	= true,
+};
+
+static const struct of_device_id aic_info_match[] = {
+	{
+		.compatible = "apple,t8103-aic",
+		.data = &aic1_fipi_info,
+	},
+	{
+		.compatible = "apple,aic",
+		.data = &aic1_info,
+	},
+	{}
+};
+
 struct aic_irq_chip {
 	void __iomem *base;
 	struct irq_domain *hw_domain;
 	struct irq_domain *ipi_domain;
 	int nr_hw;
+
+	struct aic_info info;
 };
 
 static DEFINE_PER_CPU(uint32_t, aic_fiq_unmasked);
@@ -386,8 +424,12 @@ static void __exception_irq_entry aic_handle_fiq(struct pt_regs *regs)
 	 */
 
 	if (read_sysreg_s(SYS_IMP_APL_IPI_SR_EL1) & IPI_SR_PENDING) {
-		pr_err_ratelimited("Fast IPI fired. Acking.\n");
-		write_sysreg_s(IPI_SR_PENDING, SYS_IMP_APL_IPI_SR_EL1);
+		if (static_branch_likely(&use_fast_ipi)) {
+			aic_handle_ipi(regs);
+		} else {
+			pr_err_ratelimited("Fast IPI fired. Acking.\n");
+			write_sysreg_s(IPI_SR_PENDING, SYS_IMP_APL_IPI_SR_EL1);
+		}
 	}
 
 	if (TIMER_FIRING(read_sysreg(cntp_ctl_el0)))
@@ -563,6 +605,22 @@ static const struct irq_domain_ops aic_irq_domain_ops = {
  * IPI irqchip
  */
 
+static void aic_ipi_send_fast(int cpu)
+{
+	u64 mpidr = cpu_logical_map(cpu);
+	u64 my_mpidr = read_cpuid_mpidr();
+	u64 cluster = MPIDR_CLUSTER(mpidr);
+	u64 idx = MPIDR_CPU(mpidr);
+
+	if (MPIDR_CLUSTER(my_mpidr) == cluster)
+		write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx),
+			       SYS_IMP_APL_IPI_RR_LOCAL_EL1);
+	else
+		write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx) | FIELD_PREP(IPI_RR_CLUSTER, cluster),
+			       SYS_IMP_APL_IPI_RR_GLOBAL_EL1);
+	isb();
+}
+
 static void aic_ipi_mask(struct irq_data *d)
 {
 	u32 irq_bit = BIT(irqd_to_hwirq(d));
@@ -588,8 +646,12 @@ static void aic_ipi_unmask(struct irq_data *d)
 	 * If a pending vIPI was unmasked, raise a HW IPI to ourselves.
 	 * No barriers needed here since this is a self-IPI.
 	 */
-	if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit)
-		aic_ic_write(ic, AIC_IPI_SEND, AIC_IPI_SEND_CPU(smp_processor_id()));
+	if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit) {
+		if (static_branch_likely(&use_fast_ipi))
+			aic_ipi_send_fast(smp_processor_id());
+		else
+			aic_ic_write(ic, AIC_IPI_SEND, AIC_IPI_SEND_CPU(smp_processor_id()));
+	}
 }
 
 static void aic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
@@ -617,8 +679,12 @@ static void aic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
 		smp_mb__after_atomic();
 
 		if (!(pending & irq_bit) &&
-		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit))
-			send |= AIC_IPI_SEND_CPU(cpu);
+		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit)) {
+			if (static_branch_likely(&use_fast_ipi))
+				aic_ipi_send_fast(cpu);
+			else
+				send |= AIC_IPI_SEND_CPU(cpu);
+		}
 	}
 
 	/*
@@ -650,8 +716,16 @@ static void aic_handle_ipi(struct pt_regs *regs)
 	/*
 	 * Ack the IPI. We need to order this after the AIC event read, but
 	 * that is enforced by normal MMIO ordering guarantees.
+	 *
+	 * For the Fast IPI case, this needs to be ordered before the vIPI
+	 * handling below, so we need to isb();
 	 */
-	aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_OTHER);
+	if (static_branch_likely(&use_fast_ipi)) {
+		write_sysreg_s(IPI_SR_PENDING, SYS_IMP_APL_IPI_SR_EL1);
+		isb();
+	} else {
+		aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_OTHER);
+	}
 
 	/*
 	 * The mask read does not need to be ordered. Only we can change
@@ -679,7 +753,8 @@ static void aic_handle_ipi(struct pt_regs *regs)
 	 * No ordering needed here; at worst this just changes the timing of
 	 * when the next IPI will be delivered.
 	 */
-	aic_ic_write(aic_irqc, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);
+	if (!static_branch_likely(&use_fast_ipi))
+		aic_ic_write(aic_irqc, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);
 }
 
 static int aic_ipi_alloc(struct irq_domain *d, unsigned int virq,
@@ -776,10 +851,15 @@ static int aic_init_cpu(unsigned int cpu)
 	/*
 	 * Always keep IPIs unmasked at the hardware level (except auto-masking
 	 * by AIC during processing). We manage masks at the vIPI level.
+	 * These registers only exist on AICv1, AICv2 always uses fast IPIs.
 	 */
 	aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_SELF | AIC_IPI_OTHER);
-	aic_ic_write(aic_irqc, AIC_IPI_MASK_SET, AIC_IPI_SELF);
-	aic_ic_write(aic_irqc, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);
+	if (static_branch_likely(&use_fast_ipi)) {
+		aic_ic_write(aic_irqc, AIC_IPI_MASK_SET, AIC_IPI_SELF | AIC_IPI_OTHER);
+	} else {
+		aic_ic_write(aic_irqc, AIC_IPI_MASK_SET, AIC_IPI_SELF);
+		aic_ic_write(aic_irqc, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);
+	}
 
 	/* Initialize the local mask state */
 	__this_cpu_write(aic_fiq_unmasked, 0);
@@ -799,6 +879,7 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	void __iomem *regs;
 	u32 info;
 	struct aic_irq_chip *irqc;
+	const struct of_device_id *match;
 
 	regs = of_iomap(node, 0);
 	if (WARN_ON(!regs))
@@ -808,12 +889,24 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	if (!irqc)
 		return -ENOMEM;
 
-	aic_irqc = irqc;
 	irqc->base = regs;
 
+	match = of_match_node(aic_info_match, node);
+	if (!match)
+		return -ENODEV;
+
+	irqc->info = *(struct aic_info *)match->data;
+
+	aic_irqc = irqc;
+
 	info = aic_ic_read(irqc, AIC_INFO);
 	irqc->nr_hw = FIELD_GET(AIC_INFO_NR_HW, info);
 
+	if (irqc->info.fast_ipi)
+		static_branch_enable(&use_fast_ipi);
+	else
+		static_branch_disable(&use_fast_ipi);
+
 	irqc->hw_domain = irq_domain_create_linear(of_node_to_fwnode(node),
 						   irqc->nr_hw + AIC_NR_FIQ,
 						   &aic_irq_domain_ops, irqc);
@@ -845,6 +938,9 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	if (!is_kernel_in_hyp_mode())
 		pr_info("Kernel running in EL1, mapping interrupts");
 
+	if (static_branch_likely(&use_fast_ipi))
+		pr_info("Using Fast IPIs");
+
 	cpuhp_setup_state(CPUHP_AP_IRQ_APPLE_AIC_STARTING,
 			  "irqchip/apple-aic/ipi:starting",
 			  aic_init_cpu, NULL);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/7] irqchip/apple-aic: Switch to irq_domain_create_tree and sparse hwirqs
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
                   ` (2 preceding siblings ...)
  2022-02-24 13:07 ` [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support Hector Martin
@ 2022-02-24 13:07 ` Hector Martin
  2022-02-24 13:07 ` [PATCH v2 5/7] irqchip/apple-aic: Dynamically compute register offsets Hector Martin
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

This allows us to directly use the hardware event number as the hwirq
number. Since IRQ events have bit 16 set (type=1), FIQs now move to
starting at hwirq number 0.

This will become more important once multi-die support is introduced in
a later commit.

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 drivers/irqchip/irq-apple-aic.c | 71 ++++++++++++++++++---------------
 1 file changed, 39 insertions(+), 32 deletions(-)

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 613e0ebdabdc..96480389195d 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -68,7 +68,7 @@
  */
 
 #define AIC_INFO		0x0004
-#define AIC_INFO_NR_HW		GENMASK(15, 0)
+#define AIC_INFO_NR_IRQ		GENMASK(15, 0)
 
 #define AIC_CONFIG		0x0010
 
@@ -77,7 +77,8 @@
 #define AIC_EVENT_TYPE		GENMASK(31, 16)
 #define AIC_EVENT_NUM		GENMASK(15, 0)
 
-#define AIC_EVENT_TYPE_HW	1
+#define AIC_EVENT_TYPE_FIQ	0 /* Software use */
+#define AIC_EVENT_TYPE_IRQ	1
 #define AIC_EVENT_TYPE_IPI	4
 #define AIC_EVENT_IPI_OTHER	1
 #define AIC_EVENT_IPI_SELF	2
@@ -160,6 +161,11 @@
 #define MPIDR_CPU(x)			MPIDR_AFFINITY_LEVEL(x, 0)
 #define MPIDR_CLUSTER(x)		MPIDR_AFFINITY_LEVEL(x, 1)
 
+#define AIC_IRQ_HWIRQ(x)	(FIELD_PREP(AIC_EVENT_TYPE, AIC_EVENT_TYPE_IRQ) | \
+				 FIELD_PREP(AIC_EVENT_NUM, x))
+#define AIC_FIQ_HWIRQ(x)	(FIELD_PREP(AIC_EVENT_TYPE, AIC_EVENT_TYPE_FIQ) | \
+				 FIELD_PREP(AIC_EVENT_NUM, x))
+#define AIC_HWIRQ_IRQ(x)	FIELD_GET(AIC_EVENT_NUM, x)
 #define AIC_NR_FIQ		4
 #define AIC_NR_SWIPI		32
 
@@ -213,7 +219,7 @@ struct aic_irq_chip {
 	void __iomem *base;
 	struct irq_domain *hw_domain;
 	struct irq_domain *ipi_domain;
-	int nr_hw;
+	int nr_irq;
 
 	struct aic_info info;
 };
@@ -243,18 +249,22 @@ static void aic_ic_write(struct aic_irq_chip *ic, u32 reg, u32 val)
 
 static void aic_irq_mask(struct irq_data *d)
 {
+	irq_hw_number_t hwirq = irqd_to_hwirq(d);
 	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
 
-	aic_ic_write(ic, AIC_MASK_SET + MASK_REG(irqd_to_hwirq(d)),
-		     MASK_BIT(irqd_to_hwirq(d)));
+	u32 irq = AIC_HWIRQ_IRQ(hwirq);
+
+	aic_ic_write(ic, AIC_MASK_SET + MASK_REG(irq), MASK_BIT(irq));
 }
 
 static void aic_irq_unmask(struct irq_data *d)
 {
+	irq_hw_number_t hwirq = irqd_to_hwirq(d);
 	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
 
-	aic_ic_write(ic, AIC_MASK_CLR + MASK_REG(d->hwirq),
-		     MASK_BIT(irqd_to_hwirq(d)));
+	u32 irq = AIC_HWIRQ_IRQ(hwirq);
+
+	aic_ic_write(ic, AIC_MASK_CLR + MASK_REG(irq), MASK_BIT(irq));
 }
 
 static void aic_irq_eoi(struct irq_data *d)
@@ -281,8 +291,8 @@ static void __exception_irq_entry aic_handle_irq(struct pt_regs *regs)
 		type = FIELD_GET(AIC_EVENT_TYPE, event);
 		irq = FIELD_GET(AIC_EVENT_NUM, event);
 
-		if (type == AIC_EVENT_TYPE_HW)
-			generic_handle_domain_irq(aic_irqc->hw_domain, irq);
+		if (type == AIC_EVENT_TYPE_IRQ)
+			generic_handle_domain_irq(aic_irqc->hw_domain, event);
 		else if (type == AIC_EVENT_TYPE_IPI && irq == 1)
 			aic_handle_ipi(regs);
 		else if (event != 0)
@@ -314,7 +324,7 @@ static int aic_irq_set_affinity(struct irq_data *d,
 	else
 		cpu = cpumask_any_and(mask_val, cpu_online_mask);
 
-	aic_ic_write(ic, AIC_TARGET_CPU + hwirq * 4, BIT(cpu));
+	aic_ic_write(ic, AIC_TARGET_CPU + AIC_HWIRQ_IRQ(hwirq) * 4, BIT(cpu));
 	irq_data_update_effective_affinity(d, cpumask_of(cpu));
 
 	return IRQ_SET_MASK_OK;
@@ -344,9 +354,7 @@ static struct irq_chip aic_chip = {
 
 static unsigned long aic_fiq_get_idx(struct irq_data *d)
 {
-	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
-
-	return irqd_to_hwirq(d) - ic->nr_hw;
+	return AIC_HWIRQ_IRQ(irqd_to_hwirq(d));
 }
 
 static void aic_fiq_set_mask(struct irq_data *d)
@@ -434,11 +442,11 @@ static void __exception_irq_entry aic_handle_fiq(struct pt_regs *regs)
 
 	if (TIMER_FIRING(read_sysreg(cntp_ctl_el0)))
 		generic_handle_domain_irq(aic_irqc->hw_domain,
-					  aic_irqc->nr_hw + AIC_TMR_EL0_PHYS);
+					  AIC_FIQ_HWIRQ(AIC_TMR_EL0_PHYS));
 
 	if (TIMER_FIRING(read_sysreg(cntv_ctl_el0)))
 		generic_handle_domain_irq(aic_irqc->hw_domain,
-					  aic_irqc->nr_hw + AIC_TMR_EL0_VIRT);
+					  AIC_FIQ_HWIRQ(AIC_TMR_EL0_VIRT));
 
 	if (is_kernel_in_hyp_mode()) {
 		uint64_t enabled = read_sysreg_s(SYS_IMP_APL_VM_TMR_FIQ_ENA_EL2);
@@ -446,12 +454,12 @@ static void __exception_irq_entry aic_handle_fiq(struct pt_regs *regs)
 		if ((enabled & VM_TMR_FIQ_ENABLE_P) &&
 		    TIMER_FIRING(read_sysreg_s(SYS_CNTP_CTL_EL02)))
 			generic_handle_domain_irq(aic_irqc->hw_domain,
-						  aic_irqc->nr_hw + AIC_TMR_EL02_PHYS);
+						  AIC_FIQ_HWIRQ(AIC_TMR_EL02_PHYS));
 
 		if ((enabled & VM_TMR_FIQ_ENABLE_V) &&
 		    TIMER_FIRING(read_sysreg_s(SYS_CNTV_CTL_EL02)))
 			generic_handle_domain_irq(aic_irqc->hw_domain,
-						  aic_irqc->nr_hw + AIC_TMR_EL02_VIRT);
+						  AIC_FIQ_HWIRQ(AIC_TMR_EL02_VIRT));
 	}
 
 	if ((read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & (PMCR0_IMODE | PMCR0_IACT)) ==
@@ -496,9 +504,9 @@ static struct irq_chip fiq_chip = {
 static int aic_irq_domain_map(struct irq_domain *id, unsigned int irq,
 			      irq_hw_number_t hw)
 {
-	struct aic_irq_chip *ic = id->host_data;
+	u32 type = FIELD_GET(AIC_EVENT_TYPE, hw);
 
-	if (hw < ic->nr_hw) {
+	if (type == AIC_EVENT_TYPE_IRQ) {
 		irq_domain_set_info(id, irq, hw, &aic_chip, id->host_data,
 				    handle_fasteoi_irq, NULL, NULL);
 		irqd_set_single_target(irq_desc_get_irq_data(irq_to_desc(irq)));
@@ -523,14 +531,14 @@ static int aic_irq_domain_translate(struct irq_domain *id,
 
 	switch (fwspec->param[0]) {
 	case AIC_IRQ:
-		if (fwspec->param[1] >= ic->nr_hw)
+		if (fwspec->param[1] >= ic->nr_irq)
 			return -EINVAL;
-		*hwirq = fwspec->param[1];
+		*hwirq = AIC_IRQ_HWIRQ(fwspec->param[1]);
 		break;
 	case AIC_FIQ:
 		if (fwspec->param[1] >= AIC_NR_FIQ)
 			return -EINVAL;
-		*hwirq = ic->nr_hw + fwspec->param[1];
+		*hwirq = AIC_FIQ_HWIRQ(fwspec->param[1]);
 
 		/*
 		 * In EL1 the non-redirected registers are the guest's,
@@ -539,10 +547,10 @@ static int aic_irq_domain_translate(struct irq_domain *id,
 		if (!is_kernel_in_hyp_mode()) {
 			switch (fwspec->param[1]) {
 			case AIC_TMR_GUEST_PHYS:
-				*hwirq = ic->nr_hw + AIC_TMR_EL0_PHYS;
+				*hwirq = AIC_FIQ_HWIRQ(AIC_TMR_EL0_PHYS);
 				break;
 			case AIC_TMR_GUEST_VIRT:
-				*hwirq = ic->nr_hw + AIC_TMR_EL0_VIRT;
+				*hwirq = AIC_FIQ_HWIRQ(AIC_TMR_EL0_VIRT);
 				break;
 			case AIC_TMR_HV_PHYS:
 			case AIC_TMR_HV_VIRT:
@@ -900,16 +908,15 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	aic_irqc = irqc;
 
 	info = aic_ic_read(irqc, AIC_INFO);
-	irqc->nr_hw = FIELD_GET(AIC_INFO_NR_HW, info);
+	irqc->nr_irq = FIELD_GET(AIC_INFO_NR_IRQ, info);
 
 	if (irqc->info.fast_ipi)
 		static_branch_enable(&use_fast_ipi);
 	else
 		static_branch_disable(&use_fast_ipi);
 
-	irqc->hw_domain = irq_domain_create_linear(of_node_to_fwnode(node),
-						   irqc->nr_hw + AIC_NR_FIQ,
-						   &aic_irq_domain_ops, irqc);
+	irqc->hw_domain = irq_domain_create_tree(of_node_to_fwnode(node),
+						 &aic_irq_domain_ops, irqc);
 	if (WARN_ON(!irqc->hw_domain)) {
 		iounmap(irqc->base);
 		kfree(irqc);
@@ -928,11 +935,11 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	set_handle_irq(aic_handle_irq);
 	set_handle_fiq(aic_handle_fiq);
 
-	for (i = 0; i < BITS_TO_U32(irqc->nr_hw); i++)
+	for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
 		aic_ic_write(irqc, AIC_MASK_SET + i * 4, U32_MAX);
-	for (i = 0; i < BITS_TO_U32(irqc->nr_hw); i++)
+	for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
 		aic_ic_write(irqc, AIC_SW_CLR + i * 4, U32_MAX);
-	for (i = 0; i < irqc->nr_hw; i++)
+	for (i = 0; i < irqc->nr_irq; i++)
 		aic_ic_write(irqc, AIC_TARGET_CPU + i * 4, 1);
 
 	if (!is_kernel_in_hyp_mode())
@@ -948,7 +955,7 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	vgic_set_kvm_info(&vgic_info);
 
 	pr_info("Initialized with %d IRQs, %d FIQs, %d vIPIs\n",
-		irqc->nr_hw, AIC_NR_FIQ, AIC_NR_SWIPI);
+		irqc->nr_irq, AIC_NR_FIQ, AIC_NR_SWIPI);
 
 	return 0;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/7] irqchip/apple-aic: Dynamically compute register offsets
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
                   ` (3 preceding siblings ...)
  2022-02-24 13:07 ` [PATCH v2 4/7] irqchip/apple-aic: Switch to irq_domain_create_tree and sparse hwirqs Hector Martin
@ 2022-02-24 13:07 ` Hector Martin
  2022-02-24 13:07 ` [PATCH v2 6/7] irqchip/apple-aic: Support multiple dies Hector Martin
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

This allows us to support AIC variants with different numbers of IRQs
based on capability registers.

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 drivers/irqchip/irq-apple-aic.c | 72 +++++++++++++++++++++++++--------
 1 file changed, 55 insertions(+), 17 deletions(-)

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 96480389195d..4b1ba732f476 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -64,7 +64,7 @@
 #include <dt-bindings/interrupt-controller/apple-aic.h>
 
 /*
- * AIC registers (MMIO)
+ * AIC v1 registers (MMIO)
  */
 
 #define AIC_INFO		0x0004
@@ -94,16 +94,14 @@
 #define AIC_IPI_SELF		BIT(31)
 
 #define AIC_TARGET_CPU		0x3000
-#define AIC_SW_SET		0x4000
-#define AIC_SW_CLR		0x4080
-#define AIC_MASK_SET		0x4100
-#define AIC_MASK_CLR		0x4180
 
 #define AIC_CPU_IPI_SET(cpu)	(0x5008 + ((cpu) << 7))
 #define AIC_CPU_IPI_CLR(cpu)	(0x500c + ((cpu) << 7))
 #define AIC_CPU_IPI_MASK_SET(cpu) (0x5024 + ((cpu) << 7))
 #define AIC_CPU_IPI_MASK_CLR(cpu) (0x5028 + ((cpu) << 7))
 
+#define AIC_MAX_IRQ		0x400
+
 #define MASK_REG(x)		(4 * ((x) >> 5))
 #define MASK_BIT(x)		BIT((x) & GENMASK(4, 0))
 
@@ -189,17 +187,31 @@ DEFINE_STATIC_KEY_TRUE(use_fast_ipi);
 struct aic_info {
 	int version;
 
+	/* Register offsets */
+	u32 event;
+	u32 target_cpu;
+	u32 sw_set;
+	u32 sw_clr;
+	u32 mask_set;
+	u32 mask_clr;
+
 	/* Features */
 	bool fast_ipi;
 };
 
 static const struct aic_info aic1_info = {
 	.version	= 1,
+
+	.event		= AIC_EVENT,
+	.target_cpu	= AIC_TARGET_CPU,
 };
 
 static const struct aic_info aic1_fipi_info = {
 	.version	= 1,
 
+	.event		= AIC_EVENT,
+	.target_cpu	= AIC_TARGET_CPU,
+
 	.fast_ipi	= true,
 };
 
@@ -219,7 +231,9 @@ struct aic_irq_chip {
 	void __iomem *base;
 	struct irq_domain *hw_domain;
 	struct irq_domain *ipi_domain;
+
 	int nr_irq;
+	int max_irq;
 
 	struct aic_info info;
 };
@@ -254,7 +268,7 @@ static void aic_irq_mask(struct irq_data *d)
 
 	u32 irq = AIC_HWIRQ_IRQ(hwirq);
 
-	aic_ic_write(ic, AIC_MASK_SET + MASK_REG(irq), MASK_BIT(irq));
+	aic_ic_write(ic, ic->info.mask_set + MASK_REG(irq), MASK_BIT(irq));
 }
 
 static void aic_irq_unmask(struct irq_data *d)
@@ -264,7 +278,7 @@ static void aic_irq_unmask(struct irq_data *d)
 
 	u32 irq = AIC_HWIRQ_IRQ(hwirq);
 
-	aic_ic_write(ic, AIC_MASK_CLR + MASK_REG(irq), MASK_BIT(irq));
+	aic_ic_write(ic, ic->info.mask_clr + MASK_REG(irq), MASK_BIT(irq));
 }
 
 static void aic_irq_eoi(struct irq_data *d)
@@ -287,7 +301,7 @@ static void __exception_irq_entry aic_handle_irq(struct pt_regs *regs)
 		 * We cannot use a relaxed read here, as reads from DMA buffers
 		 * need to be ordered after the IRQ fires.
 		 */
-		event = readl(ic->base + AIC_EVENT);
+		event = readl(ic->base + ic->info.event);
 		type = FIELD_GET(AIC_EVENT_TYPE, event);
 		irq = FIELD_GET(AIC_EVENT_NUM, event);
 
@@ -319,12 +333,14 @@ static int aic_irq_set_affinity(struct irq_data *d,
 	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
 	int cpu;
 
+	BUG_ON(!ic->info.target_cpu);
+
 	if (force)
 		cpu = cpumask_first(mask_val);
 	else
 		cpu = cpumask_any_and(mask_val, cpu_online_mask);
 
-	aic_ic_write(ic, AIC_TARGET_CPU + AIC_HWIRQ_IRQ(hwirq) * 4, BIT(cpu));
+	aic_ic_write(ic, ic->info.target_cpu + AIC_HWIRQ_IRQ(hwirq) * 4, BIT(cpu));
 	irq_data_update_effective_affinity(d, cpumask_of(cpu));
 
 	return IRQ_SET_MASK_OK;
@@ -884,8 +900,8 @@ static struct gic_kvm_info vgic_info __initdata = {
 static int __init aic_of_ic_init(struct device_node *node, struct device_node *parent)
 {
 	int i;
+	u32 off;
 	void __iomem *regs;
-	u32 info;
 	struct aic_irq_chip *irqc;
 	const struct of_device_id *match;
 
@@ -907,8 +923,30 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 
 	aic_irqc = irqc;
 
-	info = aic_ic_read(irqc, AIC_INFO);
-	irqc->nr_irq = FIELD_GET(AIC_INFO_NR_IRQ, info);
+	switch (irqc->info.version) {
+	case 1: {
+		u32 info;
+
+		info = aic_ic_read(irqc, AIC_INFO);
+		irqc->nr_irq = FIELD_GET(AIC_INFO_NR_IRQ, info);
+		irqc->max_irq = AIC_MAX_IRQ;
+
+		off = irqc->info.target_cpu;
+		off += sizeof(u32) * irqc->max_irq; /* TARGET_CPU */
+
+		break;
+	}
+	}
+
+	irqc->info.sw_set = off;
+	off += sizeof(u32) * (irqc->max_irq >> 5); /* SW_SET */
+	irqc->info.sw_clr = off;
+	off += sizeof(u32) * (irqc->max_irq >> 5); /* SW_CLR */
+	irqc->info.mask_set = off;
+	off += sizeof(u32) * (irqc->max_irq >> 5); /* MASK_SET */
+	irqc->info.mask_clr = off;
+	off += sizeof(u32) * (irqc->max_irq >> 5); /* MASK_CLR */
+	off += sizeof(u32) * (irqc->max_irq >> 5); /* HW_STATE */
 
 	if (irqc->info.fast_ipi)
 		static_branch_enable(&use_fast_ipi);
@@ -936,11 +974,11 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	set_handle_fiq(aic_handle_fiq);
 
 	for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
-		aic_ic_write(irqc, AIC_MASK_SET + i * 4, U32_MAX);
+		aic_ic_write(irqc, irqc->info.mask_set + i * 4, U32_MAX);
 	for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
-		aic_ic_write(irqc, AIC_SW_CLR + i * 4, U32_MAX);
+		aic_ic_write(irqc, irqc->info.sw_clr + i * 4, U32_MAX);
 	for (i = 0; i < irqc->nr_irq; i++)
-		aic_ic_write(irqc, AIC_TARGET_CPU + i * 4, 1);
+		aic_ic_write(irqc, irqc->info.target_cpu + i * 4, 1);
 
 	if (!is_kernel_in_hyp_mode())
 		pr_info("Kernel running in EL1, mapping interrupts");
@@ -954,8 +992,8 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 
 	vgic_set_kvm_info(&vgic_info);
 
-	pr_info("Initialized with %d IRQs, %d FIQs, %d vIPIs\n",
-		irqc->nr_irq, AIC_NR_FIQ, AIC_NR_SWIPI);
+	pr_info("Initialized with %d/%d IRQs, %d FIQs, %d vIPIs",
+		irqc->nr_irq, irqc->max_irq, AIC_NR_FIQ, AIC_NR_SWIPI);
 
 	return 0;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/7] irqchip/apple-aic: Support multiple dies
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
                   ` (4 preceding siblings ...)
  2022-02-24 13:07 ` [PATCH v2 5/7] irqchip/apple-aic: Dynamically compute register offsets Hector Martin
@ 2022-02-24 13:07 ` Hector Martin
  2022-02-24 13:07 ` [PATCH v2 7/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
  2022-02-24 18:26 ` [PATCH v2 0/7] " Mark Rutland
  7 siblings, 0 replies; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

Multi-die support in AICv2 uses several sets of IRQ registers. Introduce
a die count and compute the register group offset based on the die ID
field of the hwirq number, as reported by the hardware.

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 drivers/irqchip/irq-apple-aic.c | 77 +++++++++++++++++++++++----------
 1 file changed, 54 insertions(+), 23 deletions(-)

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 4b1ba732f476..93c622435ba2 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -74,7 +74,8 @@
 
 #define AIC_WHOAMI		0x2000
 #define AIC_EVENT		0x2004
-#define AIC_EVENT_TYPE		GENMASK(31, 16)
+#define AIC_EVENT_DIE		GENMASK(31, 24)
+#define AIC_EVENT_TYPE		GENMASK(23, 16)
 #define AIC_EVENT_NUM		GENMASK(15, 0)
 
 #define AIC_EVENT_TYPE_FIQ	0 /* Software use */
@@ -159,11 +160,13 @@
 #define MPIDR_CPU(x)			MPIDR_AFFINITY_LEVEL(x, 0)
 #define MPIDR_CLUSTER(x)		MPIDR_AFFINITY_LEVEL(x, 1)
 
-#define AIC_IRQ_HWIRQ(x)	(FIELD_PREP(AIC_EVENT_TYPE, AIC_EVENT_TYPE_IRQ) | \
-				 FIELD_PREP(AIC_EVENT_NUM, x))
+#define AIC_IRQ_HWIRQ(die, irq)	(FIELD_PREP(AIC_EVENT_DIE, die) | \
+				 FIELD_PREP(AIC_EVENT_TYPE, AIC_EVENT_TYPE_IRQ) | \
+				 FIELD_PREP(AIC_EVENT_NUM, irq))
 #define AIC_FIQ_HWIRQ(x)	(FIELD_PREP(AIC_EVENT_TYPE, AIC_EVENT_TYPE_FIQ) | \
 				 FIELD_PREP(AIC_EVENT_NUM, x))
 #define AIC_HWIRQ_IRQ(x)	FIELD_GET(AIC_EVENT_NUM, x)
+#define AIC_HWIRQ_DIE(x)	FIELD_GET(AIC_EVENT_DIE, x)
 #define AIC_NR_FIQ		4
 #define AIC_NR_SWIPI		32
 
@@ -195,6 +198,8 @@ struct aic_info {
 	u32 mask_set;
 	u32 mask_clr;
 
+	u32 die_stride;
+
 	/* Features */
 	bool fast_ipi;
 };
@@ -234,6 +239,8 @@ struct aic_irq_chip {
 
 	int nr_irq;
 	int max_irq;
+	int nr_die;
+	int max_die;
 
 	struct aic_info info;
 };
@@ -266,9 +273,10 @@ static void aic_irq_mask(struct irq_data *d)
 	irq_hw_number_t hwirq = irqd_to_hwirq(d);
 	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
 
+	u32 off = AIC_HWIRQ_DIE(hwirq) * ic->info.die_stride;
 	u32 irq = AIC_HWIRQ_IRQ(hwirq);
 
-	aic_ic_write(ic, ic->info.mask_set + MASK_REG(irq), MASK_BIT(irq));
+	aic_ic_write(ic, ic->info.mask_set + off + MASK_REG(irq), MASK_BIT(irq));
 }
 
 static void aic_irq_unmask(struct irq_data *d)
@@ -276,9 +284,10 @@ static void aic_irq_unmask(struct irq_data *d)
 	irq_hw_number_t hwirq = irqd_to_hwirq(d);
 	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
 
+	u32 off = AIC_HWIRQ_DIE(hwirq) * ic->info.die_stride;
 	u32 irq = AIC_HWIRQ_IRQ(hwirq);
 
-	aic_ic_write(ic, ic->info.mask_clr + MASK_REG(irq), MASK_BIT(irq));
+	aic_ic_write(ic, ic->info.mask_clr + off + MASK_REG(irq), MASK_BIT(irq));
 }
 
 static void aic_irq_eoi(struct irq_data *d)
@@ -541,27 +550,41 @@ static int aic_irq_domain_translate(struct irq_domain *id,
 				    unsigned int *type)
 {
 	struct aic_irq_chip *ic = id->host_data;
+	u32 *args;
+	u32 die = 0;
 
-	if (fwspec->param_count != 3 || !is_of_node(fwspec->fwnode))
+	if (fwspec->param_count < 3 || fwspec->param_count > 4 ||
+	    !is_of_node(fwspec->fwnode))
 		return -EINVAL;
 
+	args = &fwspec->param[1];
+
+	if (fwspec->param_count == 4) {
+		die = args[0];
+		args++;
+	}
+
 	switch (fwspec->param[0]) {
 	case AIC_IRQ:
-		if (fwspec->param[1] >= ic->nr_irq)
+		if (die >= ic->nr_die)
 			return -EINVAL;
-		*hwirq = AIC_IRQ_HWIRQ(fwspec->param[1]);
+		if (args[0] >= ic->nr_irq)
+			return -EINVAL;
+		*hwirq = AIC_IRQ_HWIRQ(die, args[0]);
 		break;
 	case AIC_FIQ:
-		if (fwspec->param[1] >= AIC_NR_FIQ)
+		if (die != 0)
+			return -EINVAL;
+		if (args[0] >= AIC_NR_FIQ)
 			return -EINVAL;
-		*hwirq = AIC_FIQ_HWIRQ(fwspec->param[1]);
+		*hwirq = AIC_FIQ_HWIRQ(args[0]);
 
 		/*
 		 * In EL1 the non-redirected registers are the guest's,
 		 * not EL2's, so remap the hwirqs to match.
 		 */
 		if (!is_kernel_in_hyp_mode()) {
-			switch (fwspec->param[1]) {
+			switch (args[0]) {
 			case AIC_TMR_GUEST_PHYS:
 				*hwirq = AIC_FIQ_HWIRQ(AIC_TMR_EL0_PHYS);
 				break;
@@ -580,7 +603,7 @@ static int aic_irq_domain_translate(struct irq_domain *id,
 		return -EINVAL;
 	}
 
-	*type = fwspec->param[2] & IRQ_TYPE_SENSE_MASK;
+	*type = args[1] & IRQ_TYPE_SENSE_MASK;
 
 	return 0;
 }
@@ -899,8 +922,8 @@ static struct gic_kvm_info vgic_info __initdata = {
 
 static int __init aic_of_ic_init(struct device_node *node, struct device_node *parent)
 {
-	int i;
-	u32 off;
+	int i, die;
+	u32 off, start_off;
 	void __iomem *regs;
 	struct aic_irq_chip *irqc;
 	const struct of_device_id *match;
@@ -930,8 +953,9 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 		info = aic_ic_read(irqc, AIC_INFO);
 		irqc->nr_irq = FIELD_GET(AIC_INFO_NR_IRQ, info);
 		irqc->max_irq = AIC_MAX_IRQ;
+		irqc->nr_die = irqc->max_die = 1;
 
-		off = irqc->info.target_cpu;
+		off = start_off = irqc->info.target_cpu;
 		off += sizeof(u32) * irqc->max_irq; /* TARGET_CPU */
 
 		break;
@@ -953,6 +977,8 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	else
 		static_branch_disable(&use_fast_ipi);
 
+	irqc->info.die_stride = off - start_off;
+
 	irqc->hw_domain = irq_domain_create_tree(of_node_to_fwnode(node),
 						 &aic_irq_domain_ops, irqc);
 	if (WARN_ON(!irqc->hw_domain)) {
@@ -973,12 +999,17 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	set_handle_irq(aic_handle_irq);
 	set_handle_fiq(aic_handle_fiq);
 
-	for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
-		aic_ic_write(irqc, irqc->info.mask_set + i * 4, U32_MAX);
-	for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
-		aic_ic_write(irqc, irqc->info.sw_clr + i * 4, U32_MAX);
-	for (i = 0; i < irqc->nr_irq; i++)
-		aic_ic_write(irqc, irqc->info.target_cpu + i * 4, 1);
+	off = 0;
+	for (die = 0; die < irqc->nr_die; die++) {
+		for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
+			aic_ic_write(irqc, irqc->info.mask_set + off + i * 4, U32_MAX);
+		for (i = 0; i < BITS_TO_U32(irqc->nr_irq); i++)
+			aic_ic_write(irqc, irqc->info.sw_clr + off + i * 4, U32_MAX);
+		if (irqc->info.target_cpu)
+			for (i = 0; i < irqc->nr_irq; i++)
+				aic_ic_write(irqc, irqc->info.target_cpu + off + i * 4, 1);
+		off += irqc->info.die_stride;
+	}
 
 	if (!is_kernel_in_hyp_mode())
 		pr_info("Kernel running in EL1, mapping interrupts");
@@ -992,8 +1023,8 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 
 	vgic_set_kvm_info(&vgic_info);
 
-	pr_info("Initialized with %d/%d IRQs, %d FIQs, %d vIPIs",
-		irqc->nr_irq, irqc->max_irq, AIC_NR_FIQ, AIC_NR_SWIPI);
+	pr_info("Initialized with %d/%d IRQs * %d/%d die(s), %d FIQs, %d vIPIs",
+		irqc->nr_irq, irqc->max_irq, irqc->nr_die, irqc->max_die, AIC_NR_FIQ, AIC_NR_SWIPI);
 
 	return 0;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 7/7] irqchip/apple-aic: Add support for AICv2
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
                   ` (5 preceding siblings ...)
  2022-02-24 13:07 ` [PATCH v2 6/7] irqchip/apple-aic: Support multiple dies Hector Martin
@ 2022-02-24 13:07 ` Hector Martin
  2022-02-25 15:27   ` Marc Zyngier
  2022-02-24 18:26 ` [PATCH v2 0/7] " Mark Rutland
  7 siblings, 1 reply; 19+ messages in thread
From: Hector Martin @ 2022-02-24 13:07 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier, Rob Herring
  Cc: Hector Martin, Sven Peter, Alyssa Rosenzweig, Mark Kettenis,
	linux-arm-kernel, linux-kernel, devicetree

Introduce support for the new AICv2 hardware block in t6000/t6001 SoCs.

It seems these blocks are missing the information required to compute
the event register offset in the capability registers, so we specify
that in the DT.

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 drivers/irqchip/irq-apple-aic.c | 148 ++++++++++++++++++++++++++++----
 1 file changed, 129 insertions(+), 19 deletions(-)

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 93c622435ba2..602c8b274170 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -103,6 +103,57 @@
 
 #define AIC_MAX_IRQ		0x400
 
+/*
+ * AIC v2 registers (MMIO)
+ */
+
+#define AIC2_VERSION		0x0000
+#define AIC2_VERSION_VER	GENMASK(7, 0)
+
+#define AIC2_INFO1		0x0004
+#define AIC2_INFO1_NR_IRQ	GENMASK(15, 0)
+#define AIC2_INFO1_LAST_DIE	GENMASK(27, 24)
+
+#define AIC2_INFO2		0x0008
+
+#define AIC2_INFO3		0x000c
+#define AIC2_INFO3_MAX_IRQ	GENMASK(15, 0)
+#define AIC2_INFO3_MAX_DIE	GENMASK(27, 24)
+
+#define AIC2_RESET		0x0010
+#define AIC2_RESET_RESET	BIT(0)
+
+#define AIC2_CONFIG		0x0014
+#define AIC2_CONFIG_ENABLE	BIT(0)
+#define AIC2_CONFIG_PREFER_PCPU	BIT(28)
+
+#define AIC2_TIMEOUT		0x0028
+#define AIC2_CLUSTER_PRIO	0x0030
+#define AIC2_DELAY_GROUPS	0x0100
+
+#define AIC2_IRQ_CFG		0x2000
+
+/*
+ * AIC2 registers are laid out like this, starting at AIC2_IRQ_CFG:
+ *
+ * Repeat for each die:
+ *   IRQ_CFG: u32 * MAX_IRQS
+ *   SW_SET: u32 * (MAX_IRQS / 32)
+ *   SW_CLR: u32 * (MAX_IRQS / 32)
+ *   MASK_SET: u32 * (MAX_IRQS / 32)
+ *   MASK_CLR: u32 * (MAX_IRQS / 32)
+ *   HW_STATE: u32 * (MAX_IRQS / 32)
+ *
+ * This is followed by a set of event registers, each 16K page aligned.
+ * The first one is the AP event register we will use. Unfortunately,
+ * the actual implemented die count is not specified anywhere in the
+ * capability registers, so we have to explicitly specify the event
+ * register offset in the device tree to remain forward-compatible.
+ */
+
+#define AIC2_IRQ_CFG_TARGET	GENMASK(3, 0)
+#define AIC2_IRQ_CFG_DELAY_IDX	GENMASK(7, 5)
+
 #define MASK_REG(x)		(4 * ((x) >> 5))
 #define MASK_BIT(x)		BIT((x) & GENMASK(4, 0))
 
@@ -193,6 +244,7 @@ struct aic_info {
 	/* Register offsets */
 	u32 event;
 	u32 target_cpu;
+	u32 irq_cfg;
 	u32 sw_set;
 	u32 sw_clr;
 	u32 mask_set;
@@ -220,6 +272,14 @@ static const struct aic_info aic1_fipi_info = {
 	.fast_ipi	= true,
 };
 
+static const struct aic_info aic2_info = {
+	.version	= 2,
+
+	.irq_cfg	= AIC2_IRQ_CFG,
+
+	.fast_ipi	= true,
+};
+
 static const struct of_device_id aic_info_match[] = {
 	{
 		.compatible = "apple,t8103-aic",
@@ -229,6 +289,10 @@ static const struct of_device_id aic_info_match[] = {
 		.compatible = "apple,aic",
 		.data = &aic1_info,
 	},
+	{
+		.compatible = "apple,aic2",
+		.data = &aic2_info,
+	},
 	{}
 };
 
@@ -373,6 +437,14 @@ static struct irq_chip aic_chip = {
 	.irq_set_type = aic_irq_set_type,
 };
 
+static struct irq_chip aic2_chip = {
+	.name = "AIC2",
+	.irq_mask = aic_irq_mask,
+	.irq_unmask = aic_irq_unmask,
+	.irq_eoi = aic_irq_eoi,
+	.irq_set_type = aic_irq_set_type,
+};
+
 /*
  * FIQ irqchip
  */
@@ -529,10 +601,15 @@ static struct irq_chip fiq_chip = {
 static int aic_irq_domain_map(struct irq_domain *id, unsigned int irq,
 			      irq_hw_number_t hw)
 {
+	struct aic_irq_chip *ic = id->host_data;
 	u32 type = FIELD_GET(AIC_EVENT_TYPE, hw);
+	struct irq_chip *chip = &aic_chip;
+
+	if (ic->info.version == 2)
+		chip = &aic2_chip;
 
 	if (type == AIC_EVENT_TYPE_IRQ) {
-		irq_domain_set_info(id, irq, hw, &aic_chip, id->host_data,
+		irq_domain_set_info(id, irq, hw, chip, id->host_data,
 				    handle_fasteoi_irq, NULL, NULL);
 		irqd_set_single_target(irq_desc_get_irq_data(irq_to_desc(irq)));
 	} else {
@@ -888,24 +965,26 @@ static int aic_init_cpu(unsigned int cpu)
 	/* Commit all of the above */
 	isb();
 
-	/*
-	 * Make sure the kernel's idea of logical CPU order is the same as AIC's
-	 * If we ever end up with a mismatch here, we will have to introduce
-	 * a mapping table similar to what other irqchip drivers do.
-	 */
-	WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());
+	if (aic_irqc->info.version == 1) {
+		/*
+		 * Make sure the kernel's idea of logical CPU order is the same as AIC's
+		 * If we ever end up with a mismatch here, we will have to introduce
+		 * a mapping table similar to what other irqchip drivers do.
+		 */
+		WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());
 
-	/*
-	 * Always keep IPIs unmasked at the hardware level (except auto-masking
-	 * by AIC during processing). We manage masks at the vIPI level.
-	 * These registers only exist on AICv1, AICv2 always uses fast IPIs.
-	 */
-	aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_SELF | AIC_IPI_OTHER);
-	if (static_branch_likely(&use_fast_ipi)) {
-		aic_ic_write(aic_irqc, AIC_IPI_MASK_SET, AIC_IPI_SELF | AIC_IPI_OTHER);
-	} else {
-		aic_ic_write(aic_irqc, AIC_IPI_MASK_SET, AIC_IPI_SELF);
-		aic_ic_write(aic_irqc, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);
+		/*
+		 * Always keep IPIs unmasked at the hardware level (except auto-masking
+		 * by AIC during processing). We manage masks at the vIPI level.
+		 * These registers only exist on AICv1, AICv2 always uses fast IPIs.
+		 */
+		aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_SELF | AIC_IPI_OTHER);
+		if (static_branch_likely(&use_fast_ipi)) {
+			aic_ic_write(aic_irqc, AIC_IPI_MASK_SET, AIC_IPI_SELF | AIC_IPI_OTHER);
+		} else {
+			aic_ic_write(aic_irqc, AIC_IPI_MASK_SET, AIC_IPI_SELF);
+			aic_ic_write(aic_irqc, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);
+		}
 	}
 
 	/* Initialize the local mask state */
@@ -960,6 +1039,29 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 
 		break;
 	}
+	case 2: {
+		u32 info1, info3;
+
+		info1 = aic_ic_read(irqc, AIC2_INFO1);
+		info3 = aic_ic_read(irqc, AIC2_INFO3);
+
+		irqc->nr_irq = FIELD_GET(AIC2_INFO1_NR_IRQ, info1);
+		irqc->max_irq = FIELD_GET(AIC2_INFO3_MAX_IRQ, info3);
+		irqc->nr_die = FIELD_GET(AIC2_INFO1_LAST_DIE, info1) + 1;
+		irqc->max_die = FIELD_GET(AIC2_INFO3_MAX_DIE, info3);
+
+		off = start_off = irqc->info.irq_cfg;
+		off += sizeof(u32) * irqc->max_irq; /* IRQ_CFG */
+
+		if (of_property_read_u32(node, "apple,event-reg", &irqc->info.event) < 0) {
+			pr_err("Failed to get apple,event-reg property");
+			iounmap(irqc->base);
+			kfree(irqc);
+			return -ENODEV;
+		}
+
+		break;
+	}
 	}
 
 	irqc->info.sw_set = off;
@@ -1011,6 +1113,13 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 		off += irqc->info.die_stride;
 	}
 
+	if (irqc->info.version == 2) {
+		u32 config = aic_ic_read(irqc, AIC2_CONFIG);
+
+		config |= AIC2_CONFIG_ENABLE;
+		aic_ic_write(irqc, AIC2_CONFIG, config);
+	}
+
 	if (!is_kernel_in_hyp_mode())
 		pr_info("Kernel running in EL1, mapping interrupts");
 
@@ -1029,4 +1138,5 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
 	return 0;
 }
 
-IRQCHIP_DECLARE(apple_m1_aic, "apple,aic", aic_of_ic_init);
+IRQCHIP_DECLARE(apple_aic, "apple,aic", aic_of_ic_init);
+IRQCHIP_DECLARE(apple_aic2, "apple,aic2", aic_of_ic_init);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2
  2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
                   ` (6 preceding siblings ...)
  2022-02-24 13:07 ` [PATCH v2 7/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
@ 2022-02-24 18:26 ` Mark Rutland
  2022-02-24 19:06   ` Marc Zyngier
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2022-02-24 18:26 UTC (permalink / raw)
  To: Hector Martin
  Cc: Thomas Gleixner, Marc Zyngier, Rob Herring, Sven Peter,
	Alyssa Rosenzweig, Mark Kettenis, linux-arm-kernel, linux-kernel,
	devicetree

On Thu, Feb 24, 2022 at 10:07:34PM +0900, Hector Martin wrote:
> Hi folks,

Hi Hector,

> In the t6000/t6001 (M1 Pro / Max) SoCs, Apple introduced a new version
> of their interrupt controller. This is a significant departure from
> AICv1 and seems designed to better scale to larger chips. This series
> adds support for it to the existing AIC driver.
> 
> Gone are CPU affinities; instead there seems to be some kind of
> "automagic" dispatch to willing CPU cores, and cores can also opt-out
> via an IMP-DEF sysreg (!). Right now the bootloader just sets up all
> cores to accept IRQs, and we ignore all this and let the magic
> algorithm pick a CPU to accept the IRQ.

Maybe that's ok for the set of peripherals attached, but in general that
violates existing expectations regarding affinity, and I fear there'll
be some subtle brokenness resulting from this automatic target
selection.

For example, in the perf events subsystem there are PMU drivers (even
those for "uncore" or "system" devices which are shared by many/all
CPUs) which rely on a combination of interrupt affinity and local IRQ
masking (without any other locking) to provide exclusion between a PMU's
IRQ handler and any other management operations for that PMU (which are
all handled from the same CPU).

> In the future, we might start making use of these finer-grained
> capabilities for e.g. better real-time guarantees (CPUs running RT
> threads might opt out of IRQs).

What mechanism does the HW have for affinity selection? The wording
above makes it sound like each CPU has to opt-out rather than having a
central affinity selection. Is there a mechanism to select a single
target?

Thanks,
Mark.

> Legacy IPI support is also gone, so this implements Fast IPI support.
> Fast IPIs are implemented entirely in the CPU core complexes, using
> FIQs and IMP-DEF sysregs. This is also supported on t8103/M1, so we
> enable it there too, but we keep the legacy AIC IPI codepath in case
> it is useful for backporting to older chips.
> 
> This also adds support for multi-die AIC2 controllers. While no
> multi-die products exist yet, the AIC2 in t600x is built to support
> up to 2 dies, and it's pretty clear how it works, so let's implement
> it. If we're lucky, when multi-die products roll around, this will
> let us support them with only DT changes. In order to support the
> extra die dimension, this introduces a 4-argument IRQ phandle form
> (3-argument is always supported and just implies die 0).
> 
> All register offsets are computed based on capability register values,
> which should allow forward-compatibility with future AIC2 variants...
> except for one. For some inexplicable reason, the number of actually
> implemented die register sets is nowhere to be found (t600x has 2,
> but claims 1 die in use and 8 dies max, neither of which is what we
> need), and this is necessary to compute the event register offset,
> which is page-aligned after the die register sets. We have no choice
> but to stick this offset in the device tree... which is the same thing
> Apple do in their ADT.
> 
> Changes since v1:
> - Split off the DT binding
> - Changed fast-ipi codepath selection to use a static key for performance
> - Added fix for PCI driver to support the new 4-cell IRQ form
> - Minor style / review feedback fixes
> 
> Hector Martin (7):
>   PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form
>   dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2
>   irqchip/apple-aic: Add Fast IPI support
>   irqchip/apple-aic: Switch to irq_domain_create_tree and sparse hwirqs
>   irqchip/apple-aic: Dynamically compute register offsets
>   irqchip/apple-aic: Support multiple dies
>   irqchip/apple-aic: Add support for AICv2
> 
>  .../interrupt-controller/apple,aic2.yaml      |  99 ++++
>  MAINTAINERS                                   |   2 +-
>  drivers/irqchip/irq-apple-aic.c               | 432 +++++++++++++++---
>  drivers/pci/controller/pcie-apple.c           |   2 +-
>  4 files changed, 458 insertions(+), 77 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml
> 
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2
  2022-02-24 18:26 ` [PATCH v2 0/7] " Mark Rutland
@ 2022-02-24 19:06   ` Marc Zyngier
  2022-02-25  4:27     ` Hector Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2022-02-24 19:06 UTC (permalink / raw)
  To: Hector Martin, Mark Rutland
  Cc: Thomas Gleixner, Rob Herring, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On Thu, 24 Feb 2022 18:26:41 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Thu, Feb 24, 2022 at 10:07:34PM +0900, Hector Martin wrote:
> > Hi folks,
> 
> Hi Hector,
> 
> > In the t6000/t6001 (M1 Pro / Max) SoCs, Apple introduced a new version
> > of their interrupt controller. This is a significant departure from
> > AICv1 and seems designed to better scale to larger chips. This series
> > adds support for it to the existing AIC driver.
> > 
> > Gone are CPU affinities; instead there seems to be some kind of
> > "automagic" dispatch to willing CPU cores, and cores can also opt-out
> > via an IMP-DEF sysreg (!). Right now the bootloader just sets up all
> > cores to accept IRQs, and we ignore all this and let the magic
> > algorithm pick a CPU to accept the IRQ.
> 
> Maybe that's ok for the set of peripherals attached, but in general that
> violates existing expectations regarding affinity, and I fear there'll
> be some subtle brokenness resulting from this automatic target
> selection.
> 
> For example, in the perf events subsystem there are PMU drivers (even
> those for "uncore" or "system" devices which are shared by many/all
> CPUs) which rely on a combination of interrupt affinity and local IRQ
> masking (without any other locking) to provide exclusion between a PMU's
> IRQ handler and any other management operations for that PMU (which are
> all handled from the same CPU).

It will definitely break anything that relies on managed interrupts,
where the kernel expects to allocate interrupts that have a strict
affinity. Drivers using this feature can legitimately expect that they
can keep their state in per-CPU pointers, and that obviously breaks.

This may affect any PCIe device with more than a couple of queues.
Maybe users of this HW do not care (yet), but we'll have to find a way
to tell drivers of the limitation.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2
  2022-02-24 19:06   ` Marc Zyngier
@ 2022-02-25  4:27     ` Hector Martin
  0 siblings, 0 replies; 19+ messages in thread
From: Hector Martin @ 2022-02-25  4:27 UTC (permalink / raw)
  To: Marc Zyngier, Mark Rutland
  Cc: Thomas Gleixner, Rob Herring, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On 25/02/2022 04.06, Marc Zyngier wrote:
> On Thu, 24 Feb 2022 18:26:41 +0000,
> Mark Rutland <mark.rutland@arm.com> wrote:
>>
>> On Thu, Feb 24, 2022 at 10:07:34PM +0900, Hector Martin wrote:
>>> Hi folks,
>>
>> Hi Hector,
>>
>>> In the t6000/t6001 (M1 Pro / Max) SoCs, Apple introduced a new version
>>> of their interrupt controller. This is a significant departure from
>>> AICv1 and seems designed to better scale to larger chips. This series
>>> adds support for it to the existing AIC driver.
>>>
>>> Gone are CPU affinities; instead there seems to be some kind of
>>> "automagic" dispatch to willing CPU cores, and cores can also opt-out
>>> via an IMP-DEF sysreg (!). Right now the bootloader just sets up all
>>> cores to accept IRQs, and we ignore all this and let the magic
>>> algorithm pick a CPU to accept the IRQ.
>>
>> Maybe that's ok for the set of peripherals attached, but in general that
>> violates existing expectations regarding affinity, and I fear there'll
>> be some subtle brokenness resulting from this automatic target
>> selection.
>>
>> For example, in the perf events subsystem there are PMU drivers (even
>> those for "uncore" or "system" devices which are shared by many/all
>> CPUs) which rely on a combination of interrupt affinity and local IRQ
>> masking (without any other locking) to provide exclusion between a PMU's
>> IRQ handler and any other management operations for that PMU (which are
>> all handled from the same CPU).
> 
> It will definitely break anything that relies on managed interrupts,
> where the kernel expects to allocate interrupts that have a strict
> affinity. Drivers using this feature can legitimately expect that they
> can keep their state in per-CPU pointers, and that obviously breaks.
> 
> This may affect any PCIe device with more than a couple of queues.
> Maybe users of this HW do not care (yet), but we'll have to find a way
> to tell drivers of the limitation.

Yes, we already had a brief discussion about this in the v1 thread:

https://lore.kernel.org/linux-arm-kernel/4a83dfb1-3188-8b09-fc60-d3083230fb54@marcan.st/

TL;DR there is no explicit per-IRQ affinity control, nor does an unknown
one seem possible, since there just aren't enough bits for it in per-IRQ
registers. AICv1 had that, but AICv2 got rid of it in favor of heuristic
magic and global per-CPU controls.

This hasn't actually been fully tested yet, but current hypothesis is
the mapping goes:

1 IRQ -> group (0-7) -> priority (0-3?) -> 1 CPU (local priority threshold)

This is based on the fact that the per-IRQ group field is 3 bits, and
the per-CPU mask IMP-DEF sysreg is 2 bits. There may or may not be
per-IRQ cluster controls. But that still leaves all IRQs funnelled into,
at most, 3-4 classes per CPU cluster, and 8 groups globally, so there's
no way to implement proper per-IRQ affinity (since we have 10 CPUs on
these platforms).

My guess is Apple has bet on heuristic magic to optimize IRQ delivery to
avoid waking up (deep?-)sleeping CPUs on low-priority events and
optimize for power, and forgone strict per-CPU queues which are how many
drivers optimize for performance. This makes some sense, since these are
largely consumer/prosumer platforms, many of them battery-powered, not
128-CPU datacenter monsters with multiple 10GbE interfaces. They can
probably get away without hard multiqueue stuff.

This won't be an issue for PMU interrupts (including the uncore PMU),
since those do not go through AIC per se but rather the FIQ path (which
is inherently per-CPU), same as the local timers. Marc's PMU support
patch set already takes care of adding support for those FIQ sources.
But it will indeed break some PCIe drivers for devices that users might
have arbitrarily attached through Thunderbolt.

Since we do not support Thunderbolt yet, I suggest we kick this can down
the road until we have test cases for how this breaks and how to fix it :-)

There are also other fun things to be done with the local CPU masking,
e.g. directing low-priority IRQs away from CPUs running real-time
threads. I definitely want to take a look in more detail at the controls
we *do* have, especially since I have a personal interest in RT for
audio production (and these platforms have no SMM/TEE, no latency
spikes, and fast cpufreq, woo!). But for now this works and brings up
the platform, so that yak is probably best shaved in the future. Let me
know if you're interested in having more discussions about RT-centric
features, though. I suspect we'll need some new kernel
mechanisms/interfaces to handle e.g. the CPU IMPDEF mask/prio stuff...

Aside, I wonder how they'll handle multi-die devices... for a single
die, you can probably well get away with no CPU pinning, but for
multi-die, are they going to do NUMA? If so, they'd want at least die
controls to avoid bouncing cache lines between dies too much... though
for some reason, I'm getting the feeling they're just going to
interleave memory and pretend it's UMA. Good chance we find out next
month...

-- 
Hector Martin (marcan@marcan.st)
Public Key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support
  2022-02-24 13:07 ` [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support Hector Martin
@ 2022-02-25 14:39   ` Marc Zyngier
  2022-02-27 15:33     ` Hector Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2022-02-25 14:39 UTC (permalink / raw)
  To: Hector Martin
  Cc: Thomas Gleixner, Rob Herring, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On Thu, 24 Feb 2022 13:07:37 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> The newer AICv2 present in t600x SoCs does not have legacy IPI support
> at all. Since t8103 also supports Fast IPIs, implement support for this
> first. The legacy IPI code is left as a fallback, so it can be
> potentially used by older SoCs in the future.
> 
> The vIPI code is shared; only the IPI firing/acking bits change for Fast
> IPIs.
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/irqchip/irq-apple-aic.c | 122 ++++++++++++++++++++++++++++----
>  1 file changed, 109 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> index 38091ebb9403..613e0ebdabdc 100644
> --- a/drivers/irqchip/irq-apple-aic.c
> +++ b/drivers/irqchip/irq-apple-aic.c
> @@ -24,7 +24,7 @@
>   * - Default "this CPU" register view and explicit per-CPU views
>   *
>   * In addition, this driver also handles FIQs, as these are routed to the same
> - * IRQ vector. These are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and
> + * IRQ vector. These are used for Fast IPIs, the ARMv8 timer IRQs, and
>   * performance counters (TODO).
>   *
>   * Implementation notes:
> @@ -52,9 +52,11 @@
>  #include <linux/irqchip.h>
>  #include <linux/irqchip/arm-vgic-info.h>
>  #include <linux/irqdomain.h>
> +#include <linux/jump_label.h>
>  #include <linux/limits.h>
>  #include <linux/of_address.h>
>  #include <linux/slab.h>
> +#include <asm/cputype.h>
>  #include <asm/exception.h>
>  #include <asm/sysreg.h>
>  #include <asm/virt.h>
> @@ -106,7 +108,6 @@
>  
>  /*
>   * IMP-DEF sysregs that control FIQ sources
> - * Note: sysreg-based IPIs are not supported yet.
>   */
>  
>  /* Core PMC control register */
> @@ -155,6 +156,10 @@
>  #define SYS_IMP_APL_UPMSR_EL1		sys_reg(3, 7, 15, 6, 4)
>  #define UPMSR_IACT			BIT(0)
>  
> +/* MPIDR fields */
> +#define MPIDR_CPU(x)			MPIDR_AFFINITY_LEVEL(x, 0)
> +#define MPIDR_CLUSTER(x)		MPIDR_AFFINITY_LEVEL(x, 1)
> +
>  #define AIC_NR_FIQ		4
>  #define AIC_NR_SWIPI		32
>  
> @@ -173,11 +178,44 @@
>  #define AIC_TMR_EL02_PHYS	AIC_TMR_GUEST_PHYS
>  #define AIC_TMR_EL02_VIRT	AIC_TMR_GUEST_VIRT
>  
> +DEFINE_STATIC_KEY_TRUE(use_fast_ipi);
> +
> +struct aic_info {
> +	int version;
> +
> +	/* Features */
> +	bool fast_ipi;
> +};
> +
> +static const struct aic_info aic1_info = {
> +	.version	= 1,
> +};
> +
> +static const struct aic_info aic1_fipi_info = {
> +	.version	= 1,
> +
> +	.fast_ipi	= true,
> +};
> +
> +static const struct of_device_id aic_info_match[] = {
> +	{
> +		.compatible = "apple,t8103-aic",
> +		.data = &aic1_fipi_info,
> +	},
> +	{
> +		.compatible = "apple,aic",
> +		.data = &aic1_info,
> +	},
> +	{}
> +};
> +
>  struct aic_irq_chip {
>  	void __iomem *base;
>  	struct irq_domain *hw_domain;
>  	struct irq_domain *ipi_domain;
>  	int nr_hw;
> +
> +	struct aic_info info;
>  };
>  
>  static DEFINE_PER_CPU(uint32_t, aic_fiq_unmasked);
> @@ -386,8 +424,12 @@ static void __exception_irq_entry aic_handle_fiq(struct pt_regs *regs)
>  	 */
>  
>  	if (read_sysreg_s(SYS_IMP_APL_IPI_SR_EL1) & IPI_SR_PENDING) {
> -		pr_err_ratelimited("Fast IPI fired. Acking.\n");
> -		write_sysreg_s(IPI_SR_PENDING, SYS_IMP_APL_IPI_SR_EL1);
> +		if (static_branch_likely(&use_fast_ipi)) {
> +			aic_handle_ipi(regs);
> +		} else {
> +			pr_err_ratelimited("Fast IPI fired. Acking.\n");
> +			write_sysreg_s(IPI_SR_PENDING, SYS_IMP_APL_IPI_SR_EL1);
> +		}
>  	}
>  
>  	if (TIMER_FIRING(read_sysreg(cntp_ctl_el0)))
> @@ -563,6 +605,22 @@ static const struct irq_domain_ops aic_irq_domain_ops = {
>   * IPI irqchip
>   */
>  
> +static void aic_ipi_send_fast(int cpu)
> +{
> +	u64 mpidr = cpu_logical_map(cpu);
> +	u64 my_mpidr = read_cpuid_mpidr();
> +	u64 cluster = MPIDR_CLUSTER(mpidr);
> +	u64 idx = MPIDR_CPU(mpidr);
> +
> +	if (MPIDR_CLUSTER(my_mpidr) == cluster)
> +		write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx),
> +			       SYS_IMP_APL_IPI_RR_LOCAL_EL1);
> +	else
> +		write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx) | FIELD_PREP(IPI_RR_CLUSTER, cluster),
> +			       SYS_IMP_APL_IPI_RR_GLOBAL_EL1);
> +	isb();

I think you have an ordering issue with this, see below.

> +}
> +
>  static void aic_ipi_mask(struct irq_data *d)
>  {
>  	u32 irq_bit = BIT(irqd_to_hwirq(d));
> @@ -588,8 +646,12 @@ static void aic_ipi_unmask(struct irq_data *d)
>  	 * If a pending vIPI was unmasked, raise a HW IPI to ourselves.
>  	 * No barriers needed here since this is a self-IPI.
>  	 */
> -	if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit)
> -		aic_ic_write(ic, AIC_IPI_SEND, AIC_IPI_SEND_CPU(smp_processor_id()));
> +	if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit) {
> +		if (static_branch_likely(&use_fast_ipi))
> +			aic_ipi_send_fast(smp_processor_id());
> +		else
> +			aic_ic_write(ic, AIC_IPI_SEND, AIC_IPI_SEND_CPU(smp_processor_id()));
> +	}
>  }
>  
>  static void aic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
> @@ -617,8 +679,12 @@ static void aic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
>  		smp_mb__after_atomic();
>  
>  		if (!(pending & irq_bit) &&
> -		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit))
> -			send |= AIC_IPI_SEND_CPU(cpu);
> +		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit)) {
> +			if (static_branch_likely(&use_fast_ipi))
> +				aic_ipi_send_fast(cpu);

OK, this is suffering from the same issue that GICv3 has, which is
that memory barriers don't provide order against sysregs. You need a
DSB for that, which is a pain. Something like this:

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 602c8b274170..579f22b72339 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -736,6 +736,15 @@ static void aic_ipi_send_fast(int cpu)
 	u64 cluster = MPIDR_CLUSTER(mpidr);
 	u64 idx = MPIDR_CPU(mpidr);
 
+	/*
+	 * A DSB is required here in to order prior writes to memory
+	 * with system register accesses having a side effect
+	 * (matching the behaviour of the IPI registers). Make sure we
+	 * only order stores with in the IS domain, keeping as light
+	 * as possible.
+	 */
+	dsb(ishst);
+
 	if (MPIDR_CLUSTER(my_mpidr) == cluster)
 		write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx),
 			       SYS_IMP_APL_IPI_RR_LOCAL_EL1);

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 7/7] irqchip/apple-aic: Add support for AICv2
  2022-02-24 13:07 ` [PATCH v2 7/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
@ 2022-02-25 15:27   ` Marc Zyngier
  2022-02-25 22:05     ` Hector Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2022-02-25 15:27 UTC (permalink / raw)
  To: Hector Martin
  Cc: Thomas Gleixner, Rob Herring, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On Thu, 24 Feb 2022 13:07:41 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> Introduce support for the new AICv2 hardware block in t6000/t6001 SoCs.
> 
> It seems these blocks are missing the information required to compute
> the event register offset in the capability registers, so we specify
> that in the DT.
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/irqchip/irq-apple-aic.c | 148 ++++++++++++++++++++++++++++----
>  1 file changed, 129 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> index 93c622435ba2..602c8b274170 100644
> --- a/drivers/irqchip/irq-apple-aic.c
> +++ b/drivers/irqchip/irq-apple-aic.c
> @@ -103,6 +103,57 @@
>  
>  #define AIC_MAX_IRQ		0x400
>  
> +/*
> + * AIC v2 registers (MMIO)
> + */
> +
> +#define AIC2_VERSION		0x0000
> +#define AIC2_VERSION_VER	GENMASK(7, 0)
> +
> +#define AIC2_INFO1		0x0004
> +#define AIC2_INFO1_NR_IRQ	GENMASK(15, 0)
> +#define AIC2_INFO1_LAST_DIE	GENMASK(27, 24)
> +
> +#define AIC2_INFO2		0x0008
> +
> +#define AIC2_INFO3		0x000c
> +#define AIC2_INFO3_MAX_IRQ	GENMASK(15, 0)
> +#define AIC2_INFO3_MAX_DIE	GENMASK(27, 24)
> +
> +#define AIC2_RESET		0x0010
> +#define AIC2_RESET_RESET	BIT(0)
> +
> +#define AIC2_CONFIG		0x0014
> +#define AIC2_CONFIG_ENABLE	BIT(0)
> +#define AIC2_CONFIG_PREFER_PCPU	BIT(28)
> +
> +#define AIC2_TIMEOUT		0x0028
> +#define AIC2_CLUSTER_PRIO	0x0030
> +#define AIC2_DELAY_GROUPS	0x0100
> +
> +#define AIC2_IRQ_CFG		0x2000
> +
> +/*
> + * AIC2 registers are laid out like this, starting at AIC2_IRQ_CFG:
> + *
> + * Repeat for each die:
> + *   IRQ_CFG: u32 * MAX_IRQS
> + *   SW_SET: u32 * (MAX_IRQS / 32)
> + *   SW_CLR: u32 * (MAX_IRQS / 32)
> + *   MASK_SET: u32 * (MAX_IRQS / 32)
> + *   MASK_CLR: u32 * (MAX_IRQS / 32)
> + *   HW_STATE: u32 * (MAX_IRQS / 32)
> + *
> + * This is followed by a set of event registers, each 16K page aligned.
> + * The first one is the AP event register we will use. Unfortunately,
> + * the actual implemented die count is not specified anywhere in the
> + * capability registers, so we have to explicitly specify the event
> + * register offset in the device tree to remain forward-compatible.
> + */
> +
> +#define AIC2_IRQ_CFG_TARGET	GENMASK(3, 0)
> +#define AIC2_IRQ_CFG_DELAY_IDX	GENMASK(7, 5)
> +
>  #define MASK_REG(x)		(4 * ((x) >> 5))
>  #define MASK_BIT(x)		BIT((x) & GENMASK(4, 0))
>  
> @@ -193,6 +244,7 @@ struct aic_info {
>  	/* Register offsets */
>  	u32 event;
>  	u32 target_cpu;
> +	u32 irq_cfg;
>  	u32 sw_set;
>  	u32 sw_clr;
>  	u32 mask_set;
> @@ -220,6 +272,14 @@ static const struct aic_info aic1_fipi_info = {
>  	.fast_ipi	= true,
>  };
>  
> +static const struct aic_info aic2_info = {
> +	.version	= 2,
> +
> +	.irq_cfg	= AIC2_IRQ_CFG,
> +
> +	.fast_ipi	= true,
> +};
> +
>  static const struct of_device_id aic_info_match[] = {
>  	{
>  		.compatible = "apple,t8103-aic",
> @@ -229,6 +289,10 @@ static const struct of_device_id aic_info_match[] = {
>  		.compatible = "apple,aic",
>  		.data = &aic1_info,
>  	},
> +	{
> +		.compatible = "apple,aic2",
> +		.data = &aic2_info,
> +	},
>  	{}
>  };
>  
> @@ -373,6 +437,14 @@ static struct irq_chip aic_chip = {
>  	.irq_set_type = aic_irq_set_type,
>  };
>  
> +static struct irq_chip aic2_chip = {
> +	.name = "AIC2",
> +	.irq_mask = aic_irq_mask,
> +	.irq_unmask = aic_irq_unmask,
> +	.irq_eoi = aic_irq_eoi,
> +	.irq_set_type = aic_irq_set_type,
> +};
> +
>  /*
>   * FIQ irqchip
>   */
> @@ -529,10 +601,15 @@ static struct irq_chip fiq_chip = {
>  static int aic_irq_domain_map(struct irq_domain *id, unsigned int irq,
>  			      irq_hw_number_t hw)
>  {
> +	struct aic_irq_chip *ic = id->host_data;
>  	u32 type = FIELD_GET(AIC_EVENT_TYPE, hw);
> +	struct irq_chip *chip = &aic_chip;
> +
> +	if (ic->info.version == 2)
> +		chip = &aic2_chip;
>  
>  	if (type == AIC_EVENT_TYPE_IRQ) {
> -		irq_domain_set_info(id, irq, hw, &aic_chip, id->host_data,
> +		irq_domain_set_info(id, irq, hw, chip, id->host_data,
>  				    handle_fasteoi_irq, NULL, NULL);
>  		irqd_set_single_target(irq_desc_get_irq_data(irq_to_desc(irq)));
>  	} else {
> @@ -888,24 +965,26 @@ static int aic_init_cpu(unsigned int cpu)
>  	/* Commit all of the above */
>  	isb();
>  
> -	/*
> -	 * Make sure the kernel's idea of logical CPU order is the same as AIC's
> -	 * If we ever end up with a mismatch here, we will have to introduce
> -	 * a mapping table similar to what other irqchip drivers do.
> -	 */
> -	WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());
> +	if (aic_irqc->info.version == 1) {
> +		/*
> +		 * Make sure the kernel's idea of logical CPU order is the same as AIC's
> +		 * If we ever end up with a mismatch here, we will have to introduce
> +		 * a mapping table similar to what other irqchip drivers do.
> +		 */
> +		WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());

Don't you have a similar issue with AICv2?  Or is it that AICv2
doesn't have this register?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2
  2022-02-24 13:07 ` [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2 Hector Martin
@ 2022-02-25 20:19   ` Rob Herring
  2022-02-25 21:58     ` Hector Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Rob Herring @ 2022-02-25 20:19 UTC (permalink / raw)
  To: Hector Martin
  Cc: Thomas Gleixner, Marc Zyngier, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On Thu, Feb 24, 2022 at 10:07:36PM +0900, Hector Martin wrote:
> This new incompatible revision of the AIC peripheral introduces
> multi-die support. This binding is based on apple,aic, but
> changes interrupt-cells to add a new die argument.
> 
> Also adds an apple,event-reg property to specify the offset of the event
> register. Inexplicably, the capability registers allow us to compute
> other register offsets, but not this one. This allows us to keep
> forward-compatibility with future SoCs that will likely implement
> different die counts, thus shifting the event register. Apple do the
> same thing in their device tree...
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  .../interrupt-controller/apple,aic2.yaml      | 99 +++++++++++++++++++
>  MAINTAINERS                                   |  2 +-
>  2 files changed, 100 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml
> 
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml b/Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml
> new file mode 100644
> index 000000000000..311900613b9e
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/apple,aic2.yaml
> @@ -0,0 +1,99 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/interrupt-controller/apple,aic2.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Apple Interrupt Controller 2
> +
> +maintainers:
> +  - Hector Martin <marcan@marcan.st>
> +
> +description: |
> +  The Apple Interrupt Controller 2 is a simple interrupt controller present on
> +  Apple ARM SoC platforms starting with t600x (M1 Pro and Max).
> +
> +  It provides the following features:
> +
> +  - Level-triggered hardware IRQs wired to SoC blocks
> +    - Single mask bit per IRQ
> +    - Automatic masking on event delivery (auto-ack)
> +    - Software triggering (ORed with hw line)
> +  - Automatic prioritization (single event/ack register per CPU, lower IRQs =
> +    higher priority)
> +  - Automatic masking on ack
> +  - Support for multiple dies
> +
> +  This device also represents the FIQ interrupt sources on platforms using AIC,
> +  which do not go through a discrete interrupt controller. It also handles
> +  FIQ-based Fast IPIs.
> +
> +properties:
> +  compatible:
> +    items:
> +      - const: apple,t6000-aic
> +      - const: apple,aic2

I feel I was sold on Apple doesn't change h/w and we're the 2nd chip in 
and the h/w changed. Just my musings, but aic3 will be rejected. :(

> +
> +  interrupt-controller: true
> +
> +  '#interrupt-cells':
> +    const: 4
> +    description: |
> +      The 1st cell contains the interrupt type:
> +        - 0: Hardware IRQ
> +        - 1: FIQ
> +
> +      The 2nd cell contains the die ID.
> +
> +      The next cell contains the interrupt number.
> +        - HW IRQs: interrupt number
> +        - FIQs:
> +          - 0: physical HV timer
> +          - 1: virtual HV timer
> +          - 2: physical guest timer
> +          - 3: virtual guest timer
> +
> +      The last cell contains the interrupt flags. This is normally
> +      IRQ_TYPE_LEVEL_HIGH (4).
> +
> +  reg:
> +    description: |
> +      Specifies base physical address and size of the AIC registers.
> +    maxItems: 1
> +
> +  power-domains:
> +    maxItems: 1
> +
> +  apple,event-reg:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    description:
> +      Specifies the offset of the event register, which lies after all the
> +      implemented die register sets, page aligned. This is not computable from
> +      capability register values, so we have to specify it explicitly.

If this is last, then couldn't it be a 2nd 'reg' entry?

'page aligned' is ambiguous. I assume that means 16K since that's what 
Apple uses, but I might assume 4K not knowing that.

> +
> +required:
> +  - compatible
> +  - '#interrupt-cells'
> +  - interrupt-controller
> +  - reg
> +  - apple,event-reg
> +
> +additionalProperties: false
> +
> +allOf:
> +  - $ref: /schemas/interrupt-controller.yaml#
> +
> +examples:
> +  - |
> +    soc {
> +        #address-cells = <2>;
> +        #size-cells = <2>;
> +
> +        aic: interrupt-controller@28e100000 {
> +            compatible = "apple,t6000-aic", "apple,aic2";
> +            #interrupt-cells = <4>;
> +            interrupt-controller;
> +            reg = <0x2 0x8e100000 0x0 0x10000>;
> +            apple,event-reg = <0xc000>;
> +        };
> +    };
> diff --git a/MAINTAINERS b/MAINTAINERS
> index aa0f6cbb634e..ad887ec7e96b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1806,7 +1806,7 @@ T:	git https://github.com/AsahiLinux/linux.git
>  F:	Documentation/devicetree/bindings/arm/apple.yaml
>  F:	Documentation/devicetree/bindings/arm/apple/*
>  F:	Documentation/devicetree/bindings/i2c/apple,i2c.yaml
> -F:	Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
> +F:	Documentation/devicetree/bindings/interrupt-controller/apple,*
>  F:	Documentation/devicetree/bindings/mailbox/apple,mailbox.yaml
>  F:	Documentation/devicetree/bindings/pci/apple,pcie.yaml
>  F:	Documentation/devicetree/bindings/pinctrl/apple,pinctrl.yaml
> -- 
> 2.33.0
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2
  2022-02-25 20:19   ` Rob Herring
@ 2022-02-25 21:58     ` Hector Martin
  2022-03-07 11:35       ` Marc Zyngier
  0 siblings, 1 reply; 19+ messages in thread
From: Hector Martin @ 2022-02-25 21:58 UTC (permalink / raw)
  To: Rob Herring
  Cc: Thomas Gleixner, Marc Zyngier, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On 26/02/2022 05.19, Rob Herring wrote:
>> +properties:
>> +  compatible:
>> +    items:
>> +      - const: apple,t6000-aic
>> +      - const: apple,aic2
> 
> I feel I was sold on Apple doesn't change h/w and we're the 2nd chip in 
> and the h/w changed. Just my musings, but aic3 will be rejected. :(

Well yes, after not changing hardware for N phone/tablet generations,
they figured out they *finally* had to make some changes for real
desktop chips... (t8103 was a tablet chip they shoehorned into laptops;
t6000 is the first real laptop/desktop chip). This isn't the 2nd chip
in, this is the 26th chip in or so, and yet it's called AIC2 (by Apple
even)... We aren't starting from chip #1, just the first chip they
decided to *let* us put Linux on.

It's pretty clear that the t6000 changes were made with future-proofing
in mind. I guess we'll find out in a couple weeks, since the rumor mill
says M2 is coming. If I'm right and we end up needing *zero* kernel
changes to boot on M2, will you be happy? ;-)

>> +  apple,event-reg:
>> +    $ref: /schemas/types.yaml#/definitions/uint32
>> +    description:
>> +      Specifies the offset of the event register, which lies after all the
>> +      implemented die register sets, page aligned. This is not computable from
>> +      capability register values, so we have to specify it explicitly.
> 
> If this is last, then couldn't it be a 2nd 'reg' entry?
> 
> 'page aligned' is ambiguous. I assume that means 16K since that's what 
> Apple uses, but I might assume 4K not knowing that.

16K, and yeah, it could be a 2nd reg entry if you think that works
better. Makes sense.

-- 
Hector Martin (marcan@marcan.st)
Public Key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 7/7] irqchip/apple-aic: Add support for AICv2
  2022-02-25 15:27   ` Marc Zyngier
@ 2022-02-25 22:05     ` Hector Martin
  0 siblings, 0 replies; 19+ messages in thread
From: Hector Martin @ 2022-02-25 22:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Rob Herring, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On 26/02/2022 00.27, Marc Zyngier wrote:
> On Thu, 24 Feb 2022 13:07:41 +0000,
> Hector Martin <marcan@marcan.st> wrote:
>> -	/*
>> -	 * Make sure the kernel's idea of logical CPU order is the same as AIC's
>> -	 * If we ever end up with a mismatch here, we will have to introduce
>> -	 * a mapping table similar to what other irqchip drivers do.
>> -	 */
>> -	WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());
>> +	if (aic_irqc->info.version == 1) {
>> +		/*
>> +		 * Make sure the kernel's idea of logical CPU order is the same as AIC's
>> +		 * If we ever end up with a mismatch here, we will have to introduce
>> +		 * a mapping table similar to what other irqchip drivers do.
>> +		 */
>> +		WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());
> 
> Don't you have a similar issue with AICv2?  Or is it that AICv2
> doesn't have this register?

No concept of individual CPUs in AICv2 at all, so no WHOAMI register
either :)

-- 
Hector Martin (marcan@marcan.st)
Public Key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support
  2022-02-25 14:39   ` Marc Zyngier
@ 2022-02-27 15:33     ` Hector Martin
  2022-03-07 11:35       ` Marc Zyngier
  0 siblings, 1 reply; 19+ messages in thread
From: Hector Martin @ 2022-02-27 15:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Rob Herring, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On 25/02/2022 23.39, Marc Zyngier wrote:
> On Thu, 24 Feb 2022 13:07:37 +0000,
>>  		if (!(pending & irq_bit) &&
>> -		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit))
>> -			send |= AIC_IPI_SEND_CPU(cpu);
>> +		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit)) {
>> +			if (static_branch_likely(&use_fast_ipi))
>> +				aic_ipi_send_fast(cpu);
> 
> OK, this is suffering from the same issue that GICv3 has, which is
> that memory barriers don't provide order against sysregs. You need a
> DSB for that, which is a pain. Something like this:

Doesn't the control flow here guarantee the ordering? atomic_read() must
complete before the sysreg is written since there is a control flow
dependency, and the prior atomic/barrier dance ensures that read is
ordered properly with everything that comes before it.

-- 
Hector Martin (marcan@marcan.st)
Public Key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support
  2022-02-27 15:33     ` Hector Martin
@ 2022-03-07 11:35       ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2022-03-07 11:35 UTC (permalink / raw)
  To: Hector Martin
  Cc: Thomas Gleixner, Rob Herring, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On Sun, 27 Feb 2022 15:33:54 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> On 25/02/2022 23.39, Marc Zyngier wrote:
> > On Thu, 24 Feb 2022 13:07:37 +0000,
> >>  		if (!(pending & irq_bit) &&
> >> -		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit))
> >> -			send |= AIC_IPI_SEND_CPU(cpu);
> >> +		    (atomic_read(per_cpu_ptr(&aic_vipi_enable, cpu)) & irq_bit)) {
> >> +			if (static_branch_likely(&use_fast_ipi))
> >> +				aic_ipi_send_fast(cpu);
> > 
> > OK, this is suffering from the same issue that GICv3 has, which is
> > that memory barriers don't provide order against sysregs. You need a
> > DSB for that, which is a pain. Something like this:
> 
> Doesn't the control flow here guarantee the ordering? atomic_read() must
> complete before the sysreg is written since there is a control flow
> dependency, and the prior atomic/barrier dance ensures that read is
> ordered properly with everything that comes before it.

Yes, you're right. Mixing memory ordering and control dependency hurts
my head badly, but hey, why not.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2
  2022-02-25 21:58     ` Hector Martin
@ 2022-03-07 11:35       ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2022-03-07 11:35 UTC (permalink / raw)
  To: Hector Martin
  Cc: Rob Herring, Thomas Gleixner, Sven Peter, Alyssa Rosenzweig,
	Mark Kettenis, linux-arm-kernel, linux-kernel, devicetree

On Fri, 25 Feb 2022 21:58:34 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> On 26/02/2022 05.19, Rob Herring wrote:
> >> +properties:
> >> +  compatible:
> >> +    items:
> >> +      - const: apple,t6000-aic
> >> +      - const: apple,aic2
> > 
> > I feel I was sold on Apple doesn't change h/w and we're the 2nd chip in 
> > and the h/w changed. Just my musings, but aic3 will be rejected. :(
> 
> Well yes, after not changing hardware for N phone/tablet generations,
> they figured out they *finally* had to make some changes for real
> desktop chips... (t8103 was a tablet chip they shoehorned into laptops;
> t6000 is the first real laptop/desktop chip). This isn't the 2nd chip
> in, this is the 26th chip in or so, and yet it's called AIC2 (by Apple
> even)... We aren't starting from chip #1, just the first chip they
> decided to *let* us put Linux on.
> 
> It's pretty clear that the t6000 changes were made with future-proofing
> in mind. I guess we'll find out in a couple weeks, since the rumor mill
> says M2 is coming. If I'm right and we end up needing *zero* kernel
> changes to boot on M2, will you be happy? ;-)
> 
> >> +  apple,event-reg:
> >> +    $ref: /schemas/types.yaml#/definitions/uint32
> >> +    description:
> >> +      Specifies the offset of the event register, which lies after all the
> >> +      implemented die register sets, page aligned. This is not computable from
> >> +      capability register values, so we have to specify it explicitly.
> > 
> > If this is last, then couldn't it be a 2nd 'reg' entry?
> > 
> > 'page aligned' is ambiguous. I assume that means 16K since that's what 
> > Apple uses, but I might assume 4K not knowing that.
> 
> 16K, and yeah, it could be a 2nd reg entry if you think that works
> better. Makes sense.

Do you plan to respin this? If I'm going to that this series for 5.18,
it needs to be this week.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-03-07 11:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-24 13:07 [PATCH v2 0/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
2022-02-24 13:07 ` [PATCH v2 1/7] PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form Hector Martin
2022-02-24 13:07 ` [PATCH v2 2/7] dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2 Hector Martin
2022-02-25 20:19   ` Rob Herring
2022-02-25 21:58     ` Hector Martin
2022-03-07 11:35       ` Marc Zyngier
2022-02-24 13:07 ` [PATCH v2 3/7] irqchip/apple-aic: Add Fast IPI support Hector Martin
2022-02-25 14:39   ` Marc Zyngier
2022-02-27 15:33     ` Hector Martin
2022-03-07 11:35       ` Marc Zyngier
2022-02-24 13:07 ` [PATCH v2 4/7] irqchip/apple-aic: Switch to irq_domain_create_tree and sparse hwirqs Hector Martin
2022-02-24 13:07 ` [PATCH v2 5/7] irqchip/apple-aic: Dynamically compute register offsets Hector Martin
2022-02-24 13:07 ` [PATCH v2 6/7] irqchip/apple-aic: Support multiple dies Hector Martin
2022-02-24 13:07 ` [PATCH v2 7/7] irqchip/apple-aic: Add support for AICv2 Hector Martin
2022-02-25 15:27   ` Marc Zyngier
2022-02-25 22:05     ` Hector Martin
2022-02-24 18:26 ` [PATCH v2 0/7] " Mark Rutland
2022-02-24 19:06   ` Marc Zyngier
2022-02-25  4:27     ` Hector Martin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).