linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/10] Linux RISC-V AIA Support
@ 2023-06-13 15:34 Anup Patel
  2023-06-13 15:34 ` [PATCH v4 01/10] RISC-V: Add riscv_fw_parent_hartid() function Anup Patel
                   ` (9 more replies)
  0 siblings, 10 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

The RISC-V AIA specification is now frozen as-per the RISC-V international
process. The latest frozen specifcation can be found at:
https://github.com/riscv/riscv-aia/releases/download/1.0-RC1/riscv-interrupts-1.0-RC1.pdf

At a high-level, the AIA specification adds three things:
1) AIA CSRs
   - Improved local interrupt support
2) Incoming Message Signaled Interrupt Controller (IMSIC)
   - Per-HART MSI controller
   - Support MSI virtualization
   - Support IPI along with virtualization
3) Advanced Platform-Level Interrupt Controller (APLIC)
   - Wired interrupt controller
   - In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
   - In Direct-mode, injects external interrupts directly into HARTs

For an overview of the AIA specification, refer the recent AIA virtualization
talk at KVM Forum 2022:
https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
https://www.youtube.com/watch?v=r071dL8Z0yo

The PATCH2 of this series conflicts with the "irqchip/riscv-intc: Add ACPI
support" patch of the "Add basic ACPI support for RISC-V" series hence this
series is based upon the "Add basic ACPI support for RISC-V" series.
(Refer, https://lore.kernel.org/lkml/20230515054928.2079268-1-sunilvl@ventanamicro.com/)

To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).

These patches can also be found in the riscv_aia_v4 branch at:
https://github.com/avpatel/linux.git

Changes since v3:
 - Rebased on Linux-6.4-rc6
 - Droped PATCH2 of v3 series instead we now set FWNODE_FLAG_BEST_EFFORT via
   IRQCHIP_DECLARE()
 - Extend riscv_fw_parent_hartid() to support both DT and ACPI in PATCH1
 - Extend iommu_dma_compose_msi_msg() instead of adding iommu_dma_select_msi()
   in PATCH6
 - Addressed Conor's comments in PATCH3
 - Addressed Conor's and Rob's comments in PATCH7

Changes since v2:
 - Rebased on Linux-6.4-rc1
 - Addressed Rob's comments on DT bindings patches 4 and 8.
 - Addessed Marc's comments on IMSIC driver PATCH5
 - Replaced use of OF apis in APLIC and IMSIC drivers with FWNODE apis
   this makes both drivers easily portable for ACPI support. This also
   removes unnecessary indirection from the APLIC and IMSIC drivers.
 - PATCH1 is a new patch for portability with ACPI support
 - PATCH2 is a new patch to fix probing in APLIC drivers for APLIC-only systems.
 - PATCH7 is a new patch which addresses the IOMMU DMA domain issues pointed
   out by SiFive

Changes since v1:
 - Rebased on Linux-6.2-rc2
 - Addressed comments on IMSIC DT bindings for PATCH4
 - Use raw_spin_lock_irqsave() on ids_lock for PATCH5
 - Improved MMIO alignment checks in PATCH5 to allow MMIO regions
   with holes.
 - Addressed comments on APLIC DT bindings for PATCH6
 - Fixed warning splat in aplic_msi_write_msg() caused by
   zeroed MSI message in PATCH7
 - Dropped DT property riscv,slow-ipi instead will have module
   parameter in future.

Anup Patel (10):
  RISC-V: Add riscv_fw_parent_hartid() function
  irqchip/riscv-intc: Add support for RISC-V AIA
  dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  irqchip: Add RISC-V incoming MSI controller driver
  irqchip/riscv-imsic: Add support for PCI MSI irqdomain
  irqchip/riscv-imsic: Improve IOMMU DMA support
  dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  irqchip: Add RISC-V advanced PLIC driver
  RISC-V: Select APLIC and IMSIC drivers
  MAINTAINERS: Add entry for RISC-V AIA drivers

 .../interrupt-controller/riscv,aplic.yaml     |  169 +++
 .../interrupt-controller/riscv,imsics.yaml    |  172 +++
 MAINTAINERS                                   |   12 +
 arch/riscv/Kconfig                            |    2 +
 arch/riscv/include/asm/processor.h            |    3 +
 arch/riscv/kernel/cpu.c                       |   16 +
 drivers/iommu/dma-iommu.c                     |   24 +-
 drivers/irqchip/Kconfig                       |   20 +-
 drivers/irqchip/Makefile                      |    2 +
 drivers/irqchip/irq-riscv-aplic.c             |  765 ++++++++++++
 drivers/irqchip/irq-riscv-imsic.c             | 1076 +++++++++++++++++
 drivers/irqchip/irq-riscv-intc.c              |   36 +-
 include/linux/irqchip/riscv-aplic.h           |  119 ++
 include/linux/irqchip/riscv-imsic.h           |   86 ++
 14 files changed, 2492 insertions(+), 10 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
 create mode 100644 drivers/irqchip/irq-riscv-aplic.c
 create mode 100644 drivers/irqchip/irq-riscv-imsic.c
 create mode 100644 include/linux/irqchip/riscv-aplic.h
 create mode 100644 include/linux/irqchip/riscv-imsic.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 01/10] RISC-V: Add riscv_fw_parent_hartid() function
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-13 15:34 ` [PATCH v4 02/10] irqchip/riscv-intc: Add support for RISC-V AIA Anup Patel
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

We add common riscv_fw_parent_hartid() which help device drivers
to get parent hartid of the INTC (i.e. local interrupt controller)
fwnode. This should work for both DT and ACPI.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 arch/riscv/include/asm/processor.h |  3 +++
 arch/riscv/kernel/cpu.c            | 16 ++++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 94a0590c6971..6fb8bbec8459 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -77,6 +77,9 @@ struct device_node;
 int riscv_of_processor_hartid(struct device_node *node, unsigned long *hartid);
 int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid);
 
+struct fwnode_handle;
+int riscv_fw_parent_hartid(struct fwnode_handle *node, unsigned long *hartid);
+
 extern void riscv_fill_hwcap(void);
 extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
 
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 5de6fb703cc2..67b335789b22 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -73,6 +73,22 @@ int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid)
 	return -1;
 }
 
+/* Find hart ID of the CPU fwnode under which given fwnode falls. */
+int riscv_fw_parent_hartid(struct fwnode_handle *node, unsigned long *hartid)
+{
+	int rc;
+	u64 temp;
+
+	if (!is_of_node(node)) {
+		rc = fwnode_property_read_u64_array(node, "hartid", &temp, 1);
+		if (!rc)
+			*hartid = temp;
+	} else
+		rc = riscv_of_parent_hartid(to_of_node(node), hartid);
+
+	return rc;
+}
+
 DEFINE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);
 
 unsigned long riscv_cached_mvendorid(unsigned int cpu_id)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 02/10] irqchip/riscv-intc: Add support for RISC-V AIA
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
  2023-06-13 15:34 ` [PATCH v4 01/10] RISC-V: Add riscv_fw_parent_hartid() function Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-13 15:34 ` [PATCH v4 03/10] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller Anup Patel
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

The RISC-V advanced interrupt architecture (AIA) extends the per-HART
local interrupts in following ways:
1. Minimum 64 local interrupts for both RV32 and RV64
2. Ability to process multiple pending local interrupts in same
   interrupt handler
3. Priority configuration for each local interrupts
4. Special CSRs to configure/access the per-HART MSI controller

This patch adds support for RISC-V AIA in the RISC-V intc driver.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/irq-riscv-intc.c | 36 ++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index 4adeee1bc391..e235bf1708a4 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -17,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/smp.h>
+#include <asm/hwcap.h>
 
 static struct irq_domain *intc_domain;
 
@@ -30,6 +31,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
 	generic_handle_domain_irq(intc_domain, cause);
 }
 
+static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
+{
+	unsigned long topi;
+
+	while ((topi = csr_read(CSR_TOPI)))
+		generic_handle_domain_irq(intc_domain,
+					  topi >> TOPI_IID_SHIFT);
+}
+
 /*
  * On RISC-V systems local interrupts are masked or unmasked by writing
  * the SIE (Supervisor Interrupt Enable) CSR.  As CSRs can only be written
@@ -39,12 +49,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
 
 static void riscv_intc_irq_mask(struct irq_data *d)
 {
-	csr_clear(CSR_IE, BIT(d->hwirq));
+	if (d->hwirq < BITS_PER_LONG)
+		csr_clear(CSR_IE, BIT(d->hwirq));
+	else
+		csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
 }
 
 static void riscv_intc_irq_unmask(struct irq_data *d)
 {
-	csr_set(CSR_IE, BIT(d->hwirq));
+	if (d->hwirq < BITS_PER_LONG)
+		csr_set(CSR_IE, BIT(d->hwirq));
+	else
+		csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
 }
 
 static void riscv_intc_irq_eoi(struct irq_data *d)
@@ -115,16 +131,22 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
 
 static int __init riscv_intc_init_common(struct fwnode_handle *fn)
 {
-	int rc;
+	int rc, nr_irqs = BITS_PER_LONG;
+
+	if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
+		nr_irqs = nr_irqs * 2;
 
-	intc_domain = irq_domain_create_linear(fn, BITS_PER_LONG,
+	intc_domain = irq_domain_create_linear(fn, nr_irqs,
 					       &riscv_intc_domain_ops, NULL);
 	if (!intc_domain) {
 		pr_err("unable to add IRQ domain\n");
 		return -ENXIO;
 	}
 
-	rc = set_handle_irq(&riscv_intc_irq);
+	if (riscv_isa_extension_available(NULL, SxAIA))
+		rc = set_handle_irq(&riscv_intc_aia_irq);
+	else
+		rc = set_handle_irq(&riscv_intc_irq);
 	if (rc) {
 		pr_err("failed to set irq handler\n");
 		return rc;
@@ -132,7 +154,9 @@ static int __init riscv_intc_init_common(struct fwnode_handle *fn)
 
 	riscv_set_intc_hwnode_fn(riscv_intc_hwnode);
 
-	pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+	pr_info("%d local interrupts mapped%s\n",
+		nr_irqs, (riscv_isa_extension_available(NULL, SxAIA)) ?
+			 " using AIA" : "");
 
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 03/10] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
  2023-06-13 15:34 ` [PATCH v4 01/10] RISC-V: Add riscv_fw_parent_hartid() function Anup Patel
  2023-06-13 15:34 ` [PATCH v4 02/10] irqchip/riscv-intc: Add support for RISC-V AIA Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-13 15:34 ` [PATCH v4 04/10] irqchip: Add RISC-V incoming MSI controller driver Anup Patel
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel, Conor Dooley, Krzysztof Kozlowski

We add DT bindings document for the RISC-V incoming MSI controller
(IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
specification.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
 .../interrupt-controller/riscv,imsics.yaml    | 172 ++++++++++++++++++
 1 file changed, 172 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
new file mode 100644
index 000000000000..84976f17a4a1
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
@@ -0,0 +1,172 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Incoming MSI Controller (IMSIC)
+
+maintainers:
+  - Anup Patel <anup@brainfault.org>
+
+description: |
+  The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
+  MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
+  AIA specification can be found at https://github.com/riscv/riscv-aia.
+
+  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
+  for each privilege level (machine or supervisor). The configuration of
+  a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
+  space to receive MSIs from devices. Each IMSIC interrupt file supports a
+  fixed number of interrupt identities (to distinguish MSIs from devices)
+  which is same for given privilege level across CPUs (or HARTs).
+
+  The device tree of a RISC-V platform will have one IMSIC device tree node
+  for each privilege level (machine or supervisor) which collectively describe
+  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
+
+  The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
+  follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
+  group is a set of IMSIC interrupt files co-located in MMIO space and we can
+  have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
+  RISC-V platform. The MSI target address of a IMSIC interrupt file at given
+  privilege level (machine or supervisor) encodes group index, HART index,
+  and guest index (shown below).
+
+  XLEN-1            > (HART Index MSB)                  12    0
+  |                  |                                  |     |
+  -------------------------------------------------------------
+  |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
+  -------------------------------------------------------------
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+  - $ref: /schemas/interrupt-controller/msi-controller.yaml#
+
+properties:
+  compatible:
+    items:
+      - enum:
+          - qemu,imsics
+      - const: riscv,imsics
+
+  reg:
+    minItems: 1
+    maxItems: 16384
+    description:
+      Base address of each IMSIC group.
+
+  interrupt-controller: true
+
+  "#interrupt-cells":
+    const: 0
+
+  msi-controller: true
+
+  "#msi-cells":
+    const: 0
+
+  interrupts-extended:
+    minItems: 1
+    maxItems: 16384
+    description:
+      This property represents the set of CPUs (or HARTs) for which given
+      device tree node describes the IMSIC interrupt files. Each node pointed
+      to should be a riscv,cpu-intc node, which has a CPU node (i.e. RISC-V
+      HART) as parent.
+
+  riscv,num-ids:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 63
+    maximum: 2047
+    description:
+      Number of interrupt identities supported by IMSIC interrupt file.
+
+  riscv,num-guest-ids:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 63
+    maximum: 2047
+    description:
+      Number of interrupt identities are supported by IMSIC guest interrupt
+      file. When not specified it is assumed to be same as specified by the
+      riscv,num-ids property.
+
+  riscv,guest-index-bits:
+    minimum: 0
+    maximum: 7
+    default: 0
+    description:
+      Number of guest index bits in the MSI target address.
+
+  riscv,hart-index-bits:
+    minimum: 0
+    maximum: 15
+    description:
+      Number of HART index bits in the MSI target address. When not
+      specified it is calculated based on the interrupts-extended property.
+
+  riscv,group-index-bits:
+    minimum: 0
+    maximum: 7
+    default: 0
+    description:
+      Number of group index bits in the MSI target address.
+
+  riscv,group-index-shift:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 55
+    default: 24
+    description:
+      The least significant bit position of the group index bits in the
+      MSI target address.
+
+required:
+  - compatible
+  - reg
+  - interrupt-controller
+  - msi-controller
+  - "#msi-cells"
+  - interrupts-extended
+  - riscv,num-ids
+
+unevaluatedProperties: false
+
+examples:
+  - |
+    // Example 1 (Machine-level IMSIC files with just one group):
+
+    interrupt-controller@24000000 {
+      compatible = "qemu,imsics", "riscv,imsics";
+      interrupts-extended = <&cpu1_intc 11>,
+                            <&cpu2_intc 11>,
+                            <&cpu3_intc 11>,
+                            <&cpu4_intc 11>;
+      reg = <0x28000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <0>;
+      msi-controller;
+      #msi-cells = <0>;
+      riscv,num-ids = <127>;
+    };
+
+  - |
+    // Example 2 (Supervisor-level IMSIC files with two groups):
+
+    interrupt-controller@28000000 {
+      compatible = "qemu,imsics", "riscv,imsics";
+      interrupts-extended = <&cpu1_intc 9>,
+                            <&cpu2_intc 9>,
+                            <&cpu3_intc 9>,
+                            <&cpu4_intc 9>;
+      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
+            <0x29000000 0x2000>; /* Group1 IMSICs */
+      interrupt-controller;
+      #interrupt-cells = <0>;
+      msi-controller;
+      #msi-cells = <0>;
+      riscv,num-ids = <127>;
+      riscv,group-index-bits = <1>;
+      riscv,group-index-shift = <24>;
+    };
+...
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 04/10] irqchip: Add RISC-V incoming MSI controller driver
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
                   ` (2 preceding siblings ...)
  2023-06-13 15:34 ` [PATCH v4 03/10] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-13 15:34 ` [PATCH v4 05/10] irqchip/riscv-imsic: Add support for PCI MSI irqdomain Anup Patel
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

The RISC-V advanced interrupt architecture (AIA) specification defines
a new MSI controller for managing MSIs and IPIs on a RISC-V platform.

This new MSI controller is referred to as incoming message signalled
interrupt controller (IMSIC) which manages MSI on per-HART (or per-CPU)
basis. (For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V IMSIC which provides
IPIs and platform MSIs to the Linux RISC-V kernel.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/Kconfig             |    7 +-
 drivers/irqchip/Makefile            |    1 +
 drivers/irqchip/irq-riscv-imsic.c   | 1028 +++++++++++++++++++++++++++
 include/linux/irqchip/riscv-imsic.h |   86 +++
 4 files changed, 1121 insertions(+), 1 deletion(-)
 create mode 100644 drivers/irqchip/irq-riscv-imsic.c
 create mode 100644 include/linux/irqchip/riscv-imsic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 09e422da482f..8ef18be5f37b 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -30,7 +30,6 @@ config ARM_GIC_V2M
 
 config GIC_NON_BANKED
 	bool
-
 config ARM_GIC_V3
 	bool
 	select IRQ_DOMAIN_HIERARCHY
@@ -545,6 +544,12 @@ config SIFIVE_PLIC
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
 
+config RISCV_IMSIC
+	bool
+	depends on RISCV
+	select IRQ_DOMAIN_HIERARCHY
+	select GENERIC_MSI_IRQ
+
 config EXYNOS_IRQ_COMBINER
 	bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
 	depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index ffd945fe71aa..577bde3e986b 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
 obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
 obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
 obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
+obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
 obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
 obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
 obj-$(CONFIG_IMX_INTMUX)		+= irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
new file mode 100644
index 000000000000..971fad638c9f
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -0,0 +1,1028 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of_address.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <asm/hwcap.h>
+
+#define IMSIC_DISABLE_EIDELIVERY		0
+#define IMSIC_ENABLE_EIDELIVERY			1
+#define IMSIC_DISABLE_EITHRESHOLD		1
+#define IMSIC_ENABLE_EITHRESHOLD		0
+
+/*
+ * The IMSIC driver uses 1 IPI for ID synchronization and
+ * arch/riscv/kernel/smp.c require 6 IPIs so we fix the
+ * total number of IPIs to 8.
+ */
+#define IMSIC_NR_IPI				8
+
+#define imsic_csr_write(__c, __v)		\
+do {						\
+	csr_write(CSR_ISELECT, __c);		\
+	csr_write(CSR_IREG, __v);		\
+} while (0)
+
+#define imsic_csr_read(__c)			\
+({						\
+	unsigned long __v;			\
+	csr_write(CSR_ISELECT, __c);		\
+	__v = csr_read(CSR_IREG);		\
+	__v;					\
+})
+
+#define imsic_csr_set(__c, __v)			\
+do {						\
+	csr_write(CSR_ISELECT, __c);		\
+	csr_set(CSR_IREG, __v);			\
+} while (0)
+
+#define imsic_csr_clear(__c, __v)		\
+do {						\
+	csr_write(CSR_ISELECT, __c);		\
+	csr_clear(CSR_IREG, __v);		\
+} while (0)
+
+struct imsic_priv {
+	/* Global configuration common for all HARTs */
+	struct imsic_global_config global;
+
+	/* Global state of interrupt identities */
+	raw_spinlock_t ids_lock;
+	unsigned long *ids_used_bimap;
+	unsigned long *ids_enabled_bimap;
+	unsigned int *ids_target_cpu;
+
+	/* Mask for connected CPUs */
+	struct cpumask lmask;
+
+	/* IPI interrupt identity and synchronization */
+	u32 ipi_id;
+	int ipi_virq;
+	struct irq_desc *ipi_lsync_desc;
+
+	/* IRQ domains */
+	struct irq_domain *base_domain;
+	struct irq_domain *plat_domain;
+};
+
+static struct imsic_priv *imsic;
+static int imsic_parent_irq;
+
+const struct imsic_global_config *imsic_get_global_config(void)
+{
+	return (imsic) ? &imsic->global : NULL;
+}
+EXPORT_SYMBOL_GPL(imsic_get_global_config);
+
+static int imsic_cpu_page_phys(unsigned int cpu,
+			       unsigned int guest_index,
+			       phys_addr_t *out_msi_pa)
+{
+	struct imsic_global_config *global;
+	struct imsic_local_config *local;
+
+	global = &imsic->global;
+	local = per_cpu_ptr(global->local, cpu);
+
+	if (BIT(global->guest_index_bits) <= guest_index)
+		return -EINVAL;
+
+	if (out_msi_pa)
+		*out_msi_pa = local->msi_pa +
+			      (guest_index * IMSIC_MMIO_PAGE_SZ);
+
+	return 0;
+}
+
+static int imsic_get_cpu(const struct cpumask *mask_val, bool force,
+			 unsigned int *out_target_cpu)
+{
+	struct cpumask amask;
+	unsigned int cpu;
+
+	cpumask_and(&amask, &imsic->lmask, mask_val);
+
+	if (force)
+		cpu = cpumask_first(&amask);
+	else
+		cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	if (out_target_cpu)
+		*out_target_cpu = cpu;
+
+	return 0;
+}
+
+static void imsic_id_set_target(unsigned int id, unsigned int target_cpu)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+	imsic->ids_target_cpu[id] = target_cpu;
+	raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+}
+
+static unsigned int imsic_id_get_target(unsigned int id)
+{
+	unsigned int ret;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+	ret = imsic->ids_target_cpu[id];
+	raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+	return ret;
+}
+
+static void __imsic_eix_update(unsigned long base_id,
+			       unsigned long num_id, bool pend, bool val)
+{
+	unsigned long i, isel, ireg;
+	unsigned long id = base_id, last_id = base_id + num_id;
+
+	while (id < last_id) {
+		isel = id / BITS_PER_LONG;
+		isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+		isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+
+		ireg = 0;
+		for (i = id & (__riscv_xlen - 1);
+		     (id < last_id) && (i < __riscv_xlen); i++) {
+			ireg |= BIT(i);
+			id++;
+		}
+
+		/*
+		 * The IMSIC EIEx and EIPx registers are indirectly
+		 * accessed via using ISELECT and IREG CSRs so we
+		 * need to access these CSRs without getting preempted.
+		 *
+		 * All existing users of this function call this
+		 * function with local IRQs disabled so we don't
+		 * need to do anything special here.
+		 */
+		if (val)
+			imsic_csr_set(isel, ireg);
+		else
+			imsic_csr_clear(isel, ireg);
+	}
+}
+
+#define __imsic_id_enable(__id)		\
+	__imsic_eix_update((__id), 1, false, true)
+#define __imsic_id_disable(__id)	\
+	__imsic_eix_update((__id), 1, false, false)
+
+static void imsic_ids_local_sync(void)
+{
+	int i;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+	for (i = 1; i <= imsic->global.nr_ids; i++) {
+		if (imsic->ipi_id == i)
+			continue;
+
+		if (test_bit(i, imsic->ids_enabled_bimap))
+			__imsic_id_enable(i);
+		else
+			__imsic_id_disable(i);
+	}
+	raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+}
+
+static void imsic_ids_local_delivery(bool enable)
+{
+	if (enable) {
+		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+	} else {
+		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+	}
+}
+
+#ifdef CONFIG_SMP
+static irqreturn_t imsic_ids_sync_handler(int irq, void *data)
+{
+	imsic_ids_local_sync();
+	return IRQ_HANDLED;
+}
+
+static void imsic_ids_remote_sync(void)
+{
+	struct cpumask amask;
+
+	/*
+	 * We simply inject ID synchronization IPI to all target CPUs
+	 * except current CPU. The ipi_send_mask() implementation of
+	 * IPI mux will inject ID synchronization IPI only for CPUs
+	 * that have enabled it so offline CPUs won't receive IPI.
+	 * An offline CPU will unconditionally synchronize IDs through
+	 * imsic_starting_cpu() when the CPU is brought up.
+	 */
+	cpumask_andnot(&amask, &imsic->lmask, cpumask_of(smp_processor_id()));
+	__ipi_send_mask(imsic->ipi_lsync_desc, &amask);
+}
+#else
+#define imsic_ids_remote_sync()
+#endif
+
+static int imsic_ids_alloc(unsigned int order)
+{
+	int ret;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+	ret = bitmap_find_free_region(imsic->ids_used_bimap,
+				      imsic->global.nr_ids + 1, order);
+	raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+	return ret;
+}
+
+static void imsic_ids_free(unsigned int base_id, unsigned int order)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+	bitmap_release_region(imsic->ids_used_bimap, base_id, order);
+	raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+}
+
+static int __init imsic_ids_init(void)
+{
+	int i;
+	struct imsic_global_config *global = &imsic->global;
+
+	raw_spin_lock_init(&imsic->ids_lock);
+
+	/* Allocate used bitmap */
+	imsic->ids_used_bimap = bitmap_zalloc(global->nr_ids + 1, GFP_KERNEL);
+	if (!imsic->ids_used_bimap)
+		return -ENOMEM;
+
+	/* Allocate enabled bitmap */
+	imsic->ids_enabled_bimap = bitmap_zalloc(global->nr_ids + 1,
+						GFP_KERNEL);
+	if (!imsic->ids_enabled_bimap) {
+		kfree(imsic->ids_used_bimap);
+		return -ENOMEM;
+	}
+
+	/* Allocate target CPU array */
+	imsic->ids_target_cpu = kcalloc(global->nr_ids + 1,
+				       sizeof(unsigned int), GFP_KERNEL);
+	if (!imsic->ids_target_cpu) {
+		bitmap_free(imsic->ids_enabled_bimap);
+		bitmap_free(imsic->ids_used_bimap);
+		return -ENOMEM;
+	}
+	for (i = 0; i <= global->nr_ids; i++)
+		imsic->ids_target_cpu[i] = UINT_MAX;
+
+	/* Reserve ID#0 because it is special and never implemented */
+	bitmap_set(imsic->ids_used_bimap, 0, 1);
+
+	return 0;
+}
+
+static void __init imsic_ids_cleanup(void)
+{
+	kfree(imsic->ids_target_cpu);
+	bitmap_free(imsic->ids_enabled_bimap);
+	bitmap_free(imsic->ids_used_bimap);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_ipi_send(unsigned int cpu)
+{
+	struct imsic_local_config *local =
+				per_cpu_ptr(imsic->global.local, cpu);
+
+	writel(imsic->ipi_id, local->msi_va);
+}
+
+static void imsic_ipi_starting_cpu(void)
+{
+	/* Enable IPIs for current CPU. */
+	__imsic_id_enable(imsic->ipi_id);
+
+	/* Enable virtual IPI used for IMSIC ID synchronization */
+	enable_percpu_irq(imsic->ipi_virq, 0);
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+	/*
+	 * Disable virtual IPI used for IMSIC ID synchronization so
+	 * that we don't receive ID synchronization requests.
+	 */
+	disable_percpu_irq(imsic->ipi_virq);
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+	int virq;
+
+	/* Allocate interrupt identity for IPIs */
+	virq = imsic_ids_alloc(get_count_order(1));
+	if (virq < 0)
+		return virq;
+	imsic->ipi_id = virq;
+
+	/* Create IMSIC IPI multiplexing */
+	virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
+	if (virq <= 0) {
+		imsic_ids_free(imsic->ipi_id, get_count_order(1));
+		return (virq < 0) ? virq : -ENOMEM;
+	}
+	imsic->ipi_virq = virq;
+
+	/* First vIRQ is used for IMSIC ID synchronization */
+	virq = request_percpu_irq(imsic->ipi_virq, imsic_ids_sync_handler,
+				  "riscv-imsic-lsync", imsic->global.local);
+	if (virq) {
+		imsic_ids_free(imsic->ipi_id, get_count_order(1));
+		return virq;
+	}
+	irq_set_status_flags(imsic->ipi_virq, IRQ_HIDDEN);
+	imsic->ipi_lsync_desc = irq_to_desc(imsic->ipi_virq);
+
+	/* Set vIRQ range */
+	riscv_ipi_set_virq_range(imsic->ipi_virq + 1, IMSIC_NR_IPI - 1, true);
+
+	return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(void)
+{
+	if (imsic->ipi_lsync_desc)
+		free_percpu_irq(imsic->ipi_virq, imsic->global.local);
+	imsic_ids_free(imsic->ipi_id, get_count_order(1));
+}
+#else
+static void imsic_ipi_starting_cpu(void)
+{
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+	/* Clear the IPI id because we are not using IPIs */
+	imsic->ipi_id = 0;
+	return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(void)
+{
+}
+#endif
+
+static void imsic_irq_mask(struct irq_data *d)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+	bitmap_clear(imsic->ids_enabled_bimap, d->hwirq, 1);
+	__imsic_id_disable(d->hwirq);
+	raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+	imsic_ids_remote_sync();
+}
+
+static void imsic_irq_unmask(struct irq_data *d)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+	bitmap_set(imsic->ids_enabled_bimap, d->hwirq, 1);
+	__imsic_id_enable(d->hwirq);
+	raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+	imsic_ids_remote_sync();
+}
+
+static void imsic_irq_compose_msi_msg(struct irq_data *d,
+				      struct msi_msg *msg)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(d);
+	phys_addr_t msi_addr;
+	unsigned int cpu;
+	int err;
+
+	cpu = imsic_id_get_target(d->hwirq);
+	if (WARN_ON(cpu == UINT_MAX))
+		return;
+
+	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+	if (WARN_ON(err))
+		return;
+
+	msg->address_hi = upper_32_bits(msi_addr);
+	msg->address_lo = lower_32_bits(msi_addr);
+	msg->data = d->hwirq;
+	iommu_dma_compose_msi_msg(desc, msg);
+}
+
+#ifdef CONFIG_SMP
+static int imsic_irq_set_affinity(struct irq_data *d,
+				  const struct cpumask *mask_val,
+				  bool force)
+{
+	unsigned int target_cpu;
+	int rc;
+
+	rc = imsic_get_cpu(mask_val, force, &target_cpu);
+	if (rc)
+		return rc;
+
+	imsic_id_set_target(d->hwirq, target_cpu);
+	irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
+
+	return IRQ_SET_MASK_OK;
+}
+#endif
+
+static struct irq_chip imsic_irq_base_chip = {
+	.name			= "RISC-V IMSIC-BASE",
+	.irq_mask		= imsic_irq_mask,
+	.irq_unmask		= imsic_irq_unmask,
+#ifdef CONFIG_SMP
+	.irq_set_affinity	= imsic_irq_set_affinity,
+#endif
+	.irq_compose_msi_msg	= imsic_irq_compose_msi_msg,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int imsic_irq_domain_alloc(struct irq_domain *domain,
+				  unsigned int virq,
+				  unsigned int nr_irqs,
+				  void *args)
+{
+	msi_alloc_info_t *info = args;
+	phys_addr_t msi_addr;
+	int i, hwirq, err = 0;
+	unsigned int cpu;
+
+	err = imsic_get_cpu(&imsic->lmask, false, &cpu);
+	if (err)
+		return err;
+
+	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+	if (err)
+		return err;
+
+	hwirq = imsic_ids_alloc(get_count_order(nr_irqs));
+	if (hwirq < 0)
+		return hwirq;
+
+	err = iommu_dma_prepare_msi(info->desc, msi_addr);
+	if (err)
+		goto fail;
+
+	for (i = 0; i < nr_irqs; i++) {
+		imsic_id_set_target(hwirq + i, cpu);
+		irq_domain_set_info(domain, virq + i, hwirq + i,
+				    &imsic_irq_base_chip, imsic,
+				    handle_simple_irq, NULL, NULL);
+		irq_set_noprobe(virq + i);
+		irq_set_affinity(virq + i, &imsic->lmask);
+		/*
+		 * IMSIC does not implement irq_disable() so Linux interrupt
+		 * subsystem will take a lazy approach for disabling an IMSIC
+		 * interrupt. This means IMSIC interrupts are left unmasked
+		 * upon system suspend and interrupts are not processed
+		 * immediately upon system wake up. To tackle this, we disable
+		 * the lazy approach for all IMSIC interrupts.
+		 */
+		irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+	}
+
+	return 0;
+
+fail:
+	imsic_ids_free(hwirq, get_count_order(nr_irqs));
+	return err;
+}
+
+static void imsic_irq_domain_free(struct irq_domain *domain,
+				  unsigned int virq,
+				  unsigned int nr_irqs)
+{
+	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+
+	imsic_ids_free(d->hwirq, get_count_order(nr_irqs));
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops imsic_base_domain_ops = {
+	.alloc		= imsic_irq_domain_alloc,
+	.free		= imsic_irq_domain_free,
+};
+
+static struct irq_chip imsic_plat_irq_chip = {
+	.name			= "RISC-V IMSIC-PLAT",
+};
+
+static struct msi_domain_ops imsic_plat_domain_ops = {
+};
+
+static struct msi_domain_info imsic_plat_domain_info = {
+	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
+	.ops	= &imsic_plat_domain_ops,
+	.chip	= &imsic_plat_irq_chip,
+};
+
+static int __init imsic_irq_domains_init(struct fwnode_handle *fwnode)
+{
+	/* Create Base IRQ domain */
+	imsic->base_domain = irq_domain_create_tree(fwnode,
+					&imsic_base_domain_ops, imsic);
+	if (!imsic->base_domain) {
+		pr_err("Failed to create IMSIC base domain\n");
+		return -ENOMEM;
+	}
+	irq_domain_update_bus_token(imsic->base_domain, DOMAIN_BUS_NEXUS);
+
+	/* Create Platform MSI domain */
+	imsic->plat_domain = platform_msi_create_irq_domain(fwnode,
+						&imsic_plat_domain_info,
+						imsic->base_domain);
+	if (!imsic->plat_domain) {
+		pr_err("Failed to create IMSIC platform domain\n");
+		irq_domain_remove(imsic->base_domain);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * To handle an interrupt, we read the TOPEI CSR and write zero in one
+ * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
+ * Linux interrupt number and let Linux IRQ subsystem handle it.
+ */
+static void imsic_handle_irq(struct irq_desc *desc)
+{
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	irq_hw_number_t hwirq;
+	int err;
+
+	chained_irq_enter(chip, desc);
+
+	while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
+		hwirq = hwirq >> TOPEI_ID_SHIFT;
+
+		if (hwirq == imsic->ipi_id) {
+#ifdef CONFIG_SMP
+			ipi_mux_process();
+#endif
+			continue;
+		}
+
+		err = generic_handle_domain_irq(imsic->base_domain, hwirq);
+		if (unlikely(err))
+			pr_warn_ratelimited(
+				"hwirq %lu mapping not found\n", hwirq);
+	}
+
+	chained_irq_exit(chip, desc);
+}
+
+static int imsic_starting_cpu(unsigned int cpu)
+{
+	/* Enable per-CPU parent interrupt */
+	enable_percpu_irq(imsic_parent_irq,
+			  irq_get_trigger_type(imsic_parent_irq));
+
+	/* Setup IPIs */
+	imsic_ipi_starting_cpu();
+
+	/*
+	 * Interrupts identities might have been enabled/disabled while
+	 * this CPU was not running so sync-up local enable/disable state.
+	 */
+	imsic_ids_local_sync();
+
+	/* Enable local interrupt delivery */
+	imsic_ids_local_delivery(true);
+
+	return 0;
+}
+
+static int imsic_dying_cpu(unsigned int cpu)
+{
+	/* Cleanup IPIs */
+	imsic_ipi_dying_cpu();
+
+	return 0;
+}
+
+static int __init imsic_get_parent_hartid(struct fwnode_handle *fwnode,
+					  u32 index, unsigned long *hartid)
+{
+	int rc;
+	struct fwnode_reference_args parent;
+
+	rc = fwnode_property_get_reference_args(fwnode,
+			"interrupts-extended", "#interrupt-cells",
+			0, index, &parent);
+	if (rc)
+		return rc;
+
+	/*
+	 * Skip interrupts other than external interrupts for
+	 * current privilege level.
+	 */
+	if (parent.args[0] != RV_IRQ_EXT)
+		return -EINVAL;
+
+	return riscv_fw_parent_hartid(parent.fwnode, hartid);
+}
+
+static int __init imsic_get_mmio_resource(struct fwnode_handle *fwnode,
+					  u32 index, struct resource *res)
+{
+	/*
+	 * Currently, only OF fwnode is support so extend this function
+	 * for other types of fwnode for ACPI support.
+	 */
+	if (!is_of_node(fwnode))
+		return -EINVAL;
+	return of_address_to_resource(to_of_node(fwnode), index, res);
+}
+
+static int __init imsic_init(struct fwnode_handle *fwnode)
+{
+	int rc, cpu;
+	phys_addr_t base_addr;
+	struct irq_domain *domain;
+	void __iomem **mmios_va = NULL;
+	struct resource res, *mmios = NULL;
+	struct imsic_local_config *local;
+	struct imsic_global_config *global;
+	unsigned long reloff, hartid;
+	u32 i, j, index, nr_parent_irqs, nr_handlers = 0, num_mmios = 0;
+
+	/*
+	 * Only one IMSIC instance allowed in a platform for clean
+	 * implementation of SMP IRQ affinity and per-CPU IPIs.
+	 *
+	 * This means on a multi-socket (or multi-die) platform we
+	 * will have multiple MMIO regions for one IMSIC instance.
+	 */
+	if (imsic) {
+		pr_err("%pfwP: already initialized hence ignoring\n",
+			fwnode);
+		return -ENODEV;
+	}
+
+	if (!riscv_isa_extension_available(NULL, SxAIA)) {
+		pr_err("%pfwP: AIA support not available\n", fwnode);
+		return -ENODEV;
+	}
+
+	imsic = kzalloc(sizeof(*imsic), GFP_KERNEL);
+	if (!imsic)
+		return -ENOMEM;
+	global = &imsic->global;
+
+	global->local = alloc_percpu(typeof(*(global->local)));
+	if (!global->local) {
+		rc = -ENOMEM;
+		goto out_free_priv;
+	}
+
+	/* Find number of parent interrupts */
+	nr_parent_irqs = 0;
+	while (!imsic_get_parent_hartid(fwnode, nr_parent_irqs, &hartid))
+		nr_parent_irqs++;
+	if (!nr_parent_irqs) {
+		pr_err("%pfwP: no parent irqs available\n", fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+
+	/* Find number of guest index bits in MSI address */
+	rc = fwnode_property_read_u32_array(fwnode, "riscv,guest-index-bits",
+					    &global->guest_index_bits, 1);
+	if (rc)
+		global->guest_index_bits = 0;
+	i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+	if (i < global->guest_index_bits) {
+		pr_err("%pfwP: guest index bits too big\n", fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+
+	/* Find number of HART index bits */
+	rc = fwnode_property_read_u32_array(fwnode, "riscv,hart-index-bits",
+					    &global->hart_index_bits, 1);
+	if (rc) {
+		/* Assume default value */
+		global->hart_index_bits = __fls(nr_parent_irqs);
+		if (BIT(global->hart_index_bits) < nr_parent_irqs)
+			global->hart_index_bits++;
+	}
+	i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT - global->guest_index_bits;
+	if (i < global->hart_index_bits) {
+		pr_err("%pfwP: HART index bits too big\n", fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+
+	/* Find number of group index bits */
+	rc = fwnode_property_read_u32_array(fwnode, "riscv,group-index-bits",
+					    &global->group_index_bits, 1);
+	if (rc)
+		global->group_index_bits = 0;
+	i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+	    global->guest_index_bits - global->hart_index_bits;
+	if (i < global->group_index_bits) {
+		pr_err("%pfwP: group index bits too big\n", fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+
+	/*
+	 * Find first bit position of group index.
+	 * If not specified assumed the default APLIC-IMSIC configuration.
+	 */
+	rc = fwnode_property_read_u32_array(fwnode, "riscv,group-index-shift",
+					    &global->group_index_shift, 1);
+	if (rc)
+		global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
+	i = global->group_index_bits + global->group_index_shift - 1;
+	if (i >= BITS_PER_LONG) {
+		pr_err("%pfwP: group index shift too big\n", fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+
+	/* Find number of interrupt identities */
+	rc = fwnode_property_read_u32_array(fwnode, "riscv,num-ids",
+					    &global->nr_ids, 1);
+	if (rc) {
+		pr_err("%pfwP: number of interrupt identities not found\n",
+			fwnode);
+		goto out_free_local;
+	}
+	if ((global->nr_ids < IMSIC_MIN_ID) ||
+	    (global->nr_ids >= IMSIC_MAX_ID) ||
+	    ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+		pr_err("%pfwP: invalid number of interrupt identities\n",
+			fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+
+	/* Find number of guest interrupt identities */
+	if (fwnode_property_read_u32_array(fwnode, "riscv,num-guest-ids",
+					   &global->nr_guest_ids, 1))
+		global->nr_guest_ids = global->nr_ids;
+	if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
+	    (global->nr_guest_ids >= IMSIC_MAX_ID) ||
+	    ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+		pr_err("%pfwP: invalid number of guest interrupt identities\n",
+			fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+
+	/* Compute base address */
+	rc = imsic_get_mmio_resource(fwnode, 0, &res);
+	if (rc) {
+		pr_err("%pfwP: first MMIO resource not found\n", fwnode);
+		rc = -EINVAL;
+		goto out_free_local;
+	}
+	global->base_addr = res.start;
+	global->base_addr &= ~(BIT(global->guest_index_bits +
+				   global->hart_index_bits +
+				   IMSIC_MMIO_PAGE_SHIFT) - 1);
+	global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+			       global->group_index_shift);
+
+	/* Find number of MMIO register sets */
+	while (!imsic_get_mmio_resource(fwnode, num_mmios, &res))
+		num_mmios++;
+
+	/* Allocate MMIO resource array */
+	mmios = kcalloc(num_mmios, sizeof(*mmios), GFP_KERNEL);
+	if (!mmios) {
+		rc = -ENOMEM;
+		goto out_free_local;
+	}
+
+	/* Allocate MMIO virtual address array */
+	mmios_va = kcalloc(num_mmios, sizeof(*mmios_va), GFP_KERNEL);
+	if (!mmios_va) {
+		rc = -ENOMEM;
+		goto out_iounmap;
+	}
+
+	/* Parse and map MMIO register sets */
+	for (i = 0; i < num_mmios; i++) {
+		rc = imsic_get_mmio_resource(fwnode, i, &mmios[i]);
+		if (rc) {
+			pr_err("%pfwP: unable to parse MMIO regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+
+		base_addr = mmios[i].start;
+		base_addr &= ~(BIT(global->guest_index_bits +
+				   global->hart_index_bits +
+				   IMSIC_MMIO_PAGE_SHIFT) - 1);
+		base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+			       global->group_index_shift);
+		if (base_addr != global->base_addr) {
+			rc = -EINVAL;
+			pr_err("%pfwP: address mismatch for regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+
+		mmios_va[i] = ioremap(mmios[i].start, resource_size(&mmios[i]));
+		if (!mmios_va[i]) {
+			rc = -EIO;
+			pr_err("%pfwP: unable to map MMIO regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+	}
+
+	/* Initialize interrupt identity management */
+	rc = imsic_ids_init();
+	if (rc) {
+		pr_err("%pfwP: failed to initialize interrupt management\n",
+		       fwnode);
+		goto out_iounmap;
+	}
+
+	/* Configure handlers for target CPUs */
+	for (i = 0; i < nr_parent_irqs; i++) {
+		rc = imsic_get_parent_hartid(fwnode, i, &hartid);
+		if (rc) {
+			pr_warn("%pfwP: hart ID for parent irq%d not found\n",
+				fwnode, i);
+			continue;
+		}
+
+		cpu = riscv_hartid_to_cpuid(hartid);
+		if (cpu < 0) {
+			pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
+				fwnode, i);
+			continue;
+		}
+
+		/* Find MMIO location of MSI page */
+		index = num_mmios;
+		reloff = i * BIT(global->guest_index_bits) *
+			 IMSIC_MMIO_PAGE_SZ;
+		for (j = 0; num_mmios; j++) {
+			if (reloff < resource_size(&mmios[j])) {
+				index = j;
+				break;
+			}
+
+			/*
+			 * MMIO region size may not be aligned to
+			 * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+			 * if holes are present.
+			 */
+			reloff -= ALIGN(resource_size(&mmios[j]),
+			BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
+		}
+		if (index >= num_mmios) {
+			pr_warn("%pfwP: MMIO not found for parent irq%d\n",
+				fwnode, i);
+			continue;
+		}
+
+		cpumask_set_cpu(cpu, &imsic->lmask);
+
+		local = per_cpu_ptr(global->local, cpu);
+		local->msi_pa = mmios[index].start + reloff;
+		local->msi_va = mmios_va[index] + reloff;
+
+		nr_handlers++;
+	}
+
+	/* If no CPU handlers found then can't take interrupts */
+	if (!nr_handlers) {
+		pr_err("%pfwP: No CPU handlers found\n", fwnode);
+		rc = -ENODEV;
+		goto out_ids_cleanup;
+	}
+
+	/* Find parent domain and register chained handler */
+	domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+					  DOMAIN_BUS_ANY);
+	if (!domain) {
+		pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
+		rc = -ENOENT;
+		goto out_ids_cleanup;
+	}
+	imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+	if (!imsic_parent_irq) {
+		pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
+		rc = -ENOENT;
+		goto out_ids_cleanup;
+	}
+	irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
+
+	/* Initialize IPI domain */
+	rc = imsic_ipi_domain_init();
+	if (rc) {
+		pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
+		goto out_ids_cleanup;
+	}
+
+	/* Initialize IRQ and MSI domains */
+	rc = imsic_irq_domains_init(fwnode);
+	if (rc) {
+		pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
+		       fwnode);
+		goto out_ipi_domain_cleanup;
+	}
+
+	/*
+	 * Setup cpuhp state (must be done after setting imsic_parent_irq)
+	 *
+	 * Don't disable per-CPU IMSIC file when CPU goes offline
+	 * because this affects IPI and the masking/unmasking of
+	 * virtual IPIs is done via generic IPI-Mux
+	 */
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+			  "irqchip/riscv/imsic:starting",
+			  imsic_starting_cpu, imsic_dying_cpu);
+
+	/* We don't need MMIO arrays anymore so let's free-up */
+	kfree(mmios_va);
+	kfree(mmios);
+
+	pr_info("%pfwP:  hart-index-bits: %d,  guest-index-bits: %d\n",
+		fwnode, global->hart_index_bits, global->guest_index_bits);
+	pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
+		fwnode, global->group_index_bits, global->group_index_shift);
+	pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
+		fwnode, global->nr_ids, nr_handlers, &global->base_addr);
+	if (imsic->ipi_id)
+		pr_info("%pfwP: providing IPIs using interrupt %d\n",
+			fwnode, imsic->ipi_id);
+
+	return 0;
+
+out_ipi_domain_cleanup:
+	imsic_ipi_domain_cleanup();
+out_ids_cleanup:
+	imsic_ids_cleanup();
+out_iounmap:
+	for (i = 0; i < num_mmios; i++) {
+		if (mmios_va[i])
+			iounmap(mmios_va[i]);
+	}
+	kfree(mmios_va);
+	kfree(mmios);
+out_free_local:
+	free_percpu(imsic->global.local);
+out_free_priv:
+	kfree(imsic);
+	imsic = NULL;
+	return rc;
+}
+
+static int __init imsic_dt_init(struct device_node *node,
+				struct device_node *parent)
+{
+	return imsic_init(&node->fwnode);
+}
+IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
new file mode 100644
index 000000000000..1f6fc9a57218
--- /dev/null
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
+#define __LINUX_IRQCHIP_RISCV_IMSIC_H
+
+#include <linux/types.h>
+#include <asm/csr.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT		12
+#define IMSIC_MMIO_PAGE_SZ		(1UL << IMSIC_MMIO_PAGE_SHIFT)
+#define IMSIC_MMIO_PAGE_LE		0x00
+#define IMSIC_MMIO_PAGE_BE		0x04
+
+#define IMSIC_MIN_ID			63
+#define IMSIC_MAX_ID			2048
+
+#define IMSIC_EIDELIVERY		0x70
+
+#define IMSIC_EITHRESHOLD		0x72
+
+#define IMSIC_EIP0			0x80
+#define IMSIC_EIP63			0xbf
+#define IMSIC_EIPx_BITS			32
+
+#define IMSIC_EIE0			0xc0
+#define IMSIC_EIE63			0xff
+#define IMSIC_EIEx_BITS			32
+
+#define IMSIC_FIRST			IMSIC_EIDELIVERY
+#define IMSIC_LAST			IMSIC_EIE63
+
+#define IMSIC_MMIO_SETIPNUM_LE		0x00
+#define IMSIC_MMIO_SETIPNUM_BE		0x04
+
+struct imsic_local_config {
+	phys_addr_t msi_pa;
+	void __iomem *msi_va;
+};
+
+struct imsic_global_config {
+	/*
+	 * MSI Target Address Scheme
+	 *
+	 * XLEN-1                                                12     0
+	 * |                                                     |     |
+	 * -------------------------------------------------------------
+	 * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
+	 * -------------------------------------------------------------
+	 */
+
+	/* Bits representing Guest index, HART index, and Group index */
+	u32 guest_index_bits;
+	u32 hart_index_bits;
+	u32 group_index_bits;
+	u32 group_index_shift;
+
+	/* Global base address matching all target MSI addresses */
+	phys_addr_t base_addr;
+
+	/* Number of interrupt identities */
+	u32 nr_ids;
+
+	/* Number of guest interrupt identities */
+	u32 nr_guest_ids;
+
+	/* Per-CPU IMSIC addresses */
+	struct imsic_local_config __percpu *local;
+};
+
+#ifdef CONFIG_RISCV_IMSIC
+
+extern const struct imsic_global_config *imsic_get_global_config(void);
+
+#else
+
+static inline const struct imsic_global_config *imsic_get_global_config(void)
+{
+	return NULL;
+}
+
+#endif
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 05/10] irqchip/riscv-imsic: Add support for PCI MSI irqdomain
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
                   ` (3 preceding siblings ...)
  2023-06-13 15:34 ` [PATCH v4 04/10] irqchip: Add RISC-V incoming MSI controller driver Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-13 15:34 ` [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support Anup Patel
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

The Linux PCI framework requires it's own dedicated MSI irqdomain so
let us create PCI MSI irqdomain as child of the IMSIC base irqdomain.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/Kconfig           |  7 +++++
 drivers/irqchip/irq-riscv-imsic.c | 49 +++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 8ef18be5f37b..d700980372ef 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -550,6 +550,13 @@ config RISCV_IMSIC
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_MSI_IRQ
 
+config RISCV_IMSIC_PCI
+	bool
+	depends on RISCV_IMSIC
+	depends on PCI
+	depends on PCI_MSI
+	default RISCV_IMSIC
+
 config EXYNOS_IRQ_COMBINER
 	bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
 	depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
index 971fad638c9f..30247c84a6b0 100644
--- a/drivers/irqchip/irq-riscv-imsic.c
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -18,6 +18,7 @@
 #include <linux/module.h>
 #include <linux/msi.h>
 #include <linux/of_address.h>
+#include <linux/pci.h>
 #include <linux/platform_device.h>
 #include <linux/spinlock.h>
 #include <linux/smp.h>
@@ -81,6 +82,7 @@ struct imsic_priv {
 
 	/* IRQ domains */
 	struct irq_domain *base_domain;
+	struct irq_domain *pci_domain;
 	struct irq_domain *plat_domain;
 };
 
@@ -547,6 +549,39 @@ static const struct irq_domain_ops imsic_base_domain_ops = {
 	.free		= imsic_irq_domain_free,
 };
 
+#ifdef CONFIG_RISCV_IMSIC_PCI
+
+static void imsic_pci_mask_irq(struct irq_data *d)
+{
+	pci_msi_mask_irq(d);
+	irq_chip_mask_parent(d);
+}
+
+static void imsic_pci_unmask_irq(struct irq_data *d)
+{
+	pci_msi_unmask_irq(d);
+	irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip imsic_pci_irq_chip = {
+	.name			= "RISC-V IMSIC-PCI",
+	.irq_mask		= imsic_pci_mask_irq,
+	.irq_unmask		= imsic_pci_unmask_irq,
+	.irq_eoi		= irq_chip_eoi_parent,
+};
+
+static struct msi_domain_ops imsic_pci_domain_ops = {
+};
+
+static struct msi_domain_info imsic_pci_domain_info = {
+	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+		   MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
+	.ops	= &imsic_pci_domain_ops,
+	.chip	= &imsic_pci_irq_chip,
+};
+
+#endif
+
 static struct irq_chip imsic_plat_irq_chip = {
 	.name			= "RISC-V IMSIC-PLAT",
 };
@@ -571,12 +606,26 @@ static int __init imsic_irq_domains_init(struct fwnode_handle *fwnode)
 	}
 	irq_domain_update_bus_token(imsic->base_domain, DOMAIN_BUS_NEXUS);
 
+#ifdef CONFIG_RISCV_IMSIC_PCI
+	/* Create PCI MSI domain */
+	imsic->pci_domain = pci_msi_create_irq_domain(fwnode,
+						&imsic_pci_domain_info,
+						imsic->base_domain);
+	if (!imsic->pci_domain) {
+		pr_err("Failed to create IMSIC PCI domain\n");
+		irq_domain_remove(imsic->base_domain);
+		return -ENOMEM;
+	}
+#endif
+
 	/* Create Platform MSI domain */
 	imsic->plat_domain = platform_msi_create_irq_domain(fwnode,
 						&imsic_plat_domain_info,
 						imsic->base_domain);
 	if (!imsic->plat_domain) {
 		pr_err("Failed to create IMSIC platform domain\n");
+		if (imsic->pci_domain)
+			irq_domain_remove(imsic->pci_domain);
 		irq_domain_remove(imsic->base_domain);
 		return -ENOMEM;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
                   ` (4 preceding siblings ...)
  2023-06-13 15:34 ` [PATCH v4 05/10] irqchip/riscv-imsic: Add support for PCI MSI irqdomain Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-14 14:46   ` Jason Gunthorpe
  2023-06-13 15:34 ` [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC Anup Patel
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel, Vincent Chen

We have a separate RISC-V IMSIC MSI address for each CPU so changing
MSI (or IRQ) affinity results in re-programming of MSI address in
the PCIe (or platform) device.

Currently, the iommu_dma_prepare_msi() is called only once at the
time of IRQ allocation so IOMMU DMA domain will only have mapping
for one MSI page. This means iommu_dma_compose_msi_msg() called
by imsic_irq_compose_msi_msg() will always use the same MSI page
irrespective to target CPU MSI address. In other words, changing
MSI (or IRQ) affinity for device using IOMMU DMA domain will not
work.

To address the above issue, we do the following:
1) Map MSI pages for all CPUs in imsic_irq_domain_alloc()
   using iommu_dma_prepare_msi().
2) Extend iommu_dma_compose_msi_msg() to lookup the correct
   msi_page whenever the msi_page stored as iommu cookie
   does not match.

Reported-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/iommu/dma-iommu.c         | 24 +++++++++++++++++++++---
 drivers/irqchip/irq-riscv-imsic.c | 23 +++++++++++------------
 2 files changed, 32 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 7a9f0b0bddbd..df96bcccbe28 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1687,14 +1687,32 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
 	struct device *dev = msi_desc_to_dev(desc);
 	const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
 	const struct iommu_dma_msi_page *msi_page;
+	struct iommu_dma_cookie *cookie;
+	phys_addr_t msi_addr;
 
-	msi_page = msi_desc_get_iommu_cookie(desc);
+	if (!domain || !domain->iova_cookie)
+		return;
 
-	if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+	cookie = domain->iova_cookie;
+	msi_page = msi_desc_get_iommu_cookie(desc);
+	if (!msi_page || msi_page->phys != msi_addr) {
+		msi_addr = ((u64)msg->address_hi << 32) | msg->address_lo;
+		msi_addr &= ~(phys_addr_t)(cookie_msi_granule(cookie) - 1);
+
+		msi_desc_set_iommu_cookie(desc, NULL);
+		list_for_each_entry(msi_page, &cookie->msi_page_list, list) {
+			if (msi_page->phys == msi_addr) {
+				msi_desc_set_iommu_cookie(desc, msi_page);
+				break;
+			}
+		}
+		msi_page = msi_desc_get_iommu_cookie(desc);
+	}
+	if (WARN_ON(!msi_page))
 		return;
 
 	msg->address_hi = upper_32_bits(msi_page->iova);
-	msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+	msg->address_lo &= cookie_msi_granule(cookie) - 1;
 	msg->address_lo += lower_32_bits(msi_page->iova);
 }
 
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
index 30247c84a6b0..19dedd036dd4 100644
--- a/drivers/irqchip/irq-riscv-imsic.c
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -493,11 +493,18 @@ static int imsic_irq_domain_alloc(struct irq_domain *domain,
 	int i, hwirq, err = 0;
 	unsigned int cpu;
 
-	err = imsic_get_cpu(&imsic->lmask, false, &cpu);
-	if (err)
-		return err;
+	/* Map MSI address of all CPUs */
+	for_each_cpu(cpu, &imsic->lmask) {
+		err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+		if (err)
+			return err;
+
+		err = iommu_dma_prepare_msi(info->desc, msi_addr);
+		if (err)
+			return err;
+	}
 
-	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+	err = imsic_get_cpu(&imsic->lmask, false, &cpu);
 	if (err)
 		return err;
 
@@ -505,10 +512,6 @@ static int imsic_irq_domain_alloc(struct irq_domain *domain,
 	if (hwirq < 0)
 		return hwirq;
 
-	err = iommu_dma_prepare_msi(info->desc, msi_addr);
-	if (err)
-		goto fail;
-
 	for (i = 0; i < nr_irqs; i++) {
 		imsic_id_set_target(hwirq + i, cpu);
 		irq_domain_set_info(domain, virq + i, hwirq + i,
@@ -528,10 +531,6 @@ static int imsic_irq_domain_alloc(struct irq_domain *domain,
 	}
 
 	return 0;
-
-fail:
-	imsic_ids_free(hwirq, get_count_order(nr_irqs));
-	return err;
 }
 
 static void imsic_irq_domain_free(struct irq_domain *domain,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
                   ` (5 preceding siblings ...)
  2023-06-13 15:34 ` [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-14 19:27   ` Conor Dooley
  2023-06-13 15:34 ` [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver Anup Patel
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

We add DT bindings document for RISC-V advanced platform level interrupt
controller (APLIC) defined by the RISC-V advanced interrupt architecture
(AIA) specification.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 .../interrupt-controller/riscv,aplic.yaml     | 169 ++++++++++++++++++
 1 file changed, 169 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
new file mode 100644
index 000000000000..e21de99b10a2
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
@@ -0,0 +1,169 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
+
+maintainers:
+  - Anup Patel <anup@brainfault.org>
+
+description:
+  The RISC-V advanced interrupt architecture (AIA) defines an advanced
+  platform level interrupt controller (APLIC) for handling wired interrupts
+  in a RISC-V platform. The RISC-V AIA specification can be found at
+  https://github.com/riscv/riscv-aia.
+
+  The RISC-V APLIC is implemented as hierarchical APLIC domains where all
+  interrupt sources connect to the root APLIC domain and a parent APLIC
+  domain can delegate interrupt sources to it's child APLIC domains. There
+  is one device tree node for each APLIC domain.
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+  compatible:
+    items:
+      - enum:
+          - qemu,aplic
+      - const: riscv,aplic
+
+  reg:
+    maxItems: 1
+
+  interrupt-controller: true
+
+  "#interrupt-cells":
+    const: 2
+
+  interrupts-extended:
+    minItems: 1
+    maxItems: 16384
+    description:
+      Given APLIC domain directly injects external interrupts to a set of
+      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
+      node, which has a CPU node (i.e. RISC-V HART) as parent.
+
+  msi-parent:
+    description:
+      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
+      message signaled interrupt controller (IMSIC). If both "msi-parent" and
+      "interrupts-extended" properties are present then it means the APLIC
+      domain supports both MSI mode and Direct mode in HW. In this case, the
+      APLIC driver has to choose between MSI mode or Direct mode.
+
+  riscv,num-sources:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 1
+    maximum: 1023
+    description:
+      Specifies the number of wired interrupt sources supported by this
+      APLIC domain.
+
+  riscv,children:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    minItems: 1
+    maxItems: 1024
+    items:
+      maxItems: 1
+    description:
+      A list of child APLIC domains for the given APLIC domain. Each child
+      APLIC domain is assigned a child index in increasing order, with the
+      first child APLIC domain assigned child index 0. The APLIC domain child
+      index is used by firmware to delegate interrupts from the given APLIC
+      domain to a particular child APLIC domain.
+
+  riscv,delegation:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    minItems: 1
+    maxItems: 1024
+    items:
+      items:
+        - description: child APLIC domain phandle
+        - description: first interrupt number of the parent APLIC domain (inclusive)
+        - description: last interrupt number of the parent APLIC domain (inclusive)
+    description:
+      A interrupt delegation list where each entry is a triple consisting
+      of child APLIC domain phandle, first interrupt number of the parent
+      APLIC domain, and last interrupt number of the parent APLIC domain.
+      Firmware must configure interrupt delegation registers based on
+      interrupt delegation list.
+
+required:
+  - compatible
+  - reg
+  - interrupt-controller
+  - "#interrupt-cells"
+  - riscv,num-sources
+
+anyOf:
+  - required:
+      - interrupts-extended
+  - required:
+      - msi-parent
+
+unevaluatedProperties: false
+
+examples:
+  - |
+    // Example 1 (APLIC domains directly injecting interrupt to HARTs):
+
+    interrupt-controller@c000000 {
+      compatible = "qemu,aplic", "riscv,aplic";
+      interrupts-extended = <&cpu1_intc 11>,
+                            <&cpu2_intc 11>,
+                            <&cpu3_intc 11>,
+                            <&cpu4_intc 11>;
+      reg = <0xc000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+      riscv,children = <&aplic1>, <&aplic2>;
+      riscv,delegation = <&aplic1 1 63>;
+    };
+
+    aplic1: interrupt-controller@d000000 {
+      compatible = "qemu,aplic", "riscv,aplic";
+      interrupts-extended = <&cpu1_intc 9>,
+                            <&cpu2_intc 9>;
+      reg = <0xd000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+
+    aplic2: interrupt-controller@e000000 {
+      compatible = "qemu,aplic", "riscv,aplic";
+      interrupts-extended = <&cpu3_intc 9>,
+                            <&cpu4_intc 9>;
+      reg = <0xe000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+
+  - |
+    // Example 2 (APLIC domains forwarding interrupts as MSIs):
+
+    interrupt-controller@c000000 {
+      compatible = "qemu,aplic", "riscv,aplic";
+      msi-parent = <&imsic_mlevel>;
+      reg = <0xc000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+      riscv,children = <&aplic3>;
+      riscv,delegation = <&aplic3 1 63>;
+    };
+
+    aplic3: interrupt-controller@d000000 {
+      compatible = "qemu,aplic", "riscv,aplic";
+      msi-parent = <&imsic_slevel>;
+      reg = <0xd000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+...
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
                   ` (6 preceding siblings ...)
  2023-06-13 15:34 ` [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-15 19:17   ` Saravana Kannan
  2023-06-13 15:34 ` [PATCH v4 09/10] RISC-V: Select APLIC and IMSIC drivers Anup Patel
  2023-06-13 15:34 ` [PATCH v4 10/10] MAINTAINERS: Add entry for RISC-V AIA drivers Anup Patel
  9 siblings, 1 reply; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

The RISC-V advanced interrupt architecture (AIA) specification defines
a new interrupt controller for managing wired interrupts on a RISC-V
platform. This new interrupt controller is referred to as advanced
platform-level interrupt controller (APLIC) which can forward wired
interrupts to CPUs (or HARTs) as local interrupts OR as message
signaled interrupts.
(For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/Kconfig             |   6 +
 drivers/irqchip/Makefile            |   1 +
 drivers/irqchip/irq-riscv-aplic.c   | 765 ++++++++++++++++++++++++++++
 include/linux/irqchip/riscv-aplic.h | 119 +++++
 4 files changed, 891 insertions(+)
 create mode 100644 drivers/irqchip/irq-riscv-aplic.c
 create mode 100644 include/linux/irqchip/riscv-aplic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index d700980372ef..834c0329f583 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -544,6 +544,12 @@ config SIFIVE_PLIC
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
 
+config RISCV_APLIC
+	bool
+	depends on RISCV
+	select IRQ_DOMAIN_HIERARCHY
+	select GENERIC_MSI_IRQ
+
 config RISCV_IMSIC
 	bool
 	depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 577bde3e986b..438b8e1a152c 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
 obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
 obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
 obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
+obj-$(CONFIG_RISCV_APLIC)		+= irq-riscv-aplic.o
 obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
 obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
 obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
new file mode 100644
index 000000000000..1e710fdf5608
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic.c
@@ -0,0 +1,765 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-aplic: " fmt
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/platform_device.h>
+#include <linux/smp.h>
+
+#define APLIC_DEFAULT_PRIORITY		1
+#define APLIC_DISABLE_IDELIVERY		0
+#define APLIC_ENABLE_IDELIVERY		1
+#define APLIC_DISABLE_ITHRESHOLD	1
+#define APLIC_ENABLE_ITHRESHOLD		0
+
+struct aplic_msicfg {
+	phys_addr_t		base_ppn;
+	u32			hhxs;
+	u32			hhxw;
+	u32			lhxs;
+	u32			lhxw;
+};
+
+struct aplic_idc {
+	unsigned int		hart_index;
+	void __iomem		*regs;
+	struct aplic_priv	*priv;
+};
+
+struct aplic_priv {
+	struct fwnode_handle	*fwnode;
+	u32			gsi_base;
+	u32			nr_irqs;
+	u32			nr_idcs;
+	void __iomem		*regs;
+	struct irq_domain	*irqdomain;
+	struct aplic_msicfg	msicfg;
+	struct cpumask		lmask;
+};
+
+static unsigned int aplic_idc_parent_irq;
+static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
+
+static void aplic_irq_unmask(struct irq_data *d)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	writel(d->hwirq, priv->regs + APLIC_SETIENUM);
+
+	if (!priv->nr_idcs)
+		irq_chip_unmask_parent(d);
+}
+
+static void aplic_irq_mask(struct irq_data *d)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
+
+	if (!priv->nr_idcs)
+		irq_chip_mask_parent(d);
+}
+
+static int aplic_set_type(struct irq_data *d, unsigned int type)
+{
+	u32 val = 0;
+	void __iomem *sourcecfg;
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	switch (type) {
+	case IRQ_TYPE_NONE:
+		val = APLIC_SOURCECFG_SM_INACTIVE;
+		break;
+	case IRQ_TYPE_LEVEL_LOW:
+		val = APLIC_SOURCECFG_SM_LEVEL_LOW;
+		break;
+	case IRQ_TYPE_LEVEL_HIGH:
+		val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
+		break;
+	case IRQ_TYPE_EDGE_FALLING:
+		val = APLIC_SOURCECFG_SM_EDGE_FALL;
+		break;
+	case IRQ_TYPE_EDGE_RISING:
+		val = APLIC_SOURCECFG_SM_EDGE_RISE;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
+	sourcecfg += (d->hwirq - 1) * sizeof(u32);
+	writel(val, sourcecfg);
+
+	return 0;
+}
+
+static void aplic_irq_eoi(struct irq_data *d)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+	u32 reg_off, reg_mask;
+
+	/*
+	 * EOI handling only required only for level-triggered
+	 * interrupts in APLIC MSI mode.
+	 */
+
+	if (priv->nr_idcs)
+		return;
+
+	reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
+	reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
+	switch (irqd_get_trigger_type(d)) {
+	case IRQ_TYPE_LEVEL_LOW:
+		if (!(readl(priv->regs + reg_off) & reg_mask))
+			writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+		break;
+	case IRQ_TYPE_LEVEL_HIGH:
+		if (readl(priv->regs + reg_off) & reg_mask)
+			writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+		break;
+	}
+}
+
+#ifdef CONFIG_SMP
+static int aplic_set_affinity(struct irq_data *d,
+			      const struct cpumask *mask_val, bool force)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+	struct aplic_idc *idc;
+	unsigned int cpu, val;
+	struct cpumask amask;
+	void __iomem *target;
+
+	if (!priv->nr_idcs)
+		return irq_chip_set_affinity_parent(d, mask_val, force);
+
+	cpumask_and(&amask, &priv->lmask, mask_val);
+
+	if (force)
+		cpu = cpumask_first(&amask);
+	else
+		cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	idc = per_cpu_ptr(&aplic_idcs, cpu);
+	target = priv->regs + APLIC_TARGET_BASE;
+	target += (d->hwirq - 1) * sizeof(u32);
+	val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+	val <<= APLIC_TARGET_HART_IDX_SHIFT;
+	val |= APLIC_DEFAULT_PRIORITY;
+	writel(val, target);
+
+	irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+	return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip aplic_chip = {
+	.name		= "RISC-V APLIC",
+	.irq_mask	= aplic_irq_mask,
+	.irq_unmask	= aplic_irq_unmask,
+	.irq_set_type	= aplic_set_type,
+	.irq_eoi	= aplic_irq_eoi,
+#ifdef CONFIG_SMP
+	.irq_set_affinity = aplic_set_affinity,
+#endif
+	.flags		= IRQCHIP_SET_TYPE_MASKED |
+			  IRQCHIP_SKIP_SET_WAKE |
+			  IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
+				     u32 gsi_base,
+				     unsigned long *hwirq,
+				     unsigned int *type)
+{
+	if (WARN_ON(fwspec->param_count < 2))
+		return -EINVAL;
+	if (WARN_ON(!fwspec->param[0]))
+		return -EINVAL;
+
+	/* For DT, gsi_base is always zero. */
+	*hwirq = fwspec->param[0] - gsi_base;
+	*type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+
+	WARN_ON(*type == IRQ_TYPE_NONE);
+
+	return 0;
+}
+
+static int aplic_irqdomain_msi_translate(struct irq_domain *d,
+					 struct irq_fwspec *fwspec,
+					 unsigned long *hwirq,
+					 unsigned int *type)
+{
+	struct aplic_priv *priv = platform_msi_get_host_data(d);
+
+	return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
+}
+
+static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	int i, ret;
+	unsigned int type;
+	irq_hw_number_t hwirq;
+	struct irq_fwspec *fwspec = arg;
+	struct aplic_priv *priv = platform_msi_get_host_data(domain);
+
+	ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
+	if (ret)
+		return ret;
+
+	ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_info(domain, virq + i, hwirq + i,
+				    &aplic_chip, priv, handle_fasteoi_irq,
+				    NULL, NULL);
+		/*
+		 * APLIC does not implement irq_disable() so Linux interrupt
+		 * subsystem will take a lazy approach for disabling an APLIC
+		 * interrupt. This means APLIC interrupts are left unmasked
+		 * upon system suspend and interrupts are not processed
+		 * immediately upon system wake up. To tackle this, we disable
+		 * the lazy approach for all APLIC interrupts.
+		 */
+		irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+	}
+
+	return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
+	.translate	= aplic_irqdomain_msi_translate,
+	.alloc		= aplic_irqdomain_msi_alloc,
+	.free		= platform_msi_device_domain_free,
+};
+
+static int aplic_irqdomain_idc_translate(struct irq_domain *d,
+					 struct irq_fwspec *fwspec,
+					 unsigned long *hwirq,
+					 unsigned int *type)
+{
+	struct aplic_priv *priv = d->host_data;
+
+	return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
+}
+
+static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	int i, ret;
+	unsigned int type;
+	irq_hw_number_t hwirq;
+	struct irq_fwspec *fwspec = arg;
+	struct aplic_priv *priv = domain->host_data;
+
+	ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_info(domain, virq + i, hwirq + i,
+				    &aplic_chip, priv, handle_fasteoi_irq,
+				    NULL, NULL);
+		irq_set_affinity(virq + i, &priv->lmask);
+		/* See the reason described in aplic_irqdomain_msi_alloc() */
+		irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+	}
+
+	return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
+	.translate	= aplic_irqdomain_idc_translate,
+	.alloc		= aplic_irqdomain_idc_alloc,
+	.free		= irq_domain_free_irqs_top,
+};
+
+static void aplic_init_hw_irqs(struct aplic_priv *priv)
+{
+	int i;
+
+	/* Disable all interrupts */
+	for (i = 0; i <= priv->nr_irqs; i += 32)
+		writel(-1U, priv->regs + APLIC_CLRIE_BASE +
+			    (i / 32) * sizeof(u32));
+
+	/* Set interrupt type and default priority for all interrupts */
+	for (i = 1; i <= priv->nr_irqs; i++) {
+		writel(0, priv->regs + APLIC_SOURCECFG_BASE +
+			  (i - 1) * sizeof(u32));
+		writel(APLIC_DEFAULT_PRIORITY,
+		       priv->regs + APLIC_TARGET_BASE +
+		       (i - 1) * sizeof(u32));
+	}
+
+	/* Clear APLIC domaincfg */
+	writel(0, priv->regs + APLIC_DOMAINCFG);
+}
+
+static void aplic_init_hw_global(struct aplic_priv *priv)
+{
+	u32 val;
+#ifdef CONFIG_RISCV_M_MODE
+	u32 valH;
+
+	if (!priv->nr_idcs) {
+		val = priv->msicfg.base_ppn;
+		valH = (priv->msicfg.base_ppn >> 32) &
+			APLIC_xMSICFGADDRH_BAPPN_MASK;
+		valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
+			<< APLIC_xMSICFGADDRH_LHXW_SHIFT;
+		valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
+			<< APLIC_xMSICFGADDRH_HHXW_SHIFT;
+		valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
+			<< APLIC_xMSICFGADDRH_LHXS_SHIFT;
+		valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
+			<< APLIC_xMSICFGADDRH_HHXS_SHIFT;
+		writel(val, priv->regs + APLIC_xMSICFGADDR);
+		writel(valH, priv->regs + APLIC_xMSICFGADDRH);
+	}
+#endif
+
+	/* Setup APLIC domaincfg register */
+	val = readl(priv->regs + APLIC_DOMAINCFG);
+	val |= APLIC_DOMAINCFG_IE;
+	if (!priv->nr_idcs)
+		val |= APLIC_DOMAINCFG_DM;
+	writel(val, priv->regs + APLIC_DOMAINCFG);
+	if (readl(priv->regs + APLIC_DOMAINCFG) != val)
+		pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
+			priv->fwnode, val);
+}
+
+static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+	unsigned int group_index, hart_index, guest_index, val;
+	struct irq_data *d = irq_get_irq_data(desc->irq);
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+	struct aplic_msicfg *mc = &priv->msicfg;
+	phys_addr_t tppn, tbppn, msg_addr;
+	void __iomem *target;
+
+	/* For zeroed MSI, simply write zero into the target register */
+	if (!msg->address_hi && !msg->address_lo && !msg->data) {
+		target = priv->regs + APLIC_TARGET_BASE;
+		target += (d->hwirq - 1) * sizeof(u32);
+		writel(0, target);
+		return;
+	}
+
+	/* Sanity check on message data */
+	WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
+
+	/* Compute target MSI address */
+	msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
+	tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+
+	/* Compute target HART Base PPN */
+	tbppn = tppn;
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+	WARN_ON(tbppn != mc->base_ppn);
+
+	/* Compute target group and hart indexes */
+	group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
+		     APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
+	hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
+		     APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
+	hart_index |= (group_index << mc->lhxw);
+	WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
+
+	/* Compute target guest index */
+	guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
+
+	/* Update IRQ TARGET register */
+	target = priv->regs + APLIC_TARGET_BASE;
+	target += (d->hwirq - 1) * sizeof(u32);
+	val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
+				<< APLIC_TARGET_HART_IDX_SHIFT;
+	val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
+				<< APLIC_TARGET_GUEST_IDX_SHIFT;
+	val |= (msg->data & APLIC_TARGET_EIID_MASK);
+	writel(val, target);
+}
+
+static int aplic_setup_msi(struct aplic_priv *priv)
+{
+	struct aplic_msicfg *mc = &priv->msicfg;
+	const struct imsic_global_config *imsic_global;
+
+	/*
+	 * The APLIC outgoing MSI config registers assume target MSI
+	 * controller to be RISC-V AIA IMSIC controller.
+	 */
+	imsic_global = imsic_get_global_config();
+	if (!imsic_global) {
+		pr_err("%pfwP: IMSIC global config not found\n",
+			priv->fwnode);
+		return -ENODEV;
+	}
+
+	/* Find number of guest index bits (LHXS) */
+	mc->lhxs = imsic_global->guest_index_bits;
+	if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
+		pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
+			priv->fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of HART index bits (LHXW) */
+	mc->lhxw = imsic_global->hart_index_bits;
+	if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
+		pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
+			priv->fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of group index bits (HHXW) */
+	mc->hhxw = imsic_global->group_index_bits;
+	if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
+		pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
+			priv->fwnode);
+		return -EINVAL;
+	}
+
+	/* Find first bit position of group index (HHXS) */
+	mc->hhxs = imsic_global->group_index_shift;
+	if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
+		pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
+			priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
+		return -EINVAL;
+	}
+	mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
+	if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
+		pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
+			priv->fwnode);
+		return -EINVAL;
+	}
+
+	/* Compute PPN base */
+	mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+
+	/* Use all possible CPUs as lmask */
+	cpumask_copy(&priv->lmask, cpu_possible_mask);
+
+	return 0;
+}
+
+/*
+ * To handle an APLIC IDC interrupts, we just read the CLAIMI register
+ * which will return highest priority pending interrupt and clear the
+ * pending bit of the interrupt. This process is repeated until CLAIMI
+ * register return zero value.
+ */
+static void aplic_idc_handle_irq(struct irq_desc *desc)
+{
+	struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	irq_hw_number_t hw_irq;
+	int irq;
+
+	chained_irq_enter(chip, desc);
+
+	while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
+		hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
+		irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
+
+		if (unlikely(irq <= 0))
+			pr_warn_ratelimited("hw_irq %lu mapping not found\n",
+					    hw_irq);
+		else
+			generic_handle_irq(irq);
+	}
+
+	chained_irq_exit(chip, desc);
+}
+
+static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
+{
+	u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
+	u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
+
+	/* Priority must be less than threshold for interrupt triggering */
+	writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
+
+	/* Delivery must be set to 1 for interrupt triggering */
+	writel(de, idc->regs + APLIC_IDC_IDELIVERY);
+}
+
+static int aplic_idc_dying_cpu(unsigned int cpu)
+{
+	if (aplic_idc_parent_irq)
+		disable_percpu_irq(aplic_idc_parent_irq);
+
+	return 0;
+}
+
+static int aplic_idc_starting_cpu(unsigned int cpu)
+{
+	if (aplic_idc_parent_irq)
+		enable_percpu_irq(aplic_idc_parent_irq,
+				  irq_get_trigger_type(aplic_idc_parent_irq));
+
+	return 0;
+}
+
+static int aplic_setup_idc(struct aplic_priv *priv)
+{
+	int i, j, rc, cpu, setup_count = 0;
+	struct fwnode_reference_args parent;
+	struct irq_domain *domain;
+	unsigned long hartid;
+	struct aplic_idc *idc;
+	u32 val;
+
+	/* Setup per-CPU IDC and target CPU mask */
+	for (i = 0; i < priv->nr_idcs; i++) {
+		rc = fwnode_property_get_reference_args(priv->fwnode,
+				"interrupts-extended", "#interrupt-cells",
+				0, i, &parent);
+		if (rc) {
+			pr_warn("%pfwP: parent irq for IDC%d not found\n",
+				priv->fwnode, i);
+			continue;
+		}
+
+		/*
+		 * Skip interrupts other than external interrupts for
+		 * current privilege level.
+		 */
+		if (parent.args[0] != RV_IRQ_EXT)
+			continue;
+
+		rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
+		if (rc) {
+			pr_warn("%pfwP: invalid hartid for IDC%d\n",
+				priv->fwnode, i);
+			continue;
+		}
+
+		cpu = riscv_hartid_to_cpuid(hartid);
+		if (cpu < 0) {
+			pr_warn("%pfwP: invalid cpuid for IDC%d\n",
+				priv->fwnode, i);
+			continue;
+		}
+
+		cpumask_set_cpu(cpu, &priv->lmask);
+
+		idc = per_cpu_ptr(&aplic_idcs, cpu);
+		idc->hart_index = i;
+		idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
+		idc->priv = priv;
+
+		aplic_idc_set_delivery(idc, true);
+
+		/*
+		 * Boot cpu might not have APLIC hart_index = 0 so check
+		 * and update target registers of all interrupts.
+		 */
+		if (cpu == smp_processor_id() && idc->hart_index) {
+			val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+			val <<= APLIC_TARGET_HART_IDX_SHIFT;
+			val |= APLIC_DEFAULT_PRIORITY;
+			for (j = 1; j <= priv->nr_irqs; j++)
+				writel(val, priv->regs + APLIC_TARGET_BASE +
+					    (j - 1) * sizeof(u32));
+		}
+
+		setup_count++;
+	}
+
+	/* Find parent domain and register chained handler */
+	domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+					  DOMAIN_BUS_ANY);
+	if (!aplic_idc_parent_irq && domain) {
+		aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+		if (aplic_idc_parent_irq) {
+			irq_set_chained_handler(aplic_idc_parent_irq,
+						aplic_idc_handle_irq);
+
+			/*
+			 * Setup CPUHP notifier to enable IDC parent
+			 * interrupt on all CPUs
+			 */
+			cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+					  "irqchip/riscv/aplic:starting",
+					  aplic_idc_starting_cpu,
+					  aplic_idc_dying_cpu);
+		}
+	}
+
+	/* Fail if we were not able to setup IDC for any CPU */
+	return (setup_count) ? 0 : -ENODEV;
+}
+
+static int aplic_probe(struct platform_device *pdev)
+{
+	struct fwnode_handle *fwnode = pdev->dev.fwnode;
+	struct fwnode_reference_args parent;
+	struct aplic_priv *priv;
+	struct resource *res;
+	phys_addr_t pa;
+	int rc;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+	priv->fwnode = fwnode;
+
+	/* Map the MMIO registers */
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
+		return -EINVAL;
+	}
+	priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
+	if (!priv->regs) {
+		pr_err("%pfwP: failed map MMIO registers\n", fwnode);
+		return -ENOMEM;
+	}
+
+	/*
+	 * Find out GSI base number
+	 *
+	 * Note: DT does not define "riscv,gsi-base" property so GSI
+	 * base is always zero for DT.
+	 */
+	rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
+					    &priv->gsi_base, 1);
+	if (rc)
+		priv->gsi_base = 0;
+
+	/* Find out number of interrupt sources */
+	rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
+					    &priv->nr_irqs, 1);
+	if (rc) {
+		pr_err("%pfwP: failed to get number of interrupt sources\n",
+			fwnode);
+		return rc;
+	}
+
+	/* Setup initial state APLIC interrupts */
+	aplic_init_hw_irqs(priv);
+
+	/*
+	 * Find out number of IDCs based on parent interrupts
+	 *
+	 * If "msi-parent" property is present then we ignore the
+	 * APLIC IDCs which forces the APLIC driver to use MSI mode.
+	 */
+	if (!fwnode_property_present(fwnode, "msi-parent")) {
+		while (!fwnode_property_get_reference_args(fwnode,
+				"interrupts-extended", "#interrupt-cells",
+				0, priv->nr_idcs, &parent))
+			priv->nr_idcs++;
+	}
+
+	/* Setup IDCs or MSIs based on number of IDCs */
+	if (priv->nr_idcs)
+		rc = aplic_setup_idc(priv);
+	else
+		rc = aplic_setup_msi(priv);
+	if (rc) {
+		pr_err("%pfwP: failed setup %s\n",
+			fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
+		return rc;
+	}
+
+	/* Setup global config and interrupt delivery */
+	aplic_init_hw_global(priv);
+
+	/* Create irq domain instance for the APLIC */
+	if (priv->nr_idcs)
+		priv->irqdomain = irq_domain_create_linear(
+						priv->fwnode,
+						priv->nr_irqs + 1,
+						&aplic_irqdomain_idc_ops,
+						priv);
+	else
+		priv->irqdomain = platform_msi_create_device_domain(
+						&pdev->dev,
+						priv->nr_irqs + 1,
+						aplic_msi_write_msg,
+						&aplic_irqdomain_msi_ops,
+						priv);
+	if (!priv->irqdomain) {
+		pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
+		return -ENOMEM;
+	}
+
+	/* Advertise the interrupt controller */
+	if (priv->nr_idcs) {
+		pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
+			priv->fwnode, priv->nr_irqs, priv->nr_idcs);
+	} else {
+		pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
+		pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
+			priv->fwnode, priv->nr_irqs, &pa);
+	}
+
+	return 0;
+}
+
+static const struct of_device_id aplic_match[] = {
+	{ .compatible = "riscv,aplic" },
+	{}
+};
+
+static struct platform_driver aplic_driver = {
+	.driver = {
+		.name		= "riscv-aplic",
+		.of_match_table	= aplic_match,
+	},
+	.probe = aplic_probe,
+};
+builtin_platform_driver(aplic_driver);
+
+static int __init aplic_dt_init(struct device_node *node,
+				struct device_node *parent)
+{
+	/*
+	 * The APLIC platform driver needs to be probed early
+	 * so for device tree:
+	 *
+	 * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
+	 *    provides a hint to the device driver core to probe the
+	 *    platform driver early.
+	 * 2) Clear the OF_POPULATED flag in device_node because
+	 *    of_irq_init() sets it which prevents creation of
+	 *    platform device.
+	 */
+	node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
+	of_node_clear_flag(node, OF_POPULATED);
+	return 0;
+}
+IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
new file mode 100644
index 000000000000..97e198ea0109
--- /dev/null
+++ b/include/linux/irqchip/riscv-aplic.h
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
+#define __LINUX_IRQCHIP_RISCV_APLIC_H
+
+#include <linux/bitops.h>
+
+#define APLIC_MAX_IDC			BIT(14)
+#define APLIC_MAX_SOURCE		1024
+
+#define APLIC_DOMAINCFG			0x0000
+#define APLIC_DOMAINCFG_RDONLY		0x80000000
+#define APLIC_DOMAINCFG_IE		BIT(8)
+#define APLIC_DOMAINCFG_DM		BIT(2)
+#define APLIC_DOMAINCFG_BE		BIT(0)
+
+#define APLIC_SOURCECFG_BASE		0x0004
+#define APLIC_SOURCECFG_D		BIT(10)
+#define APLIC_SOURCECFG_CHILDIDX_MASK	0x000003ff
+#define APLIC_SOURCECFG_SM_MASK	0x00000007
+#define APLIC_SOURCECFG_SM_INACTIVE	0x0
+#define APLIC_SOURCECFG_SM_DETACH	0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE	0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL	0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH	0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW	0x7
+
+#define APLIC_MMSICFGADDR		0x1bc0
+#define APLIC_MMSICFGADDRH		0x1bc4
+#define APLIC_SMSICFGADDR		0x1bc8
+#define APLIC_SMSICFGADDRH		0x1bcc
+
+#ifdef CONFIG_RISCV_M_MODE
+#define APLIC_xMSICFGADDR		APLIC_MMSICFGADDR
+#define APLIC_xMSICFGADDRH		APLIC_MMSICFGADDRH
+#else
+#define APLIC_xMSICFGADDR		APLIC_SMSICFGADDR
+#define APLIC_xMSICFGADDRH		APLIC_SMSICFGADDRH
+#endif
+
+#define APLIC_xMSICFGADDRH_L		BIT(31)
+#define APLIC_xMSICFGADDRH_HHXS_MASK	0x1f
+#define APLIC_xMSICFGADDRH_HHXS_SHIFT	24
+#define APLIC_xMSICFGADDRH_LHXS_MASK	0x7
+#define APLIC_xMSICFGADDRH_LHXS_SHIFT	20
+#define APLIC_xMSICFGADDRH_HHXW_MASK	0x7
+#define APLIC_xMSICFGADDRH_HHXW_SHIFT	16
+#define APLIC_xMSICFGADDRH_LHXW_MASK	0xf
+#define APLIC_xMSICFGADDRH_LHXW_SHIFT	12
+#define APLIC_xMSICFGADDRH_BAPPN_MASK	0xfff
+
+#define APLIC_xMSICFGADDR_PPN_SHIFT	12
+
+#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
+	(BIT(__lhxs) - 1)
+
+#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
+	(BIT(__lhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
+	((__lhxs))
+#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
+	(APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
+	 APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
+
+#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
+	(BIT(__hhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
+	((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
+#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
+	(APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
+	 APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
+
+#define APLIC_IRQBITS_PER_REG		32
+
+#define APLIC_SETIP_BASE		0x1c00
+#define APLIC_SETIPNUM			0x1cdc
+
+#define APLIC_CLRIP_BASE		0x1d00
+#define APLIC_CLRIPNUM			0x1ddc
+
+#define APLIC_SETIE_BASE		0x1e00
+#define APLIC_SETIENUM			0x1edc
+
+#define APLIC_CLRIE_BASE		0x1f00
+#define APLIC_CLRIENUM			0x1fdc
+
+#define APLIC_SETIPNUM_LE		0x2000
+#define APLIC_SETIPNUM_BE		0x2004
+
+#define APLIC_GENMSI			0x3000
+
+#define APLIC_TARGET_BASE		0x3004
+#define APLIC_TARGET_HART_IDX_SHIFT	18
+#define APLIC_TARGET_HART_IDX_MASK	0x3fff
+#define APLIC_TARGET_GUEST_IDX_SHIFT	12
+#define APLIC_TARGET_GUEST_IDX_MASK	0x3f
+#define APLIC_TARGET_IPRIO_MASK	0xff
+#define APLIC_TARGET_EIID_MASK	0x7ff
+
+#define APLIC_IDC_BASE			0x4000
+#define APLIC_IDC_SIZE			32
+
+#define APLIC_IDC_IDELIVERY		0x00
+
+#define APLIC_IDC_IFORCE		0x04
+
+#define APLIC_IDC_ITHRESHOLD		0x08
+
+#define APLIC_IDC_TOPI			0x18
+#define APLIC_IDC_TOPI_ID_SHIFT	16
+#define APLIC_IDC_TOPI_ID_MASK	0x3ff
+#define APLIC_IDC_TOPI_PRIO_MASK	0xff
+
+#define APLIC_IDC_CLAIMI		0x1c
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 09/10] RISC-V: Select APLIC and IMSIC drivers
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
                   ` (7 preceding siblings ...)
  2023-06-13 15:34 ` [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  2023-06-13 15:34 ` [PATCH v4 10/10] MAINTAINERS: Add entry for RISC-V AIA drivers Anup Patel
  9 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel, Conor Dooley

The QEMU virt machine supports AIA emulation and we also have
quite a few RISC-V platforms with AIA support under development
so let us select APLIC and IMSIC drivers for all RISC-V platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ff37d8ebe989..19233d59be37 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -136,6 +136,8 @@ config RISCV
 	select PCI_DOMAINS_GENERIC if PCI
 	select PCI_MSI if PCI
 	select RISCV_ALTERNATIVE if !XIP_KERNEL
+	select RISCV_APLIC
+	select RISCV_IMSIC
 	select RISCV_INTC
 	select RISCV_TIMER if RISCV_SBI
 	select SIFIVE_PLIC
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 10/10] MAINTAINERS: Add entry for RISC-V AIA drivers
  2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
                   ` (8 preceding siblings ...)
  2023-06-13 15:34 ` [PATCH v4 09/10] RISC-V: Select APLIC and IMSIC drivers Anup Patel
@ 2023-06-13 15:34 ` Anup Patel
  9 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-13 15:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand
  Cc: Atish Patra, Andrew Jones, Conor Dooley, Saravana Kannan,
	Anup Patel, linux-riscv, linux-kernel, devicetree, iommu,
	Anup Patel

Add myself as maintainer for RISC-V AIA drivers including the
RISC-V INTC driver which supports both AIA and non-AIA platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 MAINTAINERS | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 51da90e60004..2d474eb902fa 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18136,6 +18136,18 @@ S:	Maintained
 F:	drivers/mtd/nand/raw/r852.c
 F:	drivers/mtd/nand/raw/r852.h
 
+RISC-V AIA DRIVERS
+M:	Anup Patel <anup@brainfault.org>
+L:	linux-riscv@lists.infradead.org
+S:	Maintained
+F:	Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
+F:	Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
+F:	drivers/irqchip/irq-riscv-aplic.c
+F:	drivers/irqchip/irq-riscv-imsic.c
+F:	drivers/irqchip/irq-riscv-intc.c
+F:	include/linux/irqchip/riscv-aplic.h
+F:	include/linux/irqchip/riscv-imsic.h
+
 RISC-V ARCHITECTURE
 M:	Paul Walmsley <paul.walmsley@sifive.com>
 M:	Palmer Dabbelt <palmer@dabbelt.com>
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support
  2023-06-13 15:34 ` [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support Anup Patel
@ 2023-06-14 14:46   ` Jason Gunthorpe
  2023-06-14 16:17     ` Anup Patel
  0 siblings, 1 reply; 28+ messages in thread
From: Jason Gunthorpe @ 2023-06-14 14:46 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand, Atish Patra, Andrew Jones,
	Conor Dooley, Saravana Kannan, Anup Patel, linux-riscv,
	linux-kernel, devicetree, iommu, Vincent Chen

On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> We have a separate RISC-V IMSIC MSI address for each CPU so changing
> MSI (or IRQ) affinity results in re-programming of MSI address in
> the PCIe (or platform) device.
> 
> Currently, the iommu_dma_prepare_msi() is called only once at the
> time of IRQ allocation so IOMMU DMA domain will only have mapping
> for one MSI page. This means iommu_dma_compose_msi_msg() called
> by imsic_irq_compose_msi_msg() will always use the same MSI page
> irrespective to target CPU MSI address. In other words, changing
> MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> work.

You didn't answer my question from last time - there seems to be no
iommu driver here so why are you messing with iommu_dma_prepare_msi()?

This path is only for platforms that have IOMMU drivers that translate
the MSI window. You should add this code to link the interrupt
controller to the iommu driver when you introduce the iommu driver,
not in this series?

And, as I said before, I'd like to NOT see new users of
iommu_dma_prepare_msi() since it is a very problematic API.

This hacking of it here is not making it better :(

Jason

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support
  2023-06-14 14:46   ` Jason Gunthorpe
@ 2023-06-14 16:17     ` Anup Patel
  2023-06-14 16:50       ` Jason Gunthorpe
  0 siblings, 1 reply; 28+ messages in thread
From: Anup Patel @ 2023-06-14 16:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand, Atish Patra, Andrew Jones,
	Conor Dooley, Saravana Kannan, Anup Patel, linux-riscv,
	linux-kernel, devicetree, iommu, Vincent Chen

On Wed, Jun 14, 2023 at 8:16 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> > We have a separate RISC-V IMSIC MSI address for each CPU so changing
> > MSI (or IRQ) affinity results in re-programming of MSI address in
> > the PCIe (or platform) device.
> >
> > Currently, the iommu_dma_prepare_msi() is called only once at the
> > time of IRQ allocation so IOMMU DMA domain will only have mapping
> > for one MSI page. This means iommu_dma_compose_msi_msg() called
> > by imsic_irq_compose_msi_msg() will always use the same MSI page
> > irrespective to target CPU MSI address. In other words, changing
> > MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> > work.
>
> You didn't answer my question from last time - there seems to be no
> iommu driver here so why are you messing with iommu_dma_prepare_msi()?
>
> This path is only for platforms that have IOMMU drivers that translate
> the MSI window. You should add this code to link the interrupt
> controller to the iommu driver when you introduce the iommu driver,
> not in this series?
>
> And, as I said before, I'd like to NOT see new users of
> iommu_dma_prepare_msi() since it is a very problematic API.
>
> This hacking of it here is not making it better :(

I misunderstood your previous comments.

We can certainly deal with this later when the IOMMU
driver is available for RISC-V. I will drop this patch in the
next revision.

Regards,
Anup

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support
  2023-06-14 16:17     ` Anup Patel
@ 2023-06-14 16:50       ` Jason Gunthorpe
  2023-06-15  5:46         ` Anup Patel
  0 siblings, 1 reply; 28+ messages in thread
From: Jason Gunthorpe @ 2023-06-14 16:50 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand, Atish Patra, Andrew Jones,
	Conor Dooley, Saravana Kannan, Anup Patel, linux-riscv,
	linux-kernel, devicetree, iommu, Vincent Chen

On Wed, Jun 14, 2023 at 09:47:53PM +0530, Anup Patel wrote:
> On Wed, Jun 14, 2023 at 8:16 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> > > We have a separate RISC-V IMSIC MSI address for each CPU so changing
> > > MSI (or IRQ) affinity results in re-programming of MSI address in
> > > the PCIe (or platform) device.
> > >
> > > Currently, the iommu_dma_prepare_msi() is called only once at the
> > > time of IRQ allocation so IOMMU DMA domain will only have mapping
> > > for one MSI page. This means iommu_dma_compose_msi_msg() called
> > > by imsic_irq_compose_msi_msg() will always use the same MSI page
> > > irrespective to target CPU MSI address. In other words, changing
> > > MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> > > work.
> >
> > You didn't answer my question from last time - there seems to be no
> > iommu driver here so why are you messing with iommu_dma_prepare_msi()?
> >
> > This path is only for platforms that have IOMMU drivers that translate
> > the MSI window. You should add this code to link the interrupt
> > controller to the iommu driver when you introduce the iommu driver,
> > not in this series?
> >
> > And, as I said before, I'd like to NOT see new users of
> > iommu_dma_prepare_msi() since it is a very problematic API.
> >
> > This hacking of it here is not making it better :(
> 
> I misunderstood your previous comments.
> 
> We can certainly deal with this later when the IOMMU
> driver is available for RISC-V. I will drop this patch in the
> next revision.

Not only just this patch but the calls to iommu_dma_prepare_msi() and
related APIs in the prior patch too. Assume the MSI window is directly
visible to DMA without translation.

When you come with an iommu driver we can discuss how best to proceed.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-06-13 15:34 ` [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC Anup Patel
@ 2023-06-14 19:27   ` Conor Dooley
  2023-06-15  5:47     ` Anup Patel
  0 siblings, 1 reply; 28+ messages in thread
From: Conor Dooley @ 2023-06-14 19:27 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand, Atish Patra, Andrew Jones,
	Saravana Kannan, Anup Patel, linux-riscv, linux-kernel,
	devicetree, iommu

[-- Attachment #1: Type: text/plain, Size: 1922 bytes --]

Hey Anup,

Mostly looks good, once minor comment.

On Tue, Jun 13, 2023 at 09:04:12PM +0530, Anup Patel wrote:

> +  riscv,children:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      maxItems: 1
> +    description:
> +      A list of child APLIC domains for the given APLIC domain. Each child
> +      APLIC domain is assigned a child index in increasing order, with the
> +      first child APLIC domain assigned child index 0. The APLIC domain child
> +      index is used by firmware to delegate interrupts from the given APLIC
> +      domain to a particular child APLIC domain.
> +
> +  riscv,delegation:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      items:
> +        - description: child APLIC domain phandle
> +        - description: first interrupt number of the parent APLIC domain (inclusive)
> +        - description: last interrupt number of the parent APLIC domain (inclusive)
> +    description:
> +      A interrupt delegation list where each entry is a triple consisting
> +      of child APLIC domain phandle, first interrupt number of the parent
> +      APLIC domain, and last interrupt number of the parent APLIC domain.
> +      Firmware must configure interrupt delegation registers based on
> +      interrupt delegation list.
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupt-controller
> +  - "#interrupt-cells"
> +  - riscv,num-sources
> +
> +anyOf:
> +  - required:
> +      - interrupts-extended
> +  - required:
> +      - msi-parent

Not sure if you missed this from the last version, but I asked if we
needed a
	dependencies:
	  riscv,delegate: [ riscv,children ]

IOW, I don't think it is valid to have a delegation without having
children?

Otherwise,
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support
  2023-06-14 16:50       ` Jason Gunthorpe
@ 2023-06-15  5:46         ` Anup Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-15  5:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand, Atish Patra, Andrew Jones,
	Conor Dooley, Saravana Kannan, Anup Patel, linux-riscv,
	linux-kernel, devicetree, iommu, Vincent Chen

On Wed, Jun 14, 2023 at 10:20 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Wed, Jun 14, 2023 at 09:47:53PM +0530, Anup Patel wrote:
> > On Wed, Jun 14, 2023 at 8:16 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > >
> > > On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> > > > We have a separate RISC-V IMSIC MSI address for each CPU so changing
> > > > MSI (or IRQ) affinity results in re-programming of MSI address in
> > > > the PCIe (or platform) device.
> > > >
> > > > Currently, the iommu_dma_prepare_msi() is called only once at the
> > > > time of IRQ allocation so IOMMU DMA domain will only have mapping
> > > > for one MSI page. This means iommu_dma_compose_msi_msg() called
> > > > by imsic_irq_compose_msi_msg() will always use the same MSI page
> > > > irrespective to target CPU MSI address. In other words, changing
> > > > MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> > > > work.
> > >
> > > You didn't answer my question from last time - there seems to be no
> > > iommu driver here so why are you messing with iommu_dma_prepare_msi()?
> > >
> > > This path is only for platforms that have IOMMU drivers that translate
> > > the MSI window. You should add this code to link the interrupt
> > > controller to the iommu driver when you introduce the iommu driver,
> > > not in this series?
> > >
> > > And, as I said before, I'd like to NOT see new users of
> > > iommu_dma_prepare_msi() since it is a very problematic API.
> > >
> > > This hacking of it here is not making it better :(
> >
> > I misunderstood your previous comments.
> >
> > We can certainly deal with this later when the IOMMU
> > driver is available for RISC-V. I will drop this patch in the
> > next revision.
>
> Not only just this patch but the calls to iommu_dma_prepare_msi() and
> related APIs in the prior patch too. Assume the MSI window is directly
> visible to DMA without translation.

Okay, I will remove iommu_dma_xyz() usage from IMSIC driver in the
next revision.

>
> When you come with an iommu driver we can discuss how best to proceed.

Yes, that's better.

Regards,
Anup

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-06-14 19:27   ` Conor Dooley
@ 2023-06-15  5:47     ` Anup Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-15  5:47 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand, Atish Patra, Andrew Jones,
	Saravana Kannan, Anup Patel, linux-riscv, linux-kernel,
	devicetree, iommu

On Thu, Jun 15, 2023 at 12:57 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup,
>
> Mostly looks good, once minor comment.
>
> On Tue, Jun 13, 2023 at 09:04:12PM +0530, Anup Patel wrote:
>
> > +  riscv,children:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      maxItems: 1
> > +    description:
> > +      A list of child APLIC domains for the given APLIC domain. Each child
> > +      APLIC domain is assigned a child index in increasing order, with the
> > +      first child APLIC domain assigned child index 0. The APLIC domain child
> > +      index is used by firmware to delegate interrupts from the given APLIC
> > +      domain to a particular child APLIC domain.
> > +
> > +  riscv,delegation:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      items:
> > +        - description: child APLIC domain phandle
> > +        - description: first interrupt number of the parent APLIC domain (inclusive)
> > +        - description: last interrupt number of the parent APLIC domain (inclusive)
> > +    description:
> > +      A interrupt delegation list where each entry is a triple consisting
> > +      of child APLIC domain phandle, first interrupt number of the parent
> > +      APLIC domain, and last interrupt number of the parent APLIC domain.
> > +      Firmware must configure interrupt delegation registers based on
> > +      interrupt delegation list.
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - interrupt-controller
> > +  - "#interrupt-cells"
> > +  - riscv,num-sources
> > +
> > +anyOf:
> > +  - required:
> > +      - interrupts-extended
> > +  - required:
> > +      - msi-parent
>
> Not sure if you missed this from the last version, but I asked if we
> needed a
>         dependencies:
>           riscv,delegate: [ riscv,children ]
>
> IOW, I don't think it is valid to have a delegation without having
> children?

Ahh, yes. I missed this one. I will update in the next revision.

>
> Otherwise,
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
>
> Cheers,
> Conor.

Regards,
Anup

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-13 15:34 ` [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver Anup Patel
@ 2023-06-15 19:17   ` Saravana Kannan
  2023-06-15 19:31     ` Conor Dooley
  2023-06-16  2:01     ` Anup Patel
  0 siblings, 2 replies; 28+ messages in thread
From: Saravana Kannan @ 2023-06-15 19:17 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Robin Murphy, Joerg Roedel,
	Will Deacon, Frank Rowand, Atish Patra, Andrew Jones,
	Conor Dooley, Anup Patel, linux-riscv, linux-kernel, devicetree,
	iommu

On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
>
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new interrupt controller for managing wired interrupts on a RISC-V
> platform. This new interrupt controller is referred to as advanced
> platform-level interrupt controller (APLIC) which can forward wired
> interrupts to CPUs (or HARTs) as local interrupts OR as message
> signaled interrupts.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> platforms.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/Kconfig             |   6 +
>  drivers/irqchip/Makefile            |   1 +
>  drivers/irqchip/irq-riscv-aplic.c   | 765 ++++++++++++++++++++++++++++
>  include/linux/irqchip/riscv-aplic.h | 119 +++++
>  4 files changed, 891 insertions(+)
>  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
>  create mode 100644 include/linux/irqchip/riscv-aplic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index d700980372ef..834c0329f583 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -544,6 +544,12 @@ config SIFIVE_PLIC
>         select IRQ_DOMAIN_HIERARCHY
>         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_APLIC
> +       bool
> +       depends on RISCV
> +       select IRQ_DOMAIN_HIERARCHY
> +       select GENERIC_MSI_IRQ
> +
>  config RISCV_IMSIC
>         bool
>         depends on RISCV
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 577bde3e986b..438b8e1a152c 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
>  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
>  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
>  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
>  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
>  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
>  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> new file mode 100644
> index 000000000000..1e710fdf5608
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-aplic.c
> @@ -0,0 +1,765 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-aplic: " fmt
> +#include <linux/bitops.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-aplic.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/platform_device.h>
> +#include <linux/smp.h>
> +
> +#define APLIC_DEFAULT_PRIORITY         1
> +#define APLIC_DISABLE_IDELIVERY                0
> +#define APLIC_ENABLE_IDELIVERY         1
> +#define APLIC_DISABLE_ITHRESHOLD       1
> +#define APLIC_ENABLE_ITHRESHOLD                0
> +
> +struct aplic_msicfg {
> +       phys_addr_t             base_ppn;
> +       u32                     hhxs;
> +       u32                     hhxw;
> +       u32                     lhxs;
> +       u32                     lhxw;
> +};
> +
> +struct aplic_idc {
> +       unsigned int            hart_index;
> +       void __iomem            *regs;
> +       struct aplic_priv       *priv;
> +};
> +
> +struct aplic_priv {
> +       struct fwnode_handle    *fwnode;
> +       u32                     gsi_base;
> +       u32                     nr_irqs;
> +       u32                     nr_idcs;
> +       void __iomem            *regs;
> +       struct irq_domain       *irqdomain;
> +       struct aplic_msicfg     msicfg;
> +       struct cpumask          lmask;
> +};
> +
> +static unsigned int aplic_idc_parent_irq;
> +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> +
> +static void aplic_irq_unmask(struct irq_data *d)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> +
> +       if (!priv->nr_idcs)
> +               irq_chip_unmask_parent(d);
> +}
> +
> +static void aplic_irq_mask(struct irq_data *d)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> +
> +       if (!priv->nr_idcs)
> +               irq_chip_mask_parent(d);
> +}
> +
> +static int aplic_set_type(struct irq_data *d, unsigned int type)
> +{
> +       u32 val = 0;
> +       void __iomem *sourcecfg;
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       switch (type) {
> +       case IRQ_TYPE_NONE:
> +               val = APLIC_SOURCECFG_SM_INACTIVE;
> +               break;
> +       case IRQ_TYPE_LEVEL_LOW:
> +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> +               break;
> +       case IRQ_TYPE_LEVEL_HIGH:
> +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> +               break;
> +       case IRQ_TYPE_EDGE_FALLING:
> +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> +               break;
> +       case IRQ_TYPE_EDGE_RISING:
> +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> +       writel(val, sourcecfg);
> +
> +       return 0;
> +}
> +
> +static void aplic_irq_eoi(struct irq_data *d)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +       u32 reg_off, reg_mask;
> +
> +       /*
> +        * EOI handling only required only for level-triggered
> +        * interrupts in APLIC MSI mode.
> +        */
> +
> +       if (priv->nr_idcs)
> +               return;
> +
> +       reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> +       reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> +       switch (irqd_get_trigger_type(d)) {
> +       case IRQ_TYPE_LEVEL_LOW:
> +               if (!(readl(priv->regs + reg_off) & reg_mask))
> +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> +               break;
> +       case IRQ_TYPE_LEVEL_HIGH:
> +               if (readl(priv->regs + reg_off) & reg_mask)
> +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> +               break;
> +       }
> +}
> +
> +#ifdef CONFIG_SMP
> +static int aplic_set_affinity(struct irq_data *d,
> +                             const struct cpumask *mask_val, bool force)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +       struct aplic_idc *idc;
> +       unsigned int cpu, val;
> +       struct cpumask amask;
> +       void __iomem *target;
> +
> +       if (!priv->nr_idcs)
> +               return irq_chip_set_affinity_parent(d, mask_val, force);
> +
> +       cpumask_and(&amask, &priv->lmask, mask_val);
> +
> +       if (force)
> +               cpu = cpumask_first(&amask);
> +       else
> +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> +       if (cpu >= nr_cpu_ids)
> +               return -EINVAL;
> +
> +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> +       target = priv->regs + APLIC_TARGET_BASE;
> +       target += (d->hwirq - 1) * sizeof(u32);
> +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> +       val |= APLIC_DEFAULT_PRIORITY;
> +       writel(val, target);
> +
> +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +       return IRQ_SET_MASK_OK_DONE;
> +}
> +#endif
> +
> +static struct irq_chip aplic_chip = {
> +       .name           = "RISC-V APLIC",
> +       .irq_mask       = aplic_irq_mask,
> +       .irq_unmask     = aplic_irq_unmask,
> +       .irq_set_type   = aplic_set_type,
> +       .irq_eoi        = aplic_irq_eoi,
> +#ifdef CONFIG_SMP
> +       .irq_set_affinity = aplic_set_affinity,
> +#endif
> +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> +                         IRQCHIP_SKIP_SET_WAKE |
> +                         IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> +                                    u32 gsi_base,
> +                                    unsigned long *hwirq,
> +                                    unsigned int *type)
> +{
> +       if (WARN_ON(fwspec->param_count < 2))
> +               return -EINVAL;
> +       if (WARN_ON(!fwspec->param[0]))
> +               return -EINVAL;
> +
> +       /* For DT, gsi_base is always zero. */
> +       *hwirq = fwspec->param[0] - gsi_base;
> +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> +
> +       WARN_ON(*type == IRQ_TYPE_NONE);
> +
> +       return 0;
> +}
> +
> +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> +                                        struct irq_fwspec *fwspec,
> +                                        unsigned long *hwirq,
> +                                        unsigned int *type)
> +{
> +       struct aplic_priv *priv = platform_msi_get_host_data(d);
> +
> +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> +}
> +
> +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> +                                    unsigned int virq, unsigned int nr_irqs,
> +                                    void *arg)
> +{
> +       int i, ret;
> +       unsigned int type;
> +       irq_hw_number_t hwirq;
> +       struct irq_fwspec *fwspec = arg;
> +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> +
> +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> +       if (ret)
> +               return ret;
> +
> +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> +       if (ret)
> +               return ret;
> +
> +       for (i = 0; i < nr_irqs; i++) {
> +               irq_domain_set_info(domain, virq + i, hwirq + i,
> +                                   &aplic_chip, priv, handle_fasteoi_irq,
> +                                   NULL, NULL);
> +               /*
> +                * APLIC does not implement irq_disable() so Linux interrupt
> +                * subsystem will take a lazy approach for disabling an APLIC
> +                * interrupt. This means APLIC interrupts are left unmasked
> +                * upon system suspend and interrupts are not processed
> +                * immediately upon system wake up. To tackle this, we disable
> +                * the lazy approach for all APLIC interrupts.
> +                */
> +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> +       }
> +
> +       return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> +       .translate      = aplic_irqdomain_msi_translate,
> +       .alloc          = aplic_irqdomain_msi_alloc,
> +       .free           = platform_msi_device_domain_free,
> +};
> +
> +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> +                                        struct irq_fwspec *fwspec,
> +                                        unsigned long *hwirq,
> +                                        unsigned int *type)
> +{
> +       struct aplic_priv *priv = d->host_data;
> +
> +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> +}
> +
> +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> +                                    unsigned int virq, unsigned int nr_irqs,
> +                                    void *arg)
> +{
> +       int i, ret;
> +       unsigned int type;
> +       irq_hw_number_t hwirq;
> +       struct irq_fwspec *fwspec = arg;
> +       struct aplic_priv *priv = domain->host_data;
> +
> +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> +       if (ret)
> +               return ret;
> +
> +       for (i = 0; i < nr_irqs; i++) {
> +               irq_domain_set_info(domain, virq + i, hwirq + i,
> +                                   &aplic_chip, priv, handle_fasteoi_irq,
> +                                   NULL, NULL);
> +               irq_set_affinity(virq + i, &priv->lmask);
> +               /* See the reason described in aplic_irqdomain_msi_alloc() */
> +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> +       }
> +
> +       return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> +       .translate      = aplic_irqdomain_idc_translate,
> +       .alloc          = aplic_irqdomain_idc_alloc,
> +       .free           = irq_domain_free_irqs_top,
> +};
> +
> +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> +{
> +       int i;
> +
> +       /* Disable all interrupts */
> +       for (i = 0; i <= priv->nr_irqs; i += 32)
> +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> +                           (i / 32) * sizeof(u32));
> +
> +       /* Set interrupt type and default priority for all interrupts */
> +       for (i = 1; i <= priv->nr_irqs; i++) {
> +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> +                         (i - 1) * sizeof(u32));
> +               writel(APLIC_DEFAULT_PRIORITY,
> +                      priv->regs + APLIC_TARGET_BASE +
> +                      (i - 1) * sizeof(u32));
> +       }
> +
> +       /* Clear APLIC domaincfg */
> +       writel(0, priv->regs + APLIC_DOMAINCFG);
> +}
> +
> +static void aplic_init_hw_global(struct aplic_priv *priv)
> +{
> +       u32 val;
> +#ifdef CONFIG_RISCV_M_MODE
> +       u32 valH;
> +
> +       if (!priv->nr_idcs) {
> +               val = priv->msicfg.base_ppn;
> +               valH = (priv->msicfg.base_ppn >> 32) &
> +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> +       }
> +#endif
> +
> +       /* Setup APLIC domaincfg register */
> +       val = readl(priv->regs + APLIC_DOMAINCFG);
> +       val |= APLIC_DOMAINCFG_IE;
> +       if (!priv->nr_idcs)
> +               val |= APLIC_DOMAINCFG_DM;
> +       writel(val, priv->regs + APLIC_DOMAINCFG);
> +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> +               pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> +                       priv->fwnode, val);
> +}
> +
> +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +       unsigned int group_index, hart_index, guest_index, val;
> +       struct irq_data *d = irq_get_irq_data(desc->irq);
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +       struct aplic_msicfg *mc = &priv->msicfg;
> +       phys_addr_t tppn, tbppn, msg_addr;
> +       void __iomem *target;
> +
> +       /* For zeroed MSI, simply write zero into the target register */
> +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> +               target = priv->regs + APLIC_TARGET_BASE;
> +               target += (d->hwirq - 1) * sizeof(u32);
> +               writel(0, target);
> +               return;
> +       }
> +
> +       /* Sanity check on message data */
> +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> +
> +       /* Compute target MSI address */
> +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> +
> +       /* Compute target HART Base PPN */
> +       tbppn = tppn;
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> +       WARN_ON(tbppn != mc->base_ppn);
> +
> +       /* Compute target group and hart indexes */
> +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> +       hart_index |= (group_index << mc->lhxw);
> +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> +
> +       /* Compute target guest index */
> +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> +
> +       /* Update IRQ TARGET register */
> +       target = priv->regs + APLIC_TARGET_BASE;
> +       target += (d->hwirq - 1) * sizeof(u32);
> +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> +                               << APLIC_TARGET_HART_IDX_SHIFT;
> +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> +       writel(val, target);
> +}
> +
> +static int aplic_setup_msi(struct aplic_priv *priv)
> +{
> +       struct aplic_msicfg *mc = &priv->msicfg;
> +       const struct imsic_global_config *imsic_global;
> +
> +       /*
> +        * The APLIC outgoing MSI config registers assume target MSI
> +        * controller to be RISC-V AIA IMSIC controller.
> +        */
> +       imsic_global = imsic_get_global_config();
> +       if (!imsic_global) {
> +               pr_err("%pfwP: IMSIC global config not found\n",
> +                       priv->fwnode);
> +               return -ENODEV;
> +       }
> +
> +       /* Find number of guest index bits (LHXS) */
> +       mc->lhxs = imsic_global->guest_index_bits;
> +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> +               pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> +                       priv->fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of HART index bits (LHXW) */
> +       mc->lhxw = imsic_global->hart_index_bits;
> +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> +               pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> +                       priv->fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of group index bits (HHXW) */
> +       mc->hhxw = imsic_global->group_index_bits;
> +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> +               pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> +                       priv->fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find first bit position of group index (HHXS) */
> +       mc->hhxs = imsic_global->group_index_shift;
> +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> +               pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> +                       priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> +               return -EINVAL;
> +       }
> +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> +               pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> +                       priv->fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Compute PPN base */
> +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> +
> +       /* Use all possible CPUs as lmask */
> +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> +
> +       return 0;
> +}
> +
> +/*
> + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> + * which will return highest priority pending interrupt and clear the
> + * pending bit of the interrupt. This process is repeated until CLAIMI
> + * register return zero value.
> + */
> +static void aplic_idc_handle_irq(struct irq_desc *desc)
> +{
> +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> +       struct irq_chip *chip = irq_desc_get_chip(desc);
> +       irq_hw_number_t hw_irq;
> +       int irq;
> +
> +       chained_irq_enter(chip, desc);
> +
> +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> +
> +               if (unlikely(irq <= 0))
> +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> +                                           hw_irq);
> +               else
> +                       generic_handle_irq(irq);
> +       }
> +
> +       chained_irq_exit(chip, desc);
> +}
> +
> +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> +{
> +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> +
> +       /* Priority must be less than threshold for interrupt triggering */
> +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> +
> +       /* Delivery must be set to 1 for interrupt triggering */
> +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> +}
> +
> +static int aplic_idc_dying_cpu(unsigned int cpu)
> +{
> +       if (aplic_idc_parent_irq)
> +               disable_percpu_irq(aplic_idc_parent_irq);
> +
> +       return 0;
> +}
> +
> +static int aplic_idc_starting_cpu(unsigned int cpu)
> +{
> +       if (aplic_idc_parent_irq)
> +               enable_percpu_irq(aplic_idc_parent_irq,
> +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> +
> +       return 0;
> +}
> +
> +static int aplic_setup_idc(struct aplic_priv *priv)
> +{
> +       int i, j, rc, cpu, setup_count = 0;
> +       struct fwnode_reference_args parent;
> +       struct irq_domain *domain;
> +       unsigned long hartid;
> +       struct aplic_idc *idc;
> +       u32 val;
> +
> +       /* Setup per-CPU IDC and target CPU mask */
> +       for (i = 0; i < priv->nr_idcs; i++) {
> +               rc = fwnode_property_get_reference_args(priv->fwnode,
> +                               "interrupts-extended", "#interrupt-cells",
> +                               0, i, &parent);
> +               if (rc) {
> +                       pr_warn("%pfwP: parent irq for IDC%d not found\n",
> +                               priv->fwnode, i);
> +                       continue;
> +               }
> +
> +               /*
> +                * Skip interrupts other than external interrupts for
> +                * current privilege level.
> +                */
> +               if (parent.args[0] != RV_IRQ_EXT)
> +                       continue;
> +
> +               rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> +               if (rc) {
> +                       pr_warn("%pfwP: invalid hartid for IDC%d\n",
> +                               priv->fwnode, i);
> +                       continue;
> +               }
> +
> +               cpu = riscv_hartid_to_cpuid(hartid);
> +               if (cpu < 0) {
> +                       pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> +                               priv->fwnode, i);
> +                       continue;
> +               }
> +
> +               cpumask_set_cpu(cpu, &priv->lmask);
> +
> +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> +               idc->hart_index = i;
> +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> +               idc->priv = priv;
> +
> +               aplic_idc_set_delivery(idc, true);
> +
> +               /*
> +                * Boot cpu might not have APLIC hart_index = 0 so check
> +                * and update target registers of all interrupts.
> +                */
> +               if (cpu == smp_processor_id() && idc->hart_index) {
> +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> +                       val |= APLIC_DEFAULT_PRIORITY;
> +                       for (j = 1; j <= priv->nr_irqs; j++)
> +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> +                                           (j - 1) * sizeof(u32));
> +               }
> +
> +               setup_count++;
> +       }
> +
> +       /* Find parent domain and register chained handler */
> +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> +                                         DOMAIN_BUS_ANY);
> +       if (!aplic_idc_parent_irq && domain) {
> +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> +               if (aplic_idc_parent_irq) {
> +                       irq_set_chained_handler(aplic_idc_parent_irq,
> +                                               aplic_idc_handle_irq);
> +
> +                       /*
> +                        * Setup CPUHP notifier to enable IDC parent
> +                        * interrupt on all CPUs
> +                        */
> +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> +                                         "irqchip/riscv/aplic:starting",
> +                                         aplic_idc_starting_cpu,
> +                                         aplic_idc_dying_cpu);
> +               }
> +       }
> +
> +       /* Fail if we were not able to setup IDC for any CPU */
> +       return (setup_count) ? 0 : -ENODEV;
> +}
> +
> +static int aplic_probe(struct platform_device *pdev)
> +{
> +       struct fwnode_handle *fwnode = pdev->dev.fwnode;
> +       struct fwnode_reference_args parent;
> +       struct aplic_priv *priv;
> +       struct resource *res;
> +       phys_addr_t pa;
> +       int rc;
> +
> +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> +       if (!priv)
> +               return -ENOMEM;
> +       priv->fwnode = fwnode;
> +
> +       /* Map the MMIO registers */
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       if (!res) {
> +               pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> +               return -EINVAL;
> +       }
> +       priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> +       if (!priv->regs) {
> +               pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> +               return -ENOMEM;
> +       }
> +
> +       /*
> +        * Find out GSI base number
> +        *
> +        * Note: DT does not define "riscv,gsi-base" property so GSI
> +        * base is always zero for DT.
> +        */
> +       rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> +                                           &priv->gsi_base, 1);
> +       if (rc)
> +               priv->gsi_base = 0;
> +
> +       /* Find out number of interrupt sources */
> +       rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> +                                           &priv->nr_irqs, 1);
> +       if (rc) {
> +               pr_err("%pfwP: failed to get number of interrupt sources\n",
> +                       fwnode);
> +               return rc;
> +       }
> +
> +       /* Setup initial state APLIC interrupts */
> +       aplic_init_hw_irqs(priv);
> +
> +       /*
> +        * Find out number of IDCs based on parent interrupts
> +        *
> +        * If "msi-parent" property is present then we ignore the
> +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> +        */
> +       if (!fwnode_property_present(fwnode, "msi-parent")) {
> +               while (!fwnode_property_get_reference_args(fwnode,
> +                               "interrupts-extended", "#interrupt-cells",
> +                               0, priv->nr_idcs, &parent))
> +                       priv->nr_idcs++;
> +       }
> +
> +       /* Setup IDCs or MSIs based on number of IDCs */
> +       if (priv->nr_idcs)
> +               rc = aplic_setup_idc(priv);
> +       else
> +               rc = aplic_setup_msi(priv);
> +       if (rc) {
> +               pr_err("%pfwP: failed setup %s\n",
> +                       fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> +               return rc;
> +       }
> +
> +       /* Setup global config and interrupt delivery */
> +       aplic_init_hw_global(priv);
> +
> +       /* Create irq domain instance for the APLIC */
> +       if (priv->nr_idcs)
> +               priv->irqdomain = irq_domain_create_linear(
> +                                               priv->fwnode,
> +                                               priv->nr_irqs + 1,
> +                                               &aplic_irqdomain_idc_ops,
> +                                               priv);
> +       else
> +               priv->irqdomain = platform_msi_create_device_domain(
> +                                               &pdev->dev,
> +                                               priv->nr_irqs + 1,
> +                                               aplic_msi_write_msg,
> +                                               &aplic_irqdomain_msi_ops,
> +                                               priv);
> +       if (!priv->irqdomain) {
> +               pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> +               return -ENOMEM;
> +       }
> +
> +       /* Advertise the interrupt controller */
> +       if (priv->nr_idcs) {
> +               pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> +                       priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> +       } else {
> +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> +               pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> +                       priv->fwnode, priv->nr_irqs, &pa);
> +       }
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id aplic_match[] = {
> +       { .compatible = "riscv,aplic" },
> +       {}
> +};
> +
> +static struct platform_driver aplic_driver = {
> +       .driver = {
> +               .name           = "riscv-aplic",
> +               .of_match_table = aplic_match,
> +       },
> +       .probe = aplic_probe,
> +};
> +builtin_platform_driver(aplic_driver);
> +
> +static int __init aplic_dt_init(struct device_node *node,
> +                               struct device_node *parent)
> +{
> +       /*
> +        * The APLIC platform driver needs to be probed early
> +        * so for device tree:
> +        *
> +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> +        *    provides a hint to the device driver core to probe the
> +        *    platform driver early.
> +        * 2) Clear the OF_POPULATED flag in device_node because
> +        *    of_irq_init() sets it which prevents creation of
> +        *    platform device.
> +        */
> +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;

NACK. You are blindly plastering flags without trying to understand
the real issue and fixing this correctly.

> +       of_node_clear_flag(node, OF_POPULATED);
> +       return 0;
> +}
> +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);

This macro pretty much skips the entire driver core framework to probe
and calls init and you are supposed to initialize the device when the
init function is called.

If you want your device/driver to follow the proper platform driver
path (which is recommended), then you need to use the
IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.

I offered to help you debug this issue and I asked for a dts file that
corresponds to a board you are testing this on and seeing an issue.
But you haven't answered my question [1] and are pointing to some
random commit and blaming it. That commit has no impact on any
existing devices/drivers.

Hi Marc,

Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
is used or until Anup actually works with us to debug the real issue.

-Saravana
[1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/


> diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> new file mode 100644
> index 000000000000..97e198ea0109
> --- /dev/null
> +++ b/include/linux/irqchip/riscv-aplic.h
> @@ -0,0 +1,119 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> +
> +#include <linux/bitops.h>
> +
> +#define APLIC_MAX_IDC                  BIT(14)
> +#define APLIC_MAX_SOURCE               1024
> +
> +#define APLIC_DOMAINCFG                        0x0000
> +#define APLIC_DOMAINCFG_RDONLY         0x80000000
> +#define APLIC_DOMAINCFG_IE             BIT(8)
> +#define APLIC_DOMAINCFG_DM             BIT(2)
> +#define APLIC_DOMAINCFG_BE             BIT(0)
> +
> +#define APLIC_SOURCECFG_BASE           0x0004
> +#define APLIC_SOURCECFG_D              BIT(10)
> +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x000003ff
> +#define APLIC_SOURCECFG_SM_MASK        0x00000007
> +#define APLIC_SOURCECFG_SM_INACTIVE    0x0
> +#define APLIC_SOURCECFG_SM_DETACH      0x1
> +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
> +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
> +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
> +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
> +
> +#define APLIC_MMSICFGADDR              0x1bc0
> +#define APLIC_MMSICFGADDRH             0x1bc4
> +#define APLIC_SMSICFGADDR              0x1bc8
> +#define APLIC_SMSICFGADDRH             0x1bcc
> +
> +#ifdef CONFIG_RISCV_M_MODE
> +#define APLIC_xMSICFGADDR              APLIC_MMSICFGADDR
> +#define APLIC_xMSICFGADDRH             APLIC_MMSICFGADDRH
> +#else
> +#define APLIC_xMSICFGADDR              APLIC_SMSICFGADDR
> +#define APLIC_xMSICFGADDRH             APLIC_SMSICFGADDRH
> +#endif
> +
> +#define APLIC_xMSICFGADDRH_L           BIT(31)
> +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
> +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
> +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
> +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
> +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
> +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
> +#define APLIC_xMSICFGADDRH_LHXW_MASK   0xf
> +#define APLIC_xMSICFGADDRH_LHXW_SHIFT  12
> +#define APLIC_xMSICFGADDRH_BAPPN_MASK  0xfff
> +
> +#define APLIC_xMSICFGADDR_PPN_SHIFT    12
> +
> +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> +       (BIT(__lhxs) - 1)
> +
> +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> +       (BIT(__lhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> +       ((__lhxs))
> +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> +       (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> +        APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> +
> +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> +       (BIT(__hhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> +       ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> +       (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> +        APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> +
> +#define APLIC_IRQBITS_PER_REG          32
> +
> +#define APLIC_SETIP_BASE               0x1c00
> +#define APLIC_SETIPNUM                 0x1cdc
> +
> +#define APLIC_CLRIP_BASE               0x1d00
> +#define APLIC_CLRIPNUM                 0x1ddc
> +
> +#define APLIC_SETIE_BASE               0x1e00
> +#define APLIC_SETIENUM                 0x1edc
> +
> +#define APLIC_CLRIE_BASE               0x1f00
> +#define APLIC_CLRIENUM                 0x1fdc
> +
> +#define APLIC_SETIPNUM_LE              0x2000
> +#define APLIC_SETIPNUM_BE              0x2004
> +
> +#define APLIC_GENMSI                   0x3000
> +
> +#define APLIC_TARGET_BASE              0x3004
> +#define APLIC_TARGET_HART_IDX_SHIFT    18
> +#define APLIC_TARGET_HART_IDX_MASK     0x3fff
> +#define APLIC_TARGET_GUEST_IDX_SHIFT   12
> +#define APLIC_TARGET_GUEST_IDX_MASK    0x3f
> +#define APLIC_TARGET_IPRIO_MASK        0xff
> +#define APLIC_TARGET_EIID_MASK 0x7ff
> +
> +#define APLIC_IDC_BASE                 0x4000
> +#define APLIC_IDC_SIZE                 32
> +
> +#define APLIC_IDC_IDELIVERY            0x00
> +
> +#define APLIC_IDC_IFORCE               0x04
> +
> +#define APLIC_IDC_ITHRESHOLD           0x08
> +
> +#define APLIC_IDC_TOPI                 0x18
> +#define APLIC_IDC_TOPI_ID_SHIFT        16
> +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> +#define APLIC_IDC_TOPI_PRIO_MASK       0xff
> +
> +#define APLIC_IDC_CLAIMI               0x1c
> +
> +#endif
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-15 19:17   ` Saravana Kannan
@ 2023-06-15 19:31     ` Conor Dooley
  2023-06-15 20:45       ` Saravana Kannan
  2023-06-16  2:01     ` Anup Patel
  1 sibling, 1 reply; 28+ messages in thread
From: Conor Dooley @ 2023-06-15 19:31 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Anup Patel, linux-riscv, linux-kernel, devicetree,
	iommu

[-- Attachment #1: Type: text/plain, Size: 2449 bytes --]

Hey Saravana,

On Thu, Jun 15, 2023 at 12:17:08PM -0700, Saravana Kannan wrote:
> On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:

btw, please try to delete the 100s of lines of unrelated context when
replying

> > +static int __init aplic_dt_init(struct device_node *node,
> > +                               struct device_node *parent)
> > +{
> > +       /*
> > +        * The APLIC platform driver needs to be probed early
> > +        * so for device tree:
> > +        *
> > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > +        *    provides a hint to the device driver core to probe the
> > +        *    platform driver early.
> > +        * 2) Clear the OF_POPULATED flag in device_node because
> > +        *    of_irq_init() sets it which prevents creation of
> > +        *    platform device.
> > +        */
> > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> 
> NACK. You are blindly plastering flags without trying to understand
> the real issue and fixing this correctly.
> 
> > +       of_node_clear_flag(node, OF_POPULATED);
> > +       return 0;
> > +}
> > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> 
> This macro pretty much skips the entire driver core framework to probe
> and calls init and you are supposed to initialize the device when the
> init function is called.
> 
> If you want your device/driver to follow the proper platform driver
> path (which is recommended), then you need to use the
> IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> 
> I offered to help you debug this issue and I asked for a dts file that
> corresponds to a board you are testing this on and seeing an issue.

There isn't a dts file for this because there's no publicly available
hardware that actually has an APLIC. Maybe Ventana have pre-production
silicon that has it, but otherwise it's a QEMU job.

Cheers,
Conor.

> But you haven't answered my question [1] and are pointing to some
> random commit and blaming it. That commit has no impact on any
> existing devices/drivers.
> 
> Hi Marc,
> 
> Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> is used or until Anup actually works with us to debug the real issue.
> 
> -Saravana
> [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-15 19:31     ` Conor Dooley
@ 2023-06-15 20:45       ` Saravana Kannan
  2023-06-15 21:11         ` Conor Dooley
  0 siblings, 1 reply; 28+ messages in thread
From: Saravana Kannan @ 2023-06-15 20:45 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Anup Patel, linux-riscv, linux-kernel, devicetree,
	iommu

On Thu, Jun 15, 2023 at 12:31 PM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Saravana,
>
> On Thu, Jun 15, 2023 at 12:17:08PM -0700, Saravana Kannan wrote:
> > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
>
> btw, please try to delete the 100s of lines of unrelated context when
> replying

I always feel like some people like me to do this and others don't.
Also, at times, people might want to reference the other lines of code
when replying to my point. That's why I generally leave them in.

>
> > > +static int __init aplic_dt_init(struct device_node *node,
> > > +                               struct device_node *parent)
> > > +{
> > > +       /*
> > > +        * The APLIC platform driver needs to be probed early
> > > +        * so for device tree:
> > > +        *
> > > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > +        *    provides a hint to the device driver core to probe the
> > > +        *    platform driver early.
> > > +        * 2) Clear the OF_POPULATED flag in device_node because
> > > +        *    of_irq_init() sets it which prevents creation of
> > > +        *    platform device.
> > > +        */
> > > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> >
> > NACK. You are blindly plastering flags without trying to understand
> > the real issue and fixing this correctly.
> >
> > > +       of_node_clear_flag(node, OF_POPULATED);

Also, this part is not needed if the macros I mentioned below are used.

> > > +       return 0;
> > > +}
> > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> >
> > This macro pretty much skips the entire driver core framework to probe
> > and calls init and you are supposed to initialize the device when the
> > init function is called.
> >
> > If you want your device/driver to follow the proper platform driver
> > path (which is recommended), then you need to use the
> > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> >
> > I offered to help you debug this issue and I asked for a dts file that
> > corresponds to a board you are testing this on and seeing an issue.
>
> There isn't a dts file for this because there's no publicly available
> hardware that actually has an APLIC. Maybe Ventana have pre-production
> silicon that has it, but otherwise it's a QEMU job.

1. QEMU example is fine too if it can be reproduced. I just asked for
a dts file because I need the full global view of the dependencies. At
a minimum, I'd at least expect to see some example DT and explanation
of what dependency is causing the IRQ device to not be initialized on
time, etc. Instead I just see random uses of flags with no description
of the actual issue.

2. If it's not a dts available upstream, why should these drivers be
accepted? I thought the norm was to only accept drivers that can
actually be used.

-Saravana

>
> Cheers,
> Conor.
>
> > But you haven't answered my question [1] and are pointing to some
> > random commit and blaming it. That commit has no impact on any
> > existing devices/drivers.
> >
> > Hi Marc,
> >
> > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > is used or until Anup actually works with us to debug the real issue.
> >
> > -Saravana
> > [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-15 20:45       ` Saravana Kannan
@ 2023-06-15 21:11         ` Conor Dooley
  0 siblings, 0 replies; 28+ messages in thread
From: Conor Dooley @ 2023-06-15 21:11 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Anup Patel, linux-riscv, linux-kernel, devicetree,
	iommu

[-- Attachment #1: Type: text/plain, Size: 4928 bytes --]

On Thu, Jun 15, 2023 at 01:45:55PM -0700, Saravana Kannan wrote:
> On Thu, Jun 15, 2023 at 12:31 PM Conor Dooley <conor@kernel.org> wrote:
> > On Thu, Jun 15, 2023 at 12:17:08PM -0700, Saravana Kannan wrote:
> > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
> >
> > btw, please try to delete the 100s of lines of unrelated context when
> > replying
> 
> I always feel like some people like me to do this and others don't.
> Also, at times, people might want to reference the other lines of code
> when replying to my point. That's why I generally leave them in.

Yah, perhaps I cull too aggressively but there's a middle ground ;)

> > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > +                               struct device_node *parent)
> > > > +{
> > > > +       /*
> > > > +        * The APLIC platform driver needs to be probed early
> > > > +        * so for device tree:
> > > > +        *
> > > > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > +        *    provides a hint to the device driver core to probe the
> > > > +        *    platform driver early.
> > > > +        * 2) Clear the OF_POPULATED flag in device_node because
> > > > +        *    of_irq_init() sets it which prevents creation of
> > > > +        *    platform device.
> > > > +        */
> > > > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > >
> > > NACK. You are blindly plastering flags without trying to understand
> > > the real issue and fixing this correctly.
> > >
> > > > +       of_node_clear_flag(node, OF_POPULATED);
> 
> Also, this part is not needed if the macros I mentioned below are used.
> 
> > > > +       return 0;
> > > > +}
> > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > >
> > > This macro pretty much skips the entire driver core framework to probe
> > > and calls init and you are supposed to initialize the device when the
> > > init function is called.
> > >
> > > If you want your device/driver to follow the proper platform driver
> > > path (which is recommended), then you need to use the
> > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > >
> > > I offered to help you debug this issue and I asked for a dts file that
> > > corresponds to a board you are testing this on and seeing an issue.
> >
> > There isn't a dts file for this because there's no publicly available
> > hardware that actually has an APLIC. Maybe Ventana have pre-production
> > silicon that has it, but otherwise it's a QEMU job.
> 
> 1. QEMU example is fine too if it can be reproduced. I just asked for
> a dts file because I need the full global view of the dependencies. At
> a minimum, I'd at least expect to see some example DT and explanation
> of what dependency is causing the IRQ device to not be initialized on
> time, etc. Instead I just see random uses of flags with no description
> of the actual issue.

It's Anup's responsibility to provide you with that information, I have
not reproduced this issue, so I won't mislead you with QEMU invocations
that may not be what's required to reproduce.

> 2. If it's not a dts available upstream, why should these drivers be
> accepted? I thought the norm was to only accept drivers that can
> actually be used.

I think it's not unusual (and desirable?) to start the upstreaming
process for stuff before hardware is publicly available, so that once it
is, support is already upstream, or close to. I do know that people have
tested this series in FPGA based hardware emulation platforms etc.
Posting patches for it also helps avoid duplication of effort between
the various vendors in RISC-V land, who would otherwise have to write
their own drivers. Also, the documented RISC-V policy for accepting
support for ISA stuff says:
	We'll only accept patches for new modules or extensions if the
	specifications for those modules or extensions are listed as being
	unlikely to be incompatibly changed in the future.  For
	specifications from the RISC-V foundation this means "Frozen"
	(Documentation/riscv/patch-acceptance.rst)
AIA (the spec behind the APLIC/IMSIC) is frozen, and qualifies from a
RISC-V point of view. What Marc is willing to accept, in terms of
pre-production hardware support, is up to him obviously!

Cheers,
Conor.

> > > But you haven't answered my question [1] and are pointing to some
> > > random commit and blaming it. That commit has no impact on any
> > > existing devices/drivers.
> > >
> > > Hi Marc,
> > >
> > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > is used or until Anup actually works with us to debug the real issue.
> > >
> > > -Saravana
> > > [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-15 19:17   ` Saravana Kannan
  2023-06-15 19:31     ` Conor Dooley
@ 2023-06-16  2:01     ` Anup Patel
  2023-06-16 22:05       ` Saravana Kannan
  1 sibling, 1 reply; 28+ messages in thread
From: Anup Patel @ 2023-06-16  2:01 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Conor Dooley, linux-riscv, linux-kernel,
	devicetree, iommu

On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <saravanak@google.com> wrote:
>
> On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new interrupt controller for managing wired interrupts on a RISC-V
> > platform. This new interrupt controller is referred to as advanced
> > platform-level interrupt controller (APLIC) which can forward wired
> > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > signaled interrupts.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  drivers/irqchip/Kconfig             |   6 +
> >  drivers/irqchip/Makefile            |   1 +
> >  drivers/irqchip/irq-riscv-aplic.c   | 765 ++++++++++++++++++++++++++++
> >  include/linux/irqchip/riscv-aplic.h | 119 +++++
> >  4 files changed, 891 insertions(+)
> >  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> >  create mode 100644 include/linux/irqchip/riscv-aplic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index d700980372ef..834c0329f583 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> >         select IRQ_DOMAIN_HIERARCHY
> >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_APLIC
> > +       bool
> > +       depends on RISCV
> > +       select IRQ_DOMAIN_HIERARCHY
> > +       select GENERIC_MSI_IRQ
> > +
> >  config RISCV_IMSIC
> >         bool
> >         depends on RISCV
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 577bde3e986b..438b8e1a152c 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
> >  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > new file mode 100644
> > index 000000000000..1e710fdf5608
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > @@ -0,0 +1,765 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > +#include <linux/bitops.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-aplic.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/smp.h>
> > +
> > +#define APLIC_DEFAULT_PRIORITY         1
> > +#define APLIC_DISABLE_IDELIVERY                0
> > +#define APLIC_ENABLE_IDELIVERY         1
> > +#define APLIC_DISABLE_ITHRESHOLD       1
> > +#define APLIC_ENABLE_ITHRESHOLD                0
> > +
> > +struct aplic_msicfg {
> > +       phys_addr_t             base_ppn;
> > +       u32                     hhxs;
> > +       u32                     hhxw;
> > +       u32                     lhxs;
> > +       u32                     lhxw;
> > +};
> > +
> > +struct aplic_idc {
> > +       unsigned int            hart_index;
> > +       void __iomem            *regs;
> > +       struct aplic_priv       *priv;
> > +};
> > +
> > +struct aplic_priv {
> > +       struct fwnode_handle    *fwnode;
> > +       u32                     gsi_base;
> > +       u32                     nr_irqs;
> > +       u32                     nr_idcs;
> > +       void __iomem            *regs;
> > +       struct irq_domain       *irqdomain;
> > +       struct aplic_msicfg     msicfg;
> > +       struct cpumask          lmask;
> > +};
> > +
> > +static unsigned int aplic_idc_parent_irq;
> > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > +
> > +static void aplic_irq_unmask(struct irq_data *d)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > +
> > +       if (!priv->nr_idcs)
> > +               irq_chip_unmask_parent(d);
> > +}
> > +
> > +static void aplic_irq_mask(struct irq_data *d)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > +
> > +       if (!priv->nr_idcs)
> > +               irq_chip_mask_parent(d);
> > +}
> > +
> > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > +{
> > +       u32 val = 0;
> > +       void __iomem *sourcecfg;
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       switch (type) {
> > +       case IRQ_TYPE_NONE:
> > +               val = APLIC_SOURCECFG_SM_INACTIVE;
> > +               break;
> > +       case IRQ_TYPE_LEVEL_LOW:
> > +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > +               break;
> > +       case IRQ_TYPE_LEVEL_HIGH:
> > +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > +               break;
> > +       case IRQ_TYPE_EDGE_FALLING:
> > +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > +               break;
> > +       case IRQ_TYPE_EDGE_RISING:
> > +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > +               break;
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +
> > +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > +       writel(val, sourcecfg);
> > +
> > +       return 0;
> > +}
> > +
> > +static void aplic_irq_eoi(struct irq_data *d)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       u32 reg_off, reg_mask;
> > +
> > +       /*
> > +        * EOI handling only required only for level-triggered
> > +        * interrupts in APLIC MSI mode.
> > +        */
> > +
> > +       if (priv->nr_idcs)
> > +               return;
> > +
> > +       reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > +       reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > +       switch (irqd_get_trigger_type(d)) {
> > +       case IRQ_TYPE_LEVEL_LOW:
> > +               if (!(readl(priv->regs + reg_off) & reg_mask))
> > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > +               break;
> > +       case IRQ_TYPE_LEVEL_HIGH:
> > +               if (readl(priv->regs + reg_off) & reg_mask)
> > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > +               break;
> > +       }
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int aplic_set_affinity(struct irq_data *d,
> > +                             const struct cpumask *mask_val, bool force)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       struct aplic_idc *idc;
> > +       unsigned int cpu, val;
> > +       struct cpumask amask;
> > +       void __iomem *target;
> > +
> > +       if (!priv->nr_idcs)
> > +               return irq_chip_set_affinity_parent(d, mask_val, force);
> > +
> > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > +       if (force)
> > +               cpu = cpumask_first(&amask);
> > +       else
> > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > +       if (cpu >= nr_cpu_ids)
> > +               return -EINVAL;
> > +
> > +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> > +       target = priv->regs + APLIC_TARGET_BASE;
> > +       target += (d->hwirq - 1) * sizeof(u32);
> > +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > +       val |= APLIC_DEFAULT_PRIORITY;
> > +       writel(val, target);
> > +
> > +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > +
> > +       return IRQ_SET_MASK_OK_DONE;
> > +}
> > +#endif
> > +
> > +static struct irq_chip aplic_chip = {
> > +       .name           = "RISC-V APLIC",
> > +       .irq_mask       = aplic_irq_mask,
> > +       .irq_unmask     = aplic_irq_unmask,
> > +       .irq_set_type   = aplic_set_type,
> > +       .irq_eoi        = aplic_irq_eoi,
> > +#ifdef CONFIG_SMP
> > +       .irq_set_affinity = aplic_set_affinity,
> > +#endif
> > +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> > +                         IRQCHIP_SKIP_SET_WAKE |
> > +                         IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > +                                    u32 gsi_base,
> > +                                    unsigned long *hwirq,
> > +                                    unsigned int *type)
> > +{
> > +       if (WARN_ON(fwspec->param_count < 2))
> > +               return -EINVAL;
> > +       if (WARN_ON(!fwspec->param[0]))
> > +               return -EINVAL;
> > +
> > +       /* For DT, gsi_base is always zero. */
> > +       *hwirq = fwspec->param[0] - gsi_base;
> > +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > +
> > +       WARN_ON(*type == IRQ_TYPE_NONE);
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > +                                        struct irq_fwspec *fwspec,
> > +                                        unsigned long *hwirq,
> > +                                        unsigned int *type)
> > +{
> > +       struct aplic_priv *priv = platform_msi_get_host_data(d);
> > +
> > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > +}
> > +
> > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > +                                    unsigned int virq, unsigned int nr_irqs,
> > +                                    void *arg)
> > +{
> > +       int i, ret;
> > +       unsigned int type;
> > +       irq_hw_number_t hwirq;
> > +       struct irq_fwspec *fwspec = arg;
> > +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > +
> > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > +       if (ret)
> > +               return ret;
> > +
> > +       for (i = 0; i < nr_irqs; i++) {
> > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > +                                   NULL, NULL);
> > +               /*
> > +                * APLIC does not implement irq_disable() so Linux interrupt
> > +                * subsystem will take a lazy approach for disabling an APLIC
> > +                * interrupt. This means APLIC interrupts are left unmasked
> > +                * upon system suspend and interrupts are not processed
> > +                * immediately upon system wake up. To tackle this, we disable
> > +                * the lazy approach for all APLIC interrupts.
> > +                */
> > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > +       .translate      = aplic_irqdomain_msi_translate,
> > +       .alloc          = aplic_irqdomain_msi_alloc,
> > +       .free           = platform_msi_device_domain_free,
> > +};
> > +
> > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > +                                        struct irq_fwspec *fwspec,
> > +                                        unsigned long *hwirq,
> > +                                        unsigned int *type)
> > +{
> > +       struct aplic_priv *priv = d->host_data;
> > +
> > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > +}
> > +
> > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > +                                    unsigned int virq, unsigned int nr_irqs,
> > +                                    void *arg)
> > +{
> > +       int i, ret;
> > +       unsigned int type;
> > +       irq_hw_number_t hwirq;
> > +       struct irq_fwspec *fwspec = arg;
> > +       struct aplic_priv *priv = domain->host_data;
> > +
> > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > +       if (ret)
> > +               return ret;
> > +
> > +       for (i = 0; i < nr_irqs; i++) {
> > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > +                                   NULL, NULL);
> > +               irq_set_affinity(virq + i, &priv->lmask);
> > +               /* See the reason described in aplic_irqdomain_msi_alloc() */
> > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > +       .translate      = aplic_irqdomain_idc_translate,
> > +       .alloc          = aplic_irqdomain_idc_alloc,
> > +       .free           = irq_domain_free_irqs_top,
> > +};
> > +
> > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > +{
> > +       int i;
> > +
> > +       /* Disable all interrupts */
> > +       for (i = 0; i <= priv->nr_irqs; i += 32)
> > +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > +                           (i / 32) * sizeof(u32));
> > +
> > +       /* Set interrupt type and default priority for all interrupts */
> > +       for (i = 1; i <= priv->nr_irqs; i++) {
> > +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > +                         (i - 1) * sizeof(u32));
> > +               writel(APLIC_DEFAULT_PRIORITY,
> > +                      priv->regs + APLIC_TARGET_BASE +
> > +                      (i - 1) * sizeof(u32));
> > +       }
> > +
> > +       /* Clear APLIC domaincfg */
> > +       writel(0, priv->regs + APLIC_DOMAINCFG);
> > +}
> > +
> > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > +{
> > +       u32 val;
> > +#ifdef CONFIG_RISCV_M_MODE
> > +       u32 valH;
> > +
> > +       if (!priv->nr_idcs) {
> > +               val = priv->msicfg.base_ppn;
> > +               valH = (priv->msicfg.base_ppn >> 32) &
> > +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> > +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> > +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > +       }
> > +#endif
> > +
> > +       /* Setup APLIC domaincfg register */
> > +       val = readl(priv->regs + APLIC_DOMAINCFG);
> > +       val |= APLIC_DOMAINCFG_IE;
> > +       if (!priv->nr_idcs)
> > +               val |= APLIC_DOMAINCFG_DM;
> > +       writel(val, priv->regs + APLIC_DOMAINCFG);
> > +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > +               pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > +                       priv->fwnode, val);
> > +}
> > +
> > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > +{
> > +       unsigned int group_index, hart_index, guest_index, val;
> > +       struct irq_data *d = irq_get_irq_data(desc->irq);
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       struct aplic_msicfg *mc = &priv->msicfg;
> > +       phys_addr_t tppn, tbppn, msg_addr;
> > +       void __iomem *target;
> > +
> > +       /* For zeroed MSI, simply write zero into the target register */
> > +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > +               target = priv->regs + APLIC_TARGET_BASE;
> > +               target += (d->hwirq - 1) * sizeof(u32);
> > +               writel(0, target);
> > +               return;
> > +       }
> > +
> > +       /* Sanity check on message data */
> > +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > +
> > +       /* Compute target MSI address */
> > +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > +
> > +       /* Compute target HART Base PPN */
> > +       tbppn = tppn;
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > +       WARN_ON(tbppn != mc->base_ppn);
> > +
> > +       /* Compute target group and hart indexes */
> > +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > +       hart_index |= (group_index << mc->lhxw);
> > +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > +
> > +       /* Compute target guest index */
> > +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > +
> > +       /* Update IRQ TARGET register */
> > +       target = priv->regs + APLIC_TARGET_BASE;
> > +       target += (d->hwirq - 1) * sizeof(u32);
> > +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > +                               << APLIC_TARGET_HART_IDX_SHIFT;
> > +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> > +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > +       writel(val, target);
> > +}
> > +
> > +static int aplic_setup_msi(struct aplic_priv *priv)
> > +{
> > +       struct aplic_msicfg *mc = &priv->msicfg;
> > +       const struct imsic_global_config *imsic_global;
> > +
> > +       /*
> > +        * The APLIC outgoing MSI config registers assume target MSI
> > +        * controller to be RISC-V AIA IMSIC controller.
> > +        */
> > +       imsic_global = imsic_get_global_config();
> > +       if (!imsic_global) {
> > +               pr_err("%pfwP: IMSIC global config not found\n",
> > +                       priv->fwnode);
> > +               return -ENODEV;
> > +       }
> > +
> > +       /* Find number of guest index bits (LHXS) */
> > +       mc->lhxs = imsic_global->guest_index_bits;
> > +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > +               pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > +                       priv->fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of HART index bits (LHXW) */
> > +       mc->lhxw = imsic_global->hart_index_bits;
> > +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > +               pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > +                       priv->fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of group index bits (HHXW) */
> > +       mc->hhxw = imsic_global->group_index_bits;
> > +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > +               pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > +                       priv->fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find first bit position of group index (HHXS) */
> > +       mc->hhxs = imsic_global->group_index_shift;
> > +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > +               pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > +                       priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > +               return -EINVAL;
> > +       }
> > +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > +               pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > +                       priv->fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Compute PPN base */
> > +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > +
> > +       /* Use all possible CPUs as lmask */
> > +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > + * which will return highest priority pending interrupt and clear the
> > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > + * register return zero value.
> > + */
> > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > +{
> > +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > +       irq_hw_number_t hw_irq;
> > +       int irq;
> > +
> > +       chained_irq_enter(chip, desc);
> > +
> > +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > +
> > +               if (unlikely(irq <= 0))
> > +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > +                                           hw_irq);
> > +               else
> > +                       generic_handle_irq(irq);
> > +       }
> > +
> > +       chained_irq_exit(chip, desc);
> > +}
> > +
> > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > +{
> > +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > +
> > +       /* Priority must be less than threshold for interrupt triggering */
> > +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > +
> > +       /* Delivery must be set to 1 for interrupt triggering */
> > +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > +}
> > +
> > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > +{
> > +       if (aplic_idc_parent_irq)
> > +               disable_percpu_irq(aplic_idc_parent_irq);
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > +{
> > +       if (aplic_idc_parent_irq)
> > +               enable_percpu_irq(aplic_idc_parent_irq,
> > +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_setup_idc(struct aplic_priv *priv)
> > +{
> > +       int i, j, rc, cpu, setup_count = 0;
> > +       struct fwnode_reference_args parent;
> > +       struct irq_domain *domain;
> > +       unsigned long hartid;
> > +       struct aplic_idc *idc;
> > +       u32 val;
> > +
> > +       /* Setup per-CPU IDC and target CPU mask */
> > +       for (i = 0; i < priv->nr_idcs; i++) {
> > +               rc = fwnode_property_get_reference_args(priv->fwnode,
> > +                               "interrupts-extended", "#interrupt-cells",
> > +                               0, i, &parent);
> > +               if (rc) {
> > +                       pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > +                               priv->fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               /*
> > +                * Skip interrupts other than external interrupts for
> > +                * current privilege level.
> > +                */
> > +               if (parent.args[0] != RV_IRQ_EXT)
> > +                       continue;
> > +
> > +               rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > +               if (rc) {
> > +                       pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > +                               priv->fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               cpu = riscv_hartid_to_cpuid(hartid);
> > +               if (cpu < 0) {
> > +                       pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > +                               priv->fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               cpumask_set_cpu(cpu, &priv->lmask);
> > +
> > +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> > +               idc->hart_index = i;
> > +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > +               idc->priv = priv;
> > +
> > +               aplic_idc_set_delivery(idc, true);
> > +
> > +               /*
> > +                * Boot cpu might not have APLIC hart_index = 0 so check
> > +                * and update target registers of all interrupts.
> > +                */
> > +               if (cpu == smp_processor_id() && idc->hart_index) {
> > +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > +                       val |= APLIC_DEFAULT_PRIORITY;
> > +                       for (j = 1; j <= priv->nr_irqs; j++)
> > +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> > +                                           (j - 1) * sizeof(u32));
> > +               }
> > +
> > +               setup_count++;
> > +       }
> > +
> > +       /* Find parent domain and register chained handler */
> > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > +                                         DOMAIN_BUS_ANY);
> > +       if (!aplic_idc_parent_irq && domain) {
> > +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > +               if (aplic_idc_parent_irq) {
> > +                       irq_set_chained_handler(aplic_idc_parent_irq,
> > +                                               aplic_idc_handle_irq);
> > +
> > +                       /*
> > +                        * Setup CPUHP notifier to enable IDC parent
> > +                        * interrupt on all CPUs
> > +                        */
> > +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > +                                         "irqchip/riscv/aplic:starting",
> > +                                         aplic_idc_starting_cpu,
> > +                                         aplic_idc_dying_cpu);
> > +               }
> > +       }
> > +
> > +       /* Fail if we were not able to setup IDC for any CPU */
> > +       return (setup_count) ? 0 : -ENODEV;
> > +}
> > +
> > +static int aplic_probe(struct platform_device *pdev)
> > +{
> > +       struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > +       struct fwnode_reference_args parent;
> > +       struct aplic_priv *priv;
> > +       struct resource *res;
> > +       phys_addr_t pa;
> > +       int rc;
> > +
> > +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +       priv->fwnode = fwnode;
> > +
> > +       /* Map the MMIO registers */
> > +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +       if (!res) {
> > +               pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +       priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > +       if (!priv->regs) {
> > +               pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       /*
> > +        * Find out GSI base number
> > +        *
> > +        * Note: DT does not define "riscv,gsi-base" property so GSI
> > +        * base is always zero for DT.
> > +        */
> > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > +                                           &priv->gsi_base, 1);
> > +       if (rc)
> > +               priv->gsi_base = 0;
> > +
> > +       /* Find out number of interrupt sources */
> > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > +                                           &priv->nr_irqs, 1);
> > +       if (rc) {
> > +               pr_err("%pfwP: failed to get number of interrupt sources\n",
> > +                       fwnode);
> > +               return rc;
> > +       }
> > +
> > +       /* Setup initial state APLIC interrupts */
> > +       aplic_init_hw_irqs(priv);
> > +
> > +       /*
> > +        * Find out number of IDCs based on parent interrupts
> > +        *
> > +        * If "msi-parent" property is present then we ignore the
> > +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > +        */
> > +       if (!fwnode_property_present(fwnode, "msi-parent")) {
> > +               while (!fwnode_property_get_reference_args(fwnode,
> > +                               "interrupts-extended", "#interrupt-cells",
> > +                               0, priv->nr_idcs, &parent))
> > +                       priv->nr_idcs++;
> > +       }
> > +
> > +       /* Setup IDCs or MSIs based on number of IDCs */
> > +       if (priv->nr_idcs)
> > +               rc = aplic_setup_idc(priv);
> > +       else
> > +               rc = aplic_setup_msi(priv);
> > +       if (rc) {
> > +               pr_err("%pfwP: failed setup %s\n",
> > +                       fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > +               return rc;
> > +       }
> > +
> > +       /* Setup global config and interrupt delivery */
> > +       aplic_init_hw_global(priv);
> > +
> > +       /* Create irq domain instance for the APLIC */
> > +       if (priv->nr_idcs)
> > +               priv->irqdomain = irq_domain_create_linear(
> > +                                               priv->fwnode,
> > +                                               priv->nr_irqs + 1,
> > +                                               &aplic_irqdomain_idc_ops,
> > +                                               priv);
> > +       else
> > +               priv->irqdomain = platform_msi_create_device_domain(
> > +                                               &pdev->dev,
> > +                                               priv->nr_irqs + 1,
> > +                                               aplic_msi_write_msg,
> > +                                               &aplic_irqdomain_msi_ops,
> > +                                               priv);
> > +       if (!priv->irqdomain) {
> > +               pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       /* Advertise the interrupt controller */
> > +       if (priv->nr_idcs) {
> > +               pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > +                       priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > +       } else {
> > +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > +               pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > +                       priv->fwnode, priv->nr_irqs, &pa);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct of_device_id aplic_match[] = {
> > +       { .compatible = "riscv,aplic" },
> > +       {}
> > +};
> > +
> > +static struct platform_driver aplic_driver = {
> > +       .driver = {
> > +               .name           = "riscv-aplic",
> > +               .of_match_table = aplic_match,
> > +       },
> > +       .probe = aplic_probe,
> > +};
> > +builtin_platform_driver(aplic_driver);
> > +
> > +static int __init aplic_dt_init(struct device_node *node,
> > +                               struct device_node *parent)
> > +{
> > +       /*
> > +        * The APLIC platform driver needs to be probed early
> > +        * so for device tree:
> > +        *
> > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > +        *    provides a hint to the device driver core to probe the
> > +        *    platform driver early.
> > +        * 2) Clear the OF_POPULATED flag in device_node because
> > +        *    of_irq_init() sets it which prevents creation of
> > +        *    platform device.
> > +        */
> > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
>
> NACK. You are blindly plastering flags without trying to understand
> the real issue and fixing this correctly.
>
> > +       of_node_clear_flag(node, OF_POPULATED);
> > +       return 0;
> > +}
> > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
>
> This macro pretty much skips the entire driver core framework to probe
> and calls init and you are supposed to initialize the device when the
> init function is called.
>
> If you want your device/driver to follow the proper platform driver
> path (which is recommended), then you need to use the
> IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
>
> I offered to help you debug this issue and I asked for a dts file that
> corresponds to a board you are testing this on and seeing an issue.
> But you haven't answered my question [1] and are pointing to some
> random commit and blaming it. That commit has no impact on any
> existing devices/drivers.
>
> Hi Marc,
>
> Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> is used or until Anup actually works with us to debug the real issue.

Maybe I misread your previous comment.

You can easily reproduce the issue on QEMU virt machine for RISC-V:
1) Build qemu-system-riscv64 from latest QEMU master
2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
(Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
 APLIC driver at the time of building kernel)
3) Boot a APLIC-only system on QEMU virt machine
    qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
    -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
    -kernel ./build-riscv64/arch/riscv/boot/Image \
    -append "root=/dev/ram rw console=ttyS0 earlycon" \
    -initrd ./rootfs_riscv64.img

I hope the above steps help you reproduce the issue. I will certainly
test whatever fix you propose.

Regards,
Anup


>
> -Saravana
> [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/
>
>
> > diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> > new file mode 100644
> > index 000000000000..97e198ea0109
> > --- /dev/null
> > +++ b/include/linux/irqchip/riscv-aplic.h
> > @@ -0,0 +1,119 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> > +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> > +
> > +#include <linux/bitops.h>
> > +
> > +#define APLIC_MAX_IDC                  BIT(14)
> > +#define APLIC_MAX_SOURCE               1024
> > +
> > +#define APLIC_DOMAINCFG                        0x0000
> > +#define APLIC_DOMAINCFG_RDONLY         0x80000000
> > +#define APLIC_DOMAINCFG_IE             BIT(8)
> > +#define APLIC_DOMAINCFG_DM             BIT(2)
> > +#define APLIC_DOMAINCFG_BE             BIT(0)
> > +
> > +#define APLIC_SOURCECFG_BASE           0x0004
> > +#define APLIC_SOURCECFG_D              BIT(10)
> > +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x000003ff
> > +#define APLIC_SOURCECFG_SM_MASK        0x00000007
> > +#define APLIC_SOURCECFG_SM_INACTIVE    0x0
> > +#define APLIC_SOURCECFG_SM_DETACH      0x1
> > +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
> > +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
> > +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
> > +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
> > +
> > +#define APLIC_MMSICFGADDR              0x1bc0
> > +#define APLIC_MMSICFGADDRH             0x1bc4
> > +#define APLIC_SMSICFGADDR              0x1bc8
> > +#define APLIC_SMSICFGADDRH             0x1bcc
> > +
> > +#ifdef CONFIG_RISCV_M_MODE
> > +#define APLIC_xMSICFGADDR              APLIC_MMSICFGADDR
> > +#define APLIC_xMSICFGADDRH             APLIC_MMSICFGADDRH
> > +#else
> > +#define APLIC_xMSICFGADDR              APLIC_SMSICFGADDR
> > +#define APLIC_xMSICFGADDRH             APLIC_SMSICFGADDRH
> > +#endif
> > +
> > +#define APLIC_xMSICFGADDRH_L           BIT(31)
> > +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
> > +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
> > +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
> > +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
> > +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
> > +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
> > +#define APLIC_xMSICFGADDRH_LHXW_MASK   0xf
> > +#define APLIC_xMSICFGADDRH_LHXW_SHIFT  12
> > +#define APLIC_xMSICFGADDRH_BAPPN_MASK  0xfff
> > +
> > +#define APLIC_xMSICFGADDR_PPN_SHIFT    12
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> > +       (BIT(__lhxs) - 1)
> > +
> > +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> > +       (BIT(__lhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> > +       ((__lhxs))
> > +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> > +       (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> > +        APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> > +       (BIT(__hhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> > +       ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> > +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> > +       (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> > +        APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> > +
> > +#define APLIC_IRQBITS_PER_REG          32
> > +
> > +#define APLIC_SETIP_BASE               0x1c00
> > +#define APLIC_SETIPNUM                 0x1cdc
> > +
> > +#define APLIC_CLRIP_BASE               0x1d00
> > +#define APLIC_CLRIPNUM                 0x1ddc
> > +
> > +#define APLIC_SETIE_BASE               0x1e00
> > +#define APLIC_SETIENUM                 0x1edc
> > +
> > +#define APLIC_CLRIE_BASE               0x1f00
> > +#define APLIC_CLRIENUM                 0x1fdc
> > +
> > +#define APLIC_SETIPNUM_LE              0x2000
> > +#define APLIC_SETIPNUM_BE              0x2004
> > +
> > +#define APLIC_GENMSI                   0x3000
> > +
> > +#define APLIC_TARGET_BASE              0x3004
> > +#define APLIC_TARGET_HART_IDX_SHIFT    18
> > +#define APLIC_TARGET_HART_IDX_MASK     0x3fff
> > +#define APLIC_TARGET_GUEST_IDX_SHIFT   12
> > +#define APLIC_TARGET_GUEST_IDX_MASK    0x3f
> > +#define APLIC_TARGET_IPRIO_MASK        0xff
> > +#define APLIC_TARGET_EIID_MASK 0x7ff
> > +
> > +#define APLIC_IDC_BASE                 0x4000
> > +#define APLIC_IDC_SIZE                 32
> > +
> > +#define APLIC_IDC_IDELIVERY            0x00
> > +
> > +#define APLIC_IDC_IFORCE               0x04
> > +
> > +#define APLIC_IDC_ITHRESHOLD           0x08
> > +
> > +#define APLIC_IDC_TOPI                 0x18
> > +#define APLIC_IDC_TOPI_ID_SHIFT        16
> > +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> > +#define APLIC_IDC_TOPI_PRIO_MASK       0xff
> > +
> > +#define APLIC_IDC_CLAIMI               0x1c
> > +
> > +#endif
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-16  2:01     ` Anup Patel
@ 2023-06-16 22:05       ` Saravana Kannan
  2023-06-19  6:13         ` Anup Patel
  0 siblings, 1 reply; 28+ messages in thread
From: Saravana Kannan @ 2023-06-16 22:05 UTC (permalink / raw)
  To: Anup Patel
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Conor Dooley, linux-riscv, linux-kernel,
	devicetree, iommu, Android Kernel Team

On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <anup@brainfault.org> wrote:
>
> On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
> > >
> > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > platform. This new interrupt controller is referred to as advanced
> > > platform-level interrupt controller (APLIC) which can forward wired
> > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > signaled interrupts.
> > > (For more details refer https://github.com/riscv/riscv-aia)
> > >
> > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > platforms.
> > >
> > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > ---
> > >  drivers/irqchip/Kconfig             |   6 +
> > >  drivers/irqchip/Makefile            |   1 +
> > >  drivers/irqchip/irq-riscv-aplic.c   | 765 ++++++++++++++++++++++++++++
> > >  include/linux/irqchip/riscv-aplic.h | 119 +++++
> > >  4 files changed, 891 insertions(+)
> > >  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > >  create mode 100644 include/linux/irqchip/riscv-aplic.h
> > >
> > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > index d700980372ef..834c0329f583 100644
> > > --- a/drivers/irqchip/Kconfig
> > > +++ b/drivers/irqchip/Kconfig
> > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > >         select IRQ_DOMAIN_HIERARCHY
> > >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > >
> > > +config RISCV_APLIC
> > > +       bool
> > > +       depends on RISCV
> > > +       select IRQ_DOMAIN_HIERARCHY
> > > +       select GENERIC_MSI_IRQ
> > > +
> > >  config RISCV_IMSIC
> > >         bool
> > >         depends on RISCV
> > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > index 577bde3e986b..438b8e1a152c 100644
> > > --- a/drivers/irqchip/Makefile
> > > +++ b/drivers/irqchip/Makefile
> > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> > >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> > >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> > >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > > +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
> > >  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> > >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> > >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > new file mode 100644
> > > index 000000000000..1e710fdf5608
> > > --- /dev/null
> > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > @@ -0,0 +1,765 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > + */
> > > +
> > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > +#include <linux/bitops.h>
> > > +#include <linux/cpu.h>
> > > +#include <linux/interrupt.h>
> > > +#include <linux/io.h>
> > > +#include <linux/irq.h>
> > > +#include <linux/irqchip.h>
> > > +#include <linux/irqchip/chained_irq.h>
> > > +#include <linux/irqchip/riscv-aplic.h>
> > > +#include <linux/irqchip/riscv-imsic.h>
> > > +#include <linux/irqdomain.h>
> > > +#include <linux/module.h>
> > > +#include <linux/msi.h>
> > > +#include <linux/platform_device.h>
> > > +#include <linux/smp.h>
> > > +
> > > +#define APLIC_DEFAULT_PRIORITY         1
> > > +#define APLIC_DISABLE_IDELIVERY                0
> > > +#define APLIC_ENABLE_IDELIVERY         1
> > > +#define APLIC_DISABLE_ITHRESHOLD       1
> > > +#define APLIC_ENABLE_ITHRESHOLD                0
> > > +
> > > +struct aplic_msicfg {
> > > +       phys_addr_t             base_ppn;
> > > +       u32                     hhxs;
> > > +       u32                     hhxw;
> > > +       u32                     lhxs;
> > > +       u32                     lhxw;
> > > +};
> > > +
> > > +struct aplic_idc {
> > > +       unsigned int            hart_index;
> > > +       void __iomem            *regs;
> > > +       struct aplic_priv       *priv;
> > > +};
> > > +
> > > +struct aplic_priv {
> > > +       struct fwnode_handle    *fwnode;
> > > +       u32                     gsi_base;
> > > +       u32                     nr_irqs;
> > > +       u32                     nr_idcs;
> > > +       void __iomem            *regs;
> > > +       struct irq_domain       *irqdomain;
> > > +       struct aplic_msicfg     msicfg;
> > > +       struct cpumask          lmask;
> > > +};
> > > +
> > > +static unsigned int aplic_idc_parent_irq;
> > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > +
> > > +static void aplic_irq_unmask(struct irq_data *d)
> > > +{
> > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +
> > > +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > +
> > > +       if (!priv->nr_idcs)
> > > +               irq_chip_unmask_parent(d);
> > > +}
> > > +
> > > +static void aplic_irq_mask(struct irq_data *d)
> > > +{
> > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +
> > > +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > +
> > > +       if (!priv->nr_idcs)
> > > +               irq_chip_mask_parent(d);
> > > +}
> > > +
> > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > +{
> > > +       u32 val = 0;
> > > +       void __iomem *sourcecfg;
> > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +
> > > +       switch (type) {
> > > +       case IRQ_TYPE_NONE:
> > > +               val = APLIC_SOURCECFG_SM_INACTIVE;
> > > +               break;
> > > +       case IRQ_TYPE_LEVEL_LOW:
> > > +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > +               break;
> > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > +               break;
> > > +       case IRQ_TYPE_EDGE_FALLING:
> > > +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > +               break;
> > > +       case IRQ_TYPE_EDGE_RISING:
> > > +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > +               break;
> > > +       default:
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > +       writel(val, sourcecfg);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void aplic_irq_eoi(struct irq_data *d)
> > > +{
> > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +       u32 reg_off, reg_mask;
> > > +
> > > +       /*
> > > +        * EOI handling only required only for level-triggered
> > > +        * interrupts in APLIC MSI mode.
> > > +        */
> > > +
> > > +       if (priv->nr_idcs)
> > > +               return;
> > > +
> > > +       reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > +       reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > +       switch (irqd_get_trigger_type(d)) {
> > > +       case IRQ_TYPE_LEVEL_LOW:
> > > +               if (!(readl(priv->regs + reg_off) & reg_mask))
> > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > +               break;
> > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > +               if (readl(priv->regs + reg_off) & reg_mask)
> > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > +               break;
> > > +       }
> > > +}
> > > +
> > > +#ifdef CONFIG_SMP
> > > +static int aplic_set_affinity(struct irq_data *d,
> > > +                             const struct cpumask *mask_val, bool force)
> > > +{
> > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +       struct aplic_idc *idc;
> > > +       unsigned int cpu, val;
> > > +       struct cpumask amask;
> > > +       void __iomem *target;
> > > +
> > > +       if (!priv->nr_idcs)
> > > +               return irq_chip_set_affinity_parent(d, mask_val, force);
> > > +
> > > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > > +
> > > +       if (force)
> > > +               cpu = cpumask_first(&amask);
> > > +       else
> > > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > +
> > > +       if (cpu >= nr_cpu_ids)
> > > +               return -EINVAL;
> > > +
> > > +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > +       val |= APLIC_DEFAULT_PRIORITY;
> > > +       writel(val, target);
> > > +
> > > +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > +
> > > +       return IRQ_SET_MASK_OK_DONE;
> > > +}
> > > +#endif
> > > +
> > > +static struct irq_chip aplic_chip = {
> > > +       .name           = "RISC-V APLIC",
> > > +       .irq_mask       = aplic_irq_mask,
> > > +       .irq_unmask     = aplic_irq_unmask,
> > > +       .irq_set_type   = aplic_set_type,
> > > +       .irq_eoi        = aplic_irq_eoi,
> > > +#ifdef CONFIG_SMP
> > > +       .irq_set_affinity = aplic_set_affinity,
> > > +#endif
> > > +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> > > +                         IRQCHIP_SKIP_SET_WAKE |
> > > +                         IRQCHIP_MASK_ON_SUSPEND,
> > > +};
> > > +
> > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > +                                    u32 gsi_base,
> > > +                                    unsigned long *hwirq,
> > > +                                    unsigned int *type)
> > > +{
> > > +       if (WARN_ON(fwspec->param_count < 2))
> > > +               return -EINVAL;
> > > +       if (WARN_ON(!fwspec->param[0]))
> > > +               return -EINVAL;
> > > +
> > > +       /* For DT, gsi_base is always zero. */
> > > +       *hwirq = fwspec->param[0] - gsi_base;
> > > +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > +
> > > +       WARN_ON(*type == IRQ_TYPE_NONE);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > +                                        struct irq_fwspec *fwspec,
> > > +                                        unsigned long *hwirq,
> > > +                                        unsigned int *type)
> > > +{
> > > +       struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > +
> > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > +}
> > > +
> > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > +                                    void *arg)
> > > +{
> > > +       int i, ret;
> > > +       unsigned int type;
> > > +       irq_hw_number_t hwirq;
> > > +       struct irq_fwspec *fwspec = arg;
> > > +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > +
> > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       for (i = 0; i < nr_irqs; i++) {
> > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > +                                   NULL, NULL);
> > > +               /*
> > > +                * APLIC does not implement irq_disable() so Linux interrupt
> > > +                * subsystem will take a lazy approach for disabling an APLIC
> > > +                * interrupt. This means APLIC interrupts are left unmasked
> > > +                * upon system suspend and interrupts are not processed
> > > +                * immediately upon system wake up. To tackle this, we disable
> > > +                * the lazy approach for all APLIC interrupts.
> > > +                */
> > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > +       .translate      = aplic_irqdomain_msi_translate,
> > > +       .alloc          = aplic_irqdomain_msi_alloc,
> > > +       .free           = platform_msi_device_domain_free,
> > > +};
> > > +
> > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > +                                        struct irq_fwspec *fwspec,
> > > +                                        unsigned long *hwirq,
> > > +                                        unsigned int *type)
> > > +{
> > > +       struct aplic_priv *priv = d->host_data;
> > > +
> > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > +}
> > > +
> > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > +                                    void *arg)
> > > +{
> > > +       int i, ret;
> > > +       unsigned int type;
> > > +       irq_hw_number_t hwirq;
> > > +       struct irq_fwspec *fwspec = arg;
> > > +       struct aplic_priv *priv = domain->host_data;
> > > +
> > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       for (i = 0; i < nr_irqs; i++) {
> > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > +                                   NULL, NULL);
> > > +               irq_set_affinity(virq + i, &priv->lmask);
> > > +               /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > +       .translate      = aplic_irqdomain_idc_translate,
> > > +       .alloc          = aplic_irqdomain_idc_alloc,
> > > +       .free           = irq_domain_free_irqs_top,
> > > +};
> > > +
> > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > +{
> > > +       int i;
> > > +
> > > +       /* Disable all interrupts */
> > > +       for (i = 0; i <= priv->nr_irqs; i += 32)
> > > +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > +                           (i / 32) * sizeof(u32));
> > > +
> > > +       /* Set interrupt type and default priority for all interrupts */
> > > +       for (i = 1; i <= priv->nr_irqs; i++) {
> > > +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > +                         (i - 1) * sizeof(u32));
> > > +               writel(APLIC_DEFAULT_PRIORITY,
> > > +                      priv->regs + APLIC_TARGET_BASE +
> > > +                      (i - 1) * sizeof(u32));
> > > +       }
> > > +
> > > +       /* Clear APLIC domaincfg */
> > > +       writel(0, priv->regs + APLIC_DOMAINCFG);
> > > +}
> > > +
> > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > +{
> > > +       u32 val;
> > > +#ifdef CONFIG_RISCV_M_MODE
> > > +       u32 valH;
> > > +
> > > +       if (!priv->nr_idcs) {
> > > +               val = priv->msicfg.base_ppn;
> > > +               valH = (priv->msicfg.base_ppn >> 32) &
> > > +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > +       }
> > > +#endif
> > > +
> > > +       /* Setup APLIC domaincfg register */
> > > +       val = readl(priv->regs + APLIC_DOMAINCFG);
> > > +       val |= APLIC_DOMAINCFG_IE;
> > > +       if (!priv->nr_idcs)
> > > +               val |= APLIC_DOMAINCFG_DM;
> > > +       writel(val, priv->regs + APLIC_DOMAINCFG);
> > > +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > +               pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > +                       priv->fwnode, val);
> > > +}
> > > +
> > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > +{
> > > +       unsigned int group_index, hart_index, guest_index, val;
> > > +       struct irq_data *d = irq_get_irq_data(desc->irq);
> > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > +       phys_addr_t tppn, tbppn, msg_addr;
> > > +       void __iomem *target;
> > > +
> > > +       /* For zeroed MSI, simply write zero into the target register */
> > > +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > +               target = priv->regs + APLIC_TARGET_BASE;
> > > +               target += (d->hwirq - 1) * sizeof(u32);
> > > +               writel(0, target);
> > > +               return;
> > > +       }
> > > +
> > > +       /* Sanity check on message data */
> > > +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > +
> > > +       /* Compute target MSI address */
> > > +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > +
> > > +       /* Compute target HART Base PPN */
> > > +       tbppn = tppn;
> > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > +       WARN_ON(tbppn != mc->base_ppn);
> > > +
> > > +       /* Compute target group and hart indexes */
> > > +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > +       hart_index |= (group_index << mc->lhxw);
> > > +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > +
> > > +       /* Compute target guest index */
> > > +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > +
> > > +       /* Update IRQ TARGET register */
> > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > +                               << APLIC_TARGET_HART_IDX_SHIFT;
> > > +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > +       writel(val, target);
> > > +}
> > > +
> > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > +{
> > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > +       const struct imsic_global_config *imsic_global;
> > > +
> > > +       /*
> > > +        * The APLIC outgoing MSI config registers assume target MSI
> > > +        * controller to be RISC-V AIA IMSIC controller.
> > > +        */
> > > +       imsic_global = imsic_get_global_config();
> > > +       if (!imsic_global) {
> > > +               pr_err("%pfwP: IMSIC global config not found\n",
> > > +                       priv->fwnode);
> > > +               return -ENODEV;
> > > +       }
> > > +
> > > +       /* Find number of guest index bits (LHXS) */
> > > +       mc->lhxs = imsic_global->guest_index_bits;
> > > +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > +               pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > +                       priv->fwnode);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       /* Find number of HART index bits (LHXW) */
> > > +       mc->lhxw = imsic_global->hart_index_bits;
> > > +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > +               pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > +                       priv->fwnode);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       /* Find number of group index bits (HHXW) */
> > > +       mc->hhxw = imsic_global->group_index_bits;
> > > +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > +               pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > +                       priv->fwnode);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       /* Find first bit position of group index (HHXS) */
> > > +       mc->hhxs = imsic_global->group_index_shift;
> > > +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > +               pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > +                       priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > +               return -EINVAL;
> > > +       }
> > > +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > +               pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > +                       priv->fwnode);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       /* Compute PPN base */
> > > +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > +
> > > +       /* Use all possible CPUs as lmask */
> > > +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +/*
> > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > + * which will return highest priority pending interrupt and clear the
> > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > + * register return zero value.
> > > + */
> > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > +{
> > > +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > > +       irq_hw_number_t hw_irq;
> > > +       int irq;
> > > +
> > > +       chained_irq_enter(chip, desc);
> > > +
> > > +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > +
> > > +               if (unlikely(irq <= 0))
> > > +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > +                                           hw_irq);
> > > +               else
> > > +                       generic_handle_irq(irq);
> > > +       }
> > > +
> > > +       chained_irq_exit(chip, desc);
> > > +}
> > > +
> > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > +{
> > > +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > +
> > > +       /* Priority must be less than threshold for interrupt triggering */
> > > +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > +
> > > +       /* Delivery must be set to 1 for interrupt triggering */
> > > +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > +}
> > > +
> > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > +{
> > > +       if (aplic_idc_parent_irq)
> > > +               disable_percpu_irq(aplic_idc_parent_irq);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > +{
> > > +       if (aplic_idc_parent_irq)
> > > +               enable_percpu_irq(aplic_idc_parent_irq,
> > > +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > +{
> > > +       int i, j, rc, cpu, setup_count = 0;
> > > +       struct fwnode_reference_args parent;
> > > +       struct irq_domain *domain;
> > > +       unsigned long hartid;
> > > +       struct aplic_idc *idc;
> > > +       u32 val;
> > > +
> > > +       /* Setup per-CPU IDC and target CPU mask */
> > > +       for (i = 0; i < priv->nr_idcs; i++) {
> > > +               rc = fwnode_property_get_reference_args(priv->fwnode,
> > > +                               "interrupts-extended", "#interrupt-cells",
> > > +                               0, i, &parent);
> > > +               if (rc) {
> > > +                       pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > +                               priv->fwnode, i);
> > > +                       continue;
> > > +               }
> > > +
> > > +               /*
> > > +                * Skip interrupts other than external interrupts for
> > > +                * current privilege level.
> > > +                */
> > > +               if (parent.args[0] != RV_IRQ_EXT)
> > > +                       continue;
> > > +
> > > +               rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > +               if (rc) {
> > > +                       pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > +                               priv->fwnode, i);
> > > +                       continue;
> > > +               }
> > > +
> > > +               cpu = riscv_hartid_to_cpuid(hartid);
> > > +               if (cpu < 0) {
> > > +                       pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > +                               priv->fwnode, i);
> > > +                       continue;
> > > +               }
> > > +
> > > +               cpumask_set_cpu(cpu, &priv->lmask);
> > > +
> > > +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > +               idc->hart_index = i;
> > > +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > +               idc->priv = priv;
> > > +
> > > +               aplic_idc_set_delivery(idc, true);
> > > +
> > > +               /*
> > > +                * Boot cpu might not have APLIC hart_index = 0 so check
> > > +                * and update target registers of all interrupts.
> > > +                */
> > > +               if (cpu == smp_processor_id() && idc->hart_index) {
> > > +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > +                       val |= APLIC_DEFAULT_PRIORITY;
> > > +                       for (j = 1; j <= priv->nr_irqs; j++)
> > > +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> > > +                                           (j - 1) * sizeof(u32));
> > > +               }
> > > +
> > > +               setup_count++;
> > > +       }
> > > +
> > > +       /* Find parent domain and register chained handler */
> > > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > +                                         DOMAIN_BUS_ANY);
> > > +       if (!aplic_idc_parent_irq && domain) {
> > > +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > +               if (aplic_idc_parent_irq) {
> > > +                       irq_set_chained_handler(aplic_idc_parent_irq,
> > > +                                               aplic_idc_handle_irq);
> > > +
> > > +                       /*
> > > +                        * Setup CPUHP notifier to enable IDC parent
> > > +                        * interrupt on all CPUs
> > > +                        */
> > > +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > +                                         "irqchip/riscv/aplic:starting",
> > > +                                         aplic_idc_starting_cpu,
> > > +                                         aplic_idc_dying_cpu);
> > > +               }
> > > +       }
> > > +
> > > +       /* Fail if we were not able to setup IDC for any CPU */
> > > +       return (setup_count) ? 0 : -ENODEV;
> > > +}
> > > +
> > > +static int aplic_probe(struct platform_device *pdev)
> > > +{
> > > +       struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > +       struct fwnode_reference_args parent;
> > > +       struct aplic_priv *priv;
> > > +       struct resource *res;
> > > +       phys_addr_t pa;
> > > +       int rc;
> > > +
> > > +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > +       if (!priv)
> > > +               return -ENOMEM;
> > > +       priv->fwnode = fwnode;
> > > +
> > > +       /* Map the MMIO registers */
> > > +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > +       if (!res) {
> > > +               pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > +               return -EINVAL;
> > > +       }
> > > +       priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > +       if (!priv->regs) {
> > > +               pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > +               return -ENOMEM;
> > > +       }
> > > +
> > > +       /*
> > > +        * Find out GSI base number
> > > +        *
> > > +        * Note: DT does not define "riscv,gsi-base" property so GSI
> > > +        * base is always zero for DT.
> > > +        */
> > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > +                                           &priv->gsi_base, 1);
> > > +       if (rc)
> > > +               priv->gsi_base = 0;
> > > +
> > > +       /* Find out number of interrupt sources */
> > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > +                                           &priv->nr_irqs, 1);
> > > +       if (rc) {
> > > +               pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > +                       fwnode);
> > > +               return rc;
> > > +       }
> > > +
> > > +       /* Setup initial state APLIC interrupts */
> > > +       aplic_init_hw_irqs(priv);
> > > +
> > > +       /*
> > > +        * Find out number of IDCs based on parent interrupts
> > > +        *
> > > +        * If "msi-parent" property is present then we ignore the
> > > +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > +        */
> > > +       if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > +               while (!fwnode_property_get_reference_args(fwnode,
> > > +                               "interrupts-extended", "#interrupt-cells",
> > > +                               0, priv->nr_idcs, &parent))
> > > +                       priv->nr_idcs++;
> > > +       }
> > > +
> > > +       /* Setup IDCs or MSIs based on number of IDCs */
> > > +       if (priv->nr_idcs)
> > > +               rc = aplic_setup_idc(priv);
> > > +       else
> > > +               rc = aplic_setup_msi(priv);
> > > +       if (rc) {
> > > +               pr_err("%pfwP: failed setup %s\n",
> > > +                       fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > +               return rc;
> > > +       }
> > > +
> > > +       /* Setup global config and interrupt delivery */
> > > +       aplic_init_hw_global(priv);
> > > +
> > > +       /* Create irq domain instance for the APLIC */
> > > +       if (priv->nr_idcs)
> > > +               priv->irqdomain = irq_domain_create_linear(
> > > +                                               priv->fwnode,
> > > +                                               priv->nr_irqs + 1,
> > > +                                               &aplic_irqdomain_idc_ops,
> > > +                                               priv);
> > > +       else
> > > +               priv->irqdomain = platform_msi_create_device_domain(
> > > +                                               &pdev->dev,
> > > +                                               priv->nr_irqs + 1,
> > > +                                               aplic_msi_write_msg,
> > > +                                               &aplic_irqdomain_msi_ops,
> > > +                                               priv);
> > > +       if (!priv->irqdomain) {
> > > +               pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > +               return -ENOMEM;
> > > +       }
> > > +
> > > +       /* Advertise the interrupt controller */
> > > +       if (priv->nr_idcs) {
> > > +               pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > +                       priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > +       } else {
> > > +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > +               pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > +                       priv->fwnode, priv->nr_irqs, &pa);
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static const struct of_device_id aplic_match[] = {
> > > +       { .compatible = "riscv,aplic" },
> > > +       {}
> > > +};
> > > +
> > > +static struct platform_driver aplic_driver = {
> > > +       .driver = {
> > > +               .name           = "riscv-aplic",
> > > +               .of_match_table = aplic_match,
> > > +       },
> > > +       .probe = aplic_probe,
> > > +};
> > > +builtin_platform_driver(aplic_driver);
> > > +
> > > +static int __init aplic_dt_init(struct device_node *node,
> > > +                               struct device_node *parent)
> > > +{
> > > +       /*
> > > +        * The APLIC platform driver needs to be probed early
> > > +        * so for device tree:
> > > +        *
> > > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > +        *    provides a hint to the device driver core to probe the
> > > +        *    platform driver early.
> > > +        * 2) Clear the OF_POPULATED flag in device_node because
> > > +        *    of_irq_init() sets it which prevents creation of
> > > +        *    platform device.
> > > +        */
> > > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> >
> > NACK. You are blindly plastering flags without trying to understand
> > the real issue and fixing this correctly.
> >
> > > +       of_node_clear_flag(node, OF_POPULATED);
> > > +       return 0;
> > > +}
> > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> >
> > This macro pretty much skips the entire driver core framework to probe
> > and calls init and you are supposed to initialize the device when the
> > init function is called.
> >
> > If you want your device/driver to follow the proper platform driver
> > path (which is recommended), then you need to use the
> > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> >
> > I offered to help you debug this issue and I asked for a dts file that
> > corresponds to a board you are testing this on and seeing an issue.
> > But you haven't answered my question [1] and are pointing to some
> > random commit and blaming it. That commit has no impact on any
> > existing devices/drivers.
> >
> > Hi Marc,
> >
> > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > is used or until Anup actually works with us to debug the real issue.
>
> Maybe I misread your previous comment.
>
> You can easily reproduce the issue on QEMU virt machine for RISC-V:
> 1) Build qemu-system-riscv64 from latest QEMU master
> 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
>  APLIC driver at the time of building kernel)
> 3) Boot a APLIC-only system on QEMU virt machine
>     qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
>     -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
>     -kernel ./build-riscv64/arch/riscv/boot/Image \
>     -append "root=/dev/ram rw console=ttyS0 earlycon" \
>     -initrd ./rootfs_riscv64.img

Unfortunately, I don't have the time to do all that, but I generally
don't need to run something to figure out the issue. It's generally
fairly obvious once I look at the DT. I'll also lean on you for some
debug logs.

Where is the dts file that corresponds to this QEMU run? This is the
third time I'm asking for a pointer to a dts file that has this issue,
can you point me to it please? I shouldn't have to say this but: put
it somewhere and point me to it please. Please don't point me to some
git repo and ask me to dig around.

Can you give me details on what supplier is causing the deferred probe
that's a problem for you? Any other details you can provide that'll
help debug this issue?

> I hope the above steps help you reproduce the issue. I will certainly
> test whatever fix you propose.

Do you plan to try the fix I suggested already? The one about using
the correct macros?

-Saravana

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-16 22:05       ` Saravana Kannan
@ 2023-06-19  6:13         ` Anup Patel
  2023-06-22 20:56           ` Saravana Kannan
  0 siblings, 1 reply; 28+ messages in thread
From: Anup Patel @ 2023-06-19  6:13 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Conor Dooley, linux-riscv, linux-kernel,
	devicetree, iommu, Android Kernel Team

On Sat, Jun 17, 2023 at 3:36 AM Saravana Kannan <saravanak@google.com> wrote:
>
> On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
> > > >
> > > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > > platform. This new interrupt controller is referred to as advanced
> > > > platform-level interrupt controller (APLIC) which can forward wired
> > > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > > signaled interrupts.
> > > > (For more details refer https://github.com/riscv/riscv-aia)
> > > >
> > > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > > platforms.
> > > >
> > > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > > ---
> > > >  drivers/irqchip/Kconfig             |   6 +
> > > >  drivers/irqchip/Makefile            |   1 +
> > > >  drivers/irqchip/irq-riscv-aplic.c   | 765 ++++++++++++++++++++++++++++
> > > >  include/linux/irqchip/riscv-aplic.h | 119 +++++
> > > >  4 files changed, 891 insertions(+)
> > > >  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > > >  create mode 100644 include/linux/irqchip/riscv-aplic.h
> > > >
> > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > > index d700980372ef..834c0329f583 100644
> > > > --- a/drivers/irqchip/Kconfig
> > > > +++ b/drivers/irqchip/Kconfig
> > > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > > >         select IRQ_DOMAIN_HIERARCHY
> > > >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > > >
> > > > +config RISCV_APLIC
> > > > +       bool
> > > > +       depends on RISCV
> > > > +       select IRQ_DOMAIN_HIERARCHY
> > > > +       select GENERIC_MSI_IRQ
> > > > +
> > > >  config RISCV_IMSIC
> > > >         bool
> > > >         depends on RISCV
> > > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > > index 577bde3e986b..438b8e1a152c 100644
> > > > --- a/drivers/irqchip/Makefile
> > > > +++ b/drivers/irqchip/Makefile
> > > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> > > >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> > > >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> > > >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > > > +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
> > > >  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> > > >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> > > >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> > > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > > new file mode 100644
> > > > index 000000000000..1e710fdf5608
> > > > --- /dev/null
> > > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > > @@ -0,0 +1,765 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/*
> > > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > > + */
> > > > +
> > > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > > +#include <linux/bitops.h>
> > > > +#include <linux/cpu.h>
> > > > +#include <linux/interrupt.h>
> > > > +#include <linux/io.h>
> > > > +#include <linux/irq.h>
> > > > +#include <linux/irqchip.h>
> > > > +#include <linux/irqchip/chained_irq.h>
> > > > +#include <linux/irqchip/riscv-aplic.h>
> > > > +#include <linux/irqchip/riscv-imsic.h>
> > > > +#include <linux/irqdomain.h>
> > > > +#include <linux/module.h>
> > > > +#include <linux/msi.h>
> > > > +#include <linux/platform_device.h>
> > > > +#include <linux/smp.h>
> > > > +
> > > > +#define APLIC_DEFAULT_PRIORITY         1
> > > > +#define APLIC_DISABLE_IDELIVERY                0
> > > > +#define APLIC_ENABLE_IDELIVERY         1
> > > > +#define APLIC_DISABLE_ITHRESHOLD       1
> > > > +#define APLIC_ENABLE_ITHRESHOLD                0
> > > > +
> > > > +struct aplic_msicfg {
> > > > +       phys_addr_t             base_ppn;
> > > > +       u32                     hhxs;
> > > > +       u32                     hhxw;
> > > > +       u32                     lhxs;
> > > > +       u32                     lhxw;
> > > > +};
> > > > +
> > > > +struct aplic_idc {
> > > > +       unsigned int            hart_index;
> > > > +       void __iomem            *regs;
> > > > +       struct aplic_priv       *priv;
> > > > +};
> > > > +
> > > > +struct aplic_priv {
> > > > +       struct fwnode_handle    *fwnode;
> > > > +       u32                     gsi_base;
> > > > +       u32                     nr_irqs;
> > > > +       u32                     nr_idcs;
> > > > +       void __iomem            *regs;
> > > > +       struct irq_domain       *irqdomain;
> > > > +       struct aplic_msicfg     msicfg;
> > > > +       struct cpumask          lmask;
> > > > +};
> > > > +
> > > > +static unsigned int aplic_idc_parent_irq;
> > > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > > +
> > > > +static void aplic_irq_unmask(struct irq_data *d)
> > > > +{
> > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +
> > > > +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > > +
> > > > +       if (!priv->nr_idcs)
> > > > +               irq_chip_unmask_parent(d);
> > > > +}
> > > > +
> > > > +static void aplic_irq_mask(struct irq_data *d)
> > > > +{
> > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +
> > > > +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > > +
> > > > +       if (!priv->nr_idcs)
> > > > +               irq_chip_mask_parent(d);
> > > > +}
> > > > +
> > > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > > +{
> > > > +       u32 val = 0;
> > > > +       void __iomem *sourcecfg;
> > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +
> > > > +       switch (type) {
> > > > +       case IRQ_TYPE_NONE:
> > > > +               val = APLIC_SOURCECFG_SM_INACTIVE;
> > > > +               break;
> > > > +       case IRQ_TYPE_LEVEL_LOW:
> > > > +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > > +               break;
> > > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > > +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > > +               break;
> > > > +       case IRQ_TYPE_EDGE_FALLING:
> > > > +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > > +               break;
> > > > +       case IRQ_TYPE_EDGE_RISING:
> > > > +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > > +               break;
> > > > +       default:
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > > +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > > +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > > +       writel(val, sourcecfg);
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void aplic_irq_eoi(struct irq_data *d)
> > > > +{
> > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +       u32 reg_off, reg_mask;
> > > > +
> > > > +       /*
> > > > +        * EOI handling only required only for level-triggered
> > > > +        * interrupts in APLIC MSI mode.
> > > > +        */
> > > > +
> > > > +       if (priv->nr_idcs)
> > > > +               return;
> > > > +
> > > > +       reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > > +       reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > > +       switch (irqd_get_trigger_type(d)) {
> > > > +       case IRQ_TYPE_LEVEL_LOW:
> > > > +               if (!(readl(priv->regs + reg_off) & reg_mask))
> > > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > +               break;
> > > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > > +               if (readl(priv->regs + reg_off) & reg_mask)
> > > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > +               break;
> > > > +       }
> > > > +}
> > > > +
> > > > +#ifdef CONFIG_SMP
> > > > +static int aplic_set_affinity(struct irq_data *d,
> > > > +                             const struct cpumask *mask_val, bool force)
> > > > +{
> > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +       struct aplic_idc *idc;
> > > > +       unsigned int cpu, val;
> > > > +       struct cpumask amask;
> > > > +       void __iomem *target;
> > > > +
> > > > +       if (!priv->nr_idcs)
> > > > +               return irq_chip_set_affinity_parent(d, mask_val, force);
> > > > +
> > > > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > > > +
> > > > +       if (force)
> > > > +               cpu = cpumask_first(&amask);
> > > > +       else
> > > > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > +
> > > > +       if (cpu >= nr_cpu_ids)
> > > > +               return -EINVAL;
> > > > +
> > > > +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > > +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > +       val |= APLIC_DEFAULT_PRIORITY;
> > > > +       writel(val, target);
> > > > +
> > > > +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > +
> > > > +       return IRQ_SET_MASK_OK_DONE;
> > > > +}
> > > > +#endif
> > > > +
> > > > +static struct irq_chip aplic_chip = {
> > > > +       .name           = "RISC-V APLIC",
> > > > +       .irq_mask       = aplic_irq_mask,
> > > > +       .irq_unmask     = aplic_irq_unmask,
> > > > +       .irq_set_type   = aplic_set_type,
> > > > +       .irq_eoi        = aplic_irq_eoi,
> > > > +#ifdef CONFIG_SMP
> > > > +       .irq_set_affinity = aplic_set_affinity,
> > > > +#endif
> > > > +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> > > > +                         IRQCHIP_SKIP_SET_WAKE |
> > > > +                         IRQCHIP_MASK_ON_SUSPEND,
> > > > +};
> > > > +
> > > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > > +                                    u32 gsi_base,
> > > > +                                    unsigned long *hwirq,
> > > > +                                    unsigned int *type)
> > > > +{
> > > > +       if (WARN_ON(fwspec->param_count < 2))
> > > > +               return -EINVAL;
> > > > +       if (WARN_ON(!fwspec->param[0]))
> > > > +               return -EINVAL;
> > > > +
> > > > +       /* For DT, gsi_base is always zero. */
> > > > +       *hwirq = fwspec->param[0] - gsi_base;
> > > > +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > > +
> > > > +       WARN_ON(*type == IRQ_TYPE_NONE);
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > > +                                        struct irq_fwspec *fwspec,
> > > > +                                        unsigned long *hwirq,
> > > > +                                        unsigned int *type)
> > > > +{
> > > > +       struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > > +
> > > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > +}
> > > > +
> > > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > > +                                    void *arg)
> > > > +{
> > > > +       int i, ret;
> > > > +       unsigned int type;
> > > > +       irq_hw_number_t hwirq;
> > > > +       struct irq_fwspec *fwspec = arg;
> > > > +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > > +
> > > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       for (i = 0; i < nr_irqs; i++) {
> > > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > > +                                   NULL, NULL);
> > > > +               /*
> > > > +                * APLIC does not implement irq_disable() so Linux interrupt
> > > > +                * subsystem will take a lazy approach for disabling an APLIC
> > > > +                * interrupt. This means APLIC interrupts are left unmasked
> > > > +                * upon system suspend and interrupts are not processed
> > > > +                * immediately upon system wake up. To tackle this, we disable
> > > > +                * the lazy approach for all APLIC interrupts.
> > > > +                */
> > > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > +       }
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > > +       .translate      = aplic_irqdomain_msi_translate,
> > > > +       .alloc          = aplic_irqdomain_msi_alloc,
> > > > +       .free           = platform_msi_device_domain_free,
> > > > +};
> > > > +
> > > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > > +                                        struct irq_fwspec *fwspec,
> > > > +                                        unsigned long *hwirq,
> > > > +                                        unsigned int *type)
> > > > +{
> > > > +       struct aplic_priv *priv = d->host_data;
> > > > +
> > > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > +}
> > > > +
> > > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > > +                                    void *arg)
> > > > +{
> > > > +       int i, ret;
> > > > +       unsigned int type;
> > > > +       irq_hw_number_t hwirq;
> > > > +       struct irq_fwspec *fwspec = arg;
> > > > +       struct aplic_priv *priv = domain->host_data;
> > > > +
> > > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       for (i = 0; i < nr_irqs; i++) {
> > > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > > +                                   NULL, NULL);
> > > > +               irq_set_affinity(virq + i, &priv->lmask);
> > > > +               /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > +       }
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > > +       .translate      = aplic_irqdomain_idc_translate,
> > > > +       .alloc          = aplic_irqdomain_idc_alloc,
> > > > +       .free           = irq_domain_free_irqs_top,
> > > > +};
> > > > +
> > > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > > +{
> > > > +       int i;
> > > > +
> > > > +       /* Disable all interrupts */
> > > > +       for (i = 0; i <= priv->nr_irqs; i += 32)
> > > > +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > > +                           (i / 32) * sizeof(u32));
> > > > +
> > > > +       /* Set interrupt type and default priority for all interrupts */
> > > > +       for (i = 1; i <= priv->nr_irqs; i++) {
> > > > +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > > +                         (i - 1) * sizeof(u32));
> > > > +               writel(APLIC_DEFAULT_PRIORITY,
> > > > +                      priv->regs + APLIC_TARGET_BASE +
> > > > +                      (i - 1) * sizeof(u32));
> > > > +       }
> > > > +
> > > > +       /* Clear APLIC domaincfg */
> > > > +       writel(0, priv->regs + APLIC_DOMAINCFG);
> > > > +}
> > > > +
> > > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > > +{
> > > > +       u32 val;
> > > > +#ifdef CONFIG_RISCV_M_MODE
> > > > +       u32 valH;
> > > > +
> > > > +       if (!priv->nr_idcs) {
> > > > +               val = priv->msicfg.base_ppn;
> > > > +               valH = (priv->msicfg.base_ppn >> 32) &
> > > > +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > > +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > > +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > > +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > > +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > > +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > > +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > > +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > > +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > > +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > > +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > > +       }
> > > > +#endif
> > > > +
> > > > +       /* Setup APLIC domaincfg register */
> > > > +       val = readl(priv->regs + APLIC_DOMAINCFG);
> > > > +       val |= APLIC_DOMAINCFG_IE;
> > > > +       if (!priv->nr_idcs)
> > > > +               val |= APLIC_DOMAINCFG_DM;
> > > > +       writel(val, priv->regs + APLIC_DOMAINCFG);
> > > > +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > > +               pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > > +                       priv->fwnode, val);
> > > > +}
> > > > +
> > > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > > +{
> > > > +       unsigned int group_index, hart_index, guest_index, val;
> > > > +       struct irq_data *d = irq_get_irq_data(desc->irq);
> > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > > +       phys_addr_t tppn, tbppn, msg_addr;
> > > > +       void __iomem *target;
> > > > +
> > > > +       /* For zeroed MSI, simply write zero into the target register */
> > > > +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > > +               target = priv->regs + APLIC_TARGET_BASE;
> > > > +               target += (d->hwirq - 1) * sizeof(u32);
> > > > +               writel(0, target);
> > > > +               return;
> > > > +       }
> > > > +
> > > > +       /* Sanity check on message data */
> > > > +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > > +
> > > > +       /* Compute target MSI address */
> > > > +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > > +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > +
> > > > +       /* Compute target HART Base PPN */
> > > > +       tbppn = tppn;
> > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > +       WARN_ON(tbppn != mc->base_ppn);
> > > > +
> > > > +       /* Compute target group and hart indexes */
> > > > +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > > +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > > +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > > +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > > +       hart_index |= (group_index << mc->lhxw);
> > > > +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > > +
> > > > +       /* Compute target guest index */
> > > > +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > > +
> > > > +       /* Update IRQ TARGET register */
> > > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > > +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > > +                               << APLIC_TARGET_HART_IDX_SHIFT;
> > > > +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > > +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > > +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > > +       writel(val, target);
> > > > +}
> > > > +
> > > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > > +{
> > > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > > +       const struct imsic_global_config *imsic_global;
> > > > +
> > > > +       /*
> > > > +        * The APLIC outgoing MSI config registers assume target MSI
> > > > +        * controller to be RISC-V AIA IMSIC controller.
> > > > +        */
> > > > +       imsic_global = imsic_get_global_config();
> > > > +       if (!imsic_global) {
> > > > +               pr_err("%pfwP: IMSIC global config not found\n",
> > > > +                       priv->fwnode);
> > > > +               return -ENODEV;
> > > > +       }
> > > > +
> > > > +       /* Find number of guest index bits (LHXS) */
> > > > +       mc->lhxs = imsic_global->guest_index_bits;
> > > > +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > > +               pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > > +                       priv->fwnode);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > > +       /* Find number of HART index bits (LHXW) */
> > > > +       mc->lhxw = imsic_global->hart_index_bits;
> > > > +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > > +               pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > > +                       priv->fwnode);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > > +       /* Find number of group index bits (HHXW) */
> > > > +       mc->hhxw = imsic_global->group_index_bits;
> > > > +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > > +               pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > > +                       priv->fwnode);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > > +       /* Find first bit position of group index (HHXS) */
> > > > +       mc->hhxs = imsic_global->group_index_shift;
> > > > +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > > +               pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > > +                       priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > > +               return -EINVAL;
> > > > +       }
> > > > +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > > +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > > +               pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > > +                       priv->fwnode);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > > +       /* Compute PPN base */
> > > > +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > +
> > > > +       /* Use all possible CPUs as lmask */
> > > > +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > > + * which will return highest priority pending interrupt and clear the
> > > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > > + * register return zero value.
> > > > + */
> > > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > > +{
> > > > +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > > > +       irq_hw_number_t hw_irq;
> > > > +       int irq;
> > > > +
> > > > +       chained_irq_enter(chip, desc);
> > > > +
> > > > +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > > +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > > +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > > +
> > > > +               if (unlikely(irq <= 0))
> > > > +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > > +                                           hw_irq);
> > > > +               else
> > > > +                       generic_handle_irq(irq);
> > > > +       }
> > > > +
> > > > +       chained_irq_exit(chip, desc);
> > > > +}
> > > > +
> > > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > > +{
> > > > +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > > +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > > +
> > > > +       /* Priority must be less than threshold for interrupt triggering */
> > > > +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > > +
> > > > +       /* Delivery must be set to 1 for interrupt triggering */
> > > > +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > > +}
> > > > +
> > > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > > +{
> > > > +       if (aplic_idc_parent_irq)
> > > > +               disable_percpu_irq(aplic_idc_parent_irq);
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > > +{
> > > > +       if (aplic_idc_parent_irq)
> > > > +               enable_percpu_irq(aplic_idc_parent_irq,
> > > > +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > > +{
> > > > +       int i, j, rc, cpu, setup_count = 0;
> > > > +       struct fwnode_reference_args parent;
> > > > +       struct irq_domain *domain;
> > > > +       unsigned long hartid;
> > > > +       struct aplic_idc *idc;
> > > > +       u32 val;
> > > > +
> > > > +       /* Setup per-CPU IDC and target CPU mask */
> > > > +       for (i = 0; i < priv->nr_idcs; i++) {
> > > > +               rc = fwnode_property_get_reference_args(priv->fwnode,
> > > > +                               "interrupts-extended", "#interrupt-cells",
> > > > +                               0, i, &parent);
> > > > +               if (rc) {
> > > > +                       pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > > +                               priv->fwnode, i);
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               /*
> > > > +                * Skip interrupts other than external interrupts for
> > > > +                * current privilege level.
> > > > +                */
> > > > +               if (parent.args[0] != RV_IRQ_EXT)
> > > > +                       continue;
> > > > +
> > > > +               rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > > +               if (rc) {
> > > > +                       pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > > +                               priv->fwnode, i);
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               cpu = riscv_hartid_to_cpuid(hartid);
> > > > +               if (cpu < 0) {
> > > > +                       pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > > +                               priv->fwnode, i);
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               cpumask_set_cpu(cpu, &priv->lmask);
> > > > +
> > > > +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > +               idc->hart_index = i;
> > > > +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > > +               idc->priv = priv;
> > > > +
> > > > +               aplic_idc_set_delivery(idc, true);
> > > > +
> > > > +               /*
> > > > +                * Boot cpu might not have APLIC hart_index = 0 so check
> > > > +                * and update target registers of all interrupts.
> > > > +                */
> > > > +               if (cpu == smp_processor_id() && idc->hart_index) {
> > > > +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > +                       val |= APLIC_DEFAULT_PRIORITY;
> > > > +                       for (j = 1; j <= priv->nr_irqs; j++)
> > > > +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> > > > +                                           (j - 1) * sizeof(u32));
> > > > +               }
> > > > +
> > > > +               setup_count++;
> > > > +       }
> > > > +
> > > > +       /* Find parent domain and register chained handler */
> > > > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > > +                                         DOMAIN_BUS_ANY);
> > > > +       if (!aplic_idc_parent_irq && domain) {
> > > > +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > > +               if (aplic_idc_parent_irq) {
> > > > +                       irq_set_chained_handler(aplic_idc_parent_irq,
> > > > +                                               aplic_idc_handle_irq);
> > > > +
> > > > +                       /*
> > > > +                        * Setup CPUHP notifier to enable IDC parent
> > > > +                        * interrupt on all CPUs
> > > > +                        */
> > > > +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > > +                                         "irqchip/riscv/aplic:starting",
> > > > +                                         aplic_idc_starting_cpu,
> > > > +                                         aplic_idc_dying_cpu);
> > > > +               }
> > > > +       }
> > > > +
> > > > +       /* Fail if we were not able to setup IDC for any CPU */
> > > > +       return (setup_count) ? 0 : -ENODEV;
> > > > +}
> > > > +
> > > > +static int aplic_probe(struct platform_device *pdev)
> > > > +{
> > > > +       struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > > +       struct fwnode_reference_args parent;
> > > > +       struct aplic_priv *priv;
> > > > +       struct resource *res;
> > > > +       phys_addr_t pa;
> > > > +       int rc;
> > > > +
> > > > +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > +       if (!priv)
> > > > +               return -ENOMEM;
> > > > +       priv->fwnode = fwnode;
> > > > +
> > > > +       /* Map the MMIO registers */
> > > > +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > +       if (!res) {
> > > > +               pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +       priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > > +       if (!priv->regs) {
> > > > +               pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > > +               return -ENOMEM;
> > > > +       }
> > > > +
> > > > +       /*
> > > > +        * Find out GSI base number
> > > > +        *
> > > > +        * Note: DT does not define "riscv,gsi-base" property so GSI
> > > > +        * base is always zero for DT.
> > > > +        */
> > > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > > +                                           &priv->gsi_base, 1);
> > > > +       if (rc)
> > > > +               priv->gsi_base = 0;
> > > > +
> > > > +       /* Find out number of interrupt sources */
> > > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > > +                                           &priv->nr_irqs, 1);
> > > > +       if (rc) {
> > > > +               pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > > +                       fwnode);
> > > > +               return rc;
> > > > +       }
> > > > +
> > > > +       /* Setup initial state APLIC interrupts */
> > > > +       aplic_init_hw_irqs(priv);
> > > > +
> > > > +       /*
> > > > +        * Find out number of IDCs based on parent interrupts
> > > > +        *
> > > > +        * If "msi-parent" property is present then we ignore the
> > > > +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > > +        */
> > > > +       if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > > +               while (!fwnode_property_get_reference_args(fwnode,
> > > > +                               "interrupts-extended", "#interrupt-cells",
> > > > +                               0, priv->nr_idcs, &parent))
> > > > +                       priv->nr_idcs++;
> > > > +       }
> > > > +
> > > > +       /* Setup IDCs or MSIs based on number of IDCs */
> > > > +       if (priv->nr_idcs)
> > > > +               rc = aplic_setup_idc(priv);
> > > > +       else
> > > > +               rc = aplic_setup_msi(priv);
> > > > +       if (rc) {
> > > > +               pr_err("%pfwP: failed setup %s\n",
> > > > +                       fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > > +               return rc;
> > > > +       }
> > > > +
> > > > +       /* Setup global config and interrupt delivery */
> > > > +       aplic_init_hw_global(priv);
> > > > +
> > > > +       /* Create irq domain instance for the APLIC */
> > > > +       if (priv->nr_idcs)
> > > > +               priv->irqdomain = irq_domain_create_linear(
> > > > +                                               priv->fwnode,
> > > > +                                               priv->nr_irqs + 1,
> > > > +                                               &aplic_irqdomain_idc_ops,
> > > > +                                               priv);
> > > > +       else
> > > > +               priv->irqdomain = platform_msi_create_device_domain(
> > > > +                                               &pdev->dev,
> > > > +                                               priv->nr_irqs + 1,
> > > > +                                               aplic_msi_write_msg,
> > > > +                                               &aplic_irqdomain_msi_ops,
> > > > +                                               priv);
> > > > +       if (!priv->irqdomain) {
> > > > +               pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > > +               return -ENOMEM;
> > > > +       }
> > > > +
> > > > +       /* Advertise the interrupt controller */
> > > > +       if (priv->nr_idcs) {
> > > > +               pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > > +                       priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > > +       } else {
> > > > +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > +               pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > > +                       priv->fwnode, priv->nr_irqs, &pa);
> > > > +       }
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static const struct of_device_id aplic_match[] = {
> > > > +       { .compatible = "riscv,aplic" },
> > > > +       {}
> > > > +};
> > > > +
> > > > +static struct platform_driver aplic_driver = {
> > > > +       .driver = {
> > > > +               .name           = "riscv-aplic",
> > > > +               .of_match_table = aplic_match,
> > > > +       },
> > > > +       .probe = aplic_probe,
> > > > +};
> > > > +builtin_platform_driver(aplic_driver);
> > > > +
> > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > +                               struct device_node *parent)
> > > > +{
> > > > +       /*
> > > > +        * The APLIC platform driver needs to be probed early
> > > > +        * so for device tree:
> > > > +        *
> > > > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > +        *    provides a hint to the device driver core to probe the
> > > > +        *    platform driver early.
> > > > +        * 2) Clear the OF_POPULATED flag in device_node because
> > > > +        *    of_irq_init() sets it which prevents creation of
> > > > +        *    platform device.
> > > > +        */
> > > > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > >
> > > NACK. You are blindly plastering flags without trying to understand
> > > the real issue and fixing this correctly.
> > >
> > > > +       of_node_clear_flag(node, OF_POPULATED);
> > > > +       return 0;
> > > > +}
> > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > >
> > > This macro pretty much skips the entire driver core framework to probe
> > > and calls init and you are supposed to initialize the device when the
> > > init function is called.
> > >
> > > If you want your device/driver to follow the proper platform driver
> > > path (which is recommended), then you need to use the
> > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > >
> > > I offered to help you debug this issue and I asked for a dts file that
> > > corresponds to a board you are testing this on and seeing an issue.
> > > But you haven't answered my question [1] and are pointing to some
> > > random commit and blaming it. That commit has no impact on any
> > > existing devices/drivers.
> > >
> > > Hi Marc,
> > >
> > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > is used or until Anup actually works with us to debug the real issue.
> >
> > Maybe I misread your previous comment.
> >
> > You can easily reproduce the issue on QEMU virt machine for RISC-V:
> > 1) Build qemu-system-riscv64 from latest QEMU master
> > 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> > (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
> >  APLIC driver at the time of building kernel)
> > 3) Boot a APLIC-only system on QEMU virt machine
> >     qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
> >     -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
> >     -kernel ./build-riscv64/arch/riscv/boot/Image \
> >     -append "root=/dev/ram rw console=ttyS0 earlycon" \
> >     -initrd ./rootfs_riscv64.img
>
> Unfortunately, I don't have the time to do all that, but I generally
> don't need to run something to figure out the issue. It's generally
> fairly obvious once I look at the DT. I'll also lean on you for some
> debug logs.

The boot log with FWNODE_BEST_EFFORT flag in APLIC can be
found at:
https://drive.google.com/file/d/1C-uuHbh6Zk9xkAsfGLfhb_4WighvmQp1/view?usp=sharing

The boot log without FWNODE_BEST_EFFORT flag in APLIC can
be found at:
https://drive.google.com/file/d/12SRdR-2Frv_5O06kbuI_LUJ88khjf_7O/view?usp=sharing

>
> Where is the dts file that corresponds to this QEMU run? This is the
> third time I'm asking for a pointer to a dts file that has this issue,
> can you point me to it please? I shouldn't have to say this but: put
> it somewhere and point me to it please. Please don't point me to some
> git repo and ask me to dig around.

For QEMU virt machine, the DTB is generated at runtime as part of
virt machine creation. The DTS dumped by QEMU using the "dumpdtb"
command line option can be found at:
https://drive.google.com/file/d/1EU-exItL1B7EWuoXw4q-Ypocq--5Wvn8/view?usp=sharing

>
> Can you give me details on what supplier is causing the deferred probe
> that's a problem for you? Any other details you can provide that'll
> help debug this issue?

FWNODE supplier for APLIC DT node is the OF framework.

>
> > I hope the above steps help you reproduce the issue. I will certainly
> > test whatever fix you propose.
>
> Do you plan to try the fix I suggested already? The one about using
> the correct macros?

You mean use IRQCHIP_DECLARE() in the APLIC driver ?
or something else ?

Regards,
Anup

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-19  6:13         ` Anup Patel
@ 2023-06-22 20:56           ` Saravana Kannan
  2023-06-23 11:47             ` Anup Patel
  0 siblings, 1 reply; 28+ messages in thread
From: Saravana Kannan @ 2023-06-22 20:56 UTC (permalink / raw)
  To: Anup Patel
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Conor Dooley, linux-riscv, linux-kernel,
	devicetree, iommu, Android Kernel Team

On Sun, Jun 18, 2023 at 11:13 PM Anup Patel <apatel@ventanamicro.com> wrote:
>
> On Sat, Jun 17, 2023 at 3:36 AM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <saravanak@google.com> wrote:
> > > >
> > > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
> > > > >
> > > > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > > > platform. This new interrupt controller is referred to as advanced
> > > > > platform-level interrupt controller (APLIC) which can forward wired
> > > > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > > > signaled interrupts.
> > > > > (For more details refer https://github.com/riscv/riscv-aia)
> > > > >
> > > > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > > > platforms.
> > > > >
> > > > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > > > ---
> > > > >  drivers/irqchip/Kconfig             |   6 +
> > > > >  drivers/irqchip/Makefile            |   1 +
> > > > >  drivers/irqchip/irq-riscv-aplic.c   | 765 ++++++++++++++++++++++++++++
> > > > >  include/linux/irqchip/riscv-aplic.h | 119 +++++
> > > > >  4 files changed, 891 insertions(+)
> > > > >  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > > > >  create mode 100644 include/linux/irqchip/riscv-aplic.h
> > > > >
> > > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > > > index d700980372ef..834c0329f583 100644
> > > > > --- a/drivers/irqchip/Kconfig
> > > > > +++ b/drivers/irqchip/Kconfig
> > > > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > > > >         select IRQ_DOMAIN_HIERARCHY
> > > > >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > > > >
> > > > > +config RISCV_APLIC
> > > > > +       bool
> > > > > +       depends on RISCV
> > > > > +       select IRQ_DOMAIN_HIERARCHY
> > > > > +       select GENERIC_MSI_IRQ
> > > > > +
> > > > >  config RISCV_IMSIC
> > > > >         bool
> > > > >         depends on RISCV
> > > > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > > > index 577bde3e986b..438b8e1a152c 100644
> > > > > --- a/drivers/irqchip/Makefile
> > > > > +++ b/drivers/irqchip/Makefile
> > > > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> > > > >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> > > > >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> > > > >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > > > > +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
> > > > >  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> > > > >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> > > > >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> > > > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > > > new file mode 100644
> > > > > index 000000000000..1e710fdf5608
> > > > > --- /dev/null
> > > > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > > > @@ -0,0 +1,765 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > +/*
> > > > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > > > + */
> > > > > +
> > > > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > > > +#include <linux/bitops.h>
> > > > > +#include <linux/cpu.h>
> > > > > +#include <linux/interrupt.h>
> > > > > +#include <linux/io.h>
> > > > > +#include <linux/irq.h>
> > > > > +#include <linux/irqchip.h>
> > > > > +#include <linux/irqchip/chained_irq.h>
> > > > > +#include <linux/irqchip/riscv-aplic.h>
> > > > > +#include <linux/irqchip/riscv-imsic.h>
> > > > > +#include <linux/irqdomain.h>
> > > > > +#include <linux/module.h>
> > > > > +#include <linux/msi.h>
> > > > > +#include <linux/platform_device.h>
> > > > > +#include <linux/smp.h>
> > > > > +
> > > > > +#define APLIC_DEFAULT_PRIORITY         1
> > > > > +#define APLIC_DISABLE_IDELIVERY                0
> > > > > +#define APLIC_ENABLE_IDELIVERY         1
> > > > > +#define APLIC_DISABLE_ITHRESHOLD       1
> > > > > +#define APLIC_ENABLE_ITHRESHOLD                0
> > > > > +
> > > > > +struct aplic_msicfg {
> > > > > +       phys_addr_t             base_ppn;
> > > > > +       u32                     hhxs;
> > > > > +       u32                     hhxw;
> > > > > +       u32                     lhxs;
> > > > > +       u32                     lhxw;
> > > > > +};
> > > > > +
> > > > > +struct aplic_idc {
> > > > > +       unsigned int            hart_index;
> > > > > +       void __iomem            *regs;
> > > > > +       struct aplic_priv       *priv;
> > > > > +};
> > > > > +
> > > > > +struct aplic_priv {
> > > > > +       struct fwnode_handle    *fwnode;
> > > > > +       u32                     gsi_base;
> > > > > +       u32                     nr_irqs;
> > > > > +       u32                     nr_idcs;
> > > > > +       void __iomem            *regs;
> > > > > +       struct irq_domain       *irqdomain;
> > > > > +       struct aplic_msicfg     msicfg;
> > > > > +       struct cpumask          lmask;
> > > > > +};
> > > > > +
> > > > > +static unsigned int aplic_idc_parent_irq;
> > > > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > > > +
> > > > > +static void aplic_irq_unmask(struct irq_data *d)
> > > > > +{
> > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +
> > > > > +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > > > +
> > > > > +       if (!priv->nr_idcs)
> > > > > +               irq_chip_unmask_parent(d);
> > > > > +}
> > > > > +
> > > > > +static void aplic_irq_mask(struct irq_data *d)
> > > > > +{
> > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +
> > > > > +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > > > +
> > > > > +       if (!priv->nr_idcs)
> > > > > +               irq_chip_mask_parent(d);
> > > > > +}
> > > > > +
> > > > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > > > +{
> > > > > +       u32 val = 0;
> > > > > +       void __iomem *sourcecfg;
> > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +
> > > > > +       switch (type) {
> > > > > +       case IRQ_TYPE_NONE:
> > > > > +               val = APLIC_SOURCECFG_SM_INACTIVE;
> > > > > +               break;
> > > > > +       case IRQ_TYPE_LEVEL_LOW:
> > > > > +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > > > +               break;
> > > > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > > > +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > > > +               break;
> > > > > +       case IRQ_TYPE_EDGE_FALLING:
> > > > > +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > > > +               break;
> > > > > +       case IRQ_TYPE_EDGE_RISING:
> > > > > +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > > > +               break;
> > > > > +       default:
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > > +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > > > +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > > > +       writel(val, sourcecfg);
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static void aplic_irq_eoi(struct irq_data *d)
> > > > > +{
> > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +       u32 reg_off, reg_mask;
> > > > > +
> > > > > +       /*
> > > > > +        * EOI handling only required only for level-triggered
> > > > > +        * interrupts in APLIC MSI mode.
> > > > > +        */
> > > > > +
> > > > > +       if (priv->nr_idcs)
> > > > > +               return;
> > > > > +
> > > > > +       reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > > > +       reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > > > +       switch (irqd_get_trigger_type(d)) {
> > > > > +       case IRQ_TYPE_LEVEL_LOW:
> > > > > +               if (!(readl(priv->regs + reg_off) & reg_mask))
> > > > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > +               break;
> > > > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > > > +               if (readl(priv->regs + reg_off) & reg_mask)
> > > > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > +               break;
> > > > > +       }
> > > > > +}
> > > > > +
> > > > > +#ifdef CONFIG_SMP
> > > > > +static int aplic_set_affinity(struct irq_data *d,
> > > > > +                             const struct cpumask *mask_val, bool force)
> > > > > +{
> > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +       struct aplic_idc *idc;
> > > > > +       unsigned int cpu, val;
> > > > > +       struct cpumask amask;
> > > > > +       void __iomem *target;
> > > > > +
> > > > > +       if (!priv->nr_idcs)
> > > > > +               return irq_chip_set_affinity_parent(d, mask_val, force);
> > > > > +
> > > > > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > > > > +
> > > > > +       if (force)
> > > > > +               cpu = cpumask_first(&amask);
> > > > > +       else
> > > > > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > +
> > > > > +       if (cpu >= nr_cpu_ids)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > > > +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > +       val |= APLIC_DEFAULT_PRIORITY;
> > > > > +       writel(val, target);
> > > > > +
> > > > > +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > +
> > > > > +       return IRQ_SET_MASK_OK_DONE;
> > > > > +}
> > > > > +#endif
> > > > > +
> > > > > +static struct irq_chip aplic_chip = {
> > > > > +       .name           = "RISC-V APLIC",
> > > > > +       .irq_mask       = aplic_irq_mask,
> > > > > +       .irq_unmask     = aplic_irq_unmask,
> > > > > +       .irq_set_type   = aplic_set_type,
> > > > > +       .irq_eoi        = aplic_irq_eoi,
> > > > > +#ifdef CONFIG_SMP
> > > > > +       .irq_set_affinity = aplic_set_affinity,
> > > > > +#endif
> > > > > +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> > > > > +                         IRQCHIP_SKIP_SET_WAKE |
> > > > > +                         IRQCHIP_MASK_ON_SUSPEND,
> > > > > +};
> > > > > +
> > > > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > > > +                                    u32 gsi_base,
> > > > > +                                    unsigned long *hwirq,
> > > > > +                                    unsigned int *type)
> > > > > +{
> > > > > +       if (WARN_ON(fwspec->param_count < 2))
> > > > > +               return -EINVAL;
> > > > > +       if (WARN_ON(!fwspec->param[0]))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       /* For DT, gsi_base is always zero. */
> > > > > +       *hwirq = fwspec->param[0] - gsi_base;
> > > > > +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > > > +
> > > > > +       WARN_ON(*type == IRQ_TYPE_NONE);
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > > > +                                        struct irq_fwspec *fwspec,
> > > > > +                                        unsigned long *hwirq,
> > > > > +                                        unsigned int *type)
> > > > > +{
> > > > > +       struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > > > +
> > > > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > +}
> > > > > +
> > > > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > > > +                                    void *arg)
> > > > > +{
> > > > > +       int i, ret;
> > > > > +       unsigned int type;
> > > > > +       irq_hw_number_t hwirq;
> > > > > +       struct irq_fwspec *fwspec = arg;
> > > > > +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > > > +
> > > > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       for (i = 0; i < nr_irqs; i++) {
> > > > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > > > +                                   NULL, NULL);
> > > > > +               /*
> > > > > +                * APLIC does not implement irq_disable() so Linux interrupt
> > > > > +                * subsystem will take a lazy approach for disabling an APLIC
> > > > > +                * interrupt. This means APLIC interrupts are left unmasked
> > > > > +                * upon system suspend and interrupts are not processed
> > > > > +                * immediately upon system wake up. To tackle this, we disable
> > > > > +                * the lazy approach for all APLIC interrupts.
> > > > > +                */
> > > > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > +       }
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > > > +       .translate      = aplic_irqdomain_msi_translate,
> > > > > +       .alloc          = aplic_irqdomain_msi_alloc,
> > > > > +       .free           = platform_msi_device_domain_free,
> > > > > +};
> > > > > +
> > > > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > > > +                                        struct irq_fwspec *fwspec,
> > > > > +                                        unsigned long *hwirq,
> > > > > +                                        unsigned int *type)
> > > > > +{
> > > > > +       struct aplic_priv *priv = d->host_data;
> > > > > +
> > > > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > +}
> > > > > +
> > > > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > > > +                                    void *arg)
> > > > > +{
> > > > > +       int i, ret;
> > > > > +       unsigned int type;
> > > > > +       irq_hw_number_t hwirq;
> > > > > +       struct irq_fwspec *fwspec = arg;
> > > > > +       struct aplic_priv *priv = domain->host_data;
> > > > > +
> > > > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       for (i = 0; i < nr_irqs; i++) {
> > > > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > > > +                                   NULL, NULL);
> > > > > +               irq_set_affinity(virq + i, &priv->lmask);
> > > > > +               /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > +       }
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > > > +       .translate      = aplic_irqdomain_idc_translate,
> > > > > +       .alloc          = aplic_irqdomain_idc_alloc,
> > > > > +       .free           = irq_domain_free_irqs_top,
> > > > > +};
> > > > > +
> > > > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > > > +{
> > > > > +       int i;
> > > > > +
> > > > > +       /* Disable all interrupts */
> > > > > +       for (i = 0; i <= priv->nr_irqs; i += 32)
> > > > > +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > > > +                           (i / 32) * sizeof(u32));
> > > > > +
> > > > > +       /* Set interrupt type and default priority for all interrupts */
> > > > > +       for (i = 1; i <= priv->nr_irqs; i++) {
> > > > > +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > > > +                         (i - 1) * sizeof(u32));
> > > > > +               writel(APLIC_DEFAULT_PRIORITY,
> > > > > +                      priv->regs + APLIC_TARGET_BASE +
> > > > > +                      (i - 1) * sizeof(u32));
> > > > > +       }
> > > > > +
> > > > > +       /* Clear APLIC domaincfg */
> > > > > +       writel(0, priv->regs + APLIC_DOMAINCFG);
> > > > > +}
> > > > > +
> > > > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > > > +{
> > > > > +       u32 val;
> > > > > +#ifdef CONFIG_RISCV_M_MODE
> > > > > +       u32 valH;
> > > > > +
> > > > > +       if (!priv->nr_idcs) {
> > > > > +               val = priv->msicfg.base_ppn;
> > > > > +               valH = (priv->msicfg.base_ppn >> 32) &
> > > > > +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > > > +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > > > +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > > > +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > > > +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > > > +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > > > +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > > > +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > > > +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > > > +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > > > +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > > > +       }
> > > > > +#endif
> > > > > +
> > > > > +       /* Setup APLIC domaincfg register */
> > > > > +       val = readl(priv->regs + APLIC_DOMAINCFG);
> > > > > +       val |= APLIC_DOMAINCFG_IE;
> > > > > +       if (!priv->nr_idcs)
> > > > > +               val |= APLIC_DOMAINCFG_DM;
> > > > > +       writel(val, priv->regs + APLIC_DOMAINCFG);
> > > > > +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > > > +               pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > > > +                       priv->fwnode, val);
> > > > > +}
> > > > > +
> > > > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > > > +{
> > > > > +       unsigned int group_index, hart_index, guest_index, val;
> > > > > +       struct irq_data *d = irq_get_irq_data(desc->irq);
> > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > > > +       phys_addr_t tppn, tbppn, msg_addr;
> > > > > +       void __iomem *target;
> > > > > +
> > > > > +       /* For zeroed MSI, simply write zero into the target register */
> > > > > +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > > > +               target = priv->regs + APLIC_TARGET_BASE;
> > > > > +               target += (d->hwirq - 1) * sizeof(u32);
> > > > > +               writel(0, target);
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       /* Sanity check on message data */
> > > > > +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > > > +
> > > > > +       /* Compute target MSI address */
> > > > > +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > > > +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > +
> > > > > +       /* Compute target HART Base PPN */
> > > > > +       tbppn = tppn;
> > > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > +       WARN_ON(tbppn != mc->base_ppn);
> > > > > +
> > > > > +       /* Compute target group and hart indexes */
> > > > > +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > > > +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > > > +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > > > +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > > > +       hart_index |= (group_index << mc->lhxw);
> > > > > +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > > > +
> > > > > +       /* Compute target guest index */
> > > > > +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > > > +
> > > > > +       /* Update IRQ TARGET register */
> > > > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > > > +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > > > +                               << APLIC_TARGET_HART_IDX_SHIFT;
> > > > > +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > > > +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > > > +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > > > +       writel(val, target);
> > > > > +}
> > > > > +
> > > > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > > > +{
> > > > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > > > +       const struct imsic_global_config *imsic_global;
> > > > > +
> > > > > +       /*
> > > > > +        * The APLIC outgoing MSI config registers assume target MSI
> > > > > +        * controller to be RISC-V AIA IMSIC controller.
> > > > > +        */
> > > > > +       imsic_global = imsic_get_global_config();
> > > > > +       if (!imsic_global) {
> > > > > +               pr_err("%pfwP: IMSIC global config not found\n",
> > > > > +                       priv->fwnode);
> > > > > +               return -ENODEV;
> > > > > +       }
> > > > > +
> > > > > +       /* Find number of guest index bits (LHXS) */
> > > > > +       mc->lhxs = imsic_global->guest_index_bits;
> > > > > +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > > > +               pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > > > +                       priv->fwnode);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > > +       /* Find number of HART index bits (LHXW) */
> > > > > +       mc->lhxw = imsic_global->hart_index_bits;
> > > > > +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > > > +               pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > > > +                       priv->fwnode);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > > +       /* Find number of group index bits (HHXW) */
> > > > > +       mc->hhxw = imsic_global->group_index_bits;
> > > > > +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > > > +               pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > > > +                       priv->fwnode);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > > +       /* Find first bit position of group index (HHXS) */
> > > > > +       mc->hhxs = imsic_global->group_index_shift;
> > > > > +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > > > +               pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > > > +                       priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > > > +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > > > +               pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > > > +                       priv->fwnode);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > > +       /* Compute PPN base */
> > > > > +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > +
> > > > > +       /* Use all possible CPUs as lmask */
> > > > > +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > > > + * which will return highest priority pending interrupt and clear the
> > > > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > > > + * register return zero value.
> > > > > + */
> > > > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > > > +{
> > > > > +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > > > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > > > > +       irq_hw_number_t hw_irq;
> > > > > +       int irq;
> > > > > +
> > > > > +       chained_irq_enter(chip, desc);
> > > > > +
> > > > > +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > > > +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > > > +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > > > +
> > > > > +               if (unlikely(irq <= 0))
> > > > > +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > > > +                                           hw_irq);
> > > > > +               else
> > > > > +                       generic_handle_irq(irq);
> > > > > +       }
> > > > > +
> > > > > +       chained_irq_exit(chip, desc);
> > > > > +}
> > > > > +
> > > > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > > > +{
> > > > > +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > > > +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > > > +
> > > > > +       /* Priority must be less than threshold for interrupt triggering */
> > > > > +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > > > +
> > > > > +       /* Delivery must be set to 1 for interrupt triggering */
> > > > > +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > > > +}
> > > > > +
> > > > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > > > +{
> > > > > +       if (aplic_idc_parent_irq)
> > > > > +               disable_percpu_irq(aplic_idc_parent_irq);
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > > > +{
> > > > > +       if (aplic_idc_parent_irq)
> > > > > +               enable_percpu_irq(aplic_idc_parent_irq,
> > > > > +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > > > +{
> > > > > +       int i, j, rc, cpu, setup_count = 0;
> > > > > +       struct fwnode_reference_args parent;
> > > > > +       struct irq_domain *domain;
> > > > > +       unsigned long hartid;
> > > > > +       struct aplic_idc *idc;
> > > > > +       u32 val;
> > > > > +
> > > > > +       /* Setup per-CPU IDC and target CPU mask */
> > > > > +       for (i = 0; i < priv->nr_idcs; i++) {
> > > > > +               rc = fwnode_property_get_reference_args(priv->fwnode,
> > > > > +                               "interrupts-extended", "#interrupt-cells",
> > > > > +                               0, i, &parent);
> > > > > +               if (rc) {
> > > > > +                       pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > > > +                               priv->fwnode, i);
> > > > > +                       continue;
> > > > > +               }
> > > > > +
> > > > > +               /*
> > > > > +                * Skip interrupts other than external interrupts for
> > > > > +                * current privilege level.
> > > > > +                */
> > > > > +               if (parent.args[0] != RV_IRQ_EXT)
> > > > > +                       continue;
> > > > > +
> > > > > +               rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > > > +               if (rc) {
> > > > > +                       pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > > > +                               priv->fwnode, i);
> > > > > +                       continue;
> > > > > +               }
> > > > > +
> > > > > +               cpu = riscv_hartid_to_cpuid(hartid);
> > > > > +               if (cpu < 0) {
> > > > > +                       pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > > > +                               priv->fwnode, i);
> > > > > +                       continue;
> > > > > +               }
> > > > > +
> > > > > +               cpumask_set_cpu(cpu, &priv->lmask);
> > > > > +
> > > > > +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > +               idc->hart_index = i;
> > > > > +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > > > +               idc->priv = priv;
> > > > > +
> > > > > +               aplic_idc_set_delivery(idc, true);
> > > > > +
> > > > > +               /*
> > > > > +                * Boot cpu might not have APLIC hart_index = 0 so check
> > > > > +                * and update target registers of all interrupts.
> > > > > +                */
> > > > > +               if (cpu == smp_processor_id() && idc->hart_index) {
> > > > > +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > +                       val |= APLIC_DEFAULT_PRIORITY;
> > > > > +                       for (j = 1; j <= priv->nr_irqs; j++)
> > > > > +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> > > > > +                                           (j - 1) * sizeof(u32));
> > > > > +               }
> > > > > +
> > > > > +               setup_count++;
> > > > > +       }
> > > > > +
> > > > > +       /* Find parent domain and register chained handler */
> > > > > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > > > +                                         DOMAIN_BUS_ANY);
> > > > > +       if (!aplic_idc_parent_irq && domain) {
> > > > > +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > > > +               if (aplic_idc_parent_irq) {
> > > > > +                       irq_set_chained_handler(aplic_idc_parent_irq,
> > > > > +                                               aplic_idc_handle_irq);
> > > > > +
> > > > > +                       /*
> > > > > +                        * Setup CPUHP notifier to enable IDC parent
> > > > > +                        * interrupt on all CPUs
> > > > > +                        */
> > > > > +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > > > +                                         "irqchip/riscv/aplic:starting",
> > > > > +                                         aplic_idc_starting_cpu,
> > > > > +                                         aplic_idc_dying_cpu);
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       /* Fail if we were not able to setup IDC for any CPU */
> > > > > +       return (setup_count) ? 0 : -ENODEV;
> > > > > +}
> > > > > +
> > > > > +static int aplic_probe(struct platform_device *pdev)
> > > > > +{
> > > > > +       struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > > > +       struct fwnode_reference_args parent;
> > > > > +       struct aplic_priv *priv;
> > > > > +       struct resource *res;
> > > > > +       phys_addr_t pa;
> > > > > +       int rc;
> > > > > +
> > > > > +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > > +       if (!priv)
> > > > > +               return -ENOMEM;
> > > > > +       priv->fwnode = fwnode;
> > > > > +
> > > > > +       /* Map the MMIO registers */
> > > > > +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > > +       if (!res) {
> > > > > +               pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +       priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > > > +       if (!priv->regs) {
> > > > > +               pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > > > +               return -ENOMEM;
> > > > > +       }
> > > > > +
> > > > > +       /*
> > > > > +        * Find out GSI base number
> > > > > +        *
> > > > > +        * Note: DT does not define "riscv,gsi-base" property so GSI
> > > > > +        * base is always zero for DT.
> > > > > +        */
> > > > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > > > +                                           &priv->gsi_base, 1);
> > > > > +       if (rc)
> > > > > +               priv->gsi_base = 0;
> > > > > +
> > > > > +       /* Find out number of interrupt sources */
> > > > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > > > +                                           &priv->nr_irqs, 1);
> > > > > +       if (rc) {
> > > > > +               pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > > > +                       fwnode);
> > > > > +               return rc;
> > > > > +       }
> > > > > +
> > > > > +       /* Setup initial state APLIC interrupts */
> > > > > +       aplic_init_hw_irqs(priv);
> > > > > +
> > > > > +       /*
> > > > > +        * Find out number of IDCs based on parent interrupts
> > > > > +        *
> > > > > +        * If "msi-parent" property is present then we ignore the
> > > > > +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > > > +        */
> > > > > +       if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > > > +               while (!fwnode_property_get_reference_args(fwnode,
> > > > > +                               "interrupts-extended", "#interrupt-cells",
> > > > > +                               0, priv->nr_idcs, &parent))
> > > > > +                       priv->nr_idcs++;
> > > > > +       }
> > > > > +
> > > > > +       /* Setup IDCs or MSIs based on number of IDCs */
> > > > > +       if (priv->nr_idcs)
> > > > > +               rc = aplic_setup_idc(priv);
> > > > > +       else
> > > > > +               rc = aplic_setup_msi(priv);
> > > > > +       if (rc) {
> > > > > +               pr_err("%pfwP: failed setup %s\n",
> > > > > +                       fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > > > +               return rc;
> > > > > +       }
> > > > > +
> > > > > +       /* Setup global config and interrupt delivery */
> > > > > +       aplic_init_hw_global(priv);
> > > > > +
> > > > > +       /* Create irq domain instance for the APLIC */
> > > > > +       if (priv->nr_idcs)
> > > > > +               priv->irqdomain = irq_domain_create_linear(
> > > > > +                                               priv->fwnode,
> > > > > +                                               priv->nr_irqs + 1,
> > > > > +                                               &aplic_irqdomain_idc_ops,
> > > > > +                                               priv);
> > > > > +       else
> > > > > +               priv->irqdomain = platform_msi_create_device_domain(
> > > > > +                                               &pdev->dev,
> > > > > +                                               priv->nr_irqs + 1,
> > > > > +                                               aplic_msi_write_msg,
> > > > > +                                               &aplic_irqdomain_msi_ops,
> > > > > +                                               priv);
> > > > > +       if (!priv->irqdomain) {
> > > > > +               pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > > > +               return -ENOMEM;
> > > > > +       }
> > > > > +
> > > > > +       /* Advertise the interrupt controller */
> > > > > +       if (priv->nr_idcs) {
> > > > > +               pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > > > +                       priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > > > +       } else {
> > > > > +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > +               pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > > > +                       priv->fwnode, priv->nr_irqs, &pa);
> > > > > +       }
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static const struct of_device_id aplic_match[] = {
> > > > > +       { .compatible = "riscv,aplic" },
> > > > > +       {}
> > > > > +};
> > > > > +
> > > > > +static struct platform_driver aplic_driver = {
> > > > > +       .driver = {
> > > > > +               .name           = "riscv-aplic",
> > > > > +               .of_match_table = aplic_match,
> > > > > +       },
> > > > > +       .probe = aplic_probe,
> > > > > +};
> > > > > +builtin_platform_driver(aplic_driver);
> > > > > +
> > > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > > +                               struct device_node *parent)
> > > > > +{
> > > > > +       /*
> > > > > +        * The APLIC platform driver needs to be probed early
> > > > > +        * so for device tree:
> > > > > +        *
> > > > > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > > +        *    provides a hint to the device driver core to probe the
> > > > > +        *    platform driver early.
> > > > > +        * 2) Clear the OF_POPULATED flag in device_node because
> > > > > +        *    of_irq_init() sets it which prevents creation of
> > > > > +        *    platform device.
> > > > > +        */
> > > > > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > > >
> > > > NACK. You are blindly plastering flags without trying to understand
> > > > the real issue and fixing this correctly.
> > > >
> > > > > +       of_node_clear_flag(node, OF_POPULATED);
> > > > > +       return 0;
> > > > > +}
> > > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > > >
> > > > This macro pretty much skips the entire driver core framework to probe
> > > > and calls init and you are supposed to initialize the device when the
> > > > init function is called.
> > > >
> > > > If you want your device/driver to follow the proper platform driver
> > > > path (which is recommended), then you need to use the
> > > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > > >
> > > > I offered to help you debug this issue and I asked for a dts file that
> > > > corresponds to a board you are testing this on and seeing an issue.
> > > > But you haven't answered my question [1] and are pointing to some
> > > > random commit and blaming it. That commit has no impact on any
> > > > existing devices/drivers.
> > > >
> > > > Hi Marc,
> > > >
> > > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > > is used or until Anup actually works with us to debug the real issue.
> > >
> > > Maybe I misread your previous comment.
> > >
> > > You can easily reproduce the issue on QEMU virt machine for RISC-V:
> > > 1) Build qemu-system-riscv64 from latest QEMU master
> > > 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> > > (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
> > >  APLIC driver at the time of building kernel)
> > > 3) Boot a APLIC-only system on QEMU virt machine
> > >     qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
> > >     -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
> > >     -kernel ./build-riscv64/arch/riscv/boot/Image \
> > >     -append "root=/dev/ram rw console=ttyS0 earlycon" \
> > >     -initrd ./rootfs_riscv64.img
> >
> > Unfortunately, I don't have the time to do all that, but I generally
> > don't need to run something to figure out the issue. It's generally
> > fairly obvious once I look at the DT. I'll also lean on you for some
> > debug logs.
>
> The boot log with FWNODE_BEST_EFFORT flag in APLIC can be
> found at:
> https://drive.google.com/file/d/1C-uuHbh6Zk9xkAsfGLfhb_4WighvmQp1/view?usp=sharing
>
> The boot log without FWNODE_BEST_EFFORT flag in APLIC can
> be found at:
> https://drive.google.com/file/d/12SRdR-2Frv_5O06kbuI_LUJ88khjf_7O/view?usp=sharing
>
> >
> > Where is the dts file that corresponds to this QEMU run? This is the
> > third time I'm asking for a pointer to a dts file that has this issue,
> > can you point me to it please? I shouldn't have to say this but: put
> > it somewhere and point me to it please. Please don't point me to some
> > git repo and ask me to dig around.
>
> For QEMU virt machine, the DTB is generated at runtime as part of
> virt machine creation. The DTS dumped by QEMU using the "dumpdtb"
> command line option can be found at:
> https://drive.google.com/file/d/1EU-exItL1B7EWuoXw4q-Ypocq--5Wvn8/view?usp=sharing
>
> >
> > Can you give me details on what supplier is causing the deferred probe
> > that's a problem for you? Any other details you can provide that'll
> > help debug this issue?
>
> FWNODE supplier for APLIC DT node is the OF framework.
>
> >
> > > I hope the above steps help you reproduce the issue. I will certainly
> > > test whatever fix you propose.
> >
> > Do you plan to try the fix I suggested already? The one about using
> > the correct macros?
>
> You mean use IRQCHIP_DECLARE() in the APLIC driver ?
> or something else ?

No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.

-Saravana

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-22 20:56           ` Saravana Kannan
@ 2023-06-23 11:47             ` Anup Patel
  2023-06-23 12:49               ` Marc Zyngier
  0 siblings, 1 reply; 28+ messages in thread
From: Anup Patel @ 2023-06-23 11:47 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Conor Dooley, linux-riscv, linux-kernel,
	devicetree, iommu, Android Kernel Team

On Fri, Jun 23, 2023 at 2:27 AM Saravana Kannan <saravanak@google.com> wrote:
>
> On Sun, Jun 18, 2023 at 11:13 PM Anup Patel <apatel@ventanamicro.com> wrote:
> >
> > On Sat, Jun 17, 2023 at 3:36 AM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <anup@brainfault.org> wrote:
> > > >
> > > > On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > >
> > > > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <apatel@ventanamicro.com> wrote:
> > > > > >
> > > > > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > > > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > > > > platform. This new interrupt controller is referred to as advanced
> > > > > > platform-level interrupt controller (APLIC) which can forward wired
> > > > > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > > > > signaled interrupts.
> > > > > > (For more details refer https://github.com/riscv/riscv-aia)
> > > > > >
> > > > > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > > > > platforms.
> > > > > >
> > > > > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > > > > ---
> > > > > >  drivers/irqchip/Kconfig             |   6 +
> > > > > >  drivers/irqchip/Makefile            |   1 +
> > > > > >  drivers/irqchip/irq-riscv-aplic.c   | 765 ++++++++++++++++++++++++++++
> > > > > >  include/linux/irqchip/riscv-aplic.h | 119 +++++
> > > > > >  4 files changed, 891 insertions(+)
> > > > > >  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > > > > >  create mode 100644 include/linux/irqchip/riscv-aplic.h
> > > > > >
> > > > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > > > > index d700980372ef..834c0329f583 100644
> > > > > > --- a/drivers/irqchip/Kconfig
> > > > > > +++ b/drivers/irqchip/Kconfig
> > > > > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > > > > >         select IRQ_DOMAIN_HIERARCHY
> > > > > >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > > > > >
> > > > > > +config RISCV_APLIC
> > > > > > +       bool
> > > > > > +       depends on RISCV
> > > > > > +       select IRQ_DOMAIN_HIERARCHY
> > > > > > +       select GENERIC_MSI_IRQ
> > > > > > +
> > > > > >  config RISCV_IMSIC
> > > > > >         bool
> > > > > >         depends on RISCV
> > > > > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > > > > index 577bde3e986b..438b8e1a152c 100644
> > > > > > --- a/drivers/irqchip/Makefile
> > > > > > +++ b/drivers/irqchip/Makefile
> > > > > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> > > > > >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> > > > > >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> > > > > >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > > > > > +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
> > > > > >  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> > > > > >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> > > > > >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> > > > > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > > > > new file mode 100644
> > > > > > index 000000000000..1e710fdf5608
> > > > > > --- /dev/null
> > > > > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > > > > @@ -0,0 +1,765 @@
> > > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > > +/*
> > > > > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > > > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > > > > + */
> > > > > > +
> > > > > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > > > > +#include <linux/bitops.h>
> > > > > > +#include <linux/cpu.h>
> > > > > > +#include <linux/interrupt.h>
> > > > > > +#include <linux/io.h>
> > > > > > +#include <linux/irq.h>
> > > > > > +#include <linux/irqchip.h>
> > > > > > +#include <linux/irqchip/chained_irq.h>
> > > > > > +#include <linux/irqchip/riscv-aplic.h>
> > > > > > +#include <linux/irqchip/riscv-imsic.h>
> > > > > > +#include <linux/irqdomain.h>
> > > > > > +#include <linux/module.h>
> > > > > > +#include <linux/msi.h>
> > > > > > +#include <linux/platform_device.h>
> > > > > > +#include <linux/smp.h>
> > > > > > +
> > > > > > +#define APLIC_DEFAULT_PRIORITY         1
> > > > > > +#define APLIC_DISABLE_IDELIVERY                0
> > > > > > +#define APLIC_ENABLE_IDELIVERY         1
> > > > > > +#define APLIC_DISABLE_ITHRESHOLD       1
> > > > > > +#define APLIC_ENABLE_ITHRESHOLD                0
> > > > > > +
> > > > > > +struct aplic_msicfg {
> > > > > > +       phys_addr_t             base_ppn;
> > > > > > +       u32                     hhxs;
> > > > > > +       u32                     hhxw;
> > > > > > +       u32                     lhxs;
> > > > > > +       u32                     lhxw;
> > > > > > +};
> > > > > > +
> > > > > > +struct aplic_idc {
> > > > > > +       unsigned int            hart_index;
> > > > > > +       void __iomem            *regs;
> > > > > > +       struct aplic_priv       *priv;
> > > > > > +};
> > > > > > +
> > > > > > +struct aplic_priv {
> > > > > > +       struct fwnode_handle    *fwnode;
> > > > > > +       u32                     gsi_base;
> > > > > > +       u32                     nr_irqs;
> > > > > > +       u32                     nr_idcs;
> > > > > > +       void __iomem            *regs;
> > > > > > +       struct irq_domain       *irqdomain;
> > > > > > +       struct aplic_msicfg     msicfg;
> > > > > > +       struct cpumask          lmask;
> > > > > > +};
> > > > > > +
> > > > > > +static unsigned int aplic_idc_parent_irq;
> > > > > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > > > > +
> > > > > > +static void aplic_irq_unmask(struct irq_data *d)
> > > > > > +{
> > > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +
> > > > > > +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > > > > +
> > > > > > +       if (!priv->nr_idcs)
> > > > > > +               irq_chip_unmask_parent(d);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_irq_mask(struct irq_data *d)
> > > > > > +{
> > > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +
> > > > > > +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > > > > +
> > > > > > +       if (!priv->nr_idcs)
> > > > > > +               irq_chip_mask_parent(d);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > > > > +{
> > > > > > +       u32 val = 0;
> > > > > > +       void __iomem *sourcecfg;
> > > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +
> > > > > > +       switch (type) {
> > > > > > +       case IRQ_TYPE_NONE:
> > > > > > +               val = APLIC_SOURCECFG_SM_INACTIVE;
> > > > > > +               break;
> > > > > > +       case IRQ_TYPE_LEVEL_LOW:
> > > > > > +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > > > > +               break;
> > > > > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > > > > +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > > > > +               break;
> > > > > > +       case IRQ_TYPE_EDGE_FALLING:
> > > > > > +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > > > > +               break;
> > > > > > +       case IRQ_TYPE_EDGE_RISING:
> > > > > > +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > > > > +               break;
> > > > > > +       default:
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +
> > > > > > +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > > > > +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > > > > +       writel(val, sourcecfg);
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_irq_eoi(struct irq_data *d)
> > > > > > +{
> > > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +       u32 reg_off, reg_mask;
> > > > > > +
> > > > > > +       /*
> > > > > > +        * EOI handling only required only for level-triggered
> > > > > > +        * interrupts in APLIC MSI mode.
> > > > > > +        */
> > > > > > +
> > > > > > +       if (priv->nr_idcs)
> > > > > > +               return;
> > > > > > +
> > > > > > +       reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > > > > +       reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > > > > +       switch (irqd_get_trigger_type(d)) {
> > > > > > +       case IRQ_TYPE_LEVEL_LOW:
> > > > > > +               if (!(readl(priv->regs + reg_off) & reg_mask))
> > > > > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > > +               break;
> > > > > > +       case IRQ_TYPE_LEVEL_HIGH:
> > > > > > +               if (readl(priv->regs + reg_off) & reg_mask)
> > > > > > +                       writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > > +               break;
> > > > > > +       }
> > > > > > +}
> > > > > > +
> > > > > > +#ifdef CONFIG_SMP
> > > > > > +static int aplic_set_affinity(struct irq_data *d,
> > > > > > +                             const struct cpumask *mask_val, bool force)
> > > > > > +{
> > > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +       struct aplic_idc *idc;
> > > > > > +       unsigned int cpu, val;
> > > > > > +       struct cpumask amask;
> > > > > > +       void __iomem *target;
> > > > > > +
> > > > > > +       if (!priv->nr_idcs)
> > > > > > +               return irq_chip_set_affinity_parent(d, mask_val, force);
> > > > > > +
> > > > > > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > > > > > +
> > > > > > +       if (force)
> > > > > > +               cpu = cpumask_first(&amask);
> > > > > > +       else
> > > > > > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > > +
> > > > > > +       if (cpu >= nr_cpu_ids)
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > > > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > > > > +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > > +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > > +       val |= APLIC_DEFAULT_PRIORITY;
> > > > > > +       writel(val, target);
> > > > > > +
> > > > > > +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > > +
> > > > > > +       return IRQ_SET_MASK_OK_DONE;
> > > > > > +}
> > > > > > +#endif
> > > > > > +
> > > > > > +static struct irq_chip aplic_chip = {
> > > > > > +       .name           = "RISC-V APLIC",
> > > > > > +       .irq_mask       = aplic_irq_mask,
> > > > > > +       .irq_unmask     = aplic_irq_unmask,
> > > > > > +       .irq_set_type   = aplic_set_type,
> > > > > > +       .irq_eoi        = aplic_irq_eoi,
> > > > > > +#ifdef CONFIG_SMP
> > > > > > +       .irq_set_affinity = aplic_set_affinity,
> > > > > > +#endif
> > > > > > +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> > > > > > +                         IRQCHIP_SKIP_SET_WAKE |
> > > > > > +                         IRQCHIP_MASK_ON_SUSPEND,
> > > > > > +};
> > > > > > +
> > > > > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > > > > +                                    u32 gsi_base,
> > > > > > +                                    unsigned long *hwirq,
> > > > > > +                                    unsigned int *type)
> > > > > > +{
> > > > > > +       if (WARN_ON(fwspec->param_count < 2))
> > > > > > +               return -EINVAL;
> > > > > > +       if (WARN_ON(!fwspec->param[0]))
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       /* For DT, gsi_base is always zero. */
> > > > > > +       *hwirq = fwspec->param[0] - gsi_base;
> > > > > > +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > > > > +
> > > > > > +       WARN_ON(*type == IRQ_TYPE_NONE);
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > > > > +                                        struct irq_fwspec *fwspec,
> > > > > > +                                        unsigned long *hwirq,
> > > > > > +                                        unsigned int *type)
> > > > > > +{
> > > > > > +       struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > > > > +
> > > > > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > > > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > > > > +                                    void *arg)
> > > > > > +{
> > > > > > +       int i, ret;
> > > > > > +       unsigned int type;
> > > > > > +       irq_hw_number_t hwirq;
> > > > > > +       struct irq_fwspec *fwspec = arg;
> > > > > > +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > > > > +
> > > > > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > +
> > > > > > +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > +
> > > > > > +       for (i = 0; i < nr_irqs; i++) {
> > > > > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > > > > +                                   NULL, NULL);
> > > > > > +               /*
> > > > > > +                * APLIC does not implement irq_disable() so Linux interrupt
> > > > > > +                * subsystem will take a lazy approach for disabling an APLIC
> > > > > > +                * interrupt. This means APLIC interrupts are left unmasked
> > > > > > +                * upon system suspend and interrupts are not processed
> > > > > > +                * immediately upon system wake up. To tackle this, we disable
> > > > > > +                * the lazy approach for all APLIC interrupts.
> > > > > > +                */
> > > > > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > > +       }
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > > > > +       .translate      = aplic_irqdomain_msi_translate,
> > > > > > +       .alloc          = aplic_irqdomain_msi_alloc,
> > > > > > +       .free           = platform_msi_device_domain_free,
> > > > > > +};
> > > > > > +
> > > > > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > > > > +                                        struct irq_fwspec *fwspec,
> > > > > > +                                        unsigned long *hwirq,
> > > > > > +                                        unsigned int *type)
> > > > > > +{
> > > > > > +       struct aplic_priv *priv = d->host_data;
> > > > > > +
> > > > > > +       return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > > > > +                                    unsigned int virq, unsigned int nr_irqs,
> > > > > > +                                    void *arg)
> > > > > > +{
> > > > > > +       int i, ret;
> > > > > > +       unsigned int type;
> > > > > > +       irq_hw_number_t hwirq;
> > > > > > +       struct irq_fwspec *fwspec = arg;
> > > > > > +       struct aplic_priv *priv = domain->host_data;
> > > > > > +
> > > > > > +       ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > +
> > > > > > +       for (i = 0; i < nr_irqs; i++) {
> > > > > > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > > +                                   &aplic_chip, priv, handle_fasteoi_irq,
> > > > > > +                                   NULL, NULL);
> > > > > > +               irq_set_affinity(virq + i, &priv->lmask);
> > > > > > +               /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > > > > +               irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > > +       }
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > > > > +       .translate      = aplic_irqdomain_idc_translate,
> > > > > > +       .alloc          = aplic_irqdomain_idc_alloc,
> > > > > > +       .free           = irq_domain_free_irqs_top,
> > > > > > +};
> > > > > > +
> > > > > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > > > > +{
> > > > > > +       int i;
> > > > > > +
> > > > > > +       /* Disable all interrupts */
> > > > > > +       for (i = 0; i <= priv->nr_irqs; i += 32)
> > > > > > +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > > > > +                           (i / 32) * sizeof(u32));
> > > > > > +
> > > > > > +       /* Set interrupt type and default priority for all interrupts */
> > > > > > +       for (i = 1; i <= priv->nr_irqs; i++) {
> > > > > > +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > > > > +                         (i - 1) * sizeof(u32));
> > > > > > +               writel(APLIC_DEFAULT_PRIORITY,
> > > > > > +                      priv->regs + APLIC_TARGET_BASE +
> > > > > > +                      (i - 1) * sizeof(u32));
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Clear APLIC domaincfg */
> > > > > > +       writel(0, priv->regs + APLIC_DOMAINCFG);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > > > > +{
> > > > > > +       u32 val;
> > > > > > +#ifdef CONFIG_RISCV_M_MODE
> > > > > > +       u32 valH;
> > > > > > +
> > > > > > +       if (!priv->nr_idcs) {
> > > > > > +               val = priv->msicfg.base_ppn;
> > > > > > +               valH = (priv->msicfg.base_ppn >> 32) &
> > > > > > +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > > > > +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > > > > +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > > > > +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > > > > +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > > > > +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > > > > +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > > > > +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > > > > +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > > > > +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > > > > +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > > > > +       }
> > > > > > +#endif
> > > > > > +
> > > > > > +       /* Setup APLIC domaincfg register */
> > > > > > +       val = readl(priv->regs + APLIC_DOMAINCFG);
> > > > > > +       val |= APLIC_DOMAINCFG_IE;
> > > > > > +       if (!priv->nr_idcs)
> > > > > > +               val |= APLIC_DOMAINCFG_DM;
> > > > > > +       writel(val, priv->regs + APLIC_DOMAINCFG);
> > > > > > +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > > > > +               pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > > > > +                       priv->fwnode, val);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > > > > +{
> > > > > > +       unsigned int group_index, hart_index, guest_index, val;
> > > > > > +       struct irq_data *d = irq_get_irq_data(desc->irq);
> > > > > > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > > > > +       phys_addr_t tppn, tbppn, msg_addr;
> > > > > > +       void __iomem *target;
> > > > > > +
> > > > > > +       /* For zeroed MSI, simply write zero into the target register */
> > > > > > +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > > > > +               target = priv->regs + APLIC_TARGET_BASE;
> > > > > > +               target += (d->hwirq - 1) * sizeof(u32);
> > > > > > +               writel(0, target);
> > > > > > +               return;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Sanity check on message data */
> > > > > > +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > > > > +
> > > > > > +       /* Compute target MSI address */
> > > > > > +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > > > > +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > > +
> > > > > > +       /* Compute target HART Base PPN */
> > > > > > +       tbppn = tppn;
> > > > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > > +       WARN_ON(tbppn != mc->base_ppn);
> > > > > > +
> > > > > > +       /* Compute target group and hart indexes */
> > > > > > +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > > > > +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > > > > +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > > > > +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > > > > +       hart_index |= (group_index << mc->lhxw);
> > > > > > +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > > > > +
> > > > > > +       /* Compute target guest index */
> > > > > > +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > > +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > > > > +
> > > > > > +       /* Update IRQ TARGET register */
> > > > > > +       target = priv->regs + APLIC_TARGET_BASE;
> > > > > > +       target += (d->hwirq - 1) * sizeof(u32);
> > > > > > +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > > > > +                               << APLIC_TARGET_HART_IDX_SHIFT;
> > > > > > +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > > > > +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > > > > +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > > > > +       writel(val, target);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > > > > +{
> > > > > > +       struct aplic_msicfg *mc = &priv->msicfg;
> > > > > > +       const struct imsic_global_config *imsic_global;
> > > > > > +
> > > > > > +       /*
> > > > > > +        * The APLIC outgoing MSI config registers assume target MSI
> > > > > > +        * controller to be RISC-V AIA IMSIC controller.
> > > > > > +        */
> > > > > > +       imsic_global = imsic_get_global_config();
> > > > > > +       if (!imsic_global) {
> > > > > > +               pr_err("%pfwP: IMSIC global config not found\n",
> > > > > > +                       priv->fwnode);
> > > > > > +               return -ENODEV;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Find number of guest index bits (LHXS) */
> > > > > > +       mc->lhxs = imsic_global->guest_index_bits;
> > > > > > +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > > > > +               pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > > > > +                       priv->fwnode);
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Find number of HART index bits (LHXW) */
> > > > > > +       mc->lhxw = imsic_global->hart_index_bits;
> > > > > > +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > > > > +               pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > > > > +                       priv->fwnode);
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Find number of group index bits (HHXW) */
> > > > > > +       mc->hhxw = imsic_global->group_index_bits;
> > > > > > +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > > > > +               pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > > > > +                       priv->fwnode);
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Find first bit position of group index (HHXS) */
> > > > > > +       mc->hhxs = imsic_global->group_index_shift;
> > > > > > +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > > > > +               pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > > > > +                       priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > > > > +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > > > > +               pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > > > > +                       priv->fwnode);
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Compute PPN base */
> > > > > > +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > > +
> > > > > > +       /* Use all possible CPUs as lmask */
> > > > > > +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > > > > + * which will return highest priority pending interrupt and clear the
> > > > > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > > > > + * register return zero value.
> > > > > > + */
> > > > > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > > > > +{
> > > > > > +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > > > > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > > > > > +       irq_hw_number_t hw_irq;
> > > > > > +       int irq;
> > > > > > +
> > > > > > +       chained_irq_enter(chip, desc);
> > > > > > +
> > > > > > +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > > > > +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > > > > +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > > > > +
> > > > > > +               if (unlikely(irq <= 0))
> > > > > > +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > > > > +                                           hw_irq);
> > > > > > +               else
> > > > > > +                       generic_handle_irq(irq);
> > > > > > +       }
> > > > > > +
> > > > > > +       chained_irq_exit(chip, desc);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > > > > +{
> > > > > > +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > > > > +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > > > > +
> > > > > > +       /* Priority must be less than threshold for interrupt triggering */
> > > > > > +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > > > > +
> > > > > > +       /* Delivery must be set to 1 for interrupt triggering */
> > > > > > +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > > > > +{
> > > > > > +       if (aplic_idc_parent_irq)
> > > > > > +               disable_percpu_irq(aplic_idc_parent_irq);
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > > > > +{
> > > > > > +       if (aplic_idc_parent_irq)
> > > > > > +               enable_percpu_irq(aplic_idc_parent_irq,
> > > > > > +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > > > > +{
> > > > > > +       int i, j, rc, cpu, setup_count = 0;
> > > > > > +       struct fwnode_reference_args parent;
> > > > > > +       struct irq_domain *domain;
> > > > > > +       unsigned long hartid;
> > > > > > +       struct aplic_idc *idc;
> > > > > > +       u32 val;
> > > > > > +
> > > > > > +       /* Setup per-CPU IDC and target CPU mask */
> > > > > > +       for (i = 0; i < priv->nr_idcs; i++) {
> > > > > > +               rc = fwnode_property_get_reference_args(priv->fwnode,
> > > > > > +                               "interrupts-extended", "#interrupt-cells",
> > > > > > +                               0, i, &parent);
> > > > > > +               if (rc) {
> > > > > > +                       pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > > > > +                               priv->fwnode, i);
> > > > > > +                       continue;
> > > > > > +               }
> > > > > > +
> > > > > > +               /*
> > > > > > +                * Skip interrupts other than external interrupts for
> > > > > > +                * current privilege level.
> > > > > > +                */
> > > > > > +               if (parent.args[0] != RV_IRQ_EXT)
> > > > > > +                       continue;
> > > > > > +
> > > > > > +               rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > > > > +               if (rc) {
> > > > > > +                       pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > > > > +                               priv->fwnode, i);
> > > > > > +                       continue;
> > > > > > +               }
> > > > > > +
> > > > > > +               cpu = riscv_hartid_to_cpuid(hartid);
> > > > > > +               if (cpu < 0) {
> > > > > > +                       pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > > > > +                               priv->fwnode, i);
> > > > > > +                       continue;
> > > > > > +               }
> > > > > > +
> > > > > > +               cpumask_set_cpu(cpu, &priv->lmask);
> > > > > > +
> > > > > > +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > > +               idc->hart_index = i;
> > > > > > +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > > > > +               idc->priv = priv;
> > > > > > +
> > > > > > +               aplic_idc_set_delivery(idc, true);
> > > > > > +
> > > > > > +               /*
> > > > > > +                * Boot cpu might not have APLIC hart_index = 0 so check
> > > > > > +                * and update target registers of all interrupts.
> > > > > > +                */
> > > > > > +               if (cpu == smp_processor_id() && idc->hart_index) {
> > > > > > +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > > +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > > +                       val |= APLIC_DEFAULT_PRIORITY;
> > > > > > +                       for (j = 1; j <= priv->nr_irqs; j++)
> > > > > > +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> > > > > > +                                           (j - 1) * sizeof(u32));
> > > > > > +               }
> > > > > > +
> > > > > > +               setup_count++;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Find parent domain and register chained handler */
> > > > > > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > > > > +                                         DOMAIN_BUS_ANY);
> > > > > > +       if (!aplic_idc_parent_irq && domain) {
> > > > > > +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > > > > +               if (aplic_idc_parent_irq) {
> > > > > > +                       irq_set_chained_handler(aplic_idc_parent_irq,
> > > > > > +                                               aplic_idc_handle_irq);
> > > > > > +
> > > > > > +                       /*
> > > > > > +                        * Setup CPUHP notifier to enable IDC parent
> > > > > > +                        * interrupt on all CPUs
> > > > > > +                        */
> > > > > > +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > > > > +                                         "irqchip/riscv/aplic:starting",
> > > > > > +                                         aplic_idc_starting_cpu,
> > > > > > +                                         aplic_idc_dying_cpu);
> > > > > > +               }
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Fail if we were not able to setup IDC for any CPU */
> > > > > > +       return (setup_count) ? 0 : -ENODEV;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_probe(struct platform_device *pdev)
> > > > > > +{
> > > > > > +       struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > > > > +       struct fwnode_reference_args parent;
> > > > > > +       struct aplic_priv *priv;
> > > > > > +       struct resource *res;
> > > > > > +       phys_addr_t pa;
> > > > > > +       int rc;
> > > > > > +
> > > > > > +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > > > +       if (!priv)
> > > > > > +               return -ENOMEM;
> > > > > > +       priv->fwnode = fwnode;
> > > > > > +
> > > > > > +       /* Map the MMIO registers */
> > > > > > +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > > > +       if (!res) {
> > > > > > +               pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +       priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > > > > +       if (!priv->regs) {
> > > > > > +               pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > > > > +               return -ENOMEM;
> > > > > > +       }
> > > > > > +
> > > > > > +       /*
> > > > > > +        * Find out GSI base number
> > > > > > +        *
> > > > > > +        * Note: DT does not define "riscv,gsi-base" property so GSI
> > > > > > +        * base is always zero for DT.
> > > > > > +        */
> > > > > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > > > > +                                           &priv->gsi_base, 1);
> > > > > > +       if (rc)
> > > > > > +               priv->gsi_base = 0;
> > > > > > +
> > > > > > +       /* Find out number of interrupt sources */
> > > > > > +       rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > > > > +                                           &priv->nr_irqs, 1);
> > > > > > +       if (rc) {
> > > > > > +               pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > > > > +                       fwnode);
> > > > > > +               return rc;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Setup initial state APLIC interrupts */
> > > > > > +       aplic_init_hw_irqs(priv);
> > > > > > +
> > > > > > +       /*
> > > > > > +        * Find out number of IDCs based on parent interrupts
> > > > > > +        *
> > > > > > +        * If "msi-parent" property is present then we ignore the
> > > > > > +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > > > > +        */
> > > > > > +       if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > > > > +               while (!fwnode_property_get_reference_args(fwnode,
> > > > > > +                               "interrupts-extended", "#interrupt-cells",
> > > > > > +                               0, priv->nr_idcs, &parent))
> > > > > > +                       priv->nr_idcs++;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Setup IDCs or MSIs based on number of IDCs */
> > > > > > +       if (priv->nr_idcs)
> > > > > > +               rc = aplic_setup_idc(priv);
> > > > > > +       else
> > > > > > +               rc = aplic_setup_msi(priv);
> > > > > > +       if (rc) {
> > > > > > +               pr_err("%pfwP: failed setup %s\n",
> > > > > > +                       fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > > > > +               return rc;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Setup global config and interrupt delivery */
> > > > > > +       aplic_init_hw_global(priv);
> > > > > > +
> > > > > > +       /* Create irq domain instance for the APLIC */
> > > > > > +       if (priv->nr_idcs)
> > > > > > +               priv->irqdomain = irq_domain_create_linear(
> > > > > > +                                               priv->fwnode,
> > > > > > +                                               priv->nr_irqs + 1,
> > > > > > +                                               &aplic_irqdomain_idc_ops,
> > > > > > +                                               priv);
> > > > > > +       else
> > > > > > +               priv->irqdomain = platform_msi_create_device_domain(
> > > > > > +                                               &pdev->dev,
> > > > > > +                                               priv->nr_irqs + 1,
> > > > > > +                                               aplic_msi_write_msg,
> > > > > > +                                               &aplic_irqdomain_msi_ops,
> > > > > > +                                               priv);
> > > > > > +       if (!priv->irqdomain) {
> > > > > > +               pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > > > > +               return -ENOMEM;
> > > > > > +       }
> > > > > > +
> > > > > > +       /* Advertise the interrupt controller */
> > > > > > +       if (priv->nr_idcs) {
> > > > > > +               pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > > > > +                       priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > > > > +       } else {
> > > > > > +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > > +               pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > > > > +                       priv->fwnode, priv->nr_irqs, &pa);
> > > > > > +       }
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static const struct of_device_id aplic_match[] = {
> > > > > > +       { .compatible = "riscv,aplic" },
> > > > > > +       {}
> > > > > > +};
> > > > > > +
> > > > > > +static struct platform_driver aplic_driver = {
> > > > > > +       .driver = {
> > > > > > +               .name           = "riscv-aplic",
> > > > > > +               .of_match_table = aplic_match,
> > > > > > +       },
> > > > > > +       .probe = aplic_probe,
> > > > > > +};
> > > > > > +builtin_platform_driver(aplic_driver);
> > > > > > +
> > > > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > > > +                               struct device_node *parent)
> > > > > > +{
> > > > > > +       /*
> > > > > > +        * The APLIC platform driver needs to be probed early
> > > > > > +        * so for device tree:
> > > > > > +        *
> > > > > > +        * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > > > +        *    provides a hint to the device driver core to probe the
> > > > > > +        *    platform driver early.
> > > > > > +        * 2) Clear the OF_POPULATED flag in device_node because
> > > > > > +        *    of_irq_init() sets it which prevents creation of
> > > > > > +        *    platform device.
> > > > > > +        */
> > > > > > +       node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > > > >
> > > > > NACK. You are blindly plastering flags without trying to understand
> > > > > the real issue and fixing this correctly.
> > > > >
> > > > > > +       of_node_clear_flag(node, OF_POPULATED);
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > > > >
> > > > > This macro pretty much skips the entire driver core framework to probe
> > > > > and calls init and you are supposed to initialize the device when the
> > > > > init function is called.
> > > > >
> > > > > If you want your device/driver to follow the proper platform driver
> > > > > path (which is recommended), then you need to use the
> > > > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > > > >
> > > > > I offered to help you debug this issue and I asked for a dts file that
> > > > > corresponds to a board you are testing this on and seeing an issue.
> > > > > But you haven't answered my question [1] and are pointing to some
> > > > > random commit and blaming it. That commit has no impact on any
> > > > > existing devices/drivers.
> > > > >
> > > > > Hi Marc,
> > > > >
> > > > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > > > is used or until Anup actually works with us to debug the real issue.
> > > >
> > > > Maybe I misread your previous comment.
> > > >
> > > > You can easily reproduce the issue on QEMU virt machine for RISC-V:
> > > > 1) Build qemu-system-riscv64 from latest QEMU master
> > > > 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> > > > (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
> > > >  APLIC driver at the time of building kernel)
> > > > 3) Boot a APLIC-only system on QEMU virt machine
> > > >     qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
> > > >     -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
> > > >     -kernel ./build-riscv64/arch/riscv/boot/Image \
> > > >     -append "root=/dev/ram rw console=ttyS0 earlycon" \
> > > >     -initrd ./rootfs_riscv64.img
> > >
> > > Unfortunately, I don't have the time to do all that, but I generally
> > > don't need to run something to figure out the issue. It's generally
> > > fairly obvious once I look at the DT. I'll also lean on you for some
> > > debug logs.
> >
> > The boot log with FWNODE_BEST_EFFORT flag in APLIC can be
> > found at:
> > https://drive.google.com/file/d/1C-uuHbh6Zk9xkAsfGLfhb_4WighvmQp1/view?usp=sharing
> >
> > The boot log without FWNODE_BEST_EFFORT flag in APLIC can
> > be found at:
> > https://drive.google.com/file/d/12SRdR-2Frv_5O06kbuI_LUJ88khjf_7O/view?usp=sharing
> >
> > >
> > > Where is the dts file that corresponds to this QEMU run? This is the
> > > third time I'm asking for a pointer to a dts file that has this issue,
> > > can you point me to it please? I shouldn't have to say this but: put
> > > it somewhere and point me to it please. Please don't point me to some
> > > git repo and ask me to dig around.
> >
> > For QEMU virt machine, the DTB is generated at runtime as part of
> > virt machine creation. The DTS dumped by QEMU using the "dumpdtb"
> > command line option can be found at:
> > https://drive.google.com/file/d/1EU-exItL1B7EWuoXw4q-Ypocq--5Wvn8/view?usp=sharing
> >
> > >
> > > Can you give me details on what supplier is causing the deferred probe
> > > that's a problem for you? Any other details you can provide that'll
> > > help debug this issue?
> >
> > FWNODE supplier for APLIC DT node is the OF framework.
> >
> > >
> > > > I hope the above steps help you reproduce the issue. I will certainly
> > > > test whatever fix you propose.
> > >
> > > Do you plan to try the fix I suggested already? The one about using
> > > the correct macros?
> >
> > You mean use IRQCHIP_DECLARE() in the APLIC driver ?
> > or something else ?
>
> No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
> instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.

I tried IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros but these
macros are not suitable for APLIC driver because we need platform device
pointer in the APLIC probe() to create platform MSI device domain (refer,
platform_msi_create_device_domain()).

Further, I tried setting the "suppress_bind_attrs" flag in "struct
platform_driver
aplic_driver" just like the IRQCHIP_PLATFORM_DRIVER_END() macro
but this did not work.

Regards,
Anup

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-23 11:47             ` Anup Patel
@ 2023-06-23 12:49               ` Marc Zyngier
  2023-06-23 13:52                 ` Anup Patel
  0 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2023-06-23 12:49 UTC (permalink / raw)
  To: Anup Patel
  Cc: Saravana Kannan, Anup Patel, Palmer Dabbelt, Paul Walmsley,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Conor Dooley, linux-riscv, linux-kernel,
	devicetree, iommu, Android Kernel Team

[here, let me trim all of this nonsense...]

On Fri, 23 Jun 2023 12:47:00 +0100,
Anup Patel <apatel@ventanamicro.com> wrote:
> > No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
> > instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.
> 
> I tried IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros but these
> macros are not suitable for APLIC driver because we need platform device
> pointer in the APLIC probe() to create platform MSI device domain (refer,
> platform_msi_create_device_domain()).

Oh come on. How hard have you tried? Have you even looked at the other
drivers in the tree to see how they solve this insurmountable problem
with a *single* line of code?

        pdev = of_find_device_by_node(node);

That's it.

> Further, I tried setting the "suppress_bind_attrs" flag in "struct
> platform_driver aplic_driver" just like the
> IRQCHIP_PLATFORM_DRIVER_END() macro but this did not work.

I'm not sure how relevant this is to the conversation.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver
  2023-06-23 12:49               ` Marc Zyngier
@ 2023-06-23 13:52                 ` Anup Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Anup Patel @ 2023-06-23 13:52 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Anup Patel, Saravana Kannan, Palmer Dabbelt, Paul Walmsley,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Robin Murphy,
	Joerg Roedel, Will Deacon, Frank Rowand, Atish Patra,
	Andrew Jones, Conor Dooley, linux-riscv, linux-kernel,
	devicetree, iommu, Android Kernel Team

On Fri, Jun 23, 2023 at 6:19 PM Marc Zyngier <maz@kernel.org> wrote:
>
> [here, let me trim all of this nonsense...]
>
> On Fri, 23 Jun 2023 12:47:00 +0100,
> Anup Patel <apatel@ventanamicro.com> wrote:
> > > No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
> > > instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.
> >
> > I tried IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros but these
> > macros are not suitable for APLIC driver because we need platform device
> > pointer in the APLIC probe() to create platform MSI device domain (refer,
> > platform_msi_create_device_domain()).
>
> Oh come on. How hard have you tried? Have you even looked at the other
> drivers in the tree to see how they solve this insurmountable problem
> with a *single* line of code?
>
>         pdev = of_find_device_by_node(node);
>
> That's it.

Please see the below diff. I tried the same thing but still the APLIC does
not get probed without the FWNODE_FLAG_BEST_EFFORT flag. Please
note that the current APLIC driver works unmodified for both DT and ACPI
but using of_find_device_by_node() here breaks ACPI support.

diff --git a/drivers/irqchip/irq-riscv-aplic.c
b/drivers/irqchip/irq-riscv-aplic.c
index 1e710fdf5608..9ae9e7fb905f 100644
--- a/drivers/irqchip/irq-riscv-aplic.c
+++ b/drivers/irqchip/irq-riscv-aplic.c
@@ -17,6 +17,7 @@
 #include <linux/irqdomain.h>
 #include <linux/module.h>
 #include <linux/msi.h>
+#include <linux/of_platform.h>
 #include <linux/platform_device.h>
 #include <linux/smp.h>

@@ -730,36 +731,12 @@ static int aplic_probe(struct platform_device *pdev)
     return 0;
 }

-static const struct of_device_id aplic_match[] = {
-    { .compatible = "riscv,aplic" },
-    {}
-};
-
-static struct platform_driver aplic_driver = {
-    .driver = {
-        .name        = "riscv-aplic",
-        .of_match_table    = aplic_match,
-    },
-    .probe = aplic_probe,
-};
-builtin_platform_driver(aplic_driver);
-
-static int __init aplic_dt_init(struct device_node *node,
+static int __init aplic_of_init(struct device_node *dn,
                 struct device_node *parent)
 {
-    /*
-     * The APLIC platform driver needs to be probed early
-     * so for device tree:
-     *
-     * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
-     *    provides a hint to the device driver core to probe the
-     *    platform driver early.
-     * 2) Clear the OF_POPULATED flag in device_node because
-     *    of_irq_init() sets it which prevents creation of
-     *    platform device.
-     */
-    node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
-    of_node_clear_flag(node, OF_POPULATED);
-    return 0;
+    return aplic_probe(of_find_device_by_node(dn));
 }
-IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
+
+IRQCHIP_PLATFORM_DRIVER_BEGIN(aplic)
+IRQCHIP_MATCH("riscv,aplic", aplic_of_init)
+IRQCHIP_PLATFORM_DRIVER_END(aplic)

>
> > Further, I tried setting the "suppress_bind_attrs" flag in "struct
> > platform_driver aplic_driver" just like the
> > IRQCHIP_PLATFORM_DRIVER_END() macro but this did not work.
>
> I'm not sure how relevant this is to the conversation.

It's relevant because the only difference in the platform_driver
registered by IRQCHIP_PLATFORM_DRIVER_END() and
"struct platform_driver aplic_driver" is the "suppress_bind_attrs" flag.

Unfortunately, setting the "suppress_bind_attrs" flag does not
help as well.

>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

Regards,
Anup

^ permalink raw reply related	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-06-23 13:52 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-13 15:34 [PATCH v4 00/10] Linux RISC-V AIA Support Anup Patel
2023-06-13 15:34 ` [PATCH v4 01/10] RISC-V: Add riscv_fw_parent_hartid() function Anup Patel
2023-06-13 15:34 ` [PATCH v4 02/10] irqchip/riscv-intc: Add support for RISC-V AIA Anup Patel
2023-06-13 15:34 ` [PATCH v4 03/10] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller Anup Patel
2023-06-13 15:34 ` [PATCH v4 04/10] irqchip: Add RISC-V incoming MSI controller driver Anup Patel
2023-06-13 15:34 ` [PATCH v4 05/10] irqchip/riscv-imsic: Add support for PCI MSI irqdomain Anup Patel
2023-06-13 15:34 ` [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support Anup Patel
2023-06-14 14:46   ` Jason Gunthorpe
2023-06-14 16:17     ` Anup Patel
2023-06-14 16:50       ` Jason Gunthorpe
2023-06-15  5:46         ` Anup Patel
2023-06-13 15:34 ` [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC Anup Patel
2023-06-14 19:27   ` Conor Dooley
2023-06-15  5:47     ` Anup Patel
2023-06-13 15:34 ` [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver Anup Patel
2023-06-15 19:17   ` Saravana Kannan
2023-06-15 19:31     ` Conor Dooley
2023-06-15 20:45       ` Saravana Kannan
2023-06-15 21:11         ` Conor Dooley
2023-06-16  2:01     ` Anup Patel
2023-06-16 22:05       ` Saravana Kannan
2023-06-19  6:13         ` Anup Patel
2023-06-22 20:56           ` Saravana Kannan
2023-06-23 11:47             ` Anup Patel
2023-06-23 12:49               ` Marc Zyngier
2023-06-23 13:52                 ` Anup Patel
2023-06-13 15:34 ` [PATCH v4 09/10] RISC-V: Select APLIC and IMSIC drivers Anup Patel
2023-06-13 15:34 ` [PATCH v4 10/10] MAINTAINERS: Add entry for RISC-V AIA drivers Anup Patel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).