All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/9] Linux RISC-V AIA Support
@ 2023-01-03 14:14 ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V AIA specification is now frozen as-per the RISC-V international
process. The latest frozen specifcation can be found at:
https://github.com/riscv/riscv-aia/releases/download/1.0-RC1/riscv-interrupts-1.0-RC1.pdf

At a high-level, the AIA specification adds three things:
1) AIA CSRs
   - Improved local interrupt support
2) Incoming Message Signaled Interrupt Controller (IMSIC)
   - Per-HART MSI controller
   - Support MSI virtualization
   - Support IPI along with virtualization
3) Advanced Platform-Level Interrupt Controller (APLIC)
   - Wired interrupt controller
   - In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
   - In Direct-mode, injects external interrupts directly into HARTs

For an overview of the AIA specification, refer the recent AIA virtualization
talk at KVM Forum 2022:
https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
https://www.youtube.com/watch?v=r071dL8Z0yo

This series adds required Linux irqchip drivers for AIA and it depends on
the recent "RISC-V IPI Improvements".
(Refer, https://lore.kernel.org/lkml/20221101143400.690000-1-apatel@ventanamicro.com/t/)

To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).

These patches can also be found in the riscv_aia_v2 branch at:
https://github.com/avpatel/linux.git

Changes since v1:
 - Rebased on Linux-6.2-rc2
 - Addressed comments on IMSIC DT bindings for PATCH4
 - Use raw_spin_lock_irqsave() on ids_lock for PATCH5
 - Improved MMIO alignment checks in PATCH5 to allow MMIO regions
   with holes.
 - Addressed comments on APLIC DT bindings for PATCH6
 - Fixed warning splat in aplic_msi_write_msg() caused by
   zeroed MSI message in PATCH7
 - Dropped DT property riscv,slow-ipi instead will have module
   parameter in future.

Anup Patel (9):
  RISC-V: Add AIA related CSR defines
  RISC-V: Detect AIA CSRs from ISA string
  irqchip/riscv-intc: Add support for RISC-V AIA
  dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  irqchip: Add RISC-V incoming MSI controller driver
  dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  irqchip: Add RISC-V advanced PLIC driver
  RISC-V: Select APLIC and IMSIC drivers
  MAINTAINERS: Add entry for RISC-V AIA drivers

 .../interrupt-controller/riscv,aplic.yaml     |  159 +++
 .../interrupt-controller/riscv,imsics.yaml    |  168 +++
 MAINTAINERS                                   |   12 +
 arch/riscv/Kconfig                            |    2 +
 arch/riscv/include/asm/csr.h                  |   92 ++
 arch/riscv/include/asm/hwcap.h                |    8 +
 arch/riscv/kernel/cpu.c                       |    2 +
 arch/riscv/kernel/cpufeature.c                |    2 +
 drivers/irqchip/Kconfig                       |   20 +-
 drivers/irqchip/Makefile                      |    2 +
 drivers/irqchip/irq-riscv-aplic.c             |  670 ++++++++++
 drivers/irqchip/irq-riscv-imsic.c             | 1174 +++++++++++++++++
 drivers/irqchip/irq-riscv-intc.c              |   37 +-
 include/linux/irqchip/riscv-aplic.h           |  117 ++
 include/linux/irqchip/riscv-imsic.h           |   92 ++
 15 files changed, 2550 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
 create mode 100644 drivers/irqchip/irq-riscv-aplic.c
 create mode 100644 drivers/irqchip/irq-riscv-imsic.c
 create mode 100644 include/linux/irqchip/riscv-aplic.h
 create mode 100644 include/linux/irqchip/riscv-imsic.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 0/9] Linux RISC-V AIA Support
@ 2023-01-03 14:14 ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V AIA specification is now frozen as-per the RISC-V international
process. The latest frozen specifcation can be found at:
https://github.com/riscv/riscv-aia/releases/download/1.0-RC1/riscv-interrupts-1.0-RC1.pdf

At a high-level, the AIA specification adds three things:
1) AIA CSRs
   - Improved local interrupt support
2) Incoming Message Signaled Interrupt Controller (IMSIC)
   - Per-HART MSI controller
   - Support MSI virtualization
   - Support IPI along with virtualization
3) Advanced Platform-Level Interrupt Controller (APLIC)
   - Wired interrupt controller
   - In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
   - In Direct-mode, injects external interrupts directly into HARTs

For an overview of the AIA specification, refer the recent AIA virtualization
talk at KVM Forum 2022:
https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
https://www.youtube.com/watch?v=r071dL8Z0yo

This series adds required Linux irqchip drivers for AIA and it depends on
the recent "RISC-V IPI Improvements".
(Refer, https://lore.kernel.org/lkml/20221101143400.690000-1-apatel@ventanamicro.com/t/)

To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).

These patches can also be found in the riscv_aia_v2 branch at:
https://github.com/avpatel/linux.git

Changes since v1:
 - Rebased on Linux-6.2-rc2
 - Addressed comments on IMSIC DT bindings for PATCH4
 - Use raw_spin_lock_irqsave() on ids_lock for PATCH5
 - Improved MMIO alignment checks in PATCH5 to allow MMIO regions
   with holes.
 - Addressed comments on APLIC DT bindings for PATCH6
 - Fixed warning splat in aplic_msi_write_msg() caused by
   zeroed MSI message in PATCH7
 - Dropped DT property riscv,slow-ipi instead will have module
   parameter in future.

Anup Patel (9):
  RISC-V: Add AIA related CSR defines
  RISC-V: Detect AIA CSRs from ISA string
  irqchip/riscv-intc: Add support for RISC-V AIA
  dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  irqchip: Add RISC-V incoming MSI controller driver
  dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  irqchip: Add RISC-V advanced PLIC driver
  RISC-V: Select APLIC and IMSIC drivers
  MAINTAINERS: Add entry for RISC-V AIA drivers

 .../interrupt-controller/riscv,aplic.yaml     |  159 +++
 .../interrupt-controller/riscv,imsics.yaml    |  168 +++
 MAINTAINERS                                   |   12 +
 arch/riscv/Kconfig                            |    2 +
 arch/riscv/include/asm/csr.h                  |   92 ++
 arch/riscv/include/asm/hwcap.h                |    8 +
 arch/riscv/kernel/cpu.c                       |    2 +
 arch/riscv/kernel/cpufeature.c                |    2 +
 drivers/irqchip/Kconfig                       |   20 +-
 drivers/irqchip/Makefile                      |    2 +
 drivers/irqchip/irq-riscv-aplic.c             |  670 ++++++++++
 drivers/irqchip/irq-riscv-imsic.c             | 1174 +++++++++++++++++
 drivers/irqchip/irq-riscv-intc.c              |   37 +-
 include/linux/irqchip/riscv-aplic.h           |  117 ++
 include/linux/irqchip/riscv-imsic.h           |   92 ++
 15 files changed, 2550 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
 create mode 100644 drivers/irqchip/irq-riscv-aplic.c
 create mode 100644 drivers/irqchip/irq-riscv-imsic.c
 create mode 100644 include/linux/irqchip/riscv-aplic.h
 create mode 100644 include/linux/irqchip/riscv-imsic.h

-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V AIA specification improves handling per-HART local interrupts
in a backward compatible manner. This patch adds defines for new RISC-V
AIA CSRs.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 0e571f6483d9..4e1356bad7b2 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -73,7 +73,10 @@
 #define IRQ_S_EXT		9
 #define IRQ_VS_EXT		10
 #define IRQ_M_EXT		11
+#define IRQ_S_GEXT		12
 #define IRQ_PMU_OVF		13
+#define IRQ_LOCAL_MAX		(IRQ_PMU_OVF + 1)
+#define IRQ_LOCAL_MASK		((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
 
 /* Exception causes */
 #define EXC_INST_MISALIGNED	0
@@ -156,6 +159,26 @@
 				 (_AC(1, UL) << IRQ_S_TIMER) | \
 				 (_AC(1, UL) << IRQ_S_EXT))
 
+/* AIA CSR bits */
+#define TOPI_IID_SHIFT		16
+#define TOPI_IID_MASK		0xfff
+#define TOPI_IPRIO_MASK		0xff
+#define TOPI_IPRIO_BITS		8
+
+#define TOPEI_ID_SHIFT		16
+#define TOPEI_ID_MASK		0x7ff
+#define TOPEI_PRIO_MASK		0x7ff
+
+#define ISELECT_IPRIO0		0x30
+#define ISELECT_IPRIO15		0x3f
+#define ISELECT_MASK		0x1ff
+
+#define HVICTL_VTI		0x40000000
+#define HVICTL_IID		0x0fff0000
+#define HVICTL_IID_SHIFT	16
+#define HVICTL_IPRIOM		0x00000100
+#define HVICTL_IPRIO		0x000000ff
+
 /* xENVCFG flags */
 #define ENVCFG_STCE			(_AC(1, ULL) << 63)
 #define ENVCFG_PBMTE			(_AC(1, ULL) << 62)
@@ -250,6 +273,18 @@
 #define CSR_STIMECMP		0x14D
 #define CSR_STIMECMPH		0x15D
 
+/* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */
+#define CSR_SISELECT		0x150
+#define CSR_SIREG		0x151
+
+/* Supervisor-Level Interrupts (AIA) */
+#define CSR_STOPEI		0x15c
+#define CSR_STOPI		0xdb0
+
+/* Supervisor-Level High-Half CSRs (AIA) */
+#define CSR_SIEH		0x114
+#define CSR_SIPH		0x154
+
 #define CSR_VSSTATUS		0x200
 #define CSR_VSIE		0x204
 #define CSR_VSTVEC		0x205
@@ -279,8 +314,32 @@
 #define CSR_HGATP		0x680
 #define CSR_HGEIP		0xe12
 
+/* Virtual Interrupts and Interrupt Priorities (H-extension with AIA) */
+#define CSR_HVIEN		0x608
+#define CSR_HVICTL		0x609
+#define CSR_HVIPRIO1		0x646
+#define CSR_HVIPRIO2		0x647
+
+/* VS-Level Window to Indirectly Accessed Registers (H-extension with AIA) */
+#define CSR_VSISELECT		0x250
+#define CSR_VSIREG		0x251
+
+/* VS-Level Interrupts (H-extension with AIA) */
+#define CSR_VSTOPEI		0x25c
+#define CSR_VSTOPI		0xeb0
+
+/* Hypervisor and VS-Level High-Half CSRs (H-extension with AIA) */
+#define CSR_HIDELEGH		0x613
+#define CSR_HVIENH		0x618
+#define CSR_HVIPH		0x655
+#define CSR_HVIPRIO1H		0x656
+#define CSR_HVIPRIO2H		0x657
+#define CSR_VSIEH		0x214
+#define CSR_VSIPH		0x254
+
 #define CSR_MSTATUS		0x300
 #define CSR_MISA		0x301
+#define CSR_MIDELEG		0x303
 #define CSR_MIE			0x304
 #define CSR_MTVEC		0x305
 #define CSR_MENVCFG		0x30a
@@ -297,6 +356,25 @@
 #define CSR_MIMPID		0xf13
 #define CSR_MHARTID		0xf14
 
+/* Machine-Level Window to Indirectly Accessed Registers (AIA) */
+#define CSR_MISELECT		0x350
+#define CSR_MIREG		0x351
+
+/* Machine-Level Interrupts (AIA) */
+#define CSR_MTOPEI		0x35c
+#define CSR_MTOPI		0xfb0
+
+/* Virtual Interrupts for Supervisor Level (AIA) */
+#define CSR_MVIEN		0x308
+#define CSR_MVIP		0x309
+
+/* Machine-Level High-Half CSRs (AIA) */
+#define CSR_MIDELEGH		0x313
+#define CSR_MIEH		0x314
+#define CSR_MVIENH		0x318
+#define CSR_MVIPH		0x319
+#define CSR_MIPH		0x354
+
 #ifdef CONFIG_RISCV_M_MODE
 # define CSR_STATUS	CSR_MSTATUS
 # define CSR_IE		CSR_MIE
@@ -307,6 +385,13 @@
 # define CSR_TVAL	CSR_MTVAL
 # define CSR_IP		CSR_MIP
 
+# define CSR_IEH		CSR_MIEH
+# define CSR_ISELECT	CSR_MISELECT
+# define CSR_IREG	CSR_MIREG
+# define CSR_IPH		CSR_MIPH
+# define CSR_TOPEI	CSR_MTOPEI
+# define CSR_TOPI	CSR_MTOPI
+
 # define SR_IE		SR_MIE
 # define SR_PIE		SR_MPIE
 # define SR_PP		SR_MPP
@@ -324,6 +409,13 @@
 # define CSR_TVAL	CSR_STVAL
 # define CSR_IP		CSR_SIP
 
+# define CSR_IEH		CSR_SIEH
+# define CSR_ISELECT	CSR_SISELECT
+# define CSR_IREG	CSR_SIREG
+# define CSR_IPH		CSR_SIPH
+# define CSR_TOPEI	CSR_STOPEI
+# define CSR_TOPI	CSR_STOPI
+
 # define SR_IE		SR_SIE
 # define SR_PIE		SR_SPIE
 # define SR_PP		SR_SPP
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V AIA specification improves handling per-HART local interrupts
in a backward compatible manner. This patch adds defines for new RISC-V
AIA CSRs.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 0e571f6483d9..4e1356bad7b2 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -73,7 +73,10 @@
 #define IRQ_S_EXT		9
 #define IRQ_VS_EXT		10
 #define IRQ_M_EXT		11
+#define IRQ_S_GEXT		12
 #define IRQ_PMU_OVF		13
+#define IRQ_LOCAL_MAX		(IRQ_PMU_OVF + 1)
+#define IRQ_LOCAL_MASK		((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
 
 /* Exception causes */
 #define EXC_INST_MISALIGNED	0
@@ -156,6 +159,26 @@
 				 (_AC(1, UL) << IRQ_S_TIMER) | \
 				 (_AC(1, UL) << IRQ_S_EXT))
 
+/* AIA CSR bits */
+#define TOPI_IID_SHIFT		16
+#define TOPI_IID_MASK		0xfff
+#define TOPI_IPRIO_MASK		0xff
+#define TOPI_IPRIO_BITS		8
+
+#define TOPEI_ID_SHIFT		16
+#define TOPEI_ID_MASK		0x7ff
+#define TOPEI_PRIO_MASK		0x7ff
+
+#define ISELECT_IPRIO0		0x30
+#define ISELECT_IPRIO15		0x3f
+#define ISELECT_MASK		0x1ff
+
+#define HVICTL_VTI		0x40000000
+#define HVICTL_IID		0x0fff0000
+#define HVICTL_IID_SHIFT	16
+#define HVICTL_IPRIOM		0x00000100
+#define HVICTL_IPRIO		0x000000ff
+
 /* xENVCFG flags */
 #define ENVCFG_STCE			(_AC(1, ULL) << 63)
 #define ENVCFG_PBMTE			(_AC(1, ULL) << 62)
@@ -250,6 +273,18 @@
 #define CSR_STIMECMP		0x14D
 #define CSR_STIMECMPH		0x15D
 
+/* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */
+#define CSR_SISELECT		0x150
+#define CSR_SIREG		0x151
+
+/* Supervisor-Level Interrupts (AIA) */
+#define CSR_STOPEI		0x15c
+#define CSR_STOPI		0xdb0
+
+/* Supervisor-Level High-Half CSRs (AIA) */
+#define CSR_SIEH		0x114
+#define CSR_SIPH		0x154
+
 #define CSR_VSSTATUS		0x200
 #define CSR_VSIE		0x204
 #define CSR_VSTVEC		0x205
@@ -279,8 +314,32 @@
 #define CSR_HGATP		0x680
 #define CSR_HGEIP		0xe12
 
+/* Virtual Interrupts and Interrupt Priorities (H-extension with AIA) */
+#define CSR_HVIEN		0x608
+#define CSR_HVICTL		0x609
+#define CSR_HVIPRIO1		0x646
+#define CSR_HVIPRIO2		0x647
+
+/* VS-Level Window to Indirectly Accessed Registers (H-extension with AIA) */
+#define CSR_VSISELECT		0x250
+#define CSR_VSIREG		0x251
+
+/* VS-Level Interrupts (H-extension with AIA) */
+#define CSR_VSTOPEI		0x25c
+#define CSR_VSTOPI		0xeb0
+
+/* Hypervisor and VS-Level High-Half CSRs (H-extension with AIA) */
+#define CSR_HIDELEGH		0x613
+#define CSR_HVIENH		0x618
+#define CSR_HVIPH		0x655
+#define CSR_HVIPRIO1H		0x656
+#define CSR_HVIPRIO2H		0x657
+#define CSR_VSIEH		0x214
+#define CSR_VSIPH		0x254
+
 #define CSR_MSTATUS		0x300
 #define CSR_MISA		0x301
+#define CSR_MIDELEG		0x303
 #define CSR_MIE			0x304
 #define CSR_MTVEC		0x305
 #define CSR_MENVCFG		0x30a
@@ -297,6 +356,25 @@
 #define CSR_MIMPID		0xf13
 #define CSR_MHARTID		0xf14
 
+/* Machine-Level Window to Indirectly Accessed Registers (AIA) */
+#define CSR_MISELECT		0x350
+#define CSR_MIREG		0x351
+
+/* Machine-Level Interrupts (AIA) */
+#define CSR_MTOPEI		0x35c
+#define CSR_MTOPI		0xfb0
+
+/* Virtual Interrupts for Supervisor Level (AIA) */
+#define CSR_MVIEN		0x308
+#define CSR_MVIP		0x309
+
+/* Machine-Level High-Half CSRs (AIA) */
+#define CSR_MIDELEGH		0x313
+#define CSR_MIEH		0x314
+#define CSR_MVIENH		0x318
+#define CSR_MVIPH		0x319
+#define CSR_MIPH		0x354
+
 #ifdef CONFIG_RISCV_M_MODE
 # define CSR_STATUS	CSR_MSTATUS
 # define CSR_IE		CSR_MIE
@@ -307,6 +385,13 @@
 # define CSR_TVAL	CSR_MTVAL
 # define CSR_IP		CSR_MIP
 
+# define CSR_IEH		CSR_MIEH
+# define CSR_ISELECT	CSR_MISELECT
+# define CSR_IREG	CSR_MIREG
+# define CSR_IPH		CSR_MIPH
+# define CSR_TOPEI	CSR_MTOPEI
+# define CSR_TOPI	CSR_MTOPI
+
 # define SR_IE		SR_MIE
 # define SR_PIE		SR_MPIE
 # define SR_PP		SR_MPP
@@ -324,6 +409,13 @@
 # define CSR_TVAL	CSR_STVAL
 # define CSR_IP		CSR_SIP
 
+# define CSR_IEH		CSR_SIEH
+# define CSR_ISELECT	CSR_SISELECT
+# define CSR_IREG	CSR_SIREG
+# define CSR_IPH		CSR_SIPH
+# define CSR_TOPEI	CSR_STOPEI
+# define CSR_TOPI	CSR_STOPI
+
 # define SR_IE		SR_SIE
 # define SR_PIE		SR_SPIE
 # define SR_PP		SR_SPP
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 2/9] RISC-V: Detect AIA CSRs from ISA string
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

We have two extension names for AIA ISA support: Smaia (M-mode AIA CSRs)
and Ssaia (S-mode AIA CSRs).

We extend the ISA string parsing to detect Smaia and Ssaia extensions.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 arch/riscv/include/asm/hwcap.h | 8 ++++++++
 arch/riscv/kernel/cpu.c        | 2 ++
 arch/riscv/kernel/cpufeature.c | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 86328e3acb02..c649e85ed7bb 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -59,10 +59,18 @@ enum riscv_isa_ext_id {
 	RISCV_ISA_EXT_ZIHINTPAUSE,
 	RISCV_ISA_EXT_SSTC,
 	RISCV_ISA_EXT_SVINVAL,
+	RISCV_ISA_EXT_SSAIA,
+	RISCV_ISA_EXT_SMAIA,
 	RISCV_ISA_EXT_ID_MAX
 };
 static_assert(RISCV_ISA_EXT_ID_MAX <= RISCV_ISA_EXT_MAX);
 
+#ifdef CONFIG_RISCV_M_MODE
+#define RISCV_ISA_EXT_SxAIA		RISCV_ISA_EXT_SMAIA
+#else
+#define RISCV_ISA_EXT_SxAIA		RISCV_ISA_EXT_SSAIA
+#endif
+
 /*
  * This enum represents the logical ID for each RISC-V ISA extension static
  * keys. We can use static key to optimize code path if some ISA extensions
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 1b9a5a66e55a..a215ec929160 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -162,6 +162,8 @@ arch_initcall(riscv_cpuinfo_init);
  *    extensions by an underscore.
  */
 static struct riscv_isa_ext_data isa_ext_arr[] = {
+	__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+	__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
 	__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
 	__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
 	__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 93e45560af30..3c5b51f519d5 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -228,6 +228,8 @@ void __init riscv_fill_hwcap(void)
 				SET_ISA_EXT_MAP("zihintpause", RISCV_ISA_EXT_ZIHINTPAUSE);
 				SET_ISA_EXT_MAP("sstc", RISCV_ISA_EXT_SSTC);
 				SET_ISA_EXT_MAP("svinval", RISCV_ISA_EXT_SVINVAL);
+				SET_ISA_EXT_MAP("smaia", RISCV_ISA_EXT_SMAIA);
+				SET_ISA_EXT_MAP("ssaia", RISCV_ISA_EXT_SSAIA);
 			}
 #undef SET_ISA_EXT_MAP
 		}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 2/9] RISC-V: Detect AIA CSRs from ISA string
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

We have two extension names for AIA ISA support: Smaia (M-mode AIA CSRs)
and Ssaia (S-mode AIA CSRs).

We extend the ISA string parsing to detect Smaia and Ssaia extensions.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 arch/riscv/include/asm/hwcap.h | 8 ++++++++
 arch/riscv/kernel/cpu.c        | 2 ++
 arch/riscv/kernel/cpufeature.c | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 86328e3acb02..c649e85ed7bb 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -59,10 +59,18 @@ enum riscv_isa_ext_id {
 	RISCV_ISA_EXT_ZIHINTPAUSE,
 	RISCV_ISA_EXT_SSTC,
 	RISCV_ISA_EXT_SVINVAL,
+	RISCV_ISA_EXT_SSAIA,
+	RISCV_ISA_EXT_SMAIA,
 	RISCV_ISA_EXT_ID_MAX
 };
 static_assert(RISCV_ISA_EXT_ID_MAX <= RISCV_ISA_EXT_MAX);
 
+#ifdef CONFIG_RISCV_M_MODE
+#define RISCV_ISA_EXT_SxAIA		RISCV_ISA_EXT_SMAIA
+#else
+#define RISCV_ISA_EXT_SxAIA		RISCV_ISA_EXT_SSAIA
+#endif
+
 /*
  * This enum represents the logical ID for each RISC-V ISA extension static
  * keys. We can use static key to optimize code path if some ISA extensions
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 1b9a5a66e55a..a215ec929160 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -162,6 +162,8 @@ arch_initcall(riscv_cpuinfo_init);
  *    extensions by an underscore.
  */
 static struct riscv_isa_ext_data isa_ext_arr[] = {
+	__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+	__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
 	__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
 	__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
 	__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 93e45560af30..3c5b51f519d5 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -228,6 +228,8 @@ void __init riscv_fill_hwcap(void)
 				SET_ISA_EXT_MAP("zihintpause", RISCV_ISA_EXT_ZIHINTPAUSE);
 				SET_ISA_EXT_MAP("sstc", RISCV_ISA_EXT_SSTC);
 				SET_ISA_EXT_MAP("svinval", RISCV_ISA_EXT_SVINVAL);
+				SET_ISA_EXT_MAP("smaia", RISCV_ISA_EXT_SMAIA);
+				SET_ISA_EXT_MAP("ssaia", RISCV_ISA_EXT_SSAIA);
 			}
 #undef SET_ISA_EXT_MAP
 		}
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 3/9] irqchip/riscv-intc: Add support for RISC-V AIA
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V advanced interrupt architecture (AIA) extends the per-HART
local interrupts in following ways:
1. Minimum 64 local interrupts for both RV32 and RV64
2. Ability to process multiple pending local interrupts in same
   interrupt handler
3. Priority configuration for each local interrupts
4. Special CSRs to configure/access the per-HART MSI controller

This patch adds support for RISC-V AIA in the RISC-V intc driver.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/irq-riscv-intc.c | 37 ++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index f229e3e66387..880d1639aadc 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/smp.h>
+#include <asm/hwcap.h>
 
 static struct irq_domain *intc_domain;
 
@@ -29,6 +30,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
 	generic_handle_domain_irq(intc_domain, cause);
 }
 
+static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
+{
+	unsigned long topi;
+
+	while ((topi = csr_read(CSR_TOPI)))
+		generic_handle_domain_irq(intc_domain,
+					  topi >> TOPI_IID_SHIFT);
+}
+
 /*
  * On RISC-V systems local interrupts are masked or unmasked by writing
  * the SIE (Supervisor Interrupt Enable) CSR.  As CSRs can only be written
@@ -38,12 +48,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
 
 static void riscv_intc_irq_mask(struct irq_data *d)
 {
-	csr_clear(CSR_IE, BIT(d->hwirq));
+	if (d->hwirq < BITS_PER_LONG)
+		csr_clear(CSR_IE, BIT(d->hwirq));
+	else
+		csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
 }
 
 static void riscv_intc_irq_unmask(struct irq_data *d)
 {
-	csr_set(CSR_IE, BIT(d->hwirq));
+	if (d->hwirq < BITS_PER_LONG)
+		csr_set(CSR_IE, BIT(d->hwirq));
+	else
+		csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
 }
 
 static void riscv_intc_irq_eoi(struct irq_data *d)
@@ -115,7 +131,7 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
 static int __init riscv_intc_init(struct device_node *node,
 				  struct device_node *parent)
 {
-	int rc;
+	int rc, nr_irqs;
 	unsigned long hartid;
 
 	rc = riscv_of_parent_hartid(node, &hartid);
@@ -133,14 +149,21 @@ static int __init riscv_intc_init(struct device_node *node,
 	if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
 		return 0;
 
-	intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
+	nr_irqs = BITS_PER_LONG;
+	if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
+		nr_irqs = nr_irqs * 2;
+
+	intc_domain = irq_domain_add_linear(node, nr_irqs,
 					    &riscv_intc_domain_ops, NULL);
 	if (!intc_domain) {
 		pr_err("unable to add IRQ domain\n");
 		return -ENXIO;
 	}
 
-	rc = set_handle_irq(&riscv_intc_irq);
+	if (riscv_isa_extension_available(NULL, SxAIA))
+		rc = set_handle_irq(&riscv_intc_aia_irq);
+	else
+		rc = set_handle_irq(&riscv_intc_irq);
 	if (rc) {
 		pr_err("failed to set irq handler\n");
 		return rc;
@@ -148,7 +171,9 @@ static int __init riscv_intc_init(struct device_node *node,
 
 	riscv_set_intc_hwnode_fn(riscv_intc_hwnode);
 
-	pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+	pr_info("%d local interrupts mapped%s\n",
+		nr_irqs, (riscv_isa_extension_available(NULL, SxAIA)) ?
+			 " using AIA" : "");
 
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 3/9] irqchip/riscv-intc: Add support for RISC-V AIA
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V advanced interrupt architecture (AIA) extends the per-HART
local interrupts in following ways:
1. Minimum 64 local interrupts for both RV32 and RV64
2. Ability to process multiple pending local interrupts in same
   interrupt handler
3. Priority configuration for each local interrupts
4. Special CSRs to configure/access the per-HART MSI controller

This patch adds support for RISC-V AIA in the RISC-V intc driver.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/irq-riscv-intc.c | 37 ++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index f229e3e66387..880d1639aadc 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/smp.h>
+#include <asm/hwcap.h>
 
 static struct irq_domain *intc_domain;
 
@@ -29,6 +30,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
 	generic_handle_domain_irq(intc_domain, cause);
 }
 
+static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
+{
+	unsigned long topi;
+
+	while ((topi = csr_read(CSR_TOPI)))
+		generic_handle_domain_irq(intc_domain,
+					  topi >> TOPI_IID_SHIFT);
+}
+
 /*
  * On RISC-V systems local interrupts are masked or unmasked by writing
  * the SIE (Supervisor Interrupt Enable) CSR.  As CSRs can only be written
@@ -38,12 +48,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
 
 static void riscv_intc_irq_mask(struct irq_data *d)
 {
-	csr_clear(CSR_IE, BIT(d->hwirq));
+	if (d->hwirq < BITS_PER_LONG)
+		csr_clear(CSR_IE, BIT(d->hwirq));
+	else
+		csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
 }
 
 static void riscv_intc_irq_unmask(struct irq_data *d)
 {
-	csr_set(CSR_IE, BIT(d->hwirq));
+	if (d->hwirq < BITS_PER_LONG)
+		csr_set(CSR_IE, BIT(d->hwirq));
+	else
+		csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
 }
 
 static void riscv_intc_irq_eoi(struct irq_data *d)
@@ -115,7 +131,7 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
 static int __init riscv_intc_init(struct device_node *node,
 				  struct device_node *parent)
 {
-	int rc;
+	int rc, nr_irqs;
 	unsigned long hartid;
 
 	rc = riscv_of_parent_hartid(node, &hartid);
@@ -133,14 +149,21 @@ static int __init riscv_intc_init(struct device_node *node,
 	if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
 		return 0;
 
-	intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
+	nr_irqs = BITS_PER_LONG;
+	if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
+		nr_irqs = nr_irqs * 2;
+
+	intc_domain = irq_domain_add_linear(node, nr_irqs,
 					    &riscv_intc_domain_ops, NULL);
 	if (!intc_domain) {
 		pr_err("unable to add IRQ domain\n");
 		return -ENXIO;
 	}
 
-	rc = set_handle_irq(&riscv_intc_irq);
+	if (riscv_isa_extension_available(NULL, SxAIA))
+		rc = set_handle_irq(&riscv_intc_aia_irq);
+	else
+		rc = set_handle_irq(&riscv_intc_irq);
 	if (rc) {
 		pr_err("failed to set irq handler\n");
 		return rc;
@@ -148,7 +171,9 @@ static int __init riscv_intc_init(struct device_node *node,
 
 	riscv_set_intc_hwnode_fn(riscv_intc_hwnode);
 
-	pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+	pr_info("%d local interrupts mapped%s\n",
+		nr_irqs, (riscv_isa_extension_available(NULL, SxAIA)) ?
+			 " using AIA" : "");
 
 	return 0;
 }
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

We add DT bindings document for the RISC-V incoming MSI controller
(IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
specification.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
 1 file changed, 168 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
new file mode 100644
index 000000000000..b9db03b6e95f
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
@@ -0,0 +1,168 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Incoming MSI Controller (IMSIC)
+
+maintainers:
+  - Anup Patel <anup@brainfault.org>
+
+description: |
+  The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
+  MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
+  AIA specification can be found at https://github.com/riscv/riscv-aia.
+
+  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
+  for each privilege level (machine or supervisor). The configuration of
+  a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
+  space to receive MSIs from devices. Each IMSIC interrupt file supports a
+  fixed number of interrupt identities (to distinguish MSIs from devices)
+  which is same for given privilege level across CPUs (or HARTs).
+
+  The device tree of a RISC-V platform will have one IMSIC device tree node
+  for each privilege level (machine or supervisor) which collectively describe
+  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
+
+  The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
+  follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
+  group is a set of IMSIC interrupt files co-located in MMIO space and we can
+  have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
+  RISC-V platform. The MSI target address of a IMSIC interrupt file at given
+  privilege level (machine or supervisor) encodes group index, HART index,
+  and guest index (shown below).
+
+  XLEN-1           >=24                                 12    0
+  |                  |                                  |     |
+  -------------------------------------------------------------
+  |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
+  -------------------------------------------------------------
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+  - $ref: /schemas/interrupt-controller/msi-controller.yaml#
+
+properties:
+  compatible:
+    items:
+      - enum:
+          - riscv,qemu-imsics
+      - const: riscv,imsics
+
+  reg:
+    minItems: 1
+    maxItems: 16384
+    description:
+      Base address of each IMSIC group.
+
+  interrupt-controller: true
+
+  "#interrupt-cells":
+    const: 0
+
+  msi-controller: true
+
+  interrupts-extended:
+    minItems: 1
+    maxItems: 16384
+    description:
+      This property represents the set of CPUs (or HARTs) for which given
+      device tree node describes the IMSIC interrupt files. Each node pointed
+      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
+      HART) as parent.
+
+  riscv,num-ids:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 63
+    maximum: 2047
+    description:
+      Number of interrupt identities supported by IMSIC interrupt file.
+
+  riscv,num-guest-ids:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 63
+    maximum: 2047
+    description:
+      Number of interrupt identities are supported by IMSIC guest interrupt
+      file. When not specified it is assumed to be same as specified by the
+      riscv,num-ids property.
+
+  riscv,guest-index-bits:
+    minimum: 0
+    maximum: 7
+    default: 0
+    description:
+      Number of guest index bits in the MSI target address. When not
+      specified it is assumed to be 0.
+
+  riscv,hart-index-bits:
+    minimum: 0
+    maximum: 15
+    description:
+      Number of HART index bits in the MSI target address. When not
+      specified it is estimated based on the interrupts-extended property.
+
+  riscv,group-index-bits:
+    minimum: 0
+    maximum: 7
+    default: 0
+    description:
+      Number of group index bits in the MSI target address. When not
+      specified it is assumed to be 0.
+
+  riscv,group-index-shift:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 55
+    default: 24
+    description:
+      The least significant bit position of the group index bits in the
+      MSI target address. When not specified it is assumed to be 24.
+
+required:
+  - compatible
+  - reg
+  - interrupt-controller
+  - msi-controller
+  - interrupts-extended
+  - riscv,num-ids
+
+unevaluatedProperties: false
+
+examples:
+  - |
+    // Example 1 (Machine-level IMSIC files with just one group):
+
+    imsic_mlevel: interrupt-controller@24000000 {
+      compatible = "riscv,qemu-imsics", "riscv,imsics";
+      interrupts-extended = <&cpu1_intc 11>,
+                            <&cpu2_intc 11>,
+                            <&cpu3_intc 11>,
+                            <&cpu4_intc 11>;
+      reg = <0x28000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <0>;
+      msi-controller;
+      riscv,num-ids = <127>;
+    };
+
+  - |
+    // Example 2 (Supervisor-level IMSIC files with two groups):
+
+    imsic_slevel: interrupt-controller@28000000 {
+      compatible = "riscv,qemu-imsics", "riscv,imsics";
+      interrupts-extended = <&cpu1_intc 9>,
+                            <&cpu2_intc 9>,
+                            <&cpu3_intc 9>,
+                            <&cpu4_intc 9>;
+      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
+            <0x29000000 0x2000>; /* Group1 IMSICs */
+      interrupt-controller;
+      #interrupt-cells = <0>;
+      msi-controller;
+      riscv,num-ids = <127>;
+      riscv,group-index-bits = <1>;
+      riscv,group-index-shift = <24>;
+    };
+...
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

We add DT bindings document for the RISC-V incoming MSI controller
(IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
specification.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
 1 file changed, 168 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
new file mode 100644
index 000000000000..b9db03b6e95f
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
@@ -0,0 +1,168 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Incoming MSI Controller (IMSIC)
+
+maintainers:
+  - Anup Patel <anup@brainfault.org>
+
+description: |
+  The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
+  MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
+  AIA specification can be found at https://github.com/riscv/riscv-aia.
+
+  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
+  for each privilege level (machine or supervisor). The configuration of
+  a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
+  space to receive MSIs from devices. Each IMSIC interrupt file supports a
+  fixed number of interrupt identities (to distinguish MSIs from devices)
+  which is same for given privilege level across CPUs (or HARTs).
+
+  The device tree of a RISC-V platform will have one IMSIC device tree node
+  for each privilege level (machine or supervisor) which collectively describe
+  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
+
+  The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
+  follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
+  group is a set of IMSIC interrupt files co-located in MMIO space and we can
+  have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
+  RISC-V platform. The MSI target address of a IMSIC interrupt file at given
+  privilege level (machine or supervisor) encodes group index, HART index,
+  and guest index (shown below).
+
+  XLEN-1           >=24                                 12    0
+  |                  |                                  |     |
+  -------------------------------------------------------------
+  |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
+  -------------------------------------------------------------
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+  - $ref: /schemas/interrupt-controller/msi-controller.yaml#
+
+properties:
+  compatible:
+    items:
+      - enum:
+          - riscv,qemu-imsics
+      - const: riscv,imsics
+
+  reg:
+    minItems: 1
+    maxItems: 16384
+    description:
+      Base address of each IMSIC group.
+
+  interrupt-controller: true
+
+  "#interrupt-cells":
+    const: 0
+
+  msi-controller: true
+
+  interrupts-extended:
+    minItems: 1
+    maxItems: 16384
+    description:
+      This property represents the set of CPUs (or HARTs) for which given
+      device tree node describes the IMSIC interrupt files. Each node pointed
+      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
+      HART) as parent.
+
+  riscv,num-ids:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 63
+    maximum: 2047
+    description:
+      Number of interrupt identities supported by IMSIC interrupt file.
+
+  riscv,num-guest-ids:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 63
+    maximum: 2047
+    description:
+      Number of interrupt identities are supported by IMSIC guest interrupt
+      file. When not specified it is assumed to be same as specified by the
+      riscv,num-ids property.
+
+  riscv,guest-index-bits:
+    minimum: 0
+    maximum: 7
+    default: 0
+    description:
+      Number of guest index bits in the MSI target address. When not
+      specified it is assumed to be 0.
+
+  riscv,hart-index-bits:
+    minimum: 0
+    maximum: 15
+    description:
+      Number of HART index bits in the MSI target address. When not
+      specified it is estimated based on the interrupts-extended property.
+
+  riscv,group-index-bits:
+    minimum: 0
+    maximum: 7
+    default: 0
+    description:
+      Number of group index bits in the MSI target address. When not
+      specified it is assumed to be 0.
+
+  riscv,group-index-shift:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 55
+    default: 24
+    description:
+      The least significant bit position of the group index bits in the
+      MSI target address. When not specified it is assumed to be 24.
+
+required:
+  - compatible
+  - reg
+  - interrupt-controller
+  - msi-controller
+  - interrupts-extended
+  - riscv,num-ids
+
+unevaluatedProperties: false
+
+examples:
+  - |
+    // Example 1 (Machine-level IMSIC files with just one group):
+
+    imsic_mlevel: interrupt-controller@24000000 {
+      compatible = "riscv,qemu-imsics", "riscv,imsics";
+      interrupts-extended = <&cpu1_intc 11>,
+                            <&cpu2_intc 11>,
+                            <&cpu3_intc 11>,
+                            <&cpu4_intc 11>;
+      reg = <0x28000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <0>;
+      msi-controller;
+      riscv,num-ids = <127>;
+    };
+
+  - |
+    // Example 2 (Supervisor-level IMSIC files with two groups):
+
+    imsic_slevel: interrupt-controller@28000000 {
+      compatible = "riscv,qemu-imsics", "riscv,imsics";
+      interrupts-extended = <&cpu1_intc 9>,
+                            <&cpu2_intc 9>,
+                            <&cpu3_intc 9>,
+                            <&cpu4_intc 9>;
+      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
+            <0x29000000 0x2000>; /* Group1 IMSICs */
+      interrupt-controller;
+      #interrupt-cells = <0>;
+      msi-controller;
+      riscv,num-ids = <127>;
+      riscv,group-index-bits = <1>;
+      riscv,group-index-shift = <24>;
+    };
+...
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V advanced interrupt architecture (AIA) specification defines
a new MSI controller for managing MSIs on a RISC-V platform. This new
MSI controller is referred to as incoming message signaled interrupt
controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
(For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/Kconfig             |   14 +-
 drivers/irqchip/Makefile            |    1 +
 drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
 include/linux/irqchip/riscv-imsic.h |   92 +++
 4 files changed, 1280 insertions(+), 1 deletion(-)
 create mode 100644 drivers/irqchip/irq-riscv-imsic.c
 create mode 100644 include/linux/irqchip/riscv-imsic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 9e65345ca3f6..a1315189a595 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -29,7 +29,6 @@ config ARM_GIC_V2M
 
 config GIC_NON_BANKED
 	bool
-
 config ARM_GIC_V3
 	bool
 	select IRQ_DOMAIN_HIERARCHY
@@ -548,6 +547,19 @@ config SIFIVE_PLIC
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
 
+config RISCV_IMSIC
+	bool
+	depends on RISCV
+	select IRQ_DOMAIN_HIERARCHY
+	select GENERIC_MSI_IRQ_DOMAIN
+
+config RISCV_IMSIC_PCI
+	bool
+	depends on RISCV_IMSIC
+	depends on PCI
+	depends on PCI_MSI
+	default RISCV_IMSIC
+
 config EXYNOS_IRQ_COMBINER
 	bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
 	depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 87b49a10962c..22c723cc6ec8 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
 obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
 obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
 obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
+obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
 obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
 obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
 obj-$(CONFIG_IMX_INTMUX)		+= irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
new file mode 100644
index 000000000000..4c16b66738d6
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -0,0 +1,1174 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <asm/hwcap.h>
+
+#define IMSIC_DISABLE_EIDELIVERY	0
+#define IMSIC_ENABLE_EIDELIVERY		1
+#define IMSIC_DISABLE_EITHRESHOLD	1
+#define IMSIC_ENABLE_EITHRESHOLD	0
+
+#define imsic_csr_write(__c, __v)	\
+do {					\
+	csr_write(CSR_ISELECT, __c);	\
+	csr_write(CSR_IREG, __v);	\
+} while (0)
+
+#define imsic_csr_read(__c)		\
+({					\
+	unsigned long __v;		\
+	csr_write(CSR_ISELECT, __c);	\
+	__v = csr_read(CSR_IREG);	\
+	__v;				\
+})
+
+#define imsic_csr_set(__c, __v)		\
+do {					\
+	csr_write(CSR_ISELECT, __c);	\
+	csr_set(CSR_IREG, __v);		\
+} while (0)
+
+#define imsic_csr_clear(__c, __v)	\
+do {					\
+	csr_write(CSR_ISELECT, __c);	\
+	csr_clear(CSR_IREG, __v);	\
+} while (0)
+
+struct imsic_mmio {
+	phys_addr_t pa;
+	void __iomem *va;
+	unsigned long size;
+};
+
+struct imsic_priv {
+	/* Global configuration common for all HARTs */
+	struct imsic_global_config global;
+
+	/* MMIO regions */
+	u32 num_mmios;
+	struct imsic_mmio *mmios;
+
+	/* Global state of interrupt identities */
+	raw_spinlock_t ids_lock;
+	unsigned long *ids_used_bimap;
+	unsigned long *ids_enabled_bimap;
+	unsigned int *ids_target_cpu;
+
+	/* Mask for connected CPUs */
+	struct cpumask lmask;
+
+	/* IPI interrupt identity */
+	u32 ipi_id;
+	u32 ipi_lsync_id;
+
+	/* IRQ domains */
+	struct irq_domain *base_domain;
+	struct irq_domain *pci_domain;
+	struct irq_domain *plat_domain;
+};
+
+struct imsic_handler {
+	/* Local configuration for given HART */
+	struct imsic_local_config local;
+
+	/* Pointer to private context */
+	struct imsic_priv *priv;
+};
+
+static bool imsic_init_done;
+
+static int imsic_parent_irq;
+static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
+
+const struct imsic_global_config *imsic_get_global_config(void)
+{
+	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+
+	if (!handler || !handler->priv)
+		return NULL;
+
+	return &handler->priv->global;
+}
+EXPORT_SYMBOL_GPL(imsic_get_global_config);
+
+const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
+{
+	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+
+	if (!handler || !handler->priv)
+		return NULL;
+
+	return &handler->local;
+}
+EXPORT_SYMBOL_GPL(imsic_get_local_config);
+
+static int imsic_cpu_page_phys(unsigned int cpu,
+			       unsigned int guest_index,
+			       phys_addr_t *out_msi_pa)
+{
+	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+	struct imsic_global_config *global;
+	struct imsic_local_config *local;
+
+	if (!handler || !handler->priv)
+		return -ENODEV;
+	local = &handler->local;
+	global = &handler->priv->global;
+
+	if (BIT(global->guest_index_bits) <= guest_index)
+		return -EINVAL;
+
+	if (out_msi_pa)
+		*out_msi_pa = local->msi_pa +
+			      (guest_index * IMSIC_MMIO_PAGE_SZ);
+
+	return 0;
+}
+
+static int imsic_get_cpu(struct imsic_priv *priv,
+			 const struct cpumask *mask_val, bool force,
+			 unsigned int *out_target_cpu)
+{
+	struct cpumask amask;
+	unsigned int cpu;
+
+	cpumask_and(&amask, &priv->lmask, mask_val);
+
+	if (force)
+		cpu = cpumask_first(&amask);
+	else
+		cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	if (out_target_cpu)
+		*out_target_cpu = cpu;
+
+	return 0;
+}
+
+static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
+				 struct msi_msg *msg)
+{
+	phys_addr_t msi_addr;
+	int err;
+
+	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+	if (err)
+		return err;
+
+	msg->address_hi = upper_32_bits(msi_addr);
+	msg->address_lo = lower_32_bits(msi_addr);
+	msg->data = id;
+
+	return err;
+}
+
+static void imsic_id_set_target(struct imsic_priv *priv,
+				 unsigned int id, unsigned int target_cpu)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	priv->ids_target_cpu[id] = target_cpu;
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static unsigned int imsic_id_get_target(struct imsic_priv *priv,
+					unsigned int id)
+{
+	unsigned int ret;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	ret = priv->ids_target_cpu[id];
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	return ret;
+}
+
+static void __imsic_eix_update(unsigned long base_id,
+			       unsigned long num_id, bool pend, bool val)
+{
+	unsigned long i, isel, ireg, flags;
+	unsigned long id = base_id, last_id = base_id + num_id;
+
+	while (id < last_id) {
+		isel = id / BITS_PER_LONG;
+		isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+		isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+
+		ireg = 0;
+		for (i = id & (__riscv_xlen - 1);
+		     (id < last_id) && (i < __riscv_xlen); i++) {
+			ireg |= BIT(i);
+			id++;
+		}
+
+		/*
+		 * The IMSIC EIEx and EIPx registers are indirectly
+		 * accessed via using ISELECT and IREG CSRs so we
+		 * save/restore local IRQ to ensure that we don't
+		 * get preempted while accessing IMSIC registers.
+		 */
+		local_irq_save(flags);
+		if (val)
+			imsic_csr_set(isel, ireg);
+		else
+			imsic_csr_clear(isel, ireg);
+		local_irq_restore(flags);
+	}
+}
+
+#define __imsic_id_enable(__id)		\
+	__imsic_eix_update((__id), 1, false, true)
+#define __imsic_id_disable(__id)	\
+	__imsic_eix_update((__id), 1, false, false)
+
+#ifdef CONFIG_SMP
+static void __imsic_id_smp_sync(struct imsic_priv *priv)
+{
+	struct imsic_handler *handler;
+	struct cpumask amask;
+	int cpu;
+
+	cpumask_and(&amask, &priv->lmask, cpu_online_mask);
+	for_each_cpu(cpu, &amask) {
+		if (cpu == smp_processor_id())
+			continue;
+
+		handler = per_cpu_ptr(&imsic_handlers, cpu);
+		if (!handler || !handler->priv || !handler->local.msi_va) {
+			pr_warn("CPU%d: handler not initialized\n", cpu);
+			continue;
+		}
+
+		writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
+	}
+}
+#else
+#define __imsic_id_smp_sync(__priv)
+#endif
+
+static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	bitmap_set(priv->ids_enabled_bimap, id, 1);
+	__imsic_id_enable(id);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	__imsic_id_smp_sync(priv);
+}
+
+static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	bitmap_clear(priv->ids_enabled_bimap, id, 1);
+	__imsic_id_disable(id);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	__imsic_id_smp_sync(priv);
+}
+
+static void imsic_ids_local_sync(struct imsic_priv *priv)
+{
+	int i;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	for (i = 1; i <= priv->global.nr_ids; i++) {
+		if (priv->ipi_id == i || priv->ipi_lsync_id == i)
+			continue;
+
+		if (test_bit(i, priv->ids_enabled_bimap))
+			__imsic_id_enable(i);
+		else
+			__imsic_id_disable(i);
+	}
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
+{
+	if (enable) {
+		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+	} else {
+		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+	}
+}
+
+static int imsic_ids_alloc(struct imsic_priv *priv,
+			   unsigned int max_id, unsigned int order)
+{
+	int ret;
+	unsigned long flags;
+
+	if ((priv->global.nr_ids < max_id) ||
+	    (max_id < BIT(order)))
+		return -EINVAL;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	ret = bitmap_find_free_region(priv->ids_used_bimap,
+				      max_id + 1, order);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	return ret;
+}
+
+static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
+			   unsigned int order)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	bitmap_release_region(priv->ids_used_bimap, base_id, order);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static int __init imsic_ids_init(struct imsic_priv *priv)
+{
+	int i;
+	struct imsic_global_config *global = &priv->global;
+
+	raw_spin_lock_init(&priv->ids_lock);
+
+	/* Allocate used bitmap */
+	priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
+					sizeof(unsigned long), GFP_KERNEL);
+	if (!priv->ids_used_bimap)
+		return -ENOMEM;
+
+	/* Allocate enabled bitmap */
+	priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
+					   sizeof(unsigned long), GFP_KERNEL);
+	if (!priv->ids_enabled_bimap) {
+		kfree(priv->ids_used_bimap);
+		return -ENOMEM;
+	}
+
+	/* Allocate target CPU array */
+	priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
+				       sizeof(unsigned int), GFP_KERNEL);
+	if (!priv->ids_target_cpu) {
+		kfree(priv->ids_enabled_bimap);
+		kfree(priv->ids_used_bimap);
+		return -ENOMEM;
+	}
+	for (i = 0; i <= global->nr_ids; i++)
+		priv->ids_target_cpu[i] = UINT_MAX;
+
+	/* Reserve ID#0 because it is special and never implemented */
+	bitmap_set(priv->ids_used_bimap, 0, 1);
+
+	return 0;
+}
+
+static void __init imsic_ids_cleanup(struct imsic_priv *priv)
+{
+	kfree(priv->ids_target_cpu);
+	kfree(priv->ids_enabled_bimap);
+	kfree(priv->ids_used_bimap);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_ipi_send(unsigned int cpu)
+{
+	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+
+	if (!handler || !handler->priv || !handler->local.msi_va) {
+		pr_warn("CPU%d: handler not initialized\n", cpu);
+		return;
+	}
+
+	writel(handler->priv->ipi_id, handler->local.msi_va);
+}
+
+static void imsic_ipi_enable(struct imsic_priv *priv)
+{
+	__imsic_id_enable(priv->ipi_id);
+	__imsic_id_enable(priv->ipi_lsync_id);
+}
+
+static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
+{
+	int virq;
+
+	/* Allocate interrupt identity for IPIs */
+	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
+	if (virq < 0)
+		return virq;
+	priv->ipi_id = virq;
+
+	/* Create IMSIC IPI multiplexing */
+	virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
+	if (virq <= 0) {
+		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+		return (virq < 0) ? virq : -ENOMEM;
+	}
+
+	/* Set vIRQ range */
+	riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
+
+	/* Allocate interrupt identity for local enable/disable sync */
+	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
+	if (virq < 0) {
+		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+		return virq;
+	}
+	priv->ipi_lsync_id = virq;
+
+	return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
+{
+	imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
+	if (priv->ipi_id)
+		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+}
+#else
+static void imsic_ipi_enable(struct imsic_priv *priv)
+{
+}
+
+static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
+{
+	/* Clear the IPI ids because we are not using IPIs */
+	priv->ipi_id = 0;
+	priv->ipi_lsync_id = 0;
+	return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
+{
+}
+#endif
+
+static void imsic_irq_mask(struct irq_data *d)
+{
+	imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
+}
+
+static void imsic_irq_unmask(struct irq_data *d)
+{
+	imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
+}
+
+static void imsic_irq_compose_msi_msg(struct irq_data *d,
+				      struct msi_msg *msg)
+{
+	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
+	unsigned int cpu;
+	int err;
+
+	cpu = imsic_id_get_target(priv, d->hwirq);
+	WARN_ON(cpu == UINT_MAX);
+
+	err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
+	WARN_ON(err);
+
+	iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
+}
+
+#ifdef CONFIG_SMP
+static int imsic_irq_set_affinity(struct irq_data *d,
+				  const struct cpumask *mask_val,
+				  bool force)
+{
+	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
+	unsigned int target_cpu;
+	int rc;
+
+	rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
+	if (rc)
+		return rc;
+
+	imsic_id_set_target(priv, d->hwirq, target_cpu);
+	irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
+
+	return IRQ_SET_MASK_OK;
+}
+#endif
+
+static struct irq_chip imsic_irq_base_chip = {
+	.name			= "RISC-V IMSIC-BASE",
+	.irq_mask		= imsic_irq_mask,
+	.irq_unmask		= imsic_irq_unmask,
+#ifdef CONFIG_SMP
+	.irq_set_affinity	= imsic_irq_set_affinity,
+#endif
+	.irq_compose_msi_msg	= imsic_irq_compose_msi_msg,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int imsic_irq_domain_alloc(struct irq_domain *domain,
+				  unsigned int virq,
+				  unsigned int nr_irqs,
+				  void *args)
+{
+	struct imsic_priv *priv = domain->host_data;
+	msi_alloc_info_t *info = args;
+	phys_addr_t msi_addr;
+	int i, hwirq, err = 0;
+	unsigned int cpu;
+
+	err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
+	if (err)
+		return err;
+
+	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+	if (err)
+		return err;
+
+	hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
+				get_count_order(nr_irqs));
+	if (hwirq < 0)
+		return hwirq;
+
+	err = iommu_dma_prepare_msi(info->desc, msi_addr);
+	if (err)
+		goto fail;
+
+	for (i = 0; i < nr_irqs; i++) {
+		imsic_id_set_target(priv, hwirq + i, cpu);
+		irq_domain_set_info(domain, virq + i, hwirq + i,
+				    &imsic_irq_base_chip, priv,
+				    handle_simple_irq, NULL, NULL);
+		irq_set_noprobe(virq + i);
+		irq_set_affinity(virq + i, &priv->lmask);
+	}
+
+	return 0;
+
+fail:
+	imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
+	return err;
+}
+
+static void imsic_irq_domain_free(struct irq_domain *domain,
+				  unsigned int virq,
+				  unsigned int nr_irqs)
+{
+	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+	struct imsic_priv *priv = domain->host_data;
+
+	imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops imsic_base_domain_ops = {
+	.alloc		= imsic_irq_domain_alloc,
+	.free		= imsic_irq_domain_free,
+};
+
+#ifdef CONFIG_RISCV_IMSIC_PCI
+
+static void imsic_pci_mask_irq(struct irq_data *d)
+{
+	pci_msi_mask_irq(d);
+	irq_chip_mask_parent(d);
+}
+
+static void imsic_pci_unmask_irq(struct irq_data *d)
+{
+	pci_msi_unmask_irq(d);
+	irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip imsic_pci_irq_chip = {
+	.name			= "RISC-V IMSIC-PCI",
+	.irq_mask		= imsic_pci_mask_irq,
+	.irq_unmask		= imsic_pci_unmask_irq,
+	.irq_eoi		= irq_chip_eoi_parent,
+};
+
+static struct msi_domain_ops imsic_pci_domain_ops = {
+};
+
+static struct msi_domain_info imsic_pci_domain_info = {
+	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+		   MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
+	.ops	= &imsic_pci_domain_ops,
+	.chip	= &imsic_pci_irq_chip,
+};
+
+#endif
+
+static struct irq_chip imsic_plat_irq_chip = {
+	.name			= "RISC-V IMSIC-PLAT",
+};
+
+static struct msi_domain_ops imsic_plat_domain_ops = {
+};
+
+static struct msi_domain_info imsic_plat_domain_info = {
+	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
+	.ops	= &imsic_plat_domain_ops,
+	.chip	= &imsic_plat_irq_chip,
+};
+
+static int __init imsic_irq_domains_init(struct imsic_priv *priv,
+					 struct fwnode_handle *fwnode)
+{
+	/* Create Base IRQ domain */
+	priv->base_domain = irq_domain_create_tree(fwnode,
+						&imsic_base_domain_ops, priv);
+	if (!priv->base_domain) {
+		pr_err("Failed to create IMSIC base domain\n");
+		return -ENOMEM;
+	}
+	irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
+
+#ifdef CONFIG_RISCV_IMSIC_PCI
+	/* Create PCI MSI domain */
+	priv->pci_domain = pci_msi_create_irq_domain(fwnode,
+						&imsic_pci_domain_info,
+						priv->base_domain);
+	if (!priv->pci_domain) {
+		pr_err("Failed to create IMSIC PCI domain\n");
+		irq_domain_remove(priv->base_domain);
+		return -ENOMEM;
+	}
+#endif
+
+	/* Create Platform MSI domain */
+	priv->plat_domain = platform_msi_create_irq_domain(fwnode,
+						&imsic_plat_domain_info,
+						priv->base_domain);
+	if (!priv->plat_domain) {
+		pr_err("Failed to create IMSIC platform domain\n");
+		if (priv->pci_domain)
+			irq_domain_remove(priv->pci_domain);
+		irq_domain_remove(priv->base_domain);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * To handle an interrupt, we read the TOPEI CSR and write zero in one
+ * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
+ * Linux interrupt number and let Linux IRQ subsystem handle it.
+ */
+static void imsic_handle_irq(struct irq_desc *desc)
+{
+	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	struct imsic_priv *priv = handler->priv;
+	irq_hw_number_t hwirq;
+	int err;
+
+	WARN_ON_ONCE(!handler->priv);
+
+	chained_irq_enter(chip, desc);
+
+	while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
+		hwirq = hwirq >> TOPEI_ID_SHIFT;
+
+		if (hwirq == priv->ipi_id) {
+#ifdef CONFIG_SMP
+			ipi_mux_process();
+#endif
+			continue;
+		} else if (hwirq == priv->ipi_lsync_id) {
+			imsic_ids_local_sync(priv);
+			continue;
+		}
+
+		err = generic_handle_domain_irq(priv->base_domain, hwirq);
+		if (unlikely(err))
+			pr_warn_ratelimited(
+				"hwirq %lu mapping not found\n", hwirq);
+	}
+
+	chained_irq_exit(chip, desc);
+}
+
+static int imsic_starting_cpu(unsigned int cpu)
+{
+	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+	struct imsic_priv *priv = handler->priv;
+
+	/* Enable per-CPU parent interrupt */
+	if (imsic_parent_irq)
+		enable_percpu_irq(imsic_parent_irq,
+				  irq_get_trigger_type(imsic_parent_irq));
+	else
+		pr_warn("cpu%d: parent irq not available\n", cpu);
+
+	/* Enable IPIs */
+	imsic_ipi_enable(priv);
+
+	/*
+	 * Interrupts identities might have been enabled/disabled while
+	 * this CPU was not running so sync-up local enable/disable state.
+	 */
+	imsic_ids_local_sync(priv);
+
+	/* Locally enable interrupt delivery */
+	imsic_ids_local_delivery(priv, true);
+
+	return 0;
+}
+
+struct imsic_fwnode_ops {
+	u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
+			     void *fwopaque);
+	int (*parent_hartid)(struct fwnode_handle *fwnode,
+			     void *fwopaque, u32 index,
+			     unsigned long *out_hartid);
+	u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
+	int (*mmio_to_resource)(struct fwnode_handle *fwnode,
+				void *fwopaque, u32 index,
+				struct resource *res);
+	void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
+				  void *fwopaque, u32 index);
+	int (*read_u32)(struct fwnode_handle *fwnode,
+			void *fwopaque, const char *prop, u32 *out_val);
+	bool (*read_bool)(struct fwnode_handle *fwnode,
+			  void *fwopaque, const char *prop);
+};
+
+static int __init imsic_init(struct imsic_fwnode_ops *fwops,
+			     struct fwnode_handle *fwnode,
+			     void *fwopaque)
+{
+	struct resource res;
+	phys_addr_t base_addr;
+	int rc, nr_parent_irqs;
+	struct imsic_mmio *mmio;
+	struct imsic_priv *priv;
+	struct irq_domain *domain;
+	struct imsic_handler *handler;
+	struct imsic_global_config *global;
+	u32 i, tmp, nr_handlers = 0;
+
+	if (imsic_init_done) {
+		pr_err("%pfwP: already initialized hence ignoring\n",
+			fwnode);
+		return -ENODEV;
+	}
+
+	if (!riscv_isa_extension_available(NULL, SxAIA)) {
+		pr_err("%pfwP: AIA support not available\n", fwnode);
+		return -ENODEV;
+	}
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+	global = &priv->global;
+
+	/* Find number of parent interrupts */
+	nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
+	if (!nr_parent_irqs) {
+		pr_err("%pfwP: no parent irqs available\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of guest index bits in MSI address */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
+			     &global->guest_index_bits);
+	if (rc)
+		global->guest_index_bits = 0;
+	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+	if (tmp < global->guest_index_bits) {
+		pr_err("%pfwP: guest index bits too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of HART index bits */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
+			     &global->hart_index_bits);
+	if (rc) {
+		/* Assume default value */
+		global->hart_index_bits = __fls(nr_parent_irqs);
+		if (BIT(global->hart_index_bits) < nr_parent_irqs)
+			global->hart_index_bits++;
+	}
+	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+	      global->guest_index_bits;
+	if (tmp < global->hart_index_bits) {
+		pr_err("%pfwP: HART index bits too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of group index bits */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
+			     &global->group_index_bits);
+	if (rc)
+		global->group_index_bits = 0;
+	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+	      global->guest_index_bits - global->hart_index_bits;
+	if (tmp < global->group_index_bits) {
+		pr_err("%pfwP: group index bits too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/*
+	 * Find first bit position of group index.
+	 * If not specified assumed the default APLIC-IMSIC configuration.
+	 */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
+			     &global->group_index_shift);
+	if (rc)
+		global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
+	tmp = global->group_index_bits + global->group_index_shift - 1;
+	if (tmp >= BITS_PER_LONG) {
+		pr_err("%pfwP: group index shift too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of interrupt identities */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
+			     &global->nr_ids);
+	if (rc) {
+		pr_err("%pfwP: number of interrupt identities not found\n",
+			fwnode);
+		return rc;
+	}
+	if ((global->nr_ids < IMSIC_MIN_ID) ||
+	    (global->nr_ids >= IMSIC_MAX_ID) ||
+	    ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+		pr_err("%pfwP: invalid number of interrupt identities\n",
+			fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of guest interrupt identities */
+	if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
+			    &global->nr_guest_ids))
+		global->nr_guest_ids = global->nr_ids;
+	if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
+	    (global->nr_guest_ids >= IMSIC_MAX_ID) ||
+	    ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+		pr_err("%pfwP: invalid number of guest interrupt identities\n",
+			fwnode);
+		return -EINVAL;
+	}
+
+	/* Compute base address */
+	rc = fwops->mmio_to_resource(fwnode, fwopaque, 0, &res);
+	if (rc) {
+		pr_err("%pfwP: first MMIO resource not found\n", fwnode);
+		return -EINVAL;
+	}
+	global->base_addr = res.start;
+	global->base_addr &= ~(BIT(global->guest_index_bits +
+				   global->hart_index_bits +
+				   IMSIC_MMIO_PAGE_SHIFT) - 1);
+	global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+			       global->group_index_shift);
+
+	/* Find number of MMIO register sets */
+	priv->num_mmios = fwops->nr_mmio(fwnode, fwopaque);
+
+	/* Allocate MMIO register sets */
+	priv->mmios = kcalloc(priv->num_mmios, sizeof(*mmio), GFP_KERNEL);
+	if (!priv->mmios) {
+		rc = -ENOMEM;
+		goto out_free_priv;
+	}
+
+	/* Parse and map MMIO register sets */
+	for (i = 0; i < priv->num_mmios; i++) {
+		mmio = &priv->mmios[i];
+		rc = fwops->mmio_to_resource(fwnode, fwopaque, i, &res);
+		if (rc) {
+			pr_err("%pfwP: unable to parse MMIO regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+		mmio->pa = res.start;
+		mmio->size = res.end - res.start + 1;
+
+		base_addr = mmio->pa;
+		base_addr &= ~(BIT(global->guest_index_bits +
+				   global->hart_index_bits +
+				   IMSIC_MMIO_PAGE_SHIFT) - 1);
+		base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+			       global->group_index_shift);
+		if (base_addr != global->base_addr) {
+			rc = -EINVAL;
+			pr_err("%pfwP: address mismatch for regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+
+		mmio->va = fwops->mmio_map(fwnode, fwopaque, i);
+		if (!mmio->va) {
+			rc = -EIO;
+			pr_err("%pfwP: unable to map MMIO regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+	}
+
+	/* Initialize interrupt identity management */
+	rc = imsic_ids_init(priv);
+	if (rc) {
+		pr_err("%pfwP: failed to initialize interrupt management\n",
+		       fwnode);
+		goto out_iounmap;
+	}
+
+	/* Configure handlers for target CPUs */
+	for (i = 0; i < nr_parent_irqs; i++) {
+		unsigned long reloff, hartid;
+		int j, cpu;
+
+		rc = fwops->parent_hartid(fwnode, fwopaque, i, &hartid);
+		if (rc) {
+			pr_warn("%pfwP: hart ID for parent irq%d not found\n",
+				fwnode, i);
+			continue;
+		}
+
+		cpu = riscv_hartid_to_cpuid(hartid);
+		if (cpu < 0) {
+			pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
+				fwnode, i);
+			continue;
+		}
+
+		/* Find MMIO location of MSI page */
+		mmio = NULL;
+		reloff = i * BIT(global->guest_index_bits) *
+			 IMSIC_MMIO_PAGE_SZ;
+		for (j = 0; priv->num_mmios; j++) {
+			if (reloff < priv->mmios[j].size) {
+				mmio = &priv->mmios[j];
+				break;
+			}
+
+			/*
+			 * MMIO region size may not be aligned to
+			 * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+			 * if holes are present.
+			 */
+			reloff -= ALIGN(priv->mmios[j].size,
+			BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
+		}
+		if (!mmio) {
+			pr_warn("%pfwP: MMIO not found for parent irq%d\n",
+				fwnode, i);
+			continue;
+		}
+
+		handler = per_cpu_ptr(&imsic_handlers, cpu);
+		if (handler->priv) {
+			pr_warn("%pfwP: CPU%d handler already configured.\n",
+				fwnode, cpu);
+			goto done;
+		}
+
+		cpumask_set_cpu(cpu, &priv->lmask);
+		handler->local.msi_pa = mmio->pa + reloff;
+		handler->local.msi_va = mmio->va + reloff;
+		handler->priv = priv;
+
+done:
+		nr_handlers++;
+	}
+
+	/* If no CPU handlers found then can't take interrupts */
+	if (!nr_handlers) {
+		pr_err("%pfwP: No CPU handlers found\n", fwnode);
+		rc = -ENODEV;
+		goto out_ids_cleanup;
+	}
+
+	/* Find parent domain and register chained handler */
+	domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+					  DOMAIN_BUS_ANY);
+	if (!domain) {
+		pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
+		rc = -ENOENT;
+		goto out_ids_cleanup;
+	}
+	imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+	if (!imsic_parent_irq) {
+		pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
+		rc = -ENOENT;
+		goto out_ids_cleanup;
+	}
+	irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
+
+	/* Initialize IPI domain */
+	rc = imsic_ipi_domain_init(priv);
+	if (rc) {
+		pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
+		goto out_ids_cleanup;
+	}
+
+	/* Initialize IRQ and MSI domains */
+	rc = imsic_irq_domains_init(priv, fwnode);
+	if (rc) {
+		pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
+		       fwnode);
+		goto out_ipi_domain_cleanup;
+	}
+
+	/*
+	 * Setup cpuhp state
+	 *
+	 * Don't disable per-CPU IMSIC file when CPU goes offline
+	 * because this affects IPI and the masking/unmasking of
+	 * virtual IPIs is done via generic IPI-Mux
+	 */
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+			  "irqchip/riscv/imsic:starting",
+			  imsic_starting_cpu, NULL);
+
+	/*
+	 * Only one IMSIC instance allowed in a platform for clean
+	 * implementation of SMP IRQ affinity and per-CPU IPIs.
+	 *
+	 * This means on a multi-socket (or multi-die) platform we
+	 * will have multiple MMIO regions for one IMSIC instance.
+	 */
+	imsic_init_done = true;
+
+	pr_info("%pfwP:  hart-index-bits: %d,  guest-index-bits: %d\n",
+		fwnode, global->hart_index_bits, global->guest_index_bits);
+	pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
+		fwnode, global->group_index_bits, global->group_index_shift);
+	pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
+		fwnode, global->nr_ids, nr_handlers, &global->base_addr);
+	if (priv->ipi_lsync_id)
+		pr_info("%pfwP: enable/disable sync using interrupt %d\n",
+			fwnode, priv->ipi_lsync_id);
+	if (priv->ipi_id)
+		pr_info("%pfwP: providing IPIs using interrupt %d\n",
+			fwnode, priv->ipi_id);
+
+	return 0;
+
+out_ipi_domain_cleanup:
+	imsic_ipi_domain_cleanup(priv);
+out_ids_cleanup:
+	imsic_ids_cleanup(priv);
+out_iounmap:
+	for (i = 0; i < priv->num_mmios; i++) {
+		if (priv->mmios[i].va)
+			iounmap(priv->mmios[i].va);
+	}
+	kfree(priv->mmios);
+out_free_priv:
+	kfree(priv);
+	return rc;
+}
+
+static u32 __init imsic_dt_nr_parent_irq(struct fwnode_handle *fwnode,
+					 void *fwopaque)
+{
+	return of_irq_count(to_of_node(fwnode));
+}
+
+static int __init imsic_dt_parent_hartid(struct fwnode_handle *fwnode,
+					 void *fwopaque, u32 index,
+					 unsigned long *out_hartid)
+{
+	struct of_phandle_args parent;
+	int rc;
+
+	rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
+	if (rc)
+		return rc;
+
+	/*
+	 * Skip interrupts other than external interrupts for
+	 * current privilege level.
+	 */
+	if (parent.args[0] != RV_IRQ_EXT)
+		return -EINVAL;
+
+	return riscv_of_parent_hartid(parent.np, out_hartid);
+}
+
+static u32 __init imsic_dt_nr_mmio(struct fwnode_handle *fwnode,
+				   void *fwopaque)
+{
+	u32 ret = 0;
+	struct resource res;
+
+	while (!of_address_to_resource(to_of_node(fwnode), ret, &res))
+		ret++;
+
+	return ret;
+}
+
+static int __init imsic_mmio_to_resource(struct fwnode_handle *fwnode,
+					 void *fwopaque, u32 index,
+					 struct resource *res)
+{
+	return of_address_to_resource(to_of_node(fwnode), index, res);
+}
+
+static void __iomem __init *imsic_dt_mmio_map(struct fwnode_handle *fwnode,
+					      void *fwopaque, u32 index)
+{
+	return of_iomap(to_of_node(fwnode), index);
+}
+
+static int __init imsic_dt_read_u32(struct fwnode_handle *fwnode,
+				    void *fwopaque, const char *prop,
+				    u32 *out_val)
+{
+	return of_property_read_u32(to_of_node(fwnode), prop, out_val);
+}
+
+static bool __init imsic_dt_read_bool(struct fwnode_handle *fwnode,
+				      void *fwopaque, const char *prop)
+{
+	return of_property_read_bool(to_of_node(fwnode), prop);
+}
+
+static int __init imsic_dt_init(struct device_node *node,
+				struct device_node *parent)
+{
+	struct imsic_fwnode_ops ops = {
+		.nr_parent_irq = imsic_dt_nr_parent_irq,
+		.parent_hartid = imsic_dt_parent_hartid,
+		.nr_mmio = imsic_dt_nr_mmio,
+		.mmio_to_resource = imsic_mmio_to_resource,
+		.mmio_map = imsic_dt_mmio_map,
+		.read_u32 = imsic_dt_read_u32,
+		.read_bool = imsic_dt_read_bool,
+	};
+
+	return imsic_init(&ops, &node->fwnode, NULL);
+}
+IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
new file mode 100644
index 000000000000..5d1387adc0ba
--- /dev/null
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
+#define __LINUX_IRQCHIP_RISCV_IMSIC_H
+
+#include <linux/types.h>
+#include <asm/csr.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT		12
+#define IMSIC_MMIO_PAGE_SZ		(1UL << IMSIC_MMIO_PAGE_SHIFT)
+#define IMSIC_MMIO_PAGE_LE		0x00
+#define IMSIC_MMIO_PAGE_BE		0x04
+
+#define IMSIC_MIN_ID			63
+#define IMSIC_MAX_ID			2048
+
+#define IMSIC_EIDELIVERY		0x70
+
+#define IMSIC_EITHRESHOLD		0x72
+
+#define IMSIC_EIP0			0x80
+#define IMSIC_EIP63			0xbf
+#define IMSIC_EIPx_BITS			32
+
+#define IMSIC_EIE0			0xc0
+#define IMSIC_EIE63			0xff
+#define IMSIC_EIEx_BITS			32
+
+#define IMSIC_FIRST			IMSIC_EIDELIVERY
+#define IMSIC_LAST			IMSIC_EIE63
+
+#define IMSIC_MMIO_SETIPNUM_LE		0x00
+#define IMSIC_MMIO_SETIPNUM_BE		0x04
+
+struct imsic_global_config {
+	/*
+	 * MSI Target Address Scheme
+	 *
+	 * XLEN-1                                                12     0
+	 * |                                                     |     |
+	 * -------------------------------------------------------------
+	 * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
+	 * -------------------------------------------------------------
+	 */
+
+	/* Bits representing Guest index, HART index, and Group index */
+	u32 guest_index_bits;
+	u32 hart_index_bits;
+	u32 group_index_bits;
+	u32 group_index_shift;
+
+	/* Global base address matching all target MSI addresses */
+	phys_addr_t base_addr;
+
+	/* Number of interrupt identities */
+	u32 nr_ids;
+
+	/* Number of guest interrupt identities */
+	u32 nr_guest_ids;
+};
+
+struct imsic_local_config {
+	phys_addr_t msi_pa;
+	void __iomem *msi_va;
+};
+
+#ifdef CONFIG_RISCV_IMSIC
+
+extern const struct imsic_global_config *imsic_get_global_config(void);
+
+extern const struct imsic_local_config *imsic_get_local_config(
+							unsigned int cpu);
+
+#else
+
+static inline const struct imsic_global_config *imsic_get_global_config(void)
+{
+	return NULL;
+}
+
+static inline const struct imsic_local_config *imsic_get_local_config(
+							unsigned int cpu)
+{
+	return NULL;
+}
+
+#endif
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V advanced interrupt architecture (AIA) specification defines
a new MSI controller for managing MSIs on a RISC-V platform. This new
MSI controller is referred to as incoming message signaled interrupt
controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
(For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/Kconfig             |   14 +-
 drivers/irqchip/Makefile            |    1 +
 drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
 include/linux/irqchip/riscv-imsic.h |   92 +++
 4 files changed, 1280 insertions(+), 1 deletion(-)
 create mode 100644 drivers/irqchip/irq-riscv-imsic.c
 create mode 100644 include/linux/irqchip/riscv-imsic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 9e65345ca3f6..a1315189a595 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -29,7 +29,6 @@ config ARM_GIC_V2M
 
 config GIC_NON_BANKED
 	bool
-
 config ARM_GIC_V3
 	bool
 	select IRQ_DOMAIN_HIERARCHY
@@ -548,6 +547,19 @@ config SIFIVE_PLIC
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
 
+config RISCV_IMSIC
+	bool
+	depends on RISCV
+	select IRQ_DOMAIN_HIERARCHY
+	select GENERIC_MSI_IRQ_DOMAIN
+
+config RISCV_IMSIC_PCI
+	bool
+	depends on RISCV_IMSIC
+	depends on PCI
+	depends on PCI_MSI
+	default RISCV_IMSIC
+
 config EXYNOS_IRQ_COMBINER
 	bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
 	depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 87b49a10962c..22c723cc6ec8 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
 obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
 obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
 obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
+obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
 obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
 obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
 obj-$(CONFIG_IMX_INTMUX)		+= irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
new file mode 100644
index 000000000000..4c16b66738d6
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -0,0 +1,1174 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <asm/hwcap.h>
+
+#define IMSIC_DISABLE_EIDELIVERY	0
+#define IMSIC_ENABLE_EIDELIVERY		1
+#define IMSIC_DISABLE_EITHRESHOLD	1
+#define IMSIC_ENABLE_EITHRESHOLD	0
+
+#define imsic_csr_write(__c, __v)	\
+do {					\
+	csr_write(CSR_ISELECT, __c);	\
+	csr_write(CSR_IREG, __v);	\
+} while (0)
+
+#define imsic_csr_read(__c)		\
+({					\
+	unsigned long __v;		\
+	csr_write(CSR_ISELECT, __c);	\
+	__v = csr_read(CSR_IREG);	\
+	__v;				\
+})
+
+#define imsic_csr_set(__c, __v)		\
+do {					\
+	csr_write(CSR_ISELECT, __c);	\
+	csr_set(CSR_IREG, __v);		\
+} while (0)
+
+#define imsic_csr_clear(__c, __v)	\
+do {					\
+	csr_write(CSR_ISELECT, __c);	\
+	csr_clear(CSR_IREG, __v);	\
+} while (0)
+
+struct imsic_mmio {
+	phys_addr_t pa;
+	void __iomem *va;
+	unsigned long size;
+};
+
+struct imsic_priv {
+	/* Global configuration common for all HARTs */
+	struct imsic_global_config global;
+
+	/* MMIO regions */
+	u32 num_mmios;
+	struct imsic_mmio *mmios;
+
+	/* Global state of interrupt identities */
+	raw_spinlock_t ids_lock;
+	unsigned long *ids_used_bimap;
+	unsigned long *ids_enabled_bimap;
+	unsigned int *ids_target_cpu;
+
+	/* Mask for connected CPUs */
+	struct cpumask lmask;
+
+	/* IPI interrupt identity */
+	u32 ipi_id;
+	u32 ipi_lsync_id;
+
+	/* IRQ domains */
+	struct irq_domain *base_domain;
+	struct irq_domain *pci_domain;
+	struct irq_domain *plat_domain;
+};
+
+struct imsic_handler {
+	/* Local configuration for given HART */
+	struct imsic_local_config local;
+
+	/* Pointer to private context */
+	struct imsic_priv *priv;
+};
+
+static bool imsic_init_done;
+
+static int imsic_parent_irq;
+static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
+
+const struct imsic_global_config *imsic_get_global_config(void)
+{
+	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+
+	if (!handler || !handler->priv)
+		return NULL;
+
+	return &handler->priv->global;
+}
+EXPORT_SYMBOL_GPL(imsic_get_global_config);
+
+const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
+{
+	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+
+	if (!handler || !handler->priv)
+		return NULL;
+
+	return &handler->local;
+}
+EXPORT_SYMBOL_GPL(imsic_get_local_config);
+
+static int imsic_cpu_page_phys(unsigned int cpu,
+			       unsigned int guest_index,
+			       phys_addr_t *out_msi_pa)
+{
+	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+	struct imsic_global_config *global;
+	struct imsic_local_config *local;
+
+	if (!handler || !handler->priv)
+		return -ENODEV;
+	local = &handler->local;
+	global = &handler->priv->global;
+
+	if (BIT(global->guest_index_bits) <= guest_index)
+		return -EINVAL;
+
+	if (out_msi_pa)
+		*out_msi_pa = local->msi_pa +
+			      (guest_index * IMSIC_MMIO_PAGE_SZ);
+
+	return 0;
+}
+
+static int imsic_get_cpu(struct imsic_priv *priv,
+			 const struct cpumask *mask_val, bool force,
+			 unsigned int *out_target_cpu)
+{
+	struct cpumask amask;
+	unsigned int cpu;
+
+	cpumask_and(&amask, &priv->lmask, mask_val);
+
+	if (force)
+		cpu = cpumask_first(&amask);
+	else
+		cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	if (out_target_cpu)
+		*out_target_cpu = cpu;
+
+	return 0;
+}
+
+static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
+				 struct msi_msg *msg)
+{
+	phys_addr_t msi_addr;
+	int err;
+
+	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+	if (err)
+		return err;
+
+	msg->address_hi = upper_32_bits(msi_addr);
+	msg->address_lo = lower_32_bits(msi_addr);
+	msg->data = id;
+
+	return err;
+}
+
+static void imsic_id_set_target(struct imsic_priv *priv,
+				 unsigned int id, unsigned int target_cpu)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	priv->ids_target_cpu[id] = target_cpu;
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static unsigned int imsic_id_get_target(struct imsic_priv *priv,
+					unsigned int id)
+{
+	unsigned int ret;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	ret = priv->ids_target_cpu[id];
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	return ret;
+}
+
+static void __imsic_eix_update(unsigned long base_id,
+			       unsigned long num_id, bool pend, bool val)
+{
+	unsigned long i, isel, ireg, flags;
+	unsigned long id = base_id, last_id = base_id + num_id;
+
+	while (id < last_id) {
+		isel = id / BITS_PER_LONG;
+		isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+		isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+
+		ireg = 0;
+		for (i = id & (__riscv_xlen - 1);
+		     (id < last_id) && (i < __riscv_xlen); i++) {
+			ireg |= BIT(i);
+			id++;
+		}
+
+		/*
+		 * The IMSIC EIEx and EIPx registers are indirectly
+		 * accessed via using ISELECT and IREG CSRs so we
+		 * save/restore local IRQ to ensure that we don't
+		 * get preempted while accessing IMSIC registers.
+		 */
+		local_irq_save(flags);
+		if (val)
+			imsic_csr_set(isel, ireg);
+		else
+			imsic_csr_clear(isel, ireg);
+		local_irq_restore(flags);
+	}
+}
+
+#define __imsic_id_enable(__id)		\
+	__imsic_eix_update((__id), 1, false, true)
+#define __imsic_id_disable(__id)	\
+	__imsic_eix_update((__id), 1, false, false)
+
+#ifdef CONFIG_SMP
+static void __imsic_id_smp_sync(struct imsic_priv *priv)
+{
+	struct imsic_handler *handler;
+	struct cpumask amask;
+	int cpu;
+
+	cpumask_and(&amask, &priv->lmask, cpu_online_mask);
+	for_each_cpu(cpu, &amask) {
+		if (cpu == smp_processor_id())
+			continue;
+
+		handler = per_cpu_ptr(&imsic_handlers, cpu);
+		if (!handler || !handler->priv || !handler->local.msi_va) {
+			pr_warn("CPU%d: handler not initialized\n", cpu);
+			continue;
+		}
+
+		writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
+	}
+}
+#else
+#define __imsic_id_smp_sync(__priv)
+#endif
+
+static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	bitmap_set(priv->ids_enabled_bimap, id, 1);
+	__imsic_id_enable(id);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	__imsic_id_smp_sync(priv);
+}
+
+static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	bitmap_clear(priv->ids_enabled_bimap, id, 1);
+	__imsic_id_disable(id);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	__imsic_id_smp_sync(priv);
+}
+
+static void imsic_ids_local_sync(struct imsic_priv *priv)
+{
+	int i;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	for (i = 1; i <= priv->global.nr_ids; i++) {
+		if (priv->ipi_id == i || priv->ipi_lsync_id == i)
+			continue;
+
+		if (test_bit(i, priv->ids_enabled_bimap))
+			__imsic_id_enable(i);
+		else
+			__imsic_id_disable(i);
+	}
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
+{
+	if (enable) {
+		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+	} else {
+		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+	}
+}
+
+static int imsic_ids_alloc(struct imsic_priv *priv,
+			   unsigned int max_id, unsigned int order)
+{
+	int ret;
+	unsigned long flags;
+
+	if ((priv->global.nr_ids < max_id) ||
+	    (max_id < BIT(order)))
+		return -EINVAL;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	ret = bitmap_find_free_region(priv->ids_used_bimap,
+				      max_id + 1, order);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+	return ret;
+}
+
+static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
+			   unsigned int order)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&priv->ids_lock, flags);
+	bitmap_release_region(priv->ids_used_bimap, base_id, order);
+	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static int __init imsic_ids_init(struct imsic_priv *priv)
+{
+	int i;
+	struct imsic_global_config *global = &priv->global;
+
+	raw_spin_lock_init(&priv->ids_lock);
+
+	/* Allocate used bitmap */
+	priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
+					sizeof(unsigned long), GFP_KERNEL);
+	if (!priv->ids_used_bimap)
+		return -ENOMEM;
+
+	/* Allocate enabled bitmap */
+	priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
+					   sizeof(unsigned long), GFP_KERNEL);
+	if (!priv->ids_enabled_bimap) {
+		kfree(priv->ids_used_bimap);
+		return -ENOMEM;
+	}
+
+	/* Allocate target CPU array */
+	priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
+				       sizeof(unsigned int), GFP_KERNEL);
+	if (!priv->ids_target_cpu) {
+		kfree(priv->ids_enabled_bimap);
+		kfree(priv->ids_used_bimap);
+		return -ENOMEM;
+	}
+	for (i = 0; i <= global->nr_ids; i++)
+		priv->ids_target_cpu[i] = UINT_MAX;
+
+	/* Reserve ID#0 because it is special and never implemented */
+	bitmap_set(priv->ids_used_bimap, 0, 1);
+
+	return 0;
+}
+
+static void __init imsic_ids_cleanup(struct imsic_priv *priv)
+{
+	kfree(priv->ids_target_cpu);
+	kfree(priv->ids_enabled_bimap);
+	kfree(priv->ids_used_bimap);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_ipi_send(unsigned int cpu)
+{
+	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+
+	if (!handler || !handler->priv || !handler->local.msi_va) {
+		pr_warn("CPU%d: handler not initialized\n", cpu);
+		return;
+	}
+
+	writel(handler->priv->ipi_id, handler->local.msi_va);
+}
+
+static void imsic_ipi_enable(struct imsic_priv *priv)
+{
+	__imsic_id_enable(priv->ipi_id);
+	__imsic_id_enable(priv->ipi_lsync_id);
+}
+
+static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
+{
+	int virq;
+
+	/* Allocate interrupt identity for IPIs */
+	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
+	if (virq < 0)
+		return virq;
+	priv->ipi_id = virq;
+
+	/* Create IMSIC IPI multiplexing */
+	virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
+	if (virq <= 0) {
+		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+		return (virq < 0) ? virq : -ENOMEM;
+	}
+
+	/* Set vIRQ range */
+	riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
+
+	/* Allocate interrupt identity for local enable/disable sync */
+	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
+	if (virq < 0) {
+		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+		return virq;
+	}
+	priv->ipi_lsync_id = virq;
+
+	return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
+{
+	imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
+	if (priv->ipi_id)
+		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+}
+#else
+static void imsic_ipi_enable(struct imsic_priv *priv)
+{
+}
+
+static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
+{
+	/* Clear the IPI ids because we are not using IPIs */
+	priv->ipi_id = 0;
+	priv->ipi_lsync_id = 0;
+	return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
+{
+}
+#endif
+
+static void imsic_irq_mask(struct irq_data *d)
+{
+	imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
+}
+
+static void imsic_irq_unmask(struct irq_data *d)
+{
+	imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
+}
+
+static void imsic_irq_compose_msi_msg(struct irq_data *d,
+				      struct msi_msg *msg)
+{
+	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
+	unsigned int cpu;
+	int err;
+
+	cpu = imsic_id_get_target(priv, d->hwirq);
+	WARN_ON(cpu == UINT_MAX);
+
+	err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
+	WARN_ON(err);
+
+	iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
+}
+
+#ifdef CONFIG_SMP
+static int imsic_irq_set_affinity(struct irq_data *d,
+				  const struct cpumask *mask_val,
+				  bool force)
+{
+	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
+	unsigned int target_cpu;
+	int rc;
+
+	rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
+	if (rc)
+		return rc;
+
+	imsic_id_set_target(priv, d->hwirq, target_cpu);
+	irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
+
+	return IRQ_SET_MASK_OK;
+}
+#endif
+
+static struct irq_chip imsic_irq_base_chip = {
+	.name			= "RISC-V IMSIC-BASE",
+	.irq_mask		= imsic_irq_mask,
+	.irq_unmask		= imsic_irq_unmask,
+#ifdef CONFIG_SMP
+	.irq_set_affinity	= imsic_irq_set_affinity,
+#endif
+	.irq_compose_msi_msg	= imsic_irq_compose_msi_msg,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int imsic_irq_domain_alloc(struct irq_domain *domain,
+				  unsigned int virq,
+				  unsigned int nr_irqs,
+				  void *args)
+{
+	struct imsic_priv *priv = domain->host_data;
+	msi_alloc_info_t *info = args;
+	phys_addr_t msi_addr;
+	int i, hwirq, err = 0;
+	unsigned int cpu;
+
+	err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
+	if (err)
+		return err;
+
+	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+	if (err)
+		return err;
+
+	hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
+				get_count_order(nr_irqs));
+	if (hwirq < 0)
+		return hwirq;
+
+	err = iommu_dma_prepare_msi(info->desc, msi_addr);
+	if (err)
+		goto fail;
+
+	for (i = 0; i < nr_irqs; i++) {
+		imsic_id_set_target(priv, hwirq + i, cpu);
+		irq_domain_set_info(domain, virq + i, hwirq + i,
+				    &imsic_irq_base_chip, priv,
+				    handle_simple_irq, NULL, NULL);
+		irq_set_noprobe(virq + i);
+		irq_set_affinity(virq + i, &priv->lmask);
+	}
+
+	return 0;
+
+fail:
+	imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
+	return err;
+}
+
+static void imsic_irq_domain_free(struct irq_domain *domain,
+				  unsigned int virq,
+				  unsigned int nr_irqs)
+{
+	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+	struct imsic_priv *priv = domain->host_data;
+
+	imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops imsic_base_domain_ops = {
+	.alloc		= imsic_irq_domain_alloc,
+	.free		= imsic_irq_domain_free,
+};
+
+#ifdef CONFIG_RISCV_IMSIC_PCI
+
+static void imsic_pci_mask_irq(struct irq_data *d)
+{
+	pci_msi_mask_irq(d);
+	irq_chip_mask_parent(d);
+}
+
+static void imsic_pci_unmask_irq(struct irq_data *d)
+{
+	pci_msi_unmask_irq(d);
+	irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip imsic_pci_irq_chip = {
+	.name			= "RISC-V IMSIC-PCI",
+	.irq_mask		= imsic_pci_mask_irq,
+	.irq_unmask		= imsic_pci_unmask_irq,
+	.irq_eoi		= irq_chip_eoi_parent,
+};
+
+static struct msi_domain_ops imsic_pci_domain_ops = {
+};
+
+static struct msi_domain_info imsic_pci_domain_info = {
+	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+		   MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
+	.ops	= &imsic_pci_domain_ops,
+	.chip	= &imsic_pci_irq_chip,
+};
+
+#endif
+
+static struct irq_chip imsic_plat_irq_chip = {
+	.name			= "RISC-V IMSIC-PLAT",
+};
+
+static struct msi_domain_ops imsic_plat_domain_ops = {
+};
+
+static struct msi_domain_info imsic_plat_domain_info = {
+	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
+	.ops	= &imsic_plat_domain_ops,
+	.chip	= &imsic_plat_irq_chip,
+};
+
+static int __init imsic_irq_domains_init(struct imsic_priv *priv,
+					 struct fwnode_handle *fwnode)
+{
+	/* Create Base IRQ domain */
+	priv->base_domain = irq_domain_create_tree(fwnode,
+						&imsic_base_domain_ops, priv);
+	if (!priv->base_domain) {
+		pr_err("Failed to create IMSIC base domain\n");
+		return -ENOMEM;
+	}
+	irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
+
+#ifdef CONFIG_RISCV_IMSIC_PCI
+	/* Create PCI MSI domain */
+	priv->pci_domain = pci_msi_create_irq_domain(fwnode,
+						&imsic_pci_domain_info,
+						priv->base_domain);
+	if (!priv->pci_domain) {
+		pr_err("Failed to create IMSIC PCI domain\n");
+		irq_domain_remove(priv->base_domain);
+		return -ENOMEM;
+	}
+#endif
+
+	/* Create Platform MSI domain */
+	priv->plat_domain = platform_msi_create_irq_domain(fwnode,
+						&imsic_plat_domain_info,
+						priv->base_domain);
+	if (!priv->plat_domain) {
+		pr_err("Failed to create IMSIC platform domain\n");
+		if (priv->pci_domain)
+			irq_domain_remove(priv->pci_domain);
+		irq_domain_remove(priv->base_domain);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * To handle an interrupt, we read the TOPEI CSR and write zero in one
+ * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
+ * Linux interrupt number and let Linux IRQ subsystem handle it.
+ */
+static void imsic_handle_irq(struct irq_desc *desc)
+{
+	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	struct imsic_priv *priv = handler->priv;
+	irq_hw_number_t hwirq;
+	int err;
+
+	WARN_ON_ONCE(!handler->priv);
+
+	chained_irq_enter(chip, desc);
+
+	while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
+		hwirq = hwirq >> TOPEI_ID_SHIFT;
+
+		if (hwirq == priv->ipi_id) {
+#ifdef CONFIG_SMP
+			ipi_mux_process();
+#endif
+			continue;
+		} else if (hwirq == priv->ipi_lsync_id) {
+			imsic_ids_local_sync(priv);
+			continue;
+		}
+
+		err = generic_handle_domain_irq(priv->base_domain, hwirq);
+		if (unlikely(err))
+			pr_warn_ratelimited(
+				"hwirq %lu mapping not found\n", hwirq);
+	}
+
+	chained_irq_exit(chip, desc);
+}
+
+static int imsic_starting_cpu(unsigned int cpu)
+{
+	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+	struct imsic_priv *priv = handler->priv;
+
+	/* Enable per-CPU parent interrupt */
+	if (imsic_parent_irq)
+		enable_percpu_irq(imsic_parent_irq,
+				  irq_get_trigger_type(imsic_parent_irq));
+	else
+		pr_warn("cpu%d: parent irq not available\n", cpu);
+
+	/* Enable IPIs */
+	imsic_ipi_enable(priv);
+
+	/*
+	 * Interrupts identities might have been enabled/disabled while
+	 * this CPU was not running so sync-up local enable/disable state.
+	 */
+	imsic_ids_local_sync(priv);
+
+	/* Locally enable interrupt delivery */
+	imsic_ids_local_delivery(priv, true);
+
+	return 0;
+}
+
+struct imsic_fwnode_ops {
+	u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
+			     void *fwopaque);
+	int (*parent_hartid)(struct fwnode_handle *fwnode,
+			     void *fwopaque, u32 index,
+			     unsigned long *out_hartid);
+	u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
+	int (*mmio_to_resource)(struct fwnode_handle *fwnode,
+				void *fwopaque, u32 index,
+				struct resource *res);
+	void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
+				  void *fwopaque, u32 index);
+	int (*read_u32)(struct fwnode_handle *fwnode,
+			void *fwopaque, const char *prop, u32 *out_val);
+	bool (*read_bool)(struct fwnode_handle *fwnode,
+			  void *fwopaque, const char *prop);
+};
+
+static int __init imsic_init(struct imsic_fwnode_ops *fwops,
+			     struct fwnode_handle *fwnode,
+			     void *fwopaque)
+{
+	struct resource res;
+	phys_addr_t base_addr;
+	int rc, nr_parent_irqs;
+	struct imsic_mmio *mmio;
+	struct imsic_priv *priv;
+	struct irq_domain *domain;
+	struct imsic_handler *handler;
+	struct imsic_global_config *global;
+	u32 i, tmp, nr_handlers = 0;
+
+	if (imsic_init_done) {
+		pr_err("%pfwP: already initialized hence ignoring\n",
+			fwnode);
+		return -ENODEV;
+	}
+
+	if (!riscv_isa_extension_available(NULL, SxAIA)) {
+		pr_err("%pfwP: AIA support not available\n", fwnode);
+		return -ENODEV;
+	}
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+	global = &priv->global;
+
+	/* Find number of parent interrupts */
+	nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
+	if (!nr_parent_irqs) {
+		pr_err("%pfwP: no parent irqs available\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of guest index bits in MSI address */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
+			     &global->guest_index_bits);
+	if (rc)
+		global->guest_index_bits = 0;
+	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+	if (tmp < global->guest_index_bits) {
+		pr_err("%pfwP: guest index bits too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of HART index bits */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
+			     &global->hart_index_bits);
+	if (rc) {
+		/* Assume default value */
+		global->hart_index_bits = __fls(nr_parent_irqs);
+		if (BIT(global->hart_index_bits) < nr_parent_irqs)
+			global->hart_index_bits++;
+	}
+	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+	      global->guest_index_bits;
+	if (tmp < global->hart_index_bits) {
+		pr_err("%pfwP: HART index bits too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of group index bits */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
+			     &global->group_index_bits);
+	if (rc)
+		global->group_index_bits = 0;
+	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+	      global->guest_index_bits - global->hart_index_bits;
+	if (tmp < global->group_index_bits) {
+		pr_err("%pfwP: group index bits too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/*
+	 * Find first bit position of group index.
+	 * If not specified assumed the default APLIC-IMSIC configuration.
+	 */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
+			     &global->group_index_shift);
+	if (rc)
+		global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
+	tmp = global->group_index_bits + global->group_index_shift - 1;
+	if (tmp >= BITS_PER_LONG) {
+		pr_err("%pfwP: group index shift too big\n", fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of interrupt identities */
+	rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
+			     &global->nr_ids);
+	if (rc) {
+		pr_err("%pfwP: number of interrupt identities not found\n",
+			fwnode);
+		return rc;
+	}
+	if ((global->nr_ids < IMSIC_MIN_ID) ||
+	    (global->nr_ids >= IMSIC_MAX_ID) ||
+	    ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+		pr_err("%pfwP: invalid number of interrupt identities\n",
+			fwnode);
+		return -EINVAL;
+	}
+
+	/* Find number of guest interrupt identities */
+	if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
+			    &global->nr_guest_ids))
+		global->nr_guest_ids = global->nr_ids;
+	if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
+	    (global->nr_guest_ids >= IMSIC_MAX_ID) ||
+	    ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+		pr_err("%pfwP: invalid number of guest interrupt identities\n",
+			fwnode);
+		return -EINVAL;
+	}
+
+	/* Compute base address */
+	rc = fwops->mmio_to_resource(fwnode, fwopaque, 0, &res);
+	if (rc) {
+		pr_err("%pfwP: first MMIO resource not found\n", fwnode);
+		return -EINVAL;
+	}
+	global->base_addr = res.start;
+	global->base_addr &= ~(BIT(global->guest_index_bits +
+				   global->hart_index_bits +
+				   IMSIC_MMIO_PAGE_SHIFT) - 1);
+	global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+			       global->group_index_shift);
+
+	/* Find number of MMIO register sets */
+	priv->num_mmios = fwops->nr_mmio(fwnode, fwopaque);
+
+	/* Allocate MMIO register sets */
+	priv->mmios = kcalloc(priv->num_mmios, sizeof(*mmio), GFP_KERNEL);
+	if (!priv->mmios) {
+		rc = -ENOMEM;
+		goto out_free_priv;
+	}
+
+	/* Parse and map MMIO register sets */
+	for (i = 0; i < priv->num_mmios; i++) {
+		mmio = &priv->mmios[i];
+		rc = fwops->mmio_to_resource(fwnode, fwopaque, i, &res);
+		if (rc) {
+			pr_err("%pfwP: unable to parse MMIO regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+		mmio->pa = res.start;
+		mmio->size = res.end - res.start + 1;
+
+		base_addr = mmio->pa;
+		base_addr &= ~(BIT(global->guest_index_bits +
+				   global->hart_index_bits +
+				   IMSIC_MMIO_PAGE_SHIFT) - 1);
+		base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+			       global->group_index_shift);
+		if (base_addr != global->base_addr) {
+			rc = -EINVAL;
+			pr_err("%pfwP: address mismatch for regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+
+		mmio->va = fwops->mmio_map(fwnode, fwopaque, i);
+		if (!mmio->va) {
+			rc = -EIO;
+			pr_err("%pfwP: unable to map MMIO regset %d\n",
+				fwnode, i);
+			goto out_iounmap;
+		}
+	}
+
+	/* Initialize interrupt identity management */
+	rc = imsic_ids_init(priv);
+	if (rc) {
+		pr_err("%pfwP: failed to initialize interrupt management\n",
+		       fwnode);
+		goto out_iounmap;
+	}
+
+	/* Configure handlers for target CPUs */
+	for (i = 0; i < nr_parent_irqs; i++) {
+		unsigned long reloff, hartid;
+		int j, cpu;
+
+		rc = fwops->parent_hartid(fwnode, fwopaque, i, &hartid);
+		if (rc) {
+			pr_warn("%pfwP: hart ID for parent irq%d not found\n",
+				fwnode, i);
+			continue;
+		}
+
+		cpu = riscv_hartid_to_cpuid(hartid);
+		if (cpu < 0) {
+			pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
+				fwnode, i);
+			continue;
+		}
+
+		/* Find MMIO location of MSI page */
+		mmio = NULL;
+		reloff = i * BIT(global->guest_index_bits) *
+			 IMSIC_MMIO_PAGE_SZ;
+		for (j = 0; priv->num_mmios; j++) {
+			if (reloff < priv->mmios[j].size) {
+				mmio = &priv->mmios[j];
+				break;
+			}
+
+			/*
+			 * MMIO region size may not be aligned to
+			 * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+			 * if holes are present.
+			 */
+			reloff -= ALIGN(priv->mmios[j].size,
+			BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
+		}
+		if (!mmio) {
+			pr_warn("%pfwP: MMIO not found for parent irq%d\n",
+				fwnode, i);
+			continue;
+		}
+
+		handler = per_cpu_ptr(&imsic_handlers, cpu);
+		if (handler->priv) {
+			pr_warn("%pfwP: CPU%d handler already configured.\n",
+				fwnode, cpu);
+			goto done;
+		}
+
+		cpumask_set_cpu(cpu, &priv->lmask);
+		handler->local.msi_pa = mmio->pa + reloff;
+		handler->local.msi_va = mmio->va + reloff;
+		handler->priv = priv;
+
+done:
+		nr_handlers++;
+	}
+
+	/* If no CPU handlers found then can't take interrupts */
+	if (!nr_handlers) {
+		pr_err("%pfwP: No CPU handlers found\n", fwnode);
+		rc = -ENODEV;
+		goto out_ids_cleanup;
+	}
+
+	/* Find parent domain and register chained handler */
+	domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+					  DOMAIN_BUS_ANY);
+	if (!domain) {
+		pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
+		rc = -ENOENT;
+		goto out_ids_cleanup;
+	}
+	imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+	if (!imsic_parent_irq) {
+		pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
+		rc = -ENOENT;
+		goto out_ids_cleanup;
+	}
+	irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
+
+	/* Initialize IPI domain */
+	rc = imsic_ipi_domain_init(priv);
+	if (rc) {
+		pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
+		goto out_ids_cleanup;
+	}
+
+	/* Initialize IRQ and MSI domains */
+	rc = imsic_irq_domains_init(priv, fwnode);
+	if (rc) {
+		pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
+		       fwnode);
+		goto out_ipi_domain_cleanup;
+	}
+
+	/*
+	 * Setup cpuhp state
+	 *
+	 * Don't disable per-CPU IMSIC file when CPU goes offline
+	 * because this affects IPI and the masking/unmasking of
+	 * virtual IPIs is done via generic IPI-Mux
+	 */
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+			  "irqchip/riscv/imsic:starting",
+			  imsic_starting_cpu, NULL);
+
+	/*
+	 * Only one IMSIC instance allowed in a platform for clean
+	 * implementation of SMP IRQ affinity and per-CPU IPIs.
+	 *
+	 * This means on a multi-socket (or multi-die) platform we
+	 * will have multiple MMIO regions for one IMSIC instance.
+	 */
+	imsic_init_done = true;
+
+	pr_info("%pfwP:  hart-index-bits: %d,  guest-index-bits: %d\n",
+		fwnode, global->hart_index_bits, global->guest_index_bits);
+	pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
+		fwnode, global->group_index_bits, global->group_index_shift);
+	pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
+		fwnode, global->nr_ids, nr_handlers, &global->base_addr);
+	if (priv->ipi_lsync_id)
+		pr_info("%pfwP: enable/disable sync using interrupt %d\n",
+			fwnode, priv->ipi_lsync_id);
+	if (priv->ipi_id)
+		pr_info("%pfwP: providing IPIs using interrupt %d\n",
+			fwnode, priv->ipi_id);
+
+	return 0;
+
+out_ipi_domain_cleanup:
+	imsic_ipi_domain_cleanup(priv);
+out_ids_cleanup:
+	imsic_ids_cleanup(priv);
+out_iounmap:
+	for (i = 0; i < priv->num_mmios; i++) {
+		if (priv->mmios[i].va)
+			iounmap(priv->mmios[i].va);
+	}
+	kfree(priv->mmios);
+out_free_priv:
+	kfree(priv);
+	return rc;
+}
+
+static u32 __init imsic_dt_nr_parent_irq(struct fwnode_handle *fwnode,
+					 void *fwopaque)
+{
+	return of_irq_count(to_of_node(fwnode));
+}
+
+static int __init imsic_dt_parent_hartid(struct fwnode_handle *fwnode,
+					 void *fwopaque, u32 index,
+					 unsigned long *out_hartid)
+{
+	struct of_phandle_args parent;
+	int rc;
+
+	rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
+	if (rc)
+		return rc;
+
+	/*
+	 * Skip interrupts other than external interrupts for
+	 * current privilege level.
+	 */
+	if (parent.args[0] != RV_IRQ_EXT)
+		return -EINVAL;
+
+	return riscv_of_parent_hartid(parent.np, out_hartid);
+}
+
+static u32 __init imsic_dt_nr_mmio(struct fwnode_handle *fwnode,
+				   void *fwopaque)
+{
+	u32 ret = 0;
+	struct resource res;
+
+	while (!of_address_to_resource(to_of_node(fwnode), ret, &res))
+		ret++;
+
+	return ret;
+}
+
+static int __init imsic_mmio_to_resource(struct fwnode_handle *fwnode,
+					 void *fwopaque, u32 index,
+					 struct resource *res)
+{
+	return of_address_to_resource(to_of_node(fwnode), index, res);
+}
+
+static void __iomem __init *imsic_dt_mmio_map(struct fwnode_handle *fwnode,
+					      void *fwopaque, u32 index)
+{
+	return of_iomap(to_of_node(fwnode), index);
+}
+
+static int __init imsic_dt_read_u32(struct fwnode_handle *fwnode,
+				    void *fwopaque, const char *prop,
+				    u32 *out_val)
+{
+	return of_property_read_u32(to_of_node(fwnode), prop, out_val);
+}
+
+static bool __init imsic_dt_read_bool(struct fwnode_handle *fwnode,
+				      void *fwopaque, const char *prop)
+{
+	return of_property_read_bool(to_of_node(fwnode), prop);
+}
+
+static int __init imsic_dt_init(struct device_node *node,
+				struct device_node *parent)
+{
+	struct imsic_fwnode_ops ops = {
+		.nr_parent_irq = imsic_dt_nr_parent_irq,
+		.parent_hartid = imsic_dt_parent_hartid,
+		.nr_mmio = imsic_dt_nr_mmio,
+		.mmio_to_resource = imsic_mmio_to_resource,
+		.mmio_map = imsic_dt_mmio_map,
+		.read_u32 = imsic_dt_read_u32,
+		.read_bool = imsic_dt_read_bool,
+	};
+
+	return imsic_init(&ops, &node->fwnode, NULL);
+}
+IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
new file mode 100644
index 000000000000..5d1387adc0ba
--- /dev/null
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
+#define __LINUX_IRQCHIP_RISCV_IMSIC_H
+
+#include <linux/types.h>
+#include <asm/csr.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT		12
+#define IMSIC_MMIO_PAGE_SZ		(1UL << IMSIC_MMIO_PAGE_SHIFT)
+#define IMSIC_MMIO_PAGE_LE		0x00
+#define IMSIC_MMIO_PAGE_BE		0x04
+
+#define IMSIC_MIN_ID			63
+#define IMSIC_MAX_ID			2048
+
+#define IMSIC_EIDELIVERY		0x70
+
+#define IMSIC_EITHRESHOLD		0x72
+
+#define IMSIC_EIP0			0x80
+#define IMSIC_EIP63			0xbf
+#define IMSIC_EIPx_BITS			32
+
+#define IMSIC_EIE0			0xc0
+#define IMSIC_EIE63			0xff
+#define IMSIC_EIEx_BITS			32
+
+#define IMSIC_FIRST			IMSIC_EIDELIVERY
+#define IMSIC_LAST			IMSIC_EIE63
+
+#define IMSIC_MMIO_SETIPNUM_LE		0x00
+#define IMSIC_MMIO_SETIPNUM_BE		0x04
+
+struct imsic_global_config {
+	/*
+	 * MSI Target Address Scheme
+	 *
+	 * XLEN-1                                                12     0
+	 * |                                                     |     |
+	 * -------------------------------------------------------------
+	 * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
+	 * -------------------------------------------------------------
+	 */
+
+	/* Bits representing Guest index, HART index, and Group index */
+	u32 guest_index_bits;
+	u32 hart_index_bits;
+	u32 group_index_bits;
+	u32 group_index_shift;
+
+	/* Global base address matching all target MSI addresses */
+	phys_addr_t base_addr;
+
+	/* Number of interrupt identities */
+	u32 nr_ids;
+
+	/* Number of guest interrupt identities */
+	u32 nr_guest_ids;
+};
+
+struct imsic_local_config {
+	phys_addr_t msi_pa;
+	void __iomem *msi_va;
+};
+
+#ifdef CONFIG_RISCV_IMSIC
+
+extern const struct imsic_global_config *imsic_get_global_config(void);
+
+extern const struct imsic_local_config *imsic_get_local_config(
+							unsigned int cpu);
+
+#else
+
+static inline const struct imsic_global_config *imsic_get_global_config(void)
+{
+	return NULL;
+}
+
+static inline const struct imsic_local_config *imsic_get_local_config(
+							unsigned int cpu)
+{
+	return NULL;
+}
+
+#endif
+
+#endif
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

We add DT bindings document for RISC-V advanced platform level
interrupt controller (APLIC) defined by the RISC-V advanced
interrupt architecture (AIA) specification.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
 1 file changed, 159 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
new file mode 100644
index 000000000000..b7f20aad72c2
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
@@ -0,0 +1,159 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
+
+maintainers:
+  - Anup Patel <anup@brainfault.org>
+
+description:
+  The RISC-V advanced interrupt architecture (AIA) defines an advanced
+  platform level interrupt controller (APLIC) for handling wired interrupts
+  in a RISC-V platform. The RISC-V AIA specification can be found at
+  https://github.com/riscv/riscv-aia.
+
+  The RISC-V APLIC is implemented as hierarchical APLIC domains where all
+  interrupt sources connect to the root domain which can further delegate
+  interrupts to child domains. There is one device tree node for each APLIC
+  domain.
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+  compatible:
+    items:
+      - enum:
+          - riscv,qemu-aplic
+      - const: riscv,aplic
+
+  reg:
+    maxItems: 1
+
+  interrupt-controller: true
+
+  "#interrupt-cells":
+    const: 2
+
+  interrupts-extended:
+    minItems: 1
+    maxItems: 16384
+    description:
+      Given APLIC domain directly injects external interrupts to a set of
+      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
+      node, which has a riscv node (i.e. RISC-V HART) as parent.
+
+  msi-parent:
+    description:
+      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
+      message signaled interrupt controller (IMSIC). This property should be
+      considered only when the interrupts-extended property is absent.
+
+  riscv,num-sources:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 1
+    maximum: 1023
+    description:
+      Specifies how many wired interrupts are supported by this APLIC domain.
+
+  riscv,children:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    minItems: 1
+    maxItems: 1024
+    items:
+      maxItems: 1
+    description:
+      A list of child APLIC domains for the given APLIC domain. Each child
+      APLIC domain is assigned child index in increasing order with the
+      first child APLIC domain assigned child index 0. The APLIC domain
+      child index is used by firmware to delegate interrupts from the
+      given APLIC domain to a particular child APLIC domain.
+
+  riscv,delegate:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    minItems: 1
+    maxItems: 1024
+    items:
+      items:
+        - description: child APLIC domain phandle
+        - description: first interrupt number (inclusive)
+        - description: last interrupt number (inclusive)
+    description:
+      A interrupt delegation list where each entry is a triple consisting
+      of child APLIC domain phandle, first interrupt number, and last
+      interrupt number. The firmware will configure interrupt delegation
+      registers based on interrupt delegation list.
+
+required:
+  - compatible
+  - reg
+  - interrupt-controller
+  - "#interrupt-cells"
+  - riscv,num-sources
+
+unevaluatedProperties: false
+
+examples:
+  - |
+    // Example 1 (APLIC domains directly injecting interrupt to HARTs):
+
+    aplic0: interrupt-controller@c000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      interrupts-extended = <&cpu1_intc 11>,
+                            <&cpu2_intc 11>,
+                            <&cpu3_intc 11>,
+                            <&cpu4_intc 11>;
+      reg = <0xc000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+      riscv,children = <&aplic1>, <&aplic2>;
+      riscv,delegate = <&aplic1 1 63>;
+    };
+
+    aplic1: interrupt-controller@d000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      interrupts-extended = <&cpu1_intc 9>,
+                            <&cpu2_intc 9>;
+      reg = <0xd000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+
+    aplic2: interrupt-controller@e000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      interrupts-extended = <&cpu3_intc 9>,
+                            <&cpu4_intc 9>;
+      reg = <0xe000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+
+  - |
+    // Example 2 (APLIC domains forwarding interrupts as MSIs):
+
+    aplic3: interrupt-controller@c000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      msi-parent = <&imsic_mlevel>;
+      reg = <0xc000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+      riscv,children = <&aplic4>;
+      riscv,delegate = <&aplic4 1 63>;
+    };
+
+    aplic4: interrupt-controller@d000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      msi-parent = <&imsic_slevel>;
+      reg = <0xd000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+...
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

We add DT bindings document for RISC-V advanced platform level
interrupt controller (APLIC) defined by the RISC-V advanced
interrupt architecture (AIA) specification.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
 1 file changed, 159 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
new file mode 100644
index 000000000000..b7f20aad72c2
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
@@ -0,0 +1,159 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
+
+maintainers:
+  - Anup Patel <anup@brainfault.org>
+
+description:
+  The RISC-V advanced interrupt architecture (AIA) defines an advanced
+  platform level interrupt controller (APLIC) for handling wired interrupts
+  in a RISC-V platform. The RISC-V AIA specification can be found at
+  https://github.com/riscv/riscv-aia.
+
+  The RISC-V APLIC is implemented as hierarchical APLIC domains where all
+  interrupt sources connect to the root domain which can further delegate
+  interrupts to child domains. There is one device tree node for each APLIC
+  domain.
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+  compatible:
+    items:
+      - enum:
+          - riscv,qemu-aplic
+      - const: riscv,aplic
+
+  reg:
+    maxItems: 1
+
+  interrupt-controller: true
+
+  "#interrupt-cells":
+    const: 2
+
+  interrupts-extended:
+    minItems: 1
+    maxItems: 16384
+    description:
+      Given APLIC domain directly injects external interrupts to a set of
+      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
+      node, which has a riscv node (i.e. RISC-V HART) as parent.
+
+  msi-parent:
+    description:
+      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
+      message signaled interrupt controller (IMSIC). This property should be
+      considered only when the interrupts-extended property is absent.
+
+  riscv,num-sources:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 1
+    maximum: 1023
+    description:
+      Specifies how many wired interrupts are supported by this APLIC domain.
+
+  riscv,children:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    minItems: 1
+    maxItems: 1024
+    items:
+      maxItems: 1
+    description:
+      A list of child APLIC domains for the given APLIC domain. Each child
+      APLIC domain is assigned child index in increasing order with the
+      first child APLIC domain assigned child index 0. The APLIC domain
+      child index is used by firmware to delegate interrupts from the
+      given APLIC domain to a particular child APLIC domain.
+
+  riscv,delegate:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    minItems: 1
+    maxItems: 1024
+    items:
+      items:
+        - description: child APLIC domain phandle
+        - description: first interrupt number (inclusive)
+        - description: last interrupt number (inclusive)
+    description:
+      A interrupt delegation list where each entry is a triple consisting
+      of child APLIC domain phandle, first interrupt number, and last
+      interrupt number. The firmware will configure interrupt delegation
+      registers based on interrupt delegation list.
+
+required:
+  - compatible
+  - reg
+  - interrupt-controller
+  - "#interrupt-cells"
+  - riscv,num-sources
+
+unevaluatedProperties: false
+
+examples:
+  - |
+    // Example 1 (APLIC domains directly injecting interrupt to HARTs):
+
+    aplic0: interrupt-controller@c000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      interrupts-extended = <&cpu1_intc 11>,
+                            <&cpu2_intc 11>,
+                            <&cpu3_intc 11>,
+                            <&cpu4_intc 11>;
+      reg = <0xc000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+      riscv,children = <&aplic1>, <&aplic2>;
+      riscv,delegate = <&aplic1 1 63>;
+    };
+
+    aplic1: interrupt-controller@d000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      interrupts-extended = <&cpu1_intc 9>,
+                            <&cpu2_intc 9>;
+      reg = <0xd000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+
+    aplic2: interrupt-controller@e000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      interrupts-extended = <&cpu3_intc 9>,
+                            <&cpu4_intc 9>;
+      reg = <0xe000000 0x4080>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+
+  - |
+    // Example 2 (APLIC domains forwarding interrupts as MSIs):
+
+    aplic3: interrupt-controller@c000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      msi-parent = <&imsic_mlevel>;
+      reg = <0xc000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+      riscv,children = <&aplic4>;
+      riscv,delegate = <&aplic4 1 63>;
+    };
+
+    aplic4: interrupt-controller@d000000 {
+      compatible = "riscv,qemu-aplic", "riscv,aplic";
+      msi-parent = <&imsic_slevel>;
+      reg = <0xd000000 0x4000>;
+      interrupt-controller;
+      #interrupt-cells = <2>;
+      riscv,num-sources = <63>;
+    };
+...
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V advanced interrupt architecture (AIA) specification defines
a new interrupt controller for managing wired interrupts on a RISC-V
platform. This new interrupt controller is referred to as advanced
platform-level interrupt controller (APLIC) which can forward wired
interrupts to CPUs (or HARTs) as local interrupts OR as message
signaled interrupts.
(For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/Kconfig             |   6 +
 drivers/irqchip/Makefile            |   1 +
 drivers/irqchip/irq-riscv-aplic.c   | 670 ++++++++++++++++++++++++++++
 include/linux/irqchip/riscv-aplic.h | 117 +++++
 4 files changed, 794 insertions(+)
 create mode 100644 drivers/irqchip/irq-riscv-aplic.c
 create mode 100644 include/linux/irqchip/riscv-aplic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index a1315189a595..936e59fe1f99 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -547,6 +547,12 @@ config SIFIVE_PLIC
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
 
+config RISCV_APLIC
+	bool
+	depends on RISCV
+	select IRQ_DOMAIN_HIERARCHY
+	select GENERIC_MSI_IRQ_DOMAIN
+
 config RISCV_IMSIC
 	bool
 	depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 22c723cc6ec8..6154e5bc4228 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
 obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
 obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
 obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
+obj-$(CONFIG_RISCV_APLIC)		+= irq-riscv-aplic.o
 obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
 obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
 obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
new file mode 100644
index 000000000000..63f20892d7d3
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic.c
@@ -0,0 +1,670 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/smp.h>
+
+#define APLIC_DEFAULT_PRIORITY		1
+#define APLIC_DISABLE_IDELIVERY		0
+#define APLIC_ENABLE_IDELIVERY		1
+#define APLIC_DISABLE_ITHRESHOLD	1
+#define APLIC_ENABLE_ITHRESHOLD		0
+
+struct aplic_msicfg {
+	phys_addr_t		base_ppn;
+	u32			hhxs;
+	u32			hhxw;
+	u32			lhxs;
+	u32			lhxw;
+};
+
+struct aplic_idc {
+	unsigned int		hart_index;
+	void __iomem		*regs;
+	struct aplic_priv	*priv;
+};
+
+struct aplic_priv {
+	struct device		*dev;
+	u32			nr_irqs;
+	u32			nr_idcs;
+	void __iomem		*regs;
+	struct irq_domain	*irqdomain;
+	struct aplic_msicfg	msicfg;
+	struct cpumask		lmask;
+};
+
+static unsigned int aplic_idc_parent_irq;
+static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
+
+static void aplic_irq_unmask(struct irq_data *d)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	writel(d->hwirq, priv->regs + APLIC_SETIENUM);
+
+	if (!priv->nr_idcs)
+		irq_chip_unmask_parent(d);
+}
+
+static void aplic_irq_mask(struct irq_data *d)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
+
+	if (!priv->nr_idcs)
+		irq_chip_mask_parent(d);
+}
+
+static int aplic_set_type(struct irq_data *d, unsigned int type)
+{
+	u32 val = 0;
+	void __iomem *sourcecfg;
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	switch (type) {
+	case IRQ_TYPE_NONE:
+		val = APLIC_SOURCECFG_SM_INACTIVE;
+		break;
+	case IRQ_TYPE_LEVEL_LOW:
+		val = APLIC_SOURCECFG_SM_LEVEL_LOW;
+		break;
+	case IRQ_TYPE_LEVEL_HIGH:
+		val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
+		break;
+	case IRQ_TYPE_EDGE_FALLING:
+		val = APLIC_SOURCECFG_SM_EDGE_FALL;
+		break;
+	case IRQ_TYPE_EDGE_RISING:
+		val = APLIC_SOURCECFG_SM_EDGE_RISE;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
+	sourcecfg += (d->hwirq - 1) * sizeof(u32);
+	writel(val, sourcecfg);
+
+	return 0;
+}
+
+#ifdef CONFIG_SMP
+static int aplic_set_affinity(struct irq_data *d,
+			      const struct cpumask *mask_val, bool force)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+	struct aplic_idc *idc;
+	unsigned int cpu, val;
+	struct cpumask amask;
+	void __iomem *target;
+
+	if (!priv->nr_idcs)
+		return irq_chip_set_affinity_parent(d, mask_val, force);
+
+	cpumask_and(&amask, &priv->lmask, mask_val);
+
+	if (force)
+		cpu = cpumask_first(&amask);
+	else
+		cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	idc = per_cpu_ptr(&aplic_idcs, cpu);
+	target = priv->regs + APLIC_TARGET_BASE;
+	target += (d->hwirq - 1) * sizeof(u32);
+	val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+	val <<= APLIC_TARGET_HART_IDX_SHIFT;
+	val |= APLIC_DEFAULT_PRIORITY;
+	writel(val, target);
+
+	irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+	return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip aplic_chip = {
+	.name		= "RISC-V APLIC",
+	.irq_mask	= aplic_irq_mask,
+	.irq_unmask	= aplic_irq_unmask,
+	.irq_set_type	= aplic_set_type,
+#ifdef CONFIG_SMP
+	.irq_set_affinity = aplic_set_affinity,
+#endif
+	.flags		= IRQCHIP_SET_TYPE_MASKED |
+			  IRQCHIP_SKIP_SET_WAKE |
+			  IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_irqdomain_translate(struct irq_domain *d,
+				     struct irq_fwspec *fwspec,
+				     unsigned long *hwirq,
+				     unsigned int *type)
+{
+	if (WARN_ON(fwspec->param_count < 2))
+		return -EINVAL;
+	if (WARN_ON(!fwspec->param[0]))
+		return -EINVAL;
+
+	*hwirq = fwspec->param[0];
+	*type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+
+	WARN_ON(*type == IRQ_TYPE_NONE);
+
+	return 0;
+}
+
+static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	int i, ret;
+	unsigned int type;
+	irq_hw_number_t hwirq;
+	struct irq_fwspec *fwspec = arg;
+	struct aplic_priv *priv = platform_msi_get_host_data(domain);
+
+	ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
+	if (ret)
+		return ret;
+
+	ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &aplic_chip, priv);
+
+	return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
+	.translate	= aplic_irqdomain_translate,
+	.alloc		= aplic_irqdomain_msi_alloc,
+	.free		= platform_msi_device_domain_free,
+};
+
+static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	int i, ret;
+	unsigned int type;
+	irq_hw_number_t hwirq;
+	struct irq_fwspec *fwspec = arg;
+	struct aplic_priv *priv = domain->host_data;
+
+	ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_info(domain, virq + i, hwirq + i,
+				    &aplic_chip, priv, handle_simple_irq,
+				    NULL, NULL);
+		irq_set_affinity(virq + i, &priv->lmask);
+	}
+
+	return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
+	.translate	= aplic_irqdomain_translate,
+	.alloc		= aplic_irqdomain_idc_alloc,
+	.free		= irq_domain_free_irqs_top,
+};
+
+static void aplic_init_hw_irqs(struct aplic_priv *priv)
+{
+	int i;
+
+	/* Disable all interrupts */
+	for (i = 0; i <= priv->nr_irqs; i += 32)
+		writel(-1U, priv->regs + APLIC_CLRIE_BASE +
+			    (i / 32) * sizeof(u32));
+
+	/* Set interrupt type and default priority for all interrupts */
+	for (i = 1; i <= priv->nr_irqs; i++) {
+		writel(0, priv->regs + APLIC_SOURCECFG_BASE +
+			  (i - 1) * sizeof(u32));
+		writel(APLIC_DEFAULT_PRIORITY,
+		       priv->regs + APLIC_TARGET_BASE +
+		       (i - 1) * sizeof(u32));
+	}
+
+	/* Clear APLIC domaincfg */
+	writel(0, priv->regs + APLIC_DOMAINCFG);
+}
+
+static void aplic_init_hw_global(struct aplic_priv *priv)
+{
+	u32 val;
+#ifdef CONFIG_RISCV_M_MODE
+	u32 valH;
+
+	if (!priv->nr_idcs) {
+		val = priv->msicfg.base_ppn;
+		valH = (priv->msicfg.base_ppn >> 32) &
+			APLIC_xMSICFGADDRH_BAPPN_MASK;
+		valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
+			<< APLIC_xMSICFGADDRH_LHXW_SHIFT;
+		valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
+			<< APLIC_xMSICFGADDRH_HHXW_SHIFT;
+		valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
+			<< APLIC_xMSICFGADDRH_LHXS_SHIFT;
+		valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
+			<< APLIC_xMSICFGADDRH_HHXS_SHIFT;
+		writel(val, priv->regs + APLIC_xMSICFGADDR);
+		writel(valH, priv->regs + APLIC_xMSICFGADDRH);
+	}
+#endif
+
+	/* Setup APLIC domaincfg register */
+	val = readl(priv->regs + APLIC_DOMAINCFG);
+	val |= APLIC_DOMAINCFG_IE;
+	if (!priv->nr_idcs)
+		val |= APLIC_DOMAINCFG_DM;
+	writel(val, priv->regs + APLIC_DOMAINCFG);
+	if (readl(priv->regs + APLIC_DOMAINCFG) != val)
+		dev_warn(priv->dev,
+			 "unable to write 0x%x in domaincfg\n", val);
+}
+
+static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+	unsigned int group_index, hart_index, guest_index, val;
+	struct device *dev = msi_desc_to_dev(desc);
+	struct aplic_priv *priv = dev_get_drvdata(dev);
+	struct irq_data *d = irq_get_irq_data(desc->irq);
+	struct aplic_msicfg *mc = &priv->msicfg;
+	phys_addr_t tppn, tbppn, msg_addr;
+	void __iomem *target;
+
+	/* For zeroed MSI, simply write zero into the target register */
+	if (!msg->address_hi && !msg->address_lo && !msg->data) {
+		target = priv->regs + APLIC_TARGET_BASE;
+		target += (d->hwirq - 1) * sizeof(u32);
+		writel(0, target);
+		return;
+	}
+
+	/* Sanity check on message data */
+	WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
+
+	/* Compute target MSI address */
+	msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
+	tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+
+	/* Compute target HART Base PPN */
+	tbppn = tppn;
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+	WARN_ON(tbppn != mc->base_ppn);
+
+	/* Compute target group and hart indexes */
+	group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
+		     APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
+	hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
+		     APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
+	hart_index |= (group_index << mc->lhxw);
+	WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
+
+	/* Compute target guest index */
+	guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
+
+	/* Update IRQ TARGET register */
+	target = priv->regs + APLIC_TARGET_BASE;
+	target += (d->hwirq - 1) * sizeof(u32);
+	val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
+				<< APLIC_TARGET_HART_IDX_SHIFT;
+	val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
+				<< APLIC_TARGET_GUEST_IDX_SHIFT;
+	val |= (msg->data & APLIC_TARGET_EIID_MASK);
+	writel(val, target);
+}
+
+static int aplic_setup_msi(struct aplic_priv *priv)
+{
+	struct device *dev = priv->dev;
+	struct aplic_msicfg *mc = &priv->msicfg;
+	const struct imsic_global_config *imsic_global;
+
+	/*
+	 * The APLIC outgoing MSI config registers assume target MSI
+	 * controller to be RISC-V AIA IMSIC controller.
+	 */
+	imsic_global = imsic_get_global_config();
+	if (!imsic_global) {
+		dev_err(dev, "IMSIC global config not found\n");
+		return -ENODEV;
+	}
+
+	/* Find number of guest index bits (LHXS) */
+	mc->lhxs = imsic_global->guest_index_bits;
+	if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
+		dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
+		return -EINVAL;
+	}
+
+	/* Find number of HART index bits (LHXW) */
+	mc->lhxw = imsic_global->hart_index_bits;
+	if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
+		dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
+		return -EINVAL;
+	}
+
+	/* Find number of group index bits (HHXW) */
+	mc->hhxw = imsic_global->group_index_bits;
+	if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
+		dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
+		return -EINVAL;
+	}
+
+	/* Find first bit position of group index (HHXS) */
+	mc->hhxs = imsic_global->group_index_shift;
+	if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
+		dev_err(dev, "IMSIC group index shift should be >= %d\n",
+			(2 * APLIC_xMSICFGADDR_PPN_SHIFT));
+		return -EINVAL;
+	}
+	mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
+	if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
+		dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
+		return -EINVAL;
+	}
+
+	/* Compute PPN base */
+	mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+
+	/* Use all possible CPUs as lmask */
+	cpumask_copy(&priv->lmask, cpu_possible_mask);
+
+	return 0;
+}
+
+/*
+ * To handle an APLIC IDC interrupts, we just read the CLAIMI register
+ * which will return highest priority pending interrupt and clear the
+ * pending bit of the interrupt. This process is repeated until CLAIMI
+ * register return zero value.
+ */
+static void aplic_idc_handle_irq(struct irq_desc *desc)
+{
+	struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	irq_hw_number_t hw_irq;
+	int irq;
+
+	chained_irq_enter(chip, desc);
+
+	while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
+		hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
+		irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
+
+		if (unlikely(irq <= 0))
+			pr_warn_ratelimited("hw_irq %lu mapping not found\n",
+					    hw_irq);
+		else
+			generic_handle_irq(irq);
+	}
+
+	chained_irq_exit(chip, desc);
+}
+
+static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
+{
+	u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
+	u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
+
+	/* Priority must be less than threshold for interrupt triggering */
+	writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
+
+	/* Delivery must be set to 1 for interrupt triggering */
+	writel(de, idc->regs + APLIC_IDC_IDELIVERY);
+}
+
+static int aplic_idc_dying_cpu(unsigned int cpu)
+{
+	if (aplic_idc_parent_irq)
+		disable_percpu_irq(aplic_idc_parent_irq);
+
+	return 0;
+}
+
+static int aplic_idc_starting_cpu(unsigned int cpu)
+{
+	if (aplic_idc_parent_irq)
+		enable_percpu_irq(aplic_idc_parent_irq,
+				  irq_get_trigger_type(aplic_idc_parent_irq));
+
+	return 0;
+}
+
+static int aplic_setup_idc(struct aplic_priv *priv)
+{
+	int i, j, rc, cpu, setup_count = 0;
+	struct device_node *node = priv->dev->of_node;
+	struct device *dev = priv->dev;
+	struct of_phandle_args parent;
+	struct irq_domain *domain;
+	unsigned long hartid;
+	struct aplic_idc *idc;
+	u32 val;
+
+	/* Setup per-CPU IDC and target CPU mask */
+	for (i = 0; i < priv->nr_idcs; i++) {
+		if (of_irq_parse_one(node, i, &parent)) {
+			dev_err(dev, "failed to parse parent for IDC%d.\n",
+				i);
+			return -EIO;
+		}
+
+		/* Skip IDCs which do not connect to external interrupts */
+		if (parent.args[0] != RV_IRQ_EXT)
+			continue;
+
+		rc = riscv_of_parent_hartid(parent.np, &hartid);
+		if (rc) {
+			dev_err(dev, "failed to parse hart ID for IDC%d.\n",
+				i);
+			return rc;
+		}
+
+		cpu = riscv_hartid_to_cpuid(hartid);
+		if (cpu < 0) {
+			dev_warn(dev, "invalid cpuid for IDC%d\n", i);
+			continue;
+		}
+
+		cpumask_set_cpu(cpu, &priv->lmask);
+
+		idc = per_cpu_ptr(&aplic_idcs, cpu);
+		WARN_ON(idc->priv);
+
+		idc->hart_index = i;
+		idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
+		idc->priv = priv;
+
+		aplic_idc_set_delivery(idc, true);
+
+		/*
+		 * Boot cpu might not have APLIC hart_index = 0 so check
+		 * and update target registers of all interrupts.
+		 */
+		if (cpu == smp_processor_id() && idc->hart_index) {
+			val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+			val <<= APLIC_TARGET_HART_IDX_SHIFT;
+			val |= APLIC_DEFAULT_PRIORITY;
+			for (j = 1; j <= priv->nr_irqs; j++)
+				writel(val, priv->regs + APLIC_TARGET_BASE +
+					    (j - 1) * sizeof(u32));
+		}
+
+		setup_count++;
+	}
+
+	/* Find parent domain and register chained handler */
+	domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+					  DOMAIN_BUS_ANY);
+	if (!aplic_idc_parent_irq && domain) {
+		aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+		if (aplic_idc_parent_irq) {
+			irq_set_chained_handler(aplic_idc_parent_irq,
+						aplic_idc_handle_irq);
+
+			/*
+			 * Setup CPUHP notifier to enable IDC parent
+			 * interrupt on all CPUs
+			 */
+			cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+					  "irqchip/riscv/aplic:starting",
+					  aplic_idc_starting_cpu,
+					  aplic_idc_dying_cpu);
+		}
+	}
+
+	/* Fail if we were not able to setup IDC for any CPU */
+	return (setup_count) ? 0 : -ENODEV;
+}
+
+static int aplic_probe(struct platform_device *pdev)
+{
+	struct device_node *node = pdev->dev.of_node;
+	struct device *dev = &pdev->dev;
+	struct aplic_priv *priv;
+	struct resource *regs;
+	phys_addr_t pa;
+	int rc;
+
+	regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!regs) {
+		dev_err(dev, "cannot find registers resource\n");
+		return -ENOENT;
+	}
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+	platform_set_drvdata(pdev, priv);
+	priv->dev = dev;
+
+	priv->regs = devm_ioremap(dev, regs->start, resource_size(regs));
+	if (WARN_ON(!priv->regs)) {
+		dev_err(dev, "failed ioremap registers\n");
+		return -EIO;
+	}
+
+	of_property_read_u32(node, "riscv,num-sources", &priv->nr_irqs);
+	if (!priv->nr_irqs) {
+		dev_err(dev, "failed to get number of interrupt sources\n");
+		return -EINVAL;
+	}
+
+	/* Setup initial state APLIC interrupts */
+	aplic_init_hw_irqs(priv);
+
+	/*
+	 * Setup IDCs or MSIs based on parent interrupts in DT node
+	 *
+	 * If "msi-parent" DT property is present then we ignore the
+	 * APLIC IDCs which forces the APLIC driver to use MSI mode.
+	 */
+	priv->nr_idcs = of_property_read_bool(node, "msi-parent") ?
+			0 : of_irq_count(node);
+	if (priv->nr_idcs)
+		rc = aplic_setup_idc(priv);
+	else
+		rc = aplic_setup_msi(priv);
+	if (rc)
+		return rc;
+
+	/* Setup global config and interrupt delivery */
+	aplic_init_hw_global(priv);
+
+	/* Create irq domain instance for the APLIC */
+	if (priv->nr_idcs)
+		priv->irqdomain = irq_domain_create_linear(
+						of_node_to_fwnode(node),
+						priv->nr_irqs + 1,
+						&aplic_irqdomain_idc_ops,
+						priv);
+	else
+		priv->irqdomain = platform_msi_create_device_domain(dev,
+						priv->nr_irqs + 1,
+						aplic_msi_write_msg,
+						&aplic_irqdomain_msi_ops,
+						priv);
+	if (!priv->irqdomain) {
+		dev_err(dev, "failed to add irq domain\n");
+		return -ENOMEM;
+	}
+
+	/* Advertise the interrupt controller */
+	if (priv->nr_idcs) {
+		dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
+			 priv->nr_irqs, priv->nr_idcs);
+	} else {
+		pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
+		dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
+			 priv->nr_irqs, &pa);
+	}
+
+	return 0;
+}
+
+static int aplic_remove(struct platform_device *pdev)
+{
+	struct aplic_priv *priv = platform_get_drvdata(pdev);
+
+	irq_domain_remove(priv->irqdomain);
+
+	return 0;
+}
+
+static const struct of_device_id aplic_match[] = {
+	{ .compatible = "riscv,aplic" },
+	{}
+};
+
+static struct platform_driver aplic_driver = {
+	.driver = {
+		.name		= "riscv-aplic",
+		.of_match_table	= aplic_match,
+	},
+	.probe = aplic_probe,
+	.remove = aplic_remove,
+};
+
+static int __init aplic_init(void)
+{
+	return platform_driver_register(&aplic_driver);
+}
+core_initcall(aplic_init);
diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
new file mode 100644
index 000000000000..88177eefd411
--- /dev/null
+++ b/include/linux/irqchip/riscv-aplic.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
+#define __LINUX_IRQCHIP_RISCV_APLIC_H
+
+#include <linux/bitops.h>
+
+#define APLIC_MAX_IDC			BIT(14)
+#define APLIC_MAX_SOURCE		1024
+
+#define APLIC_DOMAINCFG			0x0000
+#define APLIC_DOMAINCFG_RDONLY		0x80000000
+#define APLIC_DOMAINCFG_IE		BIT(8)
+#define APLIC_DOMAINCFG_DM		BIT(2)
+#define APLIC_DOMAINCFG_BE		BIT(0)
+
+#define APLIC_SOURCECFG_BASE		0x0004
+#define APLIC_SOURCECFG_D		BIT(10)
+#define APLIC_SOURCECFG_CHILDIDX_MASK	0x000003ff
+#define APLIC_SOURCECFG_SM_MASK	0x00000007
+#define APLIC_SOURCECFG_SM_INACTIVE	0x0
+#define APLIC_SOURCECFG_SM_DETACH	0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE	0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL	0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH	0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW	0x7
+
+#define APLIC_MMSICFGADDR		0x1bc0
+#define APLIC_MMSICFGADDRH		0x1bc4
+#define APLIC_SMSICFGADDR		0x1bc8
+#define APLIC_SMSICFGADDRH		0x1bcc
+
+#ifdef CONFIG_RISCV_M_MODE
+#define APLIC_xMSICFGADDR		APLIC_MMSICFGADDR
+#define APLIC_xMSICFGADDRH		APLIC_MMSICFGADDRH
+#else
+#define APLIC_xMSICFGADDR		APLIC_SMSICFGADDR
+#define APLIC_xMSICFGADDRH		APLIC_SMSICFGADDRH
+#endif
+
+#define APLIC_xMSICFGADDRH_L		BIT(31)
+#define APLIC_xMSICFGADDRH_HHXS_MASK	0x1f
+#define APLIC_xMSICFGADDRH_HHXS_SHIFT	24
+#define APLIC_xMSICFGADDRH_LHXS_MASK	0x7
+#define APLIC_xMSICFGADDRH_LHXS_SHIFT	20
+#define APLIC_xMSICFGADDRH_HHXW_MASK	0x7
+#define APLIC_xMSICFGADDRH_HHXW_SHIFT	16
+#define APLIC_xMSICFGADDRH_LHXW_MASK	0xf
+#define APLIC_xMSICFGADDRH_LHXW_SHIFT	12
+#define APLIC_xMSICFGADDRH_BAPPN_MASK	0xfff
+
+#define APLIC_xMSICFGADDR_PPN_SHIFT	12
+
+#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
+	(BIT(__lhxs) - 1)
+
+#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
+	(BIT(__lhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
+	((__lhxs))
+#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
+	(APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
+	 APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
+
+#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
+	(BIT(__hhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
+	((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
+#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
+	(APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
+	 APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
+
+#define APLIC_SETIP_BASE		0x1c00
+#define APLIC_SETIPNUM			0x1cdc
+
+#define APLIC_CLRIP_BASE		0x1d00
+#define APLIC_CLRIPNUM			0x1ddc
+
+#define APLIC_SETIE_BASE		0x1e00
+#define APLIC_SETIENUM			0x1edc
+
+#define APLIC_CLRIE_BASE		0x1f00
+#define APLIC_CLRIENUM			0x1fdc
+
+#define APLIC_SETIPNUM_LE		0x2000
+#define APLIC_SETIPNUM_BE		0x2004
+
+#define APLIC_GENMSI			0x3000
+
+#define APLIC_TARGET_BASE		0x3004
+#define APLIC_TARGET_HART_IDX_SHIFT	18
+#define APLIC_TARGET_HART_IDX_MASK	0x3fff
+#define APLIC_TARGET_GUEST_IDX_SHIFT	12
+#define APLIC_TARGET_GUEST_IDX_MASK	0x3f
+#define APLIC_TARGET_IPRIO_MASK	0xff
+#define APLIC_TARGET_EIID_MASK	0x7ff
+
+#define APLIC_IDC_BASE			0x4000
+#define APLIC_IDC_SIZE			32
+
+#define APLIC_IDC_IDELIVERY		0x00
+
+#define APLIC_IDC_IFORCE		0x04
+
+#define APLIC_IDC_ITHRESHOLD		0x08
+
+#define APLIC_IDC_TOPI			0x18
+#define APLIC_IDC_TOPI_ID_SHIFT	16
+#define APLIC_IDC_TOPI_ID_MASK	0x3ff
+#define APLIC_IDC_TOPI_PRIO_MASK	0xff
+
+#define APLIC_IDC_CLAIMI		0x1c
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The RISC-V advanced interrupt architecture (AIA) specification defines
a new interrupt controller for managing wired interrupts on a RISC-V
platform. This new interrupt controller is referred to as advanced
platform-level interrupt controller (APLIC) which can forward wired
interrupts to CPUs (or HARTs) as local interrupts OR as message
signaled interrupts.
(For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/Kconfig             |   6 +
 drivers/irqchip/Makefile            |   1 +
 drivers/irqchip/irq-riscv-aplic.c   | 670 ++++++++++++++++++++++++++++
 include/linux/irqchip/riscv-aplic.h | 117 +++++
 4 files changed, 794 insertions(+)
 create mode 100644 drivers/irqchip/irq-riscv-aplic.c
 create mode 100644 include/linux/irqchip/riscv-aplic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index a1315189a595..936e59fe1f99 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -547,6 +547,12 @@ config SIFIVE_PLIC
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
 
+config RISCV_APLIC
+	bool
+	depends on RISCV
+	select IRQ_DOMAIN_HIERARCHY
+	select GENERIC_MSI_IRQ_DOMAIN
+
 config RISCV_IMSIC
 	bool
 	depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 22c723cc6ec8..6154e5bc4228 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
 obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
 obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
 obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
+obj-$(CONFIG_RISCV_APLIC)		+= irq-riscv-aplic.o
 obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
 obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
 obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
new file mode 100644
index 000000000000..63f20892d7d3
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic.c
@@ -0,0 +1,670 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/smp.h>
+
+#define APLIC_DEFAULT_PRIORITY		1
+#define APLIC_DISABLE_IDELIVERY		0
+#define APLIC_ENABLE_IDELIVERY		1
+#define APLIC_DISABLE_ITHRESHOLD	1
+#define APLIC_ENABLE_ITHRESHOLD		0
+
+struct aplic_msicfg {
+	phys_addr_t		base_ppn;
+	u32			hhxs;
+	u32			hhxw;
+	u32			lhxs;
+	u32			lhxw;
+};
+
+struct aplic_idc {
+	unsigned int		hart_index;
+	void __iomem		*regs;
+	struct aplic_priv	*priv;
+};
+
+struct aplic_priv {
+	struct device		*dev;
+	u32			nr_irqs;
+	u32			nr_idcs;
+	void __iomem		*regs;
+	struct irq_domain	*irqdomain;
+	struct aplic_msicfg	msicfg;
+	struct cpumask		lmask;
+};
+
+static unsigned int aplic_idc_parent_irq;
+static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
+
+static void aplic_irq_unmask(struct irq_data *d)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	writel(d->hwirq, priv->regs + APLIC_SETIENUM);
+
+	if (!priv->nr_idcs)
+		irq_chip_unmask_parent(d);
+}
+
+static void aplic_irq_mask(struct irq_data *d)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
+
+	if (!priv->nr_idcs)
+		irq_chip_mask_parent(d);
+}
+
+static int aplic_set_type(struct irq_data *d, unsigned int type)
+{
+	u32 val = 0;
+	void __iomem *sourcecfg;
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+	switch (type) {
+	case IRQ_TYPE_NONE:
+		val = APLIC_SOURCECFG_SM_INACTIVE;
+		break;
+	case IRQ_TYPE_LEVEL_LOW:
+		val = APLIC_SOURCECFG_SM_LEVEL_LOW;
+		break;
+	case IRQ_TYPE_LEVEL_HIGH:
+		val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
+		break;
+	case IRQ_TYPE_EDGE_FALLING:
+		val = APLIC_SOURCECFG_SM_EDGE_FALL;
+		break;
+	case IRQ_TYPE_EDGE_RISING:
+		val = APLIC_SOURCECFG_SM_EDGE_RISE;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
+	sourcecfg += (d->hwirq - 1) * sizeof(u32);
+	writel(val, sourcecfg);
+
+	return 0;
+}
+
+#ifdef CONFIG_SMP
+static int aplic_set_affinity(struct irq_data *d,
+			      const struct cpumask *mask_val, bool force)
+{
+	struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+	struct aplic_idc *idc;
+	unsigned int cpu, val;
+	struct cpumask amask;
+	void __iomem *target;
+
+	if (!priv->nr_idcs)
+		return irq_chip_set_affinity_parent(d, mask_val, force);
+
+	cpumask_and(&amask, &priv->lmask, mask_val);
+
+	if (force)
+		cpu = cpumask_first(&amask);
+	else
+		cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	idc = per_cpu_ptr(&aplic_idcs, cpu);
+	target = priv->regs + APLIC_TARGET_BASE;
+	target += (d->hwirq - 1) * sizeof(u32);
+	val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+	val <<= APLIC_TARGET_HART_IDX_SHIFT;
+	val |= APLIC_DEFAULT_PRIORITY;
+	writel(val, target);
+
+	irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+	return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip aplic_chip = {
+	.name		= "RISC-V APLIC",
+	.irq_mask	= aplic_irq_mask,
+	.irq_unmask	= aplic_irq_unmask,
+	.irq_set_type	= aplic_set_type,
+#ifdef CONFIG_SMP
+	.irq_set_affinity = aplic_set_affinity,
+#endif
+	.flags		= IRQCHIP_SET_TYPE_MASKED |
+			  IRQCHIP_SKIP_SET_WAKE |
+			  IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_irqdomain_translate(struct irq_domain *d,
+				     struct irq_fwspec *fwspec,
+				     unsigned long *hwirq,
+				     unsigned int *type)
+{
+	if (WARN_ON(fwspec->param_count < 2))
+		return -EINVAL;
+	if (WARN_ON(!fwspec->param[0]))
+		return -EINVAL;
+
+	*hwirq = fwspec->param[0];
+	*type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+
+	WARN_ON(*type == IRQ_TYPE_NONE);
+
+	return 0;
+}
+
+static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	int i, ret;
+	unsigned int type;
+	irq_hw_number_t hwirq;
+	struct irq_fwspec *fwspec = arg;
+	struct aplic_priv *priv = platform_msi_get_host_data(domain);
+
+	ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
+	if (ret)
+		return ret;
+
+	ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &aplic_chip, priv);
+
+	return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
+	.translate	= aplic_irqdomain_translate,
+	.alloc		= aplic_irqdomain_msi_alloc,
+	.free		= platform_msi_device_domain_free,
+};
+
+static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	int i, ret;
+	unsigned int type;
+	irq_hw_number_t hwirq;
+	struct irq_fwspec *fwspec = arg;
+	struct aplic_priv *priv = domain->host_data;
+
+	ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_info(domain, virq + i, hwirq + i,
+				    &aplic_chip, priv, handle_simple_irq,
+				    NULL, NULL);
+		irq_set_affinity(virq + i, &priv->lmask);
+	}
+
+	return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
+	.translate	= aplic_irqdomain_translate,
+	.alloc		= aplic_irqdomain_idc_alloc,
+	.free		= irq_domain_free_irqs_top,
+};
+
+static void aplic_init_hw_irqs(struct aplic_priv *priv)
+{
+	int i;
+
+	/* Disable all interrupts */
+	for (i = 0; i <= priv->nr_irqs; i += 32)
+		writel(-1U, priv->regs + APLIC_CLRIE_BASE +
+			    (i / 32) * sizeof(u32));
+
+	/* Set interrupt type and default priority for all interrupts */
+	for (i = 1; i <= priv->nr_irqs; i++) {
+		writel(0, priv->regs + APLIC_SOURCECFG_BASE +
+			  (i - 1) * sizeof(u32));
+		writel(APLIC_DEFAULT_PRIORITY,
+		       priv->regs + APLIC_TARGET_BASE +
+		       (i - 1) * sizeof(u32));
+	}
+
+	/* Clear APLIC domaincfg */
+	writel(0, priv->regs + APLIC_DOMAINCFG);
+}
+
+static void aplic_init_hw_global(struct aplic_priv *priv)
+{
+	u32 val;
+#ifdef CONFIG_RISCV_M_MODE
+	u32 valH;
+
+	if (!priv->nr_idcs) {
+		val = priv->msicfg.base_ppn;
+		valH = (priv->msicfg.base_ppn >> 32) &
+			APLIC_xMSICFGADDRH_BAPPN_MASK;
+		valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
+			<< APLIC_xMSICFGADDRH_LHXW_SHIFT;
+		valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
+			<< APLIC_xMSICFGADDRH_HHXW_SHIFT;
+		valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
+			<< APLIC_xMSICFGADDRH_LHXS_SHIFT;
+		valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
+			<< APLIC_xMSICFGADDRH_HHXS_SHIFT;
+		writel(val, priv->regs + APLIC_xMSICFGADDR);
+		writel(valH, priv->regs + APLIC_xMSICFGADDRH);
+	}
+#endif
+
+	/* Setup APLIC domaincfg register */
+	val = readl(priv->regs + APLIC_DOMAINCFG);
+	val |= APLIC_DOMAINCFG_IE;
+	if (!priv->nr_idcs)
+		val |= APLIC_DOMAINCFG_DM;
+	writel(val, priv->regs + APLIC_DOMAINCFG);
+	if (readl(priv->regs + APLIC_DOMAINCFG) != val)
+		dev_warn(priv->dev,
+			 "unable to write 0x%x in domaincfg\n", val);
+}
+
+static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+	unsigned int group_index, hart_index, guest_index, val;
+	struct device *dev = msi_desc_to_dev(desc);
+	struct aplic_priv *priv = dev_get_drvdata(dev);
+	struct irq_data *d = irq_get_irq_data(desc->irq);
+	struct aplic_msicfg *mc = &priv->msicfg;
+	phys_addr_t tppn, tbppn, msg_addr;
+	void __iomem *target;
+
+	/* For zeroed MSI, simply write zero into the target register */
+	if (!msg->address_hi && !msg->address_lo && !msg->data) {
+		target = priv->regs + APLIC_TARGET_BASE;
+		target += (d->hwirq - 1) * sizeof(u32);
+		writel(0, target);
+		return;
+	}
+
+	/* Sanity check on message data */
+	WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
+
+	/* Compute target MSI address */
+	msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
+	tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+
+	/* Compute target HART Base PPN */
+	tbppn = tppn;
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+	tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+	WARN_ON(tbppn != mc->base_ppn);
+
+	/* Compute target group and hart indexes */
+	group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
+		     APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
+	hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
+		     APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
+	hart_index |= (group_index << mc->lhxw);
+	WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
+
+	/* Compute target guest index */
+	guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
+
+	/* Update IRQ TARGET register */
+	target = priv->regs + APLIC_TARGET_BASE;
+	target += (d->hwirq - 1) * sizeof(u32);
+	val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
+				<< APLIC_TARGET_HART_IDX_SHIFT;
+	val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
+				<< APLIC_TARGET_GUEST_IDX_SHIFT;
+	val |= (msg->data & APLIC_TARGET_EIID_MASK);
+	writel(val, target);
+}
+
+static int aplic_setup_msi(struct aplic_priv *priv)
+{
+	struct device *dev = priv->dev;
+	struct aplic_msicfg *mc = &priv->msicfg;
+	const struct imsic_global_config *imsic_global;
+
+	/*
+	 * The APLIC outgoing MSI config registers assume target MSI
+	 * controller to be RISC-V AIA IMSIC controller.
+	 */
+	imsic_global = imsic_get_global_config();
+	if (!imsic_global) {
+		dev_err(dev, "IMSIC global config not found\n");
+		return -ENODEV;
+	}
+
+	/* Find number of guest index bits (LHXS) */
+	mc->lhxs = imsic_global->guest_index_bits;
+	if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
+		dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
+		return -EINVAL;
+	}
+
+	/* Find number of HART index bits (LHXW) */
+	mc->lhxw = imsic_global->hart_index_bits;
+	if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
+		dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
+		return -EINVAL;
+	}
+
+	/* Find number of group index bits (HHXW) */
+	mc->hhxw = imsic_global->group_index_bits;
+	if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
+		dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
+		return -EINVAL;
+	}
+
+	/* Find first bit position of group index (HHXS) */
+	mc->hhxs = imsic_global->group_index_shift;
+	if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
+		dev_err(dev, "IMSIC group index shift should be >= %d\n",
+			(2 * APLIC_xMSICFGADDR_PPN_SHIFT));
+		return -EINVAL;
+	}
+	mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
+	if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
+		dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
+		return -EINVAL;
+	}
+
+	/* Compute PPN base */
+	mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+	mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+
+	/* Use all possible CPUs as lmask */
+	cpumask_copy(&priv->lmask, cpu_possible_mask);
+
+	return 0;
+}
+
+/*
+ * To handle an APLIC IDC interrupts, we just read the CLAIMI register
+ * which will return highest priority pending interrupt and clear the
+ * pending bit of the interrupt. This process is repeated until CLAIMI
+ * register return zero value.
+ */
+static void aplic_idc_handle_irq(struct irq_desc *desc)
+{
+	struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	irq_hw_number_t hw_irq;
+	int irq;
+
+	chained_irq_enter(chip, desc);
+
+	while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
+		hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
+		irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
+
+		if (unlikely(irq <= 0))
+			pr_warn_ratelimited("hw_irq %lu mapping not found\n",
+					    hw_irq);
+		else
+			generic_handle_irq(irq);
+	}
+
+	chained_irq_exit(chip, desc);
+}
+
+static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
+{
+	u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
+	u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
+
+	/* Priority must be less than threshold for interrupt triggering */
+	writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
+
+	/* Delivery must be set to 1 for interrupt triggering */
+	writel(de, idc->regs + APLIC_IDC_IDELIVERY);
+}
+
+static int aplic_idc_dying_cpu(unsigned int cpu)
+{
+	if (aplic_idc_parent_irq)
+		disable_percpu_irq(aplic_idc_parent_irq);
+
+	return 0;
+}
+
+static int aplic_idc_starting_cpu(unsigned int cpu)
+{
+	if (aplic_idc_parent_irq)
+		enable_percpu_irq(aplic_idc_parent_irq,
+				  irq_get_trigger_type(aplic_idc_parent_irq));
+
+	return 0;
+}
+
+static int aplic_setup_idc(struct aplic_priv *priv)
+{
+	int i, j, rc, cpu, setup_count = 0;
+	struct device_node *node = priv->dev->of_node;
+	struct device *dev = priv->dev;
+	struct of_phandle_args parent;
+	struct irq_domain *domain;
+	unsigned long hartid;
+	struct aplic_idc *idc;
+	u32 val;
+
+	/* Setup per-CPU IDC and target CPU mask */
+	for (i = 0; i < priv->nr_idcs; i++) {
+		if (of_irq_parse_one(node, i, &parent)) {
+			dev_err(dev, "failed to parse parent for IDC%d.\n",
+				i);
+			return -EIO;
+		}
+
+		/* Skip IDCs which do not connect to external interrupts */
+		if (parent.args[0] != RV_IRQ_EXT)
+			continue;
+
+		rc = riscv_of_parent_hartid(parent.np, &hartid);
+		if (rc) {
+			dev_err(dev, "failed to parse hart ID for IDC%d.\n",
+				i);
+			return rc;
+		}
+
+		cpu = riscv_hartid_to_cpuid(hartid);
+		if (cpu < 0) {
+			dev_warn(dev, "invalid cpuid for IDC%d\n", i);
+			continue;
+		}
+
+		cpumask_set_cpu(cpu, &priv->lmask);
+
+		idc = per_cpu_ptr(&aplic_idcs, cpu);
+		WARN_ON(idc->priv);
+
+		idc->hart_index = i;
+		idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
+		idc->priv = priv;
+
+		aplic_idc_set_delivery(idc, true);
+
+		/*
+		 * Boot cpu might not have APLIC hart_index = 0 so check
+		 * and update target registers of all interrupts.
+		 */
+		if (cpu == smp_processor_id() && idc->hart_index) {
+			val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+			val <<= APLIC_TARGET_HART_IDX_SHIFT;
+			val |= APLIC_DEFAULT_PRIORITY;
+			for (j = 1; j <= priv->nr_irqs; j++)
+				writel(val, priv->regs + APLIC_TARGET_BASE +
+					    (j - 1) * sizeof(u32));
+		}
+
+		setup_count++;
+	}
+
+	/* Find parent domain and register chained handler */
+	domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+					  DOMAIN_BUS_ANY);
+	if (!aplic_idc_parent_irq && domain) {
+		aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+		if (aplic_idc_parent_irq) {
+			irq_set_chained_handler(aplic_idc_parent_irq,
+						aplic_idc_handle_irq);
+
+			/*
+			 * Setup CPUHP notifier to enable IDC parent
+			 * interrupt on all CPUs
+			 */
+			cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+					  "irqchip/riscv/aplic:starting",
+					  aplic_idc_starting_cpu,
+					  aplic_idc_dying_cpu);
+		}
+	}
+
+	/* Fail if we were not able to setup IDC for any CPU */
+	return (setup_count) ? 0 : -ENODEV;
+}
+
+static int aplic_probe(struct platform_device *pdev)
+{
+	struct device_node *node = pdev->dev.of_node;
+	struct device *dev = &pdev->dev;
+	struct aplic_priv *priv;
+	struct resource *regs;
+	phys_addr_t pa;
+	int rc;
+
+	regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!regs) {
+		dev_err(dev, "cannot find registers resource\n");
+		return -ENOENT;
+	}
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+	platform_set_drvdata(pdev, priv);
+	priv->dev = dev;
+
+	priv->regs = devm_ioremap(dev, regs->start, resource_size(regs));
+	if (WARN_ON(!priv->regs)) {
+		dev_err(dev, "failed ioremap registers\n");
+		return -EIO;
+	}
+
+	of_property_read_u32(node, "riscv,num-sources", &priv->nr_irqs);
+	if (!priv->nr_irqs) {
+		dev_err(dev, "failed to get number of interrupt sources\n");
+		return -EINVAL;
+	}
+
+	/* Setup initial state APLIC interrupts */
+	aplic_init_hw_irqs(priv);
+
+	/*
+	 * Setup IDCs or MSIs based on parent interrupts in DT node
+	 *
+	 * If "msi-parent" DT property is present then we ignore the
+	 * APLIC IDCs which forces the APLIC driver to use MSI mode.
+	 */
+	priv->nr_idcs = of_property_read_bool(node, "msi-parent") ?
+			0 : of_irq_count(node);
+	if (priv->nr_idcs)
+		rc = aplic_setup_idc(priv);
+	else
+		rc = aplic_setup_msi(priv);
+	if (rc)
+		return rc;
+
+	/* Setup global config and interrupt delivery */
+	aplic_init_hw_global(priv);
+
+	/* Create irq domain instance for the APLIC */
+	if (priv->nr_idcs)
+		priv->irqdomain = irq_domain_create_linear(
+						of_node_to_fwnode(node),
+						priv->nr_irqs + 1,
+						&aplic_irqdomain_idc_ops,
+						priv);
+	else
+		priv->irqdomain = platform_msi_create_device_domain(dev,
+						priv->nr_irqs + 1,
+						aplic_msi_write_msg,
+						&aplic_irqdomain_msi_ops,
+						priv);
+	if (!priv->irqdomain) {
+		dev_err(dev, "failed to add irq domain\n");
+		return -ENOMEM;
+	}
+
+	/* Advertise the interrupt controller */
+	if (priv->nr_idcs) {
+		dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
+			 priv->nr_irqs, priv->nr_idcs);
+	} else {
+		pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
+		dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
+			 priv->nr_irqs, &pa);
+	}
+
+	return 0;
+}
+
+static int aplic_remove(struct platform_device *pdev)
+{
+	struct aplic_priv *priv = platform_get_drvdata(pdev);
+
+	irq_domain_remove(priv->irqdomain);
+
+	return 0;
+}
+
+static const struct of_device_id aplic_match[] = {
+	{ .compatible = "riscv,aplic" },
+	{}
+};
+
+static struct platform_driver aplic_driver = {
+	.driver = {
+		.name		= "riscv-aplic",
+		.of_match_table	= aplic_match,
+	},
+	.probe = aplic_probe,
+	.remove = aplic_remove,
+};
+
+static int __init aplic_init(void)
+{
+	return platform_driver_register(&aplic_driver);
+}
+core_initcall(aplic_init);
diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
new file mode 100644
index 000000000000..88177eefd411
--- /dev/null
+++ b/include/linux/irqchip/riscv-aplic.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
+#define __LINUX_IRQCHIP_RISCV_APLIC_H
+
+#include <linux/bitops.h>
+
+#define APLIC_MAX_IDC			BIT(14)
+#define APLIC_MAX_SOURCE		1024
+
+#define APLIC_DOMAINCFG			0x0000
+#define APLIC_DOMAINCFG_RDONLY		0x80000000
+#define APLIC_DOMAINCFG_IE		BIT(8)
+#define APLIC_DOMAINCFG_DM		BIT(2)
+#define APLIC_DOMAINCFG_BE		BIT(0)
+
+#define APLIC_SOURCECFG_BASE		0x0004
+#define APLIC_SOURCECFG_D		BIT(10)
+#define APLIC_SOURCECFG_CHILDIDX_MASK	0x000003ff
+#define APLIC_SOURCECFG_SM_MASK	0x00000007
+#define APLIC_SOURCECFG_SM_INACTIVE	0x0
+#define APLIC_SOURCECFG_SM_DETACH	0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE	0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL	0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH	0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW	0x7
+
+#define APLIC_MMSICFGADDR		0x1bc0
+#define APLIC_MMSICFGADDRH		0x1bc4
+#define APLIC_SMSICFGADDR		0x1bc8
+#define APLIC_SMSICFGADDRH		0x1bcc
+
+#ifdef CONFIG_RISCV_M_MODE
+#define APLIC_xMSICFGADDR		APLIC_MMSICFGADDR
+#define APLIC_xMSICFGADDRH		APLIC_MMSICFGADDRH
+#else
+#define APLIC_xMSICFGADDR		APLIC_SMSICFGADDR
+#define APLIC_xMSICFGADDRH		APLIC_SMSICFGADDRH
+#endif
+
+#define APLIC_xMSICFGADDRH_L		BIT(31)
+#define APLIC_xMSICFGADDRH_HHXS_MASK	0x1f
+#define APLIC_xMSICFGADDRH_HHXS_SHIFT	24
+#define APLIC_xMSICFGADDRH_LHXS_MASK	0x7
+#define APLIC_xMSICFGADDRH_LHXS_SHIFT	20
+#define APLIC_xMSICFGADDRH_HHXW_MASK	0x7
+#define APLIC_xMSICFGADDRH_HHXW_SHIFT	16
+#define APLIC_xMSICFGADDRH_LHXW_MASK	0xf
+#define APLIC_xMSICFGADDRH_LHXW_SHIFT	12
+#define APLIC_xMSICFGADDRH_BAPPN_MASK	0xfff
+
+#define APLIC_xMSICFGADDR_PPN_SHIFT	12
+
+#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
+	(BIT(__lhxs) - 1)
+
+#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
+	(BIT(__lhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
+	((__lhxs))
+#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
+	(APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
+	 APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
+
+#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
+	(BIT(__hhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
+	((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
+#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
+	(APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
+	 APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
+
+#define APLIC_SETIP_BASE		0x1c00
+#define APLIC_SETIPNUM			0x1cdc
+
+#define APLIC_CLRIP_BASE		0x1d00
+#define APLIC_CLRIPNUM			0x1ddc
+
+#define APLIC_SETIE_BASE		0x1e00
+#define APLIC_SETIENUM			0x1edc
+
+#define APLIC_CLRIE_BASE		0x1f00
+#define APLIC_CLRIENUM			0x1fdc
+
+#define APLIC_SETIPNUM_LE		0x2000
+#define APLIC_SETIPNUM_BE		0x2004
+
+#define APLIC_GENMSI			0x3000
+
+#define APLIC_TARGET_BASE		0x3004
+#define APLIC_TARGET_HART_IDX_SHIFT	18
+#define APLIC_TARGET_HART_IDX_MASK	0x3fff
+#define APLIC_TARGET_GUEST_IDX_SHIFT	12
+#define APLIC_TARGET_GUEST_IDX_MASK	0x3f
+#define APLIC_TARGET_IPRIO_MASK	0xff
+#define APLIC_TARGET_EIID_MASK	0x7ff
+
+#define APLIC_IDC_BASE			0x4000
+#define APLIC_IDC_SIZE			32
+
+#define APLIC_IDC_IDELIVERY		0x00
+
+#define APLIC_IDC_IFORCE		0x04
+
+#define APLIC_IDC_ITHRESHOLD		0x08
+
+#define APLIC_IDC_TOPI			0x18
+#define APLIC_IDC_TOPI_ID_SHIFT	16
+#define APLIC_IDC_TOPI_ID_MASK	0x3ff
+#define APLIC_IDC_TOPI_PRIO_MASK	0xff
+
+#define APLIC_IDC_CLAIMI		0x1c
+
+#endif
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 8/9] RISC-V: Select APLIC and IMSIC drivers
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The QEMU virt machine supports AIA emulation and we also have
quite a few RISC-V platforms with AIA support under development
so let us select APLIC and IMSIC drivers for all RISC-V platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 arch/riscv/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d153e1cd890b..616a27e43827 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -127,6 +127,8 @@ config RISCV
 	select OF_IRQ
 	select PCI_DOMAINS_GENERIC if PCI
 	select PCI_MSI if PCI
+	select RISCV_APLIC
+	select RISCV_IMSIC
 	select RISCV_INTC
 	select RISCV_TIMER if RISCV_SBI
 	select SIFIVE_PLIC
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 8/9] RISC-V: Select APLIC and IMSIC drivers
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

The QEMU virt machine supports AIA emulation and we also have
quite a few RISC-V platforms with AIA support under development
so let us select APLIC and IMSIC drivers for all RISC-V platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 arch/riscv/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d153e1cd890b..616a27e43827 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -127,6 +127,8 @@ config RISCV
 	select OF_IRQ
 	select PCI_DOMAINS_GENERIC if PCI
 	select PCI_MSI if PCI
+	select RISCV_APLIC
+	select RISCV_IMSIC
 	select RISCV_INTC
 	select RISCV_TIMER if RISCV_SBI
 	select SIFIVE_PLIC
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 9/9] MAINTAINERS: Add entry for RISC-V AIA drivers
  2023-01-03 14:14 ` Anup Patel
@ 2023-01-03 14:14   ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

Add myself as maintainer for RISC-V AIA drivers including the
RISC-V INTC driver which supports both AIA and non-AIA platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 MAINTAINERS | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7f86d02cb427..c5b8eda0780e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17942,6 +17942,18 @@ F:	drivers/perf/riscv_pmu.c
 F:	drivers/perf/riscv_pmu_legacy.c
 F:	drivers/perf/riscv_pmu_sbi.c
 
+RISC-V AIA DRIVERS
+M:	Anup Patel <anup@brainfault.org>
+L:	linux-riscv@lists.infradead.org
+S:	Maintained
+F:	Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
+F:	Documentation/devicetree/bindings/interrupt-controller/riscv,imsic.yaml
+F:	drivers/irqchip/irq-riscv-aplic.c
+F:	drivers/irqchip/irq-riscv-imsic.c
+F:	drivers/irqchip/irq-riscv-intc.c
+F:	include/linux/irqchip/riscv-aplic.h
+F:	include/linux/irqchip/riscv-imsic.h
+
 RISC-V ARCHITECTURE
 M:	Paul Walmsley <paul.walmsley@sifive.com>
 M:	Palmer Dabbelt <palmer@dabbelt.com>
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 9/9] MAINTAINERS: Add entry for RISC-V AIA drivers
@ 2023-01-03 14:14   ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-03 14:14 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree, Anup Patel

Add myself as maintainer for RISC-V AIA drivers including the
RISC-V INTC driver which supports both AIA and non-AIA platforms.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 MAINTAINERS | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7f86d02cb427..c5b8eda0780e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17942,6 +17942,18 @@ F:	drivers/perf/riscv_pmu.c
 F:	drivers/perf/riscv_pmu_legacy.c
 F:	drivers/perf/riscv_pmu_sbi.c
 
+RISC-V AIA DRIVERS
+M:	Anup Patel <anup@brainfault.org>
+L:	linux-riscv@lists.infradead.org
+S:	Maintained
+F:	Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
+F:	Documentation/devicetree/bindings/interrupt-controller/riscv,imsic.yaml
+F:	drivers/irqchip/irq-riscv-aplic.c
+F:	drivers/irqchip/irq-riscv-imsic.c
+F:	drivers/irqchip/irq-riscv-intc.c
+F:	include/linux/irqchip/riscv-aplic.h
+F:	include/linux/irqchip/riscv-imsic.h
+
 RISC-V ARCHITECTURE
 M:	Paul Walmsley <paul.walmsley@sifive.com>
 M:	Palmer Dabbelt <palmer@dabbelt.com>
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-01-03 14:14   ` Anup Patel
@ 2023-01-04 22:16     ` Conor Dooley
  -1 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-04 22:16 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

[-- Attachment #1: Type: text/plain, Size: 5319 bytes --]

Hey Anup,

On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
>  1 file changed, 159 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      Given APLIC domain directly injects external interrupts to a set of
> +      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> +      node, which has a riscv node (i.e. RISC-V HART) as parent.
> +
> +  msi-parent:
> +    description:
> +      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> +      message signaled interrupt controller (IMSIC). This property should be
> +      considered only when the interrupts-extended property is absent.

Considered by what?
On v1 you said:
<quote>
If both "interrupts-extended" and "msi-parent" are present then it means
the APLIC domain supports both MSI mode and Direct mode in HW. In this
case, the APLIC driver has to choose between MSI mode or Direct mode.
<\quote>

The description is still pretty ambiguous IMO. Perhaps incorporate
some of that expanded comment into the property description?
Say, "If both foo and bar are present, the APLIC domain has hardware
support for both MSI and direct mode. Software may then chose either
mode".
Have I misunderstood your comment on v1? It read as if having both
present indicated that both were possible & that "should be considered
only..." was more of a suggestion and a comment about the Linux driver's
behaviour.
Apologies if I have misunderstood, but I suppose if I have then the
binding's description could be improved!!

> +  riscv,children:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      maxItems: 1
> +    description:
> +      A list of child APLIC domains for the given APLIC domain. Each child
> +      APLIC domain is assigned child index in increasing order with the

btw, missing article before child (& a comma after order I think).

> +      first child APLIC domain assigned child index 0. The APLIC domain
> +      child index is used by firmware to delegate interrupts from the
> +      given APLIC domain to a particular child APLIC domain.
> +
> +  riscv,delegate:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024

Is it valid to have a delegate property without children? If not, the
binding should reflect that dependency IMO.

> +    items:
> +      items:
> +        - description: child APLIC domain phandle
> +        - description: first interrupt number (inclusive)
> +        - description: last interrupt number (inclusive)
> +    description:
> +      A interrupt delegation list where each entry is a triple consisting
> +      of child APLIC domain phandle, first interrupt number, and last
> +      interrupt number. The firmware will configure interrupt delegation

btw, drop the article before firmware here.
Also, "firmware will" or "firmware must"? Semantics perhaps, but they
are different!

Kinda for my own curiosity here, but do you expect these properties to
generally be dynamically filled in by the bootloader or read by the
bootloader to set up the configuration?

> +      registers based on interrupt delegation list.

I'm sorry Anup, but this child versus delegate thing is still not clear
to me binding wise. See below.

> +    aplic0: interrupt-controller@c000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0xc000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +      riscv,children = <&aplic1>, <&aplic2>;
> +      riscv,delegate = <&aplic1 1 63>;

Is aplic2 here for demonstrative purposes only, since it has not been
delegated any interrupts?
I suppose it is hardware present on the SoC that is not being used by
the current configuration?

Thanks,
Conor.

> +    };
> +
> +    aplic1: interrupt-controller@d000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>;
> +      reg = <0xd000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +
> +    aplic2: interrupt-controller@e000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0xe000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-01-04 22:16     ` Conor Dooley
  0 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-04 22:16 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree


[-- Attachment #1.1: Type: text/plain, Size: 5319 bytes --]

Hey Anup,

On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
>  1 file changed, 159 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      Given APLIC domain directly injects external interrupts to a set of
> +      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> +      node, which has a riscv node (i.e. RISC-V HART) as parent.
> +
> +  msi-parent:
> +    description:
> +      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> +      message signaled interrupt controller (IMSIC). This property should be
> +      considered only when the interrupts-extended property is absent.

Considered by what?
On v1 you said:
<quote>
If both "interrupts-extended" and "msi-parent" are present then it means
the APLIC domain supports both MSI mode and Direct mode in HW. In this
case, the APLIC driver has to choose between MSI mode or Direct mode.
<\quote>

The description is still pretty ambiguous IMO. Perhaps incorporate
some of that expanded comment into the property description?
Say, "If both foo and bar are present, the APLIC domain has hardware
support for both MSI and direct mode. Software may then chose either
mode".
Have I misunderstood your comment on v1? It read as if having both
present indicated that both were possible & that "should be considered
only..." was more of a suggestion and a comment about the Linux driver's
behaviour.
Apologies if I have misunderstood, but I suppose if I have then the
binding's description could be improved!!

> +  riscv,children:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      maxItems: 1
> +    description:
> +      A list of child APLIC domains for the given APLIC domain. Each child
> +      APLIC domain is assigned child index in increasing order with the

btw, missing article before child (& a comma after order I think).

> +      first child APLIC domain assigned child index 0. The APLIC domain
> +      child index is used by firmware to delegate interrupts from the
> +      given APLIC domain to a particular child APLIC domain.
> +
> +  riscv,delegate:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024

Is it valid to have a delegate property without children? If not, the
binding should reflect that dependency IMO.

> +    items:
> +      items:
> +        - description: child APLIC domain phandle
> +        - description: first interrupt number (inclusive)
> +        - description: last interrupt number (inclusive)
> +    description:
> +      A interrupt delegation list where each entry is a triple consisting
> +      of child APLIC domain phandle, first interrupt number, and last
> +      interrupt number. The firmware will configure interrupt delegation

btw, drop the article before firmware here.
Also, "firmware will" or "firmware must"? Semantics perhaps, but they
are different!

Kinda for my own curiosity here, but do you expect these properties to
generally be dynamically filled in by the bootloader or read by the
bootloader to set up the configuration?

> +      registers based on interrupt delegation list.

I'm sorry Anup, but this child versus delegate thing is still not clear
to me binding wise. See below.

> +    aplic0: interrupt-controller@c000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0xc000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +      riscv,children = <&aplic1>, <&aplic2>;
> +      riscv,delegate = <&aplic1 1 63>;

Is aplic2 here for demonstrative purposes only, since it has not been
delegated any interrupts?
I suppose it is hardware present on the SoC that is not being used by
the current configuration?

Thanks,
Conor.

> +    };
> +
> +    aplic1: interrupt-controller@d000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>;
> +      reg = <0xd000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +
> +    aplic2: interrupt-controller@e000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0xe000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
  2023-01-03 14:14   ` Anup Patel
@ 2023-01-04 23:07     ` Conor Dooley
  -1 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-04 23:07 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

[-- Attachment #1: Type: text/plain, Size: 2114 bytes --]

Hey Anup!

On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
> The RISC-V AIA specification improves handling per-HART local interrupts
> in a backward compatible manner. This patch adds defines for new RISC-V
> AIA CSRs.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 92 insertions(+)
> 
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index 0e571f6483d9..4e1356bad7b2 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -73,7 +73,10 @@
>  #define IRQ_S_EXT		9
>  #define IRQ_VS_EXT		10
>  #define IRQ_M_EXT		11
> +#define IRQ_S_GEXT		12
>  #define IRQ_PMU_OVF		13
> +#define IRQ_LOCAL_MAX		(IRQ_PMU_OVF + 1)
> +#define IRQ_LOCAL_MASK		((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
>  
>  /* Exception causes */
>  #define EXC_INST_MISALIGNED	0
> @@ -156,6 +159,26 @@
>  				 (_AC(1, UL) << IRQ_S_TIMER) | \
>  				 (_AC(1, UL) << IRQ_S_EXT))
>  
> +/* AIA CSR bits */
> +#define TOPI_IID_SHIFT		16
> +#define TOPI_IID_MASK		0xfff
> +#define TOPI_IPRIO_MASK		0xff
> +#define TOPI_IPRIO_BITS		8
> +
> +#define TOPEI_ID_SHIFT		16
> +#define TOPEI_ID_MASK		0x7ff
> +#define TOPEI_PRIO_MASK		0x7ff
> +
> +#define ISELECT_IPRIO0		0x30
> +#define ISELECT_IPRIO15		0x3f
> +#define ISELECT_MASK		0x1ff
> +
> +#define HVICTL_VTI		0x40000000
> +#define HVICTL_IID		0x0fff0000
> +#define HVICTL_IID_SHIFT	16
> +#define HVICTL_IPRIOM		0x00000100
> +#define HVICTL_IPRIO		0x000000ff

Why not name these as masks, like you did for the other masks?
Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
to be used *pre*-shift.
Some consistency in naming and function would be great.


> +/* Machine-Level High-Half CSRs (AIA) */
> +#define CSR_MIDELEGH		0x313

I feel like I could find Midelegh in an Irish dictionary lol
Anyways, I went through the CSRs and they do all seem correct.

Thanks,
Conor.



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
@ 2023-01-04 23:07     ` Conor Dooley
  0 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-04 23:07 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree


[-- Attachment #1.1: Type: text/plain, Size: 2114 bytes --]

Hey Anup!

On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
> The RISC-V AIA specification improves handling per-HART local interrupts
> in a backward compatible manner. This patch adds defines for new RISC-V
> AIA CSRs.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 92 insertions(+)
> 
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index 0e571f6483d9..4e1356bad7b2 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -73,7 +73,10 @@
>  #define IRQ_S_EXT		9
>  #define IRQ_VS_EXT		10
>  #define IRQ_M_EXT		11
> +#define IRQ_S_GEXT		12
>  #define IRQ_PMU_OVF		13
> +#define IRQ_LOCAL_MAX		(IRQ_PMU_OVF + 1)
> +#define IRQ_LOCAL_MASK		((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
>  
>  /* Exception causes */
>  #define EXC_INST_MISALIGNED	0
> @@ -156,6 +159,26 @@
>  				 (_AC(1, UL) << IRQ_S_TIMER) | \
>  				 (_AC(1, UL) << IRQ_S_EXT))
>  
> +/* AIA CSR bits */
> +#define TOPI_IID_SHIFT		16
> +#define TOPI_IID_MASK		0xfff
> +#define TOPI_IPRIO_MASK		0xff
> +#define TOPI_IPRIO_BITS		8
> +
> +#define TOPEI_ID_SHIFT		16
> +#define TOPEI_ID_MASK		0x7ff
> +#define TOPEI_PRIO_MASK		0x7ff
> +
> +#define ISELECT_IPRIO0		0x30
> +#define ISELECT_IPRIO15		0x3f
> +#define ISELECT_MASK		0x1ff
> +
> +#define HVICTL_VTI		0x40000000
> +#define HVICTL_IID		0x0fff0000
> +#define HVICTL_IID_SHIFT	16
> +#define HVICTL_IPRIOM		0x00000100
> +#define HVICTL_IPRIO		0x000000ff

Why not name these as masks, like you did for the other masks?
Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
to be used *pre*-shift.
Some consistency in naming and function would be great.


> +/* Machine-Level High-Half CSRs (AIA) */
> +#define CSR_MIDELEGH		0x313

I feel like I could find Midelegh in an Irish dictionary lol
Anyways, I went through the CSRs and they do all seem correct.

Thanks,
Conor.



[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-01-03 14:14   ` Anup Patel
@ 2023-01-04 23:21     ` Conor Dooley
  -1 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-04 23:21 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

[-- Attachment #1: Type: text/plain, Size: 2202 bytes --]

Hey Anup,

On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> We add DT bindings document for the RISC-V incoming MSI controller
> (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> +  for each privilege level (machine or supervisor).

> +  The device tree of a RISC-V platform will have one IMSIC device tree node
> +  for each privilege level (machine or supervisor) which collectively describe
> +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).

> +examples:
> +  - |
> +    // Example 1 (Machine-level IMSIC files with just one group):
> +
> +    imsic_mlevel: interrupt-controller@24000000 {
> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0x28000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +    };
> +
> +  - |
> +    // Example 2 (Supervisor-level IMSIC files with two groups):
> +
> +    imsic_slevel: interrupt-controller@28000000 {
> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>,
> +                            <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> +            <0x29000000 0x2000>; /* Group1 IMSICs */
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +      riscv,group-index-bits = <1>;
> +      riscv,group-index-shift = <24>;
> +    };

How is, say linux, meant to know which of the per-level imsic DT nodes
applies to it?
I had a quick look in the driver, but could see no mechanism for it.
Apologies if I missed something obvious!

Thanks,
Conor.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
@ 2023-01-04 23:21     ` Conor Dooley
  0 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-04 23:21 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree


[-- Attachment #1.1: Type: text/plain, Size: 2202 bytes --]

Hey Anup,

On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> We add DT bindings document for the RISC-V incoming MSI controller
> (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> +  for each privilege level (machine or supervisor).

> +  The device tree of a RISC-V platform will have one IMSIC device tree node
> +  for each privilege level (machine or supervisor) which collectively describe
> +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).

> +examples:
> +  - |
> +    // Example 1 (Machine-level IMSIC files with just one group):
> +
> +    imsic_mlevel: interrupt-controller@24000000 {
> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0x28000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +    };
> +
> +  - |
> +    // Example 2 (Supervisor-level IMSIC files with two groups):
> +
> +    imsic_slevel: interrupt-controller@28000000 {
> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>,
> +                            <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> +            <0x29000000 0x2000>; /* Group1 IMSICs */
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +      riscv,group-index-bits = <1>;
> +      riscv,group-index-shift = <24>;
> +    };

How is, say linux, meant to know which of the per-level imsic DT nodes
applies to it?
I had a quick look in the driver, but could see no mechanism for it.
Apologies if I missed something obvious!

Thanks,
Conor.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
  2023-01-04 23:07     ` Conor Dooley
@ 2023-01-09  5:09       ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-09  5:09 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, linux-riscv, linux-kernel, devicetree

On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup!
>
> On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
> > The RISC-V AIA specification improves handling per-HART local interrupts
> > in a backward compatible manner. This patch adds defines for new RISC-V
> > AIA CSRs.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 92 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> > index 0e571f6483d9..4e1356bad7b2 100644
> > --- a/arch/riscv/include/asm/csr.h
> > +++ b/arch/riscv/include/asm/csr.h
> > @@ -73,7 +73,10 @@
> >  #define IRQ_S_EXT            9
> >  #define IRQ_VS_EXT           10
> >  #define IRQ_M_EXT            11
> > +#define IRQ_S_GEXT           12
> >  #define IRQ_PMU_OVF          13
> > +#define IRQ_LOCAL_MAX                (IRQ_PMU_OVF + 1)
> > +#define IRQ_LOCAL_MASK               ((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
> >
> >  /* Exception causes */
> >  #define EXC_INST_MISALIGNED  0
> > @@ -156,6 +159,26 @@
> >                                (_AC(1, UL) << IRQ_S_TIMER) | \
> >                                (_AC(1, UL) << IRQ_S_EXT))
> >
> > +/* AIA CSR bits */
> > +#define TOPI_IID_SHIFT               16
> > +#define TOPI_IID_MASK                0xfff
> > +#define TOPI_IPRIO_MASK              0xff
> > +#define TOPI_IPRIO_BITS              8
> > +
> > +#define TOPEI_ID_SHIFT               16
> > +#define TOPEI_ID_MASK                0x7ff
> > +#define TOPEI_PRIO_MASK              0x7ff
> > +
> > +#define ISELECT_IPRIO0               0x30
> > +#define ISELECT_IPRIO15              0x3f
> > +#define ISELECT_MASK         0x1ff
> > +
> > +#define HVICTL_VTI           0x40000000
> > +#define HVICTL_IID           0x0fff0000
> > +#define HVICTL_IID_SHIFT     16
> > +#define HVICTL_IPRIOM                0x00000100
> > +#define HVICTL_IPRIO         0x000000ff
>
> Why not name these as masks, like you did for the other masks?
> Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> to be used *pre*-shift.
> Some consistency in naming and function would be great.

The following convention is being followed in asm/csr.h for defining
MASK of any XYZ field in ABC CSR:
1. ABC_XYZ : This name is used for MASK which is intended
   to be used before SHIFT
2. ABC_XYZ_MASK: This name is used for MASK which is
   intended to be used after SHIFT

The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
follows the above convention. The only outlier is HGATPx_VMID_MASK
define which I will fix in my next KVM RISC-V series.

I don't see how any of the AIA CSR defines are violating the above
convention.

The choice of ABC_XYZ versus ABC_XYZ_MASK name is upto
the developer as long as the above convention is not violated.

>
>
> > +/* Machine-Level High-Half CSRs (AIA) */
> > +#define CSR_MIDELEGH         0x313
>
> I feel like I could find Midelegh in an Irish dictionary lol
> Anyways, I went through the CSRs and they do all seem correct.
>
> Thanks,
> Conor.
>
>

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
@ 2023-01-09  5:09       ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-09  5:09 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, linux-riscv, linux-kernel, devicetree

On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup!
>
> On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
> > The RISC-V AIA specification improves handling per-HART local interrupts
> > in a backward compatible manner. This patch adds defines for new RISC-V
> > AIA CSRs.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 92 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> > index 0e571f6483d9..4e1356bad7b2 100644
> > --- a/arch/riscv/include/asm/csr.h
> > +++ b/arch/riscv/include/asm/csr.h
> > @@ -73,7 +73,10 @@
> >  #define IRQ_S_EXT            9
> >  #define IRQ_VS_EXT           10
> >  #define IRQ_M_EXT            11
> > +#define IRQ_S_GEXT           12
> >  #define IRQ_PMU_OVF          13
> > +#define IRQ_LOCAL_MAX                (IRQ_PMU_OVF + 1)
> > +#define IRQ_LOCAL_MASK               ((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
> >
> >  /* Exception causes */
> >  #define EXC_INST_MISALIGNED  0
> > @@ -156,6 +159,26 @@
> >                                (_AC(1, UL) << IRQ_S_TIMER) | \
> >                                (_AC(1, UL) << IRQ_S_EXT))
> >
> > +/* AIA CSR bits */
> > +#define TOPI_IID_SHIFT               16
> > +#define TOPI_IID_MASK                0xfff
> > +#define TOPI_IPRIO_MASK              0xff
> > +#define TOPI_IPRIO_BITS              8
> > +
> > +#define TOPEI_ID_SHIFT               16
> > +#define TOPEI_ID_MASK                0x7ff
> > +#define TOPEI_PRIO_MASK              0x7ff
> > +
> > +#define ISELECT_IPRIO0               0x30
> > +#define ISELECT_IPRIO15              0x3f
> > +#define ISELECT_MASK         0x1ff
> > +
> > +#define HVICTL_VTI           0x40000000
> > +#define HVICTL_IID           0x0fff0000
> > +#define HVICTL_IID_SHIFT     16
> > +#define HVICTL_IPRIOM                0x00000100
> > +#define HVICTL_IPRIO         0x000000ff
>
> Why not name these as masks, like you did for the other masks?
> Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> to be used *pre*-shift.
> Some consistency in naming and function would be great.

The following convention is being followed in asm/csr.h for defining
MASK of any XYZ field in ABC CSR:
1. ABC_XYZ : This name is used for MASK which is intended
   to be used before SHIFT
2. ABC_XYZ_MASK: This name is used for MASK which is
   intended to be used after SHIFT

The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
follows the above convention. The only outlier is HGATPx_VMID_MASK
define which I will fix in my next KVM RISC-V series.

I don't see how any of the AIA CSR defines are violating the above
convention.

The choice of ABC_XYZ versus ABC_XYZ_MASK name is upto
the developer as long as the above convention is not violated.

>
>
> > +/* Machine-Level High-Half CSRs (AIA) */
> > +#define CSR_MIDELEGH         0x313
>
> I feel like I could find Midelegh in an Irish dictionary lol
> Anyways, I went through the CSRs and they do all seem correct.
>
> Thanks,
> Conor.
>
>

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-01-03 14:14   ` Anup Patel
@ 2023-01-12 20:49     ` Rob Herring
  -1 siblings, 0 replies; 72+ messages in thread
From: Rob Herring @ 2023-01-12 20:49 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> We add DT bindings document for the RISC-V incoming MSI controller
> (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
>  1 file changed, 168 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> 
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> new file mode 100644
> index 000000000000..b9db03b6e95f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> @@ -0,0 +1,168 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: RISC-V Incoming MSI Controller (IMSIC)
> +
> +maintainers:
> +  - Anup Patel <anup@brainfault.org>
> +
> +description: |
> +  The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
> +  MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
> +  AIA specification can be found at https://github.com/riscv/riscv-aia.
> +
> +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> +  for each privilege level (machine or supervisor). The configuration of
> +  a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
> +  space to receive MSIs from devices. Each IMSIC interrupt file supports a
> +  fixed number of interrupt identities (to distinguish MSIs from devices)
> +  which is same for given privilege level across CPUs (or HARTs).
> +
> +  The device tree of a RISC-V platform will have one IMSIC device tree node
> +  for each privilege level (machine or supervisor) which collectively describe
> +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
> +
> +  The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
> +  follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
> +  group is a set of IMSIC interrupt files co-located in MMIO space and we can
> +  have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
> +  RISC-V platform. The MSI target address of a IMSIC interrupt file at given
> +  privilege level (machine or supervisor) encodes group index, HART index,
> +  and guest index (shown below).
> +
> +  XLEN-1           >=24                                 12    0
> +  |                  |                                  |     |
> +  -------------------------------------------------------------
> +  |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> +  -------------------------------------------------------------
> +
> +allOf:
> +  - $ref: /schemas/interrupt-controller.yaml#
> +  - $ref: /schemas/interrupt-controller/msi-controller.yaml#
> +
> +properties:
> +  compatible:
> +    items:
> +      - enum:
> +          - riscv,qemu-imsics

The implmentation/vendor is qemu, so: qemu,imsics (or qemu,riscv-imsics?)

> +      - const: riscv,imsics
> +
> +  reg:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      Base address of each IMSIC group.
> +
> +  interrupt-controller: true
> +
> +  "#interrupt-cells":
> +    const: 0
> +
> +  msi-controller: true
> +
> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      This property represents the set of CPUs (or HARTs) for which given
> +      device tree node describes the IMSIC interrupt files. Each node pointed
> +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> +      HART) as parent.
> +
> +  riscv,num-ids:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 63
> +    maximum: 2047
> +    description:
> +      Number of interrupt identities supported by IMSIC interrupt file.
> +
> +  riscv,num-guest-ids:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 63
> +    maximum: 2047
> +    description:
> +      Number of interrupt identities are supported by IMSIC guest interrupt
> +      file. When not specified it is assumed to be same as specified by the
> +      riscv,num-ids property.
> +
> +  riscv,guest-index-bits:
> +    minimum: 0
> +    maximum: 7
> +    default: 0
> +    description:
> +      Number of guest index bits in the MSI target address. When not
> +      specified it is assumed to be 0.

No need to repeat what 'default: 0' defines.

> +
> +  riscv,hart-index-bits:
> +    minimum: 0
> +    maximum: 15
> +    description:
> +      Number of HART index bits in the MSI target address. When not
> +      specified it is estimated based on the interrupts-extended property.

If guessing works, why do you need the property? Perhaps 
s/estimated/calculated/?

> +
> +  riscv,group-index-bits:
> +    minimum: 0
> +    maximum: 7
> +    default: 0
> +    description:
> +      Number of group index bits in the MSI target address. When not
> +      specified it is assumed to be 0.
> +
> +  riscv,group-index-shift:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 0
> +    maximum: 55
> +    default: 24
> +    description:
> +      The least significant bit position of the group index bits in the
> +      MSI target address. When not specified it is assumed to be 24.
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupt-controller
> +  - msi-controller

#msi-cells should be defined (as 0) and required. Best to be explicit 
#and not rely on the default.

> +  - interrupts-extended
> +  - riscv,num-ids
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> +    // Example 1 (Machine-level IMSIC files with just one group):
> +
> +    imsic_mlevel: interrupt-controller@24000000 {

Drop unused labels.

> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0x28000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +    };
> +
> +  - |
> +    // Example 2 (Supervisor-level IMSIC files with two groups):
> +
> +    imsic_slevel: interrupt-controller@28000000 {
> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>,
> +                            <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> +            <0x29000000 0x2000>; /* Group1 IMSICs */
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +      riscv,group-index-bits = <1>;
> +      riscv,group-index-shift = <24>;
> +    };
> +...
> -- 
> 2.34.1
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
@ 2023-01-12 20:49     ` Rob Herring
  0 siblings, 0 replies; 72+ messages in thread
From: Rob Herring @ 2023-01-12 20:49 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> We add DT bindings document for the RISC-V incoming MSI controller
> (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
>  1 file changed, 168 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> 
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> new file mode 100644
> index 000000000000..b9db03b6e95f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> @@ -0,0 +1,168 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: RISC-V Incoming MSI Controller (IMSIC)
> +
> +maintainers:
> +  - Anup Patel <anup@brainfault.org>
> +
> +description: |
> +  The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
> +  MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
> +  AIA specification can be found at https://github.com/riscv/riscv-aia.
> +
> +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> +  for each privilege level (machine or supervisor). The configuration of
> +  a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
> +  space to receive MSIs from devices. Each IMSIC interrupt file supports a
> +  fixed number of interrupt identities (to distinguish MSIs from devices)
> +  which is same for given privilege level across CPUs (or HARTs).
> +
> +  The device tree of a RISC-V platform will have one IMSIC device tree node
> +  for each privilege level (machine or supervisor) which collectively describe
> +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
> +
> +  The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
> +  follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
> +  group is a set of IMSIC interrupt files co-located in MMIO space and we can
> +  have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
> +  RISC-V platform. The MSI target address of a IMSIC interrupt file at given
> +  privilege level (machine or supervisor) encodes group index, HART index,
> +  and guest index (shown below).
> +
> +  XLEN-1           >=24                                 12    0
> +  |                  |                                  |     |
> +  -------------------------------------------------------------
> +  |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> +  -------------------------------------------------------------
> +
> +allOf:
> +  - $ref: /schemas/interrupt-controller.yaml#
> +  - $ref: /schemas/interrupt-controller/msi-controller.yaml#
> +
> +properties:
> +  compatible:
> +    items:
> +      - enum:
> +          - riscv,qemu-imsics

The implmentation/vendor is qemu, so: qemu,imsics (or qemu,riscv-imsics?)

> +      - const: riscv,imsics
> +
> +  reg:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      Base address of each IMSIC group.
> +
> +  interrupt-controller: true
> +
> +  "#interrupt-cells":
> +    const: 0
> +
> +  msi-controller: true
> +
> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      This property represents the set of CPUs (or HARTs) for which given
> +      device tree node describes the IMSIC interrupt files. Each node pointed
> +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> +      HART) as parent.
> +
> +  riscv,num-ids:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 63
> +    maximum: 2047
> +    description:
> +      Number of interrupt identities supported by IMSIC interrupt file.
> +
> +  riscv,num-guest-ids:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 63
> +    maximum: 2047
> +    description:
> +      Number of interrupt identities are supported by IMSIC guest interrupt
> +      file. When not specified it is assumed to be same as specified by the
> +      riscv,num-ids property.
> +
> +  riscv,guest-index-bits:
> +    minimum: 0
> +    maximum: 7
> +    default: 0
> +    description:
> +      Number of guest index bits in the MSI target address. When not
> +      specified it is assumed to be 0.

No need to repeat what 'default: 0' defines.

> +
> +  riscv,hart-index-bits:
> +    minimum: 0
> +    maximum: 15
> +    description:
> +      Number of HART index bits in the MSI target address. When not
> +      specified it is estimated based on the interrupts-extended property.

If guessing works, why do you need the property? Perhaps 
s/estimated/calculated/?

> +
> +  riscv,group-index-bits:
> +    minimum: 0
> +    maximum: 7
> +    default: 0
> +    description:
> +      Number of group index bits in the MSI target address. When not
> +      specified it is assumed to be 0.
> +
> +  riscv,group-index-shift:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 0
> +    maximum: 55
> +    default: 24
> +    description:
> +      The least significant bit position of the group index bits in the
> +      MSI target address. When not specified it is assumed to be 24.
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupt-controller
> +  - msi-controller

#msi-cells should be defined (as 0) and required. Best to be explicit 
#and not rely on the default.

> +  - interrupts-extended
> +  - riscv,num-ids
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> +    // Example 1 (Machine-level IMSIC files with just one group):
> +
> +    imsic_mlevel: interrupt-controller@24000000 {

Drop unused labels.

> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0x28000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +    };
> +
> +  - |
> +    // Example 2 (Supervisor-level IMSIC files with two groups):
> +
> +    imsic_slevel: interrupt-controller@28000000 {
> +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>,
> +                            <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> +            <0x29000000 0x2000>; /* Group1 IMSICs */
> +      interrupt-controller;
> +      #interrupt-cells = <0>;
> +      msi-controller;
> +      riscv,num-ids = <127>;
> +      riscv,group-index-bits = <1>;
> +      riscv,group-index-shift = <24>;
> +    };
> +...
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-01-03 14:14   ` Anup Patel
@ 2023-01-12 21:02     ` Rob Herring
  -1 siblings, 0 replies; 72+ messages in thread
From: Rob Herring @ 2023-01-12 21:02 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
>  1 file changed, 159 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> 
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> new file mode 100644
> index 000000000000..b7f20aad72c2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> @@ -0,0 +1,159 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
> +
> +maintainers:
> +  - Anup Patel <anup@brainfault.org>
> +
> +description:
> +  The RISC-V advanced interrupt architecture (AIA) defines an advanced
> +  platform level interrupt controller (APLIC) for handling wired interrupts
> +  in a RISC-V platform. The RISC-V AIA specification can be found at
> +  https://github.com/riscv/riscv-aia.
> +
> +  The RISC-V APLIC is implemented as hierarchical APLIC domains where all
> +  interrupt sources connect to the root domain which can further delegate
> +  interrupts to child domains. There is one device tree node for each APLIC
> +  domain.
> +
> +allOf:
> +  - $ref: /schemas/interrupt-controller.yaml#
> +
> +properties:
> +  compatible:
> +    items:
> +      - enum:
> +          - riscv,qemu-aplic

Make 'qemu' the vendor.

> +      - const: riscv,aplic
> +
> +  reg:
> +    maxItems: 1
> +
> +  interrupt-controller: true
> +
> +  "#interrupt-cells":
> +    const: 2
> +
> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      Given APLIC domain directly injects external interrupts to a set of
> +      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> +      node, which has a riscv node (i.e. RISC-V HART) as parent.
> +
> +  msi-parent:
> +    description:
> +      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> +      message signaled interrupt controller (IMSIC). This property should be
> +      considered only when the interrupts-extended property is absent.
> +
> +  riscv,num-sources:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 1
> +    maximum: 1023
> +    description:
> +      Specifies how many wired interrupts are supported by this APLIC domain.

We don't normally need to how many interrupts, why here?

> +
> +  riscv,children:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      maxItems: 1
> +    description:
> +      A list of child APLIC domains for the given APLIC domain. Each child
> +      APLIC domain is assigned child index in increasing order with the
> +      first child APLIC domain assigned child index 0. The APLIC domain
> +      child index is used by firmware to delegate interrupts from the
> +      given APLIC domain to a particular child APLIC domain.
> +
> +  riscv,delegate:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      items:
> +        - description: child APLIC domain phandle
> +        - description: first interrupt number (inclusive)
> +        - description: last interrupt number (inclusive)
> +    description:
> +      A interrupt delegation list where each entry is a triple consisting
> +      of child APLIC domain phandle, first interrupt number, and last
> +      interrupt number. The firmware will configure interrupt delegation
> +      registers based on interrupt delegation list.

The node's domain it delegating its interrupts to the child domain or 
the other way around? The interrupt numbers here are this domain's or 
the child's?

> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupt-controller
> +  - "#interrupt-cells"
> +  - riscv,num-sources
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> +    // Example 1 (APLIC domains directly injecting interrupt to HARTs):
> +
> +    aplic0: interrupt-controller@c000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0xc000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +      riscv,children = <&aplic1>, <&aplic2>;
> +      riscv,delegate = <&aplic1 1 63>;
> +    };
> +
> +    aplic1: interrupt-controller@d000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>;
> +      reg = <0xd000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +
> +    aplic2: interrupt-controller@e000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0xe000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +
> +  - |
> +    // Example 2 (APLIC domains forwarding interrupts as MSIs):
> +
> +    aplic3: interrupt-controller@c000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      msi-parent = <&imsic_mlevel>;
> +      reg = <0xc000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +      riscv,children = <&aplic4>;
> +      riscv,delegate = <&aplic4 1 63>;
> +    };
> +
> +    aplic4: interrupt-controller@d000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      msi-parent = <&imsic_slevel>;
> +      reg = <0xd000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +...
> -- 
> 2.34.1
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-01-12 21:02     ` Rob Herring
  0 siblings, 0 replies; 72+ messages in thread
From: Rob Herring @ 2023-01-12 21:02 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
>  1 file changed, 159 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> 
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> new file mode 100644
> index 000000000000..b7f20aad72c2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> @@ -0,0 +1,159 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
> +
> +maintainers:
> +  - Anup Patel <anup@brainfault.org>
> +
> +description:
> +  The RISC-V advanced interrupt architecture (AIA) defines an advanced
> +  platform level interrupt controller (APLIC) for handling wired interrupts
> +  in a RISC-V platform. The RISC-V AIA specification can be found at
> +  https://github.com/riscv/riscv-aia.
> +
> +  The RISC-V APLIC is implemented as hierarchical APLIC domains where all
> +  interrupt sources connect to the root domain which can further delegate
> +  interrupts to child domains. There is one device tree node for each APLIC
> +  domain.
> +
> +allOf:
> +  - $ref: /schemas/interrupt-controller.yaml#
> +
> +properties:
> +  compatible:
> +    items:
> +      - enum:
> +          - riscv,qemu-aplic

Make 'qemu' the vendor.

> +      - const: riscv,aplic
> +
> +  reg:
> +    maxItems: 1
> +
> +  interrupt-controller: true
> +
> +  "#interrupt-cells":
> +    const: 2
> +
> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      Given APLIC domain directly injects external interrupts to a set of
> +      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> +      node, which has a riscv node (i.e. RISC-V HART) as parent.
> +
> +  msi-parent:
> +    description:
> +      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> +      message signaled interrupt controller (IMSIC). This property should be
> +      considered only when the interrupts-extended property is absent.
> +
> +  riscv,num-sources:
> +    $ref: /schemas/types.yaml#/definitions/uint32
> +    minimum: 1
> +    maximum: 1023
> +    description:
> +      Specifies how many wired interrupts are supported by this APLIC domain.

We don't normally need to how many interrupts, why here?

> +
> +  riscv,children:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      maxItems: 1
> +    description:
> +      A list of child APLIC domains for the given APLIC domain. Each child
> +      APLIC domain is assigned child index in increasing order with the
> +      first child APLIC domain assigned child index 0. The APLIC domain
> +      child index is used by firmware to delegate interrupts from the
> +      given APLIC domain to a particular child APLIC domain.
> +
> +  riscv,delegate:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      items:
> +        - description: child APLIC domain phandle
> +        - description: first interrupt number (inclusive)
> +        - description: last interrupt number (inclusive)
> +    description:
> +      A interrupt delegation list where each entry is a triple consisting
> +      of child APLIC domain phandle, first interrupt number, and last
> +      interrupt number. The firmware will configure interrupt delegation
> +      registers based on interrupt delegation list.

The node's domain it delegating its interrupts to the child domain or 
the other way around? The interrupt numbers here are this domain's or 
the child's?

> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupt-controller
> +  - "#interrupt-cells"
> +  - riscv,num-sources
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> +    // Example 1 (APLIC domains directly injecting interrupt to HARTs):
> +
> +    aplic0: interrupt-controller@c000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 11>,
> +                            <&cpu2_intc 11>,
> +                            <&cpu3_intc 11>,
> +                            <&cpu4_intc 11>;
> +      reg = <0xc000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +      riscv,children = <&aplic1>, <&aplic2>;
> +      riscv,delegate = <&aplic1 1 63>;
> +    };
> +
> +    aplic1: interrupt-controller@d000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu1_intc 9>,
> +                            <&cpu2_intc 9>;
> +      reg = <0xd000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +
> +    aplic2: interrupt-controller@e000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      interrupts-extended = <&cpu3_intc 9>,
> +                            <&cpu4_intc 9>;
> +      reg = <0xe000000 0x4080>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +
> +  - |
> +    // Example 2 (APLIC domains forwarding interrupts as MSIs):
> +
> +    aplic3: interrupt-controller@c000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      msi-parent = <&imsic_mlevel>;
> +      reg = <0xc000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +      riscv,children = <&aplic4>;
> +      riscv,delegate = <&aplic4 1 63>;
> +    };
> +
> +    aplic4: interrupt-controller@d000000 {
> +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> +      msi-parent = <&imsic_slevel>;
> +      reg = <0xd000000 0x4000>;
> +      interrupt-controller;
> +      #interrupt-cells = <2>;
> +      riscv,num-sources = <63>;
> +    };
> +...
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/9] irqchip/riscv-intc: Add support for RISC-V AIA
  2023-01-03 14:14   ` Anup Patel
@ 2023-01-13  9:39     ` Marc Zyngier
  -1 siblings, 0 replies; 72+ messages in thread
From: Marc Zyngier @ 2023-01-13  9:39 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Rob Herring,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, 03 Jan 2023 14:14:03 +0000,
Anup Patel <apatel@ventanamicro.com> wrote:
> 
> The RISC-V advanced interrupt architecture (AIA) extends the per-HART
> local interrupts in following ways:
> 1. Minimum 64 local interrupts for both RV32 and RV64
> 2. Ability to process multiple pending local interrupts in same
>    interrupt handler
> 3. Priority configuration for each local interrupts
> 4. Special CSRs to configure/access the per-HART MSI controller
> 
> This patch adds support for RISC-V AIA in the RISC-V intc driver.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/irq-riscv-intc.c | 37 ++++++++++++++++++++++++++------
>  1 file changed, 31 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
> index f229e3e66387..880d1639aadc 100644
> --- a/drivers/irqchip/irq-riscv-intc.c
> +++ b/drivers/irqchip/irq-riscv-intc.c
> @@ -16,6 +16,7 @@
>  #include <linux/module.h>
>  #include <linux/of.h>
>  #include <linux/smp.h>
> +#include <asm/hwcap.h>
>  
>  static struct irq_domain *intc_domain;
>  
> @@ -29,6 +30,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
>  	generic_handle_domain_irq(intc_domain, cause);
>  }
>  
> +static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)

What does "static asmlinkage" in a C file even mean? And clearly, this
isn't the only instance in this file...

> +{
> +	unsigned long topi;
> +
> +	while ((topi = csr_read(CSR_TOPI)))
> +		generic_handle_domain_irq(intc_domain,
> +					  topi >> TOPI_IID_SHIFT);
> +}
> +
>  /*
>   * On RISC-V systems local interrupts are masked or unmasked by writing
>   * the SIE (Supervisor Interrupt Enable) CSR.  As CSRs can only be written
> @@ -38,12 +48,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
>  
>  static void riscv_intc_irq_mask(struct irq_data *d)
>  {
> -	csr_clear(CSR_IE, BIT(d->hwirq));
> +	if (d->hwirq < BITS_PER_LONG)

And what if BIT_PER_LONG is 32, as I expect it to be on 32bit, which
the commit message says is supported?

> +		csr_clear(CSR_IE, BIT(d->hwirq));
> +	else
> +		csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
>  }
>  
>  static void riscv_intc_irq_unmask(struct irq_data *d)
>  {
> -	csr_set(CSR_IE, BIT(d->hwirq));
> +	if (d->hwirq < BITS_PER_LONG)
> +		csr_set(CSR_IE, BIT(d->hwirq));
> +	else
> +		csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
>  }
>  
>  static void riscv_intc_irq_eoi(struct irq_data *d)
> @@ -115,7 +131,7 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
>  static int __init riscv_intc_init(struct device_node *node,
>  				  struct device_node *parent)
>  {
> -	int rc;
> +	int rc, nr_irqs;
>  	unsigned long hartid;
>  
>  	rc = riscv_of_parent_hartid(node, &hartid);
> @@ -133,14 +149,21 @@ static int __init riscv_intc_init(struct device_node *node,
>  	if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
>  		return 0;
>  
> -	intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
> +	nr_irqs = BITS_PER_LONG;
> +	if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
> +		nr_irqs = nr_irqs * 2;

Really, please drop this BITS_PER_LONG stuff. Use explicit numbers.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/9] irqchip/riscv-intc: Add support for RISC-V AIA
@ 2023-01-13  9:39     ` Marc Zyngier
  0 siblings, 0 replies; 72+ messages in thread
From: Marc Zyngier @ 2023-01-13  9:39 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Rob Herring,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, 03 Jan 2023 14:14:03 +0000,
Anup Patel <apatel@ventanamicro.com> wrote:
> 
> The RISC-V advanced interrupt architecture (AIA) extends the per-HART
> local interrupts in following ways:
> 1. Minimum 64 local interrupts for both RV32 and RV64
> 2. Ability to process multiple pending local interrupts in same
>    interrupt handler
> 3. Priority configuration for each local interrupts
> 4. Special CSRs to configure/access the per-HART MSI controller
> 
> This patch adds support for RISC-V AIA in the RISC-V intc driver.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/irq-riscv-intc.c | 37 ++++++++++++++++++++++++++------
>  1 file changed, 31 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
> index f229e3e66387..880d1639aadc 100644
> --- a/drivers/irqchip/irq-riscv-intc.c
> +++ b/drivers/irqchip/irq-riscv-intc.c
> @@ -16,6 +16,7 @@
>  #include <linux/module.h>
>  #include <linux/of.h>
>  #include <linux/smp.h>
> +#include <asm/hwcap.h>
>  
>  static struct irq_domain *intc_domain;
>  
> @@ -29,6 +30,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
>  	generic_handle_domain_irq(intc_domain, cause);
>  }
>  
> +static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)

What does "static asmlinkage" in a C file even mean? And clearly, this
isn't the only instance in this file...

> +{
> +	unsigned long topi;
> +
> +	while ((topi = csr_read(CSR_TOPI)))
> +		generic_handle_domain_irq(intc_domain,
> +					  topi >> TOPI_IID_SHIFT);
> +}
> +
>  /*
>   * On RISC-V systems local interrupts are masked or unmasked by writing
>   * the SIE (Supervisor Interrupt Enable) CSR.  As CSRs can only be written
> @@ -38,12 +48,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
>  
>  static void riscv_intc_irq_mask(struct irq_data *d)
>  {
> -	csr_clear(CSR_IE, BIT(d->hwirq));
> +	if (d->hwirq < BITS_PER_LONG)

And what if BIT_PER_LONG is 32, as I expect it to be on 32bit, which
the commit message says is supported?

> +		csr_clear(CSR_IE, BIT(d->hwirq));
> +	else
> +		csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
>  }
>  
>  static void riscv_intc_irq_unmask(struct irq_data *d)
>  {
> -	csr_set(CSR_IE, BIT(d->hwirq));
> +	if (d->hwirq < BITS_PER_LONG)
> +		csr_set(CSR_IE, BIT(d->hwirq));
> +	else
> +		csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
>  }
>  
>  static void riscv_intc_irq_eoi(struct irq_data *d)
> @@ -115,7 +131,7 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
>  static int __init riscv_intc_init(struct device_node *node,
>  				  struct device_node *parent)
>  {
> -	int rc;
> +	int rc, nr_irqs;
>  	unsigned long hartid;
>  
>  	rc = riscv_of_parent_hartid(node, &hartid);
> @@ -133,14 +149,21 @@ static int __init riscv_intc_init(struct device_node *node,
>  	if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
>  		return 0;
>  
> -	intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
> +	nr_irqs = BITS_PER_LONG;
> +	if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
> +		nr_irqs = nr_irqs * 2;

Really, please drop this BITS_PER_LONG stuff. Use explicit numbers.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
  2023-01-03 14:14   ` Anup Patel
@ 2023-01-13 10:10     ` Marc Zyngier
  -1 siblings, 0 replies; 72+ messages in thread
From: Marc Zyngier @ 2023-01-13 10:10 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Rob Herring,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, 03 Jan 2023 14:14:05 +0000,
Anup Patel <apatel@ventanamicro.com> wrote:
> 
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new MSI controller for managing MSIs on a RISC-V platform. This new
> MSI controller is referred to as incoming message signaled interrupt
> controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> (For more details refer https://github.com/riscv/riscv-aia)

And how about IPIs, which this driver seems to be concerned about?

> 
> This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> platforms.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/Kconfig             |   14 +-
>  drivers/irqchip/Makefile            |    1 +
>  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
>  include/linux/irqchip/riscv-imsic.h |   92 +++
>  4 files changed, 1280 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
>  create mode 100644 include/linux/irqchip/riscv-imsic.h
> 
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 9e65345ca3f6..a1315189a595 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -29,7 +29,6 @@ config ARM_GIC_V2M
>  
>  config GIC_NON_BANKED
>  	bool
> -
>  config ARM_GIC_V3
>  	bool
>  	select IRQ_DOMAIN_HIERARCHY
> @@ -548,6 +547,19 @@ config SIFIVE_PLIC
>  	select IRQ_DOMAIN_HIERARCHY
>  	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>  
> +config RISCV_IMSIC
> +	bool
> +	depends on RISCV
> +	select IRQ_DOMAIN_HIERARCHY
> +	select GENERIC_MSI_IRQ_DOMAIN
> +
> +config RISCV_IMSIC_PCI
> +	bool
> +	depends on RISCV_IMSIC
> +	depends on PCI
> +	depends on PCI_MSI
> +	default RISCV_IMSIC

This should definitely tell you that this driver needs splitting.

> +
>  config EXYNOS_IRQ_COMBINER
>  	bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
>  	depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 87b49a10962c..22c723cc6ec8 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
>  obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
>  obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
>  obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
>  obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
>  obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
>  obj-$(CONFIG_IMX_INTMUX)		+= irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> new file mode 100644
> index 000000000000..4c16b66738d6
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic.c
> @@ -0,0 +1,1174 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/bitmap.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +#include <asm/hwcap.h>
> +
> +#define IMSIC_DISABLE_EIDELIVERY	0
> +#define IMSIC_ENABLE_EIDELIVERY		1
> +#define IMSIC_DISABLE_EITHRESHOLD	1
> +#define IMSIC_ENABLE_EITHRESHOLD	0
> +
> +#define imsic_csr_write(__c, __v)	\
> +do {					\
> +	csr_write(CSR_ISELECT, __c);	\
> +	csr_write(CSR_IREG, __v);	\
> +} while (0)
> +
> +#define imsic_csr_read(__c)		\
> +({					\
> +	unsigned long __v;		\
> +	csr_write(CSR_ISELECT, __c);	\
> +	__v = csr_read(CSR_IREG);	\
> +	__v;				\
> +})
> +
> +#define imsic_csr_set(__c, __v)		\
> +do {					\
> +	csr_write(CSR_ISELECT, __c);	\
> +	csr_set(CSR_IREG, __v);		\
> +} while (0)
> +
> +#define imsic_csr_clear(__c, __v)	\
> +do {					\
> +	csr_write(CSR_ISELECT, __c);	\
> +	csr_clear(CSR_IREG, __v);	\
> +} while (0)
> +
> +struct imsic_mmio {
> +	phys_addr_t pa;
> +	void __iomem *va;
> +	unsigned long size;
> +};
> +
> +struct imsic_priv {
> +	/* Global configuration common for all HARTs */
> +	struct imsic_global_config global;
> +
> +	/* MMIO regions */
> +	u32 num_mmios;
> +	struct imsic_mmio *mmios;
> +
> +	/* Global state of interrupt identities */
> +	raw_spinlock_t ids_lock;
> +	unsigned long *ids_used_bimap;
> +	unsigned long *ids_enabled_bimap;
> +	unsigned int *ids_target_cpu;
> +
> +	/* Mask for connected CPUs */
> +	struct cpumask lmask;
> +
> +	/* IPI interrupt identity */
> +	u32 ipi_id;
> +	u32 ipi_lsync_id;
> +
> +	/* IRQ domains */
> +	struct irq_domain *base_domain;
> +	struct irq_domain *pci_domain;
> +	struct irq_domain *plat_domain;
> +};
> +
> +struct imsic_handler {
> +	/* Local configuration for given HART */
> +	struct imsic_local_config local;
> +
> +	/* Pointer to private context */
> +	struct imsic_priv *priv;
> +};
> +
> +static bool imsic_init_done;
> +
> +static int imsic_parent_irq;
> +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> +
> +const struct imsic_global_config *imsic_get_global_config(void)
> +{
> +	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +
> +	if (!handler || !handler->priv)
> +		return NULL;
> +
> +	return &handler->priv->global;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> +
> +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> +{
> +	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +	if (!handler || !handler->priv)
> +		return NULL;

How can this happen?

> +
> +	return &handler->local;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_local_config);

Why are these symbols exported? They have no user, so they shouldn't
even exist here. I also seriously doubt there is a valid use case for
exposing this information to the rest of the kernel.

> +
> +static int imsic_cpu_page_phys(unsigned int cpu,
> +			       unsigned int guest_index,
> +			       phys_addr_t *out_msi_pa)
> +{
> +	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +	struct imsic_global_config *global;
> +	struct imsic_local_config *local;
> +
> +	if (!handler || !handler->priv)
> +		return -ENODEV;
> +	local = &handler->local;
> +	global = &handler->priv->global;
> +
> +	if (BIT(global->guest_index_bits) <= guest_index)
> +		return -EINVAL;
> +
> +	if (out_msi_pa)
> +		*out_msi_pa = local->msi_pa +
> +			      (guest_index * IMSIC_MMIO_PAGE_SZ);
> +
> +	return 0;
> +}
> +
> +static int imsic_get_cpu(struct imsic_priv *priv,
> +			 const struct cpumask *mask_val, bool force,
> +			 unsigned int *out_target_cpu)
> +{
> +	struct cpumask amask;
> +	unsigned int cpu;
> +
> +	cpumask_and(&amask, &priv->lmask, mask_val);
> +
> +	if (force)
> +		cpu = cpumask_first(&amask);
> +	else
> +		cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> +	if (cpu >= nr_cpu_ids)
> +		return -EINVAL;
> +
> +	if (out_target_cpu)
> +		*out_target_cpu = cpu;
> +
> +	return 0;
> +}
> +
> +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> +				 struct msi_msg *msg)
> +{
> +	phys_addr_t msi_addr;
> +	int err;
> +
> +	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +	if (err)
> +		return err;
> +
> +	msg->address_hi = upper_32_bits(msi_addr);
> +	msg->address_lo = lower_32_bits(msi_addr);
> +	msg->data = id;
> +
> +	return err;
> +}
> +
> +static void imsic_id_set_target(struct imsic_priv *priv,
> +				 unsigned int id, unsigned int target_cpu)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	priv->ids_target_cpu[id] = target_cpu;
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> +					unsigned int id)
> +{
> +	unsigned int ret;
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	ret = priv->ids_target_cpu[id];
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	return ret;
> +}
> +
> +static void __imsic_eix_update(unsigned long base_id,
> +			       unsigned long num_id, bool pend, bool val)
> +{
> +	unsigned long i, isel, ireg, flags;
> +	unsigned long id = base_id, last_id = base_id + num_id;
> +
> +	while (id < last_id) {
> +		isel = id / BITS_PER_LONG;
> +		isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> +		isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> +
> +		ireg = 0;
> +		for (i = id & (__riscv_xlen - 1);
> +		     (id < last_id) && (i < __riscv_xlen); i++) {
> +			ireg |= BIT(i);
> +			id++;
> +		}
> +
> +		/*
> +		 * The IMSIC EIEx and EIPx registers are indirectly
> +		 * accessed via using ISELECT and IREG CSRs so we
> +		 * save/restore local IRQ to ensure that we don't
> +		 * get preempted while accessing IMSIC registers.
> +		 */
> +		local_irq_save(flags);
> +		if (val)
> +			imsic_csr_set(isel, ireg);
> +		else
> +			imsic_csr_clear(isel, ireg);
> +		local_irq_restore(flags);

What is the actual requirement? no preemption? or no interrupts? This
isn't the same thing. Also, a bunch of the users already disable
interrupts. Consistency wouldn't hurt here.

> +	}
> +}
> +
> +#define __imsic_id_enable(__id)		\
> +	__imsic_eix_update((__id), 1, false, true)
> +#define __imsic_id_disable(__id)	\
> +	__imsic_eix_update((__id), 1, false, false)
> +
> +#ifdef CONFIG_SMP
> +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> +{
> +	struct imsic_handler *handler;
> +	struct cpumask amask;
> +	int cpu;
> +
> +	cpumask_and(&amask, &priv->lmask, cpu_online_mask);

Can't this race against a CPU going down?

> +	for_each_cpu(cpu, &amask) {
> +		if (cpu == smp_processor_id())
> +			continue;
> +
> +		handler = per_cpu_ptr(&imsic_handlers, cpu);
> +		if (!handler || !handler->priv || !handler->local.msi_va) {
> +			pr_warn("CPU%d: handler not initialized\n", cpu);

How many times are you going to do that? On each failing synchronisation?

> +			continue;
> +		}
> +
> +		writel(handler->priv->ipi_lsync_id, handler->local.msi_va);

As I understand it, this is a "behind the scenes" IPI. Why isn't that
a *real* IPI?

> +	}
> +}
> +#else
> +#define __imsic_id_smp_sync(__priv)
> +#endif
> +
> +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	bitmap_set(priv->ids_enabled_bimap, id, 1);
> +	__imsic_id_enable(id);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	__imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	bitmap_clear(priv->ids_enabled_bimap, id, 1);
> +	__imsic_id_disable(id);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	__imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_ids_local_sync(struct imsic_priv *priv)
> +{
> +	int i;
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	for (i = 1; i <= priv->global.nr_ids; i++) {
> +		if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> +			continue;
> +
> +		if (test_bit(i, priv->ids_enabled_bimap))
> +			__imsic_id_enable(i);
> +		else
> +			__imsic_id_disable(i);
> +	}
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> +{
> +	if (enable) {
> +		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> +		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> +	} else {
> +		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> +		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> +	}
> +}
> +
> +static int imsic_ids_alloc(struct imsic_priv *priv,
> +			   unsigned int max_id, unsigned int order)
> +{
> +	int ret;
> +	unsigned long flags;
> +
> +	if ((priv->global.nr_ids < max_id) ||
> +	    (max_id < BIT(order)))
> +		return -EINVAL;

Why do we need this check? Shouldn't that be guaranteed by
construction?

> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	ret = bitmap_find_free_region(priv->ids_used_bimap,
> +				      max_id + 1, order);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	return ret;
> +}
> +
> +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> +			   unsigned int order)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	bitmap_release_region(priv->ids_used_bimap, base_id, order);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static int __init imsic_ids_init(struct imsic_priv *priv)
> +{
> +	int i;
> +	struct imsic_global_config *global = &priv->global;
> +
> +	raw_spin_lock_init(&priv->ids_lock);
> +
> +	/* Allocate used bitmap */
> +	priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +					sizeof(unsigned long), GFP_KERNEL);

How about bitmap_alloc?

> +	if (!priv->ids_used_bimap)
> +		return -ENOMEM;
> +
> +	/* Allocate enabled bitmap */
> +	priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +					   sizeof(unsigned long), GFP_KERNEL);
> +	if (!priv->ids_enabled_bimap) {
> +		kfree(priv->ids_used_bimap);
> +		return -ENOMEM;
> +	}
> +
> +	/* Allocate target CPU array */
> +	priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> +				       sizeof(unsigned int), GFP_KERNEL);
> +	if (!priv->ids_target_cpu) {
> +		kfree(priv->ids_enabled_bimap);
> +		kfree(priv->ids_used_bimap);
> +		return -ENOMEM;
> +	}
> +	for (i = 0; i <= global->nr_ids; i++)
> +		priv->ids_target_cpu[i] = UINT_MAX;
> +
> +	/* Reserve ID#0 because it is special and never implemented */
> +	bitmap_set(priv->ids_used_bimap, 0, 1);
> +
> +	return 0;
> +}
> +
> +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> +{
> +	kfree(priv->ids_target_cpu);
> +	kfree(priv->ids_enabled_bimap);
> +	kfree(priv->ids_used_bimap);
> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_ipi_send(unsigned int cpu)
> +{
> +	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +	if (!handler || !handler->priv || !handler->local.msi_va) {
> +		pr_warn("CPU%d: handler not initialized\n", cpu);
> +		return;
> +	}
> +
> +	writel(handler->priv->ipi_id, handler->local.msi_va);
> +}
> +
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +	__imsic_id_enable(priv->ipi_id);
> +	__imsic_id_enable(priv->ipi_lsync_id);
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +	int virq;
> +
> +	/* Allocate interrupt identity for IPIs */
> +	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +	if (virq < 0)
> +		return virq;
> +	priv->ipi_id = virq;
> +
> +	/* Create IMSIC IPI multiplexing */
> +	virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);

Please! This BITS_PER_BYTE makes zero sense here. Have a proper define
that says 8, and document *why* this is 8! You're not defining a type
system, you're writing a irqchip driver.

> +	if (virq <= 0) {
> +		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +		return (virq < 0) ? virq : -ENOMEM;
> +	}
> +
> +	/* Set vIRQ range */
> +	riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> +
> +	/* Allocate interrupt identity for local enable/disable sync */
> +	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +	if (virq < 0) {
> +		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +		return virq;
> +	}
> +	priv->ipi_lsync_id = virq;
> +
> +	return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +	imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> +	if (priv->ipi_id)
> +		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +}
> +#else
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +	/* Clear the IPI ids because we are not using IPIs */
> +	priv->ipi_id = 0;
> +	priv->ipi_lsync_id = 0;
> +	return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +}
> +#endif
> +
> +static void imsic_irq_mask(struct irq_data *d)
> +{
> +	imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_unmask(struct irq_data *d)
> +{
> +	imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> +				      struct msi_msg *msg)
> +{
> +	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +	unsigned int cpu;
> +	int err;
> +
> +	cpu = imsic_id_get_target(priv, d->hwirq);
> +	WARN_ON(cpu == UINT_MAX);
> +
> +	err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> +	WARN_ON(err);
> +
> +	iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> +}
> +
> +#ifdef CONFIG_SMP
> +static int imsic_irq_set_affinity(struct irq_data *d,
> +				  const struct cpumask *mask_val,
> +				  bool force)
> +{
> +	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +	unsigned int target_cpu;
> +	int rc;
> +
> +	rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> +	if (rc)
> +		return rc;
> +
> +	imsic_id_set_target(priv, d->hwirq, target_cpu);
> +	irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> +
> +	return IRQ_SET_MASK_OK;
> +}
> +#endif
> +
> +static struct irq_chip imsic_irq_base_chip = {
> +	.name			= "RISC-V IMSIC-BASE",
> +	.irq_mask		= imsic_irq_mask,
> +	.irq_unmask		= imsic_irq_unmask,
> +#ifdef CONFIG_SMP
> +	.irq_set_affinity	= imsic_irq_set_affinity,
> +#endif
> +	.irq_compose_msi_msg	= imsic_irq_compose_msi_msg,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE |
> +				  IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> +				  unsigned int virq,
> +				  unsigned int nr_irqs,
> +				  void *args)
> +{
> +	struct imsic_priv *priv = domain->host_data;
> +	msi_alloc_info_t *info = args;
> +	phys_addr_t msi_addr;
> +	int i, hwirq, err = 0;
> +	unsigned int cpu;
> +
> +	err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> +	if (err)
> +		return err;
> +
> +	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +	if (err)
> +		return err;
> +
> +	hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> +				get_count_order(nr_irqs));
> +	if (hwirq < 0)
> +		return hwirq;
> +
> +	err = iommu_dma_prepare_msi(info->desc, msi_addr);
> +	if (err)
> +		goto fail;
> +
> +	for (i = 0; i < nr_irqs; i++) {
> +		imsic_id_set_target(priv, hwirq + i, cpu);
> +		irq_domain_set_info(domain, virq + i, hwirq + i,
> +				    &imsic_irq_base_chip, priv,
> +				    handle_simple_irq, NULL, NULL);
> +		irq_set_noprobe(virq + i);
> +		irq_set_affinity(virq + i, &priv->lmask);
> +	}
> +
> +	return 0;
> +
> +fail:
> +	imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> +	return err;
> +}
> +
> +static void imsic_irq_domain_free(struct irq_domain *domain,
> +				  unsigned int virq,
> +				  unsigned int nr_irqs)
> +{
> +	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> +	struct imsic_priv *priv = domain->host_data;
> +
> +	imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> +	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> +}
> +
> +static const struct irq_domain_ops imsic_base_domain_ops = {
> +	.alloc		= imsic_irq_domain_alloc,
> +	.free		= imsic_irq_domain_free,
> +};
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +
> +static void imsic_pci_mask_irq(struct irq_data *d)
> +{
> +	pci_msi_mask_irq(d);
> +	irq_chip_mask_parent(d);
> +}
> +
> +static void imsic_pci_unmask_irq(struct irq_data *d)
> +{
> +	pci_msi_unmask_irq(d);
> +	irq_chip_unmask_parent(d);
> +}
> +
> +static struct irq_chip imsic_pci_irq_chip = {
> +	.name			= "RISC-V IMSIC-PCI",
> +	.irq_mask		= imsic_pci_mask_irq,
> +	.irq_unmask		= imsic_pci_unmask_irq,
> +	.irq_eoi		= irq_chip_eoi_parent,
> +};
> +
> +static struct msi_domain_ops imsic_pci_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_pci_domain_info = {
> +	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> +		   MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> +	.ops	= &imsic_pci_domain_ops,
> +	.chip	= &imsic_pci_irq_chip,
> +};
> +
> +#endif
> +
> +static struct irq_chip imsic_plat_irq_chip = {
> +	.name			= "RISC-V IMSIC-PLAT",
> +};
> +
> +static struct msi_domain_ops imsic_plat_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_plat_domain_info = {
> +	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> +	.ops	= &imsic_plat_domain_ops,
> +	.chip	= &imsic_plat_irq_chip,
> +};
> +
> +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> +					 struct fwnode_handle *fwnode)
> +{
> +	/* Create Base IRQ domain */
> +	priv->base_domain = irq_domain_create_tree(fwnode,
> +						&imsic_base_domain_ops, priv);
> +	if (!priv->base_domain) {
> +		pr_err("Failed to create IMSIC base domain\n");
> +		return -ENOMEM;
> +	}
> +	irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +	/* Create PCI MSI domain */
> +	priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> +						&imsic_pci_domain_info,
> +						priv->base_domain);
> +	if (!priv->pci_domain) {
> +		pr_err("Failed to create IMSIC PCI domain\n");
> +		irq_domain_remove(priv->base_domain);
> +		return -ENOMEM;
> +	}
> +#endif
> +
> +	/* Create Platform MSI domain */
> +	priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> +						&imsic_plat_domain_info,
> +						priv->base_domain);
> +	if (!priv->plat_domain) {
> +		pr_err("Failed to create IMSIC platform domain\n");
> +		if (priv->pci_domain)
> +			irq_domain_remove(priv->pci_domain);
> +		irq_domain_remove(priv->base_domain);
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> +	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +	struct irq_chip *chip = irq_desc_get_chip(desc);
> +	struct imsic_priv *priv = handler->priv;
> +	irq_hw_number_t hwirq;
> +	int err;
> +
> +	WARN_ON_ONCE(!handler->priv);
> +
> +	chained_irq_enter(chip, desc);
> +
> +	while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> +		hwirq = hwirq >> TOPEI_ID_SHIFT;
> +
> +		if (hwirq == priv->ipi_id) {
> +#ifdef CONFIG_SMP
> +			ipi_mux_process();
> +#endif
> +			continue;
> +		} else if (hwirq == priv->ipi_lsync_id) {
> +			imsic_ids_local_sync(priv);
> +			continue;
> +		}
> +
> +		err = generic_handle_domain_irq(priv->base_domain, hwirq);
> +		if (unlikely(err))
> +			pr_warn_ratelimited(
> +				"hwirq %lu mapping not found\n", hwirq);
> +	}
> +
> +	chained_irq_exit(chip, desc);
> +}
> +
> +static int imsic_starting_cpu(unsigned int cpu)
> +{
> +	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +	struct imsic_priv *priv = handler->priv;
> +
> +	/* Enable per-CPU parent interrupt */
> +	if (imsic_parent_irq)
> +		enable_percpu_irq(imsic_parent_irq,
> +				  irq_get_trigger_type(imsic_parent_irq));

Shouldn't that be the default already?

> +	else
> +		pr_warn("cpu%d: parent irq not available\n", cpu);

And yet continue in sequence? Duh...

> +
> +	/* Enable IPIs */
> +	imsic_ipi_enable(priv);
> +
> +	/*
> +	 * Interrupts identities might have been enabled/disabled while
> +	 * this CPU was not running so sync-up local enable/disable state.
> +	 */
> +	imsic_ids_local_sync(priv);
> +
> +	/* Locally enable interrupt delivery */
> +	imsic_ids_local_delivery(priv, true);
> +
> +	return 0;
> +}
> +
> +struct imsic_fwnode_ops {
> +	u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> +			     void *fwopaque);
> +	int (*parent_hartid)(struct fwnode_handle *fwnode,
> +			     void *fwopaque, u32 index,
> +			     unsigned long *out_hartid);
> +	u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> +	int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> +				void *fwopaque, u32 index,
> +				struct resource *res);
> +	void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> +				  void *fwopaque, u32 index);
> +	int (*read_u32)(struct fwnode_handle *fwnode,
> +			void *fwopaque, const char *prop, u32 *out_val);
> +	bool (*read_bool)(struct fwnode_handle *fwnode,
> +			  void *fwopaque, const char *prop);
> +};

Why do we need this sort of (terrible) indirection?

> +
> +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> +			     struct fwnode_handle *fwnode,
> +			     void *fwopaque)
> +{
> +	struct resource res;
> +	phys_addr_t base_addr;
> +	int rc, nr_parent_irqs;
> +	struct imsic_mmio *mmio;
> +	struct imsic_priv *priv;
> +	struct irq_domain *domain;
> +	struct imsic_handler *handler;
> +	struct imsic_global_config *global;
> +	u32 i, tmp, nr_handlers = 0;
> +
> +	if (imsic_init_done) {
> +		pr_err("%pfwP: already initialized hence ignoring\n",
> +			fwnode);
> +		return -ENODEV;
> +	}
> +
> +	if (!riscv_isa_extension_available(NULL, SxAIA)) {
> +		pr_err("%pfwP: AIA support not available\n", fwnode);
> +		return -ENODEV;
> +	}
> +
> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +	global = &priv->global;
> +
> +	/* Find number of parent interrupts */
> +	nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> +	if (!nr_parent_irqs) {
> +		pr_err("%pfwP: no parent irqs available\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of guest index bits in MSI address */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> +			     &global->guest_index_bits);
> +	if (rc)
> +		global->guest_index_bits = 0;
> +	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> +	if (tmp < global->guest_index_bits) {
> +		pr_err("%pfwP: guest index bits too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of HART index bits */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> +			     &global->hart_index_bits);
> +	if (rc) {
> +		/* Assume default value */
> +		global->hart_index_bits = __fls(nr_parent_irqs);
> +		if (BIT(global->hart_index_bits) < nr_parent_irqs)
> +			global->hart_index_bits++;
> +	}
> +	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +	      global->guest_index_bits;
> +	if (tmp < global->hart_index_bits) {
> +		pr_err("%pfwP: HART index bits too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of group index bits */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> +			     &global->group_index_bits);
> +	if (rc)
> +		global->group_index_bits = 0;
> +	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +	      global->guest_index_bits - global->hart_index_bits;
> +	if (tmp < global->group_index_bits) {
> +		pr_err("%pfwP: group index bits too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Find first bit position of group index.
> +	 * If not specified assumed the default APLIC-IMSIC configuration.
> +	 */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> +			     &global->group_index_shift);
> +	if (rc)
> +		global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> +	tmp = global->group_index_bits + global->group_index_shift - 1;
> +	if (tmp >= BITS_PER_LONG) {
> +		pr_err("%pfwP: group index shift too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of interrupt identities */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> +			     &global->nr_ids);
> +	if (rc) {
> +		pr_err("%pfwP: number of interrupt identities not found\n",
> +			fwnode);
> +		return rc;
> +	}
> +	if ((global->nr_ids < IMSIC_MIN_ID) ||
> +	    (global->nr_ids >= IMSIC_MAX_ID) ||
> +	    ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +		pr_err("%pfwP: invalid number of interrupt identities\n",
> +			fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of guest interrupt identities */
> +	if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> +			    &global->nr_guest_ids))
> +		global->nr_guest_ids = global->nr_ids;
> +	if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> +	    (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> +	    ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +		pr_err("%pfwP: invalid number of guest interrupt identities\n",
> +			fwnode);
> +		return -EINVAL;
> +	}

Please split the whole guest stuff out. It is totally unused!

I've stopped reading. This needs structure, cleanups and a bit of
taste. Not a lot of that here at the moment.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
@ 2023-01-13 10:10     ` Marc Zyngier
  0 siblings, 0 replies; 72+ messages in thread
From: Marc Zyngier @ 2023-01-13 10:10 UTC (permalink / raw)
  To: Anup Patel
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Rob Herring,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Tue, 03 Jan 2023 14:14:05 +0000,
Anup Patel <apatel@ventanamicro.com> wrote:
> 
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new MSI controller for managing MSIs on a RISC-V platform. This new
> MSI controller is referred to as incoming message signaled interrupt
> controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> (For more details refer https://github.com/riscv/riscv-aia)

And how about IPIs, which this driver seems to be concerned about?

> 
> This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> platforms.
> 
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/Kconfig             |   14 +-
>  drivers/irqchip/Makefile            |    1 +
>  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
>  include/linux/irqchip/riscv-imsic.h |   92 +++
>  4 files changed, 1280 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
>  create mode 100644 include/linux/irqchip/riscv-imsic.h
> 
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 9e65345ca3f6..a1315189a595 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -29,7 +29,6 @@ config ARM_GIC_V2M
>  
>  config GIC_NON_BANKED
>  	bool
> -
>  config ARM_GIC_V3
>  	bool
>  	select IRQ_DOMAIN_HIERARCHY
> @@ -548,6 +547,19 @@ config SIFIVE_PLIC
>  	select IRQ_DOMAIN_HIERARCHY
>  	select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>  
> +config RISCV_IMSIC
> +	bool
> +	depends on RISCV
> +	select IRQ_DOMAIN_HIERARCHY
> +	select GENERIC_MSI_IRQ_DOMAIN
> +
> +config RISCV_IMSIC_PCI
> +	bool
> +	depends on RISCV_IMSIC
> +	depends on PCI
> +	depends on PCI_MSI
> +	default RISCV_IMSIC

This should definitely tell you that this driver needs splitting.

> +
>  config EXYNOS_IRQ_COMBINER
>  	bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
>  	depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 87b49a10962c..22c723cc6ec8 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)			+= irq-qcom-mpm.o
>  obj-$(CONFIG_CSKY_MPINTC)		+= irq-csky-mpintc.o
>  obj-$(CONFIG_CSKY_APB_INTC)		+= irq-csky-apb-intc.o
>  obj-$(CONFIG_RISCV_INTC)		+= irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_IMSIC)		+= irq-riscv-imsic.o
>  obj-$(CONFIG_SIFIVE_PLIC)		+= irq-sifive-plic.o
>  obj-$(CONFIG_IMX_IRQSTEER)		+= irq-imx-irqsteer.o
>  obj-$(CONFIG_IMX_INTMUX)		+= irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> new file mode 100644
> index 000000000000..4c16b66738d6
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic.c
> @@ -0,0 +1,1174 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/bitmap.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +#include <asm/hwcap.h>
> +
> +#define IMSIC_DISABLE_EIDELIVERY	0
> +#define IMSIC_ENABLE_EIDELIVERY		1
> +#define IMSIC_DISABLE_EITHRESHOLD	1
> +#define IMSIC_ENABLE_EITHRESHOLD	0
> +
> +#define imsic_csr_write(__c, __v)	\
> +do {					\
> +	csr_write(CSR_ISELECT, __c);	\
> +	csr_write(CSR_IREG, __v);	\
> +} while (0)
> +
> +#define imsic_csr_read(__c)		\
> +({					\
> +	unsigned long __v;		\
> +	csr_write(CSR_ISELECT, __c);	\
> +	__v = csr_read(CSR_IREG);	\
> +	__v;				\
> +})
> +
> +#define imsic_csr_set(__c, __v)		\
> +do {					\
> +	csr_write(CSR_ISELECT, __c);	\
> +	csr_set(CSR_IREG, __v);		\
> +} while (0)
> +
> +#define imsic_csr_clear(__c, __v)	\
> +do {					\
> +	csr_write(CSR_ISELECT, __c);	\
> +	csr_clear(CSR_IREG, __v);	\
> +} while (0)
> +
> +struct imsic_mmio {
> +	phys_addr_t pa;
> +	void __iomem *va;
> +	unsigned long size;
> +};
> +
> +struct imsic_priv {
> +	/* Global configuration common for all HARTs */
> +	struct imsic_global_config global;
> +
> +	/* MMIO regions */
> +	u32 num_mmios;
> +	struct imsic_mmio *mmios;
> +
> +	/* Global state of interrupt identities */
> +	raw_spinlock_t ids_lock;
> +	unsigned long *ids_used_bimap;
> +	unsigned long *ids_enabled_bimap;
> +	unsigned int *ids_target_cpu;
> +
> +	/* Mask for connected CPUs */
> +	struct cpumask lmask;
> +
> +	/* IPI interrupt identity */
> +	u32 ipi_id;
> +	u32 ipi_lsync_id;
> +
> +	/* IRQ domains */
> +	struct irq_domain *base_domain;
> +	struct irq_domain *pci_domain;
> +	struct irq_domain *plat_domain;
> +};
> +
> +struct imsic_handler {
> +	/* Local configuration for given HART */
> +	struct imsic_local_config local;
> +
> +	/* Pointer to private context */
> +	struct imsic_priv *priv;
> +};
> +
> +static bool imsic_init_done;
> +
> +static int imsic_parent_irq;
> +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> +
> +const struct imsic_global_config *imsic_get_global_config(void)
> +{
> +	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +
> +	if (!handler || !handler->priv)
> +		return NULL;
> +
> +	return &handler->priv->global;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> +
> +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> +{
> +	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +	if (!handler || !handler->priv)
> +		return NULL;

How can this happen?

> +
> +	return &handler->local;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_local_config);

Why are these symbols exported? They have no user, so they shouldn't
even exist here. I also seriously doubt there is a valid use case for
exposing this information to the rest of the kernel.

> +
> +static int imsic_cpu_page_phys(unsigned int cpu,
> +			       unsigned int guest_index,
> +			       phys_addr_t *out_msi_pa)
> +{
> +	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +	struct imsic_global_config *global;
> +	struct imsic_local_config *local;
> +
> +	if (!handler || !handler->priv)
> +		return -ENODEV;
> +	local = &handler->local;
> +	global = &handler->priv->global;
> +
> +	if (BIT(global->guest_index_bits) <= guest_index)
> +		return -EINVAL;
> +
> +	if (out_msi_pa)
> +		*out_msi_pa = local->msi_pa +
> +			      (guest_index * IMSIC_MMIO_PAGE_SZ);
> +
> +	return 0;
> +}
> +
> +static int imsic_get_cpu(struct imsic_priv *priv,
> +			 const struct cpumask *mask_val, bool force,
> +			 unsigned int *out_target_cpu)
> +{
> +	struct cpumask amask;
> +	unsigned int cpu;
> +
> +	cpumask_and(&amask, &priv->lmask, mask_val);
> +
> +	if (force)
> +		cpu = cpumask_first(&amask);
> +	else
> +		cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> +	if (cpu >= nr_cpu_ids)
> +		return -EINVAL;
> +
> +	if (out_target_cpu)
> +		*out_target_cpu = cpu;
> +
> +	return 0;
> +}
> +
> +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> +				 struct msi_msg *msg)
> +{
> +	phys_addr_t msi_addr;
> +	int err;
> +
> +	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +	if (err)
> +		return err;
> +
> +	msg->address_hi = upper_32_bits(msi_addr);
> +	msg->address_lo = lower_32_bits(msi_addr);
> +	msg->data = id;
> +
> +	return err;
> +}
> +
> +static void imsic_id_set_target(struct imsic_priv *priv,
> +				 unsigned int id, unsigned int target_cpu)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	priv->ids_target_cpu[id] = target_cpu;
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> +					unsigned int id)
> +{
> +	unsigned int ret;
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	ret = priv->ids_target_cpu[id];
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	return ret;
> +}
> +
> +static void __imsic_eix_update(unsigned long base_id,
> +			       unsigned long num_id, bool pend, bool val)
> +{
> +	unsigned long i, isel, ireg, flags;
> +	unsigned long id = base_id, last_id = base_id + num_id;
> +
> +	while (id < last_id) {
> +		isel = id / BITS_PER_LONG;
> +		isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> +		isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> +
> +		ireg = 0;
> +		for (i = id & (__riscv_xlen - 1);
> +		     (id < last_id) && (i < __riscv_xlen); i++) {
> +			ireg |= BIT(i);
> +			id++;
> +		}
> +
> +		/*
> +		 * The IMSIC EIEx and EIPx registers are indirectly
> +		 * accessed via using ISELECT and IREG CSRs so we
> +		 * save/restore local IRQ to ensure that we don't
> +		 * get preempted while accessing IMSIC registers.
> +		 */
> +		local_irq_save(flags);
> +		if (val)
> +			imsic_csr_set(isel, ireg);
> +		else
> +			imsic_csr_clear(isel, ireg);
> +		local_irq_restore(flags);

What is the actual requirement? no preemption? or no interrupts? This
isn't the same thing. Also, a bunch of the users already disable
interrupts. Consistency wouldn't hurt here.

> +	}
> +}
> +
> +#define __imsic_id_enable(__id)		\
> +	__imsic_eix_update((__id), 1, false, true)
> +#define __imsic_id_disable(__id)	\
> +	__imsic_eix_update((__id), 1, false, false)
> +
> +#ifdef CONFIG_SMP
> +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> +{
> +	struct imsic_handler *handler;
> +	struct cpumask amask;
> +	int cpu;
> +
> +	cpumask_and(&amask, &priv->lmask, cpu_online_mask);

Can't this race against a CPU going down?

> +	for_each_cpu(cpu, &amask) {
> +		if (cpu == smp_processor_id())
> +			continue;
> +
> +		handler = per_cpu_ptr(&imsic_handlers, cpu);
> +		if (!handler || !handler->priv || !handler->local.msi_va) {
> +			pr_warn("CPU%d: handler not initialized\n", cpu);

How many times are you going to do that? On each failing synchronisation?

> +			continue;
> +		}
> +
> +		writel(handler->priv->ipi_lsync_id, handler->local.msi_va);

As I understand it, this is a "behind the scenes" IPI. Why isn't that
a *real* IPI?

> +	}
> +}
> +#else
> +#define __imsic_id_smp_sync(__priv)
> +#endif
> +
> +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	bitmap_set(priv->ids_enabled_bimap, id, 1);
> +	__imsic_id_enable(id);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	__imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	bitmap_clear(priv->ids_enabled_bimap, id, 1);
> +	__imsic_id_disable(id);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	__imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_ids_local_sync(struct imsic_priv *priv)
> +{
> +	int i;
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	for (i = 1; i <= priv->global.nr_ids; i++) {
> +		if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> +			continue;
> +
> +		if (test_bit(i, priv->ids_enabled_bimap))
> +			__imsic_id_enable(i);
> +		else
> +			__imsic_id_disable(i);
> +	}
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> +{
> +	if (enable) {
> +		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> +		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> +	} else {
> +		imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> +		imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> +	}
> +}
> +
> +static int imsic_ids_alloc(struct imsic_priv *priv,
> +			   unsigned int max_id, unsigned int order)
> +{
> +	int ret;
> +	unsigned long flags;
> +
> +	if ((priv->global.nr_ids < max_id) ||
> +	    (max_id < BIT(order)))
> +		return -EINVAL;

Why do we need this check? Shouldn't that be guaranteed by
construction?

> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	ret = bitmap_find_free_region(priv->ids_used_bimap,
> +				      max_id + 1, order);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +	return ret;
> +}
> +
> +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> +			   unsigned int order)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +	bitmap_release_region(priv->ids_used_bimap, base_id, order);
> +	raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static int __init imsic_ids_init(struct imsic_priv *priv)
> +{
> +	int i;
> +	struct imsic_global_config *global = &priv->global;
> +
> +	raw_spin_lock_init(&priv->ids_lock);
> +
> +	/* Allocate used bitmap */
> +	priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +					sizeof(unsigned long), GFP_KERNEL);

How about bitmap_alloc?

> +	if (!priv->ids_used_bimap)
> +		return -ENOMEM;
> +
> +	/* Allocate enabled bitmap */
> +	priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +					   sizeof(unsigned long), GFP_KERNEL);
> +	if (!priv->ids_enabled_bimap) {
> +		kfree(priv->ids_used_bimap);
> +		return -ENOMEM;
> +	}
> +
> +	/* Allocate target CPU array */
> +	priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> +				       sizeof(unsigned int), GFP_KERNEL);
> +	if (!priv->ids_target_cpu) {
> +		kfree(priv->ids_enabled_bimap);
> +		kfree(priv->ids_used_bimap);
> +		return -ENOMEM;
> +	}
> +	for (i = 0; i <= global->nr_ids; i++)
> +		priv->ids_target_cpu[i] = UINT_MAX;
> +
> +	/* Reserve ID#0 because it is special and never implemented */
> +	bitmap_set(priv->ids_used_bimap, 0, 1);
> +
> +	return 0;
> +}
> +
> +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> +{
> +	kfree(priv->ids_target_cpu);
> +	kfree(priv->ids_enabled_bimap);
> +	kfree(priv->ids_used_bimap);
> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_ipi_send(unsigned int cpu)
> +{
> +	struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +	if (!handler || !handler->priv || !handler->local.msi_va) {
> +		pr_warn("CPU%d: handler not initialized\n", cpu);
> +		return;
> +	}
> +
> +	writel(handler->priv->ipi_id, handler->local.msi_va);
> +}
> +
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +	__imsic_id_enable(priv->ipi_id);
> +	__imsic_id_enable(priv->ipi_lsync_id);
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +	int virq;
> +
> +	/* Allocate interrupt identity for IPIs */
> +	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +	if (virq < 0)
> +		return virq;
> +	priv->ipi_id = virq;
> +
> +	/* Create IMSIC IPI multiplexing */
> +	virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);

Please! This BITS_PER_BYTE makes zero sense here. Have a proper define
that says 8, and document *why* this is 8! You're not defining a type
system, you're writing a irqchip driver.

> +	if (virq <= 0) {
> +		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +		return (virq < 0) ? virq : -ENOMEM;
> +	}
> +
> +	/* Set vIRQ range */
> +	riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> +
> +	/* Allocate interrupt identity for local enable/disable sync */
> +	virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +	if (virq < 0) {
> +		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +		return virq;
> +	}
> +	priv->ipi_lsync_id = virq;
> +
> +	return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +	imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> +	if (priv->ipi_id)
> +		imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +}
> +#else
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +	/* Clear the IPI ids because we are not using IPIs */
> +	priv->ipi_id = 0;
> +	priv->ipi_lsync_id = 0;
> +	return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +}
> +#endif
> +
> +static void imsic_irq_mask(struct irq_data *d)
> +{
> +	imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_unmask(struct irq_data *d)
> +{
> +	imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> +				      struct msi_msg *msg)
> +{
> +	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +	unsigned int cpu;
> +	int err;
> +
> +	cpu = imsic_id_get_target(priv, d->hwirq);
> +	WARN_ON(cpu == UINT_MAX);
> +
> +	err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> +	WARN_ON(err);
> +
> +	iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> +}
> +
> +#ifdef CONFIG_SMP
> +static int imsic_irq_set_affinity(struct irq_data *d,
> +				  const struct cpumask *mask_val,
> +				  bool force)
> +{
> +	struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +	unsigned int target_cpu;
> +	int rc;
> +
> +	rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> +	if (rc)
> +		return rc;
> +
> +	imsic_id_set_target(priv, d->hwirq, target_cpu);
> +	irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> +
> +	return IRQ_SET_MASK_OK;
> +}
> +#endif
> +
> +static struct irq_chip imsic_irq_base_chip = {
> +	.name			= "RISC-V IMSIC-BASE",
> +	.irq_mask		= imsic_irq_mask,
> +	.irq_unmask		= imsic_irq_unmask,
> +#ifdef CONFIG_SMP
> +	.irq_set_affinity	= imsic_irq_set_affinity,
> +#endif
> +	.irq_compose_msi_msg	= imsic_irq_compose_msi_msg,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE |
> +				  IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> +				  unsigned int virq,
> +				  unsigned int nr_irqs,
> +				  void *args)
> +{
> +	struct imsic_priv *priv = domain->host_data;
> +	msi_alloc_info_t *info = args;
> +	phys_addr_t msi_addr;
> +	int i, hwirq, err = 0;
> +	unsigned int cpu;
> +
> +	err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> +	if (err)
> +		return err;
> +
> +	err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +	if (err)
> +		return err;
> +
> +	hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> +				get_count_order(nr_irqs));
> +	if (hwirq < 0)
> +		return hwirq;
> +
> +	err = iommu_dma_prepare_msi(info->desc, msi_addr);
> +	if (err)
> +		goto fail;
> +
> +	for (i = 0; i < nr_irqs; i++) {
> +		imsic_id_set_target(priv, hwirq + i, cpu);
> +		irq_domain_set_info(domain, virq + i, hwirq + i,
> +				    &imsic_irq_base_chip, priv,
> +				    handle_simple_irq, NULL, NULL);
> +		irq_set_noprobe(virq + i);
> +		irq_set_affinity(virq + i, &priv->lmask);
> +	}
> +
> +	return 0;
> +
> +fail:
> +	imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> +	return err;
> +}
> +
> +static void imsic_irq_domain_free(struct irq_domain *domain,
> +				  unsigned int virq,
> +				  unsigned int nr_irqs)
> +{
> +	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> +	struct imsic_priv *priv = domain->host_data;
> +
> +	imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> +	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> +}
> +
> +static const struct irq_domain_ops imsic_base_domain_ops = {
> +	.alloc		= imsic_irq_domain_alloc,
> +	.free		= imsic_irq_domain_free,
> +};
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +
> +static void imsic_pci_mask_irq(struct irq_data *d)
> +{
> +	pci_msi_mask_irq(d);
> +	irq_chip_mask_parent(d);
> +}
> +
> +static void imsic_pci_unmask_irq(struct irq_data *d)
> +{
> +	pci_msi_unmask_irq(d);
> +	irq_chip_unmask_parent(d);
> +}
> +
> +static struct irq_chip imsic_pci_irq_chip = {
> +	.name			= "RISC-V IMSIC-PCI",
> +	.irq_mask		= imsic_pci_mask_irq,
> +	.irq_unmask		= imsic_pci_unmask_irq,
> +	.irq_eoi		= irq_chip_eoi_parent,
> +};
> +
> +static struct msi_domain_ops imsic_pci_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_pci_domain_info = {
> +	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> +		   MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> +	.ops	= &imsic_pci_domain_ops,
> +	.chip	= &imsic_pci_irq_chip,
> +};
> +
> +#endif
> +
> +static struct irq_chip imsic_plat_irq_chip = {
> +	.name			= "RISC-V IMSIC-PLAT",
> +};
> +
> +static struct msi_domain_ops imsic_plat_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_plat_domain_info = {
> +	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> +	.ops	= &imsic_plat_domain_ops,
> +	.chip	= &imsic_plat_irq_chip,
> +};
> +
> +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> +					 struct fwnode_handle *fwnode)
> +{
> +	/* Create Base IRQ domain */
> +	priv->base_domain = irq_domain_create_tree(fwnode,
> +						&imsic_base_domain_ops, priv);
> +	if (!priv->base_domain) {
> +		pr_err("Failed to create IMSIC base domain\n");
> +		return -ENOMEM;
> +	}
> +	irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +	/* Create PCI MSI domain */
> +	priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> +						&imsic_pci_domain_info,
> +						priv->base_domain);
> +	if (!priv->pci_domain) {
> +		pr_err("Failed to create IMSIC PCI domain\n");
> +		irq_domain_remove(priv->base_domain);
> +		return -ENOMEM;
> +	}
> +#endif
> +
> +	/* Create Platform MSI domain */
> +	priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> +						&imsic_plat_domain_info,
> +						priv->base_domain);
> +	if (!priv->plat_domain) {
> +		pr_err("Failed to create IMSIC platform domain\n");
> +		if (priv->pci_domain)
> +			irq_domain_remove(priv->pci_domain);
> +		irq_domain_remove(priv->base_domain);
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> +	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +	struct irq_chip *chip = irq_desc_get_chip(desc);
> +	struct imsic_priv *priv = handler->priv;
> +	irq_hw_number_t hwirq;
> +	int err;
> +
> +	WARN_ON_ONCE(!handler->priv);
> +
> +	chained_irq_enter(chip, desc);
> +
> +	while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> +		hwirq = hwirq >> TOPEI_ID_SHIFT;
> +
> +		if (hwirq == priv->ipi_id) {
> +#ifdef CONFIG_SMP
> +			ipi_mux_process();
> +#endif
> +			continue;
> +		} else if (hwirq == priv->ipi_lsync_id) {
> +			imsic_ids_local_sync(priv);
> +			continue;
> +		}
> +
> +		err = generic_handle_domain_irq(priv->base_domain, hwirq);
> +		if (unlikely(err))
> +			pr_warn_ratelimited(
> +				"hwirq %lu mapping not found\n", hwirq);
> +	}
> +
> +	chained_irq_exit(chip, desc);
> +}
> +
> +static int imsic_starting_cpu(unsigned int cpu)
> +{
> +	struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +	struct imsic_priv *priv = handler->priv;
> +
> +	/* Enable per-CPU parent interrupt */
> +	if (imsic_parent_irq)
> +		enable_percpu_irq(imsic_parent_irq,
> +				  irq_get_trigger_type(imsic_parent_irq));

Shouldn't that be the default already?

> +	else
> +		pr_warn("cpu%d: parent irq not available\n", cpu);

And yet continue in sequence? Duh...

> +
> +	/* Enable IPIs */
> +	imsic_ipi_enable(priv);
> +
> +	/*
> +	 * Interrupts identities might have been enabled/disabled while
> +	 * this CPU was not running so sync-up local enable/disable state.
> +	 */
> +	imsic_ids_local_sync(priv);
> +
> +	/* Locally enable interrupt delivery */
> +	imsic_ids_local_delivery(priv, true);
> +
> +	return 0;
> +}
> +
> +struct imsic_fwnode_ops {
> +	u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> +			     void *fwopaque);
> +	int (*parent_hartid)(struct fwnode_handle *fwnode,
> +			     void *fwopaque, u32 index,
> +			     unsigned long *out_hartid);
> +	u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> +	int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> +				void *fwopaque, u32 index,
> +				struct resource *res);
> +	void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> +				  void *fwopaque, u32 index);
> +	int (*read_u32)(struct fwnode_handle *fwnode,
> +			void *fwopaque, const char *prop, u32 *out_val);
> +	bool (*read_bool)(struct fwnode_handle *fwnode,
> +			  void *fwopaque, const char *prop);
> +};

Why do we need this sort of (terrible) indirection?

> +
> +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> +			     struct fwnode_handle *fwnode,
> +			     void *fwopaque)
> +{
> +	struct resource res;
> +	phys_addr_t base_addr;
> +	int rc, nr_parent_irqs;
> +	struct imsic_mmio *mmio;
> +	struct imsic_priv *priv;
> +	struct irq_domain *domain;
> +	struct imsic_handler *handler;
> +	struct imsic_global_config *global;
> +	u32 i, tmp, nr_handlers = 0;
> +
> +	if (imsic_init_done) {
> +		pr_err("%pfwP: already initialized hence ignoring\n",
> +			fwnode);
> +		return -ENODEV;
> +	}
> +
> +	if (!riscv_isa_extension_available(NULL, SxAIA)) {
> +		pr_err("%pfwP: AIA support not available\n", fwnode);
> +		return -ENODEV;
> +	}
> +
> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +	global = &priv->global;
> +
> +	/* Find number of parent interrupts */
> +	nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> +	if (!nr_parent_irqs) {
> +		pr_err("%pfwP: no parent irqs available\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of guest index bits in MSI address */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> +			     &global->guest_index_bits);
> +	if (rc)
> +		global->guest_index_bits = 0;
> +	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> +	if (tmp < global->guest_index_bits) {
> +		pr_err("%pfwP: guest index bits too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of HART index bits */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> +			     &global->hart_index_bits);
> +	if (rc) {
> +		/* Assume default value */
> +		global->hart_index_bits = __fls(nr_parent_irqs);
> +		if (BIT(global->hart_index_bits) < nr_parent_irqs)
> +			global->hart_index_bits++;
> +	}
> +	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +	      global->guest_index_bits;
> +	if (tmp < global->hart_index_bits) {
> +		pr_err("%pfwP: HART index bits too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of group index bits */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> +			     &global->group_index_bits);
> +	if (rc)
> +		global->group_index_bits = 0;
> +	tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +	      global->guest_index_bits - global->hart_index_bits;
> +	if (tmp < global->group_index_bits) {
> +		pr_err("%pfwP: group index bits too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Find first bit position of group index.
> +	 * If not specified assumed the default APLIC-IMSIC configuration.
> +	 */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> +			     &global->group_index_shift);
> +	if (rc)
> +		global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> +	tmp = global->group_index_bits + global->group_index_shift - 1;
> +	if (tmp >= BITS_PER_LONG) {
> +		pr_err("%pfwP: group index shift too big\n", fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of interrupt identities */
> +	rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> +			     &global->nr_ids);
> +	if (rc) {
> +		pr_err("%pfwP: number of interrupt identities not found\n",
> +			fwnode);
> +		return rc;
> +	}
> +	if ((global->nr_ids < IMSIC_MIN_ID) ||
> +	    (global->nr_ids >= IMSIC_MAX_ID) ||
> +	    ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +		pr_err("%pfwP: invalid number of interrupt identities\n",
> +			fwnode);
> +		return -EINVAL;
> +	}
> +
> +	/* Find number of guest interrupt identities */
> +	if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> +			    &global->nr_guest_ids))
> +		global->nr_guest_ids = global->nr_ids;
> +	if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> +	    (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> +	    ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +		pr_err("%pfwP: invalid number of guest interrupt identities\n",
> +			fwnode);
> +		return -EINVAL;
> +	}

Please split the whole guest stuff out. It is totally unused!

I've stopped reading. This needs structure, cleanups and a bit of
taste. Not a lot of that here at the moment.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Fwd: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
       [not found]     ` <CABvJ_xhjMa8xTsO-Qa23TOqxPpYxyBYSfV6TmKney-Gp3oi8cA@mail.gmail.com>
@ 2023-01-17  7:09         ` Vincent Chen
  0 siblings, 0 replies; 72+ messages in thread
From: Vincent Chen @ 2023-01-17  7:09 UTC (permalink / raw)
  To: apatel
  Cc: Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel@vger.kernel.org List,
	devicetree, Vincent Chen

> From: Anup Patel <apatel@ventanamicro.com>
> Date: Wed, Jan 4, 2023 at 1:19 AM
> Subject: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
> To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
>
>
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new interrupt controller for managing wired interrupts on a RISC-V
> platform. This new interrupt controller is referred to as advanced
> platform-level interrupt controller (APLIC) which can forward wired
> interrupts to CPUs (or HARTs) as local interrupts OR as message
> signaled interrupts.
> (For more details refer https://github.com/riscv/riscv-aia)
>
I could not find an appropriate place to post my question, so I posted it here.

I am a little concerned about the current MSI IRQ handling in APLIC.
According to the specification, when domaincfg.DM = 1, the pending bit
is set to one by a low-to-high transition in the rectified input
value. When the APLIC forward this interrupt by MSI, the pending bit
will be cleared regardless of whether the interrupt type is
level-sensitive or edge-trigger. However, the interrupted service
routine may not deal with all requests at a time. When there are
remaining pending requests after leaving the ISR, these requests may
have no chance to be serviced if the IRQ type of this device is
level-sensitive. This is because the rectified value of this interrupt
changes from 0 to 1 only at the beginning. It causes the pending bit
of this interrupt will not to be asserted again, so APLIC will not
send the next MSI for this interrupt. Therefore, the ISR doesn't have
a chance to deal with the remaining requests.

One possible solution to fix this issue is to let the APLIC driver
check if the rectified value of the serviced interrupt is one after
returning from its ISR. When the value is 1, it means this device
still has pending requests. In this case, the APLIC driver can set its
pending bit by the setipnum register or the setip register. It will
let APLIC send the next MSI for this device, and the ISR will have a
chance to deal with the remaining requests.



> This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> platforms.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/Kconfig             |   6 +
>  drivers/irqchip/Makefile            |   1 +
>  drivers/irqchip/irq-riscv-aplic.c   | 670 ++++++++++++++++++++++++++++
>  include/linux/irqchip/riscv-aplic.h | 117 +++++
>  4 files changed, 794 insertions(+)
>  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
>  create mode 100644 include/linux/irqchip/riscv-aplic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index a1315189a595..936e59fe1f99 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -547,6 +547,12 @@ config SIFIVE_PLIC
>         select IRQ_DOMAIN_HIERARCHY
>         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_APLIC
> +       bool
> +       depends on RISCV
> +       select IRQ_DOMAIN_HIERARCHY
> +       select GENERIC_MSI_IRQ_DOMAIN
> +
>  config RISCV_IMSIC
>         bool
>         depends on RISCV
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 22c723cc6ec8..6154e5bc4228 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
>  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
>  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
>  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
>  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
>  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
>  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> new file mode 100644
> index 000000000000..63f20892d7d3
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-aplic.c
> @@ -0,0 +1,670 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#include <linux/bitops.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-aplic.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/platform_device.h>
> +#include <linux/smp.h>
> +
> +#define APLIC_DEFAULT_PRIORITY         1
> +#define APLIC_DISABLE_IDELIVERY                0
> +#define APLIC_ENABLE_IDELIVERY         1
> +#define APLIC_DISABLE_ITHRESHOLD       1
> +#define APLIC_ENABLE_ITHRESHOLD                0
> +
> +struct aplic_msicfg {
> +       phys_addr_t             base_ppn;
> +       u32                     hhxs;
> +       u32                     hhxw;
> +       u32                     lhxs;
> +       u32                     lhxw;
> +};
> +
> +struct aplic_idc {
> +       unsigned int            hart_index;
> +       void __iomem            *regs;
> +       struct aplic_priv       *priv;
> +};
> +
> +struct aplic_priv {
> +       struct device           *dev;
> +       u32                     nr_irqs;
> +       u32                     nr_idcs;
> +       void __iomem            *regs;
> +       struct irq_domain       *irqdomain;
> +       struct aplic_msicfg     msicfg;
> +       struct cpumask          lmask;
> +};
> +
> +static unsigned int aplic_idc_parent_irq;
> +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> +
> +static void aplic_irq_unmask(struct irq_data *d)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> +
> +       if (!priv->nr_idcs)
> +               irq_chip_unmask_parent(d);
> +}
> +
> +static void aplic_irq_mask(struct irq_data *d)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> +
> +       if (!priv->nr_idcs)
> +               irq_chip_mask_parent(d);
> +}
> +
> +static int aplic_set_type(struct irq_data *d, unsigned int type)
> +{
> +       u32 val = 0;
> +       void __iomem *sourcecfg;
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       switch (type) {
> +       case IRQ_TYPE_NONE:
> +               val = APLIC_SOURCECFG_SM_INACTIVE;
> +               break;
> +       case IRQ_TYPE_LEVEL_LOW:
> +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> +               break;
> +       case IRQ_TYPE_LEVEL_HIGH:
> +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> +               break;
> +       case IRQ_TYPE_EDGE_FALLING:
> +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> +               break;
> +       case IRQ_TYPE_EDGE_RISING:
> +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> +       writel(val, sourcecfg);
> +
> +       return 0;
> +}
> +
> +#ifdef CONFIG_SMP
> +static int aplic_set_affinity(struct irq_data *d,
> +                             const struct cpumask *mask_val, bool force)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +       struct aplic_idc *idc;
> +       unsigned int cpu, val;
> +       struct cpumask amask;
> +       void __iomem *target;
> +
> +       if (!priv->nr_idcs)
> +               return irq_chip_set_affinity_parent(d, mask_val, force);
> +
> +       cpumask_and(&amask, &priv->lmask, mask_val);
> +
> +       if (force)
> +               cpu = cpumask_first(&amask);
> +       else
> +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> +       if (cpu >= nr_cpu_ids)
> +               return -EINVAL;
> +
> +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> +       target = priv->regs + APLIC_TARGET_BASE;
> +       target += (d->hwirq - 1) * sizeof(u32);
> +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> +       val |= APLIC_DEFAULT_PRIORITY;
> +       writel(val, target);
> +
> +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +       return IRQ_SET_MASK_OK_DONE;
> +}
> +#endif
> +
> +static struct irq_chip aplic_chip = {
> +       .name           = "RISC-V APLIC",
> +       .irq_mask       = aplic_irq_mask,
> +       .irq_unmask     = aplic_irq_unmask,
> +       .irq_set_type   = aplic_set_type,
> +#ifdef CONFIG_SMP
> +       .irq_set_affinity = aplic_set_affinity,
> +#endif
> +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> +                         IRQCHIP_SKIP_SET_WAKE |
> +                         IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int aplic_irqdomain_translate(struct irq_domain *d,
> +                                    struct irq_fwspec *fwspec,
> +                                    unsigned long *hwirq,
> +                                    unsigned int *type)
> +{
> +       if (WARN_ON(fwspec->param_count < 2))
> +               return -EINVAL;
> +       if (WARN_ON(!fwspec->param[0]))
> +               return -EINVAL;
> +
> +       *hwirq = fwspec->param[0];
> +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> +
> +       WARN_ON(*type == IRQ_TYPE_NONE);
> +
> +       return 0;
> +}
> +
> +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> +                                    unsigned int virq, unsigned int nr_irqs,
> +                                    void *arg)
> +{
> +       int i, ret;
> +       unsigned int type;
> +       irq_hw_number_t hwirq;
> +       struct irq_fwspec *fwspec = arg;
> +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> +
> +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> +       if (ret)
> +               return ret;
> +
> +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> +       if (ret)
> +               return ret;
> +
> +       for (i = 0; i < nr_irqs; i++)
> +               irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
> +                                             &aplic_chip, priv);
> +
> +       return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> +       .translate      = aplic_irqdomain_translate,
> +       .alloc          = aplic_irqdomain_msi_alloc,
> +       .free           = platform_msi_device_domain_free,
> +};
> +
> +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> +                                    unsigned int virq, unsigned int nr_irqs,
> +                                    void *arg)
> +{
> +       int i, ret;
> +       unsigned int type;
> +       irq_hw_number_t hwirq;
> +       struct irq_fwspec *fwspec = arg;
> +       struct aplic_priv *priv = domain->host_data;
> +
> +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> +       if (ret)
> +               return ret;
> +
> +       for (i = 0; i < nr_irqs; i++) {
> +               irq_domain_set_info(domain, virq + i, hwirq + i,
> +                                   &aplic_chip, priv, handle_simple_irq,
> +                                   NULL, NULL);
> +               irq_set_affinity(virq + i, &priv->lmask);
> +       }
> +
> +       return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> +       .translate      = aplic_irqdomain_translate,
> +       .alloc          = aplic_irqdomain_idc_alloc,
> +       .free           = irq_domain_free_irqs_top,
> +};
> +
> +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> +{
> +       int i;
> +
> +       /* Disable all interrupts */
> +       for (i = 0; i <= priv->nr_irqs; i += 32)
> +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> +                           (i / 32) * sizeof(u32));
> +
> +       /* Set interrupt type and default priority for all interrupts */
> +       for (i = 1; i <= priv->nr_irqs; i++) {
> +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> +                         (i - 1) * sizeof(u32));
> +               writel(APLIC_DEFAULT_PRIORITY,
> +                      priv->regs + APLIC_TARGET_BASE +
> +                      (i - 1) * sizeof(u32));
> +       }
> +
> +       /* Clear APLIC domaincfg */
> +       writel(0, priv->regs + APLIC_DOMAINCFG);
> +}
> +
> +static void aplic_init_hw_global(struct aplic_priv *priv)
> +{
> +       u32 val;
> +#ifdef CONFIG_RISCV_M_MODE
> +       u32 valH;
> +
> +       if (!priv->nr_idcs) {
> +               val = priv->msicfg.base_ppn;
> +               valH = (priv->msicfg.base_ppn >> 32) &
> +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> +       }
> +#endif
> +
> +       /* Setup APLIC domaincfg register */
> +       val = readl(priv->regs + APLIC_DOMAINCFG);
> +       val |= APLIC_DOMAINCFG_IE;
> +       if (!priv->nr_idcs)
> +               val |= APLIC_DOMAINCFG_DM;
> +       writel(val, priv->regs + APLIC_DOMAINCFG);
> +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> +               dev_warn(priv->dev,
> +                        "unable to write 0x%x in domaincfg\n", val);
> +}
> +
> +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +       unsigned int group_index, hart_index, guest_index, val;
> +       struct device *dev = msi_desc_to_dev(desc);
> +       struct aplic_priv *priv = dev_get_drvdata(dev);
> +       struct irq_data *d = irq_get_irq_data(desc->irq);
> +       struct aplic_msicfg *mc = &priv->msicfg;
> +       phys_addr_t tppn, tbppn, msg_addr;
> +       void __iomem *target;
> +
> +       /* For zeroed MSI, simply write zero into the target register */
> +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> +               target = priv->regs + APLIC_TARGET_BASE;
> +               target += (d->hwirq - 1) * sizeof(u32);
> +               writel(0, target);
> +               return;
> +       }
> +
> +       /* Sanity check on message data */
> +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> +
> +       /* Compute target MSI address */
> +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> +
> +       /* Compute target HART Base PPN */
> +       tbppn = tppn;
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> +       WARN_ON(tbppn != mc->base_ppn);
> +
> +       /* Compute target group and hart indexes */
> +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> +       hart_index |= (group_index << mc->lhxw);
> +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> +
> +       /* Compute target guest index */
> +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> +
> +       /* Update IRQ TARGET register */
> +       target = priv->regs + APLIC_TARGET_BASE;
> +       target += (d->hwirq - 1) * sizeof(u32);
> +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> +                               << APLIC_TARGET_HART_IDX_SHIFT;
> +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> +       writel(val, target);
> +}
> +
> +static int aplic_setup_msi(struct aplic_priv *priv)
> +{
> +       struct device *dev = priv->dev;
> +       struct aplic_msicfg *mc = &priv->msicfg;
> +       const struct imsic_global_config *imsic_global;
> +
> +       /*
> +        * The APLIC outgoing MSI config registers assume target MSI
> +        * controller to be RISC-V AIA IMSIC controller.
> +        */
> +       imsic_global = imsic_get_global_config();
> +       if (!imsic_global) {
> +               dev_err(dev, "IMSIC global config not found\n");
> +               return -ENODEV;
> +       }
> +
> +       /* Find number of guest index bits (LHXS) */
> +       mc->lhxs = imsic_global->guest_index_bits;
> +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> +               dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of HART index bits (LHXW) */
> +       mc->lhxw = imsic_global->hart_index_bits;
> +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> +               dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of group index bits (HHXW) */
> +       mc->hhxw = imsic_global->group_index_bits;
> +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> +               dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Find first bit position of group index (HHXS) */
> +       mc->hhxs = imsic_global->group_index_shift;
> +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> +               dev_err(dev, "IMSIC group index shift should be >= %d\n",
> +                       (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> +               return -EINVAL;
> +       }
> +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> +               dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Compute PPN base */
> +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> +
> +       /* Use all possible CPUs as lmask */
> +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> +
> +       return 0;
> +}
> +
> +/*
> + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> + * which will return highest priority pending interrupt and clear the
> + * pending bit of the interrupt. This process is repeated until CLAIMI
> + * register return zero value.
> + */
> +static void aplic_idc_handle_irq(struct irq_desc *desc)
> +{
> +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> +       struct irq_chip *chip = irq_desc_get_chip(desc);
> +       irq_hw_number_t hw_irq;
> +       int irq;
> +
> +       chained_irq_enter(chip, desc);
> +
> +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> +
> +               if (unlikely(irq <= 0))
> +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> +                                           hw_irq);
> +               else
> +                       generic_handle_irq(irq);
> +       }
> +
> +       chained_irq_exit(chip, desc);
> +}
> +
> +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> +{
> +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> +
> +       /* Priority must be less than threshold for interrupt triggering */
> +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> +
> +       /* Delivery must be set to 1 for interrupt triggering */
> +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> +}
> +
> +static int aplic_idc_dying_cpu(unsigned int cpu)
> +{
> +       if (aplic_idc_parent_irq)
> +               disable_percpu_irq(aplic_idc_parent_irq);
> +
> +       return 0;
> +}
> +
> +static int aplic_idc_starting_cpu(unsigned int cpu)
> +{
> +       if (aplic_idc_parent_irq)
> +               enable_percpu_irq(aplic_idc_parent_irq,
> +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> +
> +       return 0;
> +}
> +
> +static int aplic_setup_idc(struct aplic_priv *priv)
> +{
> +       int i, j, rc, cpu, setup_count = 0;
> +       struct device_node *node = priv->dev->of_node;
> +       struct device *dev = priv->dev;
> +       struct of_phandle_args parent;
> +       struct irq_domain *domain;
> +       unsigned long hartid;
> +       struct aplic_idc *idc;
> +       u32 val;
> +
> +       /* Setup per-CPU IDC and target CPU mask */
> +       for (i = 0; i < priv->nr_idcs; i++) {
> +               if (of_irq_parse_one(node, i, &parent)) {
> +                       dev_err(dev, "failed to parse parent for IDC%d.\n",
> +                               i);
> +                       return -EIO;
> +               }
> +
> +               /* Skip IDCs which do not connect to external interrupts */
> +               if (parent.args[0] != RV_IRQ_EXT)
> +                       continue;
> +
> +               rc = riscv_of_parent_hartid(parent.np, &hartid);
> +               if (rc) {
> +                       dev_err(dev, "failed to parse hart ID for IDC%d.\n",
> +                               i);
> +                       return rc;
> +               }
> +
> +               cpu = riscv_hartid_to_cpuid(hartid);
> +               if (cpu < 0) {
> +                       dev_warn(dev, "invalid cpuid for IDC%d\n", i);
> +                       continue;
> +               }
> +
> +               cpumask_set_cpu(cpu, &priv->lmask);
> +
> +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> +               WARN_ON(idc->priv);
> +
> +               idc->hart_index = i;
> +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> +               idc->priv = priv;
> +
> +               aplic_idc_set_delivery(idc, true);
> +
> +               /*
> +                * Boot cpu might not have APLIC hart_index = 0 so check
> +                * and update target registers of all interrupts.
> +                */
> +               if (cpu == smp_processor_id() && idc->hart_index) {
> +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> +                       val |= APLIC_DEFAULT_PRIORITY;
> +                       for (j = 1; j <= priv->nr_irqs; j++)
> +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> +                                           (j - 1) * sizeof(u32));
> +               }
> +
> +               setup_count++;
> +       }
> +
> +       /* Find parent domain and register chained handler */
> +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> +                                         DOMAIN_BUS_ANY);
> +       if (!aplic_idc_parent_irq && domain) {
> +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> +               if (aplic_idc_parent_irq) {
> +                       irq_set_chained_handler(aplic_idc_parent_irq,
> +                                               aplic_idc_handle_irq);
> +
> +                       /*
> +                        * Setup CPUHP notifier to enable IDC parent
> +                        * interrupt on all CPUs
> +                        */
> +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> +                                         "irqchip/riscv/aplic:starting",
> +                                         aplic_idc_starting_cpu,
> +                                         aplic_idc_dying_cpu);
> +               }
> +       }
> +
> +       /* Fail if we were not able to setup IDC for any CPU */
> +       return (setup_count) ? 0 : -ENODEV;
> +}
> +
> +static int aplic_probe(struct platform_device *pdev)
> +{
> +       struct device_node *node = pdev->dev.of_node;
> +       struct device *dev = &pdev->dev;
> +       struct aplic_priv *priv;
> +       struct resource *regs;
> +       phys_addr_t pa;
> +       int rc;
> +
> +       regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       if (!regs) {
> +               dev_err(dev, "cannot find registers resource\n");
> +               return -ENOENT;
> +       }
> +
> +       priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +       if (!priv)
> +               return -ENOMEM;
> +       platform_set_drvdata(pdev, priv);
> +       priv->dev = dev;
> +
> +       priv->regs = devm_ioremap(dev, regs->start, resource_size(regs));
> +       if (WARN_ON(!priv->regs)) {
> +               dev_err(dev, "failed ioremap registers\n");
> +               return -EIO;
> +       }
> +
> +       of_property_read_u32(node, "riscv,num-sources", &priv->nr_irqs);
> +       if (!priv->nr_irqs) {
> +               dev_err(dev, "failed to get number of interrupt sources\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Setup initial state APLIC interrupts */
> +       aplic_init_hw_irqs(priv);
> +
> +       /*
> +        * Setup IDCs or MSIs based on parent interrupts in DT node
> +        *
> +        * If "msi-parent" DT property is present then we ignore the
> +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> +        */
> +       priv->nr_idcs = of_property_read_bool(node, "msi-parent") ?
> +                       0 : of_irq_count(node);
> +       if (priv->nr_idcs)
> +               rc = aplic_setup_idc(priv);
> +       else
> +               rc = aplic_setup_msi(priv);
> +       if (rc)
> +               return rc;
> +
> +       /* Setup global config and interrupt delivery */
> +       aplic_init_hw_global(priv);
> +
> +       /* Create irq domain instance for the APLIC */
> +       if (priv->nr_idcs)
> +               priv->irqdomain = irq_domain_create_linear(
> +                                               of_node_to_fwnode(node),
> +                                               priv->nr_irqs + 1,
> +                                               &aplic_irqdomain_idc_ops,
> +                                               priv);
> +       else
> +               priv->irqdomain = platform_msi_create_device_domain(dev,
> +                                               priv->nr_irqs + 1,
> +                                               aplic_msi_write_msg,
> +                                               &aplic_irqdomain_msi_ops,
> +                                               priv);
> +       if (!priv->irqdomain) {
> +               dev_err(dev, "failed to add irq domain\n");
> +               return -ENOMEM;
> +       }
> +
> +       /* Advertise the interrupt controller */
> +       if (priv->nr_idcs) {
> +               dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
> +                        priv->nr_irqs, priv->nr_idcs);
> +       } else {
> +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> +               dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
> +                        priv->nr_irqs, &pa);
> +       }
> +
> +       return 0;
> +}
> +
> +static int aplic_remove(struct platform_device *pdev)
> +{
> +       struct aplic_priv *priv = platform_get_drvdata(pdev);
> +
> +       irq_domain_remove(priv->irqdomain);
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id aplic_match[] = {
> +       { .compatible = "riscv,aplic" },
> +       {}
> +};
> +
> +static struct platform_driver aplic_driver = {
> +       .driver = {
> +               .name           = "riscv-aplic",
> +               .of_match_table = aplic_match,
> +       },
> +       .probe = aplic_probe,
> +       .remove = aplic_remove,
> +};
> +
> +static int __init aplic_init(void)
> +{
> +       return platform_driver_register(&aplic_driver);
> +}
> +core_initcall(aplic_init);
> diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> new file mode 100644
> index 000000000000..88177eefd411
> --- /dev/null
> +++ b/include/linux/irqchip/riscv-aplic.h
> @@ -0,0 +1,117 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> +
> +#include <linux/bitops.h>
> +
> +#define APLIC_MAX_IDC                  BIT(14)
> +#define APLIC_MAX_SOURCE               1024
> +
> +#define APLIC_DOMAINCFG                        0x0000
> +#define APLIC_DOMAINCFG_RDONLY         0x80000000
> +#define APLIC_DOMAINCFG_IE             BIT(8)
> +#define APLIC_DOMAINCFG_DM             BIT(2)
> +#define APLIC_DOMAINCFG_BE             BIT(0)
> +
> +#define APLIC_SOURCECFG_BASE           0x0004
> +#define APLIC_SOURCECFG_D              BIT(10)
> +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x000003ff
> +#define APLIC_SOURCECFG_SM_MASK        0x00000007
> +#define APLIC_SOURCECFG_SM_INACTIVE    0x0
> +#define APLIC_SOURCECFG_SM_DETACH      0x1
> +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
> +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
> +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
> +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
> +
> +#define APLIC_MMSICFGADDR              0x1bc0
> +#define APLIC_MMSICFGADDRH             0x1bc4
> +#define APLIC_SMSICFGADDR              0x1bc8
> +#define APLIC_SMSICFGADDRH             0x1bcc
> +
> +#ifdef CONFIG_RISCV_M_MODE
> +#define APLIC_xMSICFGADDR              APLIC_MMSICFGADDR
> +#define APLIC_xMSICFGADDRH             APLIC_MMSICFGADDRH
> +#else
> +#define APLIC_xMSICFGADDR              APLIC_SMSICFGADDR
> +#define APLIC_xMSICFGADDRH             APLIC_SMSICFGADDRH
> +#endif
> +
> +#define APLIC_xMSICFGADDRH_L           BIT(31)
> +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
> +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
> +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
> +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
> +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
> +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
> +#define APLIC_xMSICFGADDRH_LHXW_MASK   0xf
> +#define APLIC_xMSICFGADDRH_LHXW_SHIFT  12
> +#define APLIC_xMSICFGADDRH_BAPPN_MASK  0xfff
> +
> +#define APLIC_xMSICFGADDR_PPN_SHIFT    12
> +
> +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> +       (BIT(__lhxs) - 1)
> +
> +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> +       (BIT(__lhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> +       ((__lhxs))
> +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> +       (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> +        APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> +
> +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> +       (BIT(__hhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> +       ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> +       (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> +        APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> +
> +#define APLIC_SETIP_BASE               0x1c00
> +#define APLIC_SETIPNUM                 0x1cdc
> +
> +#define APLIC_CLRIP_BASE               0x1d00
> +#define APLIC_CLRIPNUM                 0x1ddc
> +
> +#define APLIC_SETIE_BASE               0x1e00
> +#define APLIC_SETIENUM                 0x1edc
> +
> +#define APLIC_CLRIE_BASE               0x1f00
> +#define APLIC_CLRIENUM                 0x1fdc
> +
> +#define APLIC_SETIPNUM_LE              0x2000
> +#define APLIC_SETIPNUM_BE              0x2004
> +
> +#define APLIC_GENMSI                   0x3000
> +
> +#define APLIC_TARGET_BASE              0x3004
> +#define APLIC_TARGET_HART_IDX_SHIFT    18
> +#define APLIC_TARGET_HART_IDX_MASK     0x3fff
> +#define APLIC_TARGET_GUEST_IDX_SHIFT   12
> +#define APLIC_TARGET_GUEST_IDX_MASK    0x3f
> +#define APLIC_TARGET_IPRIO_MASK        0xff
> +#define APLIC_TARGET_EIID_MASK 0x7ff
> +
> +#define APLIC_IDC_BASE                 0x4000
> +#define APLIC_IDC_SIZE                 32
> +
> +#define APLIC_IDC_IDELIVERY            0x00
> +
> +#define APLIC_IDC_IFORCE               0x04
> +
> +#define APLIC_IDC_ITHRESHOLD           0x08
> +
> +#define APLIC_IDC_TOPI                 0x18
> +#define APLIC_IDC_TOPI_ID_SHIFT        16
> +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> +#define APLIC_IDC_TOPI_PRIO_MASK       0xff
> +
> +#define APLIC_IDC_CLAIMI               0x1c
> +
> +#endif
> --
> 2.34.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Fwd: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
@ 2023-01-17  7:09         ` Vincent Chen
  0 siblings, 0 replies; 72+ messages in thread
From: Vincent Chen @ 2023-01-17  7:09 UTC (permalink / raw)
  To: apatel
  Cc: Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel@vger.kernel.org List,
	devicetree, Vincent Chen

> From: Anup Patel <apatel@ventanamicro.com>
> Date: Wed, Jan 4, 2023 at 1:19 AM
> Subject: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
> To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
>
>
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new interrupt controller for managing wired interrupts on a RISC-V
> platform. This new interrupt controller is referred to as advanced
> platform-level interrupt controller (APLIC) which can forward wired
> interrupts to CPUs (or HARTs) as local interrupts OR as message
> signaled interrupts.
> (For more details refer https://github.com/riscv/riscv-aia)
>
I could not find an appropriate place to post my question, so I posted it here.

I am a little concerned about the current MSI IRQ handling in APLIC.
According to the specification, when domaincfg.DM = 1, the pending bit
is set to one by a low-to-high transition in the rectified input
value. When the APLIC forward this interrupt by MSI, the pending bit
will be cleared regardless of whether the interrupt type is
level-sensitive or edge-trigger. However, the interrupted service
routine may not deal with all requests at a time. When there are
remaining pending requests after leaving the ISR, these requests may
have no chance to be serviced if the IRQ type of this device is
level-sensitive. This is because the rectified value of this interrupt
changes from 0 to 1 only at the beginning. It causes the pending bit
of this interrupt will not to be asserted again, so APLIC will not
send the next MSI for this interrupt. Therefore, the ISR doesn't have
a chance to deal with the remaining requests.

One possible solution to fix this issue is to let the APLIC driver
check if the rectified value of the serviced interrupt is one after
returning from its ISR. When the value is 1, it means this device
still has pending requests. In this case, the APLIC driver can set its
pending bit by the setipnum register or the setip register. It will
let APLIC send the next MSI for this device, and the ISR will have a
chance to deal with the remaining requests.



> This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> platforms.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/Kconfig             |   6 +
>  drivers/irqchip/Makefile            |   1 +
>  drivers/irqchip/irq-riscv-aplic.c   | 670 ++++++++++++++++++++++++++++
>  include/linux/irqchip/riscv-aplic.h | 117 +++++
>  4 files changed, 794 insertions(+)
>  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
>  create mode 100644 include/linux/irqchip/riscv-aplic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index a1315189a595..936e59fe1f99 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -547,6 +547,12 @@ config SIFIVE_PLIC
>         select IRQ_DOMAIN_HIERARCHY
>         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_APLIC
> +       bool
> +       depends on RISCV
> +       select IRQ_DOMAIN_HIERARCHY
> +       select GENERIC_MSI_IRQ_DOMAIN
> +
>  config RISCV_IMSIC
>         bool
>         depends on RISCV
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 22c723cc6ec8..6154e5bc4228 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
>  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
>  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
>  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
>  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
>  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
>  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> new file mode 100644
> index 000000000000..63f20892d7d3
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-aplic.c
> @@ -0,0 +1,670 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#include <linux/bitops.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-aplic.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/platform_device.h>
> +#include <linux/smp.h>
> +
> +#define APLIC_DEFAULT_PRIORITY         1
> +#define APLIC_DISABLE_IDELIVERY                0
> +#define APLIC_ENABLE_IDELIVERY         1
> +#define APLIC_DISABLE_ITHRESHOLD       1
> +#define APLIC_ENABLE_ITHRESHOLD                0
> +
> +struct aplic_msicfg {
> +       phys_addr_t             base_ppn;
> +       u32                     hhxs;
> +       u32                     hhxw;
> +       u32                     lhxs;
> +       u32                     lhxw;
> +};
> +
> +struct aplic_idc {
> +       unsigned int            hart_index;
> +       void __iomem            *regs;
> +       struct aplic_priv       *priv;
> +};
> +
> +struct aplic_priv {
> +       struct device           *dev;
> +       u32                     nr_irqs;
> +       u32                     nr_idcs;
> +       void __iomem            *regs;
> +       struct irq_domain       *irqdomain;
> +       struct aplic_msicfg     msicfg;
> +       struct cpumask          lmask;
> +};
> +
> +static unsigned int aplic_idc_parent_irq;
> +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> +
> +static void aplic_irq_unmask(struct irq_data *d)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> +
> +       if (!priv->nr_idcs)
> +               irq_chip_unmask_parent(d);
> +}
> +
> +static void aplic_irq_mask(struct irq_data *d)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> +
> +       if (!priv->nr_idcs)
> +               irq_chip_mask_parent(d);
> +}
> +
> +static int aplic_set_type(struct irq_data *d, unsigned int type)
> +{
> +       u32 val = 0;
> +       void __iomem *sourcecfg;
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> +       switch (type) {
> +       case IRQ_TYPE_NONE:
> +               val = APLIC_SOURCECFG_SM_INACTIVE;
> +               break;
> +       case IRQ_TYPE_LEVEL_LOW:
> +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> +               break;
> +       case IRQ_TYPE_LEVEL_HIGH:
> +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> +               break;
> +       case IRQ_TYPE_EDGE_FALLING:
> +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> +               break;
> +       case IRQ_TYPE_EDGE_RISING:
> +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> +       writel(val, sourcecfg);
> +
> +       return 0;
> +}
> +
> +#ifdef CONFIG_SMP
> +static int aplic_set_affinity(struct irq_data *d,
> +                             const struct cpumask *mask_val, bool force)
> +{
> +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +       struct aplic_idc *idc;
> +       unsigned int cpu, val;
> +       struct cpumask amask;
> +       void __iomem *target;
> +
> +       if (!priv->nr_idcs)
> +               return irq_chip_set_affinity_parent(d, mask_val, force);
> +
> +       cpumask_and(&amask, &priv->lmask, mask_val);
> +
> +       if (force)
> +               cpu = cpumask_first(&amask);
> +       else
> +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> +       if (cpu >= nr_cpu_ids)
> +               return -EINVAL;
> +
> +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> +       target = priv->regs + APLIC_TARGET_BASE;
> +       target += (d->hwirq - 1) * sizeof(u32);
> +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> +       val |= APLIC_DEFAULT_PRIORITY;
> +       writel(val, target);
> +
> +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +       return IRQ_SET_MASK_OK_DONE;
> +}
> +#endif
> +
> +static struct irq_chip aplic_chip = {
> +       .name           = "RISC-V APLIC",
> +       .irq_mask       = aplic_irq_mask,
> +       .irq_unmask     = aplic_irq_unmask,
> +       .irq_set_type   = aplic_set_type,
> +#ifdef CONFIG_SMP
> +       .irq_set_affinity = aplic_set_affinity,
> +#endif
> +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> +                         IRQCHIP_SKIP_SET_WAKE |
> +                         IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int aplic_irqdomain_translate(struct irq_domain *d,
> +                                    struct irq_fwspec *fwspec,
> +                                    unsigned long *hwirq,
> +                                    unsigned int *type)
> +{
> +       if (WARN_ON(fwspec->param_count < 2))
> +               return -EINVAL;
> +       if (WARN_ON(!fwspec->param[0]))
> +               return -EINVAL;
> +
> +       *hwirq = fwspec->param[0];
> +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> +
> +       WARN_ON(*type == IRQ_TYPE_NONE);
> +
> +       return 0;
> +}
> +
> +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> +                                    unsigned int virq, unsigned int nr_irqs,
> +                                    void *arg)
> +{
> +       int i, ret;
> +       unsigned int type;
> +       irq_hw_number_t hwirq;
> +       struct irq_fwspec *fwspec = arg;
> +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> +
> +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> +       if (ret)
> +               return ret;
> +
> +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> +       if (ret)
> +               return ret;
> +
> +       for (i = 0; i < nr_irqs; i++)
> +               irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
> +                                             &aplic_chip, priv);
> +
> +       return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> +       .translate      = aplic_irqdomain_translate,
> +       .alloc          = aplic_irqdomain_msi_alloc,
> +       .free           = platform_msi_device_domain_free,
> +};
> +
> +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> +                                    unsigned int virq, unsigned int nr_irqs,
> +                                    void *arg)
> +{
> +       int i, ret;
> +       unsigned int type;
> +       irq_hw_number_t hwirq;
> +       struct irq_fwspec *fwspec = arg;
> +       struct aplic_priv *priv = domain->host_data;
> +
> +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> +       if (ret)
> +               return ret;
> +
> +       for (i = 0; i < nr_irqs; i++) {
> +               irq_domain_set_info(domain, virq + i, hwirq + i,
> +                                   &aplic_chip, priv, handle_simple_irq,
> +                                   NULL, NULL);
> +               irq_set_affinity(virq + i, &priv->lmask);
> +       }
> +
> +       return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> +       .translate      = aplic_irqdomain_translate,
> +       .alloc          = aplic_irqdomain_idc_alloc,
> +       .free           = irq_domain_free_irqs_top,
> +};
> +
> +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> +{
> +       int i;
> +
> +       /* Disable all interrupts */
> +       for (i = 0; i <= priv->nr_irqs; i += 32)
> +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> +                           (i / 32) * sizeof(u32));
> +
> +       /* Set interrupt type and default priority for all interrupts */
> +       for (i = 1; i <= priv->nr_irqs; i++) {
> +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> +                         (i - 1) * sizeof(u32));
> +               writel(APLIC_DEFAULT_PRIORITY,
> +                      priv->regs + APLIC_TARGET_BASE +
> +                      (i - 1) * sizeof(u32));
> +       }
> +
> +       /* Clear APLIC domaincfg */
> +       writel(0, priv->regs + APLIC_DOMAINCFG);
> +}
> +
> +static void aplic_init_hw_global(struct aplic_priv *priv)
> +{
> +       u32 val;
> +#ifdef CONFIG_RISCV_M_MODE
> +       u32 valH;
> +
> +       if (!priv->nr_idcs) {
> +               val = priv->msicfg.base_ppn;
> +               valH = (priv->msicfg.base_ppn >> 32) &
> +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> +       }
> +#endif
> +
> +       /* Setup APLIC domaincfg register */
> +       val = readl(priv->regs + APLIC_DOMAINCFG);
> +       val |= APLIC_DOMAINCFG_IE;
> +       if (!priv->nr_idcs)
> +               val |= APLIC_DOMAINCFG_DM;
> +       writel(val, priv->regs + APLIC_DOMAINCFG);
> +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> +               dev_warn(priv->dev,
> +                        "unable to write 0x%x in domaincfg\n", val);
> +}
> +
> +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +       unsigned int group_index, hart_index, guest_index, val;
> +       struct device *dev = msi_desc_to_dev(desc);
> +       struct aplic_priv *priv = dev_get_drvdata(dev);
> +       struct irq_data *d = irq_get_irq_data(desc->irq);
> +       struct aplic_msicfg *mc = &priv->msicfg;
> +       phys_addr_t tppn, tbppn, msg_addr;
> +       void __iomem *target;
> +
> +       /* For zeroed MSI, simply write zero into the target register */
> +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> +               target = priv->regs + APLIC_TARGET_BASE;
> +               target += (d->hwirq - 1) * sizeof(u32);
> +               writel(0, target);
> +               return;
> +       }
> +
> +       /* Sanity check on message data */
> +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> +
> +       /* Compute target MSI address */
> +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> +
> +       /* Compute target HART Base PPN */
> +       tbppn = tppn;
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> +       WARN_ON(tbppn != mc->base_ppn);
> +
> +       /* Compute target group and hart indexes */
> +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> +       hart_index |= (group_index << mc->lhxw);
> +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> +
> +       /* Compute target guest index */
> +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> +
> +       /* Update IRQ TARGET register */
> +       target = priv->regs + APLIC_TARGET_BASE;
> +       target += (d->hwirq - 1) * sizeof(u32);
> +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> +                               << APLIC_TARGET_HART_IDX_SHIFT;
> +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> +       writel(val, target);
> +}
> +
> +static int aplic_setup_msi(struct aplic_priv *priv)
> +{
> +       struct device *dev = priv->dev;
> +       struct aplic_msicfg *mc = &priv->msicfg;
> +       const struct imsic_global_config *imsic_global;
> +
> +       /*
> +        * The APLIC outgoing MSI config registers assume target MSI
> +        * controller to be RISC-V AIA IMSIC controller.
> +        */
> +       imsic_global = imsic_get_global_config();
> +       if (!imsic_global) {
> +               dev_err(dev, "IMSIC global config not found\n");
> +               return -ENODEV;
> +       }
> +
> +       /* Find number of guest index bits (LHXS) */
> +       mc->lhxs = imsic_global->guest_index_bits;
> +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> +               dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of HART index bits (LHXW) */
> +       mc->lhxw = imsic_global->hart_index_bits;
> +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> +               dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of group index bits (HHXW) */
> +       mc->hhxw = imsic_global->group_index_bits;
> +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> +               dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Find first bit position of group index (HHXS) */
> +       mc->hhxs = imsic_global->group_index_shift;
> +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> +               dev_err(dev, "IMSIC group index shift should be >= %d\n",
> +                       (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> +               return -EINVAL;
> +       }
> +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> +               dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Compute PPN base */
> +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> +
> +       /* Use all possible CPUs as lmask */
> +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> +
> +       return 0;
> +}
> +
> +/*
> + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> + * which will return highest priority pending interrupt and clear the
> + * pending bit of the interrupt. This process is repeated until CLAIMI
> + * register return zero value.
> + */
> +static void aplic_idc_handle_irq(struct irq_desc *desc)
> +{
> +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> +       struct irq_chip *chip = irq_desc_get_chip(desc);
> +       irq_hw_number_t hw_irq;
> +       int irq;
> +
> +       chained_irq_enter(chip, desc);
> +
> +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> +
> +               if (unlikely(irq <= 0))
> +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> +                                           hw_irq);
> +               else
> +                       generic_handle_irq(irq);
> +       }
> +
> +       chained_irq_exit(chip, desc);
> +}
> +
> +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> +{
> +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> +
> +       /* Priority must be less than threshold for interrupt triggering */
> +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> +
> +       /* Delivery must be set to 1 for interrupt triggering */
> +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> +}
> +
> +static int aplic_idc_dying_cpu(unsigned int cpu)
> +{
> +       if (aplic_idc_parent_irq)
> +               disable_percpu_irq(aplic_idc_parent_irq);
> +
> +       return 0;
> +}
> +
> +static int aplic_idc_starting_cpu(unsigned int cpu)
> +{
> +       if (aplic_idc_parent_irq)
> +               enable_percpu_irq(aplic_idc_parent_irq,
> +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> +
> +       return 0;
> +}
> +
> +static int aplic_setup_idc(struct aplic_priv *priv)
> +{
> +       int i, j, rc, cpu, setup_count = 0;
> +       struct device_node *node = priv->dev->of_node;
> +       struct device *dev = priv->dev;
> +       struct of_phandle_args parent;
> +       struct irq_domain *domain;
> +       unsigned long hartid;
> +       struct aplic_idc *idc;
> +       u32 val;
> +
> +       /* Setup per-CPU IDC and target CPU mask */
> +       for (i = 0; i < priv->nr_idcs; i++) {
> +               if (of_irq_parse_one(node, i, &parent)) {
> +                       dev_err(dev, "failed to parse parent for IDC%d.\n",
> +                               i);
> +                       return -EIO;
> +               }
> +
> +               /* Skip IDCs which do not connect to external interrupts */
> +               if (parent.args[0] != RV_IRQ_EXT)
> +                       continue;
> +
> +               rc = riscv_of_parent_hartid(parent.np, &hartid);
> +               if (rc) {
> +                       dev_err(dev, "failed to parse hart ID for IDC%d.\n",
> +                               i);
> +                       return rc;
> +               }
> +
> +               cpu = riscv_hartid_to_cpuid(hartid);
> +               if (cpu < 0) {
> +                       dev_warn(dev, "invalid cpuid for IDC%d\n", i);
> +                       continue;
> +               }
> +
> +               cpumask_set_cpu(cpu, &priv->lmask);
> +
> +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> +               WARN_ON(idc->priv);
> +
> +               idc->hart_index = i;
> +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> +               idc->priv = priv;
> +
> +               aplic_idc_set_delivery(idc, true);
> +
> +               /*
> +                * Boot cpu might not have APLIC hart_index = 0 so check
> +                * and update target registers of all interrupts.
> +                */
> +               if (cpu == smp_processor_id() && idc->hart_index) {
> +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> +                       val |= APLIC_DEFAULT_PRIORITY;
> +                       for (j = 1; j <= priv->nr_irqs; j++)
> +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> +                                           (j - 1) * sizeof(u32));
> +               }
> +
> +               setup_count++;
> +       }
> +
> +       /* Find parent domain and register chained handler */
> +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> +                                         DOMAIN_BUS_ANY);
> +       if (!aplic_idc_parent_irq && domain) {
> +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> +               if (aplic_idc_parent_irq) {
> +                       irq_set_chained_handler(aplic_idc_parent_irq,
> +                                               aplic_idc_handle_irq);
> +
> +                       /*
> +                        * Setup CPUHP notifier to enable IDC parent
> +                        * interrupt on all CPUs
> +                        */
> +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> +                                         "irqchip/riscv/aplic:starting",
> +                                         aplic_idc_starting_cpu,
> +                                         aplic_idc_dying_cpu);
> +               }
> +       }
> +
> +       /* Fail if we were not able to setup IDC for any CPU */
> +       return (setup_count) ? 0 : -ENODEV;
> +}
> +
> +static int aplic_probe(struct platform_device *pdev)
> +{
> +       struct device_node *node = pdev->dev.of_node;
> +       struct device *dev = &pdev->dev;
> +       struct aplic_priv *priv;
> +       struct resource *regs;
> +       phys_addr_t pa;
> +       int rc;
> +
> +       regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       if (!regs) {
> +               dev_err(dev, "cannot find registers resource\n");
> +               return -ENOENT;
> +       }
> +
> +       priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +       if (!priv)
> +               return -ENOMEM;
> +       platform_set_drvdata(pdev, priv);
> +       priv->dev = dev;
> +
> +       priv->regs = devm_ioremap(dev, regs->start, resource_size(regs));
> +       if (WARN_ON(!priv->regs)) {
> +               dev_err(dev, "failed ioremap registers\n");
> +               return -EIO;
> +       }
> +
> +       of_property_read_u32(node, "riscv,num-sources", &priv->nr_irqs);
> +       if (!priv->nr_irqs) {
> +               dev_err(dev, "failed to get number of interrupt sources\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Setup initial state APLIC interrupts */
> +       aplic_init_hw_irqs(priv);
> +
> +       /*
> +        * Setup IDCs or MSIs based on parent interrupts in DT node
> +        *
> +        * If "msi-parent" DT property is present then we ignore the
> +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> +        */
> +       priv->nr_idcs = of_property_read_bool(node, "msi-parent") ?
> +                       0 : of_irq_count(node);
> +       if (priv->nr_idcs)
> +               rc = aplic_setup_idc(priv);
> +       else
> +               rc = aplic_setup_msi(priv);
> +       if (rc)
> +               return rc;
> +
> +       /* Setup global config and interrupt delivery */
> +       aplic_init_hw_global(priv);
> +
> +       /* Create irq domain instance for the APLIC */
> +       if (priv->nr_idcs)
> +               priv->irqdomain = irq_domain_create_linear(
> +                                               of_node_to_fwnode(node),
> +                                               priv->nr_irqs + 1,
> +                                               &aplic_irqdomain_idc_ops,
> +                                               priv);
> +       else
> +               priv->irqdomain = platform_msi_create_device_domain(dev,
> +                                               priv->nr_irqs + 1,
> +                                               aplic_msi_write_msg,
> +                                               &aplic_irqdomain_msi_ops,
> +                                               priv);
> +       if (!priv->irqdomain) {
> +               dev_err(dev, "failed to add irq domain\n");
> +               return -ENOMEM;
> +       }
> +
> +       /* Advertise the interrupt controller */
> +       if (priv->nr_idcs) {
> +               dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
> +                        priv->nr_irqs, priv->nr_idcs);
> +       } else {
> +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> +               dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
> +                        priv->nr_irqs, &pa);
> +       }
> +
> +       return 0;
> +}
> +
> +static int aplic_remove(struct platform_device *pdev)
> +{
> +       struct aplic_priv *priv = platform_get_drvdata(pdev);
> +
> +       irq_domain_remove(priv->irqdomain);
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id aplic_match[] = {
> +       { .compatible = "riscv,aplic" },
> +       {}
> +};
> +
> +static struct platform_driver aplic_driver = {
> +       .driver = {
> +               .name           = "riscv-aplic",
> +               .of_match_table = aplic_match,
> +       },
> +       .probe = aplic_probe,
> +       .remove = aplic_remove,
> +};
> +
> +static int __init aplic_init(void)
> +{
> +       return platform_driver_register(&aplic_driver);
> +}
> +core_initcall(aplic_init);
> diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> new file mode 100644
> index 000000000000..88177eefd411
> --- /dev/null
> +++ b/include/linux/irqchip/riscv-aplic.h
> @@ -0,0 +1,117 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> +
> +#include <linux/bitops.h>
> +
> +#define APLIC_MAX_IDC                  BIT(14)
> +#define APLIC_MAX_SOURCE               1024
> +
> +#define APLIC_DOMAINCFG                        0x0000
> +#define APLIC_DOMAINCFG_RDONLY         0x80000000
> +#define APLIC_DOMAINCFG_IE             BIT(8)
> +#define APLIC_DOMAINCFG_DM             BIT(2)
> +#define APLIC_DOMAINCFG_BE             BIT(0)
> +
> +#define APLIC_SOURCECFG_BASE           0x0004
> +#define APLIC_SOURCECFG_D              BIT(10)
> +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x000003ff
> +#define APLIC_SOURCECFG_SM_MASK        0x00000007
> +#define APLIC_SOURCECFG_SM_INACTIVE    0x0
> +#define APLIC_SOURCECFG_SM_DETACH      0x1
> +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
> +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
> +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
> +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
> +
> +#define APLIC_MMSICFGADDR              0x1bc0
> +#define APLIC_MMSICFGADDRH             0x1bc4
> +#define APLIC_SMSICFGADDR              0x1bc8
> +#define APLIC_SMSICFGADDRH             0x1bcc
> +
> +#ifdef CONFIG_RISCV_M_MODE
> +#define APLIC_xMSICFGADDR              APLIC_MMSICFGADDR
> +#define APLIC_xMSICFGADDRH             APLIC_MMSICFGADDRH
> +#else
> +#define APLIC_xMSICFGADDR              APLIC_SMSICFGADDR
> +#define APLIC_xMSICFGADDRH             APLIC_SMSICFGADDRH
> +#endif
> +
> +#define APLIC_xMSICFGADDRH_L           BIT(31)
> +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
> +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
> +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
> +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
> +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
> +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
> +#define APLIC_xMSICFGADDRH_LHXW_MASK   0xf
> +#define APLIC_xMSICFGADDRH_LHXW_SHIFT  12
> +#define APLIC_xMSICFGADDRH_BAPPN_MASK  0xfff
> +
> +#define APLIC_xMSICFGADDR_PPN_SHIFT    12
> +
> +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> +       (BIT(__lhxs) - 1)
> +
> +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> +       (BIT(__lhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> +       ((__lhxs))
> +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> +       (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> +        APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> +
> +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> +       (BIT(__hhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> +       ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> +       (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> +        APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> +
> +#define APLIC_SETIP_BASE               0x1c00
> +#define APLIC_SETIPNUM                 0x1cdc
> +
> +#define APLIC_CLRIP_BASE               0x1d00
> +#define APLIC_CLRIPNUM                 0x1ddc
> +
> +#define APLIC_SETIE_BASE               0x1e00
> +#define APLIC_SETIENUM                 0x1edc
> +
> +#define APLIC_CLRIE_BASE               0x1f00
> +#define APLIC_CLRIENUM                 0x1fdc
> +
> +#define APLIC_SETIPNUM_LE              0x2000
> +#define APLIC_SETIPNUM_BE              0x2004
> +
> +#define APLIC_GENMSI                   0x3000
> +
> +#define APLIC_TARGET_BASE              0x3004
> +#define APLIC_TARGET_HART_IDX_SHIFT    18
> +#define APLIC_TARGET_HART_IDX_MASK     0x3fff
> +#define APLIC_TARGET_GUEST_IDX_SHIFT   12
> +#define APLIC_TARGET_GUEST_IDX_MASK    0x3f
> +#define APLIC_TARGET_IPRIO_MASK        0xff
> +#define APLIC_TARGET_EIID_MASK 0x7ff
> +
> +#define APLIC_IDC_BASE                 0x4000
> +#define APLIC_IDC_SIZE                 32
> +
> +#define APLIC_IDC_IDELIVERY            0x00
> +
> +#define APLIC_IDC_IFORCE               0x04
> +
> +#define APLIC_IDC_ITHRESHOLD           0x08
> +
> +#define APLIC_IDC_TOPI                 0x18
> +#define APLIC_IDC_TOPI_ID_SHIFT        16
> +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> +#define APLIC_IDC_TOPI_PRIO_MASK       0xff
> +
> +#define APLIC_IDC_CLAIMI               0x1c
> +
> +#endif
> --
> 2.34.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
  2023-01-09  5:09       ` Anup Patel
@ 2023-01-17 20:42         ` Conor Dooley
  -1 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-17 20:42 UTC (permalink / raw)
  To: Anup Patel
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, linux-riscv, linux-kernel, devicetree


[-- Attachment #1.1: Type: text/plain, Size: 2636 bytes --]

Hey Anup,

I thought I had already replied here but clearly not, sorry!

On Mon, Jan 09, 2023 at 10:39:08AM +0530, Anup Patel wrote:
> On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <conor@kernel.org> wrote:
> > On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:

> > > +/* AIA CSR bits */
> > > +#define TOPI_IID_SHIFT               16
> > > +#define TOPI_IID_MASK                0xfff

While I think of it, it'd be worth noting that these are generic across
all of topi, mtopi etc. Initially I thought that this mask was wrong as
the topi section says:
	bits 25:16 Interrupt identity (source number)
	bits 7:0 Interrupt priority

> > > +#define TOPI_IPRIO_MASK              0xff
> > > +#define TOPI_IPRIO_BITS              8
> > > +
> > > +#define TOPEI_ID_SHIFT               16
> > > +#define TOPEI_ID_MASK                0x7ff
> > > +#define TOPEI_PRIO_MASK              0x7ff
> > > +
> > > +#define ISELECT_IPRIO0               0x30
> > > +#define ISELECT_IPRIO15              0x3f
> > > +#define ISELECT_MASK         0x1ff
> > > +
> > > +#define HVICTL_VTI           0x40000000
> > > +#define HVICTL_IID           0x0fff0000
> > > +#define HVICTL_IID_SHIFT     16
> > > +#define HVICTL_IPRIOM                0x00000100
> > > +#define HVICTL_IPRIO         0x000000ff
> >
> > Why not name these as masks, like you did for the other masks?
> > Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> > intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> > to be used *pre*-shift.
> > Some consistency in naming and function would be great.
> 
> The following convention is being followed in asm/csr.h for defining
> MASK of any XYZ field in ABC CSR:
> 1. ABC_XYZ : This name is used for MASK which is intended
>    to be used before SHIFT
> 2. ABC_XYZ_MASK: This name is used for MASK which is
>    intended to be used after SHIFT

Which makes sense in theory.

> The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
> follows the above convention. The only outlier is HGATPx_VMID_MASK
> define which I will fix in my next KVM RISC-V series.

Yup, it is liable to end up like that.

> I don't see how any of the AIA CSR defines are violating the above
> convention.

What I was advocating for was picking one style and sticking to it.
These copy-paste from docs things are tedious and error prone to review,
and I don't think having multiple styles is helpful.

Tedious as it was, I did check all the numbers though, so in that
respect:
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
Conor.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
@ 2023-01-17 20:42         ` Conor Dooley
  0 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-17 20:42 UTC (permalink / raw)
  To: Anup Patel
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, linux-riscv, linux-kernel, devicetree

[-- Attachment #1: Type: text/plain, Size: 2636 bytes --]

Hey Anup,

I thought I had already replied here but clearly not, sorry!

On Mon, Jan 09, 2023 at 10:39:08AM +0530, Anup Patel wrote:
> On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <conor@kernel.org> wrote:
> > On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:

> > > +/* AIA CSR bits */
> > > +#define TOPI_IID_SHIFT               16
> > > +#define TOPI_IID_MASK                0xfff

While I think of it, it'd be worth noting that these are generic across
all of topi, mtopi etc. Initially I thought that this mask was wrong as
the topi section says:
	bits 25:16 Interrupt identity (source number)
	bits 7:0 Interrupt priority

> > > +#define TOPI_IPRIO_MASK              0xff
> > > +#define TOPI_IPRIO_BITS              8
> > > +
> > > +#define TOPEI_ID_SHIFT               16
> > > +#define TOPEI_ID_MASK                0x7ff
> > > +#define TOPEI_PRIO_MASK              0x7ff
> > > +
> > > +#define ISELECT_IPRIO0               0x30
> > > +#define ISELECT_IPRIO15              0x3f
> > > +#define ISELECT_MASK         0x1ff
> > > +
> > > +#define HVICTL_VTI           0x40000000
> > > +#define HVICTL_IID           0x0fff0000
> > > +#define HVICTL_IID_SHIFT     16
> > > +#define HVICTL_IPRIOM                0x00000100
> > > +#define HVICTL_IPRIO         0x000000ff
> >
> > Why not name these as masks, like you did for the other masks?
> > Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> > intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> > to be used *pre*-shift.
> > Some consistency in naming and function would be great.
> 
> The following convention is being followed in asm/csr.h for defining
> MASK of any XYZ field in ABC CSR:
> 1. ABC_XYZ : This name is used for MASK which is intended
>    to be used before SHIFT
> 2. ABC_XYZ_MASK: This name is used for MASK which is
>    intended to be used after SHIFT

Which makes sense in theory.

> The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
> follows the above convention. The only outlier is HGATPx_VMID_MASK
> define which I will fix in my next KVM RISC-V series.

Yup, it is liable to end up like that.

> I don't see how any of the AIA CSR defines are violating the above
> convention.

What I was advocating for was picking one style and sticking to it.
These copy-paste from docs things are tedious and error prone to review,
and I don't think having multiple styles is helpful.

Tedious as it was, I did check all the numbers though, so in that
respect:
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
Conor.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Fwd: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
       [not found]     ` <CABvJ_xhcuC92A_oo1mWQoRvtRzE8XXx9bbXKs7N7wKm0=Z3_Cw@mail.gmail.com>
@ 2023-01-18  3:49         ` Vincent Chen
  0 siblings, 0 replies; 72+ messages in thread
From: Vincent Chen @ 2023-01-18  3:49 UTC (permalink / raw)
  To: apatel
  Cc: Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel@vger.kernel.org List,
	devicetree

> From: Anup Patel <apatel@ventanamicro.com>
> Date: Wed, Jan 4, 2023 at 1:19 AM
> Subject: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
> To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
>
>
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new MSI controller for managing MSIs on a RISC-V platform. This new
> MSI controller is referred to as incoming message signaled interrupt
> controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> platforms.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/Kconfig             |   14 +-
>  drivers/irqchip/Makefile            |    1 +
>  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
>  include/linux/irqchip/riscv-imsic.h |   92 +++
>  4 files changed, 1280 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
>  create mode 100644 include/linux/irqchip/riscv-imsic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 9e65345ca3f6..a1315189a595 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -29,7 +29,6 @@ config ARM_GIC_V2M
>
>  config GIC_NON_BANKED
>         bool
> -
>  config ARM_GIC_V3
>         bool
>         select IRQ_DOMAIN_HIERARCHY
> @@ -548,6 +547,19 @@ config SIFIVE_PLIC
>         select IRQ_DOMAIN_HIERARCHY
>         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_IMSIC
> +       bool
> +       depends on RISCV
> +       select IRQ_DOMAIN_HIERARCHY
> +       select GENERIC_MSI_IRQ_DOMAIN
> +
> +config RISCV_IMSIC_PCI
> +       bool
> +       depends on RISCV_IMSIC
> +       depends on PCI
> +       depends on PCI_MSI
> +       default RISCV_IMSIC
> +
>  config EXYNOS_IRQ_COMBINER
>         bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
>         depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 87b49a10962c..22c723cc6ec8 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
>  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
>  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
>  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
>  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
>  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
>  obj-$(CONFIG_IMX_INTMUX)               += irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> new file mode 100644
> index 000000000000..4c16b66738d6
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic.c
> @@ -0,0 +1,1174 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/bitmap.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +#include <asm/hwcap.h>
> +
> +#define IMSIC_DISABLE_EIDELIVERY       0
> +#define IMSIC_ENABLE_EIDELIVERY                1
> +#define IMSIC_DISABLE_EITHRESHOLD      1
> +#define IMSIC_ENABLE_EITHRESHOLD       0
> +
> +#define imsic_csr_write(__c, __v)      \
> +do {                                   \
> +       csr_write(CSR_ISELECT, __c);    \
> +       csr_write(CSR_IREG, __v);       \
> +} while (0)
> +
> +#define imsic_csr_read(__c)            \
> +({                                     \
> +       unsigned long __v;              \
> +       csr_write(CSR_ISELECT, __c);    \
> +       __v = csr_read(CSR_IREG);       \
> +       __v;                            \
> +})
> +
> +#define imsic_csr_set(__c, __v)                \
> +do {                                   \
> +       csr_write(CSR_ISELECT, __c);    \
> +       csr_set(CSR_IREG, __v);         \
> +} while (0)
> +
> +#define imsic_csr_clear(__c, __v)      \
> +do {                                   \
> +       csr_write(CSR_ISELECT, __c);    \
> +       csr_clear(CSR_IREG, __v);       \
> +} while (0)
> +
> +struct imsic_mmio {
> +       phys_addr_t pa;
> +       void __iomem *va;
> +       unsigned long size;
> +};
> +
> +struct imsic_priv {
> +       /* Global configuration common for all HARTs */
> +       struct imsic_global_config global;
> +
> +       /* MMIO regions */
> +       u32 num_mmios;
> +       struct imsic_mmio *mmios;
> +
> +       /* Global state of interrupt identities */
> +       raw_spinlock_t ids_lock;
> +       unsigned long *ids_used_bimap;
> +       unsigned long *ids_enabled_bimap;
> +       unsigned int *ids_target_cpu;
> +
> +       /* Mask for connected CPUs */
> +       struct cpumask lmask;
> +
> +       /* IPI interrupt identity */
> +       u32 ipi_id;
> +       u32 ipi_lsync_id;
> +
> +       /* IRQ domains */
> +       struct irq_domain *base_domain;
> +       struct irq_domain *pci_domain;
> +       struct irq_domain *plat_domain;
> +};
> +
> +struct imsic_handler {
> +       /* Local configuration for given HART */
> +       struct imsic_local_config local;
> +
> +       /* Pointer to private context */
> +       struct imsic_priv *priv;
> +};
> +
> +static bool imsic_init_done;
> +
> +static int imsic_parent_irq;
> +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> +
> +const struct imsic_global_config *imsic_get_global_config(void)
> +{
> +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +
> +       if (!handler || !handler->priv)
> +               return NULL;
> +
> +       return &handler->priv->global;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> +
> +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> +{
> +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +       if (!handler || !handler->priv)
> +               return NULL;
> +
> +       return &handler->local;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_local_config);
> +
> +static int imsic_cpu_page_phys(unsigned int cpu,
> +                              unsigned int guest_index,
> +                              phys_addr_t *out_msi_pa)
> +{
> +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +       struct imsic_global_config *global;
> +       struct imsic_local_config *local;
> +
> +       if (!handler || !handler->priv)
> +               return -ENODEV;
> +       local = &handler->local;
> +       global = &handler->priv->global;
> +
> +       if (BIT(global->guest_index_bits) <= guest_index)
> +               return -EINVAL;
> +
> +       if (out_msi_pa)
> +               *out_msi_pa = local->msi_pa +
> +                             (guest_index * IMSIC_MMIO_PAGE_SZ);
> +
> +       return 0;
> +}
> +
> +static int imsic_get_cpu(struct imsic_priv *priv,
> +                        const struct cpumask *mask_val, bool force,
> +                        unsigned int *out_target_cpu)
> +{
> +       struct cpumask amask;
> +       unsigned int cpu;
> +
> +       cpumask_and(&amask, &priv->lmask, mask_val);
> +
> +       if (force)
> +               cpu = cpumask_first(&amask);
> +       else
> +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> +       if (cpu >= nr_cpu_ids)
> +               return -EINVAL;
> +
> +       if (out_target_cpu)
> +               *out_target_cpu = cpu;
> +
> +       return 0;
> +}
> +
> +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> +                                struct msi_msg *msg)
> +{
> +       phys_addr_t msi_addr;
> +       int err;
> +
> +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +       if (err)
> +               return err;
> +
> +       msg->address_hi = upper_32_bits(msi_addr);
> +       msg->address_lo = lower_32_bits(msi_addr);
> +       msg->data = id;
> +
> +       return err;
> +}
> +
> +static void imsic_id_set_target(struct imsic_priv *priv,
> +                                unsigned int id, unsigned int target_cpu)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       priv->ids_target_cpu[id] = target_cpu;
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> +                                       unsigned int id)
> +{
> +       unsigned int ret;
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       ret = priv->ids_target_cpu[id];
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       return ret;
> +}
> +
> +static void __imsic_eix_update(unsigned long base_id,
> +                              unsigned long num_id, bool pend, bool val)
> +{
> +       unsigned long i, isel, ireg, flags;
> +       unsigned long id = base_id, last_id = base_id + num_id;
> +
> +       while (id < last_id) {
> +               isel = id / BITS_PER_LONG;
> +               isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> +               isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> +
> +               ireg = 0;
> +               for (i = id & (__riscv_xlen - 1);
> +                    (id < last_id) && (i < __riscv_xlen); i++) {
> +                       ireg |= BIT(i);
> +                       id++;
> +               }
> +
> +               /*
> +                * The IMSIC EIEx and EIPx registers are indirectly
> +                * accessed via using ISELECT and IREG CSRs so we
> +                * save/restore local IRQ to ensure that we don't
> +                * get preempted while accessing IMSIC registers.
> +                */
> +               local_irq_save(flags);
> +               if (val)
> +                       imsic_csr_set(isel, ireg);
> +               else
> +                       imsic_csr_clear(isel, ireg);
> +               local_irq_restore(flags);
> +       }
> +}
> +
> +#define __imsic_id_enable(__id)                \
> +       __imsic_eix_update((__id), 1, false, true)
> +#define __imsic_id_disable(__id)       \
> +       __imsic_eix_update((__id), 1, false, false)
> +
> +#ifdef CONFIG_SMP
> +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> +{
> +       struct imsic_handler *handler;
> +       struct cpumask amask;
> +       int cpu;
> +
> +       cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> +       for_each_cpu(cpu, &amask) {
> +               if (cpu == smp_processor_id())
> +                       continue;
> +
> +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> +               if (!handler || !handler->priv || !handler->local.msi_va) {
> +                       pr_warn("CPU%d: handler not initialized\n", cpu);
> +                       continue;
> +               }
> +
> +               writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
> +       }
> +}
> +#else
> +#define __imsic_id_smp_sync(__priv)
> +#endif
> +
> +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       bitmap_set(priv->ids_enabled_bimap, id, 1);
> +       __imsic_id_enable(id);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       __imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       bitmap_clear(priv->ids_enabled_bimap, id, 1);
> +       __imsic_id_disable(id);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       __imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_ids_local_sync(struct imsic_priv *priv)
> +{
> +       int i;
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       for (i = 1; i <= priv->global.nr_ids; i++) {
> +               if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> +                       continue;
> +
> +               if (test_bit(i, priv->ids_enabled_bimap))
> +                       __imsic_id_enable(i);
> +               else
> +                       __imsic_id_disable(i);
> +       }
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> +{
> +       if (enable) {
> +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> +       } else {
> +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> +       }
> +}
> +
> +static int imsic_ids_alloc(struct imsic_priv *priv,
> +                          unsigned int max_id, unsigned int order)
> +{
> +       int ret;
> +       unsigned long flags;
> +
> +       if ((priv->global.nr_ids < max_id) ||
> +           (max_id < BIT(order)))
> +               return -EINVAL;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       ret = bitmap_find_free_region(priv->ids_used_bimap,
> +                                     max_id + 1, order);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       return ret;
> +}
> +
> +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> +                          unsigned int order)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       bitmap_release_region(priv->ids_used_bimap, base_id, order);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static int __init imsic_ids_init(struct imsic_priv *priv)
> +{
> +       int i;
> +       struct imsic_global_config *global = &priv->global;
> +
> +       raw_spin_lock_init(&priv->ids_lock);
> +
> +       /* Allocate used bitmap */
> +       priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +                                       sizeof(unsigned long), GFP_KERNEL);
> +       if (!priv->ids_used_bimap)
> +               return -ENOMEM;
> +
> +       /* Allocate enabled bitmap */
> +       priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +                                          sizeof(unsigned long), GFP_KERNEL);
> +       if (!priv->ids_enabled_bimap) {
> +               kfree(priv->ids_used_bimap);
> +               return -ENOMEM;
> +       }
> +
> +       /* Allocate target CPU array */
> +       priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> +                                      sizeof(unsigned int), GFP_KERNEL);
> +       if (!priv->ids_target_cpu) {
> +               kfree(priv->ids_enabled_bimap);
> +               kfree(priv->ids_used_bimap);
> +               return -ENOMEM;
> +       }
> +       for (i = 0; i <= global->nr_ids; i++)
> +               priv->ids_target_cpu[i] = UINT_MAX;
> +
> +       /* Reserve ID#0 because it is special and never implemented */
> +       bitmap_set(priv->ids_used_bimap, 0, 1);
> +
> +       return 0;
> +}
> +
> +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> +{
> +       kfree(priv->ids_target_cpu);
> +       kfree(priv->ids_enabled_bimap);
> +       kfree(priv->ids_used_bimap);
> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_ipi_send(unsigned int cpu)
> +{
> +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +       if (!handler || !handler->priv || !handler->local.msi_va) {
> +               pr_warn("CPU%d: handler not initialized\n", cpu);
> +               return;
> +       }
> +
> +       writel(handler->priv->ipi_id, handler->local.msi_va);
> +}
> +
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +       __imsic_id_enable(priv->ipi_id);
> +       __imsic_id_enable(priv->ipi_lsync_id);
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +       int virq;
> +
> +       /* Allocate interrupt identity for IPIs */
> +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +       if (virq < 0)
> +               return virq;
> +       priv->ipi_id = virq;
> +
> +       /* Create IMSIC IPI multiplexing */
> +       virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
> +       if (virq <= 0) {
> +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +               return (virq < 0) ? virq : -ENOMEM;
> +       }
> +
> +       /* Set vIRQ range */
> +       riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> +
> +       /* Allocate interrupt identity for local enable/disable sync */
> +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +       if (virq < 0) {
> +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +               return virq;
> +       }
> +       priv->ipi_lsync_id = virq;
> +
> +       return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +       imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> +       if (priv->ipi_id)
> +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +}
> +#else
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +       /* Clear the IPI ids because we are not using IPIs */
> +       priv->ipi_id = 0;
> +       priv->ipi_lsync_id = 0;
> +       return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +}
> +#endif
> +
> +static void imsic_irq_mask(struct irq_data *d)
> +{
> +       imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_unmask(struct irq_data *d)
> +{
> +       imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> +                                     struct msi_msg *msg)
> +{
> +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +       unsigned int cpu;
> +       int err;
> +
> +       cpu = imsic_id_get_target(priv, d->hwirq);
> +       WARN_ON(cpu == UINT_MAX);
> +
> +       err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> +       WARN_ON(err);
> +
> +       iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> +}
> +
> +#ifdef CONFIG_SMP
> +static int imsic_irq_set_affinity(struct irq_data *d,
> +                                 const struct cpumask *mask_val,
> +                                 bool force)
> +{
> +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +       unsigned int target_cpu;
> +       int rc;
> +
> +       rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> +       if (rc)
> +               return rc;
> +
> +       imsic_id_set_target(priv, d->hwirq, target_cpu);
> +       irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> +
> +       return IRQ_SET_MASK_OK;
> +}
> +#endif
> +
> +static struct irq_chip imsic_irq_base_chip = {
> +       .name                   = "RISC-V IMSIC-BASE",
> +       .irq_mask               = imsic_irq_mask,
> +       .irq_unmask             = imsic_irq_unmask,
> +#ifdef CONFIG_SMP
> +       .irq_set_affinity       = imsic_irq_set_affinity,
> +#endif
> +       .irq_compose_msi_msg    = imsic_irq_compose_msi_msg,
> +       .flags                  = IRQCHIP_SKIP_SET_WAKE |
> +                                 IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> +                                 unsigned int virq,
> +                                 unsigned int nr_irqs,
> +                                 void *args)
> +{
> +       struct imsic_priv *priv = domain->host_data;
> +       msi_alloc_info_t *info = args;
> +       phys_addr_t msi_addr;
> +       int i, hwirq, err = 0;
> +       unsigned int cpu;
> +
> +       err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> +       if (err)
> +               return err;
> +
> +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +       if (err)
> +               return err;
> +
> +       hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> +                               get_count_order(nr_irqs));
> +       if (hwirq < 0)
> +               return hwirq;
> +
> +       err = iommu_dma_prepare_msi(info->desc, msi_addr);

Hi Anup,
First of all, thank you for completing this patch set to support all
AIA features. After investigating this patch, I'm concerned that it
may have a potential issue with changing CPU affinity.

As far as I understand, the imsic_irq_domain_alloc() is only called
once for a device when this device registers its IRQ. It means that
the iommu_dma_prepare_msi() will be called once too. When a device has
IOMMU support, the iommu_dma_prepare_msi() will allocate an IOVA for
the physical MSI address of this device and then call iommu_map to
create the mapping between this IOVA and the physical MSI address.
Besides, it also calls msi_desc_set_iommu_cookie to set this IOVA to
the iommu_cookie. Afterward, iommu_dma_compose_msi_msg() called by the
imsic_irq_compose_msi_msg() will directly use this desc->iommu_cookie
to compose the MSI address for this device. However, as mentioned
early, the iommu_dma_prepare_msi() is called only when a device
registers its IRQ. Therefore, it only allocates an IOVA for the 1st
online CPU and sets this IOVA to desc->iommu_cookie. In this
circumstance, I worry that changing the CPU affinity will not work for
this device. This is because the IMSIC driver does not create the new
IOMMU mapping for the MSI address of the new CPU when changing the CPU
affinity. Besides, the desc->iommu_cookie still records the msi_msg of
the old CPU without updating.

To solve this problem, one possible solution is to create all IOVA to
core IMSIC IOMMU mappings in the imsic_irq_domain_alloc(). Then, when
changing the IRQ affinity to a new CPU, the IMSIC driver needs to
update the desc->iommu_cookie with the IOVA of this new CPU. However,
I'm not sure if updating desc->iommu_cookie every time the CPU
affinity changes would break its original cookie spirit.


> +       if (err)
> +               goto fail;
> +
> +       for (i = 0; i < nr_irqs; i++) {
> +               imsic_id_set_target(priv, hwirq + i, cpu);
> +               irq_domain_set_info(domain, virq + i, hwirq + i,
> +                                   &imsic_irq_base_chip, priv,
> +                                   handle_simple_irq, NULL, NULL);
> +               irq_set_noprobe(virq + i);
> +               irq_set_affinity(virq + i, &priv->lmask);
> +       }
> +
> +       return 0;
> +
> +fail:
> +       imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> +       return err;
> +}
> +
> +static void imsic_irq_domain_free(struct irq_domain *domain,
> +                                 unsigned int virq,
> +                                 unsigned int nr_irqs)
> +{
> +       struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> +       struct imsic_priv *priv = domain->host_data;
> +
> +       imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> +       irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> +}
> +
> +static const struct irq_domain_ops imsic_base_domain_ops = {
> +       .alloc          = imsic_irq_domain_alloc,
> +       .free           = imsic_irq_domain_free,
> +};
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +
> +static void imsic_pci_mask_irq(struct irq_data *d)
> +{
> +       pci_msi_mask_irq(d);
> +       irq_chip_mask_parent(d);
> +}
> +
> +static void imsic_pci_unmask_irq(struct irq_data *d)
> +{
> +       pci_msi_unmask_irq(d);
> +       irq_chip_unmask_parent(d);
> +}
> +
> +static struct irq_chip imsic_pci_irq_chip = {
> +       .name                   = "RISC-V IMSIC-PCI",
> +       .irq_mask               = imsic_pci_mask_irq,
> +       .irq_unmask             = imsic_pci_unmask_irq,
> +       .irq_eoi                = irq_chip_eoi_parent,
> +};
> +
> +static struct msi_domain_ops imsic_pci_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_pci_domain_info = {
> +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> +                  MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> +       .ops    = &imsic_pci_domain_ops,
> +       .chip   = &imsic_pci_irq_chip,
> +};
> +
> +#endif
> +
> +static struct irq_chip imsic_plat_irq_chip = {
> +       .name                   = "RISC-V IMSIC-PLAT",
> +};
> +
> +static struct msi_domain_ops imsic_plat_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_plat_domain_info = {
> +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> +       .ops    = &imsic_plat_domain_ops,
> +       .chip   = &imsic_plat_irq_chip,
> +};
> +
> +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> +                                        struct fwnode_handle *fwnode)
> +{
> +       /* Create Base IRQ domain */
> +       priv->base_domain = irq_domain_create_tree(fwnode,
> +                                               &imsic_base_domain_ops, priv);
> +       if (!priv->base_domain) {
> +               pr_err("Failed to create IMSIC base domain\n");
> +               return -ENOMEM;
> +       }
> +       irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +       /* Create PCI MSI domain */
> +       priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> +                                               &imsic_pci_domain_info,
> +                                               priv->base_domain);
> +       if (!priv->pci_domain) {
> +               pr_err("Failed to create IMSIC PCI domain\n");
> +               irq_domain_remove(priv->base_domain);
> +               return -ENOMEM;
> +       }
> +#endif
> +
> +       /* Create Platform MSI domain */
> +       priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> +                                               &imsic_plat_domain_info,
> +                                               priv->base_domain);
> +       if (!priv->plat_domain) {
> +               pr_err("Failed to create IMSIC platform domain\n");
> +               if (priv->pci_domain)
> +                       irq_domain_remove(priv->pci_domain);
> +               irq_domain_remove(priv->base_domain);
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
> +}
> +
> +/*
> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +       struct irq_chip *chip = irq_desc_get_chip(desc);
> +       struct imsic_priv *priv = handler->priv;
> +       irq_hw_number_t hwirq;
> +       int err;
> +
> +       WARN_ON_ONCE(!handler->priv);
> +
> +       chained_irq_enter(chip, desc);
> +
> +       while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> +               hwirq = hwirq >> TOPEI_ID_SHIFT;
> +
> +               if (hwirq == priv->ipi_id) {
> +#ifdef CONFIG_SMP
> +                       ipi_mux_process();
> +#endif
> +                       continue;
> +               } else if (hwirq == priv->ipi_lsync_id) {
> +                       imsic_ids_local_sync(priv);
> +                       continue;
> +               }
> +
> +               err = generic_handle_domain_irq(priv->base_domain, hwirq);
> +               if (unlikely(err))
> +                       pr_warn_ratelimited(
> +                               "hwirq %lu mapping not found\n", hwirq);
> +       }
> +
> +       chained_irq_exit(chip, desc);
> +}
> +
> +static int imsic_starting_cpu(unsigned int cpu)
> +{
> +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +       struct imsic_priv *priv = handler->priv;
> +
> +       /* Enable per-CPU parent interrupt */
> +       if (imsic_parent_irq)
> +               enable_percpu_irq(imsic_parent_irq,
> +                                 irq_get_trigger_type(imsic_parent_irq));
> +       else
> +               pr_warn("cpu%d: parent irq not available\n", cpu);
> +
> +       /* Enable IPIs */
> +       imsic_ipi_enable(priv);
> +
> +       /*
> +        * Interrupts identities might have been enabled/disabled while
> +        * this CPU was not running so sync-up local enable/disable state.
> +        */
> +       imsic_ids_local_sync(priv);
> +
> +       /* Locally enable interrupt delivery */
> +       imsic_ids_local_delivery(priv, true);
> +
> +       return 0;
> +}
> +
> +struct imsic_fwnode_ops {
> +       u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> +                            void *fwopaque);
> +       int (*parent_hartid)(struct fwnode_handle *fwnode,
> +                            void *fwopaque, u32 index,
> +                            unsigned long *out_hartid);
> +       u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> +       int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> +                               void *fwopaque, u32 index,
> +                               struct resource *res);
> +       void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> +                                 void *fwopaque, u32 index);
> +       int (*read_u32)(struct fwnode_handle *fwnode,
> +                       void *fwopaque, const char *prop, u32 *out_val);
> +       bool (*read_bool)(struct fwnode_handle *fwnode,
> +                         void *fwopaque, const char *prop);
> +};
> +
> +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> +                            struct fwnode_handle *fwnode,
> +                            void *fwopaque)
> +{
> +       struct resource res;
> +       phys_addr_t base_addr;
> +       int rc, nr_parent_irqs;
> +       struct imsic_mmio *mmio;
> +       struct imsic_priv *priv;
> +       struct irq_domain *domain;
> +       struct imsic_handler *handler;
> +       struct imsic_global_config *global;
> +       u32 i, tmp, nr_handlers = 0;
> +
> +       if (imsic_init_done) {
> +               pr_err("%pfwP: already initialized hence ignoring\n",
> +                       fwnode);
> +               return -ENODEV;
> +       }
> +
> +       if (!riscv_isa_extension_available(NULL, SxAIA)) {
> +               pr_err("%pfwP: AIA support not available\n", fwnode);
> +               return -ENODEV;
> +       }
> +
> +       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +       if (!priv)
> +               return -ENOMEM;
> +       global = &priv->global;
> +
> +       /* Find number of parent interrupts */
> +       nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> +       if (!nr_parent_irqs) {
> +               pr_err("%pfwP: no parent irqs available\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of guest index bits in MSI address */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> +                            &global->guest_index_bits);
> +       if (rc)
> +               global->guest_index_bits = 0;
> +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> +       if (tmp < global->guest_index_bits) {
> +               pr_err("%pfwP: guest index bits too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of HART index bits */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> +                            &global->hart_index_bits);
> +       if (rc) {
> +               /* Assume default value */
> +               global->hart_index_bits = __fls(nr_parent_irqs);
> +               if (BIT(global->hart_index_bits) < nr_parent_irqs)
> +                       global->hart_index_bits++;
> +       }
> +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +             global->guest_index_bits;
> +       if (tmp < global->hart_index_bits) {
> +               pr_err("%pfwP: HART index bits too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of group index bits */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> +                            &global->group_index_bits);
> +       if (rc)
> +               global->group_index_bits = 0;
> +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +             global->guest_index_bits - global->hart_index_bits;
> +       if (tmp < global->group_index_bits) {
> +               pr_err("%pfwP: group index bits too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /*
> +        * Find first bit position of group index.
> +        * If not specified assumed the default APLIC-IMSIC configuration.
> +        */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> +                            &global->group_index_shift);
> +       if (rc)
> +               global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> +       tmp = global->group_index_bits + global->group_index_shift - 1;
> +       if (tmp >= BITS_PER_LONG) {
> +               pr_err("%pfwP: group index shift too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of interrupt identities */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> +                            &global->nr_ids);
> +       if (rc) {
> +               pr_err("%pfwP: number of interrupt identities not found\n",
> +                       fwnode);
> +               return rc;
> +       }
> +       if ((global->nr_ids < IMSIC_MIN_ID) ||
> +           (global->nr_ids >= IMSIC_MAX_ID) ||
> +           ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +               pr_err("%pfwP: invalid number of interrupt identities\n",
> +                       fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of guest interrupt identities */
> +       if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> +                           &global->nr_guest_ids))
> +               global->nr_guest_ids = global->nr_ids;
> +       if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> +           (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> +           ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +               pr_err("%pfwP: invalid number of guest interrupt identities\n",
> +                       fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Compute base address */
> +       rc = fwops->mmio_to_resource(fwnode, fwopaque, 0, &res);
> +       if (rc) {
> +               pr_err("%pfwP: first MMIO resource not found\n", fwnode);
> +               return -EINVAL;
> +       }
> +       global->base_addr = res.start;
> +       global->base_addr &= ~(BIT(global->guest_index_bits +
> +                                  global->hart_index_bits +
> +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> +       global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> +                              global->group_index_shift);
> +
> +       /* Find number of MMIO register sets */
> +       priv->num_mmios = fwops->nr_mmio(fwnode, fwopaque);
> +
> +       /* Allocate MMIO register sets */
> +       priv->mmios = kcalloc(priv->num_mmios, sizeof(*mmio), GFP_KERNEL);
> +       if (!priv->mmios) {
> +               rc = -ENOMEM;
> +               goto out_free_priv;
> +       }
> +
> +       /* Parse and map MMIO register sets */
> +       for (i = 0; i < priv->num_mmios; i++) {
> +               mmio = &priv->mmios[i];
> +               rc = fwops->mmio_to_resource(fwnode, fwopaque, i, &res);
> +               if (rc) {
> +                       pr_err("%pfwP: unable to parse MMIO regset %d\n",
> +                               fwnode, i);
> +                       goto out_iounmap;
> +               }
> +               mmio->pa = res.start;
> +               mmio->size = res.end - res.start + 1;
> +
> +               base_addr = mmio->pa;
> +               base_addr &= ~(BIT(global->guest_index_bits +
> +                                  global->hart_index_bits +
> +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> +               base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> +                              global->group_index_shift);
> +               if (base_addr != global->base_addr) {
> +                       rc = -EINVAL;
> +                       pr_err("%pfwP: address mismatch for regset %d\n",
> +                               fwnode, i);
> +                       goto out_iounmap;
> +               }
> +
> +               mmio->va = fwops->mmio_map(fwnode, fwopaque, i);
> +               if (!mmio->va) {
> +                       rc = -EIO;
> +                       pr_err("%pfwP: unable to map MMIO regset %d\n",
> +                               fwnode, i);
> +                       goto out_iounmap;
> +               }
> +       }
> +
> +       /* Initialize interrupt identity management */
> +       rc = imsic_ids_init(priv);
> +       if (rc) {
> +               pr_err("%pfwP: failed to initialize interrupt management\n",
> +                      fwnode);
> +               goto out_iounmap;
> +       }
> +
> +       /* Configure handlers for target CPUs */
> +       for (i = 0; i < nr_parent_irqs; i++) {
> +               unsigned long reloff, hartid;
> +               int j, cpu;
> +
> +               rc = fwops->parent_hartid(fwnode, fwopaque, i, &hartid);
> +               if (rc) {
> +                       pr_warn("%pfwP: hart ID for parent irq%d not found\n",
> +                               fwnode, i);
> +                       continue;
> +               }
> +
> +               cpu = riscv_hartid_to_cpuid(hartid);
> +               if (cpu < 0) {
> +                       pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
> +                               fwnode, i);
> +                       continue;
> +               }
> +
> +               /* Find MMIO location of MSI page */
> +               mmio = NULL;
> +               reloff = i * BIT(global->guest_index_bits) *
> +                        IMSIC_MMIO_PAGE_SZ;
> +               for (j = 0; priv->num_mmios; j++) {
> +                       if (reloff < priv->mmios[j].size) {
> +                               mmio = &priv->mmios[j];
> +                               break;
> +                       }
> +
> +                       /*
> +                        * MMIO region size may not be aligned to
> +                        * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
> +                        * if holes are present.
> +                        */
> +                       reloff -= ALIGN(priv->mmios[j].size,
> +                       BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
> +               }
> +               if (!mmio) {
> +                       pr_warn("%pfwP: MMIO not found for parent irq%d\n",
> +                               fwnode, i);
> +                       continue;
> +               }
> +
> +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> +               if (handler->priv) {
> +                       pr_warn("%pfwP: CPU%d handler already configured.\n",
> +                               fwnode, cpu);
> +                       goto done;
> +               }
> +
> +               cpumask_set_cpu(cpu, &priv->lmask);
> +               handler->local.msi_pa = mmio->pa + reloff;
> +               handler->local.msi_va = mmio->va + reloff;
> +               handler->priv = priv;
> +
> +done:
> +               nr_handlers++;
> +       }
> +
> +       /* If no CPU handlers found then can't take interrupts */
> +       if (!nr_handlers) {
> +               pr_err("%pfwP: No CPU handlers found\n", fwnode);
> +               rc = -ENODEV;
> +               goto out_ids_cleanup;
> +       }
> +
> +       /* Find parent domain and register chained handler */
> +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> +                                         DOMAIN_BUS_ANY);
> +       if (!domain) {
> +               pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> +               rc = -ENOENT;
> +               goto out_ids_cleanup;
> +       }
> +       imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> +       if (!imsic_parent_irq) {
> +               pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> +               rc = -ENOENT;
> +               goto out_ids_cleanup;
> +       }
> +       irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> +
> +       /* Initialize IPI domain */
> +       rc = imsic_ipi_domain_init(priv);
> +       if (rc) {
> +               pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> +               goto out_ids_cleanup;
> +       }
> +
> +       /* Initialize IRQ and MSI domains */
> +       rc = imsic_irq_domains_init(priv, fwnode);
> +       if (rc) {
> +               pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
> +                      fwnode);
> +               goto out_ipi_domain_cleanup;
> +       }
> +
> +       /*
> +        * Setup cpuhp state
> +        *
> +        * Don't disable per-CPU IMSIC file when CPU goes offline
> +        * because this affects IPI and the masking/unmasking of
> +        * virtual IPIs is done via generic IPI-Mux
> +        */
> +       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> +                         "irqchip/riscv/imsic:starting",
> +                         imsic_starting_cpu, NULL);
> +
> +       /*
> +        * Only one IMSIC instance allowed in a platform for clean
> +        * implementation of SMP IRQ affinity and per-CPU IPIs.
> +        *
> +        * This means on a multi-socket (or multi-die) platform we
> +        * will have multiple MMIO regions for one IMSIC instance.
> +        */
> +       imsic_init_done = true;
> +
> +       pr_info("%pfwP:  hart-index-bits: %d,  guest-index-bits: %d\n",
> +               fwnode, global->hart_index_bits, global->guest_index_bits);
> +       pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
> +               fwnode, global->group_index_bits, global->group_index_shift);
> +       pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
> +               fwnode, global->nr_ids, nr_handlers, &global->base_addr);
> +       if (priv->ipi_lsync_id)
> +               pr_info("%pfwP: enable/disable sync using interrupt %d\n",
> +                       fwnode, priv->ipi_lsync_id);
> +       if (priv->ipi_id)
> +               pr_info("%pfwP: providing IPIs using interrupt %d\n",
> +                       fwnode, priv->ipi_id);
> +
> +       return 0;
> +
> +out_ipi_domain_cleanup:
> +       imsic_ipi_domain_cleanup(priv);
> +out_ids_cleanup:
> +       imsic_ids_cleanup(priv);
> +out_iounmap:
> +       for (i = 0; i < priv->num_mmios; i++) {
> +               if (priv->mmios[i].va)
> +                       iounmap(priv->mmios[i].va);
> +       }
> +       kfree(priv->mmios);
> +out_free_priv:
> +       kfree(priv);
> +       return rc;
> +}
> +
> +static u32 __init imsic_dt_nr_parent_irq(struct fwnode_handle *fwnode,
> +                                        void *fwopaque)
> +{
> +       return of_irq_count(to_of_node(fwnode));
> +}
> +
> +static int __init imsic_dt_parent_hartid(struct fwnode_handle *fwnode,
> +                                        void *fwopaque, u32 index,
> +                                        unsigned long *out_hartid)
> +{
> +       struct of_phandle_args parent;
> +       int rc;
> +
> +       rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
> +       if (rc)
> +               return rc;
> +
> +       /*
> +        * Skip interrupts other than external interrupts for
> +        * current privilege level.
> +        */
> +       if (parent.args[0] != RV_IRQ_EXT)
> +               return -EINVAL;
> +
> +       return riscv_of_parent_hartid(parent.np, out_hartid);
> +}
> +
> +static u32 __init imsic_dt_nr_mmio(struct fwnode_handle *fwnode,
> +                                  void *fwopaque)
> +{
> +       u32 ret = 0;
> +       struct resource res;
> +
> +       while (!of_address_to_resource(to_of_node(fwnode), ret, &res))
> +               ret++;
> +
> +       return ret;
> +}
> +
> +static int __init imsic_mmio_to_resource(struct fwnode_handle *fwnode,
> +                                        void *fwopaque, u32 index,
> +                                        struct resource *res)
> +{
> +       return of_address_to_resource(to_of_node(fwnode), index, res);
> +}
> +
> +static void __iomem __init *imsic_dt_mmio_map(struct fwnode_handle *fwnode,
> +                                             void *fwopaque, u32 index)
> +{
> +       return of_iomap(to_of_node(fwnode), index);
> +}
> +
> +static int __init imsic_dt_read_u32(struct fwnode_handle *fwnode,
> +                                   void *fwopaque, const char *prop,
> +                                   u32 *out_val)
> +{
> +       return of_property_read_u32(to_of_node(fwnode), prop, out_val);
> +}
> +
> +static bool __init imsic_dt_read_bool(struct fwnode_handle *fwnode,
> +                                     void *fwopaque, const char *prop)
> +{
> +       return of_property_read_bool(to_of_node(fwnode), prop);
> +}
> +
> +static int __init imsic_dt_init(struct device_node *node,
> +                               struct device_node *parent)
> +{
> +       struct imsic_fwnode_ops ops = {
> +               .nr_parent_irq = imsic_dt_nr_parent_irq,
> +               .parent_hartid = imsic_dt_parent_hartid,
> +               .nr_mmio = imsic_dt_nr_mmio,
> +               .mmio_to_resource = imsic_mmio_to_resource,
> +               .mmio_map = imsic_dt_mmio_map,
> +               .read_u32 = imsic_dt_read_u32,
> +               .read_bool = imsic_dt_read_bool,
> +       };
> +
> +       return imsic_init(&ops, &node->fwnode, NULL);
> +}
> +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
> diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
> new file mode 100644
> index 000000000000..5d1387adc0ba
> --- /dev/null
> +++ b/include/linux/irqchip/riscv-imsic.h
> @@ -0,0 +1,92 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
> +#define __LINUX_IRQCHIP_RISCV_IMSIC_H
> +
> +#include <linux/types.h>
> +#include <asm/csr.h>
> +
> +#define IMSIC_MMIO_PAGE_SHIFT          12
> +#define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
> +#define IMSIC_MMIO_PAGE_LE             0x00
> +#define IMSIC_MMIO_PAGE_BE             0x04
> +
> +#define IMSIC_MIN_ID                   63
> +#define IMSIC_MAX_ID                   2048
> +
> +#define IMSIC_EIDELIVERY               0x70
> +
> +#define IMSIC_EITHRESHOLD              0x72
> +
> +#define IMSIC_EIP0                     0x80
> +#define IMSIC_EIP63                    0xbf
> +#define IMSIC_EIPx_BITS                        32
> +
> +#define IMSIC_EIE0                     0xc0
> +#define IMSIC_EIE63                    0xff
> +#define IMSIC_EIEx_BITS                        32
> +
> +#define IMSIC_FIRST                    IMSIC_EIDELIVERY
> +#define IMSIC_LAST                     IMSIC_EIE63
> +
> +#define IMSIC_MMIO_SETIPNUM_LE         0x00
> +#define IMSIC_MMIO_SETIPNUM_BE         0x04
> +
> +struct imsic_global_config {
> +       /*
> +        * MSI Target Address Scheme
> +        *
> +        * XLEN-1                                                12     0
> +        * |                                                     |     |
> +        * -------------------------------------------------------------
> +        * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> +        * -------------------------------------------------------------
> +        */
> +
> +       /* Bits representing Guest index, HART index, and Group index */
> +       u32 guest_index_bits;
> +       u32 hart_index_bits;
> +       u32 group_index_bits;
> +       u32 group_index_shift;
> +
> +       /* Global base address matching all target MSI addresses */
> +       phys_addr_t base_addr;
> +
> +       /* Number of interrupt identities */
> +       u32 nr_ids;
> +
> +       /* Number of guest interrupt identities */
> +       u32 nr_guest_ids;
> +};
> +
> +struct imsic_local_config {
> +       phys_addr_t msi_pa;
> +       void __iomem *msi_va;
> +};
> +
> +#ifdef CONFIG_RISCV_IMSIC
> +
> +extern const struct imsic_global_config *imsic_get_global_config(void);
> +
> +extern const struct imsic_local_config *imsic_get_local_config(
> +                                                       unsigned int cpu);
> +
> +#else
> +
> +static inline const struct imsic_global_config *imsic_get_global_config(void)
> +{
> +       return NULL;
> +}
> +
> +static inline const struct imsic_local_config *imsic_get_local_config(
> +                                                       unsigned int cpu)
> +{
> +       return NULL;
> +}
> +
> +#endif
> +
> +#endif
> --
> 2.34.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Fwd: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
@ 2023-01-18  3:49         ` Vincent Chen
  0 siblings, 0 replies; 72+ messages in thread
From: Vincent Chen @ 2023-01-18  3:49 UTC (permalink / raw)
  To: apatel
  Cc: Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel@vger.kernel.org List,
	devicetree

> From: Anup Patel <apatel@ventanamicro.com>
> Date: Wed, Jan 4, 2023 at 1:19 AM
> Subject: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
> To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
>
>
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new MSI controller for managing MSIs on a RISC-V platform. This new
> MSI controller is referred to as incoming message signaled interrupt
> controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> platforms.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  drivers/irqchip/Kconfig             |   14 +-
>  drivers/irqchip/Makefile            |    1 +
>  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
>  include/linux/irqchip/riscv-imsic.h |   92 +++
>  4 files changed, 1280 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
>  create mode 100644 include/linux/irqchip/riscv-imsic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 9e65345ca3f6..a1315189a595 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -29,7 +29,6 @@ config ARM_GIC_V2M
>
>  config GIC_NON_BANKED
>         bool
> -
>  config ARM_GIC_V3
>         bool
>         select IRQ_DOMAIN_HIERARCHY
> @@ -548,6 +547,19 @@ config SIFIVE_PLIC
>         select IRQ_DOMAIN_HIERARCHY
>         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_IMSIC
> +       bool
> +       depends on RISCV
> +       select IRQ_DOMAIN_HIERARCHY
> +       select GENERIC_MSI_IRQ_DOMAIN
> +
> +config RISCV_IMSIC_PCI
> +       bool
> +       depends on RISCV_IMSIC
> +       depends on PCI
> +       depends on PCI_MSI
> +       default RISCV_IMSIC
> +
>  config EXYNOS_IRQ_COMBINER
>         bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
>         depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 87b49a10962c..22c723cc6ec8 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
>  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
>  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
>  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
>  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
>  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
>  obj-$(CONFIG_IMX_INTMUX)               += irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> new file mode 100644
> index 000000000000..4c16b66738d6
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic.c
> @@ -0,0 +1,1174 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/bitmap.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +#include <asm/hwcap.h>
> +
> +#define IMSIC_DISABLE_EIDELIVERY       0
> +#define IMSIC_ENABLE_EIDELIVERY                1
> +#define IMSIC_DISABLE_EITHRESHOLD      1
> +#define IMSIC_ENABLE_EITHRESHOLD       0
> +
> +#define imsic_csr_write(__c, __v)      \
> +do {                                   \
> +       csr_write(CSR_ISELECT, __c);    \
> +       csr_write(CSR_IREG, __v);       \
> +} while (0)
> +
> +#define imsic_csr_read(__c)            \
> +({                                     \
> +       unsigned long __v;              \
> +       csr_write(CSR_ISELECT, __c);    \
> +       __v = csr_read(CSR_IREG);       \
> +       __v;                            \
> +})
> +
> +#define imsic_csr_set(__c, __v)                \
> +do {                                   \
> +       csr_write(CSR_ISELECT, __c);    \
> +       csr_set(CSR_IREG, __v);         \
> +} while (0)
> +
> +#define imsic_csr_clear(__c, __v)      \
> +do {                                   \
> +       csr_write(CSR_ISELECT, __c);    \
> +       csr_clear(CSR_IREG, __v);       \
> +} while (0)
> +
> +struct imsic_mmio {
> +       phys_addr_t pa;
> +       void __iomem *va;
> +       unsigned long size;
> +};
> +
> +struct imsic_priv {
> +       /* Global configuration common for all HARTs */
> +       struct imsic_global_config global;
> +
> +       /* MMIO regions */
> +       u32 num_mmios;
> +       struct imsic_mmio *mmios;
> +
> +       /* Global state of interrupt identities */
> +       raw_spinlock_t ids_lock;
> +       unsigned long *ids_used_bimap;
> +       unsigned long *ids_enabled_bimap;
> +       unsigned int *ids_target_cpu;
> +
> +       /* Mask for connected CPUs */
> +       struct cpumask lmask;
> +
> +       /* IPI interrupt identity */
> +       u32 ipi_id;
> +       u32 ipi_lsync_id;
> +
> +       /* IRQ domains */
> +       struct irq_domain *base_domain;
> +       struct irq_domain *pci_domain;
> +       struct irq_domain *plat_domain;
> +};
> +
> +struct imsic_handler {
> +       /* Local configuration for given HART */
> +       struct imsic_local_config local;
> +
> +       /* Pointer to private context */
> +       struct imsic_priv *priv;
> +};
> +
> +static bool imsic_init_done;
> +
> +static int imsic_parent_irq;
> +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> +
> +const struct imsic_global_config *imsic_get_global_config(void)
> +{
> +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +
> +       if (!handler || !handler->priv)
> +               return NULL;
> +
> +       return &handler->priv->global;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> +
> +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> +{
> +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +       if (!handler || !handler->priv)
> +               return NULL;
> +
> +       return &handler->local;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_local_config);
> +
> +static int imsic_cpu_page_phys(unsigned int cpu,
> +                              unsigned int guest_index,
> +                              phys_addr_t *out_msi_pa)
> +{
> +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +       struct imsic_global_config *global;
> +       struct imsic_local_config *local;
> +
> +       if (!handler || !handler->priv)
> +               return -ENODEV;
> +       local = &handler->local;
> +       global = &handler->priv->global;
> +
> +       if (BIT(global->guest_index_bits) <= guest_index)
> +               return -EINVAL;
> +
> +       if (out_msi_pa)
> +               *out_msi_pa = local->msi_pa +
> +                             (guest_index * IMSIC_MMIO_PAGE_SZ);
> +
> +       return 0;
> +}
> +
> +static int imsic_get_cpu(struct imsic_priv *priv,
> +                        const struct cpumask *mask_val, bool force,
> +                        unsigned int *out_target_cpu)
> +{
> +       struct cpumask amask;
> +       unsigned int cpu;
> +
> +       cpumask_and(&amask, &priv->lmask, mask_val);
> +
> +       if (force)
> +               cpu = cpumask_first(&amask);
> +       else
> +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> +       if (cpu >= nr_cpu_ids)
> +               return -EINVAL;
> +
> +       if (out_target_cpu)
> +               *out_target_cpu = cpu;
> +
> +       return 0;
> +}
> +
> +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> +                                struct msi_msg *msg)
> +{
> +       phys_addr_t msi_addr;
> +       int err;
> +
> +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +       if (err)
> +               return err;
> +
> +       msg->address_hi = upper_32_bits(msi_addr);
> +       msg->address_lo = lower_32_bits(msi_addr);
> +       msg->data = id;
> +
> +       return err;
> +}
> +
> +static void imsic_id_set_target(struct imsic_priv *priv,
> +                                unsigned int id, unsigned int target_cpu)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       priv->ids_target_cpu[id] = target_cpu;
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> +                                       unsigned int id)
> +{
> +       unsigned int ret;
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       ret = priv->ids_target_cpu[id];
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       return ret;
> +}
> +
> +static void __imsic_eix_update(unsigned long base_id,
> +                              unsigned long num_id, bool pend, bool val)
> +{
> +       unsigned long i, isel, ireg, flags;
> +       unsigned long id = base_id, last_id = base_id + num_id;
> +
> +       while (id < last_id) {
> +               isel = id / BITS_PER_LONG;
> +               isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> +               isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> +
> +               ireg = 0;
> +               for (i = id & (__riscv_xlen - 1);
> +                    (id < last_id) && (i < __riscv_xlen); i++) {
> +                       ireg |= BIT(i);
> +                       id++;
> +               }
> +
> +               /*
> +                * The IMSIC EIEx and EIPx registers are indirectly
> +                * accessed via using ISELECT and IREG CSRs so we
> +                * save/restore local IRQ to ensure that we don't
> +                * get preempted while accessing IMSIC registers.
> +                */
> +               local_irq_save(flags);
> +               if (val)
> +                       imsic_csr_set(isel, ireg);
> +               else
> +                       imsic_csr_clear(isel, ireg);
> +               local_irq_restore(flags);
> +       }
> +}
> +
> +#define __imsic_id_enable(__id)                \
> +       __imsic_eix_update((__id), 1, false, true)
> +#define __imsic_id_disable(__id)       \
> +       __imsic_eix_update((__id), 1, false, false)
> +
> +#ifdef CONFIG_SMP
> +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> +{
> +       struct imsic_handler *handler;
> +       struct cpumask amask;
> +       int cpu;
> +
> +       cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> +       for_each_cpu(cpu, &amask) {
> +               if (cpu == smp_processor_id())
> +                       continue;
> +
> +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> +               if (!handler || !handler->priv || !handler->local.msi_va) {
> +                       pr_warn("CPU%d: handler not initialized\n", cpu);
> +                       continue;
> +               }
> +
> +               writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
> +       }
> +}
> +#else
> +#define __imsic_id_smp_sync(__priv)
> +#endif
> +
> +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       bitmap_set(priv->ids_enabled_bimap, id, 1);
> +       __imsic_id_enable(id);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       __imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       bitmap_clear(priv->ids_enabled_bimap, id, 1);
> +       __imsic_id_disable(id);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       __imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_ids_local_sync(struct imsic_priv *priv)
> +{
> +       int i;
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       for (i = 1; i <= priv->global.nr_ids; i++) {
> +               if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> +                       continue;
> +
> +               if (test_bit(i, priv->ids_enabled_bimap))
> +                       __imsic_id_enable(i);
> +               else
> +                       __imsic_id_disable(i);
> +       }
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> +{
> +       if (enable) {
> +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> +       } else {
> +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> +       }
> +}
> +
> +static int imsic_ids_alloc(struct imsic_priv *priv,
> +                          unsigned int max_id, unsigned int order)
> +{
> +       int ret;
> +       unsigned long flags;
> +
> +       if ((priv->global.nr_ids < max_id) ||
> +           (max_id < BIT(order)))
> +               return -EINVAL;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       ret = bitmap_find_free_region(priv->ids_used_bimap,
> +                                     max_id + 1, order);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> +       return ret;
> +}
> +
> +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> +                          unsigned int order)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> +       bitmap_release_region(priv->ids_used_bimap, base_id, order);
> +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static int __init imsic_ids_init(struct imsic_priv *priv)
> +{
> +       int i;
> +       struct imsic_global_config *global = &priv->global;
> +
> +       raw_spin_lock_init(&priv->ids_lock);
> +
> +       /* Allocate used bitmap */
> +       priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +                                       sizeof(unsigned long), GFP_KERNEL);
> +       if (!priv->ids_used_bimap)
> +               return -ENOMEM;
> +
> +       /* Allocate enabled bitmap */
> +       priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> +                                          sizeof(unsigned long), GFP_KERNEL);
> +       if (!priv->ids_enabled_bimap) {
> +               kfree(priv->ids_used_bimap);
> +               return -ENOMEM;
> +       }
> +
> +       /* Allocate target CPU array */
> +       priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> +                                      sizeof(unsigned int), GFP_KERNEL);
> +       if (!priv->ids_target_cpu) {
> +               kfree(priv->ids_enabled_bimap);
> +               kfree(priv->ids_used_bimap);
> +               return -ENOMEM;
> +       }
> +       for (i = 0; i <= global->nr_ids; i++)
> +               priv->ids_target_cpu[i] = UINT_MAX;
> +
> +       /* Reserve ID#0 because it is special and never implemented */
> +       bitmap_set(priv->ids_used_bimap, 0, 1);
> +
> +       return 0;
> +}
> +
> +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> +{
> +       kfree(priv->ids_target_cpu);
> +       kfree(priv->ids_enabled_bimap);
> +       kfree(priv->ids_used_bimap);
> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_ipi_send(unsigned int cpu)
> +{
> +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> +       if (!handler || !handler->priv || !handler->local.msi_va) {
> +               pr_warn("CPU%d: handler not initialized\n", cpu);
> +               return;
> +       }
> +
> +       writel(handler->priv->ipi_id, handler->local.msi_va);
> +}
> +
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +       __imsic_id_enable(priv->ipi_id);
> +       __imsic_id_enable(priv->ipi_lsync_id);
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +       int virq;
> +
> +       /* Allocate interrupt identity for IPIs */
> +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +       if (virq < 0)
> +               return virq;
> +       priv->ipi_id = virq;
> +
> +       /* Create IMSIC IPI multiplexing */
> +       virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
> +       if (virq <= 0) {
> +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +               return (virq < 0) ? virq : -ENOMEM;
> +       }
> +
> +       /* Set vIRQ range */
> +       riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> +
> +       /* Allocate interrupt identity for local enable/disable sync */
> +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> +       if (virq < 0) {
> +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +               return virq;
> +       }
> +       priv->ipi_lsync_id = virq;
> +
> +       return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +       imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> +       if (priv->ipi_id)
> +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +}
> +#else
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> +       /* Clear the IPI ids because we are not using IPIs */
> +       priv->ipi_id = 0;
> +       priv->ipi_lsync_id = 0;
> +       return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +}
> +#endif
> +
> +static void imsic_irq_mask(struct irq_data *d)
> +{
> +       imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_unmask(struct irq_data *d)
> +{
> +       imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> +                                     struct msi_msg *msg)
> +{
> +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +       unsigned int cpu;
> +       int err;
> +
> +       cpu = imsic_id_get_target(priv, d->hwirq);
> +       WARN_ON(cpu == UINT_MAX);
> +
> +       err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> +       WARN_ON(err);
> +
> +       iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> +}
> +
> +#ifdef CONFIG_SMP
> +static int imsic_irq_set_affinity(struct irq_data *d,
> +                                 const struct cpumask *mask_val,
> +                                 bool force)
> +{
> +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> +       unsigned int target_cpu;
> +       int rc;
> +
> +       rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> +       if (rc)
> +               return rc;
> +
> +       imsic_id_set_target(priv, d->hwirq, target_cpu);
> +       irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> +
> +       return IRQ_SET_MASK_OK;
> +}
> +#endif
> +
> +static struct irq_chip imsic_irq_base_chip = {
> +       .name                   = "RISC-V IMSIC-BASE",
> +       .irq_mask               = imsic_irq_mask,
> +       .irq_unmask             = imsic_irq_unmask,
> +#ifdef CONFIG_SMP
> +       .irq_set_affinity       = imsic_irq_set_affinity,
> +#endif
> +       .irq_compose_msi_msg    = imsic_irq_compose_msi_msg,
> +       .flags                  = IRQCHIP_SKIP_SET_WAKE |
> +                                 IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> +                                 unsigned int virq,
> +                                 unsigned int nr_irqs,
> +                                 void *args)
> +{
> +       struct imsic_priv *priv = domain->host_data;
> +       msi_alloc_info_t *info = args;
> +       phys_addr_t msi_addr;
> +       int i, hwirq, err = 0;
> +       unsigned int cpu;
> +
> +       err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> +       if (err)
> +               return err;
> +
> +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> +       if (err)
> +               return err;
> +
> +       hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> +                               get_count_order(nr_irqs));
> +       if (hwirq < 0)
> +               return hwirq;
> +
> +       err = iommu_dma_prepare_msi(info->desc, msi_addr);

Hi Anup,
First of all, thank you for completing this patch set to support all
AIA features. After investigating this patch, I'm concerned that it
may have a potential issue with changing CPU affinity.

As far as I understand, the imsic_irq_domain_alloc() is only called
once for a device when this device registers its IRQ. It means that
the iommu_dma_prepare_msi() will be called once too. When a device has
IOMMU support, the iommu_dma_prepare_msi() will allocate an IOVA for
the physical MSI address of this device and then call iommu_map to
create the mapping between this IOVA and the physical MSI address.
Besides, it also calls msi_desc_set_iommu_cookie to set this IOVA to
the iommu_cookie. Afterward, iommu_dma_compose_msi_msg() called by the
imsic_irq_compose_msi_msg() will directly use this desc->iommu_cookie
to compose the MSI address for this device. However, as mentioned
early, the iommu_dma_prepare_msi() is called only when a device
registers its IRQ. Therefore, it only allocates an IOVA for the 1st
online CPU and sets this IOVA to desc->iommu_cookie. In this
circumstance, I worry that changing the CPU affinity will not work for
this device. This is because the IMSIC driver does not create the new
IOMMU mapping for the MSI address of the new CPU when changing the CPU
affinity. Besides, the desc->iommu_cookie still records the msi_msg of
the old CPU without updating.

To solve this problem, one possible solution is to create all IOVA to
core IMSIC IOMMU mappings in the imsic_irq_domain_alloc(). Then, when
changing the IRQ affinity to a new CPU, the IMSIC driver needs to
update the desc->iommu_cookie with the IOVA of this new CPU. However,
I'm not sure if updating desc->iommu_cookie every time the CPU
affinity changes would break its original cookie spirit.


> +       if (err)
> +               goto fail;
> +
> +       for (i = 0; i < nr_irqs; i++) {
> +               imsic_id_set_target(priv, hwirq + i, cpu);
> +               irq_domain_set_info(domain, virq + i, hwirq + i,
> +                                   &imsic_irq_base_chip, priv,
> +                                   handle_simple_irq, NULL, NULL);
> +               irq_set_noprobe(virq + i);
> +               irq_set_affinity(virq + i, &priv->lmask);
> +       }
> +
> +       return 0;
> +
> +fail:
> +       imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> +       return err;
> +}
> +
> +static void imsic_irq_domain_free(struct irq_domain *domain,
> +                                 unsigned int virq,
> +                                 unsigned int nr_irqs)
> +{
> +       struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> +       struct imsic_priv *priv = domain->host_data;
> +
> +       imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> +       irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> +}
> +
> +static const struct irq_domain_ops imsic_base_domain_ops = {
> +       .alloc          = imsic_irq_domain_alloc,
> +       .free           = imsic_irq_domain_free,
> +};
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +
> +static void imsic_pci_mask_irq(struct irq_data *d)
> +{
> +       pci_msi_mask_irq(d);
> +       irq_chip_mask_parent(d);
> +}
> +
> +static void imsic_pci_unmask_irq(struct irq_data *d)
> +{
> +       pci_msi_unmask_irq(d);
> +       irq_chip_unmask_parent(d);
> +}
> +
> +static struct irq_chip imsic_pci_irq_chip = {
> +       .name                   = "RISC-V IMSIC-PCI",
> +       .irq_mask               = imsic_pci_mask_irq,
> +       .irq_unmask             = imsic_pci_unmask_irq,
> +       .irq_eoi                = irq_chip_eoi_parent,
> +};
> +
> +static struct msi_domain_ops imsic_pci_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_pci_domain_info = {
> +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> +                  MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> +       .ops    = &imsic_pci_domain_ops,
> +       .chip   = &imsic_pci_irq_chip,
> +};
> +
> +#endif
> +
> +static struct irq_chip imsic_plat_irq_chip = {
> +       .name                   = "RISC-V IMSIC-PLAT",
> +};
> +
> +static struct msi_domain_ops imsic_plat_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_plat_domain_info = {
> +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> +       .ops    = &imsic_plat_domain_ops,
> +       .chip   = &imsic_plat_irq_chip,
> +};
> +
> +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> +                                        struct fwnode_handle *fwnode)
> +{
> +       /* Create Base IRQ domain */
> +       priv->base_domain = irq_domain_create_tree(fwnode,
> +                                               &imsic_base_domain_ops, priv);
> +       if (!priv->base_domain) {
> +               pr_err("Failed to create IMSIC base domain\n");
> +               return -ENOMEM;
> +       }
> +       irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +       /* Create PCI MSI domain */
> +       priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> +                                               &imsic_pci_domain_info,
> +                                               priv->base_domain);
> +       if (!priv->pci_domain) {
> +               pr_err("Failed to create IMSIC PCI domain\n");
> +               irq_domain_remove(priv->base_domain);
> +               return -ENOMEM;
> +       }
> +#endif
> +
> +       /* Create Platform MSI domain */
> +       priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> +                                               &imsic_plat_domain_info,
> +                                               priv->base_domain);
> +       if (!priv->plat_domain) {
> +               pr_err("Failed to create IMSIC platform domain\n");
> +               if (priv->pci_domain)
> +                       irq_domain_remove(priv->pci_domain);
> +               irq_domain_remove(priv->base_domain);
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
> +}
> +
> +/*
> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +       struct irq_chip *chip = irq_desc_get_chip(desc);
> +       struct imsic_priv *priv = handler->priv;
> +       irq_hw_number_t hwirq;
> +       int err;
> +
> +       WARN_ON_ONCE(!handler->priv);
> +
> +       chained_irq_enter(chip, desc);
> +
> +       while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> +               hwirq = hwirq >> TOPEI_ID_SHIFT;
> +
> +               if (hwirq == priv->ipi_id) {
> +#ifdef CONFIG_SMP
> +                       ipi_mux_process();
> +#endif
> +                       continue;
> +               } else if (hwirq == priv->ipi_lsync_id) {
> +                       imsic_ids_local_sync(priv);
> +                       continue;
> +               }
> +
> +               err = generic_handle_domain_irq(priv->base_domain, hwirq);
> +               if (unlikely(err))
> +                       pr_warn_ratelimited(
> +                               "hwirq %lu mapping not found\n", hwirq);
> +       }
> +
> +       chained_irq_exit(chip, desc);
> +}
> +
> +static int imsic_starting_cpu(unsigned int cpu)
> +{
> +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +       struct imsic_priv *priv = handler->priv;
> +
> +       /* Enable per-CPU parent interrupt */
> +       if (imsic_parent_irq)
> +               enable_percpu_irq(imsic_parent_irq,
> +                                 irq_get_trigger_type(imsic_parent_irq));
> +       else
> +               pr_warn("cpu%d: parent irq not available\n", cpu);
> +
> +       /* Enable IPIs */
> +       imsic_ipi_enable(priv);
> +
> +       /*
> +        * Interrupts identities might have been enabled/disabled while
> +        * this CPU was not running so sync-up local enable/disable state.
> +        */
> +       imsic_ids_local_sync(priv);
> +
> +       /* Locally enable interrupt delivery */
> +       imsic_ids_local_delivery(priv, true);
> +
> +       return 0;
> +}
> +
> +struct imsic_fwnode_ops {
> +       u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> +                            void *fwopaque);
> +       int (*parent_hartid)(struct fwnode_handle *fwnode,
> +                            void *fwopaque, u32 index,
> +                            unsigned long *out_hartid);
> +       u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> +       int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> +                               void *fwopaque, u32 index,
> +                               struct resource *res);
> +       void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> +                                 void *fwopaque, u32 index);
> +       int (*read_u32)(struct fwnode_handle *fwnode,
> +                       void *fwopaque, const char *prop, u32 *out_val);
> +       bool (*read_bool)(struct fwnode_handle *fwnode,
> +                         void *fwopaque, const char *prop);
> +};
> +
> +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> +                            struct fwnode_handle *fwnode,
> +                            void *fwopaque)
> +{
> +       struct resource res;
> +       phys_addr_t base_addr;
> +       int rc, nr_parent_irqs;
> +       struct imsic_mmio *mmio;
> +       struct imsic_priv *priv;
> +       struct irq_domain *domain;
> +       struct imsic_handler *handler;
> +       struct imsic_global_config *global;
> +       u32 i, tmp, nr_handlers = 0;
> +
> +       if (imsic_init_done) {
> +               pr_err("%pfwP: already initialized hence ignoring\n",
> +                       fwnode);
> +               return -ENODEV;
> +       }
> +
> +       if (!riscv_isa_extension_available(NULL, SxAIA)) {
> +               pr_err("%pfwP: AIA support not available\n", fwnode);
> +               return -ENODEV;
> +       }
> +
> +       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +       if (!priv)
> +               return -ENOMEM;
> +       global = &priv->global;
> +
> +       /* Find number of parent interrupts */
> +       nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> +       if (!nr_parent_irqs) {
> +               pr_err("%pfwP: no parent irqs available\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of guest index bits in MSI address */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> +                            &global->guest_index_bits);
> +       if (rc)
> +               global->guest_index_bits = 0;
> +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> +       if (tmp < global->guest_index_bits) {
> +               pr_err("%pfwP: guest index bits too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of HART index bits */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> +                            &global->hart_index_bits);
> +       if (rc) {
> +               /* Assume default value */
> +               global->hart_index_bits = __fls(nr_parent_irqs);
> +               if (BIT(global->hart_index_bits) < nr_parent_irqs)
> +                       global->hart_index_bits++;
> +       }
> +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +             global->guest_index_bits;
> +       if (tmp < global->hart_index_bits) {
> +               pr_err("%pfwP: HART index bits too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of group index bits */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> +                            &global->group_index_bits);
> +       if (rc)
> +               global->group_index_bits = 0;
> +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> +             global->guest_index_bits - global->hart_index_bits;
> +       if (tmp < global->group_index_bits) {
> +               pr_err("%pfwP: group index bits too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /*
> +        * Find first bit position of group index.
> +        * If not specified assumed the default APLIC-IMSIC configuration.
> +        */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> +                            &global->group_index_shift);
> +       if (rc)
> +               global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> +       tmp = global->group_index_bits + global->group_index_shift - 1;
> +       if (tmp >= BITS_PER_LONG) {
> +               pr_err("%pfwP: group index shift too big\n", fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of interrupt identities */
> +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> +                            &global->nr_ids);
> +       if (rc) {
> +               pr_err("%pfwP: number of interrupt identities not found\n",
> +                       fwnode);
> +               return rc;
> +       }
> +       if ((global->nr_ids < IMSIC_MIN_ID) ||
> +           (global->nr_ids >= IMSIC_MAX_ID) ||
> +           ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +               pr_err("%pfwP: invalid number of interrupt identities\n",
> +                       fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Find number of guest interrupt identities */
> +       if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> +                           &global->nr_guest_ids))
> +               global->nr_guest_ids = global->nr_ids;
> +       if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> +           (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> +           ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> +               pr_err("%pfwP: invalid number of guest interrupt identities\n",
> +                       fwnode);
> +               return -EINVAL;
> +       }
> +
> +       /* Compute base address */
> +       rc = fwops->mmio_to_resource(fwnode, fwopaque, 0, &res);
> +       if (rc) {
> +               pr_err("%pfwP: first MMIO resource not found\n", fwnode);
> +               return -EINVAL;
> +       }
> +       global->base_addr = res.start;
> +       global->base_addr &= ~(BIT(global->guest_index_bits +
> +                                  global->hart_index_bits +
> +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> +       global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> +                              global->group_index_shift);
> +
> +       /* Find number of MMIO register sets */
> +       priv->num_mmios = fwops->nr_mmio(fwnode, fwopaque);
> +
> +       /* Allocate MMIO register sets */
> +       priv->mmios = kcalloc(priv->num_mmios, sizeof(*mmio), GFP_KERNEL);
> +       if (!priv->mmios) {
> +               rc = -ENOMEM;
> +               goto out_free_priv;
> +       }
> +
> +       /* Parse and map MMIO register sets */
> +       for (i = 0; i < priv->num_mmios; i++) {
> +               mmio = &priv->mmios[i];
> +               rc = fwops->mmio_to_resource(fwnode, fwopaque, i, &res);
> +               if (rc) {
> +                       pr_err("%pfwP: unable to parse MMIO regset %d\n",
> +                               fwnode, i);
> +                       goto out_iounmap;
> +               }
> +               mmio->pa = res.start;
> +               mmio->size = res.end - res.start + 1;
> +
> +               base_addr = mmio->pa;
> +               base_addr &= ~(BIT(global->guest_index_bits +
> +                                  global->hart_index_bits +
> +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> +               base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> +                              global->group_index_shift);
> +               if (base_addr != global->base_addr) {
> +                       rc = -EINVAL;
> +                       pr_err("%pfwP: address mismatch for regset %d\n",
> +                               fwnode, i);
> +                       goto out_iounmap;
> +               }
> +
> +               mmio->va = fwops->mmio_map(fwnode, fwopaque, i);
> +               if (!mmio->va) {
> +                       rc = -EIO;
> +                       pr_err("%pfwP: unable to map MMIO regset %d\n",
> +                               fwnode, i);
> +                       goto out_iounmap;
> +               }
> +       }
> +
> +       /* Initialize interrupt identity management */
> +       rc = imsic_ids_init(priv);
> +       if (rc) {
> +               pr_err("%pfwP: failed to initialize interrupt management\n",
> +                      fwnode);
> +               goto out_iounmap;
> +       }
> +
> +       /* Configure handlers for target CPUs */
> +       for (i = 0; i < nr_parent_irqs; i++) {
> +               unsigned long reloff, hartid;
> +               int j, cpu;
> +
> +               rc = fwops->parent_hartid(fwnode, fwopaque, i, &hartid);
> +               if (rc) {
> +                       pr_warn("%pfwP: hart ID for parent irq%d not found\n",
> +                               fwnode, i);
> +                       continue;
> +               }
> +
> +               cpu = riscv_hartid_to_cpuid(hartid);
> +               if (cpu < 0) {
> +                       pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
> +                               fwnode, i);
> +                       continue;
> +               }
> +
> +               /* Find MMIO location of MSI page */
> +               mmio = NULL;
> +               reloff = i * BIT(global->guest_index_bits) *
> +                        IMSIC_MMIO_PAGE_SZ;
> +               for (j = 0; priv->num_mmios; j++) {
> +                       if (reloff < priv->mmios[j].size) {
> +                               mmio = &priv->mmios[j];
> +                               break;
> +                       }
> +
> +                       /*
> +                        * MMIO region size may not be aligned to
> +                        * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
> +                        * if holes are present.
> +                        */
> +                       reloff -= ALIGN(priv->mmios[j].size,
> +                       BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
> +               }
> +               if (!mmio) {
> +                       pr_warn("%pfwP: MMIO not found for parent irq%d\n",
> +                               fwnode, i);
> +                       continue;
> +               }
> +
> +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> +               if (handler->priv) {
> +                       pr_warn("%pfwP: CPU%d handler already configured.\n",
> +                               fwnode, cpu);
> +                       goto done;
> +               }
> +
> +               cpumask_set_cpu(cpu, &priv->lmask);
> +               handler->local.msi_pa = mmio->pa + reloff;
> +               handler->local.msi_va = mmio->va + reloff;
> +               handler->priv = priv;
> +
> +done:
> +               nr_handlers++;
> +       }
> +
> +       /* If no CPU handlers found then can't take interrupts */
> +       if (!nr_handlers) {
> +               pr_err("%pfwP: No CPU handlers found\n", fwnode);
> +               rc = -ENODEV;
> +               goto out_ids_cleanup;
> +       }
> +
> +       /* Find parent domain and register chained handler */
> +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> +                                         DOMAIN_BUS_ANY);
> +       if (!domain) {
> +               pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> +               rc = -ENOENT;
> +               goto out_ids_cleanup;
> +       }
> +       imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> +       if (!imsic_parent_irq) {
> +               pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> +               rc = -ENOENT;
> +               goto out_ids_cleanup;
> +       }
> +       irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> +
> +       /* Initialize IPI domain */
> +       rc = imsic_ipi_domain_init(priv);
> +       if (rc) {
> +               pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> +               goto out_ids_cleanup;
> +       }
> +
> +       /* Initialize IRQ and MSI domains */
> +       rc = imsic_irq_domains_init(priv, fwnode);
> +       if (rc) {
> +               pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
> +                      fwnode);
> +               goto out_ipi_domain_cleanup;
> +       }
> +
> +       /*
> +        * Setup cpuhp state
> +        *
> +        * Don't disable per-CPU IMSIC file when CPU goes offline
> +        * because this affects IPI and the masking/unmasking of
> +        * virtual IPIs is done via generic IPI-Mux
> +        */
> +       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> +                         "irqchip/riscv/imsic:starting",
> +                         imsic_starting_cpu, NULL);
> +
> +       /*
> +        * Only one IMSIC instance allowed in a platform for clean
> +        * implementation of SMP IRQ affinity and per-CPU IPIs.
> +        *
> +        * This means on a multi-socket (or multi-die) platform we
> +        * will have multiple MMIO regions for one IMSIC instance.
> +        */
> +       imsic_init_done = true;
> +
> +       pr_info("%pfwP:  hart-index-bits: %d,  guest-index-bits: %d\n",
> +               fwnode, global->hart_index_bits, global->guest_index_bits);
> +       pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
> +               fwnode, global->group_index_bits, global->group_index_shift);
> +       pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
> +               fwnode, global->nr_ids, nr_handlers, &global->base_addr);
> +       if (priv->ipi_lsync_id)
> +               pr_info("%pfwP: enable/disable sync using interrupt %d\n",
> +                       fwnode, priv->ipi_lsync_id);
> +       if (priv->ipi_id)
> +               pr_info("%pfwP: providing IPIs using interrupt %d\n",
> +                       fwnode, priv->ipi_id);
> +
> +       return 0;
> +
> +out_ipi_domain_cleanup:
> +       imsic_ipi_domain_cleanup(priv);
> +out_ids_cleanup:
> +       imsic_ids_cleanup(priv);
> +out_iounmap:
> +       for (i = 0; i < priv->num_mmios; i++) {
> +               if (priv->mmios[i].va)
> +                       iounmap(priv->mmios[i].va);
> +       }
> +       kfree(priv->mmios);
> +out_free_priv:
> +       kfree(priv);
> +       return rc;
> +}
> +
> +static u32 __init imsic_dt_nr_parent_irq(struct fwnode_handle *fwnode,
> +                                        void *fwopaque)
> +{
> +       return of_irq_count(to_of_node(fwnode));
> +}
> +
> +static int __init imsic_dt_parent_hartid(struct fwnode_handle *fwnode,
> +                                        void *fwopaque, u32 index,
> +                                        unsigned long *out_hartid)
> +{
> +       struct of_phandle_args parent;
> +       int rc;
> +
> +       rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
> +       if (rc)
> +               return rc;
> +
> +       /*
> +        * Skip interrupts other than external interrupts for
> +        * current privilege level.
> +        */
> +       if (parent.args[0] != RV_IRQ_EXT)
> +               return -EINVAL;
> +
> +       return riscv_of_parent_hartid(parent.np, out_hartid);
> +}
> +
> +static u32 __init imsic_dt_nr_mmio(struct fwnode_handle *fwnode,
> +                                  void *fwopaque)
> +{
> +       u32 ret = 0;
> +       struct resource res;
> +
> +       while (!of_address_to_resource(to_of_node(fwnode), ret, &res))
> +               ret++;
> +
> +       return ret;
> +}
> +
> +static int __init imsic_mmio_to_resource(struct fwnode_handle *fwnode,
> +                                        void *fwopaque, u32 index,
> +                                        struct resource *res)
> +{
> +       return of_address_to_resource(to_of_node(fwnode), index, res);
> +}
> +
> +static void __iomem __init *imsic_dt_mmio_map(struct fwnode_handle *fwnode,
> +                                             void *fwopaque, u32 index)
> +{
> +       return of_iomap(to_of_node(fwnode), index);
> +}
> +
> +static int __init imsic_dt_read_u32(struct fwnode_handle *fwnode,
> +                                   void *fwopaque, const char *prop,
> +                                   u32 *out_val)
> +{
> +       return of_property_read_u32(to_of_node(fwnode), prop, out_val);
> +}
> +
> +static bool __init imsic_dt_read_bool(struct fwnode_handle *fwnode,
> +                                     void *fwopaque, const char *prop)
> +{
> +       return of_property_read_bool(to_of_node(fwnode), prop);
> +}
> +
> +static int __init imsic_dt_init(struct device_node *node,
> +                               struct device_node *parent)
> +{
> +       struct imsic_fwnode_ops ops = {
> +               .nr_parent_irq = imsic_dt_nr_parent_irq,
> +               .parent_hartid = imsic_dt_parent_hartid,
> +               .nr_mmio = imsic_dt_nr_mmio,
> +               .mmio_to_resource = imsic_mmio_to_resource,
> +               .mmio_map = imsic_dt_mmio_map,
> +               .read_u32 = imsic_dt_read_u32,
> +               .read_bool = imsic_dt_read_bool,
> +       };
> +
> +       return imsic_init(&ops, &node->fwnode, NULL);
> +}
> +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
> diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
> new file mode 100644
> index 000000000000..5d1387adc0ba
> --- /dev/null
> +++ b/include/linux/irqchip/riscv-imsic.h
> @@ -0,0 +1,92 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
> +#define __LINUX_IRQCHIP_RISCV_IMSIC_H
> +
> +#include <linux/types.h>
> +#include <asm/csr.h>
> +
> +#define IMSIC_MMIO_PAGE_SHIFT          12
> +#define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
> +#define IMSIC_MMIO_PAGE_LE             0x00
> +#define IMSIC_MMIO_PAGE_BE             0x04
> +
> +#define IMSIC_MIN_ID                   63
> +#define IMSIC_MAX_ID                   2048
> +
> +#define IMSIC_EIDELIVERY               0x70
> +
> +#define IMSIC_EITHRESHOLD              0x72
> +
> +#define IMSIC_EIP0                     0x80
> +#define IMSIC_EIP63                    0xbf
> +#define IMSIC_EIPx_BITS                        32
> +
> +#define IMSIC_EIE0                     0xc0
> +#define IMSIC_EIE63                    0xff
> +#define IMSIC_EIEx_BITS                        32
> +
> +#define IMSIC_FIRST                    IMSIC_EIDELIVERY
> +#define IMSIC_LAST                     IMSIC_EIE63
> +
> +#define IMSIC_MMIO_SETIPNUM_LE         0x00
> +#define IMSIC_MMIO_SETIPNUM_BE         0x04
> +
> +struct imsic_global_config {
> +       /*
> +        * MSI Target Address Scheme
> +        *
> +        * XLEN-1                                                12     0
> +        * |                                                     |     |
> +        * -------------------------------------------------------------
> +        * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> +        * -------------------------------------------------------------
> +        */
> +
> +       /* Bits representing Guest index, HART index, and Group index */
> +       u32 guest_index_bits;
> +       u32 hart_index_bits;
> +       u32 group_index_bits;
> +       u32 group_index_shift;
> +
> +       /* Global base address matching all target MSI addresses */
> +       phys_addr_t base_addr;
> +
> +       /* Number of interrupt identities */
> +       u32 nr_ids;
> +
> +       /* Number of guest interrupt identities */
> +       u32 nr_guest_ids;
> +};
> +
> +struct imsic_local_config {
> +       phys_addr_t msi_pa;
> +       void __iomem *msi_va;
> +};
> +
> +#ifdef CONFIG_RISCV_IMSIC
> +
> +extern const struct imsic_global_config *imsic_get_global_config(void);
> +
> +extern const struct imsic_local_config *imsic_get_local_config(
> +                                                       unsigned int cpu);
> +
> +#else
> +
> +static inline const struct imsic_global_config *imsic_get_global_config(void)
> +{
> +       return NULL;
> +}
> +
> +static inline const struct imsic_local_config *imsic_get_local_config(
> +                                                       unsigned int cpu)
> +{
> +       return NULL;
> +}
> +
> +#endif
> +
> +#endif
> --
> 2.34.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
  2023-01-18  3:49         ` Vincent Chen
@ 2023-01-18  4:20           ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-18  4:20 UTC (permalink / raw)
  To: Vincent Chen
  Cc: apatel, Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	linux-riscv, linux-kernel@vger.kernel.org List, devicetree

On Wed, Jan 18, 2023 at 9:20 AM Vincent Chen <vincent.chen@sifive.com> wrote:
>
> > From: Anup Patel <apatel@ventanamicro.com>
> > Date: Wed, Jan 4, 2023 at 1:19 AM
> > Subject: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
> > To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> > Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
> >
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > MSI controller is referred to as incoming message signaled interrupt
> > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  drivers/irqchip/Kconfig             |   14 +-
> >  drivers/irqchip/Makefile            |    1 +
> >  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
> >  include/linux/irqchip/riscv-imsic.h |   92 +++
> >  4 files changed, 1280 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> >  create mode 100644 include/linux/irqchip/riscv-imsic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index 9e65345ca3f6..a1315189a595 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> >
> >  config GIC_NON_BANKED
> >         bool
> > -
> >  config ARM_GIC_V3
> >         bool
> >         select IRQ_DOMAIN_HIERARCHY
> > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> >         select IRQ_DOMAIN_HIERARCHY
> >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_IMSIC
> > +       bool
> > +       depends on RISCV
> > +       select IRQ_DOMAIN_HIERARCHY
> > +       select GENERIC_MSI_IRQ_DOMAIN
> > +
> > +config RISCV_IMSIC_PCI
> > +       bool
> > +       depends on RISCV_IMSIC
> > +       depends on PCI
> > +       depends on PCI_MSI
> > +       default RISCV_IMSIC
> > +
> >  config EXYNOS_IRQ_COMBINER
> >         bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> >         depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 87b49a10962c..22c723cc6ec8 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> >  obj-$(CONFIG_IMX_INTMUX)               += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > new file mode 100644
> > index 000000000000..4c16b66738d6
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > @@ -0,0 +1,1174 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/bitmap.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iommu.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/pci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +#include <asm/hwcap.h>
> > +
> > +#define IMSIC_DISABLE_EIDELIVERY       0
> > +#define IMSIC_ENABLE_EIDELIVERY                1
> > +#define IMSIC_DISABLE_EITHRESHOLD      1
> > +#define IMSIC_ENABLE_EITHRESHOLD       0
> > +
> > +#define imsic_csr_write(__c, __v)      \
> > +do {                                   \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       csr_write(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +#define imsic_csr_read(__c)            \
> > +({                                     \
> > +       unsigned long __v;              \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       __v = csr_read(CSR_IREG);       \
> > +       __v;                            \
> > +})
> > +
> > +#define imsic_csr_set(__c, __v)                \
> > +do {                                   \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       csr_set(CSR_IREG, __v);         \
> > +} while (0)
> > +
> > +#define imsic_csr_clear(__c, __v)      \
> > +do {                                   \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       csr_clear(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +struct imsic_mmio {
> > +       phys_addr_t pa;
> > +       void __iomem *va;
> > +       unsigned long size;
> > +};
> > +
> > +struct imsic_priv {
> > +       /* Global configuration common for all HARTs */
> > +       struct imsic_global_config global;
> > +
> > +       /* MMIO regions */
> > +       u32 num_mmios;
> > +       struct imsic_mmio *mmios;
> > +
> > +       /* Global state of interrupt identities */
> > +       raw_spinlock_t ids_lock;
> > +       unsigned long *ids_used_bimap;
> > +       unsigned long *ids_enabled_bimap;
> > +       unsigned int *ids_target_cpu;
> > +
> > +       /* Mask for connected CPUs */
> > +       struct cpumask lmask;
> > +
> > +       /* IPI interrupt identity */
> > +       u32 ipi_id;
> > +       u32 ipi_lsync_id;
> > +
> > +       /* IRQ domains */
> > +       struct irq_domain *base_domain;
> > +       struct irq_domain *pci_domain;
> > +       struct irq_domain *plat_domain;
> > +};
> > +
> > +struct imsic_handler {
> > +       /* Local configuration for given HART */
> > +       struct imsic_local_config local;
> > +
> > +       /* Pointer to private context */
> > +       struct imsic_priv *priv;
> > +};
> > +
> > +static bool imsic_init_done;
> > +
> > +static int imsic_parent_irq;
> > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > +
> > +const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +
> > +       if (!handler || !handler->priv)
> > +               return NULL;
> > +
> > +       return &handler->priv->global;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > +
> > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > +{
> > +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +       if (!handler || !handler->priv)
> > +               return NULL;
> > +
> > +       return &handler->local;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
> > +
> > +static int imsic_cpu_page_phys(unsigned int cpu,
> > +                              unsigned int guest_index,
> > +                              phys_addr_t *out_msi_pa)
> > +{
> > +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +       struct imsic_global_config *global;
> > +       struct imsic_local_config *local;
> > +
> > +       if (!handler || !handler->priv)
> > +               return -ENODEV;
> > +       local = &handler->local;
> > +       global = &handler->priv->global;
> > +
> > +       if (BIT(global->guest_index_bits) <= guest_index)
> > +               return -EINVAL;
> > +
> > +       if (out_msi_pa)
> > +               *out_msi_pa = local->msi_pa +
> > +                             (guest_index * IMSIC_MMIO_PAGE_SZ);
> > +
> > +       return 0;
> > +}
> > +
> > +static int imsic_get_cpu(struct imsic_priv *priv,
> > +                        const struct cpumask *mask_val, bool force,
> > +                        unsigned int *out_target_cpu)
> > +{
> > +       struct cpumask amask;
> > +       unsigned int cpu;
> > +
> > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > +       if (force)
> > +               cpu = cpumask_first(&amask);
> > +       else
> > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > +       if (cpu >= nr_cpu_ids)
> > +               return -EINVAL;
> > +
> > +       if (out_target_cpu)
> > +               *out_target_cpu = cpu;
> > +
> > +       return 0;
> > +}
> > +
> > +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> > +                                struct msi_msg *msg)
> > +{
> > +       phys_addr_t msi_addr;
> > +       int err;
> > +
> > +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +       if (err)
> > +               return err;
> > +
> > +       msg->address_hi = upper_32_bits(msi_addr);
> > +       msg->address_lo = lower_32_bits(msi_addr);
> > +       msg->data = id;
> > +
> > +       return err;
> > +}
> > +
> > +static void imsic_id_set_target(struct imsic_priv *priv,
> > +                                unsigned int id, unsigned int target_cpu)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       priv->ids_target_cpu[id] = target_cpu;
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> > +                                       unsigned int id)
> > +{
> > +       unsigned int ret;
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       ret = priv->ids_target_cpu[id];
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       return ret;
> > +}
> > +
> > +static void __imsic_eix_update(unsigned long base_id,
> > +                              unsigned long num_id, bool pend, bool val)
> > +{
> > +       unsigned long i, isel, ireg, flags;
> > +       unsigned long id = base_id, last_id = base_id + num_id;
> > +
> > +       while (id < last_id) {
> > +               isel = id / BITS_PER_LONG;
> > +               isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > +               isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> > +
> > +               ireg = 0;
> > +               for (i = id & (__riscv_xlen - 1);
> > +                    (id < last_id) && (i < __riscv_xlen); i++) {
> > +                       ireg |= BIT(i);
> > +                       id++;
> > +               }
> > +
> > +               /*
> > +                * The IMSIC EIEx and EIPx registers are indirectly
> > +                * accessed via using ISELECT and IREG CSRs so we
> > +                * save/restore local IRQ to ensure that we don't
> > +                * get preempted while accessing IMSIC registers.
> > +                */
> > +               local_irq_save(flags);
> > +               if (val)
> > +                       imsic_csr_set(isel, ireg);
> > +               else
> > +                       imsic_csr_clear(isel, ireg);
> > +               local_irq_restore(flags);
> > +       }
> > +}
> > +
> > +#define __imsic_id_enable(__id)                \
> > +       __imsic_eix_update((__id), 1, false, true)
> > +#define __imsic_id_disable(__id)       \
> > +       __imsic_eix_update((__id), 1, false, false)
> > +
> > +#ifdef CONFIG_SMP
> > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > +{
> > +       struct imsic_handler *handler;
> > +       struct cpumask amask;
> > +       int cpu;
> > +
> > +       cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > +       for_each_cpu(cpu, &amask) {
> > +               if (cpu == smp_processor_id())
> > +                       continue;
> > +
> > +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +               if (!handler || !handler->priv || !handler->local.msi_va) {
> > +                       pr_warn("CPU%d: handler not initialized\n", cpu);
> > +                       continue;
> > +               }
> > +
> > +               writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
> > +       }
> > +}
> > +#else
> > +#define __imsic_id_smp_sync(__priv)
> > +#endif
> > +
> > +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       bitmap_set(priv->ids_enabled_bimap, id, 1);
> > +       __imsic_id_enable(id);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       bitmap_clear(priv->ids_enabled_bimap, id, 1);
> > +       __imsic_id_disable(id);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_ids_local_sync(struct imsic_priv *priv)
> > +{
> > +       int i;
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       for (i = 1; i <= priv->global.nr_ids; i++) {
> > +               if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> > +                       continue;
> > +
> > +               if (test_bit(i, priv->ids_enabled_bimap))
> > +                       __imsic_id_enable(i);
> > +               else
> > +                       __imsic_id_disable(i);
> > +       }
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> > +{
> > +       if (enable) {
> > +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> > +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> > +       } else {
> > +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> > +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> > +       }
> > +}
> > +
> > +static int imsic_ids_alloc(struct imsic_priv *priv,
> > +                          unsigned int max_id, unsigned int order)
> > +{
> > +       int ret;
> > +       unsigned long flags;
> > +
> > +       if ((priv->global.nr_ids < max_id) ||
> > +           (max_id < BIT(order)))
> > +               return -EINVAL;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       ret = bitmap_find_free_region(priv->ids_used_bimap,
> > +                                     max_id + 1, order);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       return ret;
> > +}
> > +
> > +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> > +                          unsigned int order)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       bitmap_release_region(priv->ids_used_bimap, base_id, order);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static int __init imsic_ids_init(struct imsic_priv *priv)
> > +{
> > +       int i;
> > +       struct imsic_global_config *global = &priv->global;
> > +
> > +       raw_spin_lock_init(&priv->ids_lock);
> > +
> > +       /* Allocate used bitmap */
> > +       priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                       sizeof(unsigned long), GFP_KERNEL);
> > +       if (!priv->ids_used_bimap)
> > +               return -ENOMEM;
> > +
> > +       /* Allocate enabled bitmap */
> > +       priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                          sizeof(unsigned long), GFP_KERNEL);
> > +       if (!priv->ids_enabled_bimap) {
> > +               kfree(priv->ids_used_bimap);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       /* Allocate target CPU array */
> > +       priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> > +                                      sizeof(unsigned int), GFP_KERNEL);
> > +       if (!priv->ids_target_cpu) {
> > +               kfree(priv->ids_enabled_bimap);
> > +               kfree(priv->ids_used_bimap);
> > +               return -ENOMEM;
> > +       }
> > +       for (i = 0; i <= global->nr_ids; i++)
> > +               priv->ids_target_cpu[i] = UINT_MAX;
> > +
> > +       /* Reserve ID#0 because it is special and never implemented */
> > +       bitmap_set(priv->ids_used_bimap, 0, 1);
> > +
> > +       return 0;
> > +}
> > +
> > +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> > +{
> > +       kfree(priv->ids_target_cpu);
> > +       kfree(priv->ids_enabled_bimap);
> > +       kfree(priv->ids_used_bimap);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_ipi_send(unsigned int cpu)
> > +{
> > +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +       if (!handler || !handler->priv || !handler->local.msi_va) {
> > +               pr_warn("CPU%d: handler not initialized\n", cpu);
> > +               return;
> > +       }
> > +
> > +       writel(handler->priv->ipi_id, handler->local.msi_va);
> > +}
> > +
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +       __imsic_id_enable(priv->ipi_id);
> > +       __imsic_id_enable(priv->ipi_lsync_id);
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +       int virq;
> > +
> > +       /* Allocate interrupt identity for IPIs */
> > +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +       if (virq < 0)
> > +               return virq;
> > +       priv->ipi_id = virq;
> > +
> > +       /* Create IMSIC IPI multiplexing */
> > +       virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
> > +       if (virq <= 0) {
> > +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +               return (virq < 0) ? virq : -ENOMEM;
> > +       }
> > +
> > +       /* Set vIRQ range */
> > +       riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> > +
> > +       /* Allocate interrupt identity for local enable/disable sync */
> > +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +       if (virq < 0) {
> > +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +               return virq;
> > +       }
> > +       priv->ipi_lsync_id = virq;
> > +
> > +       return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +       imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> > +       if (priv->ipi_id)
> > +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +}
> > +#else
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +       /* Clear the IPI ids because we are not using IPIs */
> > +       priv->ipi_id = 0;
> > +       priv->ipi_lsync_id = 0;
> > +       return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +}
> > +#endif
> > +
> > +static void imsic_irq_mask(struct irq_data *d)
> > +{
> > +       imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_unmask(struct irq_data *d)
> > +{
> > +       imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> > +                                     struct msi_msg *msg)
> > +{
> > +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       unsigned int cpu;
> > +       int err;
> > +
> > +       cpu = imsic_id_get_target(priv, d->hwirq);
> > +       WARN_ON(cpu == UINT_MAX);
> > +
> > +       err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> > +       WARN_ON(err);
> > +
> > +       iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int imsic_irq_set_affinity(struct irq_data *d,
> > +                                 const struct cpumask *mask_val,
> > +                                 bool force)
> > +{
> > +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       unsigned int target_cpu;
> > +       int rc;
> > +
> > +       rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> > +       if (rc)
> > +               return rc;
> > +
> > +       imsic_id_set_target(priv, d->hwirq, target_cpu);
> > +       irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> > +
> > +       return IRQ_SET_MASK_OK;
> > +}
> > +#endif
> > +
> > +static struct irq_chip imsic_irq_base_chip = {
> > +       .name                   = "RISC-V IMSIC-BASE",
> > +       .irq_mask               = imsic_irq_mask,
> > +       .irq_unmask             = imsic_irq_unmask,
> > +#ifdef CONFIG_SMP
> > +       .irq_set_affinity       = imsic_irq_set_affinity,
> > +#endif
> > +       .irq_compose_msi_msg    = imsic_irq_compose_msi_msg,
> > +       .flags                  = IRQCHIP_SKIP_SET_WAKE |
> > +                                 IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> > +                                 unsigned int virq,
> > +                                 unsigned int nr_irqs,
> > +                                 void *args)
> > +{
> > +       struct imsic_priv *priv = domain->host_data;
> > +       msi_alloc_info_t *info = args;
> > +       phys_addr_t msi_addr;
> > +       int i, hwirq, err = 0;
> > +       unsigned int cpu;
> > +
> > +       err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> > +       if (err)
> > +               return err;
> > +
> > +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +       if (err)
> > +               return err;
> > +
> > +       hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> > +                               get_count_order(nr_irqs));
> > +       if (hwirq < 0)
> > +               return hwirq;
> > +
> > +       err = iommu_dma_prepare_msi(info->desc, msi_addr);
>
> Hi Anup,
> First of all, thank you for completing this patch set to support all
> AIA features. After investigating this patch, I'm concerned that it
> may have a potential issue with changing CPU affinity.
>
> As far as I understand, the imsic_irq_domain_alloc() is only called
> once for a device when this device registers its IRQ. It means that
> the iommu_dma_prepare_msi() will be called once too. When a device has
> IOMMU support, the iommu_dma_prepare_msi() will allocate an IOVA for
> the physical MSI address of this device and then call iommu_map to
> create the mapping between this IOVA and the physical MSI address.
> Besides, it also calls msi_desc_set_iommu_cookie to set this IOVA to
> the iommu_cookie. Afterward, iommu_dma_compose_msi_msg() called by the
> imsic_irq_compose_msi_msg() will directly use this desc->iommu_cookie
> to compose the MSI address for this device. However, as mentioned
> early, the iommu_dma_prepare_msi() is called only when a device
> registers its IRQ. Therefore, it only allocates an IOVA for the 1st
> online CPU and sets this IOVA to desc->iommu_cookie. In this
> circumstance, I worry that changing the CPU affinity will not work for
> this device. This is because the IMSIC driver does not create the new
> IOMMU mapping for the MSI address of the new CPU when changing the CPU
> affinity. Besides, the desc->iommu_cookie still records the msi_msg of
> the old CPU without updating.
>
> To solve this problem, one possible solution is to create all IOVA to
> core IMSIC IOMMU mappings in the imsic_irq_domain_alloc(). Then, when
> changing the IRQ affinity to a new CPU, the IMSIC driver needs to
> update the desc->iommu_cookie with the IOVA of this new CPU. However,
> I'm not sure if updating desc->iommu_cookie every time the CPU
> affinity changes would break its original cookie spirit.

Thanks for pointing out this potential issue. I will try to solve this in
v3 patch.

Regards,
Anup

>
>
> > +       if (err)
> > +               goto fail;
> > +
> > +       for (i = 0; i < nr_irqs; i++) {
> > +               imsic_id_set_target(priv, hwirq + i, cpu);
> > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                   &imsic_irq_base_chip, priv,
> > +                                   handle_simple_irq, NULL, NULL);
> > +               irq_set_noprobe(virq + i);
> > +               irq_set_affinity(virq + i, &priv->lmask);
> > +       }
> > +
> > +       return 0;
> > +
> > +fail:
> > +       imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> > +       return err;
> > +}
> > +
> > +static void imsic_irq_domain_free(struct irq_domain *domain,
> > +                                 unsigned int virq,
> > +                                 unsigned int nr_irqs)
> > +{
> > +       struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> > +       struct imsic_priv *priv = domain->host_data;
> > +
> > +       imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> > +       irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> > +}
> > +
> > +static const struct irq_domain_ops imsic_base_domain_ops = {
> > +       .alloc          = imsic_irq_domain_alloc,
> > +       .free           = imsic_irq_domain_free,
> > +};
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +
> > +static void imsic_pci_mask_irq(struct irq_data *d)
> > +{
> > +       pci_msi_mask_irq(d);
> > +       irq_chip_mask_parent(d);
> > +}
> > +
> > +static void imsic_pci_unmask_irq(struct irq_data *d)
> > +{
> > +       pci_msi_unmask_irq(d);
> > +       irq_chip_unmask_parent(d);
> > +}
> > +
> > +static struct irq_chip imsic_pci_irq_chip = {
> > +       .name                   = "RISC-V IMSIC-PCI",
> > +       .irq_mask               = imsic_pci_mask_irq,
> > +       .irq_unmask             = imsic_pci_unmask_irq,
> > +       .irq_eoi                = irq_chip_eoi_parent,
> > +};
> > +
> > +static struct msi_domain_ops imsic_pci_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_pci_domain_info = {
> > +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> > +                  MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> > +       .ops    = &imsic_pci_domain_ops,
> > +       .chip   = &imsic_pci_irq_chip,
> > +};
> > +
> > +#endif
> > +
> > +static struct irq_chip imsic_plat_irq_chip = {
> > +       .name                   = "RISC-V IMSIC-PLAT",
> > +};
> > +
> > +static struct msi_domain_ops imsic_plat_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_plat_domain_info = {
> > +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> > +       .ops    = &imsic_plat_domain_ops,
> > +       .chip   = &imsic_plat_irq_chip,
> > +};
> > +
> > +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> > +                                        struct fwnode_handle *fwnode)
> > +{
> > +       /* Create Base IRQ domain */
> > +       priv->base_domain = irq_domain_create_tree(fwnode,
> > +                                               &imsic_base_domain_ops, priv);
> > +       if (!priv->base_domain) {
> > +               pr_err("Failed to create IMSIC base domain\n");
> > +               return -ENOMEM;
> > +       }
> > +       irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +       /* Create PCI MSI domain */
> > +       priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> > +                                               &imsic_pci_domain_info,
> > +                                               priv->base_domain);
> > +       if (!priv->pci_domain) {
> > +               pr_err("Failed to create IMSIC PCI domain\n");
> > +               irq_domain_remove(priv->base_domain);
> > +               return -ENOMEM;
> > +       }
> > +#endif
> > +
> > +       /* Create Platform MSI domain */
> > +       priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> > +                                               &imsic_plat_domain_info,
> > +                                               priv->base_domain);
> > +       if (!priv->plat_domain) {
> > +               pr_err("Failed to create IMSIC platform domain\n");
> > +               if (priv->pci_domain)
> > +                       irq_domain_remove(priv->pci_domain);
> > +               irq_domain_remove(priv->base_domain);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > +       struct imsic_priv *priv = handler->priv;
> > +       irq_hw_number_t hwirq;
> > +       int err;
> > +
> > +       WARN_ON_ONCE(!handler->priv);
> > +
> > +       chained_irq_enter(chip, desc);
> > +
> > +       while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> > +               hwirq = hwirq >> TOPEI_ID_SHIFT;
> > +
> > +               if (hwirq == priv->ipi_id) {
> > +#ifdef CONFIG_SMP
> > +                       ipi_mux_process();
> > +#endif
> > +                       continue;
> > +               } else if (hwirq == priv->ipi_lsync_id) {
> > +                       imsic_ids_local_sync(priv);
> > +                       continue;
> > +               }
> > +
> > +               err = generic_handle_domain_irq(priv->base_domain, hwirq);
> > +               if (unlikely(err))
> > +                       pr_warn_ratelimited(
> > +                               "hwirq %lu mapping not found\n", hwirq);
> > +       }
> > +
> > +       chained_irq_exit(chip, desc);
> > +}
> > +
> > +static int imsic_starting_cpu(unsigned int cpu)
> > +{
> > +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +       struct imsic_priv *priv = handler->priv;
> > +
> > +       /* Enable per-CPU parent interrupt */
> > +       if (imsic_parent_irq)
> > +               enable_percpu_irq(imsic_parent_irq,
> > +                                 irq_get_trigger_type(imsic_parent_irq));
> > +       else
> > +               pr_warn("cpu%d: parent irq not available\n", cpu);
> > +
> > +       /* Enable IPIs */
> > +       imsic_ipi_enable(priv);
> > +
> > +       /*
> > +        * Interrupts identities might have been enabled/disabled while
> > +        * this CPU was not running so sync-up local enable/disable state.
> > +        */
> > +       imsic_ids_local_sync(priv);
> > +
> > +       /* Locally enable interrupt delivery */
> > +       imsic_ids_local_delivery(priv, true);
> > +
> > +       return 0;
> > +}
> > +
> > +struct imsic_fwnode_ops {
> > +       u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> > +                            void *fwopaque);
> > +       int (*parent_hartid)(struct fwnode_handle *fwnode,
> > +                            void *fwopaque, u32 index,
> > +                            unsigned long *out_hartid);
> > +       u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> > +       int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> > +                               void *fwopaque, u32 index,
> > +                               struct resource *res);
> > +       void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> > +                                 void *fwopaque, u32 index);
> > +       int (*read_u32)(struct fwnode_handle *fwnode,
> > +                       void *fwopaque, const char *prop, u32 *out_val);
> > +       bool (*read_bool)(struct fwnode_handle *fwnode,
> > +                         void *fwopaque, const char *prop);
> > +};
> > +
> > +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> > +                            struct fwnode_handle *fwnode,
> > +                            void *fwopaque)
> > +{
> > +       struct resource res;
> > +       phys_addr_t base_addr;
> > +       int rc, nr_parent_irqs;
> > +       struct imsic_mmio *mmio;
> > +       struct imsic_priv *priv;
> > +       struct irq_domain *domain;
> > +       struct imsic_handler *handler;
> > +       struct imsic_global_config *global;
> > +       u32 i, tmp, nr_handlers = 0;
> > +
> > +       if (imsic_init_done) {
> > +               pr_err("%pfwP: already initialized hence ignoring\n",
> > +                       fwnode);
> > +               return -ENODEV;
> > +       }
> > +
> > +       if (!riscv_isa_extension_available(NULL, SxAIA)) {
> > +               pr_err("%pfwP: AIA support not available\n", fwnode);
> > +               return -ENODEV;
> > +       }
> > +
> > +       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +       global = &priv->global;
> > +
> > +       /* Find number of parent interrupts */
> > +       nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> > +       if (!nr_parent_irqs) {
> > +               pr_err("%pfwP: no parent irqs available\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of guest index bits in MSI address */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> > +                            &global->guest_index_bits);
> > +       if (rc)
> > +               global->guest_index_bits = 0;
> > +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> > +       if (tmp < global->guest_index_bits) {
> > +               pr_err("%pfwP: guest index bits too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of HART index bits */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> > +                            &global->hart_index_bits);
> > +       if (rc) {
> > +               /* Assume default value */
> > +               global->hart_index_bits = __fls(nr_parent_irqs);
> > +               if (BIT(global->hart_index_bits) < nr_parent_irqs)
> > +                       global->hart_index_bits++;
> > +       }
> > +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +             global->guest_index_bits;
> > +       if (tmp < global->hart_index_bits) {
> > +               pr_err("%pfwP: HART index bits too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of group index bits */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> > +                            &global->group_index_bits);
> > +       if (rc)
> > +               global->group_index_bits = 0;
> > +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +             global->guest_index_bits - global->hart_index_bits;
> > +       if (tmp < global->group_index_bits) {
> > +               pr_err("%pfwP: group index bits too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /*
> > +        * Find first bit position of group index.
> > +        * If not specified assumed the default APLIC-IMSIC configuration.
> > +        */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> > +                            &global->group_index_shift);
> > +       if (rc)
> > +               global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> > +       tmp = global->group_index_bits + global->group_index_shift - 1;
> > +       if (tmp >= BITS_PER_LONG) {
> > +               pr_err("%pfwP: group index shift too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of interrupt identities */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> > +                            &global->nr_ids);
> > +       if (rc) {
> > +               pr_err("%pfwP: number of interrupt identities not found\n",
> > +                       fwnode);
> > +               return rc;
> > +       }
> > +       if ((global->nr_ids < IMSIC_MIN_ID) ||
> > +           (global->nr_ids >= IMSIC_MAX_ID) ||
> > +           ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +               pr_err("%pfwP: invalid number of interrupt identities\n",
> > +                       fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of guest interrupt identities */
> > +       if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> > +                           &global->nr_guest_ids))
> > +               global->nr_guest_ids = global->nr_ids;
> > +       if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> > +           (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> > +           ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +               pr_err("%pfwP: invalid number of guest interrupt identities\n",
> > +                       fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Compute base address */
> > +       rc = fwops->mmio_to_resource(fwnode, fwopaque, 0, &res);
> > +       if (rc) {
> > +               pr_err("%pfwP: first MMIO resource not found\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +       global->base_addr = res.start;
> > +       global->base_addr &= ~(BIT(global->guest_index_bits +
> > +                                  global->hart_index_bits +
> > +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> > +       global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> > +                              global->group_index_shift);
> > +
> > +       /* Find number of MMIO register sets */
> > +       priv->num_mmios = fwops->nr_mmio(fwnode, fwopaque);
> > +
> > +       /* Allocate MMIO register sets */
> > +       priv->mmios = kcalloc(priv->num_mmios, sizeof(*mmio), GFP_KERNEL);
> > +       if (!priv->mmios) {
> > +               rc = -ENOMEM;
> > +               goto out_free_priv;
> > +       }
> > +
> > +       /* Parse and map MMIO register sets */
> > +       for (i = 0; i < priv->num_mmios; i++) {
> > +               mmio = &priv->mmios[i];
> > +               rc = fwops->mmio_to_resource(fwnode, fwopaque, i, &res);
> > +               if (rc) {
> > +                       pr_err("%pfwP: unable to parse MMIO regset %d\n",
> > +                               fwnode, i);
> > +                       goto out_iounmap;
> > +               }
> > +               mmio->pa = res.start;
> > +               mmio->size = res.end - res.start + 1;
> > +
> > +               base_addr = mmio->pa;
> > +               base_addr &= ~(BIT(global->guest_index_bits +
> > +                                  global->hart_index_bits +
> > +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> > +               base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> > +                              global->group_index_shift);
> > +               if (base_addr != global->base_addr) {
> > +                       rc = -EINVAL;
> > +                       pr_err("%pfwP: address mismatch for regset %d\n",
> > +                               fwnode, i);
> > +                       goto out_iounmap;
> > +               }
> > +
> > +               mmio->va = fwops->mmio_map(fwnode, fwopaque, i);
> > +               if (!mmio->va) {
> > +                       rc = -EIO;
> > +                       pr_err("%pfwP: unable to map MMIO regset %d\n",
> > +                               fwnode, i);
> > +                       goto out_iounmap;
> > +               }
> > +       }
> > +
> > +       /* Initialize interrupt identity management */
> > +       rc = imsic_ids_init(priv);
> > +       if (rc) {
> > +               pr_err("%pfwP: failed to initialize interrupt management\n",
> > +                      fwnode);
> > +               goto out_iounmap;
> > +       }
> > +
> > +       /* Configure handlers for target CPUs */
> > +       for (i = 0; i < nr_parent_irqs; i++) {
> > +               unsigned long reloff, hartid;
> > +               int j, cpu;
> > +
> > +               rc = fwops->parent_hartid(fwnode, fwopaque, i, &hartid);
> > +               if (rc) {
> > +                       pr_warn("%pfwP: hart ID for parent irq%d not found\n",
> > +                               fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               cpu = riscv_hartid_to_cpuid(hartid);
> > +               if (cpu < 0) {
> > +                       pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
> > +                               fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               /* Find MMIO location of MSI page */
> > +               mmio = NULL;
> > +               reloff = i * BIT(global->guest_index_bits) *
> > +                        IMSIC_MMIO_PAGE_SZ;
> > +               for (j = 0; priv->num_mmios; j++) {
> > +                       if (reloff < priv->mmios[j].size) {
> > +                               mmio = &priv->mmios[j];
> > +                               break;
> > +                       }
> > +
> > +                       /*
> > +                        * MMIO region size may not be aligned to
> > +                        * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
> > +                        * if holes are present.
> > +                        */
> > +                       reloff -= ALIGN(priv->mmios[j].size,
> > +                       BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
> > +               }
> > +               if (!mmio) {
> > +                       pr_warn("%pfwP: MMIO not found for parent irq%d\n",
> > +                               fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +               if (handler->priv) {
> > +                       pr_warn("%pfwP: CPU%d handler already configured.\n",
> > +                               fwnode, cpu);
> > +                       goto done;
> > +               }
> > +
> > +               cpumask_set_cpu(cpu, &priv->lmask);
> > +               handler->local.msi_pa = mmio->pa + reloff;
> > +               handler->local.msi_va = mmio->va + reloff;
> > +               handler->priv = priv;
> > +
> > +done:
> > +               nr_handlers++;
> > +       }
> > +
> > +       /* If no CPU handlers found then can't take interrupts */
> > +       if (!nr_handlers) {
> > +               pr_err("%pfwP: No CPU handlers found\n", fwnode);
> > +               rc = -ENODEV;
> > +               goto out_ids_cleanup;
> > +       }
> > +
> > +       /* Find parent domain and register chained handler */
> > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > +                                         DOMAIN_BUS_ANY);
> > +       if (!domain) {
> > +               pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> > +               rc = -ENOENT;
> > +               goto out_ids_cleanup;
> > +       }
> > +       imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > +       if (!imsic_parent_irq) {
> > +               pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> > +               rc = -ENOENT;
> > +               goto out_ids_cleanup;
> > +       }
> > +       irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> > +
> > +       /* Initialize IPI domain */
> > +       rc = imsic_ipi_domain_init(priv);
> > +       if (rc) {
> > +               pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> > +               goto out_ids_cleanup;
> > +       }
> > +
> > +       /* Initialize IRQ and MSI domains */
> > +       rc = imsic_irq_domains_init(priv, fwnode);
> > +       if (rc) {
> > +               pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
> > +                      fwnode);
> > +               goto out_ipi_domain_cleanup;
> > +       }
> > +
> > +       /*
> > +        * Setup cpuhp state
> > +        *
> > +        * Don't disable per-CPU IMSIC file when CPU goes offline
> > +        * because this affects IPI and the masking/unmasking of
> > +        * virtual IPIs is done via generic IPI-Mux
> > +        */
> > +       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > +                         "irqchip/riscv/imsic:starting",
> > +                         imsic_starting_cpu, NULL);
> > +
> > +       /*
> > +        * Only one IMSIC instance allowed in a platform for clean
> > +        * implementation of SMP IRQ affinity and per-CPU IPIs.
> > +        *
> > +        * This means on a multi-socket (or multi-die) platform we
> > +        * will have multiple MMIO regions for one IMSIC instance.
> > +        */
> > +       imsic_init_done = true;
> > +
> > +       pr_info("%pfwP:  hart-index-bits: %d,  guest-index-bits: %d\n",
> > +               fwnode, global->hart_index_bits, global->guest_index_bits);
> > +       pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
> > +               fwnode, global->group_index_bits, global->group_index_shift);
> > +       pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
> > +               fwnode, global->nr_ids, nr_handlers, &global->base_addr);
> > +       if (priv->ipi_lsync_id)
> > +               pr_info("%pfwP: enable/disable sync using interrupt %d\n",
> > +                       fwnode, priv->ipi_lsync_id);
> > +       if (priv->ipi_id)
> > +               pr_info("%pfwP: providing IPIs using interrupt %d\n",
> > +                       fwnode, priv->ipi_id);
> > +
> > +       return 0;
> > +
> > +out_ipi_domain_cleanup:
> > +       imsic_ipi_domain_cleanup(priv);
> > +out_ids_cleanup:
> > +       imsic_ids_cleanup(priv);
> > +out_iounmap:
> > +       for (i = 0; i < priv->num_mmios; i++) {
> > +               if (priv->mmios[i].va)
> > +                       iounmap(priv->mmios[i].va);
> > +       }
> > +       kfree(priv->mmios);
> > +out_free_priv:
> > +       kfree(priv);
> > +       return rc;
> > +}
> > +
> > +static u32 __init imsic_dt_nr_parent_irq(struct fwnode_handle *fwnode,
> > +                                        void *fwopaque)
> > +{
> > +       return of_irq_count(to_of_node(fwnode));
> > +}
> > +
> > +static int __init imsic_dt_parent_hartid(struct fwnode_handle *fwnode,
> > +                                        void *fwopaque, u32 index,
> > +                                        unsigned long *out_hartid)
> > +{
> > +       struct of_phandle_args parent;
> > +       int rc;
> > +
> > +       rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
> > +       if (rc)
> > +               return rc;
> > +
> > +       /*
> > +        * Skip interrupts other than external interrupts for
> > +        * current privilege level.
> > +        */
> > +       if (parent.args[0] != RV_IRQ_EXT)
> > +               return -EINVAL;
> > +
> > +       return riscv_of_parent_hartid(parent.np, out_hartid);
> > +}
> > +
> > +static u32 __init imsic_dt_nr_mmio(struct fwnode_handle *fwnode,
> > +                                  void *fwopaque)
> > +{
> > +       u32 ret = 0;
> > +       struct resource res;
> > +
> > +       while (!of_address_to_resource(to_of_node(fwnode), ret, &res))
> > +               ret++;
> > +
> > +       return ret;
> > +}
> > +
> > +static int __init imsic_mmio_to_resource(struct fwnode_handle *fwnode,
> > +                                        void *fwopaque, u32 index,
> > +                                        struct resource *res)
> > +{
> > +       return of_address_to_resource(to_of_node(fwnode), index, res);
> > +}
> > +
> > +static void __iomem __init *imsic_dt_mmio_map(struct fwnode_handle *fwnode,
> > +                                             void *fwopaque, u32 index)
> > +{
> > +       return of_iomap(to_of_node(fwnode), index);
> > +}
> > +
> > +static int __init imsic_dt_read_u32(struct fwnode_handle *fwnode,
> > +                                   void *fwopaque, const char *prop,
> > +                                   u32 *out_val)
> > +{
> > +       return of_property_read_u32(to_of_node(fwnode), prop, out_val);
> > +}
> > +
> > +static bool __init imsic_dt_read_bool(struct fwnode_handle *fwnode,
> > +                                     void *fwopaque, const char *prop)
> > +{
> > +       return of_property_read_bool(to_of_node(fwnode), prop);
> > +}
> > +
> > +static int __init imsic_dt_init(struct device_node *node,
> > +                               struct device_node *parent)
> > +{
> > +       struct imsic_fwnode_ops ops = {
> > +               .nr_parent_irq = imsic_dt_nr_parent_irq,
> > +               .parent_hartid = imsic_dt_parent_hartid,
> > +               .nr_mmio = imsic_dt_nr_mmio,
> > +               .mmio_to_resource = imsic_mmio_to_resource,
> > +               .mmio_map = imsic_dt_mmio_map,
> > +               .read_u32 = imsic_dt_read_u32,
> > +               .read_bool = imsic_dt_read_bool,
> > +       };
> > +
> > +       return imsic_init(&ops, &node->fwnode, NULL);
> > +}
> > +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
> > diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
> > new file mode 100644
> > index 000000000000..5d1387adc0ba
> > --- /dev/null
> > +++ b/include/linux/irqchip/riscv-imsic.h
> > @@ -0,0 +1,92 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
> > +#define __LINUX_IRQCHIP_RISCV_IMSIC_H
> > +
> > +#include <linux/types.h>
> > +#include <asm/csr.h>
> > +
> > +#define IMSIC_MMIO_PAGE_SHIFT          12
> > +#define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
> > +#define IMSIC_MMIO_PAGE_LE             0x00
> > +#define IMSIC_MMIO_PAGE_BE             0x04
> > +
> > +#define IMSIC_MIN_ID                   63
> > +#define IMSIC_MAX_ID                   2048
> > +
> > +#define IMSIC_EIDELIVERY               0x70
> > +
> > +#define IMSIC_EITHRESHOLD              0x72
> > +
> > +#define IMSIC_EIP0                     0x80
> > +#define IMSIC_EIP63                    0xbf
> > +#define IMSIC_EIPx_BITS                        32
> > +
> > +#define IMSIC_EIE0                     0xc0
> > +#define IMSIC_EIE63                    0xff
> > +#define IMSIC_EIEx_BITS                        32
> > +
> > +#define IMSIC_FIRST                    IMSIC_EIDELIVERY
> > +#define IMSIC_LAST                     IMSIC_EIE63
> > +
> > +#define IMSIC_MMIO_SETIPNUM_LE         0x00
> > +#define IMSIC_MMIO_SETIPNUM_BE         0x04
> > +
> > +struct imsic_global_config {
> > +       /*
> > +        * MSI Target Address Scheme
> > +        *
> > +        * XLEN-1                                                12     0
> > +        * |                                                     |     |
> > +        * -------------------------------------------------------------
> > +        * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> > +        * -------------------------------------------------------------
> > +        */
> > +
> > +       /* Bits representing Guest index, HART index, and Group index */
> > +       u32 guest_index_bits;
> > +       u32 hart_index_bits;
> > +       u32 group_index_bits;
> > +       u32 group_index_shift;
> > +
> > +       /* Global base address matching all target MSI addresses */
> > +       phys_addr_t base_addr;
> > +
> > +       /* Number of interrupt identities */
> > +       u32 nr_ids;
> > +
> > +       /* Number of guest interrupt identities */
> > +       u32 nr_guest_ids;
> > +};
> > +
> > +struct imsic_local_config {
> > +       phys_addr_t msi_pa;
> > +       void __iomem *msi_va;
> > +};
> > +
> > +#ifdef CONFIG_RISCV_IMSIC
> > +
> > +extern const struct imsic_global_config *imsic_get_global_config(void);
> > +
> > +extern const struct imsic_local_config *imsic_get_local_config(
> > +                                                       unsigned int cpu);
> > +
> > +#else
> > +
> > +static inline const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > +       return NULL;
> > +}
> > +
> > +static inline const struct imsic_local_config *imsic_get_local_config(
> > +                                                       unsigned int cpu)
> > +{
> > +       return NULL;
> > +}
> > +
> > +#endif
> > +
> > +#endif
> > --
> > 2.34.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
@ 2023-01-18  4:20           ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-18  4:20 UTC (permalink / raw)
  To: Vincent Chen
  Cc: apatel, Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	linux-riscv, linux-kernel@vger.kernel.org List, devicetree

On Wed, Jan 18, 2023 at 9:20 AM Vincent Chen <vincent.chen@sifive.com> wrote:
>
> > From: Anup Patel <apatel@ventanamicro.com>
> > Date: Wed, Jan 4, 2023 at 1:19 AM
> > Subject: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
> > To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> > Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
> >
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > MSI controller is referred to as incoming message signaled interrupt
> > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  drivers/irqchip/Kconfig             |   14 +-
> >  drivers/irqchip/Makefile            |    1 +
> >  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
> >  include/linux/irqchip/riscv-imsic.h |   92 +++
> >  4 files changed, 1280 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> >  create mode 100644 include/linux/irqchip/riscv-imsic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index 9e65345ca3f6..a1315189a595 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> >
> >  config GIC_NON_BANKED
> >         bool
> > -
> >  config ARM_GIC_V3
> >         bool
> >         select IRQ_DOMAIN_HIERARCHY
> > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> >         select IRQ_DOMAIN_HIERARCHY
> >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_IMSIC
> > +       bool
> > +       depends on RISCV
> > +       select IRQ_DOMAIN_HIERARCHY
> > +       select GENERIC_MSI_IRQ_DOMAIN
> > +
> > +config RISCV_IMSIC_PCI
> > +       bool
> > +       depends on RISCV_IMSIC
> > +       depends on PCI
> > +       depends on PCI_MSI
> > +       default RISCV_IMSIC
> > +
> >  config EXYNOS_IRQ_COMBINER
> >         bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> >         depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 87b49a10962c..22c723cc6ec8 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> >  obj-$(CONFIG_IMX_INTMUX)               += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > new file mode 100644
> > index 000000000000..4c16b66738d6
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > @@ -0,0 +1,1174 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/bitmap.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iommu.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/pci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +#include <asm/hwcap.h>
> > +
> > +#define IMSIC_DISABLE_EIDELIVERY       0
> > +#define IMSIC_ENABLE_EIDELIVERY                1
> > +#define IMSIC_DISABLE_EITHRESHOLD      1
> > +#define IMSIC_ENABLE_EITHRESHOLD       0
> > +
> > +#define imsic_csr_write(__c, __v)      \
> > +do {                                   \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       csr_write(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +#define imsic_csr_read(__c)            \
> > +({                                     \
> > +       unsigned long __v;              \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       __v = csr_read(CSR_IREG);       \
> > +       __v;                            \
> > +})
> > +
> > +#define imsic_csr_set(__c, __v)                \
> > +do {                                   \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       csr_set(CSR_IREG, __v);         \
> > +} while (0)
> > +
> > +#define imsic_csr_clear(__c, __v)      \
> > +do {                                   \
> > +       csr_write(CSR_ISELECT, __c);    \
> > +       csr_clear(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +struct imsic_mmio {
> > +       phys_addr_t pa;
> > +       void __iomem *va;
> > +       unsigned long size;
> > +};
> > +
> > +struct imsic_priv {
> > +       /* Global configuration common for all HARTs */
> > +       struct imsic_global_config global;
> > +
> > +       /* MMIO regions */
> > +       u32 num_mmios;
> > +       struct imsic_mmio *mmios;
> > +
> > +       /* Global state of interrupt identities */
> > +       raw_spinlock_t ids_lock;
> > +       unsigned long *ids_used_bimap;
> > +       unsigned long *ids_enabled_bimap;
> > +       unsigned int *ids_target_cpu;
> > +
> > +       /* Mask for connected CPUs */
> > +       struct cpumask lmask;
> > +
> > +       /* IPI interrupt identity */
> > +       u32 ipi_id;
> > +       u32 ipi_lsync_id;
> > +
> > +       /* IRQ domains */
> > +       struct irq_domain *base_domain;
> > +       struct irq_domain *pci_domain;
> > +       struct irq_domain *plat_domain;
> > +};
> > +
> > +struct imsic_handler {
> > +       /* Local configuration for given HART */
> > +       struct imsic_local_config local;
> > +
> > +       /* Pointer to private context */
> > +       struct imsic_priv *priv;
> > +};
> > +
> > +static bool imsic_init_done;
> > +
> > +static int imsic_parent_irq;
> > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > +
> > +const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +
> > +       if (!handler || !handler->priv)
> > +               return NULL;
> > +
> > +       return &handler->priv->global;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > +
> > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > +{
> > +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +       if (!handler || !handler->priv)
> > +               return NULL;
> > +
> > +       return &handler->local;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
> > +
> > +static int imsic_cpu_page_phys(unsigned int cpu,
> > +                              unsigned int guest_index,
> > +                              phys_addr_t *out_msi_pa)
> > +{
> > +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +       struct imsic_global_config *global;
> > +       struct imsic_local_config *local;
> > +
> > +       if (!handler || !handler->priv)
> > +               return -ENODEV;
> > +       local = &handler->local;
> > +       global = &handler->priv->global;
> > +
> > +       if (BIT(global->guest_index_bits) <= guest_index)
> > +               return -EINVAL;
> > +
> > +       if (out_msi_pa)
> > +               *out_msi_pa = local->msi_pa +
> > +                             (guest_index * IMSIC_MMIO_PAGE_SZ);
> > +
> > +       return 0;
> > +}
> > +
> > +static int imsic_get_cpu(struct imsic_priv *priv,
> > +                        const struct cpumask *mask_val, bool force,
> > +                        unsigned int *out_target_cpu)
> > +{
> > +       struct cpumask amask;
> > +       unsigned int cpu;
> > +
> > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > +       if (force)
> > +               cpu = cpumask_first(&amask);
> > +       else
> > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > +       if (cpu >= nr_cpu_ids)
> > +               return -EINVAL;
> > +
> > +       if (out_target_cpu)
> > +               *out_target_cpu = cpu;
> > +
> > +       return 0;
> > +}
> > +
> > +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> > +                                struct msi_msg *msg)
> > +{
> > +       phys_addr_t msi_addr;
> > +       int err;
> > +
> > +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +       if (err)
> > +               return err;
> > +
> > +       msg->address_hi = upper_32_bits(msi_addr);
> > +       msg->address_lo = lower_32_bits(msi_addr);
> > +       msg->data = id;
> > +
> > +       return err;
> > +}
> > +
> > +static void imsic_id_set_target(struct imsic_priv *priv,
> > +                                unsigned int id, unsigned int target_cpu)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       priv->ids_target_cpu[id] = target_cpu;
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> > +                                       unsigned int id)
> > +{
> > +       unsigned int ret;
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       ret = priv->ids_target_cpu[id];
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       return ret;
> > +}
> > +
> > +static void __imsic_eix_update(unsigned long base_id,
> > +                              unsigned long num_id, bool pend, bool val)
> > +{
> > +       unsigned long i, isel, ireg, flags;
> > +       unsigned long id = base_id, last_id = base_id + num_id;
> > +
> > +       while (id < last_id) {
> > +               isel = id / BITS_PER_LONG;
> > +               isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > +               isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> > +
> > +               ireg = 0;
> > +               for (i = id & (__riscv_xlen - 1);
> > +                    (id < last_id) && (i < __riscv_xlen); i++) {
> > +                       ireg |= BIT(i);
> > +                       id++;
> > +               }
> > +
> > +               /*
> > +                * The IMSIC EIEx and EIPx registers are indirectly
> > +                * accessed via using ISELECT and IREG CSRs so we
> > +                * save/restore local IRQ to ensure that we don't
> > +                * get preempted while accessing IMSIC registers.
> > +                */
> > +               local_irq_save(flags);
> > +               if (val)
> > +                       imsic_csr_set(isel, ireg);
> > +               else
> > +                       imsic_csr_clear(isel, ireg);
> > +               local_irq_restore(flags);
> > +       }
> > +}
> > +
> > +#define __imsic_id_enable(__id)                \
> > +       __imsic_eix_update((__id), 1, false, true)
> > +#define __imsic_id_disable(__id)       \
> > +       __imsic_eix_update((__id), 1, false, false)
> > +
> > +#ifdef CONFIG_SMP
> > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > +{
> > +       struct imsic_handler *handler;
> > +       struct cpumask amask;
> > +       int cpu;
> > +
> > +       cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > +       for_each_cpu(cpu, &amask) {
> > +               if (cpu == smp_processor_id())
> > +                       continue;
> > +
> > +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +               if (!handler || !handler->priv || !handler->local.msi_va) {
> > +                       pr_warn("CPU%d: handler not initialized\n", cpu);
> > +                       continue;
> > +               }
> > +
> > +               writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
> > +       }
> > +}
> > +#else
> > +#define __imsic_id_smp_sync(__priv)
> > +#endif
> > +
> > +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       bitmap_set(priv->ids_enabled_bimap, id, 1);
> > +       __imsic_id_enable(id);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       bitmap_clear(priv->ids_enabled_bimap, id, 1);
> > +       __imsic_id_disable(id);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_ids_local_sync(struct imsic_priv *priv)
> > +{
> > +       int i;
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       for (i = 1; i <= priv->global.nr_ids; i++) {
> > +               if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> > +                       continue;
> > +
> > +               if (test_bit(i, priv->ids_enabled_bimap))
> > +                       __imsic_id_enable(i);
> > +               else
> > +                       __imsic_id_disable(i);
> > +       }
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> > +{
> > +       if (enable) {
> > +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> > +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> > +       } else {
> > +               imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> > +               imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> > +       }
> > +}
> > +
> > +static int imsic_ids_alloc(struct imsic_priv *priv,
> > +                          unsigned int max_id, unsigned int order)
> > +{
> > +       int ret;
> > +       unsigned long flags;
> > +
> > +       if ((priv->global.nr_ids < max_id) ||
> > +           (max_id < BIT(order)))
> > +               return -EINVAL;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       ret = bitmap_find_free_region(priv->ids_used_bimap,
> > +                                     max_id + 1, order);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +       return ret;
> > +}
> > +
> > +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> > +                          unsigned int order)
> > +{
> > +       unsigned long flags;
> > +
> > +       raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +       bitmap_release_region(priv->ids_used_bimap, base_id, order);
> > +       raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static int __init imsic_ids_init(struct imsic_priv *priv)
> > +{
> > +       int i;
> > +       struct imsic_global_config *global = &priv->global;
> > +
> > +       raw_spin_lock_init(&priv->ids_lock);
> > +
> > +       /* Allocate used bitmap */
> > +       priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                       sizeof(unsigned long), GFP_KERNEL);
> > +       if (!priv->ids_used_bimap)
> > +               return -ENOMEM;
> > +
> > +       /* Allocate enabled bitmap */
> > +       priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                          sizeof(unsigned long), GFP_KERNEL);
> > +       if (!priv->ids_enabled_bimap) {
> > +               kfree(priv->ids_used_bimap);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       /* Allocate target CPU array */
> > +       priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> > +                                      sizeof(unsigned int), GFP_KERNEL);
> > +       if (!priv->ids_target_cpu) {
> > +               kfree(priv->ids_enabled_bimap);
> > +               kfree(priv->ids_used_bimap);
> > +               return -ENOMEM;
> > +       }
> > +       for (i = 0; i <= global->nr_ids; i++)
> > +               priv->ids_target_cpu[i] = UINT_MAX;
> > +
> > +       /* Reserve ID#0 because it is special and never implemented */
> > +       bitmap_set(priv->ids_used_bimap, 0, 1);
> > +
> > +       return 0;
> > +}
> > +
> > +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> > +{
> > +       kfree(priv->ids_target_cpu);
> > +       kfree(priv->ids_enabled_bimap);
> > +       kfree(priv->ids_used_bimap);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_ipi_send(unsigned int cpu)
> > +{
> > +       struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +       if (!handler || !handler->priv || !handler->local.msi_va) {
> > +               pr_warn("CPU%d: handler not initialized\n", cpu);
> > +               return;
> > +       }
> > +
> > +       writel(handler->priv->ipi_id, handler->local.msi_va);
> > +}
> > +
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +       __imsic_id_enable(priv->ipi_id);
> > +       __imsic_id_enable(priv->ipi_lsync_id);
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +       int virq;
> > +
> > +       /* Allocate interrupt identity for IPIs */
> > +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +       if (virq < 0)
> > +               return virq;
> > +       priv->ipi_id = virq;
> > +
> > +       /* Create IMSIC IPI multiplexing */
> > +       virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
> > +       if (virq <= 0) {
> > +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +               return (virq < 0) ? virq : -ENOMEM;
> > +       }
> > +
> > +       /* Set vIRQ range */
> > +       riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> > +
> > +       /* Allocate interrupt identity for local enable/disable sync */
> > +       virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +       if (virq < 0) {
> > +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +               return virq;
> > +       }
> > +       priv->ipi_lsync_id = virq;
> > +
> > +       return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +       imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> > +       if (priv->ipi_id)
> > +               imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +}
> > +#else
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +       /* Clear the IPI ids because we are not using IPIs */
> > +       priv->ipi_id = 0;
> > +       priv->ipi_lsync_id = 0;
> > +       return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +}
> > +#endif
> > +
> > +static void imsic_irq_mask(struct irq_data *d)
> > +{
> > +       imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_unmask(struct irq_data *d)
> > +{
> > +       imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> > +                                     struct msi_msg *msg)
> > +{
> > +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       unsigned int cpu;
> > +       int err;
> > +
> > +       cpu = imsic_id_get_target(priv, d->hwirq);
> > +       WARN_ON(cpu == UINT_MAX);
> > +
> > +       err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> > +       WARN_ON(err);
> > +
> > +       iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int imsic_irq_set_affinity(struct irq_data *d,
> > +                                 const struct cpumask *mask_val,
> > +                                 bool force)
> > +{
> > +       struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       unsigned int target_cpu;
> > +       int rc;
> > +
> > +       rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> > +       if (rc)
> > +               return rc;
> > +
> > +       imsic_id_set_target(priv, d->hwirq, target_cpu);
> > +       irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> > +
> > +       return IRQ_SET_MASK_OK;
> > +}
> > +#endif
> > +
> > +static struct irq_chip imsic_irq_base_chip = {
> > +       .name                   = "RISC-V IMSIC-BASE",
> > +       .irq_mask               = imsic_irq_mask,
> > +       .irq_unmask             = imsic_irq_unmask,
> > +#ifdef CONFIG_SMP
> > +       .irq_set_affinity       = imsic_irq_set_affinity,
> > +#endif
> > +       .irq_compose_msi_msg    = imsic_irq_compose_msi_msg,
> > +       .flags                  = IRQCHIP_SKIP_SET_WAKE |
> > +                                 IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> > +                                 unsigned int virq,
> > +                                 unsigned int nr_irqs,
> > +                                 void *args)
> > +{
> > +       struct imsic_priv *priv = domain->host_data;
> > +       msi_alloc_info_t *info = args;
> > +       phys_addr_t msi_addr;
> > +       int i, hwirq, err = 0;
> > +       unsigned int cpu;
> > +
> > +       err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> > +       if (err)
> > +               return err;
> > +
> > +       err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +       if (err)
> > +               return err;
> > +
> > +       hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> > +                               get_count_order(nr_irqs));
> > +       if (hwirq < 0)
> > +               return hwirq;
> > +
> > +       err = iommu_dma_prepare_msi(info->desc, msi_addr);
>
> Hi Anup,
> First of all, thank you for completing this patch set to support all
> AIA features. After investigating this patch, I'm concerned that it
> may have a potential issue with changing CPU affinity.
>
> As far as I understand, the imsic_irq_domain_alloc() is only called
> once for a device when this device registers its IRQ. It means that
> the iommu_dma_prepare_msi() will be called once too. When a device has
> IOMMU support, the iommu_dma_prepare_msi() will allocate an IOVA for
> the physical MSI address of this device and then call iommu_map to
> create the mapping between this IOVA and the physical MSI address.
> Besides, it also calls msi_desc_set_iommu_cookie to set this IOVA to
> the iommu_cookie. Afterward, iommu_dma_compose_msi_msg() called by the
> imsic_irq_compose_msi_msg() will directly use this desc->iommu_cookie
> to compose the MSI address for this device. However, as mentioned
> early, the iommu_dma_prepare_msi() is called only when a device
> registers its IRQ. Therefore, it only allocates an IOVA for the 1st
> online CPU and sets this IOVA to desc->iommu_cookie. In this
> circumstance, I worry that changing the CPU affinity will not work for
> this device. This is because the IMSIC driver does not create the new
> IOMMU mapping for the MSI address of the new CPU when changing the CPU
> affinity. Besides, the desc->iommu_cookie still records the msi_msg of
> the old CPU without updating.
>
> To solve this problem, one possible solution is to create all IOVA to
> core IMSIC IOMMU mappings in the imsic_irq_domain_alloc(). Then, when
> changing the IRQ affinity to a new CPU, the IMSIC driver needs to
> update the desc->iommu_cookie with the IOVA of this new CPU. However,
> I'm not sure if updating desc->iommu_cookie every time the CPU
> affinity changes would break its original cookie spirit.

Thanks for pointing out this potential issue. I will try to solve this in
v3 patch.

Regards,
Anup

>
>
> > +       if (err)
> > +               goto fail;
> > +
> > +       for (i = 0; i < nr_irqs; i++) {
> > +               imsic_id_set_target(priv, hwirq + i, cpu);
> > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                   &imsic_irq_base_chip, priv,
> > +                                   handle_simple_irq, NULL, NULL);
> > +               irq_set_noprobe(virq + i);
> > +               irq_set_affinity(virq + i, &priv->lmask);
> > +       }
> > +
> > +       return 0;
> > +
> > +fail:
> > +       imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> > +       return err;
> > +}
> > +
> > +static void imsic_irq_domain_free(struct irq_domain *domain,
> > +                                 unsigned int virq,
> > +                                 unsigned int nr_irqs)
> > +{
> > +       struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> > +       struct imsic_priv *priv = domain->host_data;
> > +
> > +       imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> > +       irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> > +}
> > +
> > +static const struct irq_domain_ops imsic_base_domain_ops = {
> > +       .alloc          = imsic_irq_domain_alloc,
> > +       .free           = imsic_irq_domain_free,
> > +};
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +
> > +static void imsic_pci_mask_irq(struct irq_data *d)
> > +{
> > +       pci_msi_mask_irq(d);
> > +       irq_chip_mask_parent(d);
> > +}
> > +
> > +static void imsic_pci_unmask_irq(struct irq_data *d)
> > +{
> > +       pci_msi_unmask_irq(d);
> > +       irq_chip_unmask_parent(d);
> > +}
> > +
> > +static struct irq_chip imsic_pci_irq_chip = {
> > +       .name                   = "RISC-V IMSIC-PCI",
> > +       .irq_mask               = imsic_pci_mask_irq,
> > +       .irq_unmask             = imsic_pci_unmask_irq,
> > +       .irq_eoi                = irq_chip_eoi_parent,
> > +};
> > +
> > +static struct msi_domain_ops imsic_pci_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_pci_domain_info = {
> > +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> > +                  MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> > +       .ops    = &imsic_pci_domain_ops,
> > +       .chip   = &imsic_pci_irq_chip,
> > +};
> > +
> > +#endif
> > +
> > +static struct irq_chip imsic_plat_irq_chip = {
> > +       .name                   = "RISC-V IMSIC-PLAT",
> > +};
> > +
> > +static struct msi_domain_ops imsic_plat_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_plat_domain_info = {
> > +       .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> > +       .ops    = &imsic_plat_domain_ops,
> > +       .chip   = &imsic_plat_irq_chip,
> > +};
> > +
> > +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> > +                                        struct fwnode_handle *fwnode)
> > +{
> > +       /* Create Base IRQ domain */
> > +       priv->base_domain = irq_domain_create_tree(fwnode,
> > +                                               &imsic_base_domain_ops, priv);
> > +       if (!priv->base_domain) {
> > +               pr_err("Failed to create IMSIC base domain\n");
> > +               return -ENOMEM;
> > +       }
> > +       irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +       /* Create PCI MSI domain */
> > +       priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> > +                                               &imsic_pci_domain_info,
> > +                                               priv->base_domain);
> > +       if (!priv->pci_domain) {
> > +               pr_err("Failed to create IMSIC PCI domain\n");
> > +               irq_domain_remove(priv->base_domain);
> > +               return -ENOMEM;
> > +       }
> > +#endif
> > +
> > +       /* Create Platform MSI domain */
> > +       priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> > +                                               &imsic_plat_domain_info,
> > +                                               priv->base_domain);
> > +       if (!priv->plat_domain) {
> > +               pr_err("Failed to create IMSIC platform domain\n");
> > +               if (priv->pci_domain)
> > +                       irq_domain_remove(priv->pci_domain);
> > +               irq_domain_remove(priv->base_domain);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > +       struct imsic_priv *priv = handler->priv;
> > +       irq_hw_number_t hwirq;
> > +       int err;
> > +
> > +       WARN_ON_ONCE(!handler->priv);
> > +
> > +       chained_irq_enter(chip, desc);
> > +
> > +       while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> > +               hwirq = hwirq >> TOPEI_ID_SHIFT;
> > +
> > +               if (hwirq == priv->ipi_id) {
> > +#ifdef CONFIG_SMP
> > +                       ipi_mux_process();
> > +#endif
> > +                       continue;
> > +               } else if (hwirq == priv->ipi_lsync_id) {
> > +                       imsic_ids_local_sync(priv);
> > +                       continue;
> > +               }
> > +
> > +               err = generic_handle_domain_irq(priv->base_domain, hwirq);
> > +               if (unlikely(err))
> > +                       pr_warn_ratelimited(
> > +                               "hwirq %lu mapping not found\n", hwirq);
> > +       }
> > +
> > +       chained_irq_exit(chip, desc);
> > +}
> > +
> > +static int imsic_starting_cpu(unsigned int cpu)
> > +{
> > +       struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +       struct imsic_priv *priv = handler->priv;
> > +
> > +       /* Enable per-CPU parent interrupt */
> > +       if (imsic_parent_irq)
> > +               enable_percpu_irq(imsic_parent_irq,
> > +                                 irq_get_trigger_type(imsic_parent_irq));
> > +       else
> > +               pr_warn("cpu%d: parent irq not available\n", cpu);
> > +
> > +       /* Enable IPIs */
> > +       imsic_ipi_enable(priv);
> > +
> > +       /*
> > +        * Interrupts identities might have been enabled/disabled while
> > +        * this CPU was not running so sync-up local enable/disable state.
> > +        */
> > +       imsic_ids_local_sync(priv);
> > +
> > +       /* Locally enable interrupt delivery */
> > +       imsic_ids_local_delivery(priv, true);
> > +
> > +       return 0;
> > +}
> > +
> > +struct imsic_fwnode_ops {
> > +       u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> > +                            void *fwopaque);
> > +       int (*parent_hartid)(struct fwnode_handle *fwnode,
> > +                            void *fwopaque, u32 index,
> > +                            unsigned long *out_hartid);
> > +       u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> > +       int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> > +                               void *fwopaque, u32 index,
> > +                               struct resource *res);
> > +       void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> > +                                 void *fwopaque, u32 index);
> > +       int (*read_u32)(struct fwnode_handle *fwnode,
> > +                       void *fwopaque, const char *prop, u32 *out_val);
> > +       bool (*read_bool)(struct fwnode_handle *fwnode,
> > +                         void *fwopaque, const char *prop);
> > +};
> > +
> > +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> > +                            struct fwnode_handle *fwnode,
> > +                            void *fwopaque)
> > +{
> > +       struct resource res;
> > +       phys_addr_t base_addr;
> > +       int rc, nr_parent_irqs;
> > +       struct imsic_mmio *mmio;
> > +       struct imsic_priv *priv;
> > +       struct irq_domain *domain;
> > +       struct imsic_handler *handler;
> > +       struct imsic_global_config *global;
> > +       u32 i, tmp, nr_handlers = 0;
> > +
> > +       if (imsic_init_done) {
> > +               pr_err("%pfwP: already initialized hence ignoring\n",
> > +                       fwnode);
> > +               return -ENODEV;
> > +       }
> > +
> > +       if (!riscv_isa_extension_available(NULL, SxAIA)) {
> > +               pr_err("%pfwP: AIA support not available\n", fwnode);
> > +               return -ENODEV;
> > +       }
> > +
> > +       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +       global = &priv->global;
> > +
> > +       /* Find number of parent interrupts */
> > +       nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> > +       if (!nr_parent_irqs) {
> > +               pr_err("%pfwP: no parent irqs available\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of guest index bits in MSI address */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> > +                            &global->guest_index_bits);
> > +       if (rc)
> > +               global->guest_index_bits = 0;
> > +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> > +       if (tmp < global->guest_index_bits) {
> > +               pr_err("%pfwP: guest index bits too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of HART index bits */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> > +                            &global->hart_index_bits);
> > +       if (rc) {
> > +               /* Assume default value */
> > +               global->hart_index_bits = __fls(nr_parent_irqs);
> > +               if (BIT(global->hart_index_bits) < nr_parent_irqs)
> > +                       global->hart_index_bits++;
> > +       }
> > +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +             global->guest_index_bits;
> > +       if (tmp < global->hart_index_bits) {
> > +               pr_err("%pfwP: HART index bits too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of group index bits */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> > +                            &global->group_index_bits);
> > +       if (rc)
> > +               global->group_index_bits = 0;
> > +       tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +             global->guest_index_bits - global->hart_index_bits;
> > +       if (tmp < global->group_index_bits) {
> > +               pr_err("%pfwP: group index bits too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /*
> > +        * Find first bit position of group index.
> > +        * If not specified assumed the default APLIC-IMSIC configuration.
> > +        */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> > +                            &global->group_index_shift);
> > +       if (rc)
> > +               global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> > +       tmp = global->group_index_bits + global->group_index_shift - 1;
> > +       if (tmp >= BITS_PER_LONG) {
> > +               pr_err("%pfwP: group index shift too big\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of interrupt identities */
> > +       rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> > +                            &global->nr_ids);
> > +       if (rc) {
> > +               pr_err("%pfwP: number of interrupt identities not found\n",
> > +                       fwnode);
> > +               return rc;
> > +       }
> > +       if ((global->nr_ids < IMSIC_MIN_ID) ||
> > +           (global->nr_ids >= IMSIC_MAX_ID) ||
> > +           ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +               pr_err("%pfwP: invalid number of interrupt identities\n",
> > +                       fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of guest interrupt identities */
> > +       if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> > +                           &global->nr_guest_ids))
> > +               global->nr_guest_ids = global->nr_ids;
> > +       if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> > +           (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> > +           ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +               pr_err("%pfwP: invalid number of guest interrupt identities\n",
> > +                       fwnode);
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Compute base address */
> > +       rc = fwops->mmio_to_resource(fwnode, fwopaque, 0, &res);
> > +       if (rc) {
> > +               pr_err("%pfwP: first MMIO resource not found\n", fwnode);
> > +               return -EINVAL;
> > +       }
> > +       global->base_addr = res.start;
> > +       global->base_addr &= ~(BIT(global->guest_index_bits +
> > +                                  global->hart_index_bits +
> > +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> > +       global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> > +                              global->group_index_shift);
> > +
> > +       /* Find number of MMIO register sets */
> > +       priv->num_mmios = fwops->nr_mmio(fwnode, fwopaque);
> > +
> > +       /* Allocate MMIO register sets */
> > +       priv->mmios = kcalloc(priv->num_mmios, sizeof(*mmio), GFP_KERNEL);
> > +       if (!priv->mmios) {
> > +               rc = -ENOMEM;
> > +               goto out_free_priv;
> > +       }
> > +
> > +       /* Parse and map MMIO register sets */
> > +       for (i = 0; i < priv->num_mmios; i++) {
> > +               mmio = &priv->mmios[i];
> > +               rc = fwops->mmio_to_resource(fwnode, fwopaque, i, &res);
> > +               if (rc) {
> > +                       pr_err("%pfwP: unable to parse MMIO regset %d\n",
> > +                               fwnode, i);
> > +                       goto out_iounmap;
> > +               }
> > +               mmio->pa = res.start;
> > +               mmio->size = res.end - res.start + 1;
> > +
> > +               base_addr = mmio->pa;
> > +               base_addr &= ~(BIT(global->guest_index_bits +
> > +                                  global->hart_index_bits +
> > +                                  IMSIC_MMIO_PAGE_SHIFT) - 1);
> > +               base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> > +                              global->group_index_shift);
> > +               if (base_addr != global->base_addr) {
> > +                       rc = -EINVAL;
> > +                       pr_err("%pfwP: address mismatch for regset %d\n",
> > +                               fwnode, i);
> > +                       goto out_iounmap;
> > +               }
> > +
> > +               mmio->va = fwops->mmio_map(fwnode, fwopaque, i);
> > +               if (!mmio->va) {
> > +                       rc = -EIO;
> > +                       pr_err("%pfwP: unable to map MMIO regset %d\n",
> > +                               fwnode, i);
> > +                       goto out_iounmap;
> > +               }
> > +       }
> > +
> > +       /* Initialize interrupt identity management */
> > +       rc = imsic_ids_init(priv);
> > +       if (rc) {
> > +               pr_err("%pfwP: failed to initialize interrupt management\n",
> > +                      fwnode);
> > +               goto out_iounmap;
> > +       }
> > +
> > +       /* Configure handlers for target CPUs */
> > +       for (i = 0; i < nr_parent_irqs; i++) {
> > +               unsigned long reloff, hartid;
> > +               int j, cpu;
> > +
> > +               rc = fwops->parent_hartid(fwnode, fwopaque, i, &hartid);
> > +               if (rc) {
> > +                       pr_warn("%pfwP: hart ID for parent irq%d not found\n",
> > +                               fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               cpu = riscv_hartid_to_cpuid(hartid);
> > +               if (cpu < 0) {
> > +                       pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
> > +                               fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               /* Find MMIO location of MSI page */
> > +               mmio = NULL;
> > +               reloff = i * BIT(global->guest_index_bits) *
> > +                        IMSIC_MMIO_PAGE_SZ;
> > +               for (j = 0; priv->num_mmios; j++) {
> > +                       if (reloff < priv->mmios[j].size) {
> > +                               mmio = &priv->mmios[j];
> > +                               break;
> > +                       }
> > +
> > +                       /*
> > +                        * MMIO region size may not be aligned to
> > +                        * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
> > +                        * if holes are present.
> > +                        */
> > +                       reloff -= ALIGN(priv->mmios[j].size,
> > +                       BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
> > +               }
> > +               if (!mmio) {
> > +                       pr_warn("%pfwP: MMIO not found for parent irq%d\n",
> > +                               fwnode, i);
> > +                       continue;
> > +               }
> > +
> > +               handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +               if (handler->priv) {
> > +                       pr_warn("%pfwP: CPU%d handler already configured.\n",
> > +                               fwnode, cpu);
> > +                       goto done;
> > +               }
> > +
> > +               cpumask_set_cpu(cpu, &priv->lmask);
> > +               handler->local.msi_pa = mmio->pa + reloff;
> > +               handler->local.msi_va = mmio->va + reloff;
> > +               handler->priv = priv;
> > +
> > +done:
> > +               nr_handlers++;
> > +       }
> > +
> > +       /* If no CPU handlers found then can't take interrupts */
> > +       if (!nr_handlers) {
> > +               pr_err("%pfwP: No CPU handlers found\n", fwnode);
> > +               rc = -ENODEV;
> > +               goto out_ids_cleanup;
> > +       }
> > +
> > +       /* Find parent domain and register chained handler */
> > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > +                                         DOMAIN_BUS_ANY);
> > +       if (!domain) {
> > +               pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> > +               rc = -ENOENT;
> > +               goto out_ids_cleanup;
> > +       }
> > +       imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > +       if (!imsic_parent_irq) {
> > +               pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> > +               rc = -ENOENT;
> > +               goto out_ids_cleanup;
> > +       }
> > +       irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> > +
> > +       /* Initialize IPI domain */
> > +       rc = imsic_ipi_domain_init(priv);
> > +       if (rc) {
> > +               pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> > +               goto out_ids_cleanup;
> > +       }
> > +
> > +       /* Initialize IRQ and MSI domains */
> > +       rc = imsic_irq_domains_init(priv, fwnode);
> > +       if (rc) {
> > +               pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
> > +                      fwnode);
> > +               goto out_ipi_domain_cleanup;
> > +       }
> > +
> > +       /*
> > +        * Setup cpuhp state
> > +        *
> > +        * Don't disable per-CPU IMSIC file when CPU goes offline
> > +        * because this affects IPI and the masking/unmasking of
> > +        * virtual IPIs is done via generic IPI-Mux
> > +        */
> > +       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > +                         "irqchip/riscv/imsic:starting",
> > +                         imsic_starting_cpu, NULL);
> > +
> > +       /*
> > +        * Only one IMSIC instance allowed in a platform for clean
> > +        * implementation of SMP IRQ affinity and per-CPU IPIs.
> > +        *
> > +        * This means on a multi-socket (or multi-die) platform we
> > +        * will have multiple MMIO regions for one IMSIC instance.
> > +        */
> > +       imsic_init_done = true;
> > +
> > +       pr_info("%pfwP:  hart-index-bits: %d,  guest-index-bits: %d\n",
> > +               fwnode, global->hart_index_bits, global->guest_index_bits);
> > +       pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
> > +               fwnode, global->group_index_bits, global->group_index_shift);
> > +       pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
> > +               fwnode, global->nr_ids, nr_handlers, &global->base_addr);
> > +       if (priv->ipi_lsync_id)
> > +               pr_info("%pfwP: enable/disable sync using interrupt %d\n",
> > +                       fwnode, priv->ipi_lsync_id);
> > +       if (priv->ipi_id)
> > +               pr_info("%pfwP: providing IPIs using interrupt %d\n",
> > +                       fwnode, priv->ipi_id);
> > +
> > +       return 0;
> > +
> > +out_ipi_domain_cleanup:
> > +       imsic_ipi_domain_cleanup(priv);
> > +out_ids_cleanup:
> > +       imsic_ids_cleanup(priv);
> > +out_iounmap:
> > +       for (i = 0; i < priv->num_mmios; i++) {
> > +               if (priv->mmios[i].va)
> > +                       iounmap(priv->mmios[i].va);
> > +       }
> > +       kfree(priv->mmios);
> > +out_free_priv:
> > +       kfree(priv);
> > +       return rc;
> > +}
> > +
> > +static u32 __init imsic_dt_nr_parent_irq(struct fwnode_handle *fwnode,
> > +                                        void *fwopaque)
> > +{
> > +       return of_irq_count(to_of_node(fwnode));
> > +}
> > +
> > +static int __init imsic_dt_parent_hartid(struct fwnode_handle *fwnode,
> > +                                        void *fwopaque, u32 index,
> > +                                        unsigned long *out_hartid)
> > +{
> > +       struct of_phandle_args parent;
> > +       int rc;
> > +
> > +       rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
> > +       if (rc)
> > +               return rc;
> > +
> > +       /*
> > +        * Skip interrupts other than external interrupts for
> > +        * current privilege level.
> > +        */
> > +       if (parent.args[0] != RV_IRQ_EXT)
> > +               return -EINVAL;
> > +
> > +       return riscv_of_parent_hartid(parent.np, out_hartid);
> > +}
> > +
> > +static u32 __init imsic_dt_nr_mmio(struct fwnode_handle *fwnode,
> > +                                  void *fwopaque)
> > +{
> > +       u32 ret = 0;
> > +       struct resource res;
> > +
> > +       while (!of_address_to_resource(to_of_node(fwnode), ret, &res))
> > +               ret++;
> > +
> > +       return ret;
> > +}
> > +
> > +static int __init imsic_mmio_to_resource(struct fwnode_handle *fwnode,
> > +                                        void *fwopaque, u32 index,
> > +                                        struct resource *res)
> > +{
> > +       return of_address_to_resource(to_of_node(fwnode), index, res);
> > +}
> > +
> > +static void __iomem __init *imsic_dt_mmio_map(struct fwnode_handle *fwnode,
> > +                                             void *fwopaque, u32 index)
> > +{
> > +       return of_iomap(to_of_node(fwnode), index);
> > +}
> > +
> > +static int __init imsic_dt_read_u32(struct fwnode_handle *fwnode,
> > +                                   void *fwopaque, const char *prop,
> > +                                   u32 *out_val)
> > +{
> > +       return of_property_read_u32(to_of_node(fwnode), prop, out_val);
> > +}
> > +
> > +static bool __init imsic_dt_read_bool(struct fwnode_handle *fwnode,
> > +                                     void *fwopaque, const char *prop)
> > +{
> > +       return of_property_read_bool(to_of_node(fwnode), prop);
> > +}
> > +
> > +static int __init imsic_dt_init(struct device_node *node,
> > +                               struct device_node *parent)
> > +{
> > +       struct imsic_fwnode_ops ops = {
> > +               .nr_parent_irq = imsic_dt_nr_parent_irq,
> > +               .parent_hartid = imsic_dt_parent_hartid,
> > +               .nr_mmio = imsic_dt_nr_mmio,
> > +               .mmio_to_resource = imsic_mmio_to_resource,
> > +               .mmio_map = imsic_dt_mmio_map,
> > +               .read_u32 = imsic_dt_read_u32,
> > +               .read_bool = imsic_dt_read_bool,
> > +       };
> > +
> > +       return imsic_init(&ops, &node->fwnode, NULL);
> > +}
> > +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
> > diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
> > new file mode 100644
> > index 000000000000..5d1387adc0ba
> > --- /dev/null
> > +++ b/include/linux/irqchip/riscv-imsic.h
> > @@ -0,0 +1,92 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
> > +#define __LINUX_IRQCHIP_RISCV_IMSIC_H
> > +
> > +#include <linux/types.h>
> > +#include <asm/csr.h>
> > +
> > +#define IMSIC_MMIO_PAGE_SHIFT          12
> > +#define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
> > +#define IMSIC_MMIO_PAGE_LE             0x00
> > +#define IMSIC_MMIO_PAGE_BE             0x04
> > +
> > +#define IMSIC_MIN_ID                   63
> > +#define IMSIC_MAX_ID                   2048
> > +
> > +#define IMSIC_EIDELIVERY               0x70
> > +
> > +#define IMSIC_EITHRESHOLD              0x72
> > +
> > +#define IMSIC_EIP0                     0x80
> > +#define IMSIC_EIP63                    0xbf
> > +#define IMSIC_EIPx_BITS                        32
> > +
> > +#define IMSIC_EIE0                     0xc0
> > +#define IMSIC_EIE63                    0xff
> > +#define IMSIC_EIEx_BITS                        32
> > +
> > +#define IMSIC_FIRST                    IMSIC_EIDELIVERY
> > +#define IMSIC_LAST                     IMSIC_EIE63
> > +
> > +#define IMSIC_MMIO_SETIPNUM_LE         0x00
> > +#define IMSIC_MMIO_SETIPNUM_BE         0x04
> > +
> > +struct imsic_global_config {
> > +       /*
> > +        * MSI Target Address Scheme
> > +        *
> > +        * XLEN-1                                                12     0
> > +        * |                                                     |     |
> > +        * -------------------------------------------------------------
> > +        * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> > +        * -------------------------------------------------------------
> > +        */
> > +
> > +       /* Bits representing Guest index, HART index, and Group index */
> > +       u32 guest_index_bits;
> > +       u32 hart_index_bits;
> > +       u32 group_index_bits;
> > +       u32 group_index_shift;
> > +
> > +       /* Global base address matching all target MSI addresses */
> > +       phys_addr_t base_addr;
> > +
> > +       /* Number of interrupt identities */
> > +       u32 nr_ids;
> > +
> > +       /* Number of guest interrupt identities */
> > +       u32 nr_guest_ids;
> > +};
> > +
> > +struct imsic_local_config {
> > +       phys_addr_t msi_pa;
> > +       void __iomem *msi_va;
> > +};
> > +
> > +#ifdef CONFIG_RISCV_IMSIC
> > +
> > +extern const struct imsic_global_config *imsic_get_global_config(void);
> > +
> > +extern const struct imsic_local_config *imsic_get_local_config(
> > +                                                       unsigned int cpu);
> > +
> > +#else
> > +
> > +static inline const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > +       return NULL;
> > +}
> > +
> > +static inline const struct imsic_local_config *imsic_get_local_config(
> > +                                                       unsigned int cpu)
> > +{
> > +       return NULL;
> > +}
> > +
> > +#endif
> > +
> > +#endif
> > --
> > 2.34.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
  2023-01-17  7:09         ` Vincent Chen
@ 2023-01-18  4:37           ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-18  4:37 UTC (permalink / raw)
  To: Vincent Chen
  Cc: Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel@vger.kernel.org List,
	devicetree

On Tue, Jan 17, 2023 at 12:39 PM Vincent Chen <vincent.chen@sifive.com> wrote:
>
> > From: Anup Patel <apatel@ventanamicro.com>
> > Date: Wed, Jan 4, 2023 at 1:19 AM
> > Subject: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
> > To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> > Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
> >
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new interrupt controller for managing wired interrupts on a RISC-V
> > platform. This new interrupt controller is referred to as advanced
> > platform-level interrupt controller (APLIC) which can forward wired
> > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > signaled interrupts.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> I could not find an appropriate place to post my question, so I posted it here.
>
> I am a little concerned about the current MSI IRQ handling in APLIC.
> According to the specification, when domaincfg.DM = 1, the pending bit
> is set to one by a low-to-high transition in the rectified input
> value. When the APLIC forward this interrupt by MSI, the pending bit
> will be cleared regardless of whether the interrupt type is
> level-sensitive or edge-trigger. However, the interrupted service
> routine may not deal with all requests at a time. When there are
> remaining pending requests after leaving the ISR, these requests may
> have no chance to be serviced if the IRQ type of this device is
> level-sensitive. This is because the rectified value of this interrupt
> changes from 0 to 1 only at the beginning. It causes the pending bit
> of this interrupt will not to be asserted again, so APLIC will not
> send the next MSI for this interrupt. Therefore, the ISR doesn't have
> a chance to deal with the remaining requests.
>
> One possible solution to fix this issue is to let the APLIC driver
> check if the rectified value of the serviced interrupt is one after
> returning from its ISR. When the value is 1, it means this device
> still has pending requests. In this case, the APLIC driver can set its
> pending bit by the setipnum register or the setip register. It will
> let APLIC send the next MSI for this device, and the ISR will have a
> chance to deal with the remaining requests.

This is situation and possible solutions are already described in
the section "4.9.2 Special consideration for level-sensitive interrupt
sources" of the AIA specification.

It is a human error on my part for having totally missed adding
relevant code here. Thanks for pointing. I will fix this in the v3 patch.

Regards,
Anup

>
>
>
> > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  drivers/irqchip/Kconfig             |   6 +
> >  drivers/irqchip/Makefile            |   1 +
> >  drivers/irqchip/irq-riscv-aplic.c   | 670 ++++++++++++++++++++++++++++
> >  include/linux/irqchip/riscv-aplic.h | 117 +++++
> >  4 files changed, 794 insertions(+)
> >  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> >  create mode 100644 include/linux/irqchip/riscv-aplic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index a1315189a595..936e59fe1f99 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -547,6 +547,12 @@ config SIFIVE_PLIC
> >         select IRQ_DOMAIN_HIERARCHY
> >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_APLIC
> > +       bool
> > +       depends on RISCV
> > +       select IRQ_DOMAIN_HIERARCHY
> > +       select GENERIC_MSI_IRQ_DOMAIN
> > +
> >  config RISCV_IMSIC
> >         bool
> >         depends on RISCV
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 22c723cc6ec8..6154e5bc4228 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
> >  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > new file mode 100644
> > index 000000000000..63f20892d7d3
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > @@ -0,0 +1,670 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#include <linux/bitops.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-aplic.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/smp.h>
> > +
> > +#define APLIC_DEFAULT_PRIORITY         1
> > +#define APLIC_DISABLE_IDELIVERY                0
> > +#define APLIC_ENABLE_IDELIVERY         1
> > +#define APLIC_DISABLE_ITHRESHOLD       1
> > +#define APLIC_ENABLE_ITHRESHOLD                0
> > +
> > +struct aplic_msicfg {
> > +       phys_addr_t             base_ppn;
> > +       u32                     hhxs;
> > +       u32                     hhxw;
> > +       u32                     lhxs;
> > +       u32                     lhxw;
> > +};
> > +
> > +struct aplic_idc {
> > +       unsigned int            hart_index;
> > +       void __iomem            *regs;
> > +       struct aplic_priv       *priv;
> > +};
> > +
> > +struct aplic_priv {
> > +       struct device           *dev;
> > +       u32                     nr_irqs;
> > +       u32                     nr_idcs;
> > +       void __iomem            *regs;
> > +       struct irq_domain       *irqdomain;
> > +       struct aplic_msicfg     msicfg;
> > +       struct cpumask          lmask;
> > +};
> > +
> > +static unsigned int aplic_idc_parent_irq;
> > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > +
> > +static void aplic_irq_unmask(struct irq_data *d)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > +
> > +       if (!priv->nr_idcs)
> > +               irq_chip_unmask_parent(d);
> > +}
> > +
> > +static void aplic_irq_mask(struct irq_data *d)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > +
> > +       if (!priv->nr_idcs)
> > +               irq_chip_mask_parent(d);
> > +}
> > +
> > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > +{
> > +       u32 val = 0;
> > +       void __iomem *sourcecfg;
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       switch (type) {
> > +       case IRQ_TYPE_NONE:
> > +               val = APLIC_SOURCECFG_SM_INACTIVE;
> > +               break;
> > +       case IRQ_TYPE_LEVEL_LOW:
> > +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > +               break;
> > +       case IRQ_TYPE_LEVEL_HIGH:
> > +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > +               break;
> > +       case IRQ_TYPE_EDGE_FALLING:
> > +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > +               break;
> > +       case IRQ_TYPE_EDGE_RISING:
> > +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > +               break;
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +
> > +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > +       writel(val, sourcecfg);
> > +
> > +       return 0;
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int aplic_set_affinity(struct irq_data *d,
> > +                             const struct cpumask *mask_val, bool force)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       struct aplic_idc *idc;
> > +       unsigned int cpu, val;
> > +       struct cpumask amask;
> > +       void __iomem *target;
> > +
> > +       if (!priv->nr_idcs)
> > +               return irq_chip_set_affinity_parent(d, mask_val, force);
> > +
> > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > +       if (force)
> > +               cpu = cpumask_first(&amask);
> > +       else
> > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > +       if (cpu >= nr_cpu_ids)
> > +               return -EINVAL;
> > +
> > +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> > +       target = priv->regs + APLIC_TARGET_BASE;
> > +       target += (d->hwirq - 1) * sizeof(u32);
> > +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > +       val |= APLIC_DEFAULT_PRIORITY;
> > +       writel(val, target);
> > +
> > +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > +
> > +       return IRQ_SET_MASK_OK_DONE;
> > +}
> > +#endif
> > +
> > +static struct irq_chip aplic_chip = {
> > +       .name           = "RISC-V APLIC",
> > +       .irq_mask       = aplic_irq_mask,
> > +       .irq_unmask     = aplic_irq_unmask,
> > +       .irq_set_type   = aplic_set_type,
> > +#ifdef CONFIG_SMP
> > +       .irq_set_affinity = aplic_set_affinity,
> > +#endif
> > +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> > +                         IRQCHIP_SKIP_SET_WAKE |
> > +                         IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int aplic_irqdomain_translate(struct irq_domain *d,
> > +                                    struct irq_fwspec *fwspec,
> > +                                    unsigned long *hwirq,
> > +                                    unsigned int *type)
> > +{
> > +       if (WARN_ON(fwspec->param_count < 2))
> > +               return -EINVAL;
> > +       if (WARN_ON(!fwspec->param[0]))
> > +               return -EINVAL;
> > +
> > +       *hwirq = fwspec->param[0];
> > +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > +
> > +       WARN_ON(*type == IRQ_TYPE_NONE);
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > +                                    unsigned int virq, unsigned int nr_irqs,
> > +                                    void *arg)
> > +{
> > +       int i, ret;
> > +       unsigned int type;
> > +       irq_hw_number_t hwirq;
> > +       struct irq_fwspec *fwspec = arg;
> > +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > +
> > +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > +       if (ret)
> > +               return ret;
> > +
> > +       for (i = 0; i < nr_irqs; i++)
> > +               irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
> > +                                             &aplic_chip, priv);
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > +       .translate      = aplic_irqdomain_translate,
> > +       .alloc          = aplic_irqdomain_msi_alloc,
> > +       .free           = platform_msi_device_domain_free,
> > +};
> > +
> > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > +                                    unsigned int virq, unsigned int nr_irqs,
> > +                                    void *arg)
> > +{
> > +       int i, ret;
> > +       unsigned int type;
> > +       irq_hw_number_t hwirq;
> > +       struct irq_fwspec *fwspec = arg;
> > +       struct aplic_priv *priv = domain->host_data;
> > +
> > +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> > +       if (ret)
> > +               return ret;
> > +
> > +       for (i = 0; i < nr_irqs; i++) {
> > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                   &aplic_chip, priv, handle_simple_irq,
> > +                                   NULL, NULL);
> > +               irq_set_affinity(virq + i, &priv->lmask);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > +       .translate      = aplic_irqdomain_translate,
> > +       .alloc          = aplic_irqdomain_idc_alloc,
> > +       .free           = irq_domain_free_irqs_top,
> > +};
> > +
> > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > +{
> > +       int i;
> > +
> > +       /* Disable all interrupts */
> > +       for (i = 0; i <= priv->nr_irqs; i += 32)
> > +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > +                           (i / 32) * sizeof(u32));
> > +
> > +       /* Set interrupt type and default priority for all interrupts */
> > +       for (i = 1; i <= priv->nr_irqs; i++) {
> > +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > +                         (i - 1) * sizeof(u32));
> > +               writel(APLIC_DEFAULT_PRIORITY,
> > +                      priv->regs + APLIC_TARGET_BASE +
> > +                      (i - 1) * sizeof(u32));
> > +       }
> > +
> > +       /* Clear APLIC domaincfg */
> > +       writel(0, priv->regs + APLIC_DOMAINCFG);
> > +}
> > +
> > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > +{
> > +       u32 val;
> > +#ifdef CONFIG_RISCV_M_MODE
> > +       u32 valH;
> > +
> > +       if (!priv->nr_idcs) {
> > +               val = priv->msicfg.base_ppn;
> > +               valH = (priv->msicfg.base_ppn >> 32) &
> > +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> > +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> > +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > +       }
> > +#endif
> > +
> > +       /* Setup APLIC domaincfg register */
> > +       val = readl(priv->regs + APLIC_DOMAINCFG);
> > +       val |= APLIC_DOMAINCFG_IE;
> > +       if (!priv->nr_idcs)
> > +               val |= APLIC_DOMAINCFG_DM;
> > +       writel(val, priv->regs + APLIC_DOMAINCFG);
> > +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > +               dev_warn(priv->dev,
> > +                        "unable to write 0x%x in domaincfg\n", val);
> > +}
> > +
> > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > +{
> > +       unsigned int group_index, hart_index, guest_index, val;
> > +       struct device *dev = msi_desc_to_dev(desc);
> > +       struct aplic_priv *priv = dev_get_drvdata(dev);
> > +       struct irq_data *d = irq_get_irq_data(desc->irq);
> > +       struct aplic_msicfg *mc = &priv->msicfg;
> > +       phys_addr_t tppn, tbppn, msg_addr;
> > +       void __iomem *target;
> > +
> > +       /* For zeroed MSI, simply write zero into the target register */
> > +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > +               target = priv->regs + APLIC_TARGET_BASE;
> > +               target += (d->hwirq - 1) * sizeof(u32);
> > +               writel(0, target);
> > +               return;
> > +       }
> > +
> > +       /* Sanity check on message data */
> > +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > +
> > +       /* Compute target MSI address */
> > +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > +
> > +       /* Compute target HART Base PPN */
> > +       tbppn = tppn;
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > +       WARN_ON(tbppn != mc->base_ppn);
> > +
> > +       /* Compute target group and hart indexes */
> > +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > +       hart_index |= (group_index << mc->lhxw);
> > +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > +
> > +       /* Compute target guest index */
> > +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > +
> > +       /* Update IRQ TARGET register */
> > +       target = priv->regs + APLIC_TARGET_BASE;
> > +       target += (d->hwirq - 1) * sizeof(u32);
> > +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > +                               << APLIC_TARGET_HART_IDX_SHIFT;
> > +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> > +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > +       writel(val, target);
> > +}
> > +
> > +static int aplic_setup_msi(struct aplic_priv *priv)
> > +{
> > +       struct device *dev = priv->dev;
> > +       struct aplic_msicfg *mc = &priv->msicfg;
> > +       const struct imsic_global_config *imsic_global;
> > +
> > +       /*
> > +        * The APLIC outgoing MSI config registers assume target MSI
> > +        * controller to be RISC-V AIA IMSIC controller.
> > +        */
> > +       imsic_global = imsic_get_global_config();
> > +       if (!imsic_global) {
> > +               dev_err(dev, "IMSIC global config not found\n");
> > +               return -ENODEV;
> > +       }
> > +
> > +       /* Find number of guest index bits (LHXS) */
> > +       mc->lhxs = imsic_global->guest_index_bits;
> > +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > +               dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of HART index bits (LHXW) */
> > +       mc->lhxw = imsic_global->hart_index_bits;
> > +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > +               dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of group index bits (HHXW) */
> > +       mc->hhxw = imsic_global->group_index_bits;
> > +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > +               dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find first bit position of group index (HHXS) */
> > +       mc->hhxs = imsic_global->group_index_shift;
> > +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > +               dev_err(dev, "IMSIC group index shift should be >= %d\n",
> > +                       (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > +               return -EINVAL;
> > +       }
> > +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > +               dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Compute PPN base */
> > +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > +
> > +       /* Use all possible CPUs as lmask */
> > +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > + * which will return highest priority pending interrupt and clear the
> > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > + * register return zero value.
> > + */
> > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > +{
> > +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > +       irq_hw_number_t hw_irq;
> > +       int irq;
> > +
> > +       chained_irq_enter(chip, desc);
> > +
> > +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > +
> > +               if (unlikely(irq <= 0))
> > +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > +                                           hw_irq);
> > +               else
> > +                       generic_handle_irq(irq);
> > +       }
> > +
> > +       chained_irq_exit(chip, desc);
> > +}
> > +
> > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > +{
> > +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > +
> > +       /* Priority must be less than threshold for interrupt triggering */
> > +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > +
> > +       /* Delivery must be set to 1 for interrupt triggering */
> > +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > +}
> > +
> > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > +{
> > +       if (aplic_idc_parent_irq)
> > +               disable_percpu_irq(aplic_idc_parent_irq);
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > +{
> > +       if (aplic_idc_parent_irq)
> > +               enable_percpu_irq(aplic_idc_parent_irq,
> > +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_setup_idc(struct aplic_priv *priv)
> > +{
> > +       int i, j, rc, cpu, setup_count = 0;
> > +       struct device_node *node = priv->dev->of_node;
> > +       struct device *dev = priv->dev;
> > +       struct of_phandle_args parent;
> > +       struct irq_domain *domain;
> > +       unsigned long hartid;
> > +       struct aplic_idc *idc;
> > +       u32 val;
> > +
> > +       /* Setup per-CPU IDC and target CPU mask */
> > +       for (i = 0; i < priv->nr_idcs; i++) {
> > +               if (of_irq_parse_one(node, i, &parent)) {
> > +                       dev_err(dev, "failed to parse parent for IDC%d.\n",
> > +                               i);
> > +                       return -EIO;
> > +               }
> > +
> > +               /* Skip IDCs which do not connect to external interrupts */
> > +               if (parent.args[0] != RV_IRQ_EXT)
> > +                       continue;
> > +
> > +               rc = riscv_of_parent_hartid(parent.np, &hartid);
> > +               if (rc) {
> > +                       dev_err(dev, "failed to parse hart ID for IDC%d.\n",
> > +                               i);
> > +                       return rc;
> > +               }
> > +
> > +               cpu = riscv_hartid_to_cpuid(hartid);
> > +               if (cpu < 0) {
> > +                       dev_warn(dev, "invalid cpuid for IDC%d\n", i);
> > +                       continue;
> > +               }
> > +
> > +               cpumask_set_cpu(cpu, &priv->lmask);
> > +
> > +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> > +               WARN_ON(idc->priv);
> > +
> > +               idc->hart_index = i;
> > +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > +               idc->priv = priv;
> > +
> > +               aplic_idc_set_delivery(idc, true);
> > +
> > +               /*
> > +                * Boot cpu might not have APLIC hart_index = 0 so check
> > +                * and update target registers of all interrupts.
> > +                */
> > +               if (cpu == smp_processor_id() && idc->hart_index) {
> > +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > +                       val |= APLIC_DEFAULT_PRIORITY;
> > +                       for (j = 1; j <= priv->nr_irqs; j++)
> > +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> > +                                           (j - 1) * sizeof(u32));
> > +               }
> > +
> > +               setup_count++;
> > +       }
> > +
> > +       /* Find parent domain and register chained handler */
> > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > +                                         DOMAIN_BUS_ANY);
> > +       if (!aplic_idc_parent_irq && domain) {
> > +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > +               if (aplic_idc_parent_irq) {
> > +                       irq_set_chained_handler(aplic_idc_parent_irq,
> > +                                               aplic_idc_handle_irq);
> > +
> > +                       /*
> > +                        * Setup CPUHP notifier to enable IDC parent
> > +                        * interrupt on all CPUs
> > +                        */
> > +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > +                                         "irqchip/riscv/aplic:starting",
> > +                                         aplic_idc_starting_cpu,
> > +                                         aplic_idc_dying_cpu);
> > +               }
> > +       }
> > +
> > +       /* Fail if we were not able to setup IDC for any CPU */
> > +       return (setup_count) ? 0 : -ENODEV;
> > +}
> > +
> > +static int aplic_probe(struct platform_device *pdev)
> > +{
> > +       struct device_node *node = pdev->dev.of_node;
> > +       struct device *dev = &pdev->dev;
> > +       struct aplic_priv *priv;
> > +       struct resource *regs;
> > +       phys_addr_t pa;
> > +       int rc;
> > +
> > +       regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +       if (!regs) {
> > +               dev_err(dev, "cannot find registers resource\n");
> > +               return -ENOENT;
> > +       }
> > +
> > +       priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +       platform_set_drvdata(pdev, priv);
> > +       priv->dev = dev;
> > +
> > +       priv->regs = devm_ioremap(dev, regs->start, resource_size(regs));
> > +       if (WARN_ON(!priv->regs)) {
> > +               dev_err(dev, "failed ioremap registers\n");
> > +               return -EIO;
> > +       }
> > +
> > +       of_property_read_u32(node, "riscv,num-sources", &priv->nr_irqs);
> > +       if (!priv->nr_irqs) {
> > +               dev_err(dev, "failed to get number of interrupt sources\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Setup initial state APLIC interrupts */
> > +       aplic_init_hw_irqs(priv);
> > +
> > +       /*
> > +        * Setup IDCs or MSIs based on parent interrupts in DT node
> > +        *
> > +        * If "msi-parent" DT property is present then we ignore the
> > +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > +        */
> > +       priv->nr_idcs = of_property_read_bool(node, "msi-parent") ?
> > +                       0 : of_irq_count(node);
> > +       if (priv->nr_idcs)
> > +               rc = aplic_setup_idc(priv);
> > +       else
> > +               rc = aplic_setup_msi(priv);
> > +       if (rc)
> > +               return rc;
> > +
> > +       /* Setup global config and interrupt delivery */
> > +       aplic_init_hw_global(priv);
> > +
> > +       /* Create irq domain instance for the APLIC */
> > +       if (priv->nr_idcs)
> > +               priv->irqdomain = irq_domain_create_linear(
> > +                                               of_node_to_fwnode(node),
> > +                                               priv->nr_irqs + 1,
> > +                                               &aplic_irqdomain_idc_ops,
> > +                                               priv);
> > +       else
> > +               priv->irqdomain = platform_msi_create_device_domain(dev,
> > +                                               priv->nr_irqs + 1,
> > +                                               aplic_msi_write_msg,
> > +                                               &aplic_irqdomain_msi_ops,
> > +                                               priv);
> > +       if (!priv->irqdomain) {
> > +               dev_err(dev, "failed to add irq domain\n");
> > +               return -ENOMEM;
> > +       }
> > +
> > +       /* Advertise the interrupt controller */
> > +       if (priv->nr_idcs) {
> > +               dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
> > +                        priv->nr_irqs, priv->nr_idcs);
> > +       } else {
> > +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > +               dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
> > +                        priv->nr_irqs, &pa);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_remove(struct platform_device *pdev)
> > +{
> > +       struct aplic_priv *priv = platform_get_drvdata(pdev);
> > +
> > +       irq_domain_remove(priv->irqdomain);
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct of_device_id aplic_match[] = {
> > +       { .compatible = "riscv,aplic" },
> > +       {}
> > +};
> > +
> > +static struct platform_driver aplic_driver = {
> > +       .driver = {
> > +               .name           = "riscv-aplic",
> > +               .of_match_table = aplic_match,
> > +       },
> > +       .probe = aplic_probe,
> > +       .remove = aplic_remove,
> > +};
> > +
> > +static int __init aplic_init(void)
> > +{
> > +       return platform_driver_register(&aplic_driver);
> > +}
> > +core_initcall(aplic_init);
> > diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> > new file mode 100644
> > index 000000000000..88177eefd411
> > --- /dev/null
> > +++ b/include/linux/irqchip/riscv-aplic.h
> > @@ -0,0 +1,117 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> > +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> > +
> > +#include <linux/bitops.h>
> > +
> > +#define APLIC_MAX_IDC                  BIT(14)
> > +#define APLIC_MAX_SOURCE               1024
> > +
> > +#define APLIC_DOMAINCFG                        0x0000
> > +#define APLIC_DOMAINCFG_RDONLY         0x80000000
> > +#define APLIC_DOMAINCFG_IE             BIT(8)
> > +#define APLIC_DOMAINCFG_DM             BIT(2)
> > +#define APLIC_DOMAINCFG_BE             BIT(0)
> > +
> > +#define APLIC_SOURCECFG_BASE           0x0004
> > +#define APLIC_SOURCECFG_D              BIT(10)
> > +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x000003ff
> > +#define APLIC_SOURCECFG_SM_MASK        0x00000007
> > +#define APLIC_SOURCECFG_SM_INACTIVE    0x0
> > +#define APLIC_SOURCECFG_SM_DETACH      0x1
> > +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
> > +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
> > +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
> > +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
> > +
> > +#define APLIC_MMSICFGADDR              0x1bc0
> > +#define APLIC_MMSICFGADDRH             0x1bc4
> > +#define APLIC_SMSICFGADDR              0x1bc8
> > +#define APLIC_SMSICFGADDRH             0x1bcc
> > +
> > +#ifdef CONFIG_RISCV_M_MODE
> > +#define APLIC_xMSICFGADDR              APLIC_MMSICFGADDR
> > +#define APLIC_xMSICFGADDRH             APLIC_MMSICFGADDRH
> > +#else
> > +#define APLIC_xMSICFGADDR              APLIC_SMSICFGADDR
> > +#define APLIC_xMSICFGADDRH             APLIC_SMSICFGADDRH
> > +#endif
> > +
> > +#define APLIC_xMSICFGADDRH_L           BIT(31)
> > +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
> > +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
> > +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
> > +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
> > +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
> > +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
> > +#define APLIC_xMSICFGADDRH_LHXW_MASK   0xf
> > +#define APLIC_xMSICFGADDRH_LHXW_SHIFT  12
> > +#define APLIC_xMSICFGADDRH_BAPPN_MASK  0xfff
> > +
> > +#define APLIC_xMSICFGADDR_PPN_SHIFT    12
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> > +       (BIT(__lhxs) - 1)
> > +
> > +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> > +       (BIT(__lhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> > +       ((__lhxs))
> > +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> > +       (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> > +        APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> > +       (BIT(__hhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> > +       ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> > +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> > +       (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> > +        APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> > +
> > +#define APLIC_SETIP_BASE               0x1c00
> > +#define APLIC_SETIPNUM                 0x1cdc
> > +
> > +#define APLIC_CLRIP_BASE               0x1d00
> > +#define APLIC_CLRIPNUM                 0x1ddc
> > +
> > +#define APLIC_SETIE_BASE               0x1e00
> > +#define APLIC_SETIENUM                 0x1edc
> > +
> > +#define APLIC_CLRIE_BASE               0x1f00
> > +#define APLIC_CLRIENUM                 0x1fdc
> > +
> > +#define APLIC_SETIPNUM_LE              0x2000
> > +#define APLIC_SETIPNUM_BE              0x2004
> > +
> > +#define APLIC_GENMSI                   0x3000
> > +
> > +#define APLIC_TARGET_BASE              0x3004
> > +#define APLIC_TARGET_HART_IDX_SHIFT    18
> > +#define APLIC_TARGET_HART_IDX_MASK     0x3fff
> > +#define APLIC_TARGET_GUEST_IDX_SHIFT   12
> > +#define APLIC_TARGET_GUEST_IDX_MASK    0x3f
> > +#define APLIC_TARGET_IPRIO_MASK        0xff
> > +#define APLIC_TARGET_EIID_MASK 0x7ff
> > +
> > +#define APLIC_IDC_BASE                 0x4000
> > +#define APLIC_IDC_SIZE                 32
> > +
> > +#define APLIC_IDC_IDELIVERY            0x00
> > +
> > +#define APLIC_IDC_IFORCE               0x04
> > +
> > +#define APLIC_IDC_ITHRESHOLD           0x08
> > +
> > +#define APLIC_IDC_TOPI                 0x18
> > +#define APLIC_IDC_TOPI_ID_SHIFT        16
> > +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> > +#define APLIC_IDC_TOPI_PRIO_MASK       0xff
> > +
> > +#define APLIC_IDC_CLAIMI               0x1c
> > +
> > +#endif
> > --
> > 2.34.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
@ 2023-01-18  4:37           ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-18  4:37 UTC (permalink / raw)
  To: Vincent Chen
  Cc: Palmer Dabbelt, Paul Walmsley, tglx, maz, robh+dt,
	krzysztof.kozlowski+dt, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel@vger.kernel.org List,
	devicetree

On Tue, Jan 17, 2023 at 12:39 PM Vincent Chen <vincent.chen@sifive.com> wrote:
>
> > From: Anup Patel <apatel@ventanamicro.com>
> > Date: Wed, Jan 4, 2023 at 1:19 AM
> > Subject: [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver
> > To: Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
> > Cc: Atish Patra <atishp@atishpatra.org>, Alistair Francis <Alistair.Francis@wdc.com>, Anup Patel <anup@brainfault.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <devicetree@vger.kernel.org>, Anup Patel <apatel@ventanamicro.com>
> >
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new interrupt controller for managing wired interrupts on a RISC-V
> > platform. This new interrupt controller is referred to as advanced
> > platform-level interrupt controller (APLIC) which can forward wired
> > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > signaled interrupts.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> I could not find an appropriate place to post my question, so I posted it here.
>
> I am a little concerned about the current MSI IRQ handling in APLIC.
> According to the specification, when domaincfg.DM = 1, the pending bit
> is set to one by a low-to-high transition in the rectified input
> value. When the APLIC forward this interrupt by MSI, the pending bit
> will be cleared regardless of whether the interrupt type is
> level-sensitive or edge-trigger. However, the interrupted service
> routine may not deal with all requests at a time. When there are
> remaining pending requests after leaving the ISR, these requests may
> have no chance to be serviced if the IRQ type of this device is
> level-sensitive. This is because the rectified value of this interrupt
> changes from 0 to 1 only at the beginning. It causes the pending bit
> of this interrupt will not to be asserted again, so APLIC will not
> send the next MSI for this interrupt. Therefore, the ISR doesn't have
> a chance to deal with the remaining requests.
>
> One possible solution to fix this issue is to let the APLIC driver
> check if the rectified value of the serviced interrupt is one after
> returning from its ISR. When the value is 1, it means this device
> still has pending requests. In this case, the APLIC driver can set its
> pending bit by the setipnum register or the setip register. It will
> let APLIC send the next MSI for this device, and the ISR will have a
> chance to deal with the remaining requests.

This is situation and possible solutions are already described in
the section "4.9.2 Special consideration for level-sensitive interrupt
sources" of the AIA specification.

It is a human error on my part for having totally missed adding
relevant code here. Thanks for pointing. I will fix this in the v3 patch.

Regards,
Anup

>
>
>
> > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  drivers/irqchip/Kconfig             |   6 +
> >  drivers/irqchip/Makefile            |   1 +
> >  drivers/irqchip/irq-riscv-aplic.c   | 670 ++++++++++++++++++++++++++++
> >  include/linux/irqchip/riscv-aplic.h | 117 +++++
> >  4 files changed, 794 insertions(+)
> >  create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> >  create mode 100644 include/linux/irqchip/riscv-aplic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index a1315189a595..936e59fe1f99 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -547,6 +547,12 @@ config SIFIVE_PLIC
> >         select IRQ_DOMAIN_HIERARCHY
> >         select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_APLIC
> > +       bool
> > +       depends on RISCV
> > +       select IRQ_DOMAIN_HIERARCHY
> > +       select GENERIC_MSI_IRQ_DOMAIN
> > +
> >  config RISCV_IMSIC
> >         bool
> >         depends on RISCV
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 22c723cc6ec8..6154e5bc4228 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                        += irq-qcom-mpm.o
> >  obj-$(CONFIG_CSKY_MPINTC)              += irq-csky-mpintc.o
> >  obj-$(CONFIG_CSKY_APB_INTC)            += irq-csky-apb-intc.o
> >  obj-$(CONFIG_RISCV_INTC)               += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_APLIC)              += irq-riscv-aplic.o
> >  obj-$(CONFIG_RISCV_IMSIC)              += irq-riscv-imsic.o
> >  obj-$(CONFIG_SIFIVE_PLIC)              += irq-sifive-plic.o
> >  obj-$(CONFIG_IMX_IRQSTEER)             += irq-imx-irqsteer.o
> > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > new file mode 100644
> > index 000000000000..63f20892d7d3
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > @@ -0,0 +1,670 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#include <linux/bitops.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-aplic.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/smp.h>
> > +
> > +#define APLIC_DEFAULT_PRIORITY         1
> > +#define APLIC_DISABLE_IDELIVERY                0
> > +#define APLIC_ENABLE_IDELIVERY         1
> > +#define APLIC_DISABLE_ITHRESHOLD       1
> > +#define APLIC_ENABLE_ITHRESHOLD                0
> > +
> > +struct aplic_msicfg {
> > +       phys_addr_t             base_ppn;
> > +       u32                     hhxs;
> > +       u32                     hhxw;
> > +       u32                     lhxs;
> > +       u32                     lhxw;
> > +};
> > +
> > +struct aplic_idc {
> > +       unsigned int            hart_index;
> > +       void __iomem            *regs;
> > +       struct aplic_priv       *priv;
> > +};
> > +
> > +struct aplic_priv {
> > +       struct device           *dev;
> > +       u32                     nr_irqs;
> > +       u32                     nr_idcs;
> > +       void __iomem            *regs;
> > +       struct irq_domain       *irqdomain;
> > +       struct aplic_msicfg     msicfg;
> > +       struct cpumask          lmask;
> > +};
> > +
> > +static unsigned int aplic_idc_parent_irq;
> > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > +
> > +static void aplic_irq_unmask(struct irq_data *d)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > +
> > +       if (!priv->nr_idcs)
> > +               irq_chip_unmask_parent(d);
> > +}
> > +
> > +static void aplic_irq_mask(struct irq_data *d)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > +
> > +       if (!priv->nr_idcs)
> > +               irq_chip_mask_parent(d);
> > +}
> > +
> > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > +{
> > +       u32 val = 0;
> > +       void __iomem *sourcecfg;
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +       switch (type) {
> > +       case IRQ_TYPE_NONE:
> > +               val = APLIC_SOURCECFG_SM_INACTIVE;
> > +               break;
> > +       case IRQ_TYPE_LEVEL_LOW:
> > +               val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > +               break;
> > +       case IRQ_TYPE_LEVEL_HIGH:
> > +               val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > +               break;
> > +       case IRQ_TYPE_EDGE_FALLING:
> > +               val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > +               break;
> > +       case IRQ_TYPE_EDGE_RISING:
> > +               val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > +               break;
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +
> > +       sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > +       sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > +       writel(val, sourcecfg);
> > +
> > +       return 0;
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int aplic_set_affinity(struct irq_data *d,
> > +                             const struct cpumask *mask_val, bool force)
> > +{
> > +       struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +       struct aplic_idc *idc;
> > +       unsigned int cpu, val;
> > +       struct cpumask amask;
> > +       void __iomem *target;
> > +
> > +       if (!priv->nr_idcs)
> > +               return irq_chip_set_affinity_parent(d, mask_val, force);
> > +
> > +       cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > +       if (force)
> > +               cpu = cpumask_first(&amask);
> > +       else
> > +               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > +       if (cpu >= nr_cpu_ids)
> > +               return -EINVAL;
> > +
> > +       idc = per_cpu_ptr(&aplic_idcs, cpu);
> > +       target = priv->regs + APLIC_TARGET_BASE;
> > +       target += (d->hwirq - 1) * sizeof(u32);
> > +       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > +       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > +       val |= APLIC_DEFAULT_PRIORITY;
> > +       writel(val, target);
> > +
> > +       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > +
> > +       return IRQ_SET_MASK_OK_DONE;
> > +}
> > +#endif
> > +
> > +static struct irq_chip aplic_chip = {
> > +       .name           = "RISC-V APLIC",
> > +       .irq_mask       = aplic_irq_mask,
> > +       .irq_unmask     = aplic_irq_unmask,
> > +       .irq_set_type   = aplic_set_type,
> > +#ifdef CONFIG_SMP
> > +       .irq_set_affinity = aplic_set_affinity,
> > +#endif
> > +       .flags          = IRQCHIP_SET_TYPE_MASKED |
> > +                         IRQCHIP_SKIP_SET_WAKE |
> > +                         IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int aplic_irqdomain_translate(struct irq_domain *d,
> > +                                    struct irq_fwspec *fwspec,
> > +                                    unsigned long *hwirq,
> > +                                    unsigned int *type)
> > +{
> > +       if (WARN_ON(fwspec->param_count < 2))
> > +               return -EINVAL;
> > +       if (WARN_ON(!fwspec->param[0]))
> > +               return -EINVAL;
> > +
> > +       *hwirq = fwspec->param[0];
> > +       *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > +
> > +       WARN_ON(*type == IRQ_TYPE_NONE);
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > +                                    unsigned int virq, unsigned int nr_irqs,
> > +                                    void *arg)
> > +{
> > +       int i, ret;
> > +       unsigned int type;
> > +       irq_hw_number_t hwirq;
> > +       struct irq_fwspec *fwspec = arg;
> > +       struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > +
> > +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > +       if (ret)
> > +               return ret;
> > +
> > +       for (i = 0; i < nr_irqs; i++)
> > +               irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
> > +                                             &aplic_chip, priv);
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > +       .translate      = aplic_irqdomain_translate,
> > +       .alloc          = aplic_irqdomain_msi_alloc,
> > +       .free           = platform_msi_device_domain_free,
> > +};
> > +
> > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > +                                    unsigned int virq, unsigned int nr_irqs,
> > +                                    void *arg)
> > +{
> > +       int i, ret;
> > +       unsigned int type;
> > +       irq_hw_number_t hwirq;
> > +       struct irq_fwspec *fwspec = arg;
> > +       struct aplic_priv *priv = domain->host_data;
> > +
> > +       ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
> > +       if (ret)
> > +               return ret;
> > +
> > +       for (i = 0; i < nr_irqs; i++) {
> > +               irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                   &aplic_chip, priv, handle_simple_irq,
> > +                                   NULL, NULL);
> > +               irq_set_affinity(virq + i, &priv->lmask);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > +       .translate      = aplic_irqdomain_translate,
> > +       .alloc          = aplic_irqdomain_idc_alloc,
> > +       .free           = irq_domain_free_irqs_top,
> > +};
> > +
> > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > +{
> > +       int i;
> > +
> > +       /* Disable all interrupts */
> > +       for (i = 0; i <= priv->nr_irqs; i += 32)
> > +               writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > +                           (i / 32) * sizeof(u32));
> > +
> > +       /* Set interrupt type and default priority for all interrupts */
> > +       for (i = 1; i <= priv->nr_irqs; i++) {
> > +               writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > +                         (i - 1) * sizeof(u32));
> > +               writel(APLIC_DEFAULT_PRIORITY,
> > +                      priv->regs + APLIC_TARGET_BASE +
> > +                      (i - 1) * sizeof(u32));
> > +       }
> > +
> > +       /* Clear APLIC domaincfg */
> > +       writel(0, priv->regs + APLIC_DOMAINCFG);
> > +}
> > +
> > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > +{
> > +       u32 val;
> > +#ifdef CONFIG_RISCV_M_MODE
> > +       u32 valH;
> > +
> > +       if (!priv->nr_idcs) {
> > +               val = priv->msicfg.base_ppn;
> > +               valH = (priv->msicfg.base_ppn >> 32) &
> > +                       APLIC_xMSICFGADDRH_BAPPN_MASK;
> > +               valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > +                       << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > +               valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > +                       << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > +               valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > +                       << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > +               valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > +                       << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > +               writel(val, priv->regs + APLIC_xMSICFGADDR);
> > +               writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > +       }
> > +#endif
> > +
> > +       /* Setup APLIC domaincfg register */
> > +       val = readl(priv->regs + APLIC_DOMAINCFG);
> > +       val |= APLIC_DOMAINCFG_IE;
> > +       if (!priv->nr_idcs)
> > +               val |= APLIC_DOMAINCFG_DM;
> > +       writel(val, priv->regs + APLIC_DOMAINCFG);
> > +       if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > +               dev_warn(priv->dev,
> > +                        "unable to write 0x%x in domaincfg\n", val);
> > +}
> > +
> > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > +{
> > +       unsigned int group_index, hart_index, guest_index, val;
> > +       struct device *dev = msi_desc_to_dev(desc);
> > +       struct aplic_priv *priv = dev_get_drvdata(dev);
> > +       struct irq_data *d = irq_get_irq_data(desc->irq);
> > +       struct aplic_msicfg *mc = &priv->msicfg;
> > +       phys_addr_t tppn, tbppn, msg_addr;
> > +       void __iomem *target;
> > +
> > +       /* For zeroed MSI, simply write zero into the target register */
> > +       if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > +               target = priv->regs + APLIC_TARGET_BASE;
> > +               target += (d->hwirq - 1) * sizeof(u32);
> > +               writel(0, target);
> > +               return;
> > +       }
> > +
> > +       /* Sanity check on message data */
> > +       WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > +
> > +       /* Compute target MSI address */
> > +       msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > +       tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > +
> > +       /* Compute target HART Base PPN */
> > +       tbppn = tppn;
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > +       tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > +       WARN_ON(tbppn != mc->base_ppn);
> > +
> > +       /* Compute target group and hart indexes */
> > +       group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > +                    APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > +       hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > +                    APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > +       hart_index |= (group_index << mc->lhxw);
> > +       WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > +
> > +       /* Compute target guest index */
> > +       guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > +
> > +       /* Update IRQ TARGET register */
> > +       target = priv->regs + APLIC_TARGET_BASE;
> > +       target += (d->hwirq - 1) * sizeof(u32);
> > +       val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > +                               << APLIC_TARGET_HART_IDX_SHIFT;
> > +       val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > +                               << APLIC_TARGET_GUEST_IDX_SHIFT;
> > +       val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > +       writel(val, target);
> > +}
> > +
> > +static int aplic_setup_msi(struct aplic_priv *priv)
> > +{
> > +       struct device *dev = priv->dev;
> > +       struct aplic_msicfg *mc = &priv->msicfg;
> > +       const struct imsic_global_config *imsic_global;
> > +
> > +       /*
> > +        * The APLIC outgoing MSI config registers assume target MSI
> > +        * controller to be RISC-V AIA IMSIC controller.
> > +        */
> > +       imsic_global = imsic_get_global_config();
> > +       if (!imsic_global) {
> > +               dev_err(dev, "IMSIC global config not found\n");
> > +               return -ENODEV;
> > +       }
> > +
> > +       /* Find number of guest index bits (LHXS) */
> > +       mc->lhxs = imsic_global->guest_index_bits;
> > +       if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > +               dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of HART index bits (LHXW) */
> > +       mc->lhxw = imsic_global->hart_index_bits;
> > +       if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > +               dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find number of group index bits (HHXW) */
> > +       mc->hhxw = imsic_global->group_index_bits;
> > +       if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > +               dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Find first bit position of group index (HHXS) */
> > +       mc->hhxs = imsic_global->group_index_shift;
> > +       if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > +               dev_err(dev, "IMSIC group index shift should be >= %d\n",
> > +                       (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > +               return -EINVAL;
> > +       }
> > +       mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > +       if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > +               dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Compute PPN base */
> > +       mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > +       mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > +
> > +       /* Use all possible CPUs as lmask */
> > +       cpumask_copy(&priv->lmask, cpu_possible_mask);
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > + * which will return highest priority pending interrupt and clear the
> > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > + * register return zero value.
> > + */
> > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > +{
> > +       struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > +       struct irq_chip *chip = irq_desc_get_chip(desc);
> > +       irq_hw_number_t hw_irq;
> > +       int irq;
> > +
> > +       chained_irq_enter(chip, desc);
> > +
> > +       while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > +               hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > +               irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > +
> > +               if (unlikely(irq <= 0))
> > +                       pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > +                                           hw_irq);
> > +               else
> > +                       generic_handle_irq(irq);
> > +       }
> > +
> > +       chained_irq_exit(chip, desc);
> > +}
> > +
> > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > +{
> > +       u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > +       u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > +
> > +       /* Priority must be less than threshold for interrupt triggering */
> > +       writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > +
> > +       /* Delivery must be set to 1 for interrupt triggering */
> > +       writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > +}
> > +
> > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > +{
> > +       if (aplic_idc_parent_irq)
> > +               disable_percpu_irq(aplic_idc_parent_irq);
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > +{
> > +       if (aplic_idc_parent_irq)
> > +               enable_percpu_irq(aplic_idc_parent_irq,
> > +                                 irq_get_trigger_type(aplic_idc_parent_irq));
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_setup_idc(struct aplic_priv *priv)
> > +{
> > +       int i, j, rc, cpu, setup_count = 0;
> > +       struct device_node *node = priv->dev->of_node;
> > +       struct device *dev = priv->dev;
> > +       struct of_phandle_args parent;
> > +       struct irq_domain *domain;
> > +       unsigned long hartid;
> > +       struct aplic_idc *idc;
> > +       u32 val;
> > +
> > +       /* Setup per-CPU IDC and target CPU mask */
> > +       for (i = 0; i < priv->nr_idcs; i++) {
> > +               if (of_irq_parse_one(node, i, &parent)) {
> > +                       dev_err(dev, "failed to parse parent for IDC%d.\n",
> > +                               i);
> > +                       return -EIO;
> > +               }
> > +
> > +               /* Skip IDCs which do not connect to external interrupts */
> > +               if (parent.args[0] != RV_IRQ_EXT)
> > +                       continue;
> > +
> > +               rc = riscv_of_parent_hartid(parent.np, &hartid);
> > +               if (rc) {
> > +                       dev_err(dev, "failed to parse hart ID for IDC%d.\n",
> > +                               i);
> > +                       return rc;
> > +               }
> > +
> > +               cpu = riscv_hartid_to_cpuid(hartid);
> > +               if (cpu < 0) {
> > +                       dev_warn(dev, "invalid cpuid for IDC%d\n", i);
> > +                       continue;
> > +               }
> > +
> > +               cpumask_set_cpu(cpu, &priv->lmask);
> > +
> > +               idc = per_cpu_ptr(&aplic_idcs, cpu);
> > +               WARN_ON(idc->priv);
> > +
> > +               idc->hart_index = i;
> > +               idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > +               idc->priv = priv;
> > +
> > +               aplic_idc_set_delivery(idc, true);
> > +
> > +               /*
> > +                * Boot cpu might not have APLIC hart_index = 0 so check
> > +                * and update target registers of all interrupts.
> > +                */
> > +               if (cpu == smp_processor_id() && idc->hart_index) {
> > +                       val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > +                       val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > +                       val |= APLIC_DEFAULT_PRIORITY;
> > +                       for (j = 1; j <= priv->nr_irqs; j++)
> > +                               writel(val, priv->regs + APLIC_TARGET_BASE +
> > +                                           (j - 1) * sizeof(u32));
> > +               }
> > +
> > +               setup_count++;
> > +       }
> > +
> > +       /* Find parent domain and register chained handler */
> > +       domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > +                                         DOMAIN_BUS_ANY);
> > +       if (!aplic_idc_parent_irq && domain) {
> > +               aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > +               if (aplic_idc_parent_irq) {
> > +                       irq_set_chained_handler(aplic_idc_parent_irq,
> > +                                               aplic_idc_handle_irq);
> > +
> > +                       /*
> > +                        * Setup CPUHP notifier to enable IDC parent
> > +                        * interrupt on all CPUs
> > +                        */
> > +                       cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > +                                         "irqchip/riscv/aplic:starting",
> > +                                         aplic_idc_starting_cpu,
> > +                                         aplic_idc_dying_cpu);
> > +               }
> > +       }
> > +
> > +       /* Fail if we were not able to setup IDC for any CPU */
> > +       return (setup_count) ? 0 : -ENODEV;
> > +}
> > +
> > +static int aplic_probe(struct platform_device *pdev)
> > +{
> > +       struct device_node *node = pdev->dev.of_node;
> > +       struct device *dev = &pdev->dev;
> > +       struct aplic_priv *priv;
> > +       struct resource *regs;
> > +       phys_addr_t pa;
> > +       int rc;
> > +
> > +       regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +       if (!regs) {
> > +               dev_err(dev, "cannot find registers resource\n");
> > +               return -ENOENT;
> > +       }
> > +
> > +       priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +       platform_set_drvdata(pdev, priv);
> > +       priv->dev = dev;
> > +
> > +       priv->regs = devm_ioremap(dev, regs->start, resource_size(regs));
> > +       if (WARN_ON(!priv->regs)) {
> > +               dev_err(dev, "failed ioremap registers\n");
> > +               return -EIO;
> > +       }
> > +
> > +       of_property_read_u32(node, "riscv,num-sources", &priv->nr_irqs);
> > +       if (!priv->nr_irqs) {
> > +               dev_err(dev, "failed to get number of interrupt sources\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       /* Setup initial state APLIC interrupts */
> > +       aplic_init_hw_irqs(priv);
> > +
> > +       /*
> > +        * Setup IDCs or MSIs based on parent interrupts in DT node
> > +        *
> > +        * If "msi-parent" DT property is present then we ignore the
> > +        * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > +        */
> > +       priv->nr_idcs = of_property_read_bool(node, "msi-parent") ?
> > +                       0 : of_irq_count(node);
> > +       if (priv->nr_idcs)
> > +               rc = aplic_setup_idc(priv);
> > +       else
> > +               rc = aplic_setup_msi(priv);
> > +       if (rc)
> > +               return rc;
> > +
> > +       /* Setup global config and interrupt delivery */
> > +       aplic_init_hw_global(priv);
> > +
> > +       /* Create irq domain instance for the APLIC */
> > +       if (priv->nr_idcs)
> > +               priv->irqdomain = irq_domain_create_linear(
> > +                                               of_node_to_fwnode(node),
> > +                                               priv->nr_irqs + 1,
> > +                                               &aplic_irqdomain_idc_ops,
> > +                                               priv);
> > +       else
> > +               priv->irqdomain = platform_msi_create_device_domain(dev,
> > +                                               priv->nr_irqs + 1,
> > +                                               aplic_msi_write_msg,
> > +                                               &aplic_irqdomain_msi_ops,
> > +                                               priv);
> > +       if (!priv->irqdomain) {
> > +               dev_err(dev, "failed to add irq domain\n");
> > +               return -ENOMEM;
> > +       }
> > +
> > +       /* Advertise the interrupt controller */
> > +       if (priv->nr_idcs) {
> > +               dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
> > +                        priv->nr_irqs, priv->nr_idcs);
> > +       } else {
> > +               pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > +               dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
> > +                        priv->nr_irqs, &pa);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int aplic_remove(struct platform_device *pdev)
> > +{
> > +       struct aplic_priv *priv = platform_get_drvdata(pdev);
> > +
> > +       irq_domain_remove(priv->irqdomain);
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct of_device_id aplic_match[] = {
> > +       { .compatible = "riscv,aplic" },
> > +       {}
> > +};
> > +
> > +static struct platform_driver aplic_driver = {
> > +       .driver = {
> > +               .name           = "riscv-aplic",
> > +               .of_match_table = aplic_match,
> > +       },
> > +       .probe = aplic_probe,
> > +       .remove = aplic_remove,
> > +};
> > +
> > +static int __init aplic_init(void)
> > +{
> > +       return platform_driver_register(&aplic_driver);
> > +}
> > +core_initcall(aplic_init);
> > diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> > new file mode 100644
> > index 000000000000..88177eefd411
> > --- /dev/null
> > +++ b/include/linux/irqchip/riscv-aplic.h
> > @@ -0,0 +1,117 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> > +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> > +
> > +#include <linux/bitops.h>
> > +
> > +#define APLIC_MAX_IDC                  BIT(14)
> > +#define APLIC_MAX_SOURCE               1024
> > +
> > +#define APLIC_DOMAINCFG                        0x0000
> > +#define APLIC_DOMAINCFG_RDONLY         0x80000000
> > +#define APLIC_DOMAINCFG_IE             BIT(8)
> > +#define APLIC_DOMAINCFG_DM             BIT(2)
> > +#define APLIC_DOMAINCFG_BE             BIT(0)
> > +
> > +#define APLIC_SOURCECFG_BASE           0x0004
> > +#define APLIC_SOURCECFG_D              BIT(10)
> > +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x000003ff
> > +#define APLIC_SOURCECFG_SM_MASK        0x00000007
> > +#define APLIC_SOURCECFG_SM_INACTIVE    0x0
> > +#define APLIC_SOURCECFG_SM_DETACH      0x1
> > +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
> > +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
> > +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
> > +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
> > +
> > +#define APLIC_MMSICFGADDR              0x1bc0
> > +#define APLIC_MMSICFGADDRH             0x1bc4
> > +#define APLIC_SMSICFGADDR              0x1bc8
> > +#define APLIC_SMSICFGADDRH             0x1bcc
> > +
> > +#ifdef CONFIG_RISCV_M_MODE
> > +#define APLIC_xMSICFGADDR              APLIC_MMSICFGADDR
> > +#define APLIC_xMSICFGADDRH             APLIC_MMSICFGADDRH
> > +#else
> > +#define APLIC_xMSICFGADDR              APLIC_SMSICFGADDR
> > +#define APLIC_xMSICFGADDRH             APLIC_SMSICFGADDRH
> > +#endif
> > +
> > +#define APLIC_xMSICFGADDRH_L           BIT(31)
> > +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
> > +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
> > +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
> > +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
> > +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
> > +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
> > +#define APLIC_xMSICFGADDRH_LHXW_MASK   0xf
> > +#define APLIC_xMSICFGADDRH_LHXW_SHIFT  12
> > +#define APLIC_xMSICFGADDRH_BAPPN_MASK  0xfff
> > +
> > +#define APLIC_xMSICFGADDR_PPN_SHIFT    12
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> > +       (BIT(__lhxs) - 1)
> > +
> > +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> > +       (BIT(__lhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> > +       ((__lhxs))
> > +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> > +       (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> > +        APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> > +       (BIT(__hhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> > +       ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> > +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> > +       (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> > +        APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> > +
> > +#define APLIC_SETIP_BASE               0x1c00
> > +#define APLIC_SETIPNUM                 0x1cdc
> > +
> > +#define APLIC_CLRIP_BASE               0x1d00
> > +#define APLIC_CLRIPNUM                 0x1ddc
> > +
> > +#define APLIC_SETIE_BASE               0x1e00
> > +#define APLIC_SETIENUM                 0x1edc
> > +
> > +#define APLIC_CLRIE_BASE               0x1f00
> > +#define APLIC_CLRIENUM                 0x1fdc
> > +
> > +#define APLIC_SETIPNUM_LE              0x2000
> > +#define APLIC_SETIPNUM_BE              0x2004
> > +
> > +#define APLIC_GENMSI                   0x3000
> > +
> > +#define APLIC_TARGET_BASE              0x3004
> > +#define APLIC_TARGET_HART_IDX_SHIFT    18
> > +#define APLIC_TARGET_HART_IDX_MASK     0x3fff
> > +#define APLIC_TARGET_GUEST_IDX_SHIFT   12
> > +#define APLIC_TARGET_GUEST_IDX_MASK    0x3f
> > +#define APLIC_TARGET_IPRIO_MASK        0xff
> > +#define APLIC_TARGET_EIID_MASK 0x7ff
> > +
> > +#define APLIC_IDC_BASE                 0x4000
> > +#define APLIC_IDC_SIZE                 32
> > +
> > +#define APLIC_IDC_IDELIVERY            0x00
> > +
> > +#define APLIC_IDC_IFORCE               0x04
> > +
> > +#define APLIC_IDC_ITHRESHOLD           0x08
> > +
> > +#define APLIC_IDC_TOPI                 0x18
> > +#define APLIC_IDC_TOPI_ID_SHIFT        16
> > +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> > +#define APLIC_IDC_TOPI_PRIO_MASK       0xff
> > +
> > +#define APLIC_IDC_CLAIMI               0x1c
> > +
> > +#endif
> > --
> > 2.34.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
  2023-01-17 20:42         ` Conor Dooley
@ 2023-01-27 11:58           ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-27 11:58 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, linux-riscv, linux-kernel, devicetree

On Wed, Jan 18, 2023 at 2:12 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup,
>
> I thought I had already replied here but clearly not, sorry!
>
> On Mon, Jan 09, 2023 at 10:39:08AM +0530, Anup Patel wrote:
> > On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <conor@kernel.org> wrote:
> > > On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
>
> > > > +/* AIA CSR bits */
> > > > +#define TOPI_IID_SHIFT               16
> > > > +#define TOPI_IID_MASK                0xfff
>
> While I think of it, it'd be worth noting that these are generic across
> all of topi, mtopi etc. Initially I thought that this mask was wrong as
> the topi section says:
>         bits 25:16 Interrupt identity (source number)
>         bits 7:0 Interrupt priority

These defines are for the AIA CSRs and not AIA APLIC IDC registers.

As per the latest frozen spec, the mtopi/stopi/vstopi has following bits:
    bits: 27:16 IID
    bits: 7:0 IPRIO

>
> > > > +#define TOPI_IPRIO_MASK              0xff
> > > > +#define TOPI_IPRIO_BITS              8
> > > > +
> > > > +#define TOPEI_ID_SHIFT               16
> > > > +#define TOPEI_ID_MASK                0x7ff
> > > > +#define TOPEI_PRIO_MASK              0x7ff
> > > > +
> > > > +#define ISELECT_IPRIO0               0x30
> > > > +#define ISELECT_IPRIO15              0x3f
> > > > +#define ISELECT_MASK         0x1ff
> > > > +
> > > > +#define HVICTL_VTI           0x40000000
> > > > +#define HVICTL_IID           0x0fff0000
> > > > +#define HVICTL_IID_SHIFT     16
> > > > +#define HVICTL_IPRIOM                0x00000100
> > > > +#define HVICTL_IPRIO         0x000000ff
> > >
> > > Why not name these as masks, like you did for the other masks?
> > > Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> > > intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> > > to be used *pre*-shift.
> > > Some consistency in naming and function would be great.
> >
> > The following convention is being followed in asm/csr.h for defining
> > MASK of any XYZ field in ABC CSR:
> > 1. ABC_XYZ : This name is used for MASK which is intended
> >    to be used before SHIFT
> > 2. ABC_XYZ_MASK: This name is used for MASK which is
> >    intended to be used after SHIFT
>
> Which makes sense in theory.
>
> > The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
> > follows the above convention. The only outlier is HGATPx_VMID_MASK
> > define which I will fix in my next KVM RISC-V series.
>
> Yup, it is liable to end up like that.
>
> > I don't see how any of the AIA CSR defines are violating the above
> > convention.
>
> What I was advocating for was picking one style and sticking to it.
> These copy-paste from docs things are tedious and error prone to review,
> and I don't think having multiple styles is helpful.

On the other hand, I think we should let developers choose a style
which is better suited for a particular register field instead enforcing
it here. The best we can do is follow a naming convention for defines.

>
> Tedious as it was, I did check all the numbers though, so in that
> respect:
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

BTW, this patch is shared with KVM AIA CSR series so most likely
I will take this patch through that series.

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
@ 2023-01-27 11:58           ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-01-27 11:58 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, linux-riscv, linux-kernel, devicetree

On Wed, Jan 18, 2023 at 2:12 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup,
>
> I thought I had already replied here but clearly not, sorry!
>
> On Mon, Jan 09, 2023 at 10:39:08AM +0530, Anup Patel wrote:
> > On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <conor@kernel.org> wrote:
> > > On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
>
> > > > +/* AIA CSR bits */
> > > > +#define TOPI_IID_SHIFT               16
> > > > +#define TOPI_IID_MASK                0xfff
>
> While I think of it, it'd be worth noting that these are generic across
> all of topi, mtopi etc. Initially I thought that this mask was wrong as
> the topi section says:
>         bits 25:16 Interrupt identity (source number)
>         bits 7:0 Interrupt priority

These defines are for the AIA CSRs and not AIA APLIC IDC registers.

As per the latest frozen spec, the mtopi/stopi/vstopi has following bits:
    bits: 27:16 IID
    bits: 7:0 IPRIO

>
> > > > +#define TOPI_IPRIO_MASK              0xff
> > > > +#define TOPI_IPRIO_BITS              8
> > > > +
> > > > +#define TOPEI_ID_SHIFT               16
> > > > +#define TOPEI_ID_MASK                0x7ff
> > > > +#define TOPEI_PRIO_MASK              0x7ff
> > > > +
> > > > +#define ISELECT_IPRIO0               0x30
> > > > +#define ISELECT_IPRIO15              0x3f
> > > > +#define ISELECT_MASK         0x1ff
> > > > +
> > > > +#define HVICTL_VTI           0x40000000
> > > > +#define HVICTL_IID           0x0fff0000
> > > > +#define HVICTL_IID_SHIFT     16
> > > > +#define HVICTL_IPRIOM                0x00000100
> > > > +#define HVICTL_IPRIO         0x000000ff
> > >
> > > Why not name these as masks, like you did for the other masks?
> > > Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> > > intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> > > to be used *pre*-shift.
> > > Some consistency in naming and function would be great.
> >
> > The following convention is being followed in asm/csr.h for defining
> > MASK of any XYZ field in ABC CSR:
> > 1. ABC_XYZ : This name is used for MASK which is intended
> >    to be used before SHIFT
> > 2. ABC_XYZ_MASK: This name is used for MASK which is
> >    intended to be used after SHIFT
>
> Which makes sense in theory.
>
> > The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
> > follows the above convention. The only outlier is HGATPx_VMID_MASK
> > define which I will fix in my next KVM RISC-V series.
>
> Yup, it is liable to end up like that.
>
> > I don't see how any of the AIA CSR defines are violating the above
> > convention.
>
> What I was advocating for was picking one style and sticking to it.
> These copy-paste from docs things are tedious and error prone to review,
> and I don't think having multiple styles is helpful.

On the other hand, I think we should let developers choose a style
which is better suited for a particular register field instead enforcing
it here. The best we can do is follow a naming convention for defines.

>
> Tedious as it was, I did check all the numbers though, so in that
> respect:
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

BTW, this patch is shared with KVM AIA CSR series so most likely
I will take this patch through that series.

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
  2023-01-27 11:58           ` Anup Patel
@ 2023-01-27 14:20             ` Conor Dooley
  -1 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-27 14:20 UTC (permalink / raw)
  To: Anup Patel
  Cc: Conor Dooley, Anup Patel, Palmer Dabbelt, Paul Walmsley,
	Thomas Gleixner, Marc Zyngier, Rob Herring, Krzysztof Kozlowski,
	Atish Patra, Alistair Francis, linux-riscv, linux-kernel,
	devicetree

[-- Attachment #1: Type: text/plain, Size: 1859 bytes --]

On Fri, Jan 27, 2023 at 05:28:57PM +0530, Anup Patel wrote:
> On Wed, Jan 18, 2023 at 2:12 AM Conor Dooley <conor@kernel.org> wrote:

> > > > > +/* AIA CSR bits */
> > > > > +#define TOPI_IID_SHIFT               16
> > > > > +#define TOPI_IID_MASK                0xfff
> >
> > While I think of it, it'd be worth noting that these are generic across
> > all of topi, mtopi etc. Initially I thought that this mask was wrong as
> > the topi section says:
> >         bits 25:16 Interrupt identity (source number)
> >         bits 7:0 Interrupt priority
> 
> These defines are for the AIA CSRs and not AIA APLIC IDC registers.
> 
> As per the latest frozen spec, the mtopi/stopi/vstopi has following bits:
>     bits: 27:16 IID
>     bits: 7:0 IPRIO

I know, that those ones use those bits, hence leaving an R-b for the
patch - but your define says TOPI, which it is *not* accurate for.
That is confusing and should be noted.

> > What I was advocating for was picking one style and sticking to it.
> > These copy-paste from docs things are tedious and error prone to review,
> > and I don't think having multiple styles is helpful.
> 
> On the other hand, I think we should let developers choose a style
> which is better suited for a particular register field instead enforcing
> it here. The best we can do is follow a naming convention for defines.

Well shall have to agree to disagree I suppose!

> > Tedious as it was, I did check all the numbers though, so in that
> > respect:
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> 
> BTW, this patch is shared with KVM AIA CSR series so most likely
> I will take this patch through that series.

Since the path which it gets applied is between you and Palmer to
decide, feel free to add the R-b whichever way the patch ends up going!

Thanks,
Conor.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/9] RISC-V: Add AIA related CSR defines
@ 2023-01-27 14:20             ` Conor Dooley
  0 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-01-27 14:20 UTC (permalink / raw)
  To: Anup Patel
  Cc: Conor Dooley, Anup Patel, Palmer Dabbelt, Paul Walmsley,
	Thomas Gleixner, Marc Zyngier, Rob Herring, Krzysztof Kozlowski,
	Atish Patra, Alistair Francis, linux-riscv, linux-kernel,
	devicetree


[-- Attachment #1.1: Type: text/plain, Size: 1859 bytes --]

On Fri, Jan 27, 2023 at 05:28:57PM +0530, Anup Patel wrote:
> On Wed, Jan 18, 2023 at 2:12 AM Conor Dooley <conor@kernel.org> wrote:

> > > > > +/* AIA CSR bits */
> > > > > +#define TOPI_IID_SHIFT               16
> > > > > +#define TOPI_IID_MASK                0xfff
> >
> > While I think of it, it'd be worth noting that these are generic across
> > all of topi, mtopi etc. Initially I thought that this mask was wrong as
> > the topi section says:
> >         bits 25:16 Interrupt identity (source number)
> >         bits 7:0 Interrupt priority
> 
> These defines are for the AIA CSRs and not AIA APLIC IDC registers.
> 
> As per the latest frozen spec, the mtopi/stopi/vstopi has following bits:
>     bits: 27:16 IID
>     bits: 7:0 IPRIO

I know, that those ones use those bits, hence leaving an R-b for the
patch - but your define says TOPI, which it is *not* accurate for.
That is confusing and should be noted.

> > What I was advocating for was picking one style and sticking to it.
> > These copy-paste from docs things are tedious and error prone to review,
> > and I don't think having multiple styles is helpful.
> 
> On the other hand, I think we should let developers choose a style
> which is better suited for a particular register field instead enforcing
> it here. The best we can do is follow a naming convention for defines.

Well shall have to agree to disagree I suppose!

> > Tedious as it was, I did check all the numbers though, so in that
> > respect:
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> 
> BTW, this patch is shared with KVM AIA CSR series so most likely
> I will take this patch through that series.

Since the path which it gets applied is between you and Palmer to
decide, feel free to add the R-b whichever way the patch ends up going!

Thanks,
Conor.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-01-03 14:14   ` Anup Patel
@ 2023-02-19 11:17     ` Vivian Wang
  -1 siblings, 0 replies; 72+ messages in thread
From: Vivian Wang @ 2023-02-19 11:17 UTC (permalink / raw)
  To: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree

On 1/3/23 22:14, Anup Patel wrote:
> We add DT bindings document for the RISC-V incoming MSI controller
> (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> specification.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
>  1 file changed, 168 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> new file mode 100644
> index 000000000000..b9db03b6e95f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> <snip>
> +
> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      This property represents the set of CPUs (or HARTs) for which given
> +      device tree node describes the IMSIC interrupt files. Each node pointed
> +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> +      HART) as parent.
> +

This property doesn't seem to describe guest external interrupts. Should
we add a reference to e.g. <&cpuN_intc 12> to indicate that IMSIC can
send a 'Supervisor guest external interrupt'? Or just an idea, maybe we
can add an additional interrupt controller to the CPU nodes to handle
SGEI: (Various properties omitted)

cpu0: cpu@N {
	compatible = "riscv";

	cpu0_intc: interrupt-controller {
		compatible = "riscv,cpu-intc";

		cpu0_gei: interrupt-controller {
			/* intc for hart-local hgeie/hgeip */
			compatible = "riscv,..."; /* Something here */
			interrupt-parent = <&cpu0_intc>;
			interrupts = <12>; /* SGEI */
			interrupt-controller;
			#interrupt-cells = <1>;
		}
	}
}

interrupt-controller@... {
	compatible = "riscv,imsics";
	interrupts-extended = <&cpu0_intc 11>, <&cpu0_gei 1>, <&cpu0_gei 2> /* ... */;
}

I feel that this would be more appropriate, since the guest external
interrupts are defined in the privileged architecture specification and
are not specific to AIA. Though please do suggest more appropriate ways
to formulate it.
> <snip>
> +...

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
@ 2023-02-19 11:17     ` Vivian Wang
  0 siblings, 0 replies; 72+ messages in thread
From: Vivian Wang @ 2023-02-19 11:17 UTC (permalink / raw)
  To: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree

On 1/3/23 22:14, Anup Patel wrote:
> We add DT bindings document for the RISC-V incoming MSI controller
> (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> specification.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
>  1 file changed, 168 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> new file mode 100644
> index 000000000000..b9db03b6e95f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> <snip>
> +
> +  interrupts-extended:
> +    minItems: 1
> +    maxItems: 16384
> +    description:
> +      This property represents the set of CPUs (or HARTs) for which given
> +      device tree node describes the IMSIC interrupt files. Each node pointed
> +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> +      HART) as parent.
> +

This property doesn't seem to describe guest external interrupts. Should
we add a reference to e.g. <&cpuN_intc 12> to indicate that IMSIC can
send a 'Supervisor guest external interrupt'? Or just an idea, maybe we
can add an additional interrupt controller to the CPU nodes to handle
SGEI: (Various properties omitted)

cpu0: cpu@N {
	compatible = "riscv";

	cpu0_intc: interrupt-controller {
		compatible = "riscv,cpu-intc";

		cpu0_gei: interrupt-controller {
			/* intc for hart-local hgeie/hgeip */
			compatible = "riscv,..."; /* Something here */
			interrupt-parent = <&cpu0_intc>;
			interrupts = <12>; /* SGEI */
			interrupt-controller;
			#interrupt-cells = <1>;
		}
	}
}

interrupt-controller@... {
	compatible = "riscv,imsics";
	interrupts-extended = <&cpu0_intc 11>, <&cpu0_gei 1>, <&cpu0_gei 2> /* ... */;
}

I feel that this would be more appropriate, since the guest external
interrupts are defined in the privileged architecture specification and
are not specific to AIA. Though please do suggest more appropriate ways
to formulate it.
> <snip>
> +...

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-01-03 14:14   ` Anup Patel
@ 2023-02-19 11:48     ` Vivian Wang
  -1 siblings, 0 replies; 72+ messages in thread
From: Vivian Wang @ 2023-02-19 11:48 UTC (permalink / raw)
  To: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree

On 1/3/23 22:14, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
>  1 file changed, 159 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> new file mode 100644
> index 000000000000..b7f20aad72c2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> @@ -0,0 +1,159 @@
>
> <snip>
>
> +  riscv,children:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      maxItems: 1
> +    description:
> +      A list of child APLIC domains for the given APLIC domain. Each child
> +      APLIC domain is assigned child index in increasing order with the
> +      first child APLIC domain assigned child index 0. The APLIC domain
> +      child index is used by firmware to delegate interrupts from the
> +      given APLIC domain to a particular child APLIC domain.
> +
> +  riscv,delegate:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      items:
> +        - description: child APLIC domain phandle
> +        - description: first interrupt number (inclusive)
> +        - description: last interrupt number (inclusive)
> +    description:
> +      A interrupt delegation list where each entry is a triple consisting
> +      of child APLIC domain phandle, first interrupt number, and last
> +      interrupt number. The firmware will configure interrupt delegation
> +      registers based on interrupt delegation list.
> +

I'm not sure if this is the right place to ask, since it could be more
of a OpenSBI/QEMU problem, but I think a more detailed description about
what 'the firmware' does is appropriate here.

My main confusion is how to describe wired interrupts connected to
APLICs. Say we have two APLIC nodes with labels aplic_m and aplic_s that
are the APLIC domains for M-mode and S-mode respectively. IIUC, wired
interrupts are connected directly to aplic_m. So how do I refer to it in
the device nodes?

 1. <&aplic_s num IRQ_TYPE_foo>, but it would be a lie to M-mode
    software, which could be a problem. QEMU 7.2.0 seems to take this
    approach. (I could also be misunderstanding QEMU and it actually
    does connect wired interrupts to the S-mode APLIC, but then
    riscv,children and riscv,delegate would be lies.)
 2. <&aplic_m ...>, and when M-mode software gives S-mode software
    access to devices, it delegates relevant interrupts and patches it
    into <&aplic_s num IRQ_TYPE_foo>. Seems to be the 'correct'
    approach, but pretty complicated.
 3. <&aplic_m ...>, S-mode software sees this, and sees that aplic_m has
    num in riscv,delegate, so goes to find the child it's been delegated
    to, which is (should be) aplic_s. A bit annoyingly abstraction
    breaking, since S-mode shouldn't even need to know about aplic_m.

I see that others are also confused by riscv,delegate and riscv,children
properties. It would be great if we could clarify the expected behavior
here rather than just saying 'the firmware will do the thing'.

> <snip>
> +...
Thanks,
Vivian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-02-19 11:48     ` Vivian Wang
  0 siblings, 0 replies; 72+ messages in thread
From: Vivian Wang @ 2023-02-19 11:48 UTC (permalink / raw)
  To: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski
  Cc: Atish Patra, Alistair Francis, Anup Patel, linux-riscv,
	linux-kernel, devicetree

On 1/3/23 22:14, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
>
> Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> ---
>  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
>  1 file changed, 159 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> new file mode 100644
> index 000000000000..b7f20aad72c2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> @@ -0,0 +1,159 @@
>
> <snip>
>
> +  riscv,children:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      maxItems: 1
> +    description:
> +      A list of child APLIC domains for the given APLIC domain. Each child
> +      APLIC domain is assigned child index in increasing order with the
> +      first child APLIC domain assigned child index 0. The APLIC domain
> +      child index is used by firmware to delegate interrupts from the
> +      given APLIC domain to a particular child APLIC domain.
> +
> +  riscv,delegate:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    minItems: 1
> +    maxItems: 1024
> +    items:
> +      items:
> +        - description: child APLIC domain phandle
> +        - description: first interrupt number (inclusive)
> +        - description: last interrupt number (inclusive)
> +    description:
> +      A interrupt delegation list where each entry is a triple consisting
> +      of child APLIC domain phandle, first interrupt number, and last
> +      interrupt number. The firmware will configure interrupt delegation
> +      registers based on interrupt delegation list.
> +

I'm not sure if this is the right place to ask, since it could be more
of a OpenSBI/QEMU problem, but I think a more detailed description about
what 'the firmware' does is appropriate here.

My main confusion is how to describe wired interrupts connected to
APLICs. Say we have two APLIC nodes with labels aplic_m and aplic_s that
are the APLIC domains for M-mode and S-mode respectively. IIUC, wired
interrupts are connected directly to aplic_m. So how do I refer to it in
the device nodes?

 1. <&aplic_s num IRQ_TYPE_foo>, but it would be a lie to M-mode
    software, which could be a problem. QEMU 7.2.0 seems to take this
    approach. (I could also be misunderstanding QEMU and it actually
    does connect wired interrupts to the S-mode APLIC, but then
    riscv,children and riscv,delegate would be lies.)
 2. <&aplic_m ...>, and when M-mode software gives S-mode software
    access to devices, it delegates relevant interrupts and patches it
    into <&aplic_s num IRQ_TYPE_foo>. Seems to be the 'correct'
    approach, but pretty complicated.
 3. <&aplic_m ...>, S-mode software sees this, and sees that aplic_m has
    num in riscv,delegate, so goes to find the child it's been delegated
    to, which is (should be) aplic_s. A bit annoyingly abstraction
    breaking, since S-mode shouldn't even need to know about aplic_m.

I see that others are also confused by riscv,delegate and riscv,children
properties. It would be great if we could clarify the expected behavior
here rather than just saying 'the firmware will do the thing'.

> <snip>
> +...
Thanks,
Vivian

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-01-04 23:21     ` Conor Dooley
@ 2023-02-20  3:15       ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  3:15 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Thu, Jan 5, 2023 at 4:51 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup,
>
> On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> > We add DT bindings document for the RISC-V incoming MSI controller
> > (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> > specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> > +  for each privilege level (machine or supervisor).
>
> > +  The device tree of a RISC-V platform will have one IMSIC device tree node
> > +  for each privilege level (machine or supervisor) which collectively describe
> > +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
>
> > +examples:
> > +  - |
> > +    // Example 1 (Machine-level IMSIC files with just one group):
> > +
> > +    imsic_mlevel: interrupt-controller@24000000 {
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 11>,
> > +                            <&cpu2_intc 11>,
> > +                            <&cpu3_intc 11>,
> > +                            <&cpu4_intc 11>;
> > +      reg = <0x28000000 0x4000>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +    };
> > +
> > +  - |
> > +    // Example 2 (Supervisor-level IMSIC files with two groups):
> > +
> > +    imsic_slevel: interrupt-controller@28000000 {
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 9>,
> > +                            <&cpu2_intc 9>,
> > +                            <&cpu3_intc 9>,
> > +                            <&cpu4_intc 9>;
> > +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> > +            <0x29000000 0x2000>; /* Group1 IMSICs */
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +      riscv,group-index-bits = <1>;
> > +      riscv,group-index-shift = <24>;
> > +    };
>
> How is, say linux, meant to know which of the per-level imsic DT nodes
> applies to it?
> I had a quick look in the driver, but could see no mechanism for it.
> Apologies if I missed something obvious!

This is very straightforward. We simply look at the local interrupt number
in the "interrupts-extended" DT property.

Currently, we use the same technique in PLIC driver to distinguish
M-mode PLIC context from S-mode PLIC context.

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
@ 2023-02-20  3:15       ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  3:15 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Thu, Jan 5, 2023 at 4:51 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup,
>
> On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> > We add DT bindings document for the RISC-V incoming MSI controller
> > (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> > specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> > +  for each privilege level (machine or supervisor).
>
> > +  The device tree of a RISC-V platform will have one IMSIC device tree node
> > +  for each privilege level (machine or supervisor) which collectively describe
> > +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
>
> > +examples:
> > +  - |
> > +    // Example 1 (Machine-level IMSIC files with just one group):
> > +
> > +    imsic_mlevel: interrupt-controller@24000000 {
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 11>,
> > +                            <&cpu2_intc 11>,
> > +                            <&cpu3_intc 11>,
> > +                            <&cpu4_intc 11>;
> > +      reg = <0x28000000 0x4000>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +    };
> > +
> > +  - |
> > +    // Example 2 (Supervisor-level IMSIC files with two groups):
> > +
> > +    imsic_slevel: interrupt-controller@28000000 {
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 9>,
> > +                            <&cpu2_intc 9>,
> > +                            <&cpu3_intc 9>,
> > +                            <&cpu4_intc 9>;
> > +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> > +            <0x29000000 0x2000>; /* Group1 IMSICs */
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +      riscv,group-index-bits = <1>;
> > +      riscv,group-index-shift = <24>;
> > +    };
>
> How is, say linux, meant to know which of the per-level imsic DT nodes
> applies to it?
> I had a quick look in the driver, but could see no mechanism for it.
> Apologies if I missed something obvious!

This is very straightforward. We simply look at the local interrupt number
in the "interrupts-extended" DT property.

Currently, we use the same technique in PLIC driver to distinguish
M-mode PLIC context from S-mode PLIC context.

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-01-12 20:49     ` Rob Herring
@ 2023-02-20  3:20       ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  3:20 UTC (permalink / raw)
  To: Rob Herring
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Fri, Jan 13, 2023 at 2:19 AM Rob Herring <robh@kernel.org> wrote:
>
> On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> > We add DT bindings document for the RISC-V incoming MSI controller
> > (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> > specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
> >  1 file changed, 168 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > new file mode 100644
> > index 000000000000..b9db03b6e95f
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > @@ -0,0 +1,168 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: RISC-V Incoming MSI Controller (IMSIC)
> > +
> > +maintainers:
> > +  - Anup Patel <anup@brainfault.org>
> > +
> > +description: |
> > +  The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
> > +  MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
> > +  AIA specification can be found at https://github.com/riscv/riscv-aia.
> > +
> > +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> > +  for each privilege level (machine or supervisor). The configuration of
> > +  a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
> > +  space to receive MSIs from devices. Each IMSIC interrupt file supports a
> > +  fixed number of interrupt identities (to distinguish MSIs from devices)
> > +  which is same for given privilege level across CPUs (or HARTs).
> > +
> > +  The device tree of a RISC-V platform will have one IMSIC device tree node
> > +  for each privilege level (machine or supervisor) which collectively describe
> > +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
> > +
> > +  The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
> > +  follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
> > +  group is a set of IMSIC interrupt files co-located in MMIO space and we can
> > +  have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
> > +  RISC-V platform. The MSI target address of a IMSIC interrupt file at given
> > +  privilege level (machine or supervisor) encodes group index, HART index,
> > +  and guest index (shown below).
> > +
> > +  XLEN-1           >=24                                 12    0
> > +  |                  |                                  |     |
> > +  -------------------------------------------------------------
> > +  |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> > +  -------------------------------------------------------------
> > +
> > +allOf:
> > +  - $ref: /schemas/interrupt-controller.yaml#
> > +  - $ref: /schemas/interrupt-controller/msi-controller.yaml#
> > +
> > +properties:
> > +  compatible:
> > +    items:
> > +      - enum:
> > +          - riscv,qemu-imsics
>
> The implmentation/vendor is qemu, so: qemu,imsics (or qemu,riscv-imsics?)

Okay, I will update.

>
> > +      - const: riscv,imsics
> > +
> > +  reg:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      Base address of each IMSIC group.
> > +
> > +  interrupt-controller: true
> > +
> > +  "#interrupt-cells":
> > +    const: 0
> > +
> > +  msi-controller: true
> > +
> > +  interrupts-extended:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      This property represents the set of CPUs (or HARTs) for which given
> > +      device tree node describes the IMSIC interrupt files. Each node pointed
> > +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> > +      HART) as parent.
> > +
> > +  riscv,num-ids:
> > +    $ref: /schemas/types.yaml#/definitions/uint32
> > +    minimum: 63
> > +    maximum: 2047
> > +    description:
> > +      Number of interrupt identities supported by IMSIC interrupt file.
> > +
> > +  riscv,num-guest-ids:
> > +    $ref: /schemas/types.yaml#/definitions/uint32
> > +    minimum: 63
> > +    maximum: 2047
> > +    description:
> > +      Number of interrupt identities are supported by IMSIC guest interrupt
> > +      file. When not specified it is assumed to be same as specified by the
> > +      riscv,num-ids property.
> > +
> > +  riscv,guest-index-bits:
> > +    minimum: 0
> > +    maximum: 7
> > +    default: 0
> > +    description:
> > +      Number of guest index bits in the MSI target address. When not
> > +      specified it is assumed to be 0.
>
> No need to repeat what 'default: 0' defines.

Okay, I will update.

>
> > +
> > +  riscv,hart-index-bits:
> > +    minimum: 0
> > +    maximum: 15
> > +    description:
> > +      Number of HART index bits in the MSI target address. When not
> > +      specified it is estimated based on the interrupts-extended property.
>
> If guessing works, why do you need the property? Perhaps
> s/estimated/calculated/?

Okay, I will fix the wording.

We need this property because IMSIC files of two consecutive HARTs
are not required to be contiguous since there could be holes (unused
space) in between.

>
> > +
> > +  riscv,group-index-bits:
> > +    minimum: 0
> > +    maximum: 7
> > +    default: 0
> > +    description:
> > +      Number of group index bits in the MSI target address. When not
> > +      specified it is assumed to be 0.
> > +
> > +  riscv,group-index-shift:
> > +    $ref: /schemas/types.yaml#/definitions/uint32
> > +    minimum: 0
> > +    maximum: 55
> > +    default: 24
> > +    description:
> > +      The least significant bit position of the group index bits in the
> > +      MSI target address. When not specified it is assumed to be 24.
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - interrupt-controller
> > +  - msi-controller
>
> #msi-cells should be defined (as 0) and required. Best to be explicit
> #and not rely on the default.

Okay, I will update.

>
> > +  - interrupts-extended
> > +  - riscv,num-ids
> > +
> > +unevaluatedProperties: false
> > +
> > +examples:
> > +  - |
> > +    // Example 1 (Machine-level IMSIC files with just one group):
> > +
> > +    imsic_mlevel: interrupt-controller@24000000 {
>
> Drop unused labels.

Okay, I will update.

>
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 11>,
> > +                            <&cpu2_intc 11>,
> > +                            <&cpu3_intc 11>,
> > +                            <&cpu4_intc 11>;
> > +      reg = <0x28000000 0x4000>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +    };
> > +
> > +  - |
> > +    // Example 2 (Supervisor-level IMSIC files with two groups):
> > +
> > +    imsic_slevel: interrupt-controller@28000000 {
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 9>,
> > +                            <&cpu2_intc 9>,
> > +                            <&cpu3_intc 9>,
> > +                            <&cpu4_intc 9>;
> > +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> > +            <0x29000000 0x2000>; /* Group1 IMSICs */
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +      riscv,group-index-bits = <1>;
> > +      riscv,group-index-shift = <24>;
> > +    };
> > +...
> > --
> > 2.34.1
> >

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
@ 2023-02-20  3:20       ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  3:20 UTC (permalink / raw)
  To: Rob Herring
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Krzysztof Kozlowski, Atish Patra, Alistair Francis, Anup Patel,
	linux-riscv, linux-kernel, devicetree

On Fri, Jan 13, 2023 at 2:19 AM Rob Herring <robh@kernel.org> wrote:
>
> On Tue, Jan 03, 2023 at 07:44:04PM +0530, Anup Patel wrote:
> > We add DT bindings document for the RISC-V incoming MSI controller
> > (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> > specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
> >  1 file changed, 168 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > new file mode 100644
> > index 000000000000..b9db03b6e95f
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > @@ -0,0 +1,168 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: RISC-V Incoming MSI Controller (IMSIC)
> > +
> > +maintainers:
> > +  - Anup Patel <anup@brainfault.org>
> > +
> > +description: |
> > +  The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
> > +  MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
> > +  AIA specification can be found at https://github.com/riscv/riscv-aia.
> > +
> > +  The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> > +  for each privilege level (machine or supervisor). The configuration of
> > +  a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
> > +  space to receive MSIs from devices. Each IMSIC interrupt file supports a
> > +  fixed number of interrupt identities (to distinguish MSIs from devices)
> > +  which is same for given privilege level across CPUs (or HARTs).
> > +
> > +  The device tree of a RISC-V platform will have one IMSIC device tree node
> > +  for each privilege level (machine or supervisor) which collectively describe
> > +  IMSIC interrupt files at that privilege level across CPUs (or HARTs).
> > +
> > +  The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
> > +  follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
> > +  group is a set of IMSIC interrupt files co-located in MMIO space and we can
> > +  have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
> > +  RISC-V platform. The MSI target address of a IMSIC interrupt file at given
> > +  privilege level (machine or supervisor) encodes group index, HART index,
> > +  and guest index (shown below).
> > +
> > +  XLEN-1           >=24                                 12    0
> > +  |                  |                                  |     |
> > +  -------------------------------------------------------------
> > +  |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index|  0  |
> > +  -------------------------------------------------------------
> > +
> > +allOf:
> > +  - $ref: /schemas/interrupt-controller.yaml#
> > +  - $ref: /schemas/interrupt-controller/msi-controller.yaml#
> > +
> > +properties:
> > +  compatible:
> > +    items:
> > +      - enum:
> > +          - riscv,qemu-imsics
>
> The implmentation/vendor is qemu, so: qemu,imsics (or qemu,riscv-imsics?)

Okay, I will update.

>
> > +      - const: riscv,imsics
> > +
> > +  reg:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      Base address of each IMSIC group.
> > +
> > +  interrupt-controller: true
> > +
> > +  "#interrupt-cells":
> > +    const: 0
> > +
> > +  msi-controller: true
> > +
> > +  interrupts-extended:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      This property represents the set of CPUs (or HARTs) for which given
> > +      device tree node describes the IMSIC interrupt files. Each node pointed
> > +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> > +      HART) as parent.
> > +
> > +  riscv,num-ids:
> > +    $ref: /schemas/types.yaml#/definitions/uint32
> > +    minimum: 63
> > +    maximum: 2047
> > +    description:
> > +      Number of interrupt identities supported by IMSIC interrupt file.
> > +
> > +  riscv,num-guest-ids:
> > +    $ref: /schemas/types.yaml#/definitions/uint32
> > +    minimum: 63
> > +    maximum: 2047
> > +    description:
> > +      Number of interrupt identities are supported by IMSIC guest interrupt
> > +      file. When not specified it is assumed to be same as specified by the
> > +      riscv,num-ids property.
> > +
> > +  riscv,guest-index-bits:
> > +    minimum: 0
> > +    maximum: 7
> > +    default: 0
> > +    description:
> > +      Number of guest index bits in the MSI target address. When not
> > +      specified it is assumed to be 0.
>
> No need to repeat what 'default: 0' defines.

Okay, I will update.

>
> > +
> > +  riscv,hart-index-bits:
> > +    minimum: 0
> > +    maximum: 15
> > +    description:
> > +      Number of HART index bits in the MSI target address. When not
> > +      specified it is estimated based on the interrupts-extended property.
>
> If guessing works, why do you need the property? Perhaps
> s/estimated/calculated/?

Okay, I will fix the wording.

We need this property because IMSIC files of two consecutive HARTs
are not required to be contiguous since there could be holes (unused
space) in between.

>
> > +
> > +  riscv,group-index-bits:
> > +    minimum: 0
> > +    maximum: 7
> > +    default: 0
> > +    description:
> > +      Number of group index bits in the MSI target address. When not
> > +      specified it is assumed to be 0.
> > +
> > +  riscv,group-index-shift:
> > +    $ref: /schemas/types.yaml#/definitions/uint32
> > +    minimum: 0
> > +    maximum: 55
> > +    default: 24
> > +    description:
> > +      The least significant bit position of the group index bits in the
> > +      MSI target address. When not specified it is assumed to be 24.
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - interrupt-controller
> > +  - msi-controller
>
> #msi-cells should be defined (as 0) and required. Best to be explicit
> #and not rely on the default.

Okay, I will update.

>
> > +  - interrupts-extended
> > +  - riscv,num-ids
> > +
> > +unevaluatedProperties: false
> > +
> > +examples:
> > +  - |
> > +    // Example 1 (Machine-level IMSIC files with just one group):
> > +
> > +    imsic_mlevel: interrupt-controller@24000000 {
>
> Drop unused labels.

Okay, I will update.

>
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 11>,
> > +                            <&cpu2_intc 11>,
> > +                            <&cpu3_intc 11>,
> > +                            <&cpu4_intc 11>;
> > +      reg = <0x28000000 0x4000>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +    };
> > +
> > +  - |
> > +    // Example 2 (Supervisor-level IMSIC files with two groups):
> > +
> > +    imsic_slevel: interrupt-controller@28000000 {
> > +      compatible = "riscv,qemu-imsics", "riscv,imsics";
> > +      interrupts-extended = <&cpu1_intc 9>,
> > +                            <&cpu2_intc 9>,
> > +                            <&cpu3_intc 9>,
> > +                            <&cpu4_intc 9>;
> > +      reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> > +            <0x29000000 0x2000>; /* Group1 IMSICs */
> > +      interrupt-controller;
> > +      #interrupt-cells = <0>;
> > +      msi-controller;
> > +      riscv,num-ids = <127>;
> > +      riscv,group-index-bits = <1>;
> > +      riscv,group-index-shift = <24>;
> > +    };
> > +...
> > --
> > 2.34.1
> >

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
  2023-02-19 11:17     ` Vivian Wang
@ 2023-02-20  3:31       ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  3:31 UTC (permalink / raw)
  To: Vivian Wang
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Sun, Feb 19, 2023 at 4:48 PM Vivian Wang <uwu@dram.page> wrote:
>
> On 1/3/23 22:14, Anup Patel wrote:
> > We add DT bindings document for the RISC-V incoming MSI controller
> > (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> > specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
> >  1 file changed, 168 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > new file mode 100644
> > index 000000000000..b9db03b6e95f
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > <snip>
> > +
> > +  interrupts-extended:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      This property represents the set of CPUs (or HARTs) for which given
> > +      device tree node describes the IMSIC interrupt files. Each node pointed
> > +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> > +      HART) as parent.
> > +
>
> This property doesn't seem to describe guest external interrupts. Should
> we add a reference to e.g. <&cpuN_intc 12> to indicate that IMSIC can
> send a 'Supervisor guest external interrupt'? Or just an idea, maybe we
> can add an additional interrupt controller to the CPU nodes to handle
> SGEI: (Various properties omitted)
>
> cpu0: cpu@N {
>         compatible = "riscv";
>
>         cpu0_intc: interrupt-controller {
>                 compatible = "riscv,cpu-intc";
>
>                 cpu0_gei: interrupt-controller {
>                         /* intc for hart-local hgeie/hgeip */
>                         compatible = "riscv,..."; /* Something here */
>                         interrupt-parent = <&cpu0_intc>;
>                         interrupts = <12>; /* SGEI */
>                         interrupt-controller;
>                         #interrupt-cells = <1>;
>                 }
>         }
> }
>
> interrupt-controller@... {
>         compatible = "riscv,imsics";
>         interrupts-extended = <&cpu0_intc 11>, <&cpu0_gei 1>, <&cpu0_gei 2> /* ... */;
> }
>
> I feel that this would be more appropriate, since the guest external
> interrupts are defined in the privileged architecture specification and
> are not specific to AIA. Though please do suggest more appropriate ways
> to formulate it.

This is unnecessary because GEILEN can be detected by init time
writes to hgeie CSR. Please look at KVM RISC-V AIA implementation
for more details. We only need "riscv,guest-index-bits" DT property
for address space holes.

In fact, we have tested these DT bindings with a variety of NUMA
configurations containing different numbers of IMISC guest files
per-HART.

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
@ 2023-02-20  3:31       ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  3:31 UTC (permalink / raw)
  To: Vivian Wang
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Sun, Feb 19, 2023 at 4:48 PM Vivian Wang <uwu@dram.page> wrote:
>
> On 1/3/23 22:14, Anup Patel wrote:
> > We add DT bindings document for the RISC-V incoming MSI controller
> > (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> > specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,imsics.yaml    | 168 ++++++++++++++++++
> >  1 file changed, 168 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > new file mode 100644
> > index 000000000000..b9db03b6e95f
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> > <snip>
> > +
> > +  interrupts-extended:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      This property represents the set of CPUs (or HARTs) for which given
> > +      device tree node describes the IMSIC interrupt files. Each node pointed
> > +      to should be a riscv,cpu-intc node, which has a riscv node (i.e. RISC-V
> > +      HART) as parent.
> > +
>
> This property doesn't seem to describe guest external interrupts. Should
> we add a reference to e.g. <&cpuN_intc 12> to indicate that IMSIC can
> send a 'Supervisor guest external interrupt'? Or just an idea, maybe we
> can add an additional interrupt controller to the CPU nodes to handle
> SGEI: (Various properties omitted)
>
> cpu0: cpu@N {
>         compatible = "riscv";
>
>         cpu0_intc: interrupt-controller {
>                 compatible = "riscv,cpu-intc";
>
>                 cpu0_gei: interrupt-controller {
>                         /* intc for hart-local hgeie/hgeip */
>                         compatible = "riscv,..."; /* Something here */
>                         interrupt-parent = <&cpu0_intc>;
>                         interrupts = <12>; /* SGEI */
>                         interrupt-controller;
>                         #interrupt-cells = <1>;
>                 }
>         }
> }
>
> interrupt-controller@... {
>         compatible = "riscv,imsics";
>         interrupts-extended = <&cpu0_intc 11>, <&cpu0_gei 1>, <&cpu0_gei 2> /* ... */;
> }
>
> I feel that this would be more appropriate, since the guest external
> interrupts are defined in the privileged architecture specification and
> are not specific to AIA. Though please do suggest more appropriate ways
> to formulate it.

This is unnecessary because GEILEN can be detected by init time
writes to hgeie CSR. Please look at KVM RISC-V AIA implementation
for more details. We only need "riscv,guest-index-bits" DT property
for address space holes.

In fact, we have tested these DT bindings with a variety of NUMA
configurations containing different numbers of IMISC guest files
per-HART.

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-01-04 22:16     ` Conor Dooley
@ 2023-02-20  4:36       ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  4:36 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup,
>
> On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > We add DT bindings document for RISC-V advanced platform level
> > interrupt controller (APLIC) defined by the RISC-V advanced
> > interrupt architecture (AIA) specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> >  1 file changed, 159 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> > +  interrupts-extended:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      Given APLIC domain directly injects external interrupts to a set of
> > +      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> > +      node, which has a riscv node (i.e. RISC-V HART) as parent.
> > +
> > +  msi-parent:
> > +    description:
> > +      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> > +      message signaled interrupt controller (IMSIC). This property should be
> > +      considered only when the interrupts-extended property is absent.
>
> Considered by what?
> On v1 you said:
> <quote>
> If both "interrupts-extended" and "msi-parent" are present then it means
> the APLIC domain supports both MSI mode and Direct mode in HW. In this
> case, the APLIC driver has to choose between MSI mode or Direct mode.
> <\quote>
>
> The description is still pretty ambiguous IMO. Perhaps incorporate
> some of that expanded comment into the property description?
> Say, "If both foo and bar are present, the APLIC domain has hardware
> support for both MSI and direct mode. Software may then chose either
> mode".
> Have I misunderstood your comment on v1? It read as if having both
> present indicated that both were possible & that "should be considered
> only..." was more of a suggestion and a comment about the Linux driver's
> behaviour.
> Apologies if I have misunderstood, but I suppose if I have then the
> binding's description could be improved!!

Yes, when both DT properties are present then it's up to Linux
APLIC driver to choose the appropriate APLIC mode.

I forgot to update the text here in v2 but I will update it in v3.
Thanks for pointing.

>
> > +  riscv,children:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      maxItems: 1
> > +    description:
> > +      A list of child APLIC domains for the given APLIC domain. Each child
> > +      APLIC domain is assigned child index in increasing order with the
>
> btw, missing article before child (& a comma after order I think).

Okay, I will update.

>
> > +      first child APLIC domain assigned child index 0. The APLIC domain
> > +      child index is used by firmware to delegate interrupts from the
> > +      given APLIC domain to a particular child APLIC domain.
> > +
> > +  riscv,delegate:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
>
> Is it valid to have a delegate property without children? If not, the
> binding should reflect that dependency IMO.

Okay, I will update.

>
> > +    items:
> > +      items:
> > +        - description: child APLIC domain phandle
> > +        - description: first interrupt number (inclusive)
> > +        - description: last interrupt number (inclusive)
> > +    description:
> > +      A interrupt delegation list where each entry is a triple consisting
> > +      of child APLIC domain phandle, first interrupt number, and last
> > +      interrupt number. The firmware will configure interrupt delegation
>
> btw, drop the article before firmware here.
> Also, "firmware will" or "firmware must"? Semantics perhaps, but they
> are different!

I think "firmware must" is better because APLIC M-mode domains are
not accessible to S-mode so firmware must configure delegation for
at least all APLIC M-mode domains.

>
> Kinda for my own curiosity here, but do you expect these properties to
> generally be dynamically filled in by the bootloader or read by the
> bootloader to set up the configuration?

Firmware (or bootloader) will look at this property and setup delegation
before booting the OS kernel.

>
> > +      registers based on interrupt delegation list.
>
> I'm sorry Anup, but this child versus delegate thing is still not clear
> to me binding wise. See below.

There are two different information in-context of APLIC domain:

1) HW child domain numbering: If an APLIC domain has N children
    then HW will have a fixed child index for each of the N children
    in the range 0 to N-1. This HW child index is required at the time
    of setting up interrupt delegation in sourcecfgX registers. The
    "riscv,children" DT property helps firmware (or bootloader) find
    the total number of child APLIC domains and corresponding
    HW child index number.

2) IRQ delegation to child domains: An APLIC domain can delegate
   any IRQ range(s) to a particular APLIC child domain. The
   "riscv,delegate" DT property is simply a table where we have
   one row for each IRQ range which is delegated to some child
   APLIC domain. This property is more of a system setting fixed
   by the RISC-V platform vendor.

>
> > +    aplic0: interrupt-controller@c000000 {
> > +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> > +      interrupts-extended = <&cpu1_intc 11>,
> > +                            <&cpu2_intc 11>,
> > +                            <&cpu3_intc 11>,
> > +                            <&cpu4_intc 11>;
> > +      reg = <0xc000000 0x4080>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <2>;
> > +      riscv,num-sources = <63>;
> > +      riscv,children = <&aplic1>, <&aplic2>;
> > +      riscv,delegate = <&aplic1 1 63>;
>
> Is aplic2 here for demonstrative purposes only, since it has not been
> delegated any interrupts?

Yes, it's for demonstrative purposes only.

> I suppose it is hardware present on the SoC that is not being used by
> the current configuration?

Yes, in this example aplic2 is unused because it has no interrupts
delegated to it.

>
> Thanks,
> Conor.
>
> > +    };
> > +
> > +    aplic1: interrupt-controller@d000000 {
> > +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> > +      interrupts-extended = <&cpu1_intc 9>,
> > +                            <&cpu2_intc 9>;
> > +      reg = <0xd000000 0x4080>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <2>;
> > +      riscv,num-sources = <63>;
> > +    };
> > +
> > +    aplic2: interrupt-controller@e000000 {
> > +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> > +      interrupts-extended = <&cpu3_intc 9>,
> > +                            <&cpu4_intc 9>;
> > +      reg = <0xe000000 0x4080>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <2>;
> > +      riscv,num-sources = <63>;
> > +    };
>

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-02-20  4:36       ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  4:36 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Anup,
>
> On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > We add DT bindings document for RISC-V advanced platform level
> > interrupt controller (APLIC) defined by the RISC-V advanced
> > interrupt architecture (AIA) specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> >  1 file changed, 159 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> > +  interrupts-extended:
> > +    minItems: 1
> > +    maxItems: 16384
> > +    description:
> > +      Given APLIC domain directly injects external interrupts to a set of
> > +      RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> > +      node, which has a riscv node (i.e. RISC-V HART) as parent.
> > +
> > +  msi-parent:
> > +    description:
> > +      Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> > +      message signaled interrupt controller (IMSIC). This property should be
> > +      considered only when the interrupts-extended property is absent.
>
> Considered by what?
> On v1 you said:
> <quote>
> If both "interrupts-extended" and "msi-parent" are present then it means
> the APLIC domain supports both MSI mode and Direct mode in HW. In this
> case, the APLIC driver has to choose between MSI mode or Direct mode.
> <\quote>
>
> The description is still pretty ambiguous IMO. Perhaps incorporate
> some of that expanded comment into the property description?
> Say, "If both foo and bar are present, the APLIC domain has hardware
> support for both MSI and direct mode. Software may then chose either
> mode".
> Have I misunderstood your comment on v1? It read as if having both
> present indicated that both were possible & that "should be considered
> only..." was more of a suggestion and a comment about the Linux driver's
> behaviour.
> Apologies if I have misunderstood, but I suppose if I have then the
> binding's description could be improved!!

Yes, when both DT properties are present then it's up to Linux
APLIC driver to choose the appropriate APLIC mode.

I forgot to update the text here in v2 but I will update it in v3.
Thanks for pointing.

>
> > +  riscv,children:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      maxItems: 1
> > +    description:
> > +      A list of child APLIC domains for the given APLIC domain. Each child
> > +      APLIC domain is assigned child index in increasing order with the
>
> btw, missing article before child (& a comma after order I think).

Okay, I will update.

>
> > +      first child APLIC domain assigned child index 0. The APLIC domain
> > +      child index is used by firmware to delegate interrupts from the
> > +      given APLIC domain to a particular child APLIC domain.
> > +
> > +  riscv,delegate:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
>
> Is it valid to have a delegate property without children? If not, the
> binding should reflect that dependency IMO.

Okay, I will update.

>
> > +    items:
> > +      items:
> > +        - description: child APLIC domain phandle
> > +        - description: first interrupt number (inclusive)
> > +        - description: last interrupt number (inclusive)
> > +    description:
> > +      A interrupt delegation list where each entry is a triple consisting
> > +      of child APLIC domain phandle, first interrupt number, and last
> > +      interrupt number. The firmware will configure interrupt delegation
>
> btw, drop the article before firmware here.
> Also, "firmware will" or "firmware must"? Semantics perhaps, but they
> are different!

I think "firmware must" is better because APLIC M-mode domains are
not accessible to S-mode so firmware must configure delegation for
at least all APLIC M-mode domains.

>
> Kinda for my own curiosity here, but do you expect these properties to
> generally be dynamically filled in by the bootloader or read by the
> bootloader to set up the configuration?

Firmware (or bootloader) will look at this property and setup delegation
before booting the OS kernel.

>
> > +      registers based on interrupt delegation list.
>
> I'm sorry Anup, but this child versus delegate thing is still not clear
> to me binding wise. See below.

There are two different information in-context of APLIC domain:

1) HW child domain numbering: If an APLIC domain has N children
    then HW will have a fixed child index for each of the N children
    in the range 0 to N-1. This HW child index is required at the time
    of setting up interrupt delegation in sourcecfgX registers. The
    "riscv,children" DT property helps firmware (or bootloader) find
    the total number of child APLIC domains and corresponding
    HW child index number.

2) IRQ delegation to child domains: An APLIC domain can delegate
   any IRQ range(s) to a particular APLIC child domain. The
   "riscv,delegate" DT property is simply a table where we have
   one row for each IRQ range which is delegated to some child
   APLIC domain. This property is more of a system setting fixed
   by the RISC-V platform vendor.

>
> > +    aplic0: interrupt-controller@c000000 {
> > +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> > +      interrupts-extended = <&cpu1_intc 11>,
> > +                            <&cpu2_intc 11>,
> > +                            <&cpu3_intc 11>,
> > +                            <&cpu4_intc 11>;
> > +      reg = <0xc000000 0x4080>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <2>;
> > +      riscv,num-sources = <63>;
> > +      riscv,children = <&aplic1>, <&aplic2>;
> > +      riscv,delegate = <&aplic1 1 63>;
>
> Is aplic2 here for demonstrative purposes only, since it has not been
> delegated any interrupts?

Yes, it's for demonstrative purposes only.

> I suppose it is hardware present on the SoC that is not being used by
> the current configuration?

Yes, in this example aplic2 is unused because it has no interrupts
delegated to it.

>
> Thanks,
> Conor.
>
> > +    };
> > +
> > +    aplic1: interrupt-controller@d000000 {
> > +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> > +      interrupts-extended = <&cpu1_intc 9>,
> > +                            <&cpu2_intc 9>;
> > +      reg = <0xd000000 0x4080>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <2>;
> > +      riscv,num-sources = <63>;
> > +    };
> > +
> > +    aplic2: interrupt-controller@e000000 {
> > +      compatible = "riscv,qemu-aplic", "riscv,aplic";
> > +      interrupts-extended = <&cpu3_intc 9>,
> > +                            <&cpu4_intc 9>;
> > +      reg = <0xe000000 0x4080>;
> > +      interrupt-controller;
> > +      #interrupt-cells = <2>;
> > +      riscv,num-sources = <63>;
> > +    };
>

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-02-19 11:48     ` Vivian Wang
@ 2023-02-20  5:09       ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  5:09 UTC (permalink / raw)
  To: Vivian Wang
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Sun, Feb 19, 2023 at 5:18 PM Vivian Wang <uwu@dram.page> wrote:
>
> On 1/3/23 22:14, Anup Patel wrote:
> > We add DT bindings document for RISC-V advanced platform level
> > interrupt controller (APLIC) defined by the RISC-V advanced
> > interrupt architecture (AIA) specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> >  1 file changed, 159 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> > new file mode 100644
> > index 000000000000..b7f20aad72c2
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> > @@ -0,0 +1,159 @@
> >
> > <snip>
> >
> > +  riscv,children:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      maxItems: 1
> > +    description:
> > +      A list of child APLIC domains for the given APLIC domain. Each child
> > +      APLIC domain is assigned child index in increasing order with the
> > +      first child APLIC domain assigned child index 0. The APLIC domain
> > +      child index is used by firmware to delegate interrupts from the
> > +      given APLIC domain to a particular child APLIC domain.
> > +
> > +  riscv,delegate:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      items:
> > +        - description: child APLIC domain phandle
> > +        - description: first interrupt number (inclusive)
> > +        - description: last interrupt number (inclusive)
> > +    description:
> > +      A interrupt delegation list where each entry is a triple consisting
> > +      of child APLIC domain phandle, first interrupt number, and last
> > +      interrupt number. The firmware will configure interrupt delegation
> > +      registers based on interrupt delegation list.
> > +
>
> I'm not sure if this is the right place to ask, since it could be more
> of a OpenSBI/QEMU problem, but I think a more detailed description about
> what 'the firmware' does is appropriate here.
>
> My main confusion is how to describe wired interrupts connected to
> APLICs. Say we have two APLIC nodes with labels aplic_m and aplic_s that
> are the APLIC domains for M-mode and S-mode respectively. IIUC, wired
> interrupts are connected directly to aplic_m. So how do I refer to it in
> the device nodes?

Please see my previous reply to Conor about these DT properties.
The riscv,children DT property describes HW child numbering whereas
the riscv,delegate DT propert is a table of IRQ delegation.

In your example, let's assume we have N wired interrupts. This
means we will have devices connected to the root APLIC domain
(aplic_m). Now since aplic_s is a child of aplic_m, we will have
N wired interrupts going from from aplic_m to aplic_s where
aplic_m will route a wired/device interrupt x to aplic_s if
sourcecfg[x].D = 1 and sourcecfg[x].child = 0.

>
>  1. <&aplic_s num IRQ_TYPE_foo>, but it would be a lie to M-mode
>     software, which could be a problem. QEMU 7.2.0 seems to take this
>     approach. (I could also be misunderstanding QEMU and it actually
>     does connect wired interrupts to the S-mode APLIC, but then
>     riscv,children and riscv,delegate would be lies.)

No, it's not a lie. The <&aplic_s num IRQ_TYPE_foo> in a device DT
node is based on the IRQ delegation fixed by the RISC-V platform.
QEMU has its own strategy of delegating IRQs to APLIC S-mode
while other platforms can use a different strategy.

>  2. <&aplic_m ...>, and when M-mode software gives S-mode software
>     access to devices, it delegates relevant interrupts and patches it
>     into <&aplic_s num IRQ_TYPE_foo>. Seems to be the 'correct'
>     approach, but pretty complicated.

The APLIC M-mode domain is not accessible to S-mode software so
Linux cannot create an irqdomain using APLIC M-mode DT node. This
means device DT nodes must have <&aplic_s num IRQ_TYPE_foo>
which points to APLIC S-mode domain.

It is totally up to RISC-V firmware and platform if it wants to dynamically
add/patch <&aplic_s num IRQ_TYPE_foo> in device DT nodes. Currently,
we do not patch device DT nodes in OpenSBI and instead have the
device DT nodes point to correct APLIC domain based on the IRQ
delegation.

>  3. <&aplic_m ...>, S-mode software sees this, and sees that aplic_m has
>     num in riscv,delegate, so goes to find the child it's been delegated
>     to, which is (should be) aplic_s. A bit annoyingly abstraction
>     breaking, since S-mode shouldn't even need to know about aplic_m.

Yes, S-mode should know about aplic_m and if it tries to access aplic_m
then it will get an access fault.

This is exactly why device DT node should have "interrupts" DT property
pointing to the actual APLIC domain which is delivering interrupt to S-mode.

>
> I see that others are also confused by riscv,delegate and riscv,children
> properties. It would be great if we could clarify the expected behavior
> here rather than just saying 'the firmware will do the thing'.

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-02-20  5:09       ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-02-20  5:09 UTC (permalink / raw)
  To: Vivian Wang
  Cc: Palmer Dabbelt, Paul Walmsley, Thomas Gleixner, Marc Zyngier,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	Anup Patel, linux-riscv, linux-kernel, devicetree

On Sun, Feb 19, 2023 at 5:18 PM Vivian Wang <uwu@dram.page> wrote:
>
> On 1/3/23 22:14, Anup Patel wrote:
> > We add DT bindings document for RISC-V advanced platform level
> > interrupt controller (APLIC) defined by the RISC-V advanced
> > interrupt architecture (AIA) specification.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> >  1 file changed, 159 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> > new file mode 100644
> > index 000000000000..b7f20aad72c2
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> > @@ -0,0 +1,159 @@
> >
> > <snip>
> >
> > +  riscv,children:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      maxItems: 1
> > +    description:
> > +      A list of child APLIC domains for the given APLIC domain. Each child
> > +      APLIC domain is assigned child index in increasing order with the
> > +      first child APLIC domain assigned child index 0. The APLIC domain
> > +      child index is used by firmware to delegate interrupts from the
> > +      given APLIC domain to a particular child APLIC domain.
> > +
> > +  riscv,delegate:
> > +    $ref: /schemas/types.yaml#/definitions/phandle-array
> > +    minItems: 1
> > +    maxItems: 1024
> > +    items:
> > +      items:
> > +        - description: child APLIC domain phandle
> > +        - description: first interrupt number (inclusive)
> > +        - description: last interrupt number (inclusive)
> > +    description:
> > +      A interrupt delegation list where each entry is a triple consisting
> > +      of child APLIC domain phandle, first interrupt number, and last
> > +      interrupt number. The firmware will configure interrupt delegation
> > +      registers based on interrupt delegation list.
> > +
>
> I'm not sure if this is the right place to ask, since it could be more
> of a OpenSBI/QEMU problem, but I think a more detailed description about
> what 'the firmware' does is appropriate here.
>
> My main confusion is how to describe wired interrupts connected to
> APLICs. Say we have two APLIC nodes with labels aplic_m and aplic_s that
> are the APLIC domains for M-mode and S-mode respectively. IIUC, wired
> interrupts are connected directly to aplic_m. So how do I refer to it in
> the device nodes?

Please see my previous reply to Conor about these DT properties.
The riscv,children DT property describes HW child numbering whereas
the riscv,delegate DT propert is a table of IRQ delegation.

In your example, let's assume we have N wired interrupts. This
means we will have devices connected to the root APLIC domain
(aplic_m). Now since aplic_s is a child of aplic_m, we will have
N wired interrupts going from from aplic_m to aplic_s where
aplic_m will route a wired/device interrupt x to aplic_s if
sourcecfg[x].D = 1 and sourcecfg[x].child = 0.

>
>  1. <&aplic_s num IRQ_TYPE_foo>, but it would be a lie to M-mode
>     software, which could be a problem. QEMU 7.2.0 seems to take this
>     approach. (I could also be misunderstanding QEMU and it actually
>     does connect wired interrupts to the S-mode APLIC, but then
>     riscv,children and riscv,delegate would be lies.)

No, it's not a lie. The <&aplic_s num IRQ_TYPE_foo> in a device DT
node is based on the IRQ delegation fixed by the RISC-V platform.
QEMU has its own strategy of delegating IRQs to APLIC S-mode
while other platforms can use a different strategy.

>  2. <&aplic_m ...>, and when M-mode software gives S-mode software
>     access to devices, it delegates relevant interrupts and patches it
>     into <&aplic_s num IRQ_TYPE_foo>. Seems to be the 'correct'
>     approach, but pretty complicated.

The APLIC M-mode domain is not accessible to S-mode software so
Linux cannot create an irqdomain using APLIC M-mode DT node. This
means device DT nodes must have <&aplic_s num IRQ_TYPE_foo>
which points to APLIC S-mode domain.

It is totally up to RISC-V firmware and platform if it wants to dynamically
add/patch <&aplic_s num IRQ_TYPE_foo> in device DT nodes. Currently,
we do not patch device DT nodes in OpenSBI and instead have the
device DT nodes point to correct APLIC domain based on the IRQ
delegation.

>  3. <&aplic_m ...>, S-mode software sees this, and sees that aplic_m has
>     num in riscv,delegate, so goes to find the child it's been delegated
>     to, which is (should be) aplic_s. A bit annoyingly abstraction
>     breaking, since S-mode shouldn't even need to know about aplic_m.

Yes, S-mode should know about aplic_m and if it tries to access aplic_m
then it will get an access fault.

This is exactly why device DT node should have "interrupts" DT property
pointing to the actual APLIC domain which is delivering interrupt to S-mode.

>
> I see that others are also confused by riscv,delegate and riscv,children
> properties. It would be great if we could clarify the expected behavior
> here rather than just saying 'the firmware will do the thing'.

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-02-20  4:36       ` Anup Patel
@ 2023-02-20 10:32         ` Conor Dooley
  -1 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-02-20 10:32 UTC (permalink / raw)
  To: Anup Patel
  Cc: Conor Dooley, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, Anup Patel, linux-riscv, linux-kernel,
	devicetree

[-- Attachment #1: Type: text/plain, Size: 1923 bytes --]

Hey Anup,

On Mon, Feb 20, 2023 at 10:06:49AM +0530, Anup Patel wrote:
> On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <conor@kernel.org> wrote:
> > On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > > We add DT bindings document for RISC-V advanced platform level
> > > interrupt controller (APLIC) defined by the RISC-V advanced
> > > interrupt architecture (AIA) specification.
> > >
> > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > ---
> > >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> > >  1 file changed, 159 insertions(+)
> > >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

> > I'm sorry Anup, but this child versus delegate thing is still not clear
> > to me binding wise. See below.
> 
> There are two different information in-context of APLIC domain:
> 
> 1) HW child domain numbering: If an APLIC domain has N children
>     then HW will have a fixed child index for each of the N children
>     in the range 0 to N-1. This HW child index is required at the time
>     of setting up interrupt delegation in sourcecfgX registers. The
>     "riscv,children" DT property helps firmware (or bootloader) find
>     the total number of child APLIC domains and corresponding
>     HW child index number.
> 
> 2) IRQ delegation to child domains: An APLIC domain can delegate
>    any IRQ range(s) to a particular APLIC child domain. The
>    "riscv,delegate" DT property is simply a table where we have
>    one row for each IRQ range which is delegated to some child
>    APLIC domain. This property is more of a system setting fixed
>    by the RISC-V platform vendor.

Thanks for the explanations. It's been a while since my brain swapped
this stuff out, but I think delegate/child makes sense to me now.
Just don't ask me to write the dt entry as proof...

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-02-20 10:32         ` Conor Dooley
  0 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-02-20 10:32 UTC (permalink / raw)
  To: Anup Patel
  Cc: Conor Dooley, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, Anup Patel, linux-riscv, linux-kernel,
	devicetree


[-- Attachment #1.1: Type: text/plain, Size: 1923 bytes --]

Hey Anup,

On Mon, Feb 20, 2023 at 10:06:49AM +0530, Anup Patel wrote:
> On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <conor@kernel.org> wrote:
> > On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > > We add DT bindings document for RISC-V advanced platform level
> > > interrupt controller (APLIC) defined by the RISC-V advanced
> > > interrupt architecture (AIA) specification.
> > >
> > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > ---
> > >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> > >  1 file changed, 159 insertions(+)
> > >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

> > I'm sorry Anup, but this child versus delegate thing is still not clear
> > to me binding wise. See below.
> 
> There are two different information in-context of APLIC domain:
> 
> 1) HW child domain numbering: If an APLIC domain has N children
>     then HW will have a fixed child index for each of the N children
>     in the range 0 to N-1. This HW child index is required at the time
>     of setting up interrupt delegation in sourcecfgX registers. The
>     "riscv,children" DT property helps firmware (or bootloader) find
>     the total number of child APLIC domains and corresponding
>     HW child index number.
> 
> 2) IRQ delegation to child domains: An APLIC domain can delegate
>    any IRQ range(s) to a particular APLIC child domain. The
>    "riscv,delegate" DT property is simply a table where we have
>    one row for each IRQ range which is delegated to some child
>    APLIC domain. This property is more of a system setting fixed
>    by the RISC-V platform vendor.

Thanks for the explanations. It's been a while since my brain swapped
this stuff out, but I think delegate/child makes sense to me now.
Just don't ask me to write the dt entry as proof...

Thanks,
Conor.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
  2023-02-20 10:32         ` Conor Dooley
@ 2023-02-20 10:56           ` Conor Dooley
  -1 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-02-20 10:56 UTC (permalink / raw)
  To: Anup Patel
  Cc: Conor Dooley, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, Anup Patel, linux-riscv, linux-kernel,
	devicetree

[-- Attachment #1: Type: text/plain, Size: 2346 bytes --]

On Mon, Feb 20, 2023 at 10:32:57AM +0000, Conor Dooley wrote:
> On Mon, Feb 20, 2023 at 10:06:49AM +0530, Anup Patel wrote:
> > On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <conor@kernel.org> wrote:
> > > On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > > > We add DT bindings document for RISC-V advanced platform level
> > > > interrupt controller (APLIC) defined by the RISC-V advanced
> > > > interrupt architecture (AIA) specification.
> > > >
> > > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > > ---
> > > >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> > > >  1 file changed, 159 insertions(+)
> > > >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> 
> > > I'm sorry Anup, but this child versus delegate thing is still not clear
> > > to me binding wise. See below.
> > 
> > There are two different information in-context of APLIC domain:
> > 
> > 1) HW child domain numbering: If an APLIC domain has N children
> >     then HW will have a fixed child index for each of the N children
> >     in the range 0 to N-1. This HW child index is required at the time
> >     of setting up interrupt delegation in sourcecfgX registers. The
> >     "riscv,children" DT property helps firmware (or bootloader) find
> >     the total number of child APLIC domains and corresponding
> >     HW child index number.
> > 
> > 2) IRQ delegation to child domains: An APLIC domain can delegate
> >    any IRQ range(s) to a particular APLIC child domain. The
> >    "riscv,delegate" DT property is simply a table where we have
> >    one row for each IRQ range which is delegated to some child
> >    APLIC domain. This property is more of a system setting fixed
> >    by the RISC-V platform vendor.
> 
> Thanks for the explanations. It's been a while since my brain swapped
> this stuff out, but I think delegate/child makes sense to me now.

> Just don't ask me to write the dt entry as proof...

Having looked at Dramforever's QEMU dtb dump a bit more and your
responses to her, I think that I have "come to terms" with it now
actually.
I suppose when the next version comes around I'll make sure that I
arrive in the same ballpark that QEMU does, based off the descriptions
etc in the binding.

Thanks!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
@ 2023-02-20 10:56           ` Conor Dooley
  0 siblings, 0 replies; 72+ messages in thread
From: Conor Dooley @ 2023-02-20 10:56 UTC (permalink / raw)
  To: Anup Patel
  Cc: Conor Dooley, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Marc Zyngier, Rob Herring, Krzysztof Kozlowski, Atish Patra,
	Alistair Francis, Anup Patel, linux-riscv, linux-kernel,
	devicetree


[-- Attachment #1.1: Type: text/plain, Size: 2346 bytes --]

On Mon, Feb 20, 2023 at 10:32:57AM +0000, Conor Dooley wrote:
> On Mon, Feb 20, 2023 at 10:06:49AM +0530, Anup Patel wrote:
> > On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <conor@kernel.org> wrote:
> > > On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > > > We add DT bindings document for RISC-V advanced platform level
> > > > interrupt controller (APLIC) defined by the RISC-V advanced
> > > > interrupt architecture (AIA) specification.
> > > >
> > > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > > ---
> > > >  .../interrupt-controller/riscv,aplic.yaml     | 159 ++++++++++++++++++
> > > >  1 file changed, 159 insertions(+)
> > > >  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> 
> > > I'm sorry Anup, but this child versus delegate thing is still not clear
> > > to me binding wise. See below.
> > 
> > There are two different information in-context of APLIC domain:
> > 
> > 1) HW child domain numbering: If an APLIC domain has N children
> >     then HW will have a fixed child index for each of the N children
> >     in the range 0 to N-1. This HW child index is required at the time
> >     of setting up interrupt delegation in sourcecfgX registers. The
> >     "riscv,children" DT property helps firmware (or bootloader) find
> >     the total number of child APLIC domains and corresponding
> >     HW child index number.
> > 
> > 2) IRQ delegation to child domains: An APLIC domain can delegate
> >    any IRQ range(s) to a particular APLIC child domain. The
> >    "riscv,delegate" DT property is simply a table where we have
> >    one row for each IRQ range which is delegated to some child
> >    APLIC domain. This property is more of a system setting fixed
> >    by the RISC-V platform vendor.
> 
> Thanks for the explanations. It's been a while since my brain swapped
> this stuff out, but I think delegate/child makes sense to me now.

> Just don't ask me to write the dt entry as proof...

Having looked at Dramforever's QEMU dtb dump a bit more and your
responses to her, I think that I have "come to terms" with it now
actually.
I suppose when the next version comes around I'll make sure that I
arrive in the same ballpark that QEMU does, based off the descriptions
etc in the binding.

Thanks!

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
  2023-01-13 10:10     ` Marc Zyngier
@ 2023-05-01  8:28       ` Anup Patel
  -1 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-05-01  8:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	linux-riscv, linux-kernel, devicetree

On Fri, Jan 13, 2023 at 3:40 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 03 Jan 2023 14:14:05 +0000,
> Anup Patel <apatel@ventanamicro.com> wrote:
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > MSI controller is referred to as incoming message signaled interrupt
> > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > (For more details refer https://github.com/riscv/riscv-aia)
>
> And how about IPIs, which this driver seems to be concerned about?

Okay, I will mention about IPIs in the commit description.

>
> >
> > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  drivers/irqchip/Kconfig             |   14 +-
> >  drivers/irqchip/Makefile            |    1 +
> >  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
> >  include/linux/irqchip/riscv-imsic.h |   92 +++
> >  4 files changed, 1280 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> >  create mode 100644 include/linux/irqchip/riscv-imsic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index 9e65345ca3f6..a1315189a595 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> >
> >  config GIC_NON_BANKED
> >       bool
> > -
> >  config ARM_GIC_V3
> >       bool
> >       select IRQ_DOMAIN_HIERARCHY
> > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> >       select IRQ_DOMAIN_HIERARCHY
> >       select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_IMSIC
> > +     bool
> > +     depends on RISCV
> > +     select IRQ_DOMAIN_HIERARCHY
> > +     select GENERIC_MSI_IRQ_DOMAIN
> > +
> > +config RISCV_IMSIC_PCI
> > +     bool
> > +     depends on RISCV_IMSIC
> > +     depends on PCI
> > +     depends on PCI_MSI
> > +     default RISCV_IMSIC
>
> This should definitely tell you that this driver needs splitting.

The code under "#ifdef CONFIG_RISCV_IMSIC_PCI" is hardly 40 lines
so I felt it was too small to deserve its own source file.

>
> > +
> >  config EXYNOS_IRQ_COMBINER
> >       bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> >       depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 87b49a10962c..22c723cc6ec8 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                      += irq-qcom-mpm.o
> >  obj-$(CONFIG_CSKY_MPINTC)            += irq-csky-mpintc.o
> >  obj-$(CONFIG_CSKY_APB_INTC)          += irq-csky-apb-intc.o
> >  obj-$(CONFIG_RISCV_INTC)             += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_IMSIC)            += irq-riscv-imsic.o
> >  obj-$(CONFIG_SIFIVE_PLIC)            += irq-sifive-plic.o
> >  obj-$(CONFIG_IMX_IRQSTEER)           += irq-imx-irqsteer.o
> >  obj-$(CONFIG_IMX_INTMUX)             += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > new file mode 100644
> > index 000000000000..4c16b66738d6
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > @@ -0,0 +1,1174 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/bitmap.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iommu.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/pci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +#include <asm/hwcap.h>
> > +
> > +#define IMSIC_DISABLE_EIDELIVERY     0
> > +#define IMSIC_ENABLE_EIDELIVERY              1
> > +#define IMSIC_DISABLE_EITHRESHOLD    1
> > +#define IMSIC_ENABLE_EITHRESHOLD     0
> > +
> > +#define imsic_csr_write(__c, __v)    \
> > +do {                                 \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     csr_write(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +#define imsic_csr_read(__c)          \
> > +({                                   \
> > +     unsigned long __v;              \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     __v = csr_read(CSR_IREG);       \
> > +     __v;                            \
> > +})
> > +
> > +#define imsic_csr_set(__c, __v)              \
> > +do {                                 \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     csr_set(CSR_IREG, __v);         \
> > +} while (0)
> > +
> > +#define imsic_csr_clear(__c, __v)    \
> > +do {                                 \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     csr_clear(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +struct imsic_mmio {
> > +     phys_addr_t pa;
> > +     void __iomem *va;
> > +     unsigned long size;
> > +};
> > +
> > +struct imsic_priv {
> > +     /* Global configuration common for all HARTs */
> > +     struct imsic_global_config global;
> > +
> > +     /* MMIO regions */
> > +     u32 num_mmios;
> > +     struct imsic_mmio *mmios;
> > +
> > +     /* Global state of interrupt identities */
> > +     raw_spinlock_t ids_lock;
> > +     unsigned long *ids_used_bimap;
> > +     unsigned long *ids_enabled_bimap;
> > +     unsigned int *ids_target_cpu;
> > +
> > +     /* Mask for connected CPUs */
> > +     struct cpumask lmask;
> > +
> > +     /* IPI interrupt identity */
> > +     u32 ipi_id;
> > +     u32 ipi_lsync_id;
> > +
> > +     /* IRQ domains */
> > +     struct irq_domain *base_domain;
> > +     struct irq_domain *pci_domain;
> > +     struct irq_domain *plat_domain;
> > +};
> > +
> > +struct imsic_handler {
> > +     /* Local configuration for given HART */
> > +     struct imsic_local_config local;
> > +
> > +     /* Pointer to private context */
> > +     struct imsic_priv *priv;
> > +};
> > +
> > +static bool imsic_init_done;
> > +
> > +static int imsic_parent_irq;
> > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > +
> > +const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +
> > +     if (!handler || !handler->priv)
> > +             return NULL;
> > +
> > +     return &handler->priv->global;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > +
> > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > +{
> > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +     if (!handler || !handler->priv)
> > +             return NULL;
>
> How can this happen?

These are redundant checks. I will drop.

>
> > +
> > +     return &handler->local;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
>
> Why are these symbols exported? They have no user, so they shouldn't
> even exist here. I also seriously doubt there is a valid use case for
> exposing this information to the rest of the kernel.

The imsic_get_global_config() is used by APLIC driver and KVM RISC-V
module whereas imsic_get_local_config() is only used by KVM RISC-V.

The KVM RISC-V AIA irqchip patches are available in riscv_kvm_aia_v1
branch at: https://github.com/avpatel/linux.git. I have not posted KVM RISC-V
patches due various interdependencies.

>
> > +
> > +static int imsic_cpu_page_phys(unsigned int cpu,
> > +                            unsigned int guest_index,
> > +                            phys_addr_t *out_msi_pa)
> > +{
> > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +     struct imsic_global_config *global;
> > +     struct imsic_local_config *local;
> > +
> > +     if (!handler || !handler->priv)
> > +             return -ENODEV;
> > +     local = &handler->local;
> > +     global = &handler->priv->global;
> > +
> > +     if (BIT(global->guest_index_bits) <= guest_index)
> > +             return -EINVAL;
> > +
> > +     if (out_msi_pa)
> > +             *out_msi_pa = local->msi_pa +
> > +                           (guest_index * IMSIC_MMIO_PAGE_SZ);
> > +
> > +     return 0;
> > +}
> > +
> > +static int imsic_get_cpu(struct imsic_priv *priv,
> > +                      const struct cpumask *mask_val, bool force,
> > +                      unsigned int *out_target_cpu)
> > +{
> > +     struct cpumask amask;
> > +     unsigned int cpu;
> > +
> > +     cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > +     if (force)
> > +             cpu = cpumask_first(&amask);
> > +     else
> > +             cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > +     if (cpu >= nr_cpu_ids)
> > +             return -EINVAL;
> > +
> > +     if (out_target_cpu)
> > +             *out_target_cpu = cpu;
> > +
> > +     return 0;
> > +}
> > +
> > +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> > +                              struct msi_msg *msg)
> > +{
> > +     phys_addr_t msi_addr;
> > +     int err;
> > +
> > +     err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +     if (err)
> > +             return err;
> > +
> > +     msg->address_hi = upper_32_bits(msi_addr);
> > +     msg->address_lo = lower_32_bits(msi_addr);
> > +     msg->data = id;
> > +
> > +     return err;
> > +}
> > +
> > +static void imsic_id_set_target(struct imsic_priv *priv,
> > +                              unsigned int id, unsigned int target_cpu)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     priv->ids_target_cpu[id] = target_cpu;
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> > +                                     unsigned int id)
> > +{
> > +     unsigned int ret;
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     ret = priv->ids_target_cpu[id];
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     return ret;
> > +}
> > +
> > +static void __imsic_eix_update(unsigned long base_id,
> > +                            unsigned long num_id, bool pend, bool val)
> > +{
> > +     unsigned long i, isel, ireg, flags;
> > +     unsigned long id = base_id, last_id = base_id + num_id;
> > +
> > +     while (id < last_id) {
> > +             isel = id / BITS_PER_LONG;
> > +             isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > +             isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> > +
> > +             ireg = 0;
> > +             for (i = id & (__riscv_xlen - 1);
> > +                  (id < last_id) && (i < __riscv_xlen); i++) {
> > +                     ireg |= BIT(i);
> > +                     id++;
> > +             }
> > +
> > +             /*
> > +              * The IMSIC EIEx and EIPx registers are indirectly
> > +              * accessed via using ISELECT and IREG CSRs so we
> > +              * save/restore local IRQ to ensure that we don't
> > +              * get preempted while accessing IMSIC registers.
> > +              */
> > +             local_irq_save(flags);
> > +             if (val)
> > +                     imsic_csr_set(isel, ireg);
> > +             else
> > +                     imsic_csr_clear(isel, ireg);
> > +             local_irq_restore(flags);
>
> What is the actual requirement? no preemption? or no interrupts? This
> isn't the same thing. Also, a bunch of the users already disable
> interrupts. Consistency wouldn't hurt here.

No preemption is the only requirement. Since the callers of these
functions disable local IRQ, I think we don't need to not anything
special here. I will drop the local IRQ save/restore and update
the comments as well.

>
> > +     }
> > +}
> > +
> > +#define __imsic_id_enable(__id)              \
> > +     __imsic_eix_update((__id), 1, false, true)
> > +#define __imsic_id_disable(__id)     \
> > +     __imsic_eix_update((__id), 1, false, false)
> > +
> > +#ifdef CONFIG_SMP
> > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > +{
> > +     struct imsic_handler *handler;
> > +     struct cpumask amask;
> > +     int cpu;
> > +
> > +     cpumask_and(&amask, &priv->lmask, cpu_online_mask);
>
> Can't this race against a CPU going down?

Yes, it can race if a CPU goes down while we are in this function
but this won't be a problem because the imsic_starting_cpu()
will unconditionally do imsic_ids_local_sync() when the CPU is
brought-up again. I will add a multiline comment block explaining
this.

>
> > +     for_each_cpu(cpu, &amask) {
> > +             if (cpu == smp_processor_id())
> > +                     continue;
> > +
> > +             handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +             if (!handler || !handler->priv || !handler->local.msi_va) {
> > +                     pr_warn("CPU%d: handler not initialized\n", cpu);
>
> How many times are you going to do that? On each failing synchronisation?

My bad for adding these paranoid checks. I remove these checks
wherever possible.

>
> > +                     continue;
> > +             }
> > +
> > +             writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
>
> As I understand it, this is a "behind the scenes" IPI. Why isn't that
> a *real* IPI?

Yes, that's correct. The ID enable bits are per-CPU accessible only
via CSRs hence we have a special "behind the scenes" IPI to
synchronize state of ID enable bits.

>
> > +     }
> > +}
> > +#else
> > +#define __imsic_id_smp_sync(__priv)
> > +#endif
> > +
> > +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     bitmap_set(priv->ids_enabled_bimap, id, 1);
> > +     __imsic_id_enable(id);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     bitmap_clear(priv->ids_enabled_bimap, id, 1);
> > +     __imsic_id_disable(id);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_ids_local_sync(struct imsic_priv *priv)
> > +{
> > +     int i;
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     for (i = 1; i <= priv->global.nr_ids; i++) {
> > +             if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> > +                     continue;
> > +
> > +             if (test_bit(i, priv->ids_enabled_bimap))
> > +                     __imsic_id_enable(i);
> > +             else
> > +                     __imsic_id_disable(i);
> > +     }
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> > +{
> > +     if (enable) {
> > +             imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> > +             imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> > +     } else {
> > +             imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> > +             imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> > +     }
> > +}
> > +
> > +static int imsic_ids_alloc(struct imsic_priv *priv,
> > +                        unsigned int max_id, unsigned int order)
> > +{
> > +     int ret;
> > +     unsigned long flags;
> > +
> > +     if ((priv->global.nr_ids < max_id) ||
> > +         (max_id < BIT(order)))
> > +             return -EINVAL;
>
> Why do we need this check? Shouldn't that be guaranteed by
> construction?

Yes, these are redundant checks. I will remove it.

>
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     ret = bitmap_find_free_region(priv->ids_used_bimap,
> > +                                   max_id + 1, order);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     return ret;
> > +}
> > +
> > +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> > +                        unsigned int order)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     bitmap_release_region(priv->ids_used_bimap, base_id, order);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static int __init imsic_ids_init(struct imsic_priv *priv)
> > +{
> > +     int i;
> > +     struct imsic_global_config *global = &priv->global;
> > +
> > +     raw_spin_lock_init(&priv->ids_lock);
> > +
> > +     /* Allocate used bitmap */
> > +     priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                     sizeof(unsigned long), GFP_KERNEL);
>
> How about bitmap_alloc?

Okay, I will use bitmap_zalloc() here.

>
> > +     if (!priv->ids_used_bimap)
> > +             return -ENOMEM;
> > +
> > +     /* Allocate enabled bitmap */
> > +     priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                        sizeof(unsigned long), GFP_KERNEL);
> > +     if (!priv->ids_enabled_bimap) {
> > +             kfree(priv->ids_used_bimap);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     /* Allocate target CPU array */
> > +     priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> > +                                    sizeof(unsigned int), GFP_KERNEL);
> > +     if (!priv->ids_target_cpu) {
> > +             kfree(priv->ids_enabled_bimap);
> > +             kfree(priv->ids_used_bimap);
> > +             return -ENOMEM;
> > +     }
> > +     for (i = 0; i <= global->nr_ids; i++)
> > +             priv->ids_target_cpu[i] = UINT_MAX;
> > +
> > +     /* Reserve ID#0 because it is special and never implemented */
> > +     bitmap_set(priv->ids_used_bimap, 0, 1);
> > +
> > +     return 0;
> > +}
> > +
> > +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> > +{
> > +     kfree(priv->ids_target_cpu);
> > +     kfree(priv->ids_enabled_bimap);
> > +     kfree(priv->ids_used_bimap);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_ipi_send(unsigned int cpu)
> > +{
> > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +     if (!handler || !handler->priv || !handler->local.msi_va) {
> > +             pr_warn("CPU%d: handler not initialized\n", cpu);
> > +             return;
> > +     }
> > +
> > +     writel(handler->priv->ipi_id, handler->local.msi_va);
> > +}
> > +
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +     __imsic_id_enable(priv->ipi_id);
> > +     __imsic_id_enable(priv->ipi_lsync_id);
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +     int virq;
> > +
> > +     /* Allocate interrupt identity for IPIs */
> > +     virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +     if (virq < 0)
> > +             return virq;
> > +     priv->ipi_id = virq;
> > +
> > +     /* Create IMSIC IPI multiplexing */
> > +     virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
>
> Please! This BITS_PER_BYTE makes zero sense here. Have a proper define
> that says 8, and document *why* this is 8! You're not defining a type
> system, you're writing a irqchip driver.

Okay, I will add a "#define" for the number of IPI with an explanation
for *why*.

>
> > +     if (virq <= 0) {
> > +             imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +             return (virq < 0) ? virq : -ENOMEM;
> > +     }
> > +
> > +     /* Set vIRQ range */
> > +     riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> > +
> > +     /* Allocate interrupt identity for local enable/disable sync */
> > +     virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +     if (virq < 0) {
> > +             imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +             return virq;
> > +     }
> > +     priv->ipi_lsync_id = virq;
> > +
> > +     return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +     imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> > +     if (priv->ipi_id)
> > +             imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +}
> > +#else
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +     /* Clear the IPI ids because we are not using IPIs */
> > +     priv->ipi_id = 0;
> > +     priv->ipi_lsync_id = 0;
> > +     return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +}
> > +#endif
> > +
> > +static void imsic_irq_mask(struct irq_data *d)
> > +{
> > +     imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_unmask(struct irq_data *d)
> > +{
> > +     imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> > +                                   struct msi_msg *msg)
> > +{
> > +     struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +     unsigned int cpu;
> > +     int err;
> > +
> > +     cpu = imsic_id_get_target(priv, d->hwirq);
> > +     WARN_ON(cpu == UINT_MAX);
> > +
> > +     err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> > +     WARN_ON(err);
> > +
> > +     iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int imsic_irq_set_affinity(struct irq_data *d,
> > +                               const struct cpumask *mask_val,
> > +                               bool force)
> > +{
> > +     struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +     unsigned int target_cpu;
> > +     int rc;
> > +
> > +     rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> > +     if (rc)
> > +             return rc;
> > +
> > +     imsic_id_set_target(priv, d->hwirq, target_cpu);
> > +     irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> > +
> > +     return IRQ_SET_MASK_OK;
> > +}
> > +#endif
> > +
> > +static struct irq_chip imsic_irq_base_chip = {
> > +     .name                   = "RISC-V IMSIC-BASE",
> > +     .irq_mask               = imsic_irq_mask,
> > +     .irq_unmask             = imsic_irq_unmask,
> > +#ifdef CONFIG_SMP
> > +     .irq_set_affinity       = imsic_irq_set_affinity,
> > +#endif
> > +     .irq_compose_msi_msg    = imsic_irq_compose_msi_msg,
> > +     .flags                  = IRQCHIP_SKIP_SET_WAKE |
> > +                               IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> > +                               unsigned int virq,
> > +                               unsigned int nr_irqs,
> > +                               void *args)
> > +{
> > +     struct imsic_priv *priv = domain->host_data;
> > +     msi_alloc_info_t *info = args;
> > +     phys_addr_t msi_addr;
> > +     int i, hwirq, err = 0;
> > +     unsigned int cpu;
> > +
> > +     err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> > +     if (err)
> > +             return err;
> > +
> > +     err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +     if (err)
> > +             return err;
> > +
> > +     hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> > +                             get_count_order(nr_irqs));
> > +     if (hwirq < 0)
> > +             return hwirq;
> > +
> > +     err = iommu_dma_prepare_msi(info->desc, msi_addr);
> > +     if (err)
> > +             goto fail;
> > +
> > +     for (i = 0; i < nr_irqs; i++) {
> > +             imsic_id_set_target(priv, hwirq + i, cpu);
> > +             irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                 &imsic_irq_base_chip, priv,
> > +                                 handle_simple_irq, NULL, NULL);
> > +             irq_set_noprobe(virq + i);
> > +             irq_set_affinity(virq + i, &priv->lmask);
> > +     }
> > +
> > +     return 0;
> > +
> > +fail:
> > +     imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> > +     return err;
> > +}
> > +
> > +static void imsic_irq_domain_free(struct irq_domain *domain,
> > +                               unsigned int virq,
> > +                               unsigned int nr_irqs)
> > +{
> > +     struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> > +     struct imsic_priv *priv = domain->host_data;
> > +
> > +     imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> > +     irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> > +}
> > +
> > +static const struct irq_domain_ops imsic_base_domain_ops = {
> > +     .alloc          = imsic_irq_domain_alloc,
> > +     .free           = imsic_irq_domain_free,
> > +};
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +
> > +static void imsic_pci_mask_irq(struct irq_data *d)
> > +{
> > +     pci_msi_mask_irq(d);
> > +     irq_chip_mask_parent(d);
> > +}
> > +
> > +static void imsic_pci_unmask_irq(struct irq_data *d)
> > +{
> > +     pci_msi_unmask_irq(d);
> > +     irq_chip_unmask_parent(d);
> > +}
> > +
> > +static struct irq_chip imsic_pci_irq_chip = {
> > +     .name                   = "RISC-V IMSIC-PCI",
> > +     .irq_mask               = imsic_pci_mask_irq,
> > +     .irq_unmask             = imsic_pci_unmask_irq,
> > +     .irq_eoi                = irq_chip_eoi_parent,
> > +};
> > +
> > +static struct msi_domain_ops imsic_pci_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_pci_domain_info = {
> > +     .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> > +                MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> > +     .ops    = &imsic_pci_domain_ops,
> > +     .chip   = &imsic_pci_irq_chip,
> > +};
> > +
> > +#endif
> > +
> > +static struct irq_chip imsic_plat_irq_chip = {
> > +     .name                   = "RISC-V IMSIC-PLAT",
> > +};
> > +
> > +static struct msi_domain_ops imsic_plat_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_plat_domain_info = {
> > +     .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> > +     .ops    = &imsic_plat_domain_ops,
> > +     .chip   = &imsic_plat_irq_chip,
> > +};
> > +
> > +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> > +                                      struct fwnode_handle *fwnode)
> > +{
> > +     /* Create Base IRQ domain */
> > +     priv->base_domain = irq_domain_create_tree(fwnode,
> > +                                             &imsic_base_domain_ops, priv);
> > +     if (!priv->base_domain) {
> > +             pr_err("Failed to create IMSIC base domain\n");
> > +             return -ENOMEM;
> > +     }
> > +     irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +     /* Create PCI MSI domain */
> > +     priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> > +                                             &imsic_pci_domain_info,
> > +                                             priv->base_domain);
> > +     if (!priv->pci_domain) {
> > +             pr_err("Failed to create IMSIC PCI domain\n");
> > +             irq_domain_remove(priv->base_domain);
> > +             return -ENOMEM;
> > +     }
> > +#endif
> > +
> > +     /* Create Platform MSI domain */
> > +     priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> > +                                             &imsic_plat_domain_info,
> > +                                             priv->base_domain);
> > +     if (!priv->plat_domain) {
> > +             pr_err("Failed to create IMSIC platform domain\n");
> > +             if (priv->pci_domain)
> > +                     irq_domain_remove(priv->pci_domain);
> > +             irq_domain_remove(priv->base_domain);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +     struct irq_chip *chip = irq_desc_get_chip(desc);
> > +     struct imsic_priv *priv = handler->priv;
> > +     irq_hw_number_t hwirq;
> > +     int err;
> > +
> > +     WARN_ON_ONCE(!handler->priv);
> > +
> > +     chained_irq_enter(chip, desc);
> > +
> > +     while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> > +             hwirq = hwirq >> TOPEI_ID_SHIFT;
> > +
> > +             if (hwirq == priv->ipi_id) {
> > +#ifdef CONFIG_SMP
> > +                     ipi_mux_process();
> > +#endif
> > +                     continue;
> > +             } else if (hwirq == priv->ipi_lsync_id) {
> > +                     imsic_ids_local_sync(priv);
> > +                     continue;
> > +             }
> > +
> > +             err = generic_handle_domain_irq(priv->base_domain, hwirq);
> > +             if (unlikely(err))
> > +                     pr_warn_ratelimited(
> > +                             "hwirq %lu mapping not found\n", hwirq);
> > +     }
> > +
> > +     chained_irq_exit(chip, desc);
> > +}
> > +
> > +static int imsic_starting_cpu(unsigned int cpu)
> > +{
> > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +     struct imsic_priv *priv = handler->priv;
> > +
> > +     /* Enable per-CPU parent interrupt */
> > +     if (imsic_parent_irq)
> > +             enable_percpu_irq(imsic_parent_irq,
> > +                               irq_get_trigger_type(imsic_parent_irq));
>
> Shouldn't that be the default already?

The imsic_parent_irq is already set before calling imsic_starting_cpu()
on each CPU so we can drop the if-check.

>
> > +     else
> > +             pr_warn("cpu%d: parent irq not available\n", cpu);
>
> And yet continue in sequence? Duh...

This warning is also not required.

>
> > +
> > +     /* Enable IPIs */
> > +     imsic_ipi_enable(priv);
> > +
> > +     /*
> > +      * Interrupts identities might have been enabled/disabled while
> > +      * this CPU was not running so sync-up local enable/disable state.
> > +      */
> > +     imsic_ids_local_sync(priv);
> > +
> > +     /* Locally enable interrupt delivery */
> > +     imsic_ids_local_delivery(priv, true);
> > +
> > +     return 0;
> > +}
> > +
> > +struct imsic_fwnode_ops {
> > +     u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> > +                          void *fwopaque);
> > +     int (*parent_hartid)(struct fwnode_handle *fwnode,
> > +                          void *fwopaque, u32 index,
> > +                          unsigned long *out_hartid);
> > +     u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> > +     int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> > +                             void *fwopaque, u32 index,
> > +                             struct resource *res);
> > +     void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> > +                               void *fwopaque, u32 index);
> > +     int (*read_u32)(struct fwnode_handle *fwnode,
> > +                     void *fwopaque, const char *prop, u32 *out_val);
> > +     bool (*read_bool)(struct fwnode_handle *fwnode,
> > +                       void *fwopaque, const char *prop);
> > +};
>
> Why do we need this sort of (terrible) indirection?

Okay, I will replace this indirection with fwnode APIs.

>
> > +
> > +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> > +                          struct fwnode_handle *fwnode,
> > +                          void *fwopaque)
> > +{
> > +     struct resource res;
> > +     phys_addr_t base_addr;
> > +     int rc, nr_parent_irqs;
> > +     struct imsic_mmio *mmio;
> > +     struct imsic_priv *priv;
> > +     struct irq_domain *domain;
> > +     struct imsic_handler *handler;
> > +     struct imsic_global_config *global;
> > +     u32 i, tmp, nr_handlers = 0;
> > +
> > +     if (imsic_init_done) {
> > +             pr_err("%pfwP: already initialized hence ignoring\n",
> > +                     fwnode);
> > +             return -ENODEV;
> > +     }
> > +
> > +     if (!riscv_isa_extension_available(NULL, SxAIA)) {
> > +             pr_err("%pfwP: AIA support not available\n", fwnode);
> > +             return -ENODEV;
> > +     }
> > +
> > +     priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> > +     if (!priv)
> > +             return -ENOMEM;
> > +     global = &priv->global;
> > +
> > +     /* Find number of parent interrupts */
> > +     nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> > +     if (!nr_parent_irqs) {
> > +             pr_err("%pfwP: no parent irqs available\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of guest index bits in MSI address */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> > +                          &global->guest_index_bits);
> > +     if (rc)
> > +             global->guest_index_bits = 0;
> > +     tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> > +     if (tmp < global->guest_index_bits) {
> > +             pr_err("%pfwP: guest index bits too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of HART index bits */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> > +                          &global->hart_index_bits);
> > +     if (rc) {
> > +             /* Assume default value */
> > +             global->hart_index_bits = __fls(nr_parent_irqs);
> > +             if (BIT(global->hart_index_bits) < nr_parent_irqs)
> > +                     global->hart_index_bits++;
> > +     }
> > +     tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +           global->guest_index_bits;
> > +     if (tmp < global->hart_index_bits) {
> > +             pr_err("%pfwP: HART index bits too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of group index bits */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> > +                          &global->group_index_bits);
> > +     if (rc)
> > +             global->group_index_bits = 0;
> > +     tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +           global->guest_index_bits - global->hart_index_bits;
> > +     if (tmp < global->group_index_bits) {
> > +             pr_err("%pfwP: group index bits too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /*
> > +      * Find first bit position of group index.
> > +      * If not specified assumed the default APLIC-IMSIC configuration.
> > +      */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> > +                          &global->group_index_shift);
> > +     if (rc)
> > +             global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> > +     tmp = global->group_index_bits + global->group_index_shift - 1;
> > +     if (tmp >= BITS_PER_LONG) {
> > +             pr_err("%pfwP: group index shift too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of interrupt identities */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> > +                          &global->nr_ids);
> > +     if (rc) {
> > +             pr_err("%pfwP: number of interrupt identities not found\n",
> > +                     fwnode);
> > +             return rc;
> > +     }
> > +     if ((global->nr_ids < IMSIC_MIN_ID) ||
> > +         (global->nr_ids >= IMSIC_MAX_ID) ||
> > +         ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +             pr_err("%pfwP: invalid number of interrupt identities\n",
> > +                     fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of guest interrupt identities */
> > +     if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> > +                         &global->nr_guest_ids))
> > +             global->nr_guest_ids = global->nr_ids;
> > +     if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> > +         (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> > +         ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +             pr_err("%pfwP: invalid number of guest interrupt identities\n",
> > +                     fwnode);
> > +             return -EINVAL;
> > +     }
>
> Please split the whole guest stuff out. It is totally unused!

The number of guest IDs is used by KVM RISC-V AIA support which
is in the pipeline. The KVM RISC-V only need imsic_get_global_config()
and imsic_get_local_config(). The "nr_guest_ids" is part of the
IMSIC global config.

>
> I've stopped reading. This needs structure, cleanups and a bit of
> taste. Not a lot of that here at the moment.
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

It took a while to address all your comments and I also got
preempted by other stuff. Sorry for the delay.

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
@ 2023-05-01  8:28       ` Anup Patel
  0 siblings, 0 replies; 72+ messages in thread
From: Anup Patel @ 2023-05-01  8:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	linux-riscv, linux-kernel, devicetree

On Fri, Jan 13, 2023 at 3:40 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 03 Jan 2023 14:14:05 +0000,
> Anup Patel <apatel@ventanamicro.com> wrote:
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > MSI controller is referred to as incoming message signaled interrupt
> > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > (For more details refer https://github.com/riscv/riscv-aia)
>
> And how about IPIs, which this driver seems to be concerned about?

Okay, I will mention about IPIs in the commit description.

>
> >
> > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > ---
> >  drivers/irqchip/Kconfig             |   14 +-
> >  drivers/irqchip/Makefile            |    1 +
> >  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
> >  include/linux/irqchip/riscv-imsic.h |   92 +++
> >  4 files changed, 1280 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> >  create mode 100644 include/linux/irqchip/riscv-imsic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index 9e65345ca3f6..a1315189a595 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> >
> >  config GIC_NON_BANKED
> >       bool
> > -
> >  config ARM_GIC_V3
> >       bool
> >       select IRQ_DOMAIN_HIERARCHY
> > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> >       select IRQ_DOMAIN_HIERARCHY
> >       select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_IMSIC
> > +     bool
> > +     depends on RISCV
> > +     select IRQ_DOMAIN_HIERARCHY
> > +     select GENERIC_MSI_IRQ_DOMAIN
> > +
> > +config RISCV_IMSIC_PCI
> > +     bool
> > +     depends on RISCV_IMSIC
> > +     depends on PCI
> > +     depends on PCI_MSI
> > +     default RISCV_IMSIC
>
> This should definitely tell you that this driver needs splitting.

The code under "#ifdef CONFIG_RISCV_IMSIC_PCI" is hardly 40 lines
so I felt it was too small to deserve its own source file.

>
> > +
> >  config EXYNOS_IRQ_COMBINER
> >       bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> >       depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 87b49a10962c..22c723cc6ec8 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                      += irq-qcom-mpm.o
> >  obj-$(CONFIG_CSKY_MPINTC)            += irq-csky-mpintc.o
> >  obj-$(CONFIG_CSKY_APB_INTC)          += irq-csky-apb-intc.o
> >  obj-$(CONFIG_RISCV_INTC)             += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_IMSIC)            += irq-riscv-imsic.o
> >  obj-$(CONFIG_SIFIVE_PLIC)            += irq-sifive-plic.o
> >  obj-$(CONFIG_IMX_IRQSTEER)           += irq-imx-irqsteer.o
> >  obj-$(CONFIG_IMX_INTMUX)             += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > new file mode 100644
> > index 000000000000..4c16b66738d6
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > @@ -0,0 +1,1174 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/bitmap.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iommu.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/pci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +#include <asm/hwcap.h>
> > +
> > +#define IMSIC_DISABLE_EIDELIVERY     0
> > +#define IMSIC_ENABLE_EIDELIVERY              1
> > +#define IMSIC_DISABLE_EITHRESHOLD    1
> > +#define IMSIC_ENABLE_EITHRESHOLD     0
> > +
> > +#define imsic_csr_write(__c, __v)    \
> > +do {                                 \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     csr_write(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +#define imsic_csr_read(__c)          \
> > +({                                   \
> > +     unsigned long __v;              \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     __v = csr_read(CSR_IREG);       \
> > +     __v;                            \
> > +})
> > +
> > +#define imsic_csr_set(__c, __v)              \
> > +do {                                 \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     csr_set(CSR_IREG, __v);         \
> > +} while (0)
> > +
> > +#define imsic_csr_clear(__c, __v)    \
> > +do {                                 \
> > +     csr_write(CSR_ISELECT, __c);    \
> > +     csr_clear(CSR_IREG, __v);       \
> > +} while (0)
> > +
> > +struct imsic_mmio {
> > +     phys_addr_t pa;
> > +     void __iomem *va;
> > +     unsigned long size;
> > +};
> > +
> > +struct imsic_priv {
> > +     /* Global configuration common for all HARTs */
> > +     struct imsic_global_config global;
> > +
> > +     /* MMIO regions */
> > +     u32 num_mmios;
> > +     struct imsic_mmio *mmios;
> > +
> > +     /* Global state of interrupt identities */
> > +     raw_spinlock_t ids_lock;
> > +     unsigned long *ids_used_bimap;
> > +     unsigned long *ids_enabled_bimap;
> > +     unsigned int *ids_target_cpu;
> > +
> > +     /* Mask for connected CPUs */
> > +     struct cpumask lmask;
> > +
> > +     /* IPI interrupt identity */
> > +     u32 ipi_id;
> > +     u32 ipi_lsync_id;
> > +
> > +     /* IRQ domains */
> > +     struct irq_domain *base_domain;
> > +     struct irq_domain *pci_domain;
> > +     struct irq_domain *plat_domain;
> > +};
> > +
> > +struct imsic_handler {
> > +     /* Local configuration for given HART */
> > +     struct imsic_local_config local;
> > +
> > +     /* Pointer to private context */
> > +     struct imsic_priv *priv;
> > +};
> > +
> > +static bool imsic_init_done;
> > +
> > +static int imsic_parent_irq;
> > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > +
> > +const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +
> > +     if (!handler || !handler->priv)
> > +             return NULL;
> > +
> > +     return &handler->priv->global;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > +
> > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > +{
> > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +     if (!handler || !handler->priv)
> > +             return NULL;
>
> How can this happen?

These are redundant checks. I will drop.

>
> > +
> > +     return &handler->local;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
>
> Why are these symbols exported? They have no user, so they shouldn't
> even exist here. I also seriously doubt there is a valid use case for
> exposing this information to the rest of the kernel.

The imsic_get_global_config() is used by APLIC driver and KVM RISC-V
module whereas imsic_get_local_config() is only used by KVM RISC-V.

The KVM RISC-V AIA irqchip patches are available in riscv_kvm_aia_v1
branch at: https://github.com/avpatel/linux.git. I have not posted KVM RISC-V
patches due various interdependencies.

>
> > +
> > +static int imsic_cpu_page_phys(unsigned int cpu,
> > +                            unsigned int guest_index,
> > +                            phys_addr_t *out_msi_pa)
> > +{
> > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +     struct imsic_global_config *global;
> > +     struct imsic_local_config *local;
> > +
> > +     if (!handler || !handler->priv)
> > +             return -ENODEV;
> > +     local = &handler->local;
> > +     global = &handler->priv->global;
> > +
> > +     if (BIT(global->guest_index_bits) <= guest_index)
> > +             return -EINVAL;
> > +
> > +     if (out_msi_pa)
> > +             *out_msi_pa = local->msi_pa +
> > +                           (guest_index * IMSIC_MMIO_PAGE_SZ);
> > +
> > +     return 0;
> > +}
> > +
> > +static int imsic_get_cpu(struct imsic_priv *priv,
> > +                      const struct cpumask *mask_val, bool force,
> > +                      unsigned int *out_target_cpu)
> > +{
> > +     struct cpumask amask;
> > +     unsigned int cpu;
> > +
> > +     cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > +     if (force)
> > +             cpu = cpumask_first(&amask);
> > +     else
> > +             cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > +     if (cpu >= nr_cpu_ids)
> > +             return -EINVAL;
> > +
> > +     if (out_target_cpu)
> > +             *out_target_cpu = cpu;
> > +
> > +     return 0;
> > +}
> > +
> > +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> > +                              struct msi_msg *msg)
> > +{
> > +     phys_addr_t msi_addr;
> > +     int err;
> > +
> > +     err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +     if (err)
> > +             return err;
> > +
> > +     msg->address_hi = upper_32_bits(msi_addr);
> > +     msg->address_lo = lower_32_bits(msi_addr);
> > +     msg->data = id;
> > +
> > +     return err;
> > +}
> > +
> > +static void imsic_id_set_target(struct imsic_priv *priv,
> > +                              unsigned int id, unsigned int target_cpu)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     priv->ids_target_cpu[id] = target_cpu;
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> > +                                     unsigned int id)
> > +{
> > +     unsigned int ret;
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     ret = priv->ids_target_cpu[id];
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     return ret;
> > +}
> > +
> > +static void __imsic_eix_update(unsigned long base_id,
> > +                            unsigned long num_id, bool pend, bool val)
> > +{
> > +     unsigned long i, isel, ireg, flags;
> > +     unsigned long id = base_id, last_id = base_id + num_id;
> > +
> > +     while (id < last_id) {
> > +             isel = id / BITS_PER_LONG;
> > +             isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > +             isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> > +
> > +             ireg = 0;
> > +             for (i = id & (__riscv_xlen - 1);
> > +                  (id < last_id) && (i < __riscv_xlen); i++) {
> > +                     ireg |= BIT(i);
> > +                     id++;
> > +             }
> > +
> > +             /*
> > +              * The IMSIC EIEx and EIPx registers are indirectly
> > +              * accessed via using ISELECT and IREG CSRs so we
> > +              * save/restore local IRQ to ensure that we don't
> > +              * get preempted while accessing IMSIC registers.
> > +              */
> > +             local_irq_save(flags);
> > +             if (val)
> > +                     imsic_csr_set(isel, ireg);
> > +             else
> > +                     imsic_csr_clear(isel, ireg);
> > +             local_irq_restore(flags);
>
> What is the actual requirement? no preemption? or no interrupts? This
> isn't the same thing. Also, a bunch of the users already disable
> interrupts. Consistency wouldn't hurt here.

No preemption is the only requirement. Since the callers of these
functions disable local IRQ, I think we don't need to not anything
special here. I will drop the local IRQ save/restore and update
the comments as well.

>
> > +     }
> > +}
> > +
> > +#define __imsic_id_enable(__id)              \
> > +     __imsic_eix_update((__id), 1, false, true)
> > +#define __imsic_id_disable(__id)     \
> > +     __imsic_eix_update((__id), 1, false, false)
> > +
> > +#ifdef CONFIG_SMP
> > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > +{
> > +     struct imsic_handler *handler;
> > +     struct cpumask amask;
> > +     int cpu;
> > +
> > +     cpumask_and(&amask, &priv->lmask, cpu_online_mask);
>
> Can't this race against a CPU going down?

Yes, it can race if a CPU goes down while we are in this function
but this won't be a problem because the imsic_starting_cpu()
will unconditionally do imsic_ids_local_sync() when the CPU is
brought-up again. I will add a multiline comment block explaining
this.

>
> > +     for_each_cpu(cpu, &amask) {
> > +             if (cpu == smp_processor_id())
> > +                     continue;
> > +
> > +             handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +             if (!handler || !handler->priv || !handler->local.msi_va) {
> > +                     pr_warn("CPU%d: handler not initialized\n", cpu);
>
> How many times are you going to do that? On each failing synchronisation?

My bad for adding these paranoid checks. I remove these checks
wherever possible.

>
> > +                     continue;
> > +             }
> > +
> > +             writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
>
> As I understand it, this is a "behind the scenes" IPI. Why isn't that
> a *real* IPI?

Yes, that's correct. The ID enable bits are per-CPU accessible only
via CSRs hence we have a special "behind the scenes" IPI to
synchronize state of ID enable bits.

>
> > +     }
> > +}
> > +#else
> > +#define __imsic_id_smp_sync(__priv)
> > +#endif
> > +
> > +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     bitmap_set(priv->ids_enabled_bimap, id, 1);
> > +     __imsic_id_enable(id);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     bitmap_clear(priv->ids_enabled_bimap, id, 1);
> > +     __imsic_id_disable(id);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_ids_local_sync(struct imsic_priv *priv)
> > +{
> > +     int i;
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     for (i = 1; i <= priv->global.nr_ids; i++) {
> > +             if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> > +                     continue;
> > +
> > +             if (test_bit(i, priv->ids_enabled_bimap))
> > +                     __imsic_id_enable(i);
> > +             else
> > +                     __imsic_id_disable(i);
> > +     }
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> > +{
> > +     if (enable) {
> > +             imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> > +             imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> > +     } else {
> > +             imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> > +             imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> > +     }
> > +}
> > +
> > +static int imsic_ids_alloc(struct imsic_priv *priv,
> > +                        unsigned int max_id, unsigned int order)
> > +{
> > +     int ret;
> > +     unsigned long flags;
> > +
> > +     if ((priv->global.nr_ids < max_id) ||
> > +         (max_id < BIT(order)))
> > +             return -EINVAL;
>
> Why do we need this check? Shouldn't that be guaranteed by
> construction?

Yes, these are redundant checks. I will remove it.

>
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     ret = bitmap_find_free_region(priv->ids_used_bimap,
> > +                                   max_id + 1, order);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > +     return ret;
> > +}
> > +
> > +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> > +                        unsigned int order)
> > +{
> > +     unsigned long flags;
> > +
> > +     raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > +     bitmap_release_region(priv->ids_used_bimap, base_id, order);
> > +     raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static int __init imsic_ids_init(struct imsic_priv *priv)
> > +{
> > +     int i;
> > +     struct imsic_global_config *global = &priv->global;
> > +
> > +     raw_spin_lock_init(&priv->ids_lock);
> > +
> > +     /* Allocate used bitmap */
> > +     priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                     sizeof(unsigned long), GFP_KERNEL);
>
> How about bitmap_alloc?

Okay, I will use bitmap_zalloc() here.

>
> > +     if (!priv->ids_used_bimap)
> > +             return -ENOMEM;
> > +
> > +     /* Allocate enabled bitmap */
> > +     priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > +                                        sizeof(unsigned long), GFP_KERNEL);
> > +     if (!priv->ids_enabled_bimap) {
> > +             kfree(priv->ids_used_bimap);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     /* Allocate target CPU array */
> > +     priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> > +                                    sizeof(unsigned int), GFP_KERNEL);
> > +     if (!priv->ids_target_cpu) {
> > +             kfree(priv->ids_enabled_bimap);
> > +             kfree(priv->ids_used_bimap);
> > +             return -ENOMEM;
> > +     }
> > +     for (i = 0; i <= global->nr_ids; i++)
> > +             priv->ids_target_cpu[i] = UINT_MAX;
> > +
> > +     /* Reserve ID#0 because it is special and never implemented */
> > +     bitmap_set(priv->ids_used_bimap, 0, 1);
> > +
> > +     return 0;
> > +}
> > +
> > +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> > +{
> > +     kfree(priv->ids_target_cpu);
> > +     kfree(priv->ids_enabled_bimap);
> > +     kfree(priv->ids_used_bimap);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_ipi_send(unsigned int cpu)
> > +{
> > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > +     if (!handler || !handler->priv || !handler->local.msi_va) {
> > +             pr_warn("CPU%d: handler not initialized\n", cpu);
> > +             return;
> > +     }
> > +
> > +     writel(handler->priv->ipi_id, handler->local.msi_va);
> > +}
> > +
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +     __imsic_id_enable(priv->ipi_id);
> > +     __imsic_id_enable(priv->ipi_lsync_id);
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +     int virq;
> > +
> > +     /* Allocate interrupt identity for IPIs */
> > +     virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +     if (virq < 0)
> > +             return virq;
> > +     priv->ipi_id = virq;
> > +
> > +     /* Create IMSIC IPI multiplexing */
> > +     virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
>
> Please! This BITS_PER_BYTE makes zero sense here. Have a proper define
> that says 8, and document *why* this is 8! You're not defining a type
> system, you're writing a irqchip driver.

Okay, I will add a "#define" for the number of IPI with an explanation
for *why*.

>
> > +     if (virq <= 0) {
> > +             imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +             return (virq < 0) ? virq : -ENOMEM;
> > +     }
> > +
> > +     /* Set vIRQ range */
> > +     riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> > +
> > +     /* Allocate interrupt identity for local enable/disable sync */
> > +     virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > +     if (virq < 0) {
> > +             imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +             return virq;
> > +     }
> > +     priv->ipi_lsync_id = virq;
> > +
> > +     return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +     imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> > +     if (priv->ipi_id)
> > +             imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +}
> > +#else
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > +     /* Clear the IPI ids because we are not using IPIs */
> > +     priv->ipi_id = 0;
> > +     priv->ipi_lsync_id = 0;
> > +     return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +}
> > +#endif
> > +
> > +static void imsic_irq_mask(struct irq_data *d)
> > +{
> > +     imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_unmask(struct irq_data *d)
> > +{
> > +     imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> > +                                   struct msi_msg *msg)
> > +{
> > +     struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +     unsigned int cpu;
> > +     int err;
> > +
> > +     cpu = imsic_id_get_target(priv, d->hwirq);
> > +     WARN_ON(cpu == UINT_MAX);
> > +
> > +     err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> > +     WARN_ON(err);
> > +
> > +     iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int imsic_irq_set_affinity(struct irq_data *d,
> > +                               const struct cpumask *mask_val,
> > +                               bool force)
> > +{
> > +     struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > +     unsigned int target_cpu;
> > +     int rc;
> > +
> > +     rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> > +     if (rc)
> > +             return rc;
> > +
> > +     imsic_id_set_target(priv, d->hwirq, target_cpu);
> > +     irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> > +
> > +     return IRQ_SET_MASK_OK;
> > +}
> > +#endif
> > +
> > +static struct irq_chip imsic_irq_base_chip = {
> > +     .name                   = "RISC-V IMSIC-BASE",
> > +     .irq_mask               = imsic_irq_mask,
> > +     .irq_unmask             = imsic_irq_unmask,
> > +#ifdef CONFIG_SMP
> > +     .irq_set_affinity       = imsic_irq_set_affinity,
> > +#endif
> > +     .irq_compose_msi_msg    = imsic_irq_compose_msi_msg,
> > +     .flags                  = IRQCHIP_SKIP_SET_WAKE |
> > +                               IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> > +                               unsigned int virq,
> > +                               unsigned int nr_irqs,
> > +                               void *args)
> > +{
> > +     struct imsic_priv *priv = domain->host_data;
> > +     msi_alloc_info_t *info = args;
> > +     phys_addr_t msi_addr;
> > +     int i, hwirq, err = 0;
> > +     unsigned int cpu;
> > +
> > +     err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> > +     if (err)
> > +             return err;
> > +
> > +     err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > +     if (err)
> > +             return err;
> > +
> > +     hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> > +                             get_count_order(nr_irqs));
> > +     if (hwirq < 0)
> > +             return hwirq;
> > +
> > +     err = iommu_dma_prepare_msi(info->desc, msi_addr);
> > +     if (err)
> > +             goto fail;
> > +
> > +     for (i = 0; i < nr_irqs; i++) {
> > +             imsic_id_set_target(priv, hwirq + i, cpu);
> > +             irq_domain_set_info(domain, virq + i, hwirq + i,
> > +                                 &imsic_irq_base_chip, priv,
> > +                                 handle_simple_irq, NULL, NULL);
> > +             irq_set_noprobe(virq + i);
> > +             irq_set_affinity(virq + i, &priv->lmask);
> > +     }
> > +
> > +     return 0;
> > +
> > +fail:
> > +     imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> > +     return err;
> > +}
> > +
> > +static void imsic_irq_domain_free(struct irq_domain *domain,
> > +                               unsigned int virq,
> > +                               unsigned int nr_irqs)
> > +{
> > +     struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> > +     struct imsic_priv *priv = domain->host_data;
> > +
> > +     imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> > +     irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> > +}
> > +
> > +static const struct irq_domain_ops imsic_base_domain_ops = {
> > +     .alloc          = imsic_irq_domain_alloc,
> > +     .free           = imsic_irq_domain_free,
> > +};
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +
> > +static void imsic_pci_mask_irq(struct irq_data *d)
> > +{
> > +     pci_msi_mask_irq(d);
> > +     irq_chip_mask_parent(d);
> > +}
> > +
> > +static void imsic_pci_unmask_irq(struct irq_data *d)
> > +{
> > +     pci_msi_unmask_irq(d);
> > +     irq_chip_unmask_parent(d);
> > +}
> > +
> > +static struct irq_chip imsic_pci_irq_chip = {
> > +     .name                   = "RISC-V IMSIC-PCI",
> > +     .irq_mask               = imsic_pci_mask_irq,
> > +     .irq_unmask             = imsic_pci_unmask_irq,
> > +     .irq_eoi                = irq_chip_eoi_parent,
> > +};
> > +
> > +static struct msi_domain_ops imsic_pci_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_pci_domain_info = {
> > +     .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> > +                MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> > +     .ops    = &imsic_pci_domain_ops,
> > +     .chip   = &imsic_pci_irq_chip,
> > +};
> > +
> > +#endif
> > +
> > +static struct irq_chip imsic_plat_irq_chip = {
> > +     .name                   = "RISC-V IMSIC-PLAT",
> > +};
> > +
> > +static struct msi_domain_ops imsic_plat_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_plat_domain_info = {
> > +     .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> > +     .ops    = &imsic_plat_domain_ops,
> > +     .chip   = &imsic_plat_irq_chip,
> > +};
> > +
> > +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> > +                                      struct fwnode_handle *fwnode)
> > +{
> > +     /* Create Base IRQ domain */
> > +     priv->base_domain = irq_domain_create_tree(fwnode,
> > +                                             &imsic_base_domain_ops, priv);
> > +     if (!priv->base_domain) {
> > +             pr_err("Failed to create IMSIC base domain\n");
> > +             return -ENOMEM;
> > +     }
> > +     irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +     /* Create PCI MSI domain */
> > +     priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> > +                                             &imsic_pci_domain_info,
> > +                                             priv->base_domain);
> > +     if (!priv->pci_domain) {
> > +             pr_err("Failed to create IMSIC PCI domain\n");
> > +             irq_domain_remove(priv->base_domain);
> > +             return -ENOMEM;
> > +     }
> > +#endif
> > +
> > +     /* Create Platform MSI domain */
> > +     priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> > +                                             &imsic_plat_domain_info,
> > +                                             priv->base_domain);
> > +     if (!priv->plat_domain) {
> > +             pr_err("Failed to create IMSIC platform domain\n");
> > +             if (priv->pci_domain)
> > +                     irq_domain_remove(priv->pci_domain);
> > +             irq_domain_remove(priv->base_domain);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +     struct irq_chip *chip = irq_desc_get_chip(desc);
> > +     struct imsic_priv *priv = handler->priv;
> > +     irq_hw_number_t hwirq;
> > +     int err;
> > +
> > +     WARN_ON_ONCE(!handler->priv);
> > +
> > +     chained_irq_enter(chip, desc);
> > +
> > +     while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> > +             hwirq = hwirq >> TOPEI_ID_SHIFT;
> > +
> > +             if (hwirq == priv->ipi_id) {
> > +#ifdef CONFIG_SMP
> > +                     ipi_mux_process();
> > +#endif
> > +                     continue;
> > +             } else if (hwirq == priv->ipi_lsync_id) {
> > +                     imsic_ids_local_sync(priv);
> > +                     continue;
> > +             }
> > +
> > +             err = generic_handle_domain_irq(priv->base_domain, hwirq);
> > +             if (unlikely(err))
> > +                     pr_warn_ratelimited(
> > +                             "hwirq %lu mapping not found\n", hwirq);
> > +     }
> > +
> > +     chained_irq_exit(chip, desc);
> > +}
> > +
> > +static int imsic_starting_cpu(unsigned int cpu)
> > +{
> > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +     struct imsic_priv *priv = handler->priv;
> > +
> > +     /* Enable per-CPU parent interrupt */
> > +     if (imsic_parent_irq)
> > +             enable_percpu_irq(imsic_parent_irq,
> > +                               irq_get_trigger_type(imsic_parent_irq));
>
> Shouldn't that be the default already?

The imsic_parent_irq is already set before calling imsic_starting_cpu()
on each CPU so we can drop the if-check.

>
> > +     else
> > +             pr_warn("cpu%d: parent irq not available\n", cpu);
>
> And yet continue in sequence? Duh...

This warning is also not required.

>
> > +
> > +     /* Enable IPIs */
> > +     imsic_ipi_enable(priv);
> > +
> > +     /*
> > +      * Interrupts identities might have been enabled/disabled while
> > +      * this CPU was not running so sync-up local enable/disable state.
> > +      */
> > +     imsic_ids_local_sync(priv);
> > +
> > +     /* Locally enable interrupt delivery */
> > +     imsic_ids_local_delivery(priv, true);
> > +
> > +     return 0;
> > +}
> > +
> > +struct imsic_fwnode_ops {
> > +     u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> > +                          void *fwopaque);
> > +     int (*parent_hartid)(struct fwnode_handle *fwnode,
> > +                          void *fwopaque, u32 index,
> > +                          unsigned long *out_hartid);
> > +     u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> > +     int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> > +                             void *fwopaque, u32 index,
> > +                             struct resource *res);
> > +     void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> > +                               void *fwopaque, u32 index);
> > +     int (*read_u32)(struct fwnode_handle *fwnode,
> > +                     void *fwopaque, const char *prop, u32 *out_val);
> > +     bool (*read_bool)(struct fwnode_handle *fwnode,
> > +                       void *fwopaque, const char *prop);
> > +};
>
> Why do we need this sort of (terrible) indirection?

Okay, I will replace this indirection with fwnode APIs.

>
> > +
> > +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> > +                          struct fwnode_handle *fwnode,
> > +                          void *fwopaque)
> > +{
> > +     struct resource res;
> > +     phys_addr_t base_addr;
> > +     int rc, nr_parent_irqs;
> > +     struct imsic_mmio *mmio;
> > +     struct imsic_priv *priv;
> > +     struct irq_domain *domain;
> > +     struct imsic_handler *handler;
> > +     struct imsic_global_config *global;
> > +     u32 i, tmp, nr_handlers = 0;
> > +
> > +     if (imsic_init_done) {
> > +             pr_err("%pfwP: already initialized hence ignoring\n",
> > +                     fwnode);
> > +             return -ENODEV;
> > +     }
> > +
> > +     if (!riscv_isa_extension_available(NULL, SxAIA)) {
> > +             pr_err("%pfwP: AIA support not available\n", fwnode);
> > +             return -ENODEV;
> > +     }
> > +
> > +     priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> > +     if (!priv)
> > +             return -ENOMEM;
> > +     global = &priv->global;
> > +
> > +     /* Find number of parent interrupts */
> > +     nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> > +     if (!nr_parent_irqs) {
> > +             pr_err("%pfwP: no parent irqs available\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of guest index bits in MSI address */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> > +                          &global->guest_index_bits);
> > +     if (rc)
> > +             global->guest_index_bits = 0;
> > +     tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> > +     if (tmp < global->guest_index_bits) {
> > +             pr_err("%pfwP: guest index bits too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of HART index bits */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> > +                          &global->hart_index_bits);
> > +     if (rc) {
> > +             /* Assume default value */
> > +             global->hart_index_bits = __fls(nr_parent_irqs);
> > +             if (BIT(global->hart_index_bits) < nr_parent_irqs)
> > +                     global->hart_index_bits++;
> > +     }
> > +     tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +           global->guest_index_bits;
> > +     if (tmp < global->hart_index_bits) {
> > +             pr_err("%pfwP: HART index bits too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of group index bits */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> > +                          &global->group_index_bits);
> > +     if (rc)
> > +             global->group_index_bits = 0;
> > +     tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > +           global->guest_index_bits - global->hart_index_bits;
> > +     if (tmp < global->group_index_bits) {
> > +             pr_err("%pfwP: group index bits too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /*
> > +      * Find first bit position of group index.
> > +      * If not specified assumed the default APLIC-IMSIC configuration.
> > +      */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> > +                          &global->group_index_shift);
> > +     if (rc)
> > +             global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> > +     tmp = global->group_index_bits + global->group_index_shift - 1;
> > +     if (tmp >= BITS_PER_LONG) {
> > +             pr_err("%pfwP: group index shift too big\n", fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of interrupt identities */
> > +     rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> > +                          &global->nr_ids);
> > +     if (rc) {
> > +             pr_err("%pfwP: number of interrupt identities not found\n",
> > +                     fwnode);
> > +             return rc;
> > +     }
> > +     if ((global->nr_ids < IMSIC_MIN_ID) ||
> > +         (global->nr_ids >= IMSIC_MAX_ID) ||
> > +         ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +             pr_err("%pfwP: invalid number of interrupt identities\n",
> > +                     fwnode);
> > +             return -EINVAL;
> > +     }
> > +
> > +     /* Find number of guest interrupt identities */
> > +     if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> > +                         &global->nr_guest_ids))
> > +             global->nr_guest_ids = global->nr_ids;
> > +     if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> > +         (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> > +         ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > +             pr_err("%pfwP: invalid number of guest interrupt identities\n",
> > +                     fwnode);
> > +             return -EINVAL;
> > +     }
>
> Please split the whole guest stuff out. It is totally unused!

The number of guest IDs is used by KVM RISC-V AIA support which
is in the pipeline. The KVM RISC-V only need imsic_get_global_config()
and imsic_get_local_config(). The "nr_guest_ids" is part of the
IMSIC global config.

>
> I've stopped reading. This needs structure, cleanups and a bit of
> taste. Not a lot of that here at the moment.
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

It took a while to address all your comments and I also got
preempted by other stuff. Sorry for the delay.

Regards,
Anup

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
  2023-05-01  8:28       ` Anup Patel
@ 2023-05-01  8:44         ` Marc Zyngier
  -1 siblings, 0 replies; 72+ messages in thread
From: Marc Zyngier @ 2023-05-01  8:44 UTC (permalink / raw)
  To: Anup Patel
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	linux-riscv, linux-kernel, devicetree

On Mon, 01 May 2023 09:28:16 +0100,
Anup Patel <anup@brainfault.org> wrote:
> 
> On Fri, Jan 13, 2023 at 3:40 PM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Tue, 03 Jan 2023 14:14:05 +0000,
> > Anup Patel <apatel@ventanamicro.com> wrote:
> > >
> > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > > MSI controller is referred to as incoming message signaled interrupt
> > > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > And how about IPIs, which this driver seems to be concerned about?
> 
> Okay, I will mention about IPIs in the commit description.
> 
> >
> > >
> > > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > > platforms.
> > >
> > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > ---
> > >  drivers/irqchip/Kconfig             |   14 +-
> > >  drivers/irqchip/Makefile            |    1 +
> > >  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
> > >  include/linux/irqchip/riscv-imsic.h |   92 +++
> > >  4 files changed, 1280 insertions(+), 1 deletion(-)
> > >  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> > >  create mode 100644 include/linux/irqchip/riscv-imsic.h
> > >
> > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > index 9e65345ca3f6..a1315189a595 100644
> > > --- a/drivers/irqchip/Kconfig
> > > +++ b/drivers/irqchip/Kconfig
> > > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> > >
> > >  config GIC_NON_BANKED
> > >       bool
> > > -
> > >  config ARM_GIC_V3
> > >       bool
> > >       select IRQ_DOMAIN_HIERARCHY
> > > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> > >       select IRQ_DOMAIN_HIERARCHY
> > >       select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > >
> > > +config RISCV_IMSIC
> > > +     bool
> > > +     depends on RISCV
> > > +     select IRQ_DOMAIN_HIERARCHY
> > > +     select GENERIC_MSI_IRQ_DOMAIN
> > > +
> > > +config RISCV_IMSIC_PCI
> > > +     bool
> > > +     depends on RISCV_IMSIC
> > > +     depends on PCI
> > > +     depends on PCI_MSI
> > > +     default RISCV_IMSIC
> >
> > This should definitely tell you that this driver needs splitting.
> 
> The code under "#ifdef CONFIG_RISCV_IMSIC_PCI" is hardly 40 lines
> so I felt it was too small to deserve its own source file.

It at least needs its own patch.

> 
> >
> > > +
> > >  config EXYNOS_IRQ_COMBINER
> > >       bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> > >       depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > index 87b49a10962c..22c723cc6ec8 100644
> > > --- a/drivers/irqchip/Makefile
> > > +++ b/drivers/irqchip/Makefile
> > > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                      += irq-qcom-mpm.o
> > >  obj-$(CONFIG_CSKY_MPINTC)            += irq-csky-mpintc.o
> > >  obj-$(CONFIG_CSKY_APB_INTC)          += irq-csky-apb-intc.o
> > >  obj-$(CONFIG_RISCV_INTC)             += irq-riscv-intc.o
> > > +obj-$(CONFIG_RISCV_IMSIC)            += irq-riscv-imsic.o
> > >  obj-$(CONFIG_SIFIVE_PLIC)            += irq-sifive-plic.o
> > >  obj-$(CONFIG_IMX_IRQSTEER)           += irq-imx-irqsteer.o
> > >  obj-$(CONFIG_IMX_INTMUX)             += irq-imx-intmux.o
> > > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > > new file mode 100644
> > > index 000000000000..4c16b66738d6
> > > --- /dev/null
> > > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > > @@ -0,0 +1,1174 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > + */
> > > +
> > > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > > +#include <linux/bitmap.h>
> > > +#include <linux/cpu.h>
> > > +#include <linux/interrupt.h>
> > > +#include <linux/io.h>
> > > +#include <linux/iommu.h>
> > > +#include <linux/irq.h>
> > > +#include <linux/irqchip.h>
> > > +#include <linux/irqchip/chained_irq.h>
> > > +#include <linux/irqchip/riscv-imsic.h>
> > > +#include <linux/irqdomain.h>
> > > +#include <linux/module.h>
> > > +#include <linux/msi.h>
> > > +#include <linux/of.h>
> > > +#include <linux/of_address.h>
> > > +#include <linux/of_irq.h>
> > > +#include <linux/pci.h>
> > > +#include <linux/platform_device.h>
> > > +#include <linux/spinlock.h>
> > > +#include <linux/smp.h>
> > > +#include <asm/hwcap.h>
> > > +
> > > +#define IMSIC_DISABLE_EIDELIVERY     0
> > > +#define IMSIC_ENABLE_EIDELIVERY              1
> > > +#define IMSIC_DISABLE_EITHRESHOLD    1
> > > +#define IMSIC_ENABLE_EITHRESHOLD     0
> > > +
> > > +#define imsic_csr_write(__c, __v)    \
> > > +do {                                 \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     csr_write(CSR_IREG, __v);       \
> > > +} while (0)
> > > +
> > > +#define imsic_csr_read(__c)          \
> > > +({                                   \
> > > +     unsigned long __v;              \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     __v = csr_read(CSR_IREG);       \
> > > +     __v;                            \
> > > +})
> > > +
> > > +#define imsic_csr_set(__c, __v)              \
> > > +do {                                 \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     csr_set(CSR_IREG, __v);         \
> > > +} while (0)
> > > +
> > > +#define imsic_csr_clear(__c, __v)    \
> > > +do {                                 \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     csr_clear(CSR_IREG, __v);       \
> > > +} while (0)
> > > +
> > > +struct imsic_mmio {
> > > +     phys_addr_t pa;
> > > +     void __iomem *va;
> > > +     unsigned long size;
> > > +};
> > > +
> > > +struct imsic_priv {
> > > +     /* Global configuration common for all HARTs */
> > > +     struct imsic_global_config global;
> > > +
> > > +     /* MMIO regions */
> > > +     u32 num_mmios;
> > > +     struct imsic_mmio *mmios;
> > > +
> > > +     /* Global state of interrupt identities */
> > > +     raw_spinlock_t ids_lock;
> > > +     unsigned long *ids_used_bimap;
> > > +     unsigned long *ids_enabled_bimap;
> > > +     unsigned int *ids_target_cpu;
> > > +
> > > +     /* Mask for connected CPUs */
> > > +     struct cpumask lmask;
> > > +
> > > +     /* IPI interrupt identity */
> > > +     u32 ipi_id;
> > > +     u32 ipi_lsync_id;
> > > +
> > > +     /* IRQ domains */
> > > +     struct irq_domain *base_domain;
> > > +     struct irq_domain *pci_domain;
> > > +     struct irq_domain *plat_domain;
> > > +};
> > > +
> > > +struct imsic_handler {
> > > +     /* Local configuration for given HART */
> > > +     struct imsic_local_config local;
> > > +
> > > +     /* Pointer to private context */
> > > +     struct imsic_priv *priv;
> > > +};
> > > +
> > > +static bool imsic_init_done;
> > > +
> > > +static int imsic_parent_irq;
> > > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > > +
> > > +const struct imsic_global_config *imsic_get_global_config(void)
> > > +{
> > > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > > +
> > > +     if (!handler || !handler->priv)
> > > +             return NULL;
> > > +
> > > +     return &handler->priv->global;
> > > +}
> > > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > > +
> > > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > > +{
> > > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > > +
> > > +     if (!handler || !handler->priv)
> > > +             return NULL;
> >
> > How can this happen?
> 
> These are redundant checks. I will drop.
> 
> >
> > > +
> > > +     return &handler->local;
> > > +}
> > > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
> >
> > Why are these symbols exported? They have no user, so they shouldn't
> > even exist here. I also seriously doubt there is a valid use case for
> > exposing this information to the rest of the kernel.
> 
> The imsic_get_global_config() is used by APLIC driver and KVM RISC-V
> module whereas imsic_get_local_config() is only used by KVM RISC-V.
> 
> The KVM RISC-V AIA irqchip patches are available in riscv_kvm_aia_v1
> branch at: https://github.com/avpatel/linux.git. I have not posted KVM RISC-V
> patches due various interdependencies.

Then the symbols can wait, can't they? It'd make more sense if the
KVM-dependent bits were brought together with the KVM patches.

Even better, you'd use some level of abstraction between KVM and the
irqchip code. GIC makes some heavy use of irq_set_vcpu_affinity() as a
private API with KVM, and I'd suggest you look into something similar.

[...]

> > > +#ifdef CONFIG_SMP
> > > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > > +{
> > > +     struct imsic_handler *handler;
> > > +     struct cpumask amask;
> > > +     int cpu;
> > > +
> > > +     cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> >
> > Can't this race against a CPU going down?
> 
> Yes, it can race if a CPU goes down while we are in this function
> but this won't be a problem because the imsic_starting_cpu()
> will unconditionally do imsic_ids_local_sync() when the CPU is
> brought-up again. I will add a multiline comment block explaining
> this.

I'd rather you avoid the race instead of papering over it.

> 
> >
> > > +     for_each_cpu(cpu, &amask) {
> > > +             if (cpu == smp_processor_id())
> > > +                     continue;
> > > +
> > > +             handler = per_cpu_ptr(&imsic_handlers, cpu);
> > > +             if (!handler || !handler->priv || !handler->local.msi_va) {
> > > +                     pr_warn("CPU%d: handler not initialized\n", cpu);
> >
> > How many times are you going to do that? On each failing synchronisation?
> 
> My bad for adding these paranoid checks. I remove these checks
> wherever possible.
> 
> >
> > > +                     continue;
> > > +             }
> > > +
> > > +             writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
> >
> > As I understand it, this is a "behind the scenes" IPI. Why isn't that
> > a *real* IPI?
> 
> Yes, that's correct. The ID enable bits are per-CPU accessible only
> via CSRs hence we have a special "behind the scenes" IPI to
> synchronize state of ID enable bits.

My question still stands: why isn't this a *real*, Linux visible IPI?
This sideband signalling makes everything hard to follow, hard to
debug, and screws up accounting.

> > Please split the whole guest stuff out. It is totally unused!
> 
> The number of guest IDs is used by KVM RISC-V AIA support which
> is in the pipeline. The KVM RISC-V only need imsic_get_global_config()
> and imsic_get_local_config(). The "nr_guest_ids" is part of the
> IMSIC global config.

And yet it isn't needed for a minimal driver, which what I'd like to
see at first. Shoving the kitchen sink into an initial patch isn't a
great way to get it merged.

      M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver
@ 2023-05-01  8:44         ` Marc Zyngier
  0 siblings, 0 replies; 72+ messages in thread
From: Marc Zyngier @ 2023-05-01  8:44 UTC (permalink / raw)
  To: Anup Patel
  Cc: Anup Patel, Palmer Dabbelt, Paul Walmsley, Thomas Gleixner,
	Rob Herring, Krzysztof Kozlowski, Atish Patra, Alistair Francis,
	linux-riscv, linux-kernel, devicetree

On Mon, 01 May 2023 09:28:16 +0100,
Anup Patel <anup@brainfault.org> wrote:
> 
> On Fri, Jan 13, 2023 at 3:40 PM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Tue, 03 Jan 2023 14:14:05 +0000,
> > Anup Patel <apatel@ventanamicro.com> wrote:
> > >
> > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > > MSI controller is referred to as incoming message signaled interrupt
> > > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > And how about IPIs, which this driver seems to be concerned about?
> 
> Okay, I will mention about IPIs in the commit description.
> 
> >
> > >
> > > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > > platforms.
> > >
> > > Signed-off-by: Anup Patel <apatel@ventanamicro.com>
> > > ---
> > >  drivers/irqchip/Kconfig             |   14 +-
> > >  drivers/irqchip/Makefile            |    1 +
> > >  drivers/irqchip/irq-riscv-imsic.c   | 1174 +++++++++++++++++++++++++++
> > >  include/linux/irqchip/riscv-imsic.h |   92 +++
> > >  4 files changed, 1280 insertions(+), 1 deletion(-)
> > >  create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> > >  create mode 100644 include/linux/irqchip/riscv-imsic.h
> > >
> > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > index 9e65345ca3f6..a1315189a595 100644
> > > --- a/drivers/irqchip/Kconfig
> > > +++ b/drivers/irqchip/Kconfig
> > > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> > >
> > >  config GIC_NON_BANKED
> > >       bool
> > > -
> > >  config ARM_GIC_V3
> > >       bool
> > >       select IRQ_DOMAIN_HIERARCHY
> > > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> > >       select IRQ_DOMAIN_HIERARCHY
> > >       select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > >
> > > +config RISCV_IMSIC
> > > +     bool
> > > +     depends on RISCV
> > > +     select IRQ_DOMAIN_HIERARCHY
> > > +     select GENERIC_MSI_IRQ_DOMAIN
> > > +
> > > +config RISCV_IMSIC_PCI
> > > +     bool
> > > +     depends on RISCV_IMSIC
> > > +     depends on PCI
> > > +     depends on PCI_MSI
> > > +     default RISCV_IMSIC
> >
> > This should definitely tell you that this driver needs splitting.
> 
> The code under "#ifdef CONFIG_RISCV_IMSIC_PCI" is hardly 40 lines
> so I felt it was too small to deserve its own source file.

It at least needs its own patch.

> 
> >
> > > +
> > >  config EXYNOS_IRQ_COMBINER
> > >       bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> > >       depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > index 87b49a10962c..22c723cc6ec8 100644
> > > --- a/drivers/irqchip/Makefile
> > > +++ b/drivers/irqchip/Makefile
> > > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM)                      += irq-qcom-mpm.o
> > >  obj-$(CONFIG_CSKY_MPINTC)            += irq-csky-mpintc.o
> > >  obj-$(CONFIG_CSKY_APB_INTC)          += irq-csky-apb-intc.o
> > >  obj-$(CONFIG_RISCV_INTC)             += irq-riscv-intc.o
> > > +obj-$(CONFIG_RISCV_IMSIC)            += irq-riscv-imsic.o
> > >  obj-$(CONFIG_SIFIVE_PLIC)            += irq-sifive-plic.o
> > >  obj-$(CONFIG_IMX_IRQSTEER)           += irq-imx-irqsteer.o
> > >  obj-$(CONFIG_IMX_INTMUX)             += irq-imx-intmux.o
> > > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > > new file mode 100644
> > > index 000000000000..4c16b66738d6
> > > --- /dev/null
> > > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > > @@ -0,0 +1,1174 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > + */
> > > +
> > > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > > +#include <linux/bitmap.h>
> > > +#include <linux/cpu.h>
> > > +#include <linux/interrupt.h>
> > > +#include <linux/io.h>
> > > +#include <linux/iommu.h>
> > > +#include <linux/irq.h>
> > > +#include <linux/irqchip.h>
> > > +#include <linux/irqchip/chained_irq.h>
> > > +#include <linux/irqchip/riscv-imsic.h>
> > > +#include <linux/irqdomain.h>
> > > +#include <linux/module.h>
> > > +#include <linux/msi.h>
> > > +#include <linux/of.h>
> > > +#include <linux/of_address.h>
> > > +#include <linux/of_irq.h>
> > > +#include <linux/pci.h>
> > > +#include <linux/platform_device.h>
> > > +#include <linux/spinlock.h>
> > > +#include <linux/smp.h>
> > > +#include <asm/hwcap.h>
> > > +
> > > +#define IMSIC_DISABLE_EIDELIVERY     0
> > > +#define IMSIC_ENABLE_EIDELIVERY              1
> > > +#define IMSIC_DISABLE_EITHRESHOLD    1
> > > +#define IMSIC_ENABLE_EITHRESHOLD     0
> > > +
> > > +#define imsic_csr_write(__c, __v)    \
> > > +do {                                 \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     csr_write(CSR_IREG, __v);       \
> > > +} while (0)
> > > +
> > > +#define imsic_csr_read(__c)          \
> > > +({                                   \
> > > +     unsigned long __v;              \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     __v = csr_read(CSR_IREG);       \
> > > +     __v;                            \
> > > +})
> > > +
> > > +#define imsic_csr_set(__c, __v)              \
> > > +do {                                 \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     csr_set(CSR_IREG, __v);         \
> > > +} while (0)
> > > +
> > > +#define imsic_csr_clear(__c, __v)    \
> > > +do {                                 \
> > > +     csr_write(CSR_ISELECT, __c);    \
> > > +     csr_clear(CSR_IREG, __v);       \
> > > +} while (0)
> > > +
> > > +struct imsic_mmio {
> > > +     phys_addr_t pa;
> > > +     void __iomem *va;
> > > +     unsigned long size;
> > > +};
> > > +
> > > +struct imsic_priv {
> > > +     /* Global configuration common for all HARTs */
> > > +     struct imsic_global_config global;
> > > +
> > > +     /* MMIO regions */
> > > +     u32 num_mmios;
> > > +     struct imsic_mmio *mmios;
> > > +
> > > +     /* Global state of interrupt identities */
> > > +     raw_spinlock_t ids_lock;
> > > +     unsigned long *ids_used_bimap;
> > > +     unsigned long *ids_enabled_bimap;
> > > +     unsigned int *ids_target_cpu;
> > > +
> > > +     /* Mask for connected CPUs */
> > > +     struct cpumask lmask;
> > > +
> > > +     /* IPI interrupt identity */
> > > +     u32 ipi_id;
> > > +     u32 ipi_lsync_id;
> > > +
> > > +     /* IRQ domains */
> > > +     struct irq_domain *base_domain;
> > > +     struct irq_domain *pci_domain;
> > > +     struct irq_domain *plat_domain;
> > > +};
> > > +
> > > +struct imsic_handler {
> > > +     /* Local configuration for given HART */
> > > +     struct imsic_local_config local;
> > > +
> > > +     /* Pointer to private context */
> > > +     struct imsic_priv *priv;
> > > +};
> > > +
> > > +static bool imsic_init_done;
> > > +
> > > +static int imsic_parent_irq;
> > > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > > +
> > > +const struct imsic_global_config *imsic_get_global_config(void)
> > > +{
> > > +     struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > > +
> > > +     if (!handler || !handler->priv)
> > > +             return NULL;
> > > +
> > > +     return &handler->priv->global;
> > > +}
> > > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > > +
> > > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > > +{
> > > +     struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > > +
> > > +     if (!handler || !handler->priv)
> > > +             return NULL;
> >
> > How can this happen?
> 
> These are redundant checks. I will drop.
> 
> >
> > > +
> > > +     return &handler->local;
> > > +}
> > > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
> >
> > Why are these symbols exported? They have no user, so they shouldn't
> > even exist here. I also seriously doubt there is a valid use case for
> > exposing this information to the rest of the kernel.
> 
> The imsic_get_global_config() is used by APLIC driver and KVM RISC-V
> module whereas imsic_get_local_config() is only used by KVM RISC-V.
> 
> The KVM RISC-V AIA irqchip patches are available in riscv_kvm_aia_v1
> branch at: https://github.com/avpatel/linux.git. I have not posted KVM RISC-V
> patches due various interdependencies.

Then the symbols can wait, can't they? It'd make more sense if the
KVM-dependent bits were brought together with the KVM patches.

Even better, you'd use some level of abstraction between KVM and the
irqchip code. GIC makes some heavy use of irq_set_vcpu_affinity() as a
private API with KVM, and I'd suggest you look into something similar.

[...]

> > > +#ifdef CONFIG_SMP
> > > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > > +{
> > > +     struct imsic_handler *handler;
> > > +     struct cpumask amask;
> > > +     int cpu;
> > > +
> > > +     cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> >
> > Can't this race against a CPU going down?
> 
> Yes, it can race if a CPU goes down while we are in this function
> but this won't be a problem because the imsic_starting_cpu()
> will unconditionally do imsic_ids_local_sync() when the CPU is
> brought-up again. I will add a multiline comment block explaining
> this.

I'd rather you avoid the race instead of papering over it.

> 
> >
> > > +     for_each_cpu(cpu, &amask) {
> > > +             if (cpu == smp_processor_id())
> > > +                     continue;
> > > +
> > > +             handler = per_cpu_ptr(&imsic_handlers, cpu);
> > > +             if (!handler || !handler->priv || !handler->local.msi_va) {
> > > +                     pr_warn("CPU%d: handler not initialized\n", cpu);
> >
> > How many times are you going to do that? On each failing synchronisation?
> 
> My bad for adding these paranoid checks. I remove these checks
> wherever possible.
> 
> >
> > > +                     continue;
> > > +             }
> > > +
> > > +             writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
> >
> > As I understand it, this is a "behind the scenes" IPI. Why isn't that
> > a *real* IPI?
> 
> Yes, that's correct. The ID enable bits are per-CPU accessible only
> via CSRs hence we have a special "behind the scenes" IPI to
> synchronize state of ID enable bits.

My question still stands: why isn't this a *real*, Linux visible IPI?
This sideband signalling makes everything hard to follow, hard to
debug, and screws up accounting.

> > Please split the whole guest stuff out. It is totally unused!
> 
> The number of guest IDs is used by KVM RISC-V AIA support which
> is in the pipeline. The KVM RISC-V only need imsic_get_global_config()
> and imsic_get_local_config(). The "nr_guest_ids" is part of the
> IMSIC global config.

And yet it isn't needed for a minimal driver, which what I'd like to
see at first. Shoving the kitchen sink into an initial patch isn't a
great way to get it merged.

      M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2023-05-01  8:46 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-03 14:14 [PATCH v2 0/9] Linux RISC-V AIA Support Anup Patel
2023-01-03 14:14 ` Anup Patel
2023-01-03 14:14 ` [PATCH v2 1/9] RISC-V: Add AIA related CSR defines Anup Patel
2023-01-03 14:14   ` Anup Patel
2023-01-04 23:07   ` Conor Dooley
2023-01-04 23:07     ` Conor Dooley
2023-01-09  5:09     ` Anup Patel
2023-01-09  5:09       ` Anup Patel
2023-01-17 20:42       ` Conor Dooley
2023-01-17 20:42         ` Conor Dooley
2023-01-27 11:58         ` Anup Patel
2023-01-27 11:58           ` Anup Patel
2023-01-27 14:20           ` Conor Dooley
2023-01-27 14:20             ` Conor Dooley
2023-01-03 14:14 ` [PATCH v2 2/9] RISC-V: Detect AIA CSRs from ISA string Anup Patel
2023-01-03 14:14   ` Anup Patel
2023-01-03 14:14 ` [PATCH v2 3/9] irqchip/riscv-intc: Add support for RISC-V AIA Anup Patel
2023-01-03 14:14   ` Anup Patel
2023-01-13  9:39   ` Marc Zyngier
2023-01-13  9:39     ` Marc Zyngier
2023-01-03 14:14 ` [PATCH v2 4/9] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller Anup Patel
2023-01-03 14:14   ` Anup Patel
2023-01-04 23:21   ` Conor Dooley
2023-01-04 23:21     ` Conor Dooley
2023-02-20  3:15     ` Anup Patel
2023-02-20  3:15       ` Anup Patel
2023-01-12 20:49   ` Rob Herring
2023-01-12 20:49     ` Rob Herring
2023-02-20  3:20     ` Anup Patel
2023-02-20  3:20       ` Anup Patel
2023-02-19 11:17   ` Vivian Wang
2023-02-19 11:17     ` Vivian Wang
2023-02-20  3:31     ` Anup Patel
2023-02-20  3:31       ` Anup Patel
2023-01-03 14:14 ` [PATCH v2 5/9] irqchip: Add RISC-V incoming MSI controller driver Anup Patel
2023-01-03 14:14   ` Anup Patel
2023-01-13 10:10   ` Marc Zyngier
2023-01-13 10:10     ` Marc Zyngier
2023-05-01  8:28     ` Anup Patel
2023-05-01  8:28       ` Anup Patel
2023-05-01  8:44       ` Marc Zyngier
2023-05-01  8:44         ` Marc Zyngier
     [not found]   ` <CAPqJEFqhd-=-RYepKqnco7HySoxk7AhEctL+vzNozMSWe0mv7A@mail.gmail.com>
     [not found]     ` <CABvJ_xhcuC92A_oo1mWQoRvtRzE8XXx9bbXKs7N7wKm0=Z3_Cw@mail.gmail.com>
2023-01-18  3:49       ` Fwd: " Vincent Chen
2023-01-18  3:49         ` Vincent Chen
2023-01-18  4:20         ` Anup Patel
2023-01-18  4:20           ` Anup Patel
2023-01-03 14:14 ` [PATCH v2 6/9] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC Anup Patel
2023-01-03 14:14   ` Anup Patel
2023-01-04 22:16   ` Conor Dooley
2023-01-04 22:16     ` Conor Dooley
2023-02-20  4:36     ` Anup Patel
2023-02-20  4:36       ` Anup Patel
2023-02-20 10:32       ` Conor Dooley
2023-02-20 10:32         ` Conor Dooley
2023-02-20 10:56         ` Conor Dooley
2023-02-20 10:56           ` Conor Dooley
2023-01-12 21:02   ` Rob Herring
2023-01-12 21:02     ` Rob Herring
2023-02-19 11:48   ` Vivian Wang
2023-02-19 11:48     ` Vivian Wang
2023-02-20  5:09     ` Anup Patel
2023-02-20  5:09       ` Anup Patel
2023-01-03 14:14 ` [PATCH v2 7/9] irqchip: Add RISC-V advanced PLIC driver Anup Patel
2023-01-03 14:14   ` Anup Patel
     [not found]   ` <CAPqJEFpmAvWiOdackxYwSPBfjo4DnTHXrXVSCC4snMn8tnZXPw@mail.gmail.com>
     [not found]     ` <CABvJ_xhjMa8xTsO-Qa23TOqxPpYxyBYSfV6TmKney-Gp3oi8cA@mail.gmail.com>
2023-01-17  7:09       ` Fwd: " Vincent Chen
2023-01-17  7:09         ` Vincent Chen
2023-01-18  4:37         ` Anup Patel
2023-01-18  4:37           ` Anup Patel
2023-01-03 14:14 ` [PATCH v2 8/9] RISC-V: Select APLIC and IMSIC drivers Anup Patel
2023-01-03 14:14   ` Anup Patel
2023-01-03 14:14 ` [PATCH v2 9/9] MAINTAINERS: Add entry for RISC-V AIA drivers Anup Patel
2023-01-03 14:14   ` Anup Patel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.