All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-11 14:03 ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

We plan to restructure x86 interrupt code based on hierarchy irqdomain,
that is to build irqdomains for CPU vector, interrupt remapping unit,
IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
Each irqdomain manages corresponding interrupt controller and talks to
parent interrupt controller through public irqdomain interfaces. We also
support stacked irq_chip based on hierarchy irqdomain. It will make the
x86 interrupt architecture much more clear and more easy to maintain
with hierarchy irqdomain and stacked irq_chip. It may also help ARM
interrupt management architecture too.

This is the second patch set to enable support of hierarchy irqdomain
on x86 platforms. It depends on the first part at:
https://lkml.org/lkml/2014/9/11/101 
And you may access it at:
https://github.com/jiangliu/linux.git irqdomain/p2v1

And there will be a third patch set to convert IOAPIC driver to support
hierarchy irqdomain and clean up code.

The first patch extends irqdomain interfaces to support hierarchy
irqdomain. Hope this interface could be used by other architectures too,
such as ARM/ARM64.
The second patch introduces two helper functions to support stacked
irq_chip.
Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
it's the root irqdomain for x86 platforms.
Patch 10-13 converts Intel and AMD interrupt remapping drivers to
support hierarchy irqdomain.
Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
irqdomain.
Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
drivers.

We have tested this patchset on Intel 32-bit and 64-bit systems. But we
have only done compilation tests for HT_IRQ and AMD interrupt remapping
drivers due to hardware resource limitation. Tests on AMD platforms are
warmly welcomed!

Jiang Liu (21):
  irqdomain: Introduce new interfaces to support hierarchy irqdomains
  genirq: Introduce helper functions to support stacked irq_chip
  x86, irq: Save destination CPU ID in irq_cfg
  x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
  x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
  x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
  x86, uv: Use new irqdomain interfaces to allocate/free IRQ
  x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
  x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
  x86: irq_remapping: Introduce new interfaces to support hierarchy
    irqdomain
  iommu/vt-d: Change prototypes to prepare for enabling hierarchy
    irqdomain
  iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
  iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
  x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
  x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
  x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
  x86, htirq: Use hierarchy irqdomain to manage Hypertransport
    interrupts
  iommu/vt-d: Clean up unused MSI related code
  iommu/amd: Clean up unused MSI related code
  x86: irq_remapping: Clean up unused MSI related code
  x86, irq: Clean up unused MSI related code and interfaces

 arch/x86/Kconfig                     |    3 +-
 arch/x86/include/asm/hpet.h          |   16 +-
 arch/x86/include/asm/hw_irq.h        |   64 +++++
 arch/x86/include/asm/irq_remapping.h |   66 +++--
 arch/x86/include/asm/pci.h           |    5 -
 arch/x86/include/asm/x86_init.h      |    4 -
 arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
 arch/x86/kernel/apic/io_apic.c       |    3 -
 arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
 arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
 arch/x86/kernel/hpet.c               |   57 ++---
 arch/x86/kernel/x86_init.c           |    2 -
 arch/x86/platform/uv/uv_irq.c        |   27 +-
 drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
 drivers/iommu/amd_iommu_init.c       |    4 +
 drivers/iommu/amd_iommu_proto.h      |    9 +
 drivers/iommu/amd_iommu_types.h      |    5 +
 drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
 drivers/iommu/irq_remapping.c        |  221 ++++++----------
 drivers/iommu/irq_remapping.h        |   22 +-
 drivers/pci/htirq.c                  |   48 +---
 include/linux/htirq.h                |   22 +-
 include/linux/intel-iommu.h          |    4 +
 include/linux/irq.h                  |    8 +
 include/linux/irqdomain.h            |   60 +++++
 kernel/irq/Kconfig                   |    3 +
 kernel/irq/chip.c                    |   21 ++
 kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
 28 files changed, 1934 insertions(+), 709 deletions(-)

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-11 14:03 ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

We plan to restructure x86 interrupt code based on hierarchy irqdomain,
that is to build irqdomains for CPU vector, interrupt remapping unit,
IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
Each irqdomain manages corresponding interrupt controller and talks to
parent interrupt controller through public irqdomain interfaces. We also
support stacked irq_chip based on hierarchy irqdomain. It will make the
x86 interrupt architecture much more clear and more easy to maintain
with hierarchy irqdomain and stacked irq_chip. It may also help ARM
interrupt management architecture too.

This is the second patch set to enable support of hierarchy irqdomain
on x86 platforms. It depends on the first part at:
https://lkml.org/lkml/2014/9/11/101 
And you may access it at:
https://github.com/jiangliu/linux.git irqdomain/p2v1

And there will be a third patch set to convert IOAPIC driver to support
hierarchy irqdomain and clean up code.

The first patch extends irqdomain interfaces to support hierarchy
irqdomain. Hope this interface could be used by other architectures too,
such as ARM/ARM64.
The second patch introduces two helper functions to support stacked
irq_chip.
Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
it's the root irqdomain for x86 platforms.
Patch 10-13 converts Intel and AMD interrupt remapping drivers to
support hierarchy irqdomain.
Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
irqdomain.
Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
drivers.

We have tested this patchset on Intel 32-bit and 64-bit systems. But we
have only done compilation tests for HT_IRQ and AMD interrupt remapping
drivers due to hardware resource limitation. Tests on AMD platforms are
warmly welcomed!

Jiang Liu (21):
  irqdomain: Introduce new interfaces to support hierarchy irqdomains
  genirq: Introduce helper functions to support stacked irq_chip
  x86, irq: Save destination CPU ID in irq_cfg
  x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
  x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
  x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
  x86, uv: Use new irqdomain interfaces to allocate/free IRQ
  x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
  x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
  x86: irq_remapping: Introduce new interfaces to support hierarchy
    irqdomain
  iommu/vt-d: Change prototypes to prepare for enabling hierarchy
    irqdomain
  iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
  iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
  x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
  x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
  x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
  x86, htirq: Use hierarchy irqdomain to manage Hypertransport
    interrupts
  iommu/vt-d: Clean up unused MSI related code
  iommu/amd: Clean up unused MSI related code
  x86: irq_remapping: Clean up unused MSI related code
  x86, irq: Clean up unused MSI related code and interfaces

 arch/x86/Kconfig                     |    3 +-
 arch/x86/include/asm/hpet.h          |   16 +-
 arch/x86/include/asm/hw_irq.h        |   64 +++++
 arch/x86/include/asm/irq_remapping.h |   66 +++--
 arch/x86/include/asm/pci.h           |    5 -
 arch/x86/include/asm/x86_init.h      |    4 -
 arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
 arch/x86/kernel/apic/io_apic.c       |    3 -
 arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
 arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
 arch/x86/kernel/hpet.c               |   57 ++---
 arch/x86/kernel/x86_init.c           |    2 -
 arch/x86/platform/uv/uv_irq.c        |   27 +-
 drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
 drivers/iommu/amd_iommu_init.c       |    4 +
 drivers/iommu/amd_iommu_proto.h      |    9 +
 drivers/iommu/amd_iommu_types.h      |    5 +
 drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
 drivers/iommu/irq_remapping.c        |  221 ++++++----------
 drivers/iommu/irq_remapping.h        |   22 +-
 drivers/pci/htirq.c                  |   48 +---
 include/linux/htirq.h                |   22 +-
 include/linux/intel-iommu.h          |    4 +
 include/linux/irq.h                  |    8 +
 include/linux/irqdomain.h            |   60 +++++
 kernel/irq/Kconfig                   |    3 +
 kernel/irq/chip.c                    |   21 ++
 kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
 28 files changed, 1934 insertions(+), 709 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

We plan to use hierarchy irqdomain to suppport CPU vector assignment,
interrupt remapping controller, IO-APIC controller, MSI interrupt
and hypertransport interrupt etc on x86 platforms. So extend irqdomain
interfaces to support hierarchy irqdomain.

There are already many clients of current irqdomain interfaces.
To minimize the changes, we choose to introduce new version 2 interfaces
to support hierarchy instead of extending existing irqdomain interfaces.

According to Thomas's suggestion, the most important design decision is
to build hierarchy struct irq_data to support hierarchy irqdomain, so
hierarchy irqdomain related data could be saved in struct irq_data.
With support of hierarchy irq_data, we could also support stacked
irq_chips. This is most useful in case of set_affinity().

The new hierarchy irqdomain introduces following interfaces:
1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
   and related resources.
2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
   interrupt controllers to activate/deactivate interrupt.

There are also several help functions to ease irqdomain implemenations:
1) irq_domain_get_irq_data(): get irq_data associated with a specific
   irqdomain.
2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
   irq_data.
3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
   parent irqdomain's alloc/free callbacks.

We also changed irq_startup()/irq_shutdown() to invoke
irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
interrupt controller when start/stop interrupts.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 include/linux/irq.h       |    3 +
 include/linux/irqdomain.h |   60 ++++++++
 kernel/irq/Kconfig        |    3 +
 kernel/irq/chip.c         |    3 +
 kernel/irq/irqdomain.c    |  349 +++++++++++++++++++++++++++++++++++++++++++--
 5 files changed, 404 insertions(+), 14 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62af59242ddc..4b74565690ce 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -151,6 +151,9 @@ struct irq_data {
 	unsigned int		state_use_accessors;
 	struct irq_chip		*chip;
 	struct irq_domain	*domain;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_data		*parent_data;
+#endif
 	void			*handler_data;
 	void			*chip_data;
 	struct msi_desc		*msi_desc;
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index b0f9d16e48f6..a9ddc8534c63 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -38,6 +38,7 @@
 struct device_node;
 struct irq_domain;
 struct of_device_id;
+struct irq_chip;
 
 /* Number of irqs reserved for a legacy isa controller */
 #define NUM_ISA_INTERRUPTS	16
@@ -64,6 +65,16 @@ struct irq_domain_ops {
 	int (*xlate)(struct irq_domain *d, struct device_node *node,
 		     const u32 *intspec, unsigned int intsize,
 		     unsigned long *out_hwirq, unsigned int *out_type);
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	/* extended V2 interfaces to support hierarchy irqdomains */
+	int (*alloc)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs, void *arg);
+	void (*free)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs);
+	int (*activate)(struct irq_domain *d, struct irq_data *irq_data);
+	int (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
+#endif
 };
 
 extern struct irq_domain_ops irq_generic_chip_ops;
@@ -101,6 +112,9 @@ struct irq_domain {
 	/* Optional data */
 	struct device_node *of_node;
 	struct irq_domain_chip_generic *gc;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_domain *parent;
+#endif
 
 	/* reverse map data. The linear map gets appended to the irq_domain */
 	irq_hw_number_t hwirq_max;
@@ -220,8 +234,54 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d, struct device_node *ctrlr,
 			const u32 *intspec, unsigned int intsize,
 			irq_hw_number_t *out_hwirq, unsigned int *out_type);
 
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+/* V2 interfaces to support hierarchy IRQ domains. */
+extern struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+						unsigned int virq);
+extern int irq_domain_set_hwirq_and_chip(struct irq_domain *domain,
+					 unsigned int virq,
+					 irq_hw_number_t hwirq,
+					 struct irq_chip *chip,
+					 void *chip_data);
+extern void irq_domain_reset_irq_data(struct irq_data *irq_data);
+extern int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+				   unsigned int nr_irqs, int node, void *arg,
+				   bool realloc);
+extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
+extern int irq_domain_activate_irq(struct irq_data *irq_data);
+extern int irq_domain_deactivate_irq(struct irq_data *irq_data);
+
+static inline int irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+				unsigned int nr_irqs, int node, void *arg)
+{
+	return __irq_domain_alloc_irqs(domain, irq_base, nr_irqs, node,
+				       arg, false);
+}
+
+static inline int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
+				int irq_base, unsigned int nr_irqs, void *arg)
+{
+	if (domain->parent && domain->parent->ops->alloc)
+		return domain->parent->ops->alloc(domain->parent, irq_base,
+						  nr_irqs, arg);
+	return -ENOSYS;
+}
+
+static inline void irq_domain_free_irqs_parent(struct irq_domain *domain,
+					int irq_base, unsigned int nr_irqs)
+{
+	if (domain->parent && domain->parent->ops->free)
+		domain->parent->ops->free(domain->parent, irq_base, nr_irqs);
+}
+#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+
 #else /* CONFIG_IRQ_DOMAIN */
 static inline void irq_dispose_mapping(unsigned int virq) { }
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
 #endif /* !CONFIG_IRQ_DOMAIN */
 
 #endif /* _LINUX_IRQDOMAIN_H */
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index d269cecdfbf0..dc1f3d08892e 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -55,6 +55,9 @@ config GENERIC_IRQ_CHIP
 config IRQ_DOMAIN
 	bool
 
+config IRQ_DOMAIN_HIERARCHY
+	bool
+
 config IRQ_DOMAIN_DEBUG
 	bool "Expose hardware/virtual IRQ mapping via debugfs"
 	depends on IRQ_DOMAIN && DEBUG_FS
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6223fab9a9d2..46bd5e2190c3 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
+#include <linux/irqdomain.h>
 
 #include <trace/events/irq.h>
 
@@ -178,6 +179,7 @@ int irq_startup(struct irq_desc *desc, bool resend)
 	irq_state_clr_disabled(desc);
 	desc->depth = 0;
 
+	irq_domain_activate_irq(&desc->irq_data);
 	if (desc->irq_data.chip->irq_startup) {
 		ret = desc->irq_data.chip->irq_startup(&desc->irq_data);
 		irq_state_clr_masked(desc);
@@ -199,6 +201,7 @@ void irq_shutdown(struct irq_desc *desc)
 		desc->irq_data.chip->irq_disable(&desc->irq_data);
 	else
 		desc->irq_data.chip->irq_mask(&desc->irq_data);
+	irq_domain_deactivate_irq(&desc->irq_data);
 	irq_state_set_masked(desc);
 }
 
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 6534ff6ce02e..e285f3abc595 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -23,6 +23,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
 static DEFINE_MUTEX(revmap_trees_mutex);
 static struct irq_domain *irq_default_domain;
 
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node);
+
 /**
  * __irq_domain_add() - Allocate a new irq_domain data structure
  * @of_node: optional device-tree node of the interrupt controller
@@ -30,7 +33,7 @@ static struct irq_domain *irq_default_domain;
  * @hwirq_max: Maximum number of interrupts supported by controller
  * @direct_max: Maximum value of direct maps; Use ~0 for no limit; 0 for no
  *              direct mapping
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates and initialize and irq_domain structure.
@@ -109,7 +112,7 @@ EXPORT_SYMBOL_GPL(irq_domain_remove);
  * @first_irq: first number of irq block assigned to the domain,
  *	pass zero to assign irqs on-the-fly. If first_irq is non-zero, then
  *	pre-map all of the irqs in the domain to virqs starting at first_irq.
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates an irq_domain, and optionally if first_irq is positive then also
@@ -174,10 +177,8 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
 
 	domain = __irq_domain_add(of_node, first_hwirq + size,
 				  first_hwirq + size, 0, ops, host_data);
-	if (!domain)
-		return NULL;
-
-	irq_domain_associate_many(domain, first_irq, first_hwirq, size);
+	if (domain)
+		irq_domain_associate_many(domain, first_irq, first_hwirq, size);
 
 	return domain;
 }
@@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
 unsigned int irq_create_mapping(struct irq_domain *domain,
 				irq_hw_number_t hwirq)
 {
-	unsigned int hint;
 	int virq;
 
 	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
@@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 	}
 
 	/* Allocate a virtual interrupt number */
-	hint = hwirq % nr_irqs;
-	if (hint == 0)
-		hint++;
-	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
-	if (virq <= 0)
-		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
+	virq = irq_domain_alloc_descs(-1, 1, hwirq,
+				      of_node_to_nid(domain->of_node));
 	if (virq <= 0) {
 		pr_debug("-> virq allocation failed\n");
 		return 0;
@@ -490,7 +486,13 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
 	}
 
 	/* Create mapping */
-	virq = irq_create_mapping(domain, hwirq);
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	if (domain->ops->alloc)
+		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
+					     irq_data);
+	else
+#endif
+		virq = irq_create_mapping(domain, hwirq);
 	if (!virq)
 		return virq;
 
@@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
 		return 0;
 
 	if (hwirq < domain->revmap_direct_max_irq) {
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+		data = irq_domain_get_irq_data(domain, hwirq);
+#else
 		data = irq_get_irq_data(hwirq);
+#endif
 		if (data && (data->domain == domain) && (data->hwirq == hwirq))
 			return hwirq;
 	}
@@ -709,3 +715,318 @@ const struct irq_domain_ops irq_domain_simple_ops = {
 	.xlate = irq_domain_xlate_onetwocell,
 };
 EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
+
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node)
+{
+	unsigned int hint;
+
+	if (virq >= 0) {
+		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
+	} else {
+		hint = hwirq % nr_irqs;
+		if (hint == 0)
+			hint++;
+		virq = irq_alloc_descs_from(hint, nr_irqs, node);
+		if (virq <= 0 && hint > 1)
+			virq = irq_alloc_descs_from(1, nr_irqs, node);
+	}
+
+	return virq;
+}
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
+{
+	unsigned int i;
+
+	for (i = 0; i < nr_irqs; i++)
+		irq_free_desc(virq + i);
+}
+
+static void irq_domain_insert_irq(int virq)
+{
+	struct irq_data *data;
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = virq;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_insert(&domain->revmap_tree, hwirq, data);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+
+		/* If not already assigned, give the domain the chip's name */
+		if (!domain->name && data->chip)
+			domain->name = data->chip->name;
+	}
+
+	irq_clear_status_flags(virq, IRQ_NOREQUEST);
+}
+
+static void irq_domain_remove_irq(int virq)
+{
+	struct irq_data *data;
+
+	irq_set_status_flags(virq, IRQ_NOREQUEST);
+	irq_set_chip_and_handler(virq, NULL, NULL);
+	synchronize_irq(virq);
+	smp_mb();
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = 0;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_delete(&domain->revmap_tree, hwirq);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+	}
+}
+
+static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
+						   struct irq_data *child)
+{
+	struct irq_data *irq_data;
+
+	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
+	if (irq_data) {
+		child->parent_data = irq_data;
+		irq_data->irq = child->irq;
+		irq_data->node = child->node;
+		irq_data->domain = domain;
+	}
+
+	return irq_data;
+}
+
+static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data, *tmp;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		tmp = irq_data->parent_data;
+		irq_data->parent_data = NULL;
+		irq_data->domain = NULL;
+
+		while (tmp) {
+			irq_data = tmp;
+			tmp = tmp->parent_data;
+			kfree(irq_data);
+		}
+	}
+}
+
+static int irq_domain_alloc_irq_data(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data;
+	struct irq_domain *parent;
+
+	/* The outmost irq_data is embedded in struct irq_desc */
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		irq_data->domain = domain;
+
+		for (parent = domain->parent; parent; parent = parent->parent) {
+			irq_data = irq_domain_insert_irq_data(parent, irq_data);
+			if (!irq_data) {
+				irq_domain_free_irq_data(virq, i + 1);
+				return -ENOMEM;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * irq_domain_get_irq_data - Get irq_data assoicated with @virq and  @domain
+ * @domain: domain to match
+ * @virq: IRQ number to get irq_data
+ */
+struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+					 unsigned int virq)
+{
+	struct irq_data *irq_data;
+
+	for (irq_data = irq_get_irq_data(virq); irq_data;
+	     irq_data = irq_data->parent_data)
+		if (irq_data->domain == domain)
+			return irq_data;
+
+	return NULL;
+}
+
+int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
+				  irq_hw_number_t hwirq, struct irq_chip *chip,
+				  void *chip_data)
+{
+	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
+
+	if (!irq_data)
+		return -ENOENT;
+
+	irq_data->hwirq = hwirq;
+	irq_data->chip = chip;
+	irq_data->chip_data = chip_data;
+
+	return 0;
+}
+
+void irq_domain_reset_irq_data(struct irq_data *irq_data)
+{
+	irq_data->hwirq = 0;
+	irq_data->chip = NULL;
+	irq_data->chip_data = NULL;
+}
+
+/**
+ * irq_domain_alloc_irqs - Allocate IRQs from domain
+ * @domain: domain to allocate from
+ * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
+ * @nr_irqs: number of IRQs to allocate
+ * @node: NUMA node id for memory allocation
+ * @arg: domain specific argument
+ * @realloc: IRQ descriptors have already been allocated if true
+ *
+ * Allocate IRQ numbers and initialized all data structures to support
+ * hiearchy IRQ domains.
+ * Parameter @realloc is mainly to support legacy IRQs.
+ * Returns error code or allocated IRQ number
+ */
+int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+			    unsigned int nr_irqs, int node, void *arg,
+			    bool realloc)
+{
+	int i, ret, virq;
+
+	if (domain == NULL) {
+		domain = irq_default_domain;
+		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
+			return -EINVAL;
+	}
+
+	if (!domain->ops->alloc) {
+		pr_debug("domain->ops->alloc() is NULL\n");
+		return -ENOSYS;
+	}
+
+	if (realloc && irq_base >= 0) {
+		virq =  irq_base;
+	} else {
+		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
+		if (virq < 0) {
+			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
+				 irq_base, nr_irqs);
+			return virq;
+		}
+	}
+
+	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
+		pr_debug("cannot allocate memory for IRQ%d\n", virq);
+		ret = -ENOMEM;
+		goto out_free_desc;
+	}
+
+	mutex_lock(&irq_domain_mutex);
+	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
+	if (ret < 0) {
+		mutex_unlock(&irq_domain_mutex);
+		goto out_free_irq_data;
+	}
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_insert_irq(virq + i);
+	mutex_unlock(&irq_domain_mutex);
+
+	return virq;
+
+out_free_irq_data:
+	irq_domain_free_irq_data(virq, nr_irqs);
+out_free_desc:
+	irq_domain_free_descs(virq, nr_irqs);
+	return ret;
+}
+
+/**
+ * irq_domain_free_irqs - Free IRQ number and assoicated data structures
+ * @virq: base IRQ number
+ * @nr_irqs: number of IRQs to free
+ */
+void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *data = irq_get_irq_data(virq);
+
+	if (WARN(!data || !data->domain || !data->domain->ops->free,
+		 "NULL pointer, cannot free irq\n"))
+		return;
+
+	mutex_lock(&irq_domain_mutex);
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_remove_irq(virq + i);
+	data->domain->ops->free(data->domain, virq, nr_irqs);
+	mutex_unlock(&irq_domain_mutex);
+
+	irq_domain_free_irq_data(virq, nr_irqs);
+	irq_domain_free_descs(virq, nr_irqs);
+}
+
+/**
+ * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
+ *			     interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->activate to program interrupt controllers, so the
+ * interrupt could actually delivered.
+ */
+int irq_domain_activate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (irq_data->parent_data)
+			ret = irq_domain_activate_irq(irq_data->parent_data);
+		if (ret == 0 && domain->ops->activate)
+			ret = domain->ops->activate(domain, irq_data);
+	}
+
+	return ret;
+}
+
+/**
+ * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
+ *			       deactivate interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->deactivate to program interrupt controllers to disable
+ * interrupt delivery.
+ */
+int irq_domain_deactivate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (domain->ops->deactivate)
+			ret = domain->ops->deactivate(domain, irq_data);
+		if (ret == 0 && irq_data->parent_data)
+			ret = irq_domain_deactivate_irq(irq_data->parent_data);
+	}
+
+	return ret;
+}
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

We plan to use hierarchy irqdomain to suppport CPU vector assignment,
interrupt remapping controller, IO-APIC controller, MSI interrupt
and hypertransport interrupt etc on x86 platforms. So extend irqdomain
interfaces to support hierarchy irqdomain.

There are already many clients of current irqdomain interfaces.
To minimize the changes, we choose to introduce new version 2 interfaces
to support hierarchy instead of extending existing irqdomain interfaces.

According to Thomas's suggestion, the most important design decision is
to build hierarchy struct irq_data to support hierarchy irqdomain, so
hierarchy irqdomain related data could be saved in struct irq_data.
With support of hierarchy irq_data, we could also support stacked
irq_chips. This is most useful in case of set_affinity().

The new hierarchy irqdomain introduces following interfaces:
1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
   and related resources.
2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
   interrupt controllers to activate/deactivate interrupt.

There are also several help functions to ease irqdomain implemenations:
1) irq_domain_get_irq_data(): get irq_data associated with a specific
   irqdomain.
2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
   irq_data.
3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
   parent irqdomain's alloc/free callbacks.

We also changed irq_startup()/irq_shutdown() to invoke
irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
interrupt controller when start/stop interrupts.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 include/linux/irq.h       |    3 +
 include/linux/irqdomain.h |   60 ++++++++
 kernel/irq/Kconfig        |    3 +
 kernel/irq/chip.c         |    3 +
 kernel/irq/irqdomain.c    |  349 +++++++++++++++++++++++++++++++++++++++++++--
 5 files changed, 404 insertions(+), 14 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62af59242ddc..4b74565690ce 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -151,6 +151,9 @@ struct irq_data {
 	unsigned int		state_use_accessors;
 	struct irq_chip		*chip;
 	struct irq_domain	*domain;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_data		*parent_data;
+#endif
 	void			*handler_data;
 	void			*chip_data;
 	struct msi_desc		*msi_desc;
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index b0f9d16e48f6..a9ddc8534c63 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -38,6 +38,7 @@
 struct device_node;
 struct irq_domain;
 struct of_device_id;
+struct irq_chip;
 
 /* Number of irqs reserved for a legacy isa controller */
 #define NUM_ISA_INTERRUPTS	16
@@ -64,6 +65,16 @@ struct irq_domain_ops {
 	int (*xlate)(struct irq_domain *d, struct device_node *node,
 		     const u32 *intspec, unsigned int intsize,
 		     unsigned long *out_hwirq, unsigned int *out_type);
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	/* extended V2 interfaces to support hierarchy irqdomains */
+	int (*alloc)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs, void *arg);
+	void (*free)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs);
+	int (*activate)(struct irq_domain *d, struct irq_data *irq_data);
+	int (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
+#endif
 };
 
 extern struct irq_domain_ops irq_generic_chip_ops;
@@ -101,6 +112,9 @@ struct irq_domain {
 	/* Optional data */
 	struct device_node *of_node;
 	struct irq_domain_chip_generic *gc;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_domain *parent;
+#endif
 
 	/* reverse map data. The linear map gets appended to the irq_domain */
 	irq_hw_number_t hwirq_max;
@@ -220,8 +234,54 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d, struct device_node *ctrlr,
 			const u32 *intspec, unsigned int intsize,
 			irq_hw_number_t *out_hwirq, unsigned int *out_type);
 
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+/* V2 interfaces to support hierarchy IRQ domains. */
+extern struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+						unsigned int virq);
+extern int irq_domain_set_hwirq_and_chip(struct irq_domain *domain,
+					 unsigned int virq,
+					 irq_hw_number_t hwirq,
+					 struct irq_chip *chip,
+					 void *chip_data);
+extern void irq_domain_reset_irq_data(struct irq_data *irq_data);
+extern int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+				   unsigned int nr_irqs, int node, void *arg,
+				   bool realloc);
+extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
+extern int irq_domain_activate_irq(struct irq_data *irq_data);
+extern int irq_domain_deactivate_irq(struct irq_data *irq_data);
+
+static inline int irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+				unsigned int nr_irqs, int node, void *arg)
+{
+	return __irq_domain_alloc_irqs(domain, irq_base, nr_irqs, node,
+				       arg, false);
+}
+
+static inline int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
+				int irq_base, unsigned int nr_irqs, void *arg)
+{
+	if (domain->parent && domain->parent->ops->alloc)
+		return domain->parent->ops->alloc(domain->parent, irq_base,
+						  nr_irqs, arg);
+	return -ENOSYS;
+}
+
+static inline void irq_domain_free_irqs_parent(struct irq_domain *domain,
+					int irq_base, unsigned int nr_irqs)
+{
+	if (domain->parent && domain->parent->ops->free)
+		domain->parent->ops->free(domain->parent, irq_base, nr_irqs);
+}
+#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+
 #else /* CONFIG_IRQ_DOMAIN */
 static inline void irq_dispose_mapping(unsigned int virq) { }
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
 #endif /* !CONFIG_IRQ_DOMAIN */
 
 #endif /* _LINUX_IRQDOMAIN_H */
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index d269cecdfbf0..dc1f3d08892e 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -55,6 +55,9 @@ config GENERIC_IRQ_CHIP
 config IRQ_DOMAIN
 	bool
 
+config IRQ_DOMAIN_HIERARCHY
+	bool
+
 config IRQ_DOMAIN_DEBUG
 	bool "Expose hardware/virtual IRQ mapping via debugfs"
 	depends on IRQ_DOMAIN && DEBUG_FS
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6223fab9a9d2..46bd5e2190c3 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
+#include <linux/irqdomain.h>
 
 #include <trace/events/irq.h>
 
@@ -178,6 +179,7 @@ int irq_startup(struct irq_desc *desc, bool resend)
 	irq_state_clr_disabled(desc);
 	desc->depth = 0;
 
+	irq_domain_activate_irq(&desc->irq_data);
 	if (desc->irq_data.chip->irq_startup) {
 		ret = desc->irq_data.chip->irq_startup(&desc->irq_data);
 		irq_state_clr_masked(desc);
@@ -199,6 +201,7 @@ void irq_shutdown(struct irq_desc *desc)
 		desc->irq_data.chip->irq_disable(&desc->irq_data);
 	else
 		desc->irq_data.chip->irq_mask(&desc->irq_data);
+	irq_domain_deactivate_irq(&desc->irq_data);
 	irq_state_set_masked(desc);
 }
 
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 6534ff6ce02e..e285f3abc595 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -23,6 +23,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
 static DEFINE_MUTEX(revmap_trees_mutex);
 static struct irq_domain *irq_default_domain;
 
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node);
+
 /**
  * __irq_domain_add() - Allocate a new irq_domain data structure
  * @of_node: optional device-tree node of the interrupt controller
@@ -30,7 +33,7 @@ static struct irq_domain *irq_default_domain;
  * @hwirq_max: Maximum number of interrupts supported by controller
  * @direct_max: Maximum value of direct maps; Use ~0 for no limit; 0 for no
  *              direct mapping
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates and initialize and irq_domain structure.
@@ -109,7 +112,7 @@ EXPORT_SYMBOL_GPL(irq_domain_remove);
  * @first_irq: first number of irq block assigned to the domain,
  *	pass zero to assign irqs on-the-fly. If first_irq is non-zero, then
  *	pre-map all of the irqs in the domain to virqs starting at first_irq.
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates an irq_domain, and optionally if first_irq is positive then also
@@ -174,10 +177,8 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
 
 	domain = __irq_domain_add(of_node, first_hwirq + size,
 				  first_hwirq + size, 0, ops, host_data);
-	if (!domain)
-		return NULL;
-
-	irq_domain_associate_many(domain, first_irq, first_hwirq, size);
+	if (domain)
+		irq_domain_associate_many(domain, first_irq, first_hwirq, size);
 
 	return domain;
 }
@@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
 unsigned int irq_create_mapping(struct irq_domain *domain,
 				irq_hw_number_t hwirq)
 {
-	unsigned int hint;
 	int virq;
 
 	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
@@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 	}
 
 	/* Allocate a virtual interrupt number */
-	hint = hwirq % nr_irqs;
-	if (hint == 0)
-		hint++;
-	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
-	if (virq <= 0)
-		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
+	virq = irq_domain_alloc_descs(-1, 1, hwirq,
+				      of_node_to_nid(domain->of_node));
 	if (virq <= 0) {
 		pr_debug("-> virq allocation failed\n");
 		return 0;
@@ -490,7 +486,13 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
 	}
 
 	/* Create mapping */
-	virq = irq_create_mapping(domain, hwirq);
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	if (domain->ops->alloc)
+		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
+					     irq_data);
+	else
+#endif
+		virq = irq_create_mapping(domain, hwirq);
 	if (!virq)
 		return virq;
 
@@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
 		return 0;
 
 	if (hwirq < domain->revmap_direct_max_irq) {
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+		data = irq_domain_get_irq_data(domain, hwirq);
+#else
 		data = irq_get_irq_data(hwirq);
+#endif
 		if (data && (data->domain == domain) && (data->hwirq == hwirq))
 			return hwirq;
 	}
@@ -709,3 +715,318 @@ const struct irq_domain_ops irq_domain_simple_ops = {
 	.xlate = irq_domain_xlate_onetwocell,
 };
 EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
+
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node)
+{
+	unsigned int hint;
+
+	if (virq >= 0) {
+		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
+	} else {
+		hint = hwirq % nr_irqs;
+		if (hint == 0)
+			hint++;
+		virq = irq_alloc_descs_from(hint, nr_irqs, node);
+		if (virq <= 0 && hint > 1)
+			virq = irq_alloc_descs_from(1, nr_irqs, node);
+	}
+
+	return virq;
+}
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
+{
+	unsigned int i;
+
+	for (i = 0; i < nr_irqs; i++)
+		irq_free_desc(virq + i);
+}
+
+static void irq_domain_insert_irq(int virq)
+{
+	struct irq_data *data;
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = virq;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_insert(&domain->revmap_tree, hwirq, data);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+
+		/* If not already assigned, give the domain the chip's name */
+		if (!domain->name && data->chip)
+			domain->name = data->chip->name;
+	}
+
+	irq_clear_status_flags(virq, IRQ_NOREQUEST);
+}
+
+static void irq_domain_remove_irq(int virq)
+{
+	struct irq_data *data;
+
+	irq_set_status_flags(virq, IRQ_NOREQUEST);
+	irq_set_chip_and_handler(virq, NULL, NULL);
+	synchronize_irq(virq);
+	smp_mb();
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = 0;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_delete(&domain->revmap_tree, hwirq);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+	}
+}
+
+static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
+						   struct irq_data *child)
+{
+	struct irq_data *irq_data;
+
+	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
+	if (irq_data) {
+		child->parent_data = irq_data;
+		irq_data->irq = child->irq;
+		irq_data->node = child->node;
+		irq_data->domain = domain;
+	}
+
+	return irq_data;
+}
+
+static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data, *tmp;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		tmp = irq_data->parent_data;
+		irq_data->parent_data = NULL;
+		irq_data->domain = NULL;
+
+		while (tmp) {
+			irq_data = tmp;
+			tmp = tmp->parent_data;
+			kfree(irq_data);
+		}
+	}
+}
+
+static int irq_domain_alloc_irq_data(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data;
+	struct irq_domain *parent;
+
+	/* The outmost irq_data is embedded in struct irq_desc */
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		irq_data->domain = domain;
+
+		for (parent = domain->parent; parent; parent = parent->parent) {
+			irq_data = irq_domain_insert_irq_data(parent, irq_data);
+			if (!irq_data) {
+				irq_domain_free_irq_data(virq, i + 1);
+				return -ENOMEM;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * irq_domain_get_irq_data - Get irq_data assoicated with @virq and  @domain
+ * @domain: domain to match
+ * @virq: IRQ number to get irq_data
+ */
+struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+					 unsigned int virq)
+{
+	struct irq_data *irq_data;
+
+	for (irq_data = irq_get_irq_data(virq); irq_data;
+	     irq_data = irq_data->parent_data)
+		if (irq_data->domain == domain)
+			return irq_data;
+
+	return NULL;
+}
+
+int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
+				  irq_hw_number_t hwirq, struct irq_chip *chip,
+				  void *chip_data)
+{
+	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
+
+	if (!irq_data)
+		return -ENOENT;
+
+	irq_data->hwirq = hwirq;
+	irq_data->chip = chip;
+	irq_data->chip_data = chip_data;
+
+	return 0;
+}
+
+void irq_domain_reset_irq_data(struct irq_data *irq_data)
+{
+	irq_data->hwirq = 0;
+	irq_data->chip = NULL;
+	irq_data->chip_data = NULL;
+}
+
+/**
+ * irq_domain_alloc_irqs - Allocate IRQs from domain
+ * @domain: domain to allocate from
+ * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
+ * @nr_irqs: number of IRQs to allocate
+ * @node: NUMA node id for memory allocation
+ * @arg: domain specific argument
+ * @realloc: IRQ descriptors have already been allocated if true
+ *
+ * Allocate IRQ numbers and initialized all data structures to support
+ * hiearchy IRQ domains.
+ * Parameter @realloc is mainly to support legacy IRQs.
+ * Returns error code or allocated IRQ number
+ */
+int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+			    unsigned int nr_irqs, int node, void *arg,
+			    bool realloc)
+{
+	int i, ret, virq;
+
+	if (domain == NULL) {
+		domain = irq_default_domain;
+		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
+			return -EINVAL;
+	}
+
+	if (!domain->ops->alloc) {
+		pr_debug("domain->ops->alloc() is NULL\n");
+		return -ENOSYS;
+	}
+
+	if (realloc && irq_base >= 0) {
+		virq =  irq_base;
+	} else {
+		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
+		if (virq < 0) {
+			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
+				 irq_base, nr_irqs);
+			return virq;
+		}
+	}
+
+	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
+		pr_debug("cannot allocate memory for IRQ%d\n", virq);
+		ret = -ENOMEM;
+		goto out_free_desc;
+	}
+
+	mutex_lock(&irq_domain_mutex);
+	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
+	if (ret < 0) {
+		mutex_unlock(&irq_domain_mutex);
+		goto out_free_irq_data;
+	}
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_insert_irq(virq + i);
+	mutex_unlock(&irq_domain_mutex);
+
+	return virq;
+
+out_free_irq_data:
+	irq_domain_free_irq_data(virq, nr_irqs);
+out_free_desc:
+	irq_domain_free_descs(virq, nr_irqs);
+	return ret;
+}
+
+/**
+ * irq_domain_free_irqs - Free IRQ number and assoicated data structures
+ * @virq: base IRQ number
+ * @nr_irqs: number of IRQs to free
+ */
+void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *data = irq_get_irq_data(virq);
+
+	if (WARN(!data || !data->domain || !data->domain->ops->free,
+		 "NULL pointer, cannot free irq\n"))
+		return;
+
+	mutex_lock(&irq_domain_mutex);
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_remove_irq(virq + i);
+	data->domain->ops->free(data->domain, virq, nr_irqs);
+	mutex_unlock(&irq_domain_mutex);
+
+	irq_domain_free_irq_data(virq, nr_irqs);
+	irq_domain_free_descs(virq, nr_irqs);
+}
+
+/**
+ * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
+ *			     interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->activate to program interrupt controllers, so the
+ * interrupt could actually delivered.
+ */
+int irq_domain_activate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (irq_data->parent_data)
+			ret = irq_domain_activate_irq(irq_data->parent_data);
+		if (ret == 0 && domain->ops->activate)
+			ret = domain->ops->activate(domain, irq_data);
+	}
+
+	return ret;
+}
+
+/**
+ * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
+ *			       deactivate interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->deactivate to program interrupt controllers to disable
+ * interrupt delivery.
+ */
+int irq_domain_deactivate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (domain->ops->deactivate)
+			ret = domain->ops->deactivate(domain, irq_data);
+		if (ret == 0 && irq_data->parent_data)
+			ret = irq_domain_deactivate_irq(irq_data->parent_data);
+	}
+
+	return ret;
+}
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, Jiang Liu, linux-arm-kernel

Now we already support hierarchy irq_datas, so introduce several helpers
to support stacked irq_chips.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 include/linux/irq.h |    5 +++++
 kernel/irq/chip.c   |   18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 4b74565690ce..bfa027f6814a 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -433,6 +433,11 @@ extern void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc);
 extern void handle_bad_irq(unsigned int irq, struct irq_desc *desc);
 extern void handle_nested_irq(unsigned int irq);
 
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+extern void irq_chip_ack_parent(struct irq_data *data);
+extern int irq_chip_retrigger_hierarchy(struct irq_data *data);
+#endif
+
 /* Handling of unhandled and spurious interrupts: */
 extern void note_interrupt(unsigned int irq, struct irq_desc *desc,
 			   irqreturn_t action_ret);
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 46bd5e2190c3..b8ee27efde73 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -821,3 +821,21 @@ void irq_cpu_offline(void)
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 	}
 }
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+void irq_chip_ack_parent(struct irq_data *data)
+{
+	data = data->parent_data;
+	if (data && data->chip && data->chip->irq_ack)
+		data->chip->irq_ack(data);
+}
+
+int irq_chip_retrigger_hierarchy(struct irq_data *data)
+{
+	for (data = data->parent_data; data; data = data->parent_data)
+		if (data->chip && data->chip->irq_retrigger)
+			return data->chip->irq_retrigger(data);
+
+	return -ENOSYS;
+}
+#endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Now we already support hierarchy irq_datas, so introduce several helpers
to support stacked irq_chips.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 include/linux/irq.h |    5 +++++
 kernel/irq/chip.c   |   18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 4b74565690ce..bfa027f6814a 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -433,6 +433,11 @@ extern void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc);
 extern void handle_bad_irq(unsigned int irq, struct irq_desc *desc);
 extern void handle_nested_irq(unsigned int irq);
 
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+extern void irq_chip_ack_parent(struct irq_data *data);
+extern int irq_chip_retrigger_hierarchy(struct irq_data *data);
+#endif
+
 /* Handling of unhandled and spurious interrupts: */
 extern void note_interrupt(unsigned int irq, struct irq_desc *desc,
 			   irqreturn_t action_ret);
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 46bd5e2190c3..b8ee27efde73 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -821,3 +821,21 @@ void irq_cpu_offline(void)
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 	}
 }
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+void irq_chip_ack_parent(struct irq_data *data)
+{
+	data = data->parent_data;
+	if (data && data->chip && data->chip->irq_ack)
+		data->chip->irq_ack(data);
+}
+
+int irq_chip_retrigger_hierarchy(struct irq_data *data)
+{
+	for (data = data->parent_data; data; data = data->parent_data)
+		if (data->chip && data->chip->irq_retrigger)
+			return data->chip->irq_retrigger(data);
+
+	return -ENOSYS;
+}
+#endif
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Now we already support hierarchy irq_datas, so introduce several helpers
to support stacked irq_chips.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 include/linux/irq.h |    5 +++++
 kernel/irq/chip.c   |   18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 4b74565690ce..bfa027f6814a 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -433,6 +433,11 @@ extern void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc);
 extern void handle_bad_irq(unsigned int irq, struct irq_desc *desc);
 extern void handle_nested_irq(unsigned int irq);
 
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+extern void irq_chip_ack_parent(struct irq_data *data);
+extern int irq_chip_retrigger_hierarchy(struct irq_data *data);
+#endif
+
 /* Handling of unhandled and spurious interrupts: */
 extern void note_interrupt(unsigned int irq, struct irq_desc *desc,
 			   irqreturn_t action_ret);
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 46bd5e2190c3..b8ee27efde73 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -821,3 +821,21 @@ void irq_cpu_offline(void)
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 	}
 }
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+void irq_chip_ack_parent(struct irq_data *data)
+{
+	data = data->parent_data;
+	if (data && data->chip && data->chip->irq_ack)
+		data->chip->irq_ack(data);
+}
+
+int irq_chip_retrigger_hierarchy(struct irq_data *data)
+{
+	for (data = data->parent_data; data; data = data->parent_data)
+		if (data->chip && data->chip->irq_retrigger)
+			return data->chip->irq_retrigger(data);
+
+	return -ENOSYS;
+}
+#endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 03/21] x86, irq: Save destination CPU ID in irq_cfg
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Cache destination CPU APIC ID into struct irq_cfg when assigning vector
for interrupt. Upper layer just needs to read the cached APIC ID instead
of calling apic->cpu_mask_to_apicid_and(), it helps to hide APIC driver
details from IOAPIC/HPET/MSI drivers..

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h |    1 +
 arch/x86/kernel/apic/vector.c |    4 ++++
 2 files changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 7624fffc2822..3d51d74d6c01 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -116,6 +116,7 @@ struct irq_data;
 struct irq_cfg {
 	cpumask_var_t		domain;
 	cpumask_var_t		old_domain;
+	unsigned int		dest_apicid;
 	u8			vector;
 	u8			move_in_progress : 1;
 #ifdef CONFIG_IRQ_REMAP
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 7562cb15b3bd..287ae4e8d500 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -188,6 +188,10 @@ next:
 	}
 	free_cpumask_var(tmp_mask);
 
+	if (!err)
+		err = apic->cpu_mask_to_apicid_and(mask, cfg->domain,
+						   &cfg->dest_apicid);
+
 	return err;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 03/21] x86, irq: Save destination CPU ID in irq_cfg
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Cache destination CPU APIC ID into struct irq_cfg when assigning vector
for interrupt. Upper layer just needs to read the cached APIC ID instead
of calling apic->cpu_mask_to_apicid_and(), it helps to hide APIC driver
details from IOAPIC/HPET/MSI drivers..

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h |    1 +
 arch/x86/kernel/apic/vector.c |    4 ++++
 2 files changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 7624fffc2822..3d51d74d6c01 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -116,6 +116,7 @@ struct irq_data;
 struct irq_cfg {
 	cpumask_var_t		domain;
 	cpumask_var_t		old_domain;
+	unsigned int		dest_apicid;
 	u8			vector;
 	u8			move_in_progress : 1;
 #ifdef CONFIG_IRQ_REMAP
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 7562cb15b3bd..287ae4e8d500 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -188,6 +188,10 @@ next:
 	}
 	free_cpumask_var(tmp_mask);
 
+	if (!err)
+		err = apic->cpu_mask_to_apicid_and(mask, cfg->domain,
+						   &cfg->dest_apicid);
+
 	return err;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 04/21] x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Abstract CPU local APIC as an interrupt controller and create an
irqdomain for it to manage CPU interupt vectors. It's the base to
enable hierarchy irqdomain on x86 systems. Eventually we will build
a irqdomain hiearchy as below:
IOAPIC domain-------|
MSI/MSI-x domain------> [Inerrupt Remapping domain] -> CPU vector domain
HPET_IRQ domain_____|                                         ^
DMAR domain---------------------------------------------------|
HT_IRQ domain-------------------------------------------------|

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/Kconfig               |    3 +-
 arch/x86/include/asm/hw_irq.h  |   10 +++
 arch/x86/kernel/apic/io_apic.c |    3 -
 arch/x86/kernel/apic/vector.c  |  151 ++++++++++++++++++++++++++++++++++++----
 4 files changed, 150 insertions(+), 17 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7d0ca80d628f..451121658f70 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -837,11 +837,12 @@ config X86_LOCAL_APIC
 	def_bool y
 	depends on X86_64 || SMP || X86_32_NON_STANDARD || X86_UP_APIC || PCI_MSI
 	select GENERIC_IRQ_LEGACY_ALLOC_HWIRQ
+	select IRQ_DOMAIN
+	select IRQ_DOMAIN_HIERARCHY
 
 config X86_IO_APIC
 	def_bool X86_64 || SMP || X86_32_NON_STANDARD || X86_UP_IOAPIC
 	depends on X86_LOCAL_APIC
-	select IRQ_DOMAIN
 
 config X86_REROUTE_FOR_BROKEN_BOOT_IRQS
 	bool "Reroute for broken boot IRQs"
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 3d51d74d6c01..313ae21a0784 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -113,6 +113,10 @@ struct irq_2_irte {
 #ifdef	CONFIG_X86_LOCAL_APIC
 struct irq_data;
 
+struct irq_alloc_info {
+	const struct cpumask *mask;	/* CPU mask for vector allocation */
+};
+
 struct irq_cfg {
 	cpumask_var_t		domain;
 	cpumask_var_t		old_domain;
@@ -135,6 +139,12 @@ struct irq_cfg {
 	};
 };
 
+extern struct irq_domain *x86_vector_domain;
+
+extern void init_irq_alloc_info(struct irq_alloc_info *info,
+				const struct cpumask *mask);
+extern void copy_irq_alloc_info(struct irq_alloc_info *dst,
+				struct irq_alloc_info *src);
 extern struct irq_cfg *irq_cfg(unsigned int irq);
 extern struct irq_cfg *irqd_cfg(struct irq_data *irq_data);
 extern struct irq_cfg *alloc_irq_and_cfg_at(unsigned int at, int node);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index e02e78ec579f..2a8f1ba7e25f 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2352,9 +2352,6 @@ static int mp_irqdomain_create(int ioapic)
 		ioapic_dynirq_base = max(ioapic_dynirq_base,
 					 gsi_cfg->gsi_end + 1);
 
-	if (gsi_cfg->gsi_base == 0)
-		irq_set_default_host(ip->irqdomain);
-
 	return 0;
 }
 
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 287ae4e8d500..774ab5ba95f2 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -19,7 +21,9 @@
 #include <asm/desc.h>
 #include <asm/irq_remapping.h>
 
+struct irq_domain *x86_vector_domain;
 static DEFINE_RAW_SPINLOCK(vector_lock);
+static struct irq_chip vector_chip;
 
 void lock_vector_lock(void)
 {
@@ -36,15 +40,21 @@ void unlock_vector_lock(void)
 
 struct irq_cfg *irq_cfg(unsigned int irq)
 {
-	return irq_get_chip_data(irq);
+	return irqd_cfg(irq_get_irq_data(irq));
 }
 
 struct irq_cfg *irqd_cfg(struct irq_data *irq_data)
 {
+	if (!irq_data)
+		return NULL;
+
+	while (irq_data->parent_data)
+		irq_data = irq_data->parent_data;
+
 	return irq_data->chip_data;
 }
 
-static struct irq_cfg *alloc_irq_cfg(unsigned int irq, int node)
+static struct irq_cfg *alloc_irq_cfg(int node)
 {
 	struct irq_cfg *cfg;
 
@@ -79,7 +89,7 @@ struct irq_cfg *alloc_irq_and_cfg_at(unsigned int at, int node)
 			return cfg;
 	}
 
-	cfg = alloc_irq_cfg(at, node);
+	cfg = alloc_irq_cfg(node);
 	if (cfg)
 		irq_set_chip_data(at, cfg);
 	else
@@ -87,14 +97,13 @@ struct irq_cfg *alloc_irq_and_cfg_at(unsigned int at, int node)
 	return cfg;
 }
 
-static void free_irq_cfg(unsigned int at, struct irq_cfg *cfg)
+static void free_irq_cfg(struct irq_cfg *cfg)
 {
-	if (!cfg)
-		return;
-	irq_set_chip_data(at, NULL);
-	free_cpumask_var(cfg->domain);
-	free_cpumask_var(cfg->old_domain);
-	kfree(cfg);
+	if (cfg) {
+		free_cpumask_var(cfg->domain);
+		free_cpumask_var(cfg->old_domain);
+		kfree(cfg);
+	}
 }
 
 static int
@@ -239,6 +248,85 @@ void clear_irq_vector(int irq, struct irq_cfg *cfg)
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 }
 
+void init_irq_alloc_info(struct irq_alloc_info *info,
+			 const struct cpumask *mask)
+{
+	memset(info, 0, sizeof(*info));
+	info->mask = mask;
+}
+
+void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
+{
+	if (src)
+		*dst = *src;
+	else
+		memset(dst, 0, sizeof(*dst));
+}
+
+static inline const struct cpumask *
+irq_alloc_info_get_mask(struct irq_alloc_info *info)
+{
+	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
+}
+
+static void x86_vector_free_irqs(struct irq_domain *domain,
+				 unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(x86_vector_domain, virq + i);
+		if (irq_data && irq_data->chip_data) {
+			free_remapped_irq(virq);
+			clear_irq_vector(virq + i, irq_data->chip_data);
+			free_irq_cfg(irq_data->chip_data);
+			irq_domain_reset_irq_data(irq_data);
+		}
+	}
+}
+
+static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
+				 unsigned int nr_irqs, void *arg)
+{
+	int i, err;
+	struct irq_cfg *cfg;
+	struct irq_data *irq_data;
+	const struct cpumask *mask;
+
+	if (disable_apic)
+		return -ENXIO;
+
+	mask = irq_alloc_info_get_mask(arg);
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		BUG_ON(!irq_data);
+		cfg = alloc_irq_cfg(irq_data->node);
+		if (!cfg) {
+			err = -ENOMEM;
+			goto error;
+		}
+
+		irq_data->chip = &vector_chip;
+		irq_data->chip_data = cfg;
+		irq_data->hwirq = virq + i;
+		err = assign_irq_vector(virq, cfg, mask);
+		if (err)
+			goto error;
+	}
+
+	return 0;
+
+error:
+	x86_vector_free_irqs(domain, virq, i + 1);
+	return err;
+}
+
+static struct irq_domain_ops x86_vector_domain_ops = {
+	.alloc = x86_vector_alloc_irqs,
+	.free = x86_vector_free_irqs,
+};
+
 int __init arch_probe_nr_irqs(void)
 {
 	int nr;
@@ -264,6 +352,11 @@ int __init arch_probe_nr_irqs(void)
 
 int __init arch_early_irq_init(void)
 {
+	x86_vector_domain = irq_domain_add_linear(NULL, nr_irqs,
+						  &x86_vector_domain_ops, NULL);
+	BUG_ON(x86_vector_domain == NULL);
+	irq_set_default_host(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
@@ -378,6 +471,37 @@ int apic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	return 0;
 }
 
+static int vector_set_affinity(struct irq_data *irq_data,
+			       const struct cpumask *dest, bool force)
+{
+	int err;
+	int irq = irq_data->irq;
+	struct irq_cfg *cfg = irq_data->chip_data;
+
+	if (!config_enabled(CONFIG_SMP))
+		return -EPERM;
+
+	if (!cpumask_intersects(dest, cpu_online_mask))
+		return -EINVAL;
+
+	err = assign_irq_vector(irq, cfg, dest);
+	if (err) {
+		struct irq_data *top = irq_get_irq_data(irq);
+
+		if (assign_irq_vector(irq, cfg, top->affinity))
+			pr_err("Failed to recover vector for irq %d\n", irq);
+		return err;
+	}
+
+	return IRQ_SET_MASK_OK;
+}
+
+static struct irq_chip vector_chip = {
+	.irq_ack = apic_ack_edge,
+	.irq_set_affinity = vector_set_affinity,
+	.irq_retrigger = apic_retrigger_irq,
+};
+
 #ifdef CONFIG_SMP
 void send_cleanup_vector(struct irq_cfg *cfg)
 {
@@ -497,7 +621,7 @@ int arch_setup_hwirq(unsigned int irq, int node)
 	unsigned long flags;
 	int ret;
 
-	cfg = alloc_irq_cfg(irq, node);
+	cfg = alloc_irq_cfg(node);
 	if (!cfg)
 		return -ENOMEM;
 
@@ -508,7 +632,7 @@ int arch_setup_hwirq(unsigned int irq, int node)
 	if (!ret)
 		irq_set_chip_data(irq, cfg);
 	else
-		free_irq_cfg(irq, cfg);
+		free_irq_cfg(cfg);
 	return ret;
 }
 
@@ -518,7 +642,8 @@ void arch_teardown_hwirq(unsigned int irq)
 
 	free_remapped_irq(irq);
 	clear_irq_vector(irq, cfg);
-	free_irq_cfg(irq, cfg);
+	irq_set_chip_data(irq, NULL);
+	free_irq_cfg(cfg);
 }
 
 static void __init print_APIC_field(int base)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 04/21] x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Abstract CPU local APIC as an interrupt controller and create an
irqdomain for it to manage CPU interupt vectors. It's the base to
enable hierarchy irqdomain on x86 systems. Eventually we will build
a irqdomain hiearchy as below:
IOAPIC domain-------|
MSI/MSI-x domain------> [Inerrupt Remapping domain] -> CPU vector domain
HPET_IRQ domain_____|                                         ^
DMAR domain---------------------------------------------------|
HT_IRQ domain-------------------------------------------------|

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/Kconfig               |    3 +-
 arch/x86/include/asm/hw_irq.h  |   10 +++
 arch/x86/kernel/apic/io_apic.c |    3 -
 arch/x86/kernel/apic/vector.c  |  151 ++++++++++++++++++++++++++++++++++++----
 4 files changed, 150 insertions(+), 17 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7d0ca80d628f..451121658f70 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -837,11 +837,12 @@ config X86_LOCAL_APIC
 	def_bool y
 	depends on X86_64 || SMP || X86_32_NON_STANDARD || X86_UP_APIC || PCI_MSI
 	select GENERIC_IRQ_LEGACY_ALLOC_HWIRQ
+	select IRQ_DOMAIN
+	select IRQ_DOMAIN_HIERARCHY
 
 config X86_IO_APIC
 	def_bool X86_64 || SMP || X86_32_NON_STANDARD || X86_UP_IOAPIC
 	depends on X86_LOCAL_APIC
-	select IRQ_DOMAIN
 
 config X86_REROUTE_FOR_BROKEN_BOOT_IRQS
 	bool "Reroute for broken boot IRQs"
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 3d51d74d6c01..313ae21a0784 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -113,6 +113,10 @@ struct irq_2_irte {
 #ifdef	CONFIG_X86_LOCAL_APIC
 struct irq_data;
 
+struct irq_alloc_info {
+	const struct cpumask *mask;	/* CPU mask for vector allocation */
+};
+
 struct irq_cfg {
 	cpumask_var_t		domain;
 	cpumask_var_t		old_domain;
@@ -135,6 +139,12 @@ struct irq_cfg {
 	};
 };
 
+extern struct irq_domain *x86_vector_domain;
+
+extern void init_irq_alloc_info(struct irq_alloc_info *info,
+				const struct cpumask *mask);
+extern void copy_irq_alloc_info(struct irq_alloc_info *dst,
+				struct irq_alloc_info *src);
 extern struct irq_cfg *irq_cfg(unsigned int irq);
 extern struct irq_cfg *irqd_cfg(struct irq_data *irq_data);
 extern struct irq_cfg *alloc_irq_and_cfg_at(unsigned int at, int node);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index e02e78ec579f..2a8f1ba7e25f 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2352,9 +2352,6 @@ static int mp_irqdomain_create(int ioapic)
 		ioapic_dynirq_base = max(ioapic_dynirq_base,
 					 gsi_cfg->gsi_end + 1);
 
-	if (gsi_cfg->gsi_base == 0)
-		irq_set_default_host(ip->irqdomain);
-
 	return 0;
 }
 
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 287ae4e8d500..774ab5ba95f2 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -19,7 +21,9 @@
 #include <asm/desc.h>
 #include <asm/irq_remapping.h>
 
+struct irq_domain *x86_vector_domain;
 static DEFINE_RAW_SPINLOCK(vector_lock);
+static struct irq_chip vector_chip;
 
 void lock_vector_lock(void)
 {
@@ -36,15 +40,21 @@ void unlock_vector_lock(void)
 
 struct irq_cfg *irq_cfg(unsigned int irq)
 {
-	return irq_get_chip_data(irq);
+	return irqd_cfg(irq_get_irq_data(irq));
 }
 
 struct irq_cfg *irqd_cfg(struct irq_data *irq_data)
 {
+	if (!irq_data)
+		return NULL;
+
+	while (irq_data->parent_data)
+		irq_data = irq_data->parent_data;
+
 	return irq_data->chip_data;
 }
 
-static struct irq_cfg *alloc_irq_cfg(unsigned int irq, int node)
+static struct irq_cfg *alloc_irq_cfg(int node)
 {
 	struct irq_cfg *cfg;
 
@@ -79,7 +89,7 @@ struct irq_cfg *alloc_irq_and_cfg_at(unsigned int at, int node)
 			return cfg;
 	}
 
-	cfg = alloc_irq_cfg(at, node);
+	cfg = alloc_irq_cfg(node);
 	if (cfg)
 		irq_set_chip_data(at, cfg);
 	else
@@ -87,14 +97,13 @@ struct irq_cfg *alloc_irq_and_cfg_at(unsigned int at, int node)
 	return cfg;
 }
 
-static void free_irq_cfg(unsigned int at, struct irq_cfg *cfg)
+static void free_irq_cfg(struct irq_cfg *cfg)
 {
-	if (!cfg)
-		return;
-	irq_set_chip_data(at, NULL);
-	free_cpumask_var(cfg->domain);
-	free_cpumask_var(cfg->old_domain);
-	kfree(cfg);
+	if (cfg) {
+		free_cpumask_var(cfg->domain);
+		free_cpumask_var(cfg->old_domain);
+		kfree(cfg);
+	}
 }
 
 static int
@@ -239,6 +248,85 @@ void clear_irq_vector(int irq, struct irq_cfg *cfg)
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 }
 
+void init_irq_alloc_info(struct irq_alloc_info *info,
+			 const struct cpumask *mask)
+{
+	memset(info, 0, sizeof(*info));
+	info->mask = mask;
+}
+
+void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
+{
+	if (src)
+		*dst = *src;
+	else
+		memset(dst, 0, sizeof(*dst));
+}
+
+static inline const struct cpumask *
+irq_alloc_info_get_mask(struct irq_alloc_info *info)
+{
+	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
+}
+
+static void x86_vector_free_irqs(struct irq_domain *domain,
+				 unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(x86_vector_domain, virq + i);
+		if (irq_data && irq_data->chip_data) {
+			free_remapped_irq(virq);
+			clear_irq_vector(virq + i, irq_data->chip_data);
+			free_irq_cfg(irq_data->chip_data);
+			irq_domain_reset_irq_data(irq_data);
+		}
+	}
+}
+
+static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
+				 unsigned int nr_irqs, void *arg)
+{
+	int i, err;
+	struct irq_cfg *cfg;
+	struct irq_data *irq_data;
+	const struct cpumask *mask;
+
+	if (disable_apic)
+		return -ENXIO;
+
+	mask = irq_alloc_info_get_mask(arg);
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		BUG_ON(!irq_data);
+		cfg = alloc_irq_cfg(irq_data->node);
+		if (!cfg) {
+			err = -ENOMEM;
+			goto error;
+		}
+
+		irq_data->chip = &vector_chip;
+		irq_data->chip_data = cfg;
+		irq_data->hwirq = virq + i;
+		err = assign_irq_vector(virq, cfg, mask);
+		if (err)
+			goto error;
+	}
+
+	return 0;
+
+error:
+	x86_vector_free_irqs(domain, virq, i + 1);
+	return err;
+}
+
+static struct irq_domain_ops x86_vector_domain_ops = {
+	.alloc = x86_vector_alloc_irqs,
+	.free = x86_vector_free_irqs,
+};
+
 int __init arch_probe_nr_irqs(void)
 {
 	int nr;
@@ -264,6 +352,11 @@ int __init arch_probe_nr_irqs(void)
 
 int __init arch_early_irq_init(void)
 {
+	x86_vector_domain = irq_domain_add_linear(NULL, nr_irqs,
+						  &x86_vector_domain_ops, NULL);
+	BUG_ON(x86_vector_domain == NULL);
+	irq_set_default_host(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
@@ -378,6 +471,37 @@ int apic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	return 0;
 }
 
+static int vector_set_affinity(struct irq_data *irq_data,
+			       const struct cpumask *dest, bool force)
+{
+	int err;
+	int irq = irq_data->irq;
+	struct irq_cfg *cfg = irq_data->chip_data;
+
+	if (!config_enabled(CONFIG_SMP))
+		return -EPERM;
+
+	if (!cpumask_intersects(dest, cpu_online_mask))
+		return -EINVAL;
+
+	err = assign_irq_vector(irq, cfg, dest);
+	if (err) {
+		struct irq_data *top = irq_get_irq_data(irq);
+
+		if (assign_irq_vector(irq, cfg, top->affinity))
+			pr_err("Failed to recover vector for irq %d\n", irq);
+		return err;
+	}
+
+	return IRQ_SET_MASK_OK;
+}
+
+static struct irq_chip vector_chip = {
+	.irq_ack = apic_ack_edge,
+	.irq_set_affinity = vector_set_affinity,
+	.irq_retrigger = apic_retrigger_irq,
+};
+
 #ifdef CONFIG_SMP
 void send_cleanup_vector(struct irq_cfg *cfg)
 {
@@ -497,7 +621,7 @@ int arch_setup_hwirq(unsigned int irq, int node)
 	unsigned long flags;
 	int ret;
 
-	cfg = alloc_irq_cfg(irq, node);
+	cfg = alloc_irq_cfg(node);
 	if (!cfg)
 		return -ENOMEM;
 
@@ -508,7 +632,7 @@ int arch_setup_hwirq(unsigned int irq, int node)
 	if (!ret)
 		irq_set_chip_data(irq, cfg);
 	else
-		free_irq_cfg(irq, cfg);
+		free_irq_cfg(cfg);
 	return ret;
 }
 
@@ -518,7 +642,8 @@ void arch_teardown_hwirq(unsigned int irq)
 
 	free_remapped_irq(irq);
 	clear_irq_vector(irq, cfg);
-	free_irq_cfg(irq, cfg);
+	irq_set_chip_data(irq, NULL);
+	free_irq_cfg(cfg);
 }
 
 static void __init print_APIC_field(int base)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 05/21] x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for HPET, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/hpet.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 319bcb9372fe..cb60652f59a3 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -11,6 +11,7 @@
 #include <linux/cpu.h>
 #include <linux/pm.h>
 #include <linux/io.h>
+#include <linux/irqdomain.h>
 
 #include <asm/fixmap.h>
 #include <asm/hpet.h>
@@ -476,7 +477,7 @@ static int hpet_msi_next_event(unsigned long delta,
 static int hpet_setup_msi_irq(unsigned int irq)
 {
 	if (x86_msi.setup_hpet_msi(irq, hpet_blockid)) {
-		irq_free_hwirq(irq);
+		irq_domain_free_irqs(irq, 1);
 		return -EINVAL;
 	}
 	return 0;
@@ -484,9 +485,10 @@ static int hpet_setup_msi_irq(unsigned int irq)
 
 static int hpet_assign_irq(struct hpet_dev *dev)
 {
-	unsigned int irq = irq_alloc_hwirq(-1);
+	int irq;
 
-	if (!irq)
+	irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+	if (irq <= 0)
 		return -EINVAL;
 
 	irq_set_handler_data(irq, dev);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 05/21] x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for HPET, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/hpet.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 319bcb9372fe..cb60652f59a3 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -11,6 +11,7 @@
 #include <linux/cpu.h>
 #include <linux/pm.h>
 #include <linux/io.h>
+#include <linux/irqdomain.h>
 
 #include <asm/fixmap.h>
 #include <asm/hpet.h>
@@ -476,7 +477,7 @@ static int hpet_msi_next_event(unsigned long delta,
 static int hpet_setup_msi_irq(unsigned int irq)
 {
 	if (x86_msi.setup_hpet_msi(irq, hpet_blockid)) {
-		irq_free_hwirq(irq);
+		irq_domain_free_irqs(irq, 1);
 		return -EINVAL;
 	}
 	return 0;
@@ -484,9 +485,10 @@ static int hpet_setup_msi_irq(unsigned int irq)
 
 static int hpet_assign_irq(struct hpet_dev *dev)
 {
-	unsigned int irq = irq_alloc_hwirq(-1);
+	int irq;
 
-	if (!irq)
+	irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+	if (irq <= 0)
 		return -EINVAL;
 
 	irq_set_handler_data(irq, dev);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 06/21] x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, Jiang Liu, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for MSI, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index fb45663395ca..2439d383c10c 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -14,6 +14,7 @@
 #include <linux/dmar.h>
 #include <linux/hpet.h>
 #include <linux/msi.h>
+#include <linux/irqdomain.h>
 #include <asm/msidef.h>
 #include <asm/hpet.h>
 #include <asm/hw_irq.h>
@@ -145,23 +146,20 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	struct msi_desc *msidesc;
-	unsigned int irq;
-	int node, ret;
+	int irq, ret;
 
 	/* Multiple MSI vectors only supported with interrupt remapping */
 	if (type == PCI_CAP_ID_MSI && nvec > 1)
 		return 1;
 
-	node = dev_to_node(&dev->dev);
-
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_alloc_hwirq(node);
-		if (!irq)
+		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		if (irq <= 0)
 			return -ENOSPC;
 
 		ret = setup_msi_irq(dev, msidesc, irq, 0);
 		if (ret < 0) {
-			irq_free_hwirq(irq);
+			irq_domain_free_irqs(irq, 1);
 			return ret;
 		}
 
@@ -171,7 +169,7 @@ int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 
 void native_teardown_msi_irq(unsigned int irq)
 {
-	irq_free_hwirq(irq);
+	irq_domain_free_irqs(irq, 1);
 }
 
 #ifdef CONFIG_DMAR_TABLE
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 06/21] x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for MSI, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index fb45663395ca..2439d383c10c 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -14,6 +14,7 @@
 #include <linux/dmar.h>
 #include <linux/hpet.h>
 #include <linux/msi.h>
+#include <linux/irqdomain.h>
 #include <asm/msidef.h>
 #include <asm/hpet.h>
 #include <asm/hw_irq.h>
@@ -145,23 +146,20 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	struct msi_desc *msidesc;
-	unsigned int irq;
-	int node, ret;
+	int irq, ret;
 
 	/* Multiple MSI vectors only supported with interrupt remapping */
 	if (type == PCI_CAP_ID_MSI && nvec > 1)
 		return 1;
 
-	node = dev_to_node(&dev->dev);
-
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_alloc_hwirq(node);
-		if (!irq)
+		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		if (irq <= 0)
 			return -ENOSPC;
 
 		ret = setup_msi_irq(dev, msidesc, irq, 0);
 		if (ret < 0) {
-			irq_free_hwirq(irq);
+			irq_domain_free_irqs(irq, 1);
 			return ret;
 		}
 
@@ -171,7 +169,7 @@ int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 
 void native_teardown_msi_irq(unsigned int irq)
 {
-	irq_free_hwirq(irq);
+	irq_domain_free_irqs(irq, 1);
 }
 
 #ifdef CONFIG_DMAR_TABLE
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 06/21] x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for MSI, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index fb45663395ca..2439d383c10c 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -14,6 +14,7 @@
 #include <linux/dmar.h>
 #include <linux/hpet.h>
 #include <linux/msi.h>
+#include <linux/irqdomain.h>
 #include <asm/msidef.h>
 #include <asm/hpet.h>
 #include <asm/hw_irq.h>
@@ -145,23 +146,20 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	struct msi_desc *msidesc;
-	unsigned int irq;
-	int node, ret;
+	int irq, ret;
 
 	/* Multiple MSI vectors only supported with interrupt remapping */
 	if (type == PCI_CAP_ID_MSI && nvec > 1)
 		return 1;
 
-	node = dev_to_node(&dev->dev);
-
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_alloc_hwirq(node);
-		if (!irq)
+		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		if (irq <= 0)
 			return -ENOSPC;
 
 		ret = setup_msi_irq(dev, msidesc, irq, 0);
 		if (ret < 0) {
-			irq_free_hwirq(irq);
+			irq_domain_free_irqs(irq, 1);
 			return ret;
 		}
 
@@ -171,7 +169,7 @@ int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 
 void native_teardown_msi_irq(unsigned int irq)
 {
-	irq_free_hwirq(irq);
+	irq_domain_free_irqs(irq, 1);
 }
 
 #ifdef CONFIG_DMAR_TABLE
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 07/21] x86, uv: Use new irqdomain interfaces to allocate/free IRQ
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/platform/uv/uv_irq.c |   27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/arch/x86/platform/uv/uv_irq.c b/arch/x86/platform/uv/uv_irq.c
index 0ce673645432..74871ead5a30 100644
--- a/arch/x86/platform/uv/uv_irq.c
+++ b/arch/x86/platform/uv/uv_irq.c
@@ -12,6 +12,7 @@
 #include <linux/rbtree.h>
 #include <linux/slab.h>
 #include <linux/irq.h>
+#include <linux/irqdomain.h>
 
 #include <asm/apic.h>
 #include <asm/uv/uv_irq.h>
@@ -130,24 +131,14 @@ static int
 arch_enable_uv_irq(char *irq_name, unsigned int irq, int cpu, int mmr_blade,
 		       unsigned long mmr_offset, int limit)
 {
-	const struct cpumask *eligible_cpu = cpumask_of(cpu);
 	struct irq_cfg *cfg = irq_cfg(irq);
 	unsigned long mmr_value;
 	struct uv_IO_APIC_route_entry *entry;
-	int mmr_pnode, err;
-	unsigned int dest;
+	int mmr_pnode;
 
 	BUILD_BUG_ON(sizeof(struct uv_IO_APIC_route_entry) !=
 			sizeof(unsigned long));
 
-	err = assign_irq_vector(irq, cfg, eligible_cpu);
-	if (err != 0)
-		return err;
-
-	err = apic->cpu_mask_to_apicid_and(eligible_cpu, eligible_cpu, &dest);
-	if (err != 0)
-		return err;
-
 	if (limit == UV_AFFINITY_CPU)
 		irq_set_status_flags(irq, IRQ_NO_BALANCING);
 	else
@@ -164,7 +155,7 @@ arch_enable_uv_irq(char *irq_name, unsigned int irq, int cpu, int mmr_blade,
 	entry->polarity		= 0;
 	entry->trigger		= 0;
 	entry->mask		= 0;
-	entry->dest		= dest;
+	entry->dest		= cfg->dest_apicid;
 
 	mmr_pnode = uv_blade_to_pnode(mmr_blade);
 	uv_write_global_mmr64(mmr_pnode, mmr_offset, mmr_value);
@@ -238,9 +229,13 @@ uv_set_irq_affinity(struct irq_data *data, const struct cpumask *mask,
 int uv_setup_irq(char *irq_name, int cpu, int mmr_blade,
 		 unsigned long mmr_offset, int limit)
 {
-	int ret, irq = irq_alloc_hwirq(uv_blade_to_memory_nid(mmr_blade));
+	int ret, irq;
+	struct irq_alloc_info info;
 
-	if (!irq)
+	init_irq_alloc_info(&info, cpumask_of(cpu));
+	irq = irq_domain_alloc_irqs(NULL, -1, 1,
+				    uv_blade_to_memory_nid(mmr_blade), &info);
+	if (irq <= 0)
 		return -EBUSY;
 
 	ret = arch_enable_uv_irq(irq_name, irq, cpu, mmr_blade, mmr_offset,
@@ -248,7 +243,7 @@ int uv_setup_irq(char *irq_name, int cpu, int mmr_blade,
 	if (ret == irq)
 		uv_set_irq_2_mmr_info(irq, mmr_offset, mmr_blade);
 	else
-		irq_free_hwirq(irq);
+		irq_domain_free_irqs(irq, 1);
 
 	return ret;
 }
@@ -283,6 +278,6 @@ void uv_teardown_irq(unsigned int irq)
 			n = n->rb_right;
 	}
 	spin_unlock_irqrestore(&uv_irq_lock, irqflags);
-	irq_free_hwirq(irq);
+	irq_domain_free_irqs(irq, 1);
 }
 EXPORT_SYMBOL_GPL(uv_teardown_irq);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 07/21] x86, uv: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/platform/uv/uv_irq.c |   27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/arch/x86/platform/uv/uv_irq.c b/arch/x86/platform/uv/uv_irq.c
index 0ce673645432..74871ead5a30 100644
--- a/arch/x86/platform/uv/uv_irq.c
+++ b/arch/x86/platform/uv/uv_irq.c
@@ -12,6 +12,7 @@
 #include <linux/rbtree.h>
 #include <linux/slab.h>
 #include <linux/irq.h>
+#include <linux/irqdomain.h>
 
 #include <asm/apic.h>
 #include <asm/uv/uv_irq.h>
@@ -130,24 +131,14 @@ static int
 arch_enable_uv_irq(char *irq_name, unsigned int irq, int cpu, int mmr_blade,
 		       unsigned long mmr_offset, int limit)
 {
-	const struct cpumask *eligible_cpu = cpumask_of(cpu);
 	struct irq_cfg *cfg = irq_cfg(irq);
 	unsigned long mmr_value;
 	struct uv_IO_APIC_route_entry *entry;
-	int mmr_pnode, err;
-	unsigned int dest;
+	int mmr_pnode;
 
 	BUILD_BUG_ON(sizeof(struct uv_IO_APIC_route_entry) !=
 			sizeof(unsigned long));
 
-	err = assign_irq_vector(irq, cfg, eligible_cpu);
-	if (err != 0)
-		return err;
-
-	err = apic->cpu_mask_to_apicid_and(eligible_cpu, eligible_cpu, &dest);
-	if (err != 0)
-		return err;
-
 	if (limit == UV_AFFINITY_CPU)
 		irq_set_status_flags(irq, IRQ_NO_BALANCING);
 	else
@@ -164,7 +155,7 @@ arch_enable_uv_irq(char *irq_name, unsigned int irq, int cpu, int mmr_blade,
 	entry->polarity		= 0;
 	entry->trigger		= 0;
 	entry->mask		= 0;
-	entry->dest		= dest;
+	entry->dest		= cfg->dest_apicid;
 
 	mmr_pnode = uv_blade_to_pnode(mmr_blade);
 	uv_write_global_mmr64(mmr_pnode, mmr_offset, mmr_value);
@@ -238,9 +229,13 @@ uv_set_irq_affinity(struct irq_data *data, const struct cpumask *mask,
 int uv_setup_irq(char *irq_name, int cpu, int mmr_blade,
 		 unsigned long mmr_offset, int limit)
 {
-	int ret, irq = irq_alloc_hwirq(uv_blade_to_memory_nid(mmr_blade));
+	int ret, irq;
+	struct irq_alloc_info info;
 
-	if (!irq)
+	init_irq_alloc_info(&info, cpumask_of(cpu));
+	irq = irq_domain_alloc_irqs(NULL, -1, 1,
+				    uv_blade_to_memory_nid(mmr_blade), &info);
+	if (irq <= 0)
 		return -EBUSY;
 
 	ret = arch_enable_uv_irq(irq_name, irq, cpu, mmr_blade, mmr_offset,
@@ -248,7 +243,7 @@ int uv_setup_irq(char *irq_name, int cpu, int mmr_blade,
 	if (ret == irq)
 		uv_set_irq_2_mmr_info(irq, mmr_offset, mmr_blade);
 	else
-		irq_free_hwirq(irq);
+		irq_domain_free_irqs(irq, 1);
 
 	return ret;
 }
@@ -283,6 +278,6 @@ void uv_teardown_irq(unsigned int irq)
 			n = n->rb_right;
 	}
 	spin_unlock_irqrestore(&uv_irq_lock, irqflags);
-	irq_free_hwirq(irq);
+	irq_domain_free_irqs(irq, 1);
 }
 EXPORT_SYMBOL_GPL(uv_teardown_irq);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 08/21] x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, Jiang Liu, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for HTIRQ, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

This patch changes the interfaces between arch independent PCI driver
and arch specific code. Currently HT_IRQ is only enabled on x86, so it
shouldn't break other architectures.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/htirq.c |   26 +++++++++++++-------------
 drivers/pci/htirq.c          |    7 +++----
 include/linux/htirq.h        |    2 ++
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/apic/htirq.c b/arch/x86/kernel/apic/htirq.c
index 6f527b02ac4c..55cb061a95cb 100644
--- a/arch/x86/kernel/apic/htirq.c
+++ b/arch/x86/kernel/apic/htirq.c
@@ -14,6 +14,7 @@
 #include <linux/device.h>
 #include <linux/pci.h>
 #include <linux/htirq.h>
+#include <linux/irqdomain.h>
 #include <asm/hw_irq.h>
 #include <asm/apic.h>
 #include <asm/hypertransport.h>
@@ -60,31 +61,30 @@ static struct irq_chip ht_irq_chip = {
 	.irq_retrigger		= apic_retrigger_irq,
 };
 
+int arch_alloc_ht_irq(struct pci_dev *dev)
+{
+	return irq_domain_alloc_irqs(NULL, -1, 1, dev_to_node(&dev->dev), NULL);
+}
+
+void arch_free_ht_irq(int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
+
 int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
 {
 	struct irq_cfg *cfg;
 	struct ht_irq_msg msg;
-	unsigned dest;
-	int err;
 
 	if (disable_apic)
 		return -ENXIO;
 
 	cfg = irq_cfg(irq);
-	err = assign_irq_vector(irq, cfg, apic->target_cpus());
-	if (err)
-		return err;
-
-	err = apic->cpu_mask_to_apicid_and(cfg->domain,
-					   apic->target_cpus(), &dest);
-	if (err)
-		return err;
-
-	msg.address_hi = HT_IRQ_HIGH_DEST_ID(dest);
+	msg.address_hi = HT_IRQ_HIGH_DEST_ID(cfg->dest_apicid);
 
 	msg.address_lo =
 		HT_IRQ_LOW_BASE |
-		HT_IRQ_LOW_DEST_ID(dest) |
+		HT_IRQ_LOW_DEST_ID(cfg->dest_apicid) |
 		HT_IRQ_LOW_VECTOR(cfg->vector) |
 		((apic->irq_dest_mode == 0) ?
 			HT_IRQ_LOW_DM_PHYSICAL :
diff --git a/drivers/pci/htirq.c b/drivers/pci/htirq.c
index a94dd2c4183a..ceb0ebeb7b5f 100644
--- a/drivers/pci/htirq.c
+++ b/drivers/pci/htirq.c
@@ -117,8 +117,8 @@ int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update)
 	cfg->msg.address_lo = 0xffffffff;
 	cfg->msg.address_hi = 0xffffffff;
 
-	irq = irq_alloc_hwirq(dev_to_node(&dev->dev));
-	if (!irq) {
+	irq = arch_alloc_ht_irq(dev);
+	if (irq <= 0) {
 		kfree(cfg);
 		return -EBUSY;
 	}
@@ -163,8 +163,7 @@ void ht_destroy_irq(unsigned int irq)
 	cfg = irq_get_handler_data(irq);
 	irq_set_chip(irq, NULL);
 	irq_set_handler_data(irq, NULL);
-	irq_free_hwirq(irq);
-
+	arch_free_ht_irq(irq);
 	kfree(cfg);
 }
 EXPORT_SYMBOL(ht_destroy_irq);
diff --git a/include/linux/htirq.h b/include/linux/htirq.h
index 70a1dbbf2093..5caa51b7b95c 100644
--- a/include/linux/htirq.h
+++ b/include/linux/htirq.h
@@ -15,6 +15,8 @@ void unmask_ht_irq(struct irq_data *data);
 
 /* The arch hook for getting things started */
 int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev);
+int arch_alloc_ht_irq(struct pci_dev *dev);
+void arch_free_ht_irq(int irq);
 
 /* For drivers of buggy hardware */
 typedef void (ht_irq_update_t)(struct pci_dev *dev, int irq,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 08/21] x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for HTIRQ, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

This patch changes the interfaces between arch independent PCI driver
and arch specific code. Currently HT_IRQ is only enabled on x86, so it
shouldn't break other architectures.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/htirq.c |   26 +++++++++++++-------------
 drivers/pci/htirq.c          |    7 +++----
 include/linux/htirq.h        |    2 ++
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/apic/htirq.c b/arch/x86/kernel/apic/htirq.c
index 6f527b02ac4c..55cb061a95cb 100644
--- a/arch/x86/kernel/apic/htirq.c
+++ b/arch/x86/kernel/apic/htirq.c
@@ -14,6 +14,7 @@
 #include <linux/device.h>
 #include <linux/pci.h>
 #include <linux/htirq.h>
+#include <linux/irqdomain.h>
 #include <asm/hw_irq.h>
 #include <asm/apic.h>
 #include <asm/hypertransport.h>
@@ -60,31 +61,30 @@ static struct irq_chip ht_irq_chip = {
 	.irq_retrigger		= apic_retrigger_irq,
 };
 
+int arch_alloc_ht_irq(struct pci_dev *dev)
+{
+	return irq_domain_alloc_irqs(NULL, -1, 1, dev_to_node(&dev->dev), NULL);
+}
+
+void arch_free_ht_irq(int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
+
 int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
 {
 	struct irq_cfg *cfg;
 	struct ht_irq_msg msg;
-	unsigned dest;
-	int err;
 
 	if (disable_apic)
 		return -ENXIO;
 
 	cfg = irq_cfg(irq);
-	err = assign_irq_vector(irq, cfg, apic->target_cpus());
-	if (err)
-		return err;
-
-	err = apic->cpu_mask_to_apicid_and(cfg->domain,
-					   apic->target_cpus(), &dest);
-	if (err)
-		return err;
-
-	msg.address_hi = HT_IRQ_HIGH_DEST_ID(dest);
+	msg.address_hi = HT_IRQ_HIGH_DEST_ID(cfg->dest_apicid);
 
 	msg.address_lo =
 		HT_IRQ_LOW_BASE |
-		HT_IRQ_LOW_DEST_ID(dest) |
+		HT_IRQ_LOW_DEST_ID(cfg->dest_apicid) |
 		HT_IRQ_LOW_VECTOR(cfg->vector) |
 		((apic->irq_dest_mode == 0) ?
 			HT_IRQ_LOW_DM_PHYSICAL :
diff --git a/drivers/pci/htirq.c b/drivers/pci/htirq.c
index a94dd2c4183a..ceb0ebeb7b5f 100644
--- a/drivers/pci/htirq.c
+++ b/drivers/pci/htirq.c
@@ -117,8 +117,8 @@ int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update)
 	cfg->msg.address_lo = 0xffffffff;
 	cfg->msg.address_hi = 0xffffffff;
 
-	irq = irq_alloc_hwirq(dev_to_node(&dev->dev));
-	if (!irq) {
+	irq = arch_alloc_ht_irq(dev);
+	if (irq <= 0) {
 		kfree(cfg);
 		return -EBUSY;
 	}
@@ -163,8 +163,7 @@ void ht_destroy_irq(unsigned int irq)
 	cfg = irq_get_handler_data(irq);
 	irq_set_chip(irq, NULL);
 	irq_set_handler_data(irq, NULL);
-	irq_free_hwirq(irq);
-
+	arch_free_ht_irq(irq);
 	kfree(cfg);
 }
 EXPORT_SYMBOL(ht_destroy_irq);
diff --git a/include/linux/htirq.h b/include/linux/htirq.h
index 70a1dbbf2093..5caa51b7b95c 100644
--- a/include/linux/htirq.h
+++ b/include/linux/htirq.h
@@ -15,6 +15,8 @@ void unmask_ht_irq(struct irq_data *data);
 
 /* The arch hook for getting things started */
 int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev);
+int arch_alloc_ht_irq(struct pci_dev *dev);
+void arch_free_ht_irq(int irq);
 
 /* For drivers of buggy hardware */
 typedef void (ht_irq_update_t)(struct pci_dev *dev, int irq,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 08/21] x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for HTIRQ, so we could
kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

This patch changes the interfaces between arch independent PCI driver
and arch specific code. Currently HT_IRQ is only enabled on x86, so it
shouldn't break other architectures.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/htirq.c |   26 +++++++++++++-------------
 drivers/pci/htirq.c          |    7 +++----
 include/linux/htirq.h        |    2 ++
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/apic/htirq.c b/arch/x86/kernel/apic/htirq.c
index 6f527b02ac4c..55cb061a95cb 100644
--- a/arch/x86/kernel/apic/htirq.c
+++ b/arch/x86/kernel/apic/htirq.c
@@ -14,6 +14,7 @@
 #include <linux/device.h>
 #include <linux/pci.h>
 #include <linux/htirq.h>
+#include <linux/irqdomain.h>
 #include <asm/hw_irq.h>
 #include <asm/apic.h>
 #include <asm/hypertransport.h>
@@ -60,31 +61,30 @@ static struct irq_chip ht_irq_chip = {
 	.irq_retrigger		= apic_retrigger_irq,
 };
 
+int arch_alloc_ht_irq(struct pci_dev *dev)
+{
+	return irq_domain_alloc_irqs(NULL, -1, 1, dev_to_node(&dev->dev), NULL);
+}
+
+void arch_free_ht_irq(int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
+
 int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
 {
 	struct irq_cfg *cfg;
 	struct ht_irq_msg msg;
-	unsigned dest;
-	int err;
 
 	if (disable_apic)
 		return -ENXIO;
 
 	cfg = irq_cfg(irq);
-	err = assign_irq_vector(irq, cfg, apic->target_cpus());
-	if (err)
-		return err;
-
-	err = apic->cpu_mask_to_apicid_and(cfg->domain,
-					   apic->target_cpus(), &dest);
-	if (err)
-		return err;
-
-	msg.address_hi = HT_IRQ_HIGH_DEST_ID(dest);
+	msg.address_hi = HT_IRQ_HIGH_DEST_ID(cfg->dest_apicid);
 
 	msg.address_lo =
 		HT_IRQ_LOW_BASE |
-		HT_IRQ_LOW_DEST_ID(dest) |
+		HT_IRQ_LOW_DEST_ID(cfg->dest_apicid) |
 		HT_IRQ_LOW_VECTOR(cfg->vector) |
 		((apic->irq_dest_mode == 0) ?
 			HT_IRQ_LOW_DM_PHYSICAL :
diff --git a/drivers/pci/htirq.c b/drivers/pci/htirq.c
index a94dd2c4183a..ceb0ebeb7b5f 100644
--- a/drivers/pci/htirq.c
+++ b/drivers/pci/htirq.c
@@ -117,8 +117,8 @@ int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update)
 	cfg->msg.address_lo = 0xffffffff;
 	cfg->msg.address_hi = 0xffffffff;
 
-	irq = irq_alloc_hwirq(dev_to_node(&dev->dev));
-	if (!irq) {
+	irq = arch_alloc_ht_irq(dev);
+	if (irq <= 0) {
 		kfree(cfg);
 		return -EBUSY;
 	}
@@ -163,8 +163,7 @@ void ht_destroy_irq(unsigned int irq)
 	cfg = irq_get_handler_data(irq);
 	irq_set_chip(irq, NULL);
 	irq_set_handler_data(irq, NULL);
-	irq_free_hwirq(irq);
-
+	arch_free_ht_irq(irq);
 	kfree(cfg);
 }
 EXPORT_SYMBOL(ht_destroy_irq);
diff --git a/include/linux/htirq.h b/include/linux/htirq.h
index 70a1dbbf2093..5caa51b7b95c 100644
--- a/include/linux/htirq.h
+++ b/include/linux/htirq.h
@@ -15,6 +15,8 @@ void unmask_ht_irq(struct irq_data *data);
 
 /* The arch hook for getting things started */
 int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev);
+int arch_alloc_ht_irq(struct pci_dev *dev);
+void arch_free_ht_irq(int irq);
 
 /* For drivers of buggy hardware */
 typedef void (ht_irq_update_t)(struct pci_dev *dev, int irq,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 09/21] x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, Jiang Liu, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for DMAR and interrupt
remapping, so we could kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

The private definition of irq_alloc_hwirqs()/irq_free_hwirqs() are
temporary solution, it will be removed once we have converted interrupt
remapping driver to use irqdomain framework.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h |    4 ++--
 arch/x86/kernel/apic/msi.c           |   10 ++++++++++
 drivers/iommu/irq_remapping.c        |   17 +++++++++++++++--
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index b7747c4c2cf2..230dde9b695e 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -103,7 +103,7 @@ static inline bool setup_remapped_irq(int irq,
 }
 #endif /* CONFIG_IRQ_REMAP */
 
-#define dmar_alloc_hwirq()	irq_alloc_hwirq(-1)
-#define dmar_free_hwirq		irq_free_hwirq
+extern int dmar_alloc_hwirq(void);
+extern void dmar_free_hwirq(int irq);
 
 #endif /* __X86_IRQ_REMAPPING_H */
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 2439d383c10c..ad2d624a0800 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -221,6 +221,16 @@ int arch_setup_dmar_msi(unsigned int irq)
 				      "edge");
 	return 0;
 }
+
+int dmar_alloc_hwirq(void)
+{
+	return irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+}
+
+void dmar_free_hwirq(int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
 #endif
 
 /*
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 34d01de91bd7..7dd893ee70be 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -6,6 +6,7 @@
 #include <linux/msi.h>
 #include <linux/irq.h>
 #include <linux/pci.h>
+#include <linux/irqdomain.h>
 
 #include <asm/hw_irq.h>
 #include <asm/irq_remapping.h>
@@ -49,6 +50,18 @@ static void irq_remapping_disable_io_apic(void)
 		disconnect_bsp_APIC(0);
 }
 
+#ifndef CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ
+static unsigned int irq_alloc_hwirqs(int cnt, int node)
+{
+	return irq_domain_alloc_irqs(NULL, -1, cnt, node, NULL);
+}
+
+static void irq_free_hwirqs(unsigned int from, int cnt)
+{
+	irq_domain_free_irqs(from, cnt);
+}
+#endif
+
 static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
 {
 	int ret, sub_handle, nvec_pow2, index = 0;
@@ -112,7 +125,7 @@ static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
 
-		irq = irq_alloc_hwirq(node);
+		irq = irq_alloc_hwirqs(1, node);
 		if (irq == 0)
 			return -1;
 
@@ -135,7 +148,7 @@ static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
 	return 0;
 
 error:
-	irq_free_hwirq(irq);
+	irq_free_hwirqs(irq, 1);
 	return ret;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 09/21] x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for DMAR and interrupt
remapping, so we could kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

The private definition of irq_alloc_hwirqs()/irq_free_hwirqs() are
temporary solution, it will be removed once we have converted interrupt
remapping driver to use irqdomain framework.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h |    4 ++--
 arch/x86/kernel/apic/msi.c           |   10 ++++++++++
 drivers/iommu/irq_remapping.c        |   17 +++++++++++++++--
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index b7747c4c2cf2..230dde9b695e 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -103,7 +103,7 @@ static inline bool setup_remapped_irq(int irq,
 }
 #endif /* CONFIG_IRQ_REMAP */
 
-#define dmar_alloc_hwirq()	irq_alloc_hwirq(-1)
-#define dmar_free_hwirq		irq_free_hwirq
+extern int dmar_alloc_hwirq(void);
+extern void dmar_free_hwirq(int irq);
 
 #endif /* __X86_IRQ_REMAPPING_H */
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 2439d383c10c..ad2d624a0800 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -221,6 +221,16 @@ int arch_setup_dmar_msi(unsigned int irq)
 				      "edge");
 	return 0;
 }
+
+int dmar_alloc_hwirq(void)
+{
+	return irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+}
+
+void dmar_free_hwirq(int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
 #endif
 
 /*
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 34d01de91bd7..7dd893ee70be 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -6,6 +6,7 @@
 #include <linux/msi.h>
 #include <linux/irq.h>
 #include <linux/pci.h>
+#include <linux/irqdomain.h>
 
 #include <asm/hw_irq.h>
 #include <asm/irq_remapping.h>
@@ -49,6 +50,18 @@ static void irq_remapping_disable_io_apic(void)
 		disconnect_bsp_APIC(0);
 }
 
+#ifndef CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ
+static unsigned int irq_alloc_hwirqs(int cnt, int node)
+{
+	return irq_domain_alloc_irqs(NULL, -1, cnt, node, NULL);
+}
+
+static void irq_free_hwirqs(unsigned int from, int cnt)
+{
+	irq_domain_free_irqs(from, cnt);
+}
+#endif
+
 static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
 {
 	int ret, sub_handle, nvec_pow2, index = 0;
@@ -112,7 +125,7 @@ static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
 
-		irq = irq_alloc_hwirq(node);
+		irq = irq_alloc_hwirqs(1, node);
 		if (irq == 0)
 			return -1;
 
@@ -135,7 +148,7 @@ static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
 	return 0;
 
 error:
-	irq_free_hwirq(irq);
+	irq_free_hwirqs(irq, 1);
 	return ret;
 }
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 09/21] x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Use new irqdomain interfaces to allocate/free IRQ for DMAR and interrupt
remapping, so we could kill GENERIC_IRQ_LEGACY_ALLOC_HWIRQ later.

The private definition of irq_alloc_hwirqs()/irq_free_hwirqs() are
temporary solution, it will be removed once we have converted interrupt
remapping driver to use irqdomain framework.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h |    4 ++--
 arch/x86/kernel/apic/msi.c           |   10 ++++++++++
 drivers/iommu/irq_remapping.c        |   17 +++++++++++++++--
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index b7747c4c2cf2..230dde9b695e 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -103,7 +103,7 @@ static inline bool setup_remapped_irq(int irq,
 }
 #endif /* CONFIG_IRQ_REMAP */
 
-#define dmar_alloc_hwirq()	irq_alloc_hwirq(-1)
-#define dmar_free_hwirq		irq_free_hwirq
+extern int dmar_alloc_hwirq(void);
+extern void dmar_free_hwirq(int irq);
 
 #endif /* __X86_IRQ_REMAPPING_H */
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 2439d383c10c..ad2d624a0800 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -221,6 +221,16 @@ int arch_setup_dmar_msi(unsigned int irq)
 				      "edge");
 	return 0;
 }
+
+int dmar_alloc_hwirq(void)
+{
+	return irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+}
+
+void dmar_free_hwirq(int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
 #endif
 
 /*
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 34d01de91bd7..7dd893ee70be 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -6,6 +6,7 @@
 #include <linux/msi.h>
 #include <linux/irq.h>
 #include <linux/pci.h>
+#include <linux/irqdomain.h>
 
 #include <asm/hw_irq.h>
 #include <asm/irq_remapping.h>
@@ -49,6 +50,18 @@ static void irq_remapping_disable_io_apic(void)
 		disconnect_bsp_APIC(0);
 }
 
+#ifndef CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ
+static unsigned int irq_alloc_hwirqs(int cnt, int node)
+{
+	return irq_domain_alloc_irqs(NULL, -1, cnt, node, NULL);
+}
+
+static void irq_free_hwirqs(unsigned int from, int cnt)
+{
+	irq_domain_free_irqs(from, cnt);
+}
+#endif
+
 static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
 {
 	int ret, sub_handle, nvec_pow2, index = 0;
@@ -112,7 +125,7 @@ static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
 
-		irq = irq_alloc_hwirq(node);
+		irq = irq_alloc_hwirqs(1, node);
 		if (irq == 0)
 			return -1;
 
@@ -135,7 +148,7 @@ static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
 	return 0;
 
 error:
-	irq_free_hwirq(irq);
+	irq_free_hwirqs(irq, 1);
 	return ret;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 10/21] x86: irq_remapping: Introduce new interfaces to support hierarchy irqdomain
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Introduce new interfaces for interrupt remapping drivers to support
hierarchy irqdomain:
1) irq_remapping_get_ir_irq_domain(): get irqdomain associated with an
   interrupt remapping unit. IOAPIC/HPET drivers use this interface to
   get parent interrupt remapping irqdomain.
2) irq_remapping_get_irq_domain(): get irqdomain for an IRQ allocation.
   This is mainly used to support MSI irqdomain. We must build one MSI
   irqdomain for each interrupt remapping unit. MSI driver calls this
   interface to get MSI irqdomain associated with an IR irqdomain which
   manages the PCI devices.
3) irq_remapping_get_ioapic_entry(): get IOAPIC entry content rewritten
   by the interrupt remapping driver for remapped IOAPIC interrupt.
4) irq_remapping_get_msi_entry(): get MSI/HPET entry content rewritten
   by the interrupt remapping driver for remapped MSI/HPET interrupt.

Architecture specific needs to implement two hooks:
1) arch_get_ir_parent_domain(): get parent irqdomain for IR irqdomain,
   which is x86_vector_domain on x86 platforms.
2) arch_create_msi_irq_domain(): create an MSI irqdomain associated with
   the interrupt remapping unit.

We also add follwing callbacks into struct irq_remap_ops:
	struct irq_domain *(*get_ir_irq_domain)(struct irq_alloc_info *);
	struct irq_domain *(*get_irq_domain)(struct irq_alloc_info *);
	int (*get_ioapic_entry)(struct irq_data *,
				struct IR_IO_APIC_route_entry *);
	int (*get_msi_entry)(struct irq_data *, struct msi_msg *);

Once all clients of IR have been converted to new hierarchy irqdomain
interfaces, we will:
1) Remove set_ioapic_entry, set_affinity, free_irq, compose_msi_msg,
   msi_alloc_irq, msi_setup_irq, setup_hpet_msi from struct remap_osp
2) Kill setup_ioapic_remapped_entry, free_remapped_irq,
   compose_remapped_msi_msg, setup_hpet_msi_remapped, setup_remapped_irq.
3) Simplify x86_io_apic_ops and x86_msi.

We could achieve a much more clear architecture with all these changes
applied.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |   36 +++++++++++++++-
 arch/x86/include/asm/irq_remapping.h |   53 +++++++++++++++++++++++
 drivers/iommu/irq_remapping.c        |   78 +++++++++++++++++++++++++++++++++-
 drivers/iommu/irq_remapping.h        |   17 ++++++++
 4 files changed, 182 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 313ae21a0784..57f81f5a9686 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -112,9 +112,43 @@ struct irq_2_irte {
 
 #ifdef	CONFIG_X86_LOCAL_APIC
 struct irq_data;
+struct pci_dev;
+struct msi_desc;
+
+enum irq_alloc_type {
+	X86_IRQ_ALLOC_TYPE_IOAPIC = 1,
+	X86_IRQ_ALLOC_TYPE_HPET,
+	X86_IRQ_ALLOC_TYPE_MSI,
+	X86_IRQ_ALLOC_TYPE_MSIX,
+};
 
 struct irq_alloc_info {
-	const struct cpumask *mask;	/* CPU mask for vector allocation */
+	const struct cpumask	*mask;	/* CPU mask for vector allocation */
+	enum irq_alloc_type	type;
+	union {
+		int		unused;
+#ifdef	CONFIG_HPET_TIMER
+		struct {
+			int		hpet_id;
+			int		hpet_index;
+			void		*hpet_data;
+		};
+#endif
+#ifdef	CONFIG_PCI_MSI
+		struct {
+			struct pci_dev	*msi_dev;
+			struct msi_desc *msi_desc;
+		};
+#endif
+#ifdef	CONFIG_X86_IO_APIC
+		struct {
+			int		ioapic_id;
+			int		ioapic_pin;
+			u32		ioapic_trigger : 1;
+			u32		ioapic_polarity : 1;
+		};
+#endif
+	};
 };
 
 struct irq_cfg {
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 230dde9b695e..428b4e6d637c 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -30,6 +30,7 @@ struct irq_chip;
 struct msi_msg;
 struct pci_dev;
 struct irq_cfg;
+struct irq_alloc_info;
 
 #ifdef CONFIG_IRQ_REMAP
 
@@ -58,6 +59,32 @@ extern bool setup_remapped_irq(int irq,
 
 void irq_remap_modify_chip_defaults(struct irq_chip *chip);
 
+extern struct irq_domain *irq_remapping_get_ir_irq_domain(
+				struct irq_alloc_info *info);
+extern struct irq_domain *irq_remapping_get_irq_domain(
+				struct irq_alloc_info *info);
+extern int irq_remapping_get_ioapic_entry(struct irq_data *irq_data,
+					  struct IR_IO_APIC_route_entry *entry);
+extern int irq_remapping_get_msi_entry(struct irq_data *irq_data,
+				       struct msi_msg *entry);
+extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
+
+/*
+ * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
+ * parent irqdomain.
+ */
+static inline struct irq_domain *
+arch_create_msi_irq_domain(struct irq_domain *parent)
+{
+	return NULL;
+}
+
+/* Get parent irqdomain for interrupt remapping irqdomain */
+static inline struct irq_domain *arch_get_ir_parent_domain(void)
+{
+	return x86_vector_domain;
+}
+
 #else  /* CONFIG_IRQ_REMAP */
 
 static inline void setup_irq_remapping_ops(void) { }
@@ -101,6 +128,32 @@ static inline bool setup_remapped_irq(int irq,
 {
 	return false;
 }
+
+static inline struct irq_domain *
+irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	return NULL;
+}
+
+static inline struct irq_domain *
+irq_remapping_get_irq_domain(struct irq_alloc_info *info)
+{
+	return NULL;
+}
+
+static inline int irq_remapping_get_ioapic_entry(struct irq_data *irq_data,
+				struct IR_IO_APIC_route_entry *entry)
+{
+	return -ENOSYS;
+}
+
+static inline int irq_remapping_get_msi_entry(struct irq_data *irq_data,
+					      struct msi_msg *entry)
+{
+	return -ENOSYS;
+}
+
+#define	irq_remapping_print_chip	NULL
 #endif /* CONFIG_IRQ_REMAP */
 
 extern int dmar_alloc_hwirq(void);
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 7dd893ee70be..7ac44a464be0 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -370,7 +370,7 @@ void panic_if_irq_remap(const char *msg)
 		panic(msg);
 }
 
-static void ir_ack_apic_edge(struct irq_data *data)
+void ir_ack_apic_edge(struct irq_data *data)
 {
 	ack_APIC_irq();
 }
@@ -381,6 +381,19 @@ static void ir_ack_apic_level(struct irq_data *data)
 	eoi_ioapic_irq(data->irq, irqd_cfg(data));
 }
 
+void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p)
+{
+	/*
+	 * Assume interrupt is remapped if the parent irqdomain isn't the
+	 * vector domain, which is true for MSI, HPET and IOAPIC on x86
+	 * platforms.
+	 */
+	if (data->domain && data->domain->parent != arch_get_ir_parent_domain())
+		seq_printf(p, " IR-%s", data->chip->name);
+	else
+		seq_printf(p, " %s", data->chip->name);
+}
+
 static void ir_print_prefix(struct irq_data *data, struct seq_file *p)
 {
 	seq_printf(p, " IR-%s", data->chip->name);
@@ -402,3 +415,66 @@ bool setup_remapped_irq(int irq, struct irq_cfg *cfg, struct irq_chip *chip)
 	irq_remap_modify_chip_defaults(chip);
 	return true;
 }
+
+/**
+ * irq_remapping_get_ir_irq_domain - Get the irqdomain associated the IOMMU
+ *				     device serving @info
+ * @info: interrupt allocation information, used to find the IOMMU device
+ *
+ * It's used to get parent irqdomain for HPET and IOAPIC domains.
+ * Returns pointer to IRQ domain, or NULL on failure.
+ */
+struct irq_domain *
+irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	if (!remap_ops || !remap_ops->get_ir_irq_domain)
+		return NULL;
+
+	return remap_ops->get_ir_irq_domain(info);
+}
+
+/**
+ * irq_remapping_get_irq_domain - Get the irqdomain serving the MSI interrupt
+ * @info: interrupt allocation information, used to find the IOMMU device
+ *
+ * It's used to get irqdomain for MSI/MSIx interrupt allocation.
+ * Returns pointer to IRQ domain, or NULL on failure.
+ */
+struct irq_domain *
+irq_remapping_get_irq_domain(struct irq_alloc_info *info)
+{
+	if (!remap_ops || !remap_ops->get_irq_domain)
+		return NULL;
+
+	return remap_ops->get_irq_domain(info);
+}
+
+/**
+ * irq_remapping_get_ioapic_entry - Get IOAPIC entry content rewritten by
+ *				    interrupt remapping driver
+ * @irq_data: irq_data associated with interrupt remapping irqdomain
+ * @entry: host returned data
+ *
+ * Caller must make sure that the interrupt is remapped.
+ * Return 0 on success, otherwise return error code
+ */
+int irq_remapping_get_ioapic_entry(struct irq_data *irq_data,
+				   struct IR_IO_APIC_route_entry *entry)
+{
+	return remap_ops->get_ioapic_entry(irq_data, entry);
+}
+
+/**
+ * irq_remapping_get_ioapic_entry - Get MSI data rewritten by interrupt
+ *				    remapping driver
+ * @irq_data: irq_data associated with interrupt remapping irqdomain
+ * @entry: host returned data
+ *
+ * Caller must make sure that the interrupt is remapped.
+ * Return 0 on success, otherwise return error code
+ */
+int irq_remapping_get_msi_entry(struct irq_data *irq_data,
+				struct msi_msg *entry)
+{
+	return remap_ops->get_msi_entry(irq_data, entry);
+}
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 90c4dae5a46b..6e46074f06d0 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -30,6 +30,8 @@ struct irq_data;
 struct cpumask;
 struct pci_dev;
 struct msi_msg;
+struct irq_domain;
+struct irq_alloc_info;
 
 extern int disable_irq_remap;
 extern int irq_remap_broken;
@@ -81,11 +83,26 @@ struct irq_remap_ops {
 
 	/* Setup interrupt remapping for an HPET MSI */
 	int (*setup_hpet_msi)(unsigned int, unsigned int);
+
+	/* Get the irqdomain associated the IOMMU device */
+	struct irq_domain *(*get_ir_irq_domain)(struct irq_alloc_info *);
+
+	/* Get the MSI irqdomain associated with the IOMMU device */
+	struct irq_domain *(*get_irq_domain)(struct irq_alloc_info *);
+
+	/* Get IOAPIC entry content rewritten by interrupt remapping driver */
+	int (*get_ioapic_entry)(struct irq_data *,
+				struct IR_IO_APIC_route_entry *);
+
+	/*  Get MSI data rewritten by interrupt remapping driver */
+	int (*get_msi_entry)(struct irq_data *, struct msi_msg *);
 };
 
 extern struct irq_remap_ops intel_irq_remap_ops;
 extern struct irq_remap_ops amd_iommu_irq_ops;
 
+extern void ir_ack_apic_edge(struct irq_data *data);
+
 #else  /* CONFIG_IRQ_REMAP */
 
 #define irq_remapping_enabled 0
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 10/21] x86: irq_remapping: Introduce new interfaces to support hierarchy irqdomain
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce new interfaces for interrupt remapping drivers to support
hierarchy irqdomain:
1) irq_remapping_get_ir_irq_domain(): get irqdomain associated with an
   interrupt remapping unit. IOAPIC/HPET drivers use this interface to
   get parent interrupt remapping irqdomain.
2) irq_remapping_get_irq_domain(): get irqdomain for an IRQ allocation.
   This is mainly used to support MSI irqdomain. We must build one MSI
   irqdomain for each interrupt remapping unit. MSI driver calls this
   interface to get MSI irqdomain associated with an IR irqdomain which
   manages the PCI devices.
3) irq_remapping_get_ioapic_entry(): get IOAPIC entry content rewritten
   by the interrupt remapping driver for remapped IOAPIC interrupt.
4) irq_remapping_get_msi_entry(): get MSI/HPET entry content rewritten
   by the interrupt remapping driver for remapped MSI/HPET interrupt.

Architecture specific needs to implement two hooks:
1) arch_get_ir_parent_domain(): get parent irqdomain for IR irqdomain,
   which is x86_vector_domain on x86 platforms.
2) arch_create_msi_irq_domain(): create an MSI irqdomain associated with
   the interrupt remapping unit.

We also add follwing callbacks into struct irq_remap_ops:
	struct irq_domain *(*get_ir_irq_domain)(struct irq_alloc_info *);
	struct irq_domain *(*get_irq_domain)(struct irq_alloc_info *);
	int (*get_ioapic_entry)(struct irq_data *,
				struct IR_IO_APIC_route_entry *);
	int (*get_msi_entry)(struct irq_data *, struct msi_msg *);

Once all clients of IR have been converted to new hierarchy irqdomain
interfaces, we will:
1) Remove set_ioapic_entry, set_affinity, free_irq, compose_msi_msg,
   msi_alloc_irq, msi_setup_irq, setup_hpet_msi from struct remap_osp
2) Kill setup_ioapic_remapped_entry, free_remapped_irq,
   compose_remapped_msi_msg, setup_hpet_msi_remapped, setup_remapped_irq.
3) Simplify x86_io_apic_ops and x86_msi.

We could achieve a much more clear architecture with all these changes
applied.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |   36 +++++++++++++++-
 arch/x86/include/asm/irq_remapping.h |   53 +++++++++++++++++++++++
 drivers/iommu/irq_remapping.c        |   78 +++++++++++++++++++++++++++++++++-
 drivers/iommu/irq_remapping.h        |   17 ++++++++
 4 files changed, 182 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 313ae21a0784..57f81f5a9686 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -112,9 +112,43 @@ struct irq_2_irte {
 
 #ifdef	CONFIG_X86_LOCAL_APIC
 struct irq_data;
+struct pci_dev;
+struct msi_desc;
+
+enum irq_alloc_type {
+	X86_IRQ_ALLOC_TYPE_IOAPIC = 1,
+	X86_IRQ_ALLOC_TYPE_HPET,
+	X86_IRQ_ALLOC_TYPE_MSI,
+	X86_IRQ_ALLOC_TYPE_MSIX,
+};
 
 struct irq_alloc_info {
-	const struct cpumask *mask;	/* CPU mask for vector allocation */
+	const struct cpumask	*mask;	/* CPU mask for vector allocation */
+	enum irq_alloc_type	type;
+	union {
+		int		unused;
+#ifdef	CONFIG_HPET_TIMER
+		struct {
+			int		hpet_id;
+			int		hpet_index;
+			void		*hpet_data;
+		};
+#endif
+#ifdef	CONFIG_PCI_MSI
+		struct {
+			struct pci_dev	*msi_dev;
+			struct msi_desc *msi_desc;
+		};
+#endif
+#ifdef	CONFIG_X86_IO_APIC
+		struct {
+			int		ioapic_id;
+			int		ioapic_pin;
+			u32		ioapic_trigger : 1;
+			u32		ioapic_polarity : 1;
+		};
+#endif
+	};
 };
 
 struct irq_cfg {
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 230dde9b695e..428b4e6d637c 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -30,6 +30,7 @@ struct irq_chip;
 struct msi_msg;
 struct pci_dev;
 struct irq_cfg;
+struct irq_alloc_info;
 
 #ifdef CONFIG_IRQ_REMAP
 
@@ -58,6 +59,32 @@ extern bool setup_remapped_irq(int irq,
 
 void irq_remap_modify_chip_defaults(struct irq_chip *chip);
 
+extern struct irq_domain *irq_remapping_get_ir_irq_domain(
+				struct irq_alloc_info *info);
+extern struct irq_domain *irq_remapping_get_irq_domain(
+				struct irq_alloc_info *info);
+extern int irq_remapping_get_ioapic_entry(struct irq_data *irq_data,
+					  struct IR_IO_APIC_route_entry *entry);
+extern int irq_remapping_get_msi_entry(struct irq_data *irq_data,
+				       struct msi_msg *entry);
+extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
+
+/*
+ * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
+ * parent irqdomain.
+ */
+static inline struct irq_domain *
+arch_create_msi_irq_domain(struct irq_domain *parent)
+{
+	return NULL;
+}
+
+/* Get parent irqdomain for interrupt remapping irqdomain */
+static inline struct irq_domain *arch_get_ir_parent_domain(void)
+{
+	return x86_vector_domain;
+}
+
 #else  /* CONFIG_IRQ_REMAP */
 
 static inline void setup_irq_remapping_ops(void) { }
@@ -101,6 +128,32 @@ static inline bool setup_remapped_irq(int irq,
 {
 	return false;
 }
+
+static inline struct irq_domain *
+irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	return NULL;
+}
+
+static inline struct irq_domain *
+irq_remapping_get_irq_domain(struct irq_alloc_info *info)
+{
+	return NULL;
+}
+
+static inline int irq_remapping_get_ioapic_entry(struct irq_data *irq_data,
+				struct IR_IO_APIC_route_entry *entry)
+{
+	return -ENOSYS;
+}
+
+static inline int irq_remapping_get_msi_entry(struct irq_data *irq_data,
+					      struct msi_msg *entry)
+{
+	return -ENOSYS;
+}
+
+#define	irq_remapping_print_chip	NULL
 #endif /* CONFIG_IRQ_REMAP */
 
 extern int dmar_alloc_hwirq(void);
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 7dd893ee70be..7ac44a464be0 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -370,7 +370,7 @@ void panic_if_irq_remap(const char *msg)
 		panic(msg);
 }
 
-static void ir_ack_apic_edge(struct irq_data *data)
+void ir_ack_apic_edge(struct irq_data *data)
 {
 	ack_APIC_irq();
 }
@@ -381,6 +381,19 @@ static void ir_ack_apic_level(struct irq_data *data)
 	eoi_ioapic_irq(data->irq, irqd_cfg(data));
 }
 
+void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p)
+{
+	/*
+	 * Assume interrupt is remapped if the parent irqdomain isn't the
+	 * vector domain, which is true for MSI, HPET and IOAPIC on x86
+	 * platforms.
+	 */
+	if (data->domain && data->domain->parent != arch_get_ir_parent_domain())
+		seq_printf(p, " IR-%s", data->chip->name);
+	else
+		seq_printf(p, " %s", data->chip->name);
+}
+
 static void ir_print_prefix(struct irq_data *data, struct seq_file *p)
 {
 	seq_printf(p, " IR-%s", data->chip->name);
@@ -402,3 +415,66 @@ bool setup_remapped_irq(int irq, struct irq_cfg *cfg, struct irq_chip *chip)
 	irq_remap_modify_chip_defaults(chip);
 	return true;
 }
+
+/**
+ * irq_remapping_get_ir_irq_domain - Get the irqdomain associated the IOMMU
+ *				     device serving @info
+ * @info: interrupt allocation information, used to find the IOMMU device
+ *
+ * It's used to get parent irqdomain for HPET and IOAPIC domains.
+ * Returns pointer to IRQ domain, or NULL on failure.
+ */
+struct irq_domain *
+irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	if (!remap_ops || !remap_ops->get_ir_irq_domain)
+		return NULL;
+
+	return remap_ops->get_ir_irq_domain(info);
+}
+
+/**
+ * irq_remapping_get_irq_domain - Get the irqdomain serving the MSI interrupt
+ * @info: interrupt allocation information, used to find the IOMMU device
+ *
+ * It's used to get irqdomain for MSI/MSIx interrupt allocation.
+ * Returns pointer to IRQ domain, or NULL on failure.
+ */
+struct irq_domain *
+irq_remapping_get_irq_domain(struct irq_alloc_info *info)
+{
+	if (!remap_ops || !remap_ops->get_irq_domain)
+		return NULL;
+
+	return remap_ops->get_irq_domain(info);
+}
+
+/**
+ * irq_remapping_get_ioapic_entry - Get IOAPIC entry content rewritten by
+ *				    interrupt remapping driver
+ * @irq_data: irq_data associated with interrupt remapping irqdomain
+ * @entry: host returned data
+ *
+ * Caller must make sure that the interrupt is remapped.
+ * Return 0 on success, otherwise return error code
+ */
+int irq_remapping_get_ioapic_entry(struct irq_data *irq_data,
+				   struct IR_IO_APIC_route_entry *entry)
+{
+	return remap_ops->get_ioapic_entry(irq_data, entry);
+}
+
+/**
+ * irq_remapping_get_ioapic_entry - Get MSI data rewritten by interrupt
+ *				    remapping driver
+ * @irq_data: irq_data associated with interrupt remapping irqdomain
+ * @entry: host returned data
+ *
+ * Caller must make sure that the interrupt is remapped.
+ * Return 0 on success, otherwise return error code
+ */
+int irq_remapping_get_msi_entry(struct irq_data *irq_data,
+				struct msi_msg *entry)
+{
+	return remap_ops->get_msi_entry(irq_data, entry);
+}
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 90c4dae5a46b..6e46074f06d0 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -30,6 +30,8 @@ struct irq_data;
 struct cpumask;
 struct pci_dev;
 struct msi_msg;
+struct irq_domain;
+struct irq_alloc_info;
 
 extern int disable_irq_remap;
 extern int irq_remap_broken;
@@ -81,11 +83,26 @@ struct irq_remap_ops {
 
 	/* Setup interrupt remapping for an HPET MSI */
 	int (*setup_hpet_msi)(unsigned int, unsigned int);
+
+	/* Get the irqdomain associated the IOMMU device */
+	struct irq_domain *(*get_ir_irq_domain)(struct irq_alloc_info *);
+
+	/* Get the MSI irqdomain associated with the IOMMU device */
+	struct irq_domain *(*get_irq_domain)(struct irq_alloc_info *);
+
+	/* Get IOAPIC entry content rewritten by interrupt remapping driver */
+	int (*get_ioapic_entry)(struct irq_data *,
+				struct IR_IO_APIC_route_entry *);
+
+	/*  Get MSI data rewritten by interrupt remapping driver */
+	int (*get_msi_entry)(struct irq_data *, struct msi_msg *);
 };
 
 extern struct irq_remap_ops intel_irq_remap_ops;
 extern struct irq_remap_ops amd_iommu_irq_ops;
 
+extern void ir_ack_apic_edge(struct irq_data *data);
+
 #else  /* CONFIG_IRQ_REMAP */
 
 #define irq_remapping_enabled 0
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 11/21] iommu/vt-d: Change prototypes to prepare for enabling hierarchy irqdomain
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, Jiang Liu, linux-arm-kernel

Prepare for support hierarchy irqdomain by changing function prototypes,
should be no function changes.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/intel_irq_remapping.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 319b39edcf7e..0c679369e08a 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -82,10 +82,10 @@ static int get_irte(int irq, struct irte *entry)
 	return 0;
 }
 
-static int alloc_irte(struct intel_iommu *iommu, int irq, u16 count)
+static int alloc_irte(struct intel_iommu *iommu, int irq,
+		      struct irq_2_iommu *irq_iommu, u16 count)
 {
 	struct ir_table *table = iommu->ir_table;
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
 	struct irq_cfg *cfg = irq_cfg(irq);
 	unsigned int mask = 0;
 	unsigned long flags;
@@ -173,9 +173,9 @@ static int set_irte_irq(int irq, struct intel_iommu *iommu, u16 index, u16 subha
 	return 0;
 }
 
-static int modify_irte(int irq, struct irte *irte_modified)
+static int modify_irte(struct irq_2_iommu *irq_iommu,
+		       struct irte *irte_modified)
 {
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
 	struct intel_iommu *iommu;
 	unsigned long flags;
 	struct irte *irte;
@@ -242,7 +242,7 @@ static int clear_entries(struct irq_2_iommu *irq_iommu)
 		return 0;
 
 	iommu = irq_iommu->iommu;
-	index = irq_iommu->irte_index + irq_iommu->sub_handle;
+	index = irq_iommu->irte_index;
 
 	start = iommu->ir_table->base + index;
 	end = start + (1 << irq_iommu->irte_mask);
@@ -938,7 +938,7 @@ static int intel_setup_ioapic_entry(int irq,
 		pr_warn("No mapping iommu for ioapic %d\n", ioapic_id);
 		index = -ENODEV;
 	} else {
-		index = alloc_irte(iommu, irq, 1);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
 		if (index < 0) {
 			pr_warn("Failed to allocate IRTE for ioapic %d\n",
 				ioapic_id);
@@ -954,7 +954,7 @@ static int intel_setup_ioapic_entry(int irq,
 	/* Set source-id of interrupt request */
 	set_ioapic_sid(&irte, ioapic_id);
 
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: "
 		"Set IRTE entry (P:%d FPD:%d Dst_Mode:%d "
@@ -1041,7 +1041,7 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	 * Atomically updates the IRTE with the new destination, vector
 	 * and flushes the interrupt entry cache.
 	 */
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	/*
 	 * After this point, all the interrupts will start arriving
@@ -1077,7 +1077,7 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
 	else
 		set_hpet_sid(&irte, hpet_id);
 
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	msg->address_hi = MSI_ADDR_BASE_HI;
 	msg->data = sub_handle;
@@ -1104,7 +1104,7 @@ static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
 		       "Unable to map PCI %s to iommu\n", pci_name(dev));
 		index = -ENOENT;
 	} else {
-		index = alloc_irte(iommu, irq, nvec);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), nvec);
 		if (index < 0) {
 			printk(KERN_ERR
 			       "Unable to allocate %d IRTE for PCI %s\n",
@@ -1148,7 +1148,7 @@ static int intel_setup_hpet_msi(unsigned int irq, unsigned int id)
 	down_read(&dmar_global_lock);
 	iommu = map_hpet_to_ir(id);
 	if (iommu) {
-		index = alloc_irte(iommu, irq, 1);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
 		if (index >= 0)
 			ret = 0;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 11/21] iommu/vt-d: Change prototypes to prepare for enabling hierarchy irqdomain
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Prepare for support hierarchy irqdomain by changing function prototypes,
should be no function changes.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/intel_irq_remapping.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 319b39edcf7e..0c679369e08a 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -82,10 +82,10 @@ static int get_irte(int irq, struct irte *entry)
 	return 0;
 }
 
-static int alloc_irte(struct intel_iommu *iommu, int irq, u16 count)
+static int alloc_irte(struct intel_iommu *iommu, int irq,
+		      struct irq_2_iommu *irq_iommu, u16 count)
 {
 	struct ir_table *table = iommu->ir_table;
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
 	struct irq_cfg *cfg = irq_cfg(irq);
 	unsigned int mask = 0;
 	unsigned long flags;
@@ -173,9 +173,9 @@ static int set_irte_irq(int irq, struct intel_iommu *iommu, u16 index, u16 subha
 	return 0;
 }
 
-static int modify_irte(int irq, struct irte *irte_modified)
+static int modify_irte(struct irq_2_iommu *irq_iommu,
+		       struct irte *irte_modified)
 {
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
 	struct intel_iommu *iommu;
 	unsigned long flags;
 	struct irte *irte;
@@ -242,7 +242,7 @@ static int clear_entries(struct irq_2_iommu *irq_iommu)
 		return 0;
 
 	iommu = irq_iommu->iommu;
-	index = irq_iommu->irte_index + irq_iommu->sub_handle;
+	index = irq_iommu->irte_index;
 
 	start = iommu->ir_table->base + index;
 	end = start + (1 << irq_iommu->irte_mask);
@@ -938,7 +938,7 @@ static int intel_setup_ioapic_entry(int irq,
 		pr_warn("No mapping iommu for ioapic %d\n", ioapic_id);
 		index = -ENODEV;
 	} else {
-		index = alloc_irte(iommu, irq, 1);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
 		if (index < 0) {
 			pr_warn("Failed to allocate IRTE for ioapic %d\n",
 				ioapic_id);
@@ -954,7 +954,7 @@ static int intel_setup_ioapic_entry(int irq,
 	/* Set source-id of interrupt request */
 	set_ioapic_sid(&irte, ioapic_id);
 
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: "
 		"Set IRTE entry (P:%d FPD:%d Dst_Mode:%d "
@@ -1041,7 +1041,7 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	 * Atomically updates the IRTE with the new destination, vector
 	 * and flushes the interrupt entry cache.
 	 */
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	/*
 	 * After this point, all the interrupts will start arriving
@@ -1077,7 +1077,7 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
 	else
 		set_hpet_sid(&irte, hpet_id);
 
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	msg->address_hi = MSI_ADDR_BASE_HI;
 	msg->data = sub_handle;
@@ -1104,7 +1104,7 @@ static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
 		       "Unable to map PCI %s to iommu\n", pci_name(dev));
 		index = -ENOENT;
 	} else {
-		index = alloc_irte(iommu, irq, nvec);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), nvec);
 		if (index < 0) {
 			printk(KERN_ERR
 			       "Unable to allocate %d IRTE for PCI %s\n",
@@ -1148,7 +1148,7 @@ static int intel_setup_hpet_msi(unsigned int irq, unsigned int id)
 	down_read(&dmar_global_lock);
 	iommu = map_hpet_to_ir(id);
 	if (iommu) {
-		index = alloc_irte(iommu, irq, 1);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
 		if (index >= 0)
 			ret = 0;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 11/21] iommu/vt-d: Change prototypes to prepare for enabling hierarchy irqdomain
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Prepare for support hierarchy irqdomain by changing function prototypes,
should be no function changes.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/intel_irq_remapping.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 319b39edcf7e..0c679369e08a 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -82,10 +82,10 @@ static int get_irte(int irq, struct irte *entry)
 	return 0;
 }
 
-static int alloc_irte(struct intel_iommu *iommu, int irq, u16 count)
+static int alloc_irte(struct intel_iommu *iommu, int irq,
+		      struct irq_2_iommu *irq_iommu, u16 count)
 {
 	struct ir_table *table = iommu->ir_table;
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
 	struct irq_cfg *cfg = irq_cfg(irq);
 	unsigned int mask = 0;
 	unsigned long flags;
@@ -173,9 +173,9 @@ static int set_irte_irq(int irq, struct intel_iommu *iommu, u16 index, u16 subha
 	return 0;
 }
 
-static int modify_irte(int irq, struct irte *irte_modified)
+static int modify_irte(struct irq_2_iommu *irq_iommu,
+		       struct irte *irte_modified)
 {
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
 	struct intel_iommu *iommu;
 	unsigned long flags;
 	struct irte *irte;
@@ -242,7 +242,7 @@ static int clear_entries(struct irq_2_iommu *irq_iommu)
 		return 0;
 
 	iommu = irq_iommu->iommu;
-	index = irq_iommu->irte_index + irq_iommu->sub_handle;
+	index = irq_iommu->irte_index;
 
 	start = iommu->ir_table->base + index;
 	end = start + (1 << irq_iommu->irte_mask);
@@ -938,7 +938,7 @@ static int intel_setup_ioapic_entry(int irq,
 		pr_warn("No mapping iommu for ioapic %d\n", ioapic_id);
 		index = -ENODEV;
 	} else {
-		index = alloc_irte(iommu, irq, 1);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
 		if (index < 0) {
 			pr_warn("Failed to allocate IRTE for ioapic %d\n",
 				ioapic_id);
@@ -954,7 +954,7 @@ static int intel_setup_ioapic_entry(int irq,
 	/* Set source-id of interrupt request */
 	set_ioapic_sid(&irte, ioapic_id);
 
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: "
 		"Set IRTE entry (P:%d FPD:%d Dst_Mode:%d "
@@ -1041,7 +1041,7 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	 * Atomically updates the IRTE with the new destination, vector
 	 * and flushes the interrupt entry cache.
 	 */
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	/*
 	 * After this point, all the interrupts will start arriving
@@ -1077,7 +1077,7 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
 	else
 		set_hpet_sid(&irte, hpet_id);
 
-	modify_irte(irq, &irte);
+	modify_irte(irq_2_iommu(irq), &irte);
 
 	msg->address_hi = MSI_ADDR_BASE_HI;
 	msg->data = sub_handle;
@@ -1104,7 +1104,7 @@ static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
 		       "Unable to map PCI %s to iommu\n", pci_name(dev));
 		index = -ENOENT;
 	} else {
-		index = alloc_irte(iommu, irq, nvec);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), nvec);
 		if (index < 0) {
 			printk(KERN_ERR
 			       "Unable to allocate %d IRTE for PCI %s\n",
@@ -1148,7 +1148,7 @@ static int intel_setup_hpet_msi(unsigned int irq, unsigned int id)
 	down_read(&dmar_global_lock);
 	iommu = map_hpet_to_ir(id);
 	if (iommu) {
-		index = alloc_irte(iommu, irq, 1);
+		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
 		if (index >= 0)
 			ret = 0;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 12/21] iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Enhance Intel interrupt remapping driver to support hierarchy irqdomain,
it will simplify the code eventually. It also implements intel_ir_chip
to support stacked irq_chip.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/intel_irq_remapping.c |  354 +++++++++++++++++++++++++++++++++--
 include/linux/intel-iommu.h         |    4 +
 2 files changed, 340 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 0c679369e08a..8bac5935e0d5 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -8,6 +8,7 @@
 #include <linux/irq.h>
 #include <linux/intel-iommu.h>
 #include <linux/acpi.h>
+#include <linux/irqdomain.h>
 #include <asm/io_apic.h>
 #include <asm/smp.h>
 #include <asm/cpu.h>
@@ -31,12 +32,22 @@ struct hpet_scope {
 	unsigned int devfn;
 };
 
+struct intel_ir_data {
+	struct irq_2_iommu			irq_2_iommu;
+	struct irte				irte_entry;
+	union {
+		struct msi_msg			msi_entry;
+		struct IR_IO_APIC_route_entry	ioapic_entry;
+	};
+};
+
 #define IR_X2APIC_MODE(mode) (mode ? (1 << 11) : 0)
 #define IRTE_DEST(dest) ((x2apic_mode) ? dest : dest << 8)
 
 static struct ioapic_scope ir_ioapic[MAX_IO_APICS];
 static struct hpet_scope ir_hpet[MAX_HPET_TBS];
 static int ir_ioapic_num, ir_hpet_num;
+static struct irq_domain_ops intel_ir_domain_ops;
 
 /*
  * Lock ordering:
@@ -263,7 +274,7 @@ static int free_irte(int irq)
 	unsigned long flags;
 	int rc;
 
-	if (!irq_iommu)
+	if (!irq_iommu || irq_iommu->iommu == NULL)
 		return -1;
 
 	raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
@@ -481,36 +492,48 @@ static int intel_setup_irq_remapping(struct intel_iommu *iommu, int mode)
 	struct page *pages;
 	unsigned long *bitmap;
 
-	ir_table = iommu->ir_table = kzalloc(sizeof(struct ir_table),
-					     GFP_ATOMIC);
-
-	if (!iommu->ir_table)
+	ir_table = kzalloc(sizeof(struct ir_table), GFP_ATOMIC);
+	if (!ir_table)
 		return -ENOMEM;
 
 	pages = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
 				 INTR_REMAP_PAGE_ORDER);
-
 	if (!pages) {
 		pr_err("IR%d: failed to allocate pages of order %d\n",
 		       iommu->seq_id, INTR_REMAP_PAGE_ORDER);
-		kfree(iommu->ir_table);
-		return -ENOMEM;
+		goto out_free_table;
 	}
 
 	bitmap = kcalloc(BITS_TO_LONGS(INTR_REMAP_TABLE_ENTRIES),
 			 sizeof(long), GFP_ATOMIC);
 	if (bitmap == NULL) {
 		pr_err("IR%d: failed to allocate bitmap\n", iommu->seq_id);
-		__free_pages(pages, INTR_REMAP_PAGE_ORDER);
-		kfree(ir_table);
-		return -ENOMEM;
+		goto out_free_pages;
+	}
+
+	iommu->ir_domain = irq_domain_add_linear(NULL, INTR_REMAP_TABLE_ENTRIES,
+						 &intel_ir_domain_ops, iommu);
+	if (!iommu->ir_domain) {
+		pr_err("IR%d: failed to allocate irqdomain\n", iommu->seq_id);
+		goto out_free_bitmap;
 	}
+	iommu->ir_domain->parent = arch_get_ir_parent_domain();
+	iommu->ir_msi_domain = arch_create_msi_irq_domain(iommu->ir_domain);
 
 	ir_table->base = page_address(pages);
 	ir_table->bitmap = bitmap;
-
+	iommu->ir_table = ir_table;
 	iommu_set_irq_remapping(iommu, mode);
+
 	return 0;
+
+out_free_bitmap:
+	kfree(bitmap);
+out_free_pages:
+	__free_pages(pages, INTR_REMAP_PAGE_ORDER);
+out_free_table:
+	kfree(ir_table);
+	return -ENOMEM;
 }
 
 /*
@@ -1014,12 +1037,6 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	struct irte irte;
 	int err;
 
-	if (!config_enabled(CONFIG_SMP))
-		return -EINVAL;
-
-	if (!cpumask_intersects(mask, cpu_online_mask))
-		return -EINVAL;
-
 	if (get_irte(irq, &irte))
 		return -EBUSY;
 
@@ -1157,6 +1174,69 @@ static int intel_setup_hpet_msi(unsigned int irq, unsigned int id)
 	return ret;
 }
 
+static struct irq_domain *intel_get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	struct intel_iommu *iommu = NULL;
+
+	if (!info)
+		return NULL;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		iommu = map_ioapic_to_ir(info->ioapic_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_HPET:
+		iommu = map_hpet_to_ir(info->hpet_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		iommu = map_dev_to_ir(info->msi_dev);
+		break;
+	}
+
+	return iommu ? iommu->ir_domain : NULL;
+}
+
+static struct irq_domain *intel_get_irq_domain(struct irq_alloc_info *info)
+{
+	struct intel_iommu *iommu;
+
+	if (!info)
+		return NULL;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		iommu = map_dev_to_ir(info->msi_dev);
+		if (iommu)
+			return iommu->ir_msi_domain;
+		break;
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+static int intel_get_ioapic_entry(struct irq_data *irq_data,
+				  struct IR_IO_APIC_route_entry *entry)
+{
+	struct intel_ir_data *ir_data = irq_data->chip_data;
+
+	*entry = ir_data->ioapic_entry;
+
+	return 0;
+}
+
+static int intel_get_msi_entry(struct irq_data *irq_data, struct msi_msg *msg)
+{
+	struct intel_ir_data *ir_data = irq_data->chip_data;
+
+	*msg = ir_data->msi_entry;
+
+	return 0;
+}
+
 struct irq_remap_ops intel_irq_remap_ops = {
 	.supported		= intel_irq_remapping_supported,
 	.prepare		= dmar_table_init,
@@ -1171,4 +1251,242 @@ struct irq_remap_ops intel_irq_remap_ops = {
 	.msi_alloc_irq		= intel_msi_alloc_irq,
 	.msi_setup_irq		= intel_msi_setup_irq,
 	.setup_hpet_msi		= intel_setup_hpet_msi,
+	.get_ir_irq_domain	= intel_get_ir_irq_domain,
+	.get_irq_domain		= intel_get_irq_domain,
+	.get_ioapic_entry	= intel_get_ioapic_entry,
+	.get_msi_entry		= intel_get_msi_entry,
+};
+
+/*
+ * Migrate the IO-APIC irq in the presence of intr-remapping.
+ *
+ * For both level and edge triggered, irq migration is a simple atomic
+ * update(of vector and cpu destination) of IRTE and flush the hardware cache.
+ *
+ * For level triggered, we eliminate the io-apic RTE modification (with the
+ * updated vector information), by using a virtual vector (io-apic pin number).
+ * Real vector that is used for interrupting cpu will be coming from
+ * the interrupt-remapping table entry.
+ *
+ * As the migration is a simple atomic update of IRTE, the same mechanism
+ * is used to migrate MSI irq's in the presence of interrupt-remapping.
+ */
+static int
+intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask,
+		      bool force)
+{
+	struct intel_ir_data *ir_data = data->chip_data;
+	struct irte *irte = &ir_data->irte_entry;
+	struct irq_cfg *cfg = irqd_cfg(data);
+	struct irq_data *parent = data->parent_data;
+	int ret;
+
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	irte->vector = cfg->vector;
+	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
+	modify_irte(&ir_data->irq_2_iommu, irte);
+
+	/*
+	 * After this point, all the interrupts will start arriving
+	 * at the new destination. So, time to cleanup the previous
+	 * vector allocation.
+	 */
+	if (cfg->move_in_progress)
+		send_cleanup_vector(cfg);
+
+	return ret;
+}
+
+static struct irq_chip intel_ir_chip = {
+	.irq_ack = ir_ack_apic_edge,
+	.irq_set_affinity = intel_ir_set_affinity,
+};
+
+static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data,
+					     struct irq_cfg *irq_cfg,
+					     struct irq_alloc_info *info,
+					     int index, int sub_handle)
+{
+	struct irte *irte = &data->irte_entry;
+	struct IR_IO_APIC_route_entry *entry = &data->ioapic_entry;
+	struct msi_msg *msg = &data->msi_entry;
+
+	prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid);
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		/* Set source-id of interrupt request */
+		set_ioapic_sid(irte, info->ioapic_id);
+		apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: Set IRTE entry (P:%d FPD:%d Dst_Mode:%d Redir_hint:%d Trig_Mode:%d Dlvry_Mode:%X Avail:%X Vector:%02X Dest:%08X SID:%04X SQ:%X SVT:%X)\n",
+			info->ioapic_id, irte->present, irte->fpd,
+			irte->dst_mode, irte->redir_hint,
+			irte->trigger_mode, irte->dlvry_mode,
+			irte->avail, irte->vector, irte->dest_id,
+			irte->sid, irte->sq, irte->svt);
+
+		memset(entry, 0, sizeof(*entry));
+		entry->index2	= (index >> 15) & 0x1;
+		entry->zero	= 0;
+		entry->format	= 1;
+		entry->index	= (index & 0x7fff);
+		/*
+		 * IO-APIC RTE will be configured with virtual vector.
+		 * irq handler will do the explicit EOI to the io-apic.
+		 */
+		entry->vector	= info->ioapic_pin;
+		entry->mask	= 0;			/* enable IRQ */
+		entry->trigger	= info->ioapic_trigger;
+		entry->polarity	= info->ioapic_polarity;
+		if (info->ioapic_trigger)
+			entry->mask = 1; /* Mask level triggered irqs. */
+		break;
+
+	case X86_IRQ_ALLOC_TYPE_HPET:
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		if (info->type == X86_IRQ_ALLOC_TYPE_HPET)
+			set_hpet_sid(irte, info->hpet_id);
+		else
+			set_msi_sid(irte, info->msi_dev);
+
+		msg->address_hi = MSI_ADDR_BASE_HI;
+		msg->data = sub_handle;
+		msg->address_lo = MSI_ADDR_BASE_LO | MSI_ADDR_IR_EXT_INT |
+				  MSI_ADDR_IR_SHV |
+				  MSI_ADDR_IR_INDEX1(index) |
+				  MSI_ADDR_IR_INDEX2(index);
+		break;
+
+	default:
+		BUG_ON(1);
+		break;
+	}
+}
+
+static void intel_free_irq_resources(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	struct irq_data *irq_data;
+	struct intel_ir_data *data;
+	struct irq_2_iommu *irq_iommu;
+	unsigned long flags;
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq  + i);
+		if (irq_data && irq_data->chip_data) {
+			data = irq_data->chip_data;
+			irq_iommu = &data->irq_2_iommu;
+			raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
+			clear_entries(irq_iommu);
+			raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
+			irq_domain_reset_irq_data(irq_data);
+			kfree(data);
+		}
+	}
+}
+
+static int intel_irq_remapping_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	struct intel_iommu *iommu = domain->host_data;
+	struct irq_alloc_info *info = arg;
+	struct intel_ir_data *data;
+	struct irq_data *irq_data;
+	struct irq_cfg *irq_cfg;
+	int i, ret, index;
+
+	if (!info || !iommu)
+		return -EINVAL;
+	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_MSI &&
+	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
+		return -EINVAL;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret < 0)
+		return ret;
+
+	ret = -ENOMEM;
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto out_free_parent;
+
+	down_read(&dmar_global_lock);
+	index = alloc_irte(iommu, virq, &data->irq_2_iommu, nr_irqs);
+	up_read(&dmar_global_lock);
+	if (index < 0) {
+		pr_warn("Failed to allocate IRTE\n");
+		kfree(data);
+		goto out_free_parent;
+	}
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		irq_cfg = irqd_cfg(irq_data);
+		if (!irq_data || !irq_cfg) {
+			ret = -EINVAL;
+			goto out_free_data;
+		}
+
+		if (i > 0) {
+			data = kzalloc(sizeof(*data), GFP_KERNEL);
+			if (!data)
+				goto out_free_data;
+		}
+		irq_data->hwirq = (index << 16) + i;
+		irq_data->chip_data = data;
+		irq_data->chip = &intel_ir_chip;
+		intel_irq_remapping_prepare_irte(data, irq_cfg, info, index, i);
+		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
+	}
+	return 0;
+
+out_free_data:
+	intel_free_irq_resources(domain, virq, i);
+out_free_parent:
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+	return ret;
+}
+
+static void intel_irq_remapping_free(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	intel_free_irq_resources(domain, virq, nr_irqs);
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int intel_irq_remapping_activate(struct irq_domain *domain,
+					struct irq_data *irq_data)
+{
+	struct intel_ir_data *data = irq_data->chip_data;
+
+	modify_irte(&data->irq_2_iommu, &data->irte_entry);
+
+	return 0;
+}
+
+static int intel_irq_remapping_deactivate(struct irq_domain *domain,
+					  struct irq_data *irq_data)
+{
+	struct intel_ir_data *data = irq_data->chip_data;
+	struct irte entry;
+
+	memset(&entry, 0, sizeof(entry));
+	modify_irte(&data->irq_2_iommu, &entry);
+
+	return 0;
+}
+
+static struct irq_domain_ops intel_ir_domain_ops = {
+	.alloc = intel_irq_remapping_alloc,
+	.free = intel_irq_remapping_free,
+	.activate = intel_irq_remapping_activate,
+	.deactivate = intel_irq_remapping_deactivate,
 };
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index a65208a8fe18..ecaf3a937845 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -286,6 +286,8 @@ struct q_inval {
 
 #define INTR_REMAP_TABLE_ENTRIES	65536
 
+struct irq_domain;
+
 struct ir_table {
 	struct irte *base;
 	unsigned long *bitmap;
@@ -335,6 +337,8 @@ struct intel_iommu {
 
 #ifdef CONFIG_IRQ_REMAP
 	struct ir_table *ir_table;	/* Interrupt remapping info */
+	struct irq_domain *ir_domain;
+	struct irq_domain *ir_msi_domain;
 #endif
 	struct device	*iommu_dev; /* IOMMU-sysfs device */
 	int		node;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 12/21] iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Enhance Intel interrupt remapping driver to support hierarchy irqdomain,
it will simplify the code eventually. It also implements intel_ir_chip
to support stacked irq_chip.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/intel_irq_remapping.c |  354 +++++++++++++++++++++++++++++++++--
 include/linux/intel-iommu.h         |    4 +
 2 files changed, 340 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 0c679369e08a..8bac5935e0d5 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -8,6 +8,7 @@
 #include <linux/irq.h>
 #include <linux/intel-iommu.h>
 #include <linux/acpi.h>
+#include <linux/irqdomain.h>
 #include <asm/io_apic.h>
 #include <asm/smp.h>
 #include <asm/cpu.h>
@@ -31,12 +32,22 @@ struct hpet_scope {
 	unsigned int devfn;
 };
 
+struct intel_ir_data {
+	struct irq_2_iommu			irq_2_iommu;
+	struct irte				irte_entry;
+	union {
+		struct msi_msg			msi_entry;
+		struct IR_IO_APIC_route_entry	ioapic_entry;
+	};
+};
+
 #define IR_X2APIC_MODE(mode) (mode ? (1 << 11) : 0)
 #define IRTE_DEST(dest) ((x2apic_mode) ? dest : dest << 8)
 
 static struct ioapic_scope ir_ioapic[MAX_IO_APICS];
 static struct hpet_scope ir_hpet[MAX_HPET_TBS];
 static int ir_ioapic_num, ir_hpet_num;
+static struct irq_domain_ops intel_ir_domain_ops;
 
 /*
  * Lock ordering:
@@ -263,7 +274,7 @@ static int free_irte(int irq)
 	unsigned long flags;
 	int rc;
 
-	if (!irq_iommu)
+	if (!irq_iommu || irq_iommu->iommu == NULL)
 		return -1;
 
 	raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
@@ -481,36 +492,48 @@ static int intel_setup_irq_remapping(struct intel_iommu *iommu, int mode)
 	struct page *pages;
 	unsigned long *bitmap;
 
-	ir_table = iommu->ir_table = kzalloc(sizeof(struct ir_table),
-					     GFP_ATOMIC);
-
-	if (!iommu->ir_table)
+	ir_table = kzalloc(sizeof(struct ir_table), GFP_ATOMIC);
+	if (!ir_table)
 		return -ENOMEM;
 
 	pages = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
 				 INTR_REMAP_PAGE_ORDER);
-
 	if (!pages) {
 		pr_err("IR%d: failed to allocate pages of order %d\n",
 		       iommu->seq_id, INTR_REMAP_PAGE_ORDER);
-		kfree(iommu->ir_table);
-		return -ENOMEM;
+		goto out_free_table;
 	}
 
 	bitmap = kcalloc(BITS_TO_LONGS(INTR_REMAP_TABLE_ENTRIES),
 			 sizeof(long), GFP_ATOMIC);
 	if (bitmap == NULL) {
 		pr_err("IR%d: failed to allocate bitmap\n", iommu->seq_id);
-		__free_pages(pages, INTR_REMAP_PAGE_ORDER);
-		kfree(ir_table);
-		return -ENOMEM;
+		goto out_free_pages;
+	}
+
+	iommu->ir_domain = irq_domain_add_linear(NULL, INTR_REMAP_TABLE_ENTRIES,
+						 &intel_ir_domain_ops, iommu);
+	if (!iommu->ir_domain) {
+		pr_err("IR%d: failed to allocate irqdomain\n", iommu->seq_id);
+		goto out_free_bitmap;
 	}
+	iommu->ir_domain->parent = arch_get_ir_parent_domain();
+	iommu->ir_msi_domain = arch_create_msi_irq_domain(iommu->ir_domain);
 
 	ir_table->base = page_address(pages);
 	ir_table->bitmap = bitmap;
-
+	iommu->ir_table = ir_table;
 	iommu_set_irq_remapping(iommu, mode);
+
 	return 0;
+
+out_free_bitmap:
+	kfree(bitmap);
+out_free_pages:
+	__free_pages(pages, INTR_REMAP_PAGE_ORDER);
+out_free_table:
+	kfree(ir_table);
+	return -ENOMEM;
 }
 
 /*
@@ -1014,12 +1037,6 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	struct irte irte;
 	int err;
 
-	if (!config_enabled(CONFIG_SMP))
-		return -EINVAL;
-
-	if (!cpumask_intersects(mask, cpu_online_mask))
-		return -EINVAL;
-
 	if (get_irte(irq, &irte))
 		return -EBUSY;
 
@@ -1157,6 +1174,69 @@ static int intel_setup_hpet_msi(unsigned int irq, unsigned int id)
 	return ret;
 }
 
+static struct irq_domain *intel_get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	struct intel_iommu *iommu = NULL;
+
+	if (!info)
+		return NULL;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		iommu = map_ioapic_to_ir(info->ioapic_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_HPET:
+		iommu = map_hpet_to_ir(info->hpet_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		iommu = map_dev_to_ir(info->msi_dev);
+		break;
+	}
+
+	return iommu ? iommu->ir_domain : NULL;
+}
+
+static struct irq_domain *intel_get_irq_domain(struct irq_alloc_info *info)
+{
+	struct intel_iommu *iommu;
+
+	if (!info)
+		return NULL;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		iommu = map_dev_to_ir(info->msi_dev);
+		if (iommu)
+			return iommu->ir_msi_domain;
+		break;
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+static int intel_get_ioapic_entry(struct irq_data *irq_data,
+				  struct IR_IO_APIC_route_entry *entry)
+{
+	struct intel_ir_data *ir_data = irq_data->chip_data;
+
+	*entry = ir_data->ioapic_entry;
+
+	return 0;
+}
+
+static int intel_get_msi_entry(struct irq_data *irq_data, struct msi_msg *msg)
+{
+	struct intel_ir_data *ir_data = irq_data->chip_data;
+
+	*msg = ir_data->msi_entry;
+
+	return 0;
+}
+
 struct irq_remap_ops intel_irq_remap_ops = {
 	.supported		= intel_irq_remapping_supported,
 	.prepare		= dmar_table_init,
@@ -1171,4 +1251,242 @@ struct irq_remap_ops intel_irq_remap_ops = {
 	.msi_alloc_irq		= intel_msi_alloc_irq,
 	.msi_setup_irq		= intel_msi_setup_irq,
 	.setup_hpet_msi		= intel_setup_hpet_msi,
+	.get_ir_irq_domain	= intel_get_ir_irq_domain,
+	.get_irq_domain		= intel_get_irq_domain,
+	.get_ioapic_entry	= intel_get_ioapic_entry,
+	.get_msi_entry		= intel_get_msi_entry,
+};
+
+/*
+ * Migrate the IO-APIC irq in the presence of intr-remapping.
+ *
+ * For both level and edge triggered, irq migration is a simple atomic
+ * update(of vector and cpu destination) of IRTE and flush the hardware cache.
+ *
+ * For level triggered, we eliminate the io-apic RTE modification (with the
+ * updated vector information), by using a virtual vector (io-apic pin number).
+ * Real vector that is used for interrupting cpu will be coming from
+ * the interrupt-remapping table entry.
+ *
+ * As the migration is a simple atomic update of IRTE, the same mechanism
+ * is used to migrate MSI irq's in the presence of interrupt-remapping.
+ */
+static int
+intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask,
+		      bool force)
+{
+	struct intel_ir_data *ir_data = data->chip_data;
+	struct irte *irte = &ir_data->irte_entry;
+	struct irq_cfg *cfg = irqd_cfg(data);
+	struct irq_data *parent = data->parent_data;
+	int ret;
+
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	irte->vector = cfg->vector;
+	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
+	modify_irte(&ir_data->irq_2_iommu, irte);
+
+	/*
+	 * After this point, all the interrupts will start arriving
+	 * at the new destination. So, time to cleanup the previous
+	 * vector allocation.
+	 */
+	if (cfg->move_in_progress)
+		send_cleanup_vector(cfg);
+
+	return ret;
+}
+
+static struct irq_chip intel_ir_chip = {
+	.irq_ack = ir_ack_apic_edge,
+	.irq_set_affinity = intel_ir_set_affinity,
+};
+
+static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data,
+					     struct irq_cfg *irq_cfg,
+					     struct irq_alloc_info *info,
+					     int index, int sub_handle)
+{
+	struct irte *irte = &data->irte_entry;
+	struct IR_IO_APIC_route_entry *entry = &data->ioapic_entry;
+	struct msi_msg *msg = &data->msi_entry;
+
+	prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid);
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		/* Set source-id of interrupt request */
+		set_ioapic_sid(irte, info->ioapic_id);
+		apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: Set IRTE entry (P:%d FPD:%d Dst_Mode:%d Redir_hint:%d Trig_Mode:%d Dlvry_Mode:%X Avail:%X Vector:%02X Dest:%08X SID:%04X SQ:%X SVT:%X)\n",
+			info->ioapic_id, irte->present, irte->fpd,
+			irte->dst_mode, irte->redir_hint,
+			irte->trigger_mode, irte->dlvry_mode,
+			irte->avail, irte->vector, irte->dest_id,
+			irte->sid, irte->sq, irte->svt);
+
+		memset(entry, 0, sizeof(*entry));
+		entry->index2	= (index >> 15) & 0x1;
+		entry->zero	= 0;
+		entry->format	= 1;
+		entry->index	= (index & 0x7fff);
+		/*
+		 * IO-APIC RTE will be configured with virtual vector.
+		 * irq handler will do the explicit EOI to the io-apic.
+		 */
+		entry->vector	= info->ioapic_pin;
+		entry->mask	= 0;			/* enable IRQ */
+		entry->trigger	= info->ioapic_trigger;
+		entry->polarity	= info->ioapic_polarity;
+		if (info->ioapic_trigger)
+			entry->mask = 1; /* Mask level triggered irqs. */
+		break;
+
+	case X86_IRQ_ALLOC_TYPE_HPET:
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		if (info->type == X86_IRQ_ALLOC_TYPE_HPET)
+			set_hpet_sid(irte, info->hpet_id);
+		else
+			set_msi_sid(irte, info->msi_dev);
+
+		msg->address_hi = MSI_ADDR_BASE_HI;
+		msg->data = sub_handle;
+		msg->address_lo = MSI_ADDR_BASE_LO | MSI_ADDR_IR_EXT_INT |
+				  MSI_ADDR_IR_SHV |
+				  MSI_ADDR_IR_INDEX1(index) |
+				  MSI_ADDR_IR_INDEX2(index);
+		break;
+
+	default:
+		BUG_ON(1);
+		break;
+	}
+}
+
+static void intel_free_irq_resources(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	struct irq_data *irq_data;
+	struct intel_ir_data *data;
+	struct irq_2_iommu *irq_iommu;
+	unsigned long flags;
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq  + i);
+		if (irq_data && irq_data->chip_data) {
+			data = irq_data->chip_data;
+			irq_iommu = &data->irq_2_iommu;
+			raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
+			clear_entries(irq_iommu);
+			raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
+			irq_domain_reset_irq_data(irq_data);
+			kfree(data);
+		}
+	}
+}
+
+static int intel_irq_remapping_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *arg)
+{
+	struct intel_iommu *iommu = domain->host_data;
+	struct irq_alloc_info *info = arg;
+	struct intel_ir_data *data;
+	struct irq_data *irq_data;
+	struct irq_cfg *irq_cfg;
+	int i, ret, index;
+
+	if (!info || !iommu)
+		return -EINVAL;
+	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_MSI &&
+	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
+		return -EINVAL;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret < 0)
+		return ret;
+
+	ret = -ENOMEM;
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto out_free_parent;
+
+	down_read(&dmar_global_lock);
+	index = alloc_irte(iommu, virq, &data->irq_2_iommu, nr_irqs);
+	up_read(&dmar_global_lock);
+	if (index < 0) {
+		pr_warn("Failed to allocate IRTE\n");
+		kfree(data);
+		goto out_free_parent;
+	}
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		irq_cfg = irqd_cfg(irq_data);
+		if (!irq_data || !irq_cfg) {
+			ret = -EINVAL;
+			goto out_free_data;
+		}
+
+		if (i > 0) {
+			data = kzalloc(sizeof(*data), GFP_KERNEL);
+			if (!data)
+				goto out_free_data;
+		}
+		irq_data->hwirq = (index << 16) + i;
+		irq_data->chip_data = data;
+		irq_data->chip = &intel_ir_chip;
+		intel_irq_remapping_prepare_irte(data, irq_cfg, info, index, i);
+		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
+	}
+	return 0;
+
+out_free_data:
+	intel_free_irq_resources(domain, virq, i);
+out_free_parent:
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+	return ret;
+}
+
+static void intel_irq_remapping_free(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	intel_free_irq_resources(domain, virq, nr_irqs);
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int intel_irq_remapping_activate(struct irq_domain *domain,
+					struct irq_data *irq_data)
+{
+	struct intel_ir_data *data = irq_data->chip_data;
+
+	modify_irte(&data->irq_2_iommu, &data->irte_entry);
+
+	return 0;
+}
+
+static int intel_irq_remapping_deactivate(struct irq_domain *domain,
+					  struct irq_data *irq_data)
+{
+	struct intel_ir_data *data = irq_data->chip_data;
+	struct irte entry;
+
+	memset(&entry, 0, sizeof(entry));
+	modify_irte(&data->irq_2_iommu, &entry);
+
+	return 0;
+}
+
+static struct irq_domain_ops intel_ir_domain_ops = {
+	.alloc = intel_irq_remapping_alloc,
+	.free = intel_irq_remapping_free,
+	.activate = intel_irq_remapping_activate,
+	.deactivate = intel_irq_remapping_deactivate,
 };
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index a65208a8fe18..ecaf3a937845 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -286,6 +286,8 @@ struct q_inval {
 
 #define INTR_REMAP_TABLE_ENTRIES	65536
 
+struct irq_domain;
+
 struct ir_table {
 	struct irte *base;
 	unsigned long *bitmap;
@@ -335,6 +337,8 @@ struct intel_iommu {
 
 #ifdef CONFIG_IRQ_REMAP
 	struct ir_table *ir_table;	/* Interrupt remapping info */
+	struct irq_domain *ir_domain;
+	struct irq_domain *ir_msi_domain;
 #endif
 	struct device	*iommu_dev; /* IOMMU-sysfs device */
 	int		node;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 13/21] iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Enhance AMD interrupt remapping driver to support hierarchy irqdomain,
it will simplify the code eventually.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/amd_iommu.c       |  338 ++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/amd_iommu_init.c  |    4 +
 drivers/iommu/amd_iommu_proto.h |    9 ++
 drivers/iommu/amd_iommu_types.h |    5 +
 4 files changed, 350 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e0def6249284..71ab03949599 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -33,6 +33,7 @@
 #include <linux/export.h>
 #include <linux/irq.h>
 #include <linux/msi.h>
+#include <linux/irqdomain.h>
 #include <asm/irq_remapping.h>
 #include <asm/io_apic.h>
 #include <asm/apic.h>
@@ -3835,6 +3836,17 @@ union irte {
 	} fields;
 };
 
+struct amd_ir_data {
+	struct irq_2_irte			irq_2_irte;
+	union irte				irte_entry;
+	union {
+		struct msi_msg			msi_entry;
+		struct IR_IO_APIC_route_entry	ioapic_entry;
+	};
+};
+
+static struct irq_chip amd_ir_chip;
+
 #define DTE_IRQ_PHYS_ADDR_MASK	(((1ULL << 45)-1) << 6)
 #define DTE_IRQ_REMAP_INTCTL    (2ULL << 60)
 #define DTE_IRQ_TABLE_LEN       (8ULL << 1)
@@ -3928,7 +3940,8 @@ out_unlock:
 	return table;
 }
 
-static int alloc_irq_index(struct irq_cfg *cfg, u16 devid, int count)
+static int alloc_irq_index(struct irq_cfg *cfg, struct irq_2_irte *irte_info,
+			   u16 devid, int count)
 {
 	struct irq_remap_table *table;
 	unsigned long flags;
@@ -3950,15 +3963,12 @@ static int alloc_irq_index(struct irq_cfg *cfg, u16 devid, int count)
 			c = 0;
 
 		if (c == count)	{
-			struct irq_2_irte *irte_info;
-
 			for (; c != 0; --c)
 				table->table[index - c + 1] = IRTE_ALLOCATED;
 
 			index -= count - 1;
 
 			cfg->remapped	      = 1;
-			irte_info             = &cfg->irq_2_irte;
 			irte_info->devid      = devid;
 			irte_info->index      = index;
 
@@ -4203,7 +4213,7 @@ static int msi_alloc_irq(struct pci_dev *pdev, int irq, int nvec)
 		return -EINVAL;
 
 	devid = get_device_id(&pdev->dev);
-	index = alloc_irq_index(cfg, devid, nvec);
+	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, nvec);
 
 	return index < 0 ? MAX_IRQS_PER_TABLE : index;
 }
@@ -4250,7 +4260,7 @@ static int setup_hpet_msi(unsigned int irq, unsigned int id)
 	if (devid < 0)
 		return devid;
 
-	index = alloc_irq_index(cfg, devid, 1);
+	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, 1);
 	if (index < 0)
 		return index;
 
@@ -4261,6 +4271,88 @@ static int setup_hpet_msi(unsigned int irq, unsigned int id)
 	return 0;
 }
 
+static int get_devid(struct irq_alloc_info *info)
+{
+	int devid = -1;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		devid     = get_ioapic_devid(info->ioapic_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_HPET:
+		devid     = get_hpet_devid(info->hpet_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		devid = get_device_id(&info->msi_dev->dev);
+		break;
+	}
+
+	return devid;
+}
+
+static struct irq_domain *get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	int devid;
+	struct amd_iommu *iommu;
+
+	if (!info)
+		return NULL;
+
+	devid = get_devid(info);
+	if (devid >= 0) {
+		iommu = amd_iommu_rlookup_table[devid];
+		if (iommu)
+			return iommu->ir_domain;
+	}
+
+	return NULL;
+}
+
+static struct irq_domain *get_irq_domain(struct irq_alloc_info *info)
+{
+	int devid;
+	struct amd_iommu *iommu;
+
+	if (!info)
+		return NULL;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		devid = get_device_id(&info->msi_dev->dev);
+		if (devid >= 0) {
+			iommu = amd_iommu_rlookup_table[devid];
+			if (iommu)
+				return iommu->msi_domain;
+		}
+		break;
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+static int get_ioapic_entry(struct irq_data *irq_data,
+				  struct IR_IO_APIC_route_entry *entry)
+{
+	struct amd_ir_data *ir_data = irq_data->chip_data;
+
+	*entry = ir_data->ioapic_entry;
+
+	return 0;
+}
+
+static int get_msi_entry(struct irq_data *irq_data, struct msi_msg *msg)
+{
+	struct amd_ir_data *ir_data = irq_data->chip_data;
+
+	*msg = ir_data->msi_entry;
+
+	return 0;
+}
+
 struct irq_remap_ops amd_iommu_irq_ops = {
 	.supported		= amd_iommu_supported,
 	.prepare		= amd_iommu_prepare,
@@ -4275,5 +4367,239 @@ struct irq_remap_ops amd_iommu_irq_ops = {
 	.msi_alloc_irq		= msi_alloc_irq,
 	.msi_setup_irq		= msi_setup_irq,
 	.setup_hpet_msi		= setup_hpet_msi,
+	.get_ir_irq_domain	= get_ir_irq_domain,
+	.get_irq_domain		= get_irq_domain,
+	.get_ioapic_entry	= get_ioapic_entry,
+	.get_msi_entry		= get_msi_entry,
 };
+
+static void irq_remapping_prepare_irte(struct amd_ir_data *data,
+				       struct irq_cfg *irq_cfg,
+				       struct irq_alloc_info *info,
+				       int devid, int index, int sub_handle)
+{
+	union irte *irte = &data->irte_entry;
+	struct irq_2_irte *irte_info = &data->irq_2_irte;
+	struct IR_IO_APIC_route_entry *entry = &data->ioapic_entry;
+	struct msi_msg *msg = &data->msi_entry;
+
+	irq_cfg->remapped = 1;
+	data->irq_2_irte.devid = devid;
+	data->irq_2_irte.index = index + sub_handle;
+
+	/* Setup IRTE for IOMMU */
+	irte->val = 0;
+	irte->fields.vector      = irq_cfg->vector;
+	irte->fields.int_type    = apic->irq_delivery_mode;
+	irte->fields.destination = irq_cfg->dest_apicid;
+	irte->fields.dm          = apic->irq_dest_mode;
+	irte->fields.valid       = 1;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		/* Setup IOAPIC entry */
+		memset(entry, 0, sizeof(*entry));
+		entry->vector        = index;
+		entry->mask          = 0;
+		entry->trigger       = info->ioapic_trigger;
+		entry->polarity      = info->ioapic_polarity;
+		/* Mask level triggered irqs. */
+		if (info->ioapic_trigger)
+			entry->mask = 1;
+		break;
+
+	case X86_IRQ_ALLOC_TYPE_HPET:
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		msg->address_hi = MSI_ADDR_BASE_HI;
+		msg->address_lo = MSI_ADDR_BASE_LO;
+		msg->data = irte_info->index;
+		break;
+
+	default:
+		BUG_ON(1);
+		break;
+	}
+}
+
+static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq,
+			       unsigned int nr_irqs, void *arg)
+{
+	struct irq_alloc_info *info = arg;
+	struct amd_ir_data *data;
+	struct irq_data *irq_data;
+	struct irq_cfg *cfg;
+	int i, ret, devid;
+	int index = -1;
+
+	if (!info)
+		return -EINVAL;
+	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_MSI &&
+	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
+		return -EINVAL;
+
+	devid = get_devid(info);
+	if (devid < 0)
+		return -EINVAL;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret < 0)
+		return ret;
+
+	ret = -ENOMEM;
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto out_free_parent;
+
+	if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
+		if (get_irq_table(devid, true))
+			index = info->ioapic_pin;
+		else
+			ret = -ENOMEM;
+	} else {
+		cfg = irq_cfg(virq);
+		index = alloc_irq_index(cfg, &data->irq_2_irte, devid, nr_irqs);
+	}
+	if (index < 0) {
+		pr_warn("Failed to allocate IRTE\n");
+		kfree(data);
+		goto out_free_parent;
+	}
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		cfg = irqd_cfg(irq_data);
+		if (!irq_data || !cfg) {
+			ret = -EINVAL;
+			goto out_free_data;
+		}
+
+		if (i > 0) {
+			data = kzalloc(sizeof(*data), GFP_KERNEL);
+			if (!data)
+				goto out_free_data;
+		}
+		irq_data->hwirq = (devid << 16) + i;
+		irq_data->chip_data = data;
+		irq_data->chip = &amd_ir_chip;
+		irq_remapping_prepare_irte(data, cfg, info, devid, index, i);
+		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
+	}
+	return 0;
+
+out_free_data:
+	for (i--; i >= 0; i--) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		if (irq_data->chip_data) {
+			kfree(irq_data->chip_data);
+			irq_domain_reset_irq_data(irq_data);
+		}
+	}
+	for (i = 0; i < nr_irqs; i++)
+		free_irte(devid, index + i);
+out_free_parent:
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+	return ret;
+}
+
+static void irq_remapping_free(struct irq_domain *domain, unsigned int virq,
+			       unsigned int nr_irqs)
+{
+	struct irq_data *irq_data;
+	struct amd_ir_data *data;
+	struct irq_2_irte *irte_info;
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq  + i);
+		if (irq_data && irq_data->chip_data) {
+			data = irq_data->chip_data;
+			irte_info = &data->irq_2_irte;
+			free_irte(irte_info->devid, irte_info->index);
+			irq_domain_reset_irq_data(irq_data);
+			kfree(data);
+		}
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int irq_remapping_activate(struct irq_domain *domain,
+				  struct irq_data *irq_data)
+{
+	struct amd_ir_data *data = irq_data->chip_data;
+	struct irq_2_irte *irte_info = &data->irq_2_irte;
+
+	modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+
+	return 0;
+}
+
+static int irq_remapping_deactivate(struct irq_domain *domain,
+				    struct irq_data *irq_data)
+{
+	struct amd_ir_data *data = irq_data->chip_data;
+	struct irq_2_irte *irte_info = &data->irq_2_irte;
+	union irte entry;
+
+	entry.val = 0;
+	modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+
+	return 0;
+}
+
+static struct irq_domain_ops amd_ir_domain_ops = {
+	.alloc = irq_remapping_alloc,
+	.free = irq_remapping_free,
+	.activate = irq_remapping_activate,
+	.deactivate = irq_remapping_deactivate,
+};
+
+static int amd_ir_set_affinity(struct irq_data *data,
+			       const struct cpumask *mask, bool force)
+{
+	struct amd_ir_data *ir_data = data->chip_data;
+	struct irq_2_irte *irte_info = &ir_data->irq_2_irte;
+	struct irq_cfg *cfg = irqd_cfg(data);
+	struct irq_data *parent = data->parent_data;
+	int ret;
+
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	ir_data->irte_entry.fields.vector = cfg->vector;
+	ir_data->irte_entry.fields.destination = cfg->dest_apicid;
+	modify_irte(irte_info->devid, irte_info->index, ir_data->irte_entry);
+
+	/*
+	 * After this point, all the interrupts will start arriving
+	 * at the new destination. So, time to cleanup the previous
+	 * vector allocation.
+	 */
+	if (cfg->move_in_progress)
+		send_cleanup_vector(cfg);
+
+	return ret;
+}
+
+static struct irq_chip amd_ir_chip = {
+	.irq_ack = ir_ack_apic_edge,
+	.irq_set_affinity = amd_ir_set_affinity,
+};
+
+int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
+{
+	iommu->ir_domain = irq_domain_add_tree(NULL, &amd_ir_domain_ops, iommu);
+	if (!iommu->ir_domain)
+		return -ENOMEM;
+
+	iommu->ir_domain->parent = arch_get_ir_parent_domain();
+	iommu->msi_domain = arch_create_msi_irq_domain(iommu->ir_domain);
+
+	return 0;
+}
 #endif
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 3783e0b44df6..e9b3b91d45ff 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1115,6 +1115,10 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
 	if (ret)
 		return ret;
 
+	ret = amd_iommu_create_irq_domain(iommu);
+	if (ret)
+		return ret;
+
 	/*
 	 * Make sure IOMMU is not considered to translate itself. The IVRS
 	 * table tells us so, but this is a lie!
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index 95ed6deae47f..612a22192fa0 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -63,6 +63,15 @@ extern u8 amd_iommu_pc_get_max_counters(u16 devid);
 extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
 				    u64 *value, bool is_write);
 
+#ifdef CONFIG_IRQ_REMAP
+extern int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
+#else
+static inline int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
+{
+	return 0;
+}
+#endif
+
 #define PPR_SUCCESS			0x0
 #define PPR_INVALID			0x1
 #define PPR_FAILURE			0xf
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 8e43b7cba133..ccb84d7491ed 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -392,6 +392,7 @@ struct amd_iommu_fault {
 
 
 struct iommu_domain;
+struct irq_domain;
 
 /*
  * This structure contains generic data for  IOMMU protection domains
@@ -595,6 +596,10 @@ struct amd_iommu {
 	/* The maximum PC banks and counters/bank (PCSup=1) */
 	u8 max_banks;
 	u8 max_counters;
+#ifdef CONFIG_IRQ_REMAP
+	struct irq_domain *ir_domain;
+	struct irq_domain *msi_domain;
+#endif
 };
 
 struct devid_map {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 13/21] iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Enhance AMD interrupt remapping driver to support hierarchy irqdomain,
it will simplify the code eventually.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/amd_iommu.c       |  338 ++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/amd_iommu_init.c  |    4 +
 drivers/iommu/amd_iommu_proto.h |    9 ++
 drivers/iommu/amd_iommu_types.h |    5 +
 4 files changed, 350 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e0def6249284..71ab03949599 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -33,6 +33,7 @@
 #include <linux/export.h>
 #include <linux/irq.h>
 #include <linux/msi.h>
+#include <linux/irqdomain.h>
 #include <asm/irq_remapping.h>
 #include <asm/io_apic.h>
 #include <asm/apic.h>
@@ -3835,6 +3836,17 @@ union irte {
 	} fields;
 };
 
+struct amd_ir_data {
+	struct irq_2_irte			irq_2_irte;
+	union irte				irte_entry;
+	union {
+		struct msi_msg			msi_entry;
+		struct IR_IO_APIC_route_entry	ioapic_entry;
+	};
+};
+
+static struct irq_chip amd_ir_chip;
+
 #define DTE_IRQ_PHYS_ADDR_MASK	(((1ULL << 45)-1) << 6)
 #define DTE_IRQ_REMAP_INTCTL    (2ULL << 60)
 #define DTE_IRQ_TABLE_LEN       (8ULL << 1)
@@ -3928,7 +3940,8 @@ out_unlock:
 	return table;
 }
 
-static int alloc_irq_index(struct irq_cfg *cfg, u16 devid, int count)
+static int alloc_irq_index(struct irq_cfg *cfg, struct irq_2_irte *irte_info,
+			   u16 devid, int count)
 {
 	struct irq_remap_table *table;
 	unsigned long flags;
@@ -3950,15 +3963,12 @@ static int alloc_irq_index(struct irq_cfg *cfg, u16 devid, int count)
 			c = 0;
 
 		if (c == count)	{
-			struct irq_2_irte *irte_info;
-
 			for (; c != 0; --c)
 				table->table[index - c + 1] = IRTE_ALLOCATED;
 
 			index -= count - 1;
 
 			cfg->remapped	      = 1;
-			irte_info             = &cfg->irq_2_irte;
 			irte_info->devid      = devid;
 			irte_info->index      = index;
 
@@ -4203,7 +4213,7 @@ static int msi_alloc_irq(struct pci_dev *pdev, int irq, int nvec)
 		return -EINVAL;
 
 	devid = get_device_id(&pdev->dev);
-	index = alloc_irq_index(cfg, devid, nvec);
+	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, nvec);
 
 	return index < 0 ? MAX_IRQS_PER_TABLE : index;
 }
@@ -4250,7 +4260,7 @@ static int setup_hpet_msi(unsigned int irq, unsigned int id)
 	if (devid < 0)
 		return devid;
 
-	index = alloc_irq_index(cfg, devid, 1);
+	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, 1);
 	if (index < 0)
 		return index;
 
@@ -4261,6 +4271,88 @@ static int setup_hpet_msi(unsigned int irq, unsigned int id)
 	return 0;
 }
 
+static int get_devid(struct irq_alloc_info *info)
+{
+	int devid = -1;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		devid     = get_ioapic_devid(info->ioapic_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_HPET:
+		devid     = get_hpet_devid(info->hpet_id);
+		break;
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		devid = get_device_id(&info->msi_dev->dev);
+		break;
+	}
+
+	return devid;
+}
+
+static struct irq_domain *get_ir_irq_domain(struct irq_alloc_info *info)
+{
+	int devid;
+	struct amd_iommu *iommu;
+
+	if (!info)
+		return NULL;
+
+	devid = get_devid(info);
+	if (devid >= 0) {
+		iommu = amd_iommu_rlookup_table[devid];
+		if (iommu)
+			return iommu->ir_domain;
+	}
+
+	return NULL;
+}
+
+static struct irq_domain *get_irq_domain(struct irq_alloc_info *info)
+{
+	int devid;
+	struct amd_iommu *iommu;
+
+	if (!info)
+		return NULL;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		devid = get_device_id(&info->msi_dev->dev);
+		if (devid >= 0) {
+			iommu = amd_iommu_rlookup_table[devid];
+			if (iommu)
+				return iommu->msi_domain;
+		}
+		break;
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+static int get_ioapic_entry(struct irq_data *irq_data,
+				  struct IR_IO_APIC_route_entry *entry)
+{
+	struct amd_ir_data *ir_data = irq_data->chip_data;
+
+	*entry = ir_data->ioapic_entry;
+
+	return 0;
+}
+
+static int get_msi_entry(struct irq_data *irq_data, struct msi_msg *msg)
+{
+	struct amd_ir_data *ir_data = irq_data->chip_data;
+
+	*msg = ir_data->msi_entry;
+
+	return 0;
+}
+
 struct irq_remap_ops amd_iommu_irq_ops = {
 	.supported		= amd_iommu_supported,
 	.prepare		= amd_iommu_prepare,
@@ -4275,5 +4367,239 @@ struct irq_remap_ops amd_iommu_irq_ops = {
 	.msi_alloc_irq		= msi_alloc_irq,
 	.msi_setup_irq		= msi_setup_irq,
 	.setup_hpet_msi		= setup_hpet_msi,
+	.get_ir_irq_domain	= get_ir_irq_domain,
+	.get_irq_domain		= get_irq_domain,
+	.get_ioapic_entry	= get_ioapic_entry,
+	.get_msi_entry		= get_msi_entry,
 };
+
+static void irq_remapping_prepare_irte(struct amd_ir_data *data,
+				       struct irq_cfg *irq_cfg,
+				       struct irq_alloc_info *info,
+				       int devid, int index, int sub_handle)
+{
+	union irte *irte = &data->irte_entry;
+	struct irq_2_irte *irte_info = &data->irq_2_irte;
+	struct IR_IO_APIC_route_entry *entry = &data->ioapic_entry;
+	struct msi_msg *msg = &data->msi_entry;
+
+	irq_cfg->remapped = 1;
+	data->irq_2_irte.devid = devid;
+	data->irq_2_irte.index = index + sub_handle;
+
+	/* Setup IRTE for IOMMU */
+	irte->val = 0;
+	irte->fields.vector      = irq_cfg->vector;
+	irte->fields.int_type    = apic->irq_delivery_mode;
+	irte->fields.destination = irq_cfg->dest_apicid;
+	irte->fields.dm          = apic->irq_dest_mode;
+	irte->fields.valid       = 1;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+		/* Setup IOAPIC entry */
+		memset(entry, 0, sizeof(*entry));
+		entry->vector        = index;
+		entry->mask          = 0;
+		entry->trigger       = info->ioapic_trigger;
+		entry->polarity      = info->ioapic_polarity;
+		/* Mask level triggered irqs. */
+		if (info->ioapic_trigger)
+			entry->mask = 1;
+		break;
+
+	case X86_IRQ_ALLOC_TYPE_HPET:
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		msg->address_hi = MSI_ADDR_BASE_HI;
+		msg->address_lo = MSI_ADDR_BASE_LO;
+		msg->data = irte_info->index;
+		break;
+
+	default:
+		BUG_ON(1);
+		break;
+	}
+}
+
+static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq,
+			       unsigned int nr_irqs, void *arg)
+{
+	struct irq_alloc_info *info = arg;
+	struct amd_ir_data *data;
+	struct irq_data *irq_data;
+	struct irq_cfg *cfg;
+	int i, ret, devid;
+	int index = -1;
+
+	if (!info)
+		return -EINVAL;
+	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_MSI &&
+	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
+		return -EINVAL;
+
+	devid = get_devid(info);
+	if (devid < 0)
+		return -EINVAL;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret < 0)
+		return ret;
+
+	ret = -ENOMEM;
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto out_free_parent;
+
+	if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
+		if (get_irq_table(devid, true))
+			index = info->ioapic_pin;
+		else
+			ret = -ENOMEM;
+	} else {
+		cfg = irq_cfg(virq);
+		index = alloc_irq_index(cfg, &data->irq_2_irte, devid, nr_irqs);
+	}
+	if (index < 0) {
+		pr_warn("Failed to allocate IRTE\n");
+		kfree(data);
+		goto out_free_parent;
+	}
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		cfg = irqd_cfg(irq_data);
+		if (!irq_data || !cfg) {
+			ret = -EINVAL;
+			goto out_free_data;
+		}
+
+		if (i > 0) {
+			data = kzalloc(sizeof(*data), GFP_KERNEL);
+			if (!data)
+				goto out_free_data;
+		}
+		irq_data->hwirq = (devid << 16) + i;
+		irq_data->chip_data = data;
+		irq_data->chip = &amd_ir_chip;
+		irq_remapping_prepare_irte(data, cfg, info, devid, index, i);
+		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
+	}
+	return 0;
+
+out_free_data:
+	for (i--; i >= 0; i--) {
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		if (irq_data->chip_data) {
+			kfree(irq_data->chip_data);
+			irq_domain_reset_irq_data(irq_data);
+		}
+	}
+	for (i = 0; i < nr_irqs; i++)
+		free_irte(devid, index + i);
+out_free_parent:
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+	return ret;
+}
+
+static void irq_remapping_free(struct irq_domain *domain, unsigned int virq,
+			       unsigned int nr_irqs)
+{
+	struct irq_data *irq_data;
+	struct amd_ir_data *data;
+	struct irq_2_irte *irte_info;
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_domain_get_irq_data(domain, virq  + i);
+		if (irq_data && irq_data->chip_data) {
+			data = irq_data->chip_data;
+			irte_info = &data->irq_2_irte;
+			free_irte(irte_info->devid, irte_info->index);
+			irq_domain_reset_irq_data(irq_data);
+			kfree(data);
+		}
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int irq_remapping_activate(struct irq_domain *domain,
+				  struct irq_data *irq_data)
+{
+	struct amd_ir_data *data = irq_data->chip_data;
+	struct irq_2_irte *irte_info = &data->irq_2_irte;
+
+	modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+
+	return 0;
+}
+
+static int irq_remapping_deactivate(struct irq_domain *domain,
+				    struct irq_data *irq_data)
+{
+	struct amd_ir_data *data = irq_data->chip_data;
+	struct irq_2_irte *irte_info = &data->irq_2_irte;
+	union irte entry;
+
+	entry.val = 0;
+	modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+
+	return 0;
+}
+
+static struct irq_domain_ops amd_ir_domain_ops = {
+	.alloc = irq_remapping_alloc,
+	.free = irq_remapping_free,
+	.activate = irq_remapping_activate,
+	.deactivate = irq_remapping_deactivate,
+};
+
+static int amd_ir_set_affinity(struct irq_data *data,
+			       const struct cpumask *mask, bool force)
+{
+	struct amd_ir_data *ir_data = data->chip_data;
+	struct irq_2_irte *irte_info = &ir_data->irq_2_irte;
+	struct irq_cfg *cfg = irqd_cfg(data);
+	struct irq_data *parent = data->parent_data;
+	int ret;
+
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	ir_data->irte_entry.fields.vector = cfg->vector;
+	ir_data->irte_entry.fields.destination = cfg->dest_apicid;
+	modify_irte(irte_info->devid, irte_info->index, ir_data->irte_entry);
+
+	/*
+	 * After this point, all the interrupts will start arriving
+	 * at the new destination. So, time to cleanup the previous
+	 * vector allocation.
+	 */
+	if (cfg->move_in_progress)
+		send_cleanup_vector(cfg);
+
+	return ret;
+}
+
+static struct irq_chip amd_ir_chip = {
+	.irq_ack = ir_ack_apic_edge,
+	.irq_set_affinity = amd_ir_set_affinity,
+};
+
+int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
+{
+	iommu->ir_domain = irq_domain_add_tree(NULL, &amd_ir_domain_ops, iommu);
+	if (!iommu->ir_domain)
+		return -ENOMEM;
+
+	iommu->ir_domain->parent = arch_get_ir_parent_domain();
+	iommu->msi_domain = arch_create_msi_irq_domain(iommu->ir_domain);
+
+	return 0;
+}
 #endif
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 3783e0b44df6..e9b3b91d45ff 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1115,6 +1115,10 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
 	if (ret)
 		return ret;
 
+	ret = amd_iommu_create_irq_domain(iommu);
+	if (ret)
+		return ret;
+
 	/*
 	 * Make sure IOMMU is not considered to translate itself. The IVRS
 	 * table tells us so, but this is a lie!
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index 95ed6deae47f..612a22192fa0 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -63,6 +63,15 @@ extern u8 amd_iommu_pc_get_max_counters(u16 devid);
 extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
 				    u64 *value, bool is_write);
 
+#ifdef CONFIG_IRQ_REMAP
+extern int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
+#else
+static inline int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
+{
+	return 0;
+}
+#endif
+
 #define PPR_SUCCESS			0x0
 #define PPR_INVALID			0x1
 #define PPR_FAILURE			0xf
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 8e43b7cba133..ccb84d7491ed 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -392,6 +392,7 @@ struct amd_iommu_fault {
 
 
 struct iommu_domain;
+struct irq_domain;
 
 /*
  * This structure contains generic data for  IOMMU protection domains
@@ -595,6 +596,10 @@ struct amd_iommu {
 	/* The maximum PC banks and counters/bank (PCSup=1) */
 	u8 max_banks;
 	u8 max_counters;
+#ifdef CONFIG_IRQ_REMAP
+	struct irq_domain *ir_domain;
+	struct irq_domain *msi_domain;
+#endif
 };
 
 struct devid_map {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, Jiang Liu, linux-arm-kernel

Enhance HPET code to support hierarchy irqdomain, it helps to make
the and and architecture more clear.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hpet.h |    7 +-
 arch/x86/kernel/apic/msi.c  |  156 ++++++++++++++++++++++++++++++++++++++-----
 arch/x86/kernel/hpet.c      |   57 ++++------------
 3 files changed, 160 insertions(+), 60 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 36f7125945e3..e87e9faf87a9 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -74,11 +74,16 @@ extern unsigned int hpet_readl(unsigned int a);
 extern void force_hpet_resume(void);
 
 struct irq_data;
+struct hpet_dev;
+struct irq_domain;
+
 extern void hpet_msi_unmask(struct irq_data *data);
 extern void hpet_msi_mask(struct irq_data *data);
-struct hpet_dev;
 extern void hpet_msi_write(struct hpet_dev *hdev, struct msi_msg *msg);
 extern void hpet_msi_read(struct hpet_dev *hdev, struct msi_msg *msg);
+extern struct irq_domain *hpet_create_irq_domain(int hpet_id);
+extern int hpet_assign_irq(struct irq_domain *domain,
+			   struct hpet_dev *dev, int dev_num);
 
 #ifdef CONFIG_PCI_MSI
 extern int default_setup_hpet_msi(unsigned int irq, unsigned int id);
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index ad2d624a0800..709fedab44f2 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -237,38 +237,48 @@ void dmar_free_hwirq(int irq)
  * MSI message composition
  */
 #ifdef CONFIG_HPET_TIMER
+#define	HPET_DOMAIN_REMAPPED		0x80000000
+
+static inline int hpet_dev_id(struct irq_domain *domain)
+{
+	return (int)((long)domain->host_data & ~HPET_DOMAIN_REMAPPED);
+}
+
+static inline bool hpet_remapped(struct irq_domain *domain)
+{
+	return (bool)((long)domain->host_data & HPET_DOMAIN_REMAPPED);
+}
 
 static int hpet_msi_set_affinity(struct irq_data *data,
 				 const struct cpumask *mask, bool force)
 {
+	struct irq_data *parent = data->parent_data;
 	struct irq_cfg *cfg = irqd_cfg(data);
 	struct msi_msg msg;
-	unsigned int dest;
 	int ret;
 
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
-
-	hpet_msi_read(data->handler_data, &msg);
-
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
-
-	hpet_msi_write(data->handler_data, &msg);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to rewrite HPET registers if interrupt is remapped */
+	if (ret >= 0 && !hpet_remapped(data->domain)) {
+		hpet_msi_read(data->handler_data, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		hpet_msi_write(data->handler_data, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 static struct irq_chip hpet_msi_type = {
 	.name = "HPET_MSI",
 	.irq_unmask = hpet_msi_unmask,
 	.irq_mask = hpet_msi_mask,
-	.irq_ack = apic_ack_edge,
+	.irq_ack = irq_chip_ack_parent,
 	.irq_set_affinity = hpet_msi_set_affinity,
-	.irq_retrigger = apic_retrigger_irq,
+	.irq_retrigger = irq_chip_retrigger_hierarchy,
+	.irq_print_chip = irq_remapping_print_chip,
 };
 
 int default_setup_hpet_msi(unsigned int irq, unsigned int id)
@@ -288,4 +298,118 @@ int default_setup_hpet_msi(unsigned int irq, unsigned int id)
 	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
 	return 0;
 }
+
+static int hpet_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			     unsigned int nr_irqs, void *arg)
+{
+	struct irq_alloc_info *info = arg;
+	int ret;
+
+	if (nr_irqs > 1 || !info || info->type != X86_IRQ_ALLOC_TYPE_HPET)
+		return -EINVAL;
+	if (irq_find_mapping(domain, info->hpet_index)) {
+		pr_warn("IRQ for HPET%d already exists.\n", info->hpet_index);
+		return -EEXIST;
+	}
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret >= 0) {
+		irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
+		irq_domain_set_hwirq_and_chip(domain, virq, info->hpet_index,
+					      &hpet_msi_type, NULL);
+		irq_set_handler_data(virq, info->hpet_data);
+		__irq_set_handler(virq, handle_edge_irq, 0, "edge");
+	}
+
+	return ret;
+}
+
+static void hpet_domain_free(struct irq_domain *domain, unsigned int virq,
+			     unsigned int nr_irqs)
+{
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+		irq_set_handler_data(virq + i, NULL);
+		irq_set_handler(virq + i, NULL);
+		irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int hpet_domain_activate(struct irq_domain *domain,
+				struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	if (hpet_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, hpet_dev_id(domain));
+	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
+
+	return 0;
+}
+
+static int hpet_domain_deactivate(struct irq_domain *domain,
+				  struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	memset(&msg, 0, sizeof(msg));
+	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops hpet_domain_ops = {
+	.alloc = hpet_domain_alloc,
+	.free = hpet_domain_free,
+	.activate = hpet_domain_activate,
+	.deactivate = hpet_domain_deactivate,
+};
+
+struct irq_domain *hpet_create_irq_domain(int hpet_id)
+{
+	struct irq_domain *parent, *domain;
+	struct irq_alloc_info info;
+	long host_data;
+
+	BUG_ON(hpet_id & HPET_DOMAIN_REMAPPED);
+	host_data = hpet_id;
+
+	init_irq_alloc_info(&info, NULL);
+	info.type = X86_IRQ_ALLOC_TYPE_HPET;
+	info.hpet_id = hpet_id;
+	parent = irq_remapping_get_ir_irq_domain(&info);
+	if (!parent)
+		parent = x86_vector_domain;
+	else
+		host_data |= HPET_DOMAIN_REMAPPED;
+	if (!parent)
+		return NULL;
+
+	domain = irq_domain_add_tree(NULL, &hpet_domain_ops, (void *)host_data);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+int hpet_assign_irq(struct irq_domain *domain, struct hpet_dev *dev,
+		    int dev_num)
+{
+	struct irq_alloc_info info;
+
+	init_irq_alloc_info(&info, NULL);
+	info.type = X86_IRQ_ALLOC_TYPE_HPET;
+	info.hpet_data = dev;
+	info.hpet_id = hpet_dev_id(domain);
+	info.hpet_index = dev_num;
+
+	return irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE, NULL);
+}
 #endif
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index cb60652f59a3..c5559c293773 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -43,6 +43,7 @@ u8					hpet_msi_disable;
 static unsigned long			hpet_num_timers;
 #endif
 static void __iomem			*hpet_virt_address;
+static struct irq_domain		*hpet_domain;
 
 struct hpet_dev {
 	struct clock_event_device	evt;
@@ -306,8 +307,6 @@ static void hpet_legacy_clockevent_register(void)
 	printk(KERN_DEBUG "hpet clockevent registered\n");
 }
 
-static int hpet_setup_msi_irq(unsigned int irq);
-
 static void hpet_set_mode(enum clock_event_mode mode,
 			  struct clock_event_device *evt, int timer)
 {
@@ -358,7 +357,7 @@ static void hpet_set_mode(enum clock_event_mode mode,
 			hpet_enable_legacy_int();
 		} else {
 			struct hpet_dev *hdev = EVT_TO_HPET_DEV(evt);
-			hpet_setup_msi_irq(hdev->irq);
+			irq_domain_activate_irq(irq_get_irq_data(hdev->irq));
 			disable_irq(hdev->irq);
 			irq_set_affinity(hdev->irq, cpumask_of(hdev->cpu));
 			enable_irq(hdev->irq);
@@ -474,32 +473,6 @@ static int hpet_msi_next_event(unsigned long delta,
 	return hpet_next_event(delta, evt, hdev->num);
 }
 
-static int hpet_setup_msi_irq(unsigned int irq)
-{
-	if (x86_msi.setup_hpet_msi(irq, hpet_blockid)) {
-		irq_domain_free_irqs(irq, 1);
-		return -EINVAL;
-	}
-	return 0;
-}
-
-static int hpet_assign_irq(struct hpet_dev *dev)
-{
-	int irq;
-
-	irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
-	if (irq <= 0)
-		return -EINVAL;
-
-	irq_set_handler_data(irq, dev);
-
-	if (hpet_setup_msi_irq(irq))
-		return -EINVAL;
-
-	dev->irq = irq;
-	return 0;
-}
-
 static irqreturn_t hpet_interrupt_handler(int irq, void *data)
 {
 	struct hpet_dev *dev = (struct hpet_dev *)data;
@@ -542,9 +515,6 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
 	if (!(hdev->flags & HPET_DEV_VALID))
 		return;
 
-	if (hpet_setup_msi_irq(hdev->irq))
-		return;
-
 	hdev->cpu = cpu;
 	per_cpu(cpu_hpet_dev, cpu) = hdev;
 	evt->name = hdev->name;
@@ -576,7 +546,7 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 	unsigned int id;
 	unsigned int num_timers;
 	unsigned int num_timers_used = 0;
-	int i;
+	int i, irq;
 
 	if (hpet_msi_disable)
 		return;
@@ -589,6 +559,10 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 	num_timers++; /* Value read out starts from 0 */
 	hpet_print_config();
 
+	hpet_domain = hpet_create_irq_domain(hpet_blockid);
+	if (!hpet_domain)
+		return;
+
 	hpet_devs = kzalloc(sizeof(struct hpet_dev) * num_timers, GFP_KERNEL);
 	if (!hpet_devs)
 		return;
@@ -603,15 +577,16 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 		if (!(cfg & HPET_TN_FSB_CAP))
 			continue;
 
+		irq = hpet_assign_irq(hpet_domain, hdev, hdev->num);
+		if (irq < 0)
+			continue;
+
+		sprintf(hdev->name, "hpet%d", i);
+		hdev->num = i;
+		hdev->irq = irq;
 		hdev->flags = 0;
 		if (cfg & HPET_TN_PERIODIC_CAP)
 			hdev->flags |= HPET_DEV_PERI_CAP;
-		hdev->num = i;
-
-		sprintf(hdev->name, "hpet%d", i);
-		if (hpet_assign_irq(hdev))
-			continue;
-
 		hdev->flags |= HPET_DEV_FSB_CAP;
 		hdev->flags |= HPET_DEV_VALID;
 		num_timers_used++;
@@ -711,10 +686,6 @@ static int hpet_cpuhp_notify(struct notifier_block *n,
 }
 #else
 
-static int hpet_setup_msi_irq(unsigned int irq)
-{
-	return 0;
-}
 static void hpet_msi_capability_lookup(unsigned int start_timer)
 {
 	return;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Enhance HPET code to support hierarchy irqdomain, it helps to make
the and and architecture more clear.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hpet.h |    7 +-
 arch/x86/kernel/apic/msi.c  |  156 ++++++++++++++++++++++++++++++++++++++-----
 arch/x86/kernel/hpet.c      |   57 ++++------------
 3 files changed, 160 insertions(+), 60 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 36f7125945e3..e87e9faf87a9 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -74,11 +74,16 @@ extern unsigned int hpet_readl(unsigned int a);
 extern void force_hpet_resume(void);
 
 struct irq_data;
+struct hpet_dev;
+struct irq_domain;
+
 extern void hpet_msi_unmask(struct irq_data *data);
 extern void hpet_msi_mask(struct irq_data *data);
-struct hpet_dev;
 extern void hpet_msi_write(struct hpet_dev *hdev, struct msi_msg *msg);
 extern void hpet_msi_read(struct hpet_dev *hdev, struct msi_msg *msg);
+extern struct irq_domain *hpet_create_irq_domain(int hpet_id);
+extern int hpet_assign_irq(struct irq_domain *domain,
+			   struct hpet_dev *dev, int dev_num);
 
 #ifdef CONFIG_PCI_MSI
 extern int default_setup_hpet_msi(unsigned int irq, unsigned int id);
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index ad2d624a0800..709fedab44f2 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -237,38 +237,48 @@ void dmar_free_hwirq(int irq)
  * MSI message composition
  */
 #ifdef CONFIG_HPET_TIMER
+#define	HPET_DOMAIN_REMAPPED		0x80000000
+
+static inline int hpet_dev_id(struct irq_domain *domain)
+{
+	return (int)((long)domain->host_data & ~HPET_DOMAIN_REMAPPED);
+}
+
+static inline bool hpet_remapped(struct irq_domain *domain)
+{
+	return (bool)((long)domain->host_data & HPET_DOMAIN_REMAPPED);
+}
 
 static int hpet_msi_set_affinity(struct irq_data *data,
 				 const struct cpumask *mask, bool force)
 {
+	struct irq_data *parent = data->parent_data;
 	struct irq_cfg *cfg = irqd_cfg(data);
 	struct msi_msg msg;
-	unsigned int dest;
 	int ret;
 
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
-
-	hpet_msi_read(data->handler_data, &msg);
-
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
-
-	hpet_msi_write(data->handler_data, &msg);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to rewrite HPET registers if interrupt is remapped */
+	if (ret >= 0 && !hpet_remapped(data->domain)) {
+		hpet_msi_read(data->handler_data, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		hpet_msi_write(data->handler_data, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 static struct irq_chip hpet_msi_type = {
 	.name = "HPET_MSI",
 	.irq_unmask = hpet_msi_unmask,
 	.irq_mask = hpet_msi_mask,
-	.irq_ack = apic_ack_edge,
+	.irq_ack = irq_chip_ack_parent,
 	.irq_set_affinity = hpet_msi_set_affinity,
-	.irq_retrigger = apic_retrigger_irq,
+	.irq_retrigger = irq_chip_retrigger_hierarchy,
+	.irq_print_chip = irq_remapping_print_chip,
 };
 
 int default_setup_hpet_msi(unsigned int irq, unsigned int id)
@@ -288,4 +298,118 @@ int default_setup_hpet_msi(unsigned int irq, unsigned int id)
 	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
 	return 0;
 }
+
+static int hpet_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			     unsigned int nr_irqs, void *arg)
+{
+	struct irq_alloc_info *info = arg;
+	int ret;
+
+	if (nr_irqs > 1 || !info || info->type != X86_IRQ_ALLOC_TYPE_HPET)
+		return -EINVAL;
+	if (irq_find_mapping(domain, info->hpet_index)) {
+		pr_warn("IRQ for HPET%d already exists.\n", info->hpet_index);
+		return -EEXIST;
+	}
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret >= 0) {
+		irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
+		irq_domain_set_hwirq_and_chip(domain, virq, info->hpet_index,
+					      &hpet_msi_type, NULL);
+		irq_set_handler_data(virq, info->hpet_data);
+		__irq_set_handler(virq, handle_edge_irq, 0, "edge");
+	}
+
+	return ret;
+}
+
+static void hpet_domain_free(struct irq_domain *domain, unsigned int virq,
+			     unsigned int nr_irqs)
+{
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+		irq_set_handler_data(virq + i, NULL);
+		irq_set_handler(virq + i, NULL);
+		irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int hpet_domain_activate(struct irq_domain *domain,
+				struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	if (hpet_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, hpet_dev_id(domain));
+	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
+
+	return 0;
+}
+
+static int hpet_domain_deactivate(struct irq_domain *domain,
+				  struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	memset(&msg, 0, sizeof(msg));
+	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops hpet_domain_ops = {
+	.alloc = hpet_domain_alloc,
+	.free = hpet_domain_free,
+	.activate = hpet_domain_activate,
+	.deactivate = hpet_domain_deactivate,
+};
+
+struct irq_domain *hpet_create_irq_domain(int hpet_id)
+{
+	struct irq_domain *parent, *domain;
+	struct irq_alloc_info info;
+	long host_data;
+
+	BUG_ON(hpet_id & HPET_DOMAIN_REMAPPED);
+	host_data = hpet_id;
+
+	init_irq_alloc_info(&info, NULL);
+	info.type = X86_IRQ_ALLOC_TYPE_HPET;
+	info.hpet_id = hpet_id;
+	parent = irq_remapping_get_ir_irq_domain(&info);
+	if (!parent)
+		parent = x86_vector_domain;
+	else
+		host_data |= HPET_DOMAIN_REMAPPED;
+	if (!parent)
+		return NULL;
+
+	domain = irq_domain_add_tree(NULL, &hpet_domain_ops, (void *)host_data);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+int hpet_assign_irq(struct irq_domain *domain, struct hpet_dev *dev,
+		    int dev_num)
+{
+	struct irq_alloc_info info;
+
+	init_irq_alloc_info(&info, NULL);
+	info.type = X86_IRQ_ALLOC_TYPE_HPET;
+	info.hpet_data = dev;
+	info.hpet_id = hpet_dev_id(domain);
+	info.hpet_index = dev_num;
+
+	return irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE, NULL);
+}
 #endif
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index cb60652f59a3..c5559c293773 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -43,6 +43,7 @@ u8					hpet_msi_disable;
 static unsigned long			hpet_num_timers;
 #endif
 static void __iomem			*hpet_virt_address;
+static struct irq_domain		*hpet_domain;
 
 struct hpet_dev {
 	struct clock_event_device	evt;
@@ -306,8 +307,6 @@ static void hpet_legacy_clockevent_register(void)
 	printk(KERN_DEBUG "hpet clockevent registered\n");
 }
 
-static int hpet_setup_msi_irq(unsigned int irq);
-
 static void hpet_set_mode(enum clock_event_mode mode,
 			  struct clock_event_device *evt, int timer)
 {
@@ -358,7 +357,7 @@ static void hpet_set_mode(enum clock_event_mode mode,
 			hpet_enable_legacy_int();
 		} else {
 			struct hpet_dev *hdev = EVT_TO_HPET_DEV(evt);
-			hpet_setup_msi_irq(hdev->irq);
+			irq_domain_activate_irq(irq_get_irq_data(hdev->irq));
 			disable_irq(hdev->irq);
 			irq_set_affinity(hdev->irq, cpumask_of(hdev->cpu));
 			enable_irq(hdev->irq);
@@ -474,32 +473,6 @@ static int hpet_msi_next_event(unsigned long delta,
 	return hpet_next_event(delta, evt, hdev->num);
 }
 
-static int hpet_setup_msi_irq(unsigned int irq)
-{
-	if (x86_msi.setup_hpet_msi(irq, hpet_blockid)) {
-		irq_domain_free_irqs(irq, 1);
-		return -EINVAL;
-	}
-	return 0;
-}
-
-static int hpet_assign_irq(struct hpet_dev *dev)
-{
-	int irq;
-
-	irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
-	if (irq <= 0)
-		return -EINVAL;
-
-	irq_set_handler_data(irq, dev);
-
-	if (hpet_setup_msi_irq(irq))
-		return -EINVAL;
-
-	dev->irq = irq;
-	return 0;
-}
-
 static irqreturn_t hpet_interrupt_handler(int irq, void *data)
 {
 	struct hpet_dev *dev = (struct hpet_dev *)data;
@@ -542,9 +515,6 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
 	if (!(hdev->flags & HPET_DEV_VALID))
 		return;
 
-	if (hpet_setup_msi_irq(hdev->irq))
-		return;
-
 	hdev->cpu = cpu;
 	per_cpu(cpu_hpet_dev, cpu) = hdev;
 	evt->name = hdev->name;
@@ -576,7 +546,7 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 	unsigned int id;
 	unsigned int num_timers;
 	unsigned int num_timers_used = 0;
-	int i;
+	int i, irq;
 
 	if (hpet_msi_disable)
 		return;
@@ -589,6 +559,10 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 	num_timers++; /* Value read out starts from 0 */
 	hpet_print_config();
 
+	hpet_domain = hpet_create_irq_domain(hpet_blockid);
+	if (!hpet_domain)
+		return;
+
 	hpet_devs = kzalloc(sizeof(struct hpet_dev) * num_timers, GFP_KERNEL);
 	if (!hpet_devs)
 		return;
@@ -603,15 +577,16 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 		if (!(cfg & HPET_TN_FSB_CAP))
 			continue;
 
+		irq = hpet_assign_irq(hpet_domain, hdev, hdev->num);
+		if (irq < 0)
+			continue;
+
+		sprintf(hdev->name, "hpet%d", i);
+		hdev->num = i;
+		hdev->irq = irq;
 		hdev->flags = 0;
 		if (cfg & HPET_TN_PERIODIC_CAP)
 			hdev->flags |= HPET_DEV_PERI_CAP;
-		hdev->num = i;
-
-		sprintf(hdev->name, "hpet%d", i);
-		if (hpet_assign_irq(hdev))
-			continue;
-
 		hdev->flags |= HPET_DEV_FSB_CAP;
 		hdev->flags |= HPET_DEV_VALID;
 		num_timers_used++;
@@ -711,10 +686,6 @@ static int hpet_cpuhp_notify(struct notifier_block *n,
 }
 #else
 
-static int hpet_setup_msi_irq(unsigned int irq)
-{
-	return 0;
-}
 static void hpet_msi_capability_lookup(unsigned int start_timer)
 {
 	return;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Enhance HPET code to support hierarchy irqdomain, it helps to make
the and and architecture more clear.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hpet.h |    7 +-
 arch/x86/kernel/apic/msi.c  |  156 ++++++++++++++++++++++++++++++++++++++-----
 arch/x86/kernel/hpet.c      |   57 ++++------------
 3 files changed, 160 insertions(+), 60 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 36f7125945e3..e87e9faf87a9 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -74,11 +74,16 @@ extern unsigned int hpet_readl(unsigned int a);
 extern void force_hpet_resume(void);
 
 struct irq_data;
+struct hpet_dev;
+struct irq_domain;
+
 extern void hpet_msi_unmask(struct irq_data *data);
 extern void hpet_msi_mask(struct irq_data *data);
-struct hpet_dev;
 extern void hpet_msi_write(struct hpet_dev *hdev, struct msi_msg *msg);
 extern void hpet_msi_read(struct hpet_dev *hdev, struct msi_msg *msg);
+extern struct irq_domain *hpet_create_irq_domain(int hpet_id);
+extern int hpet_assign_irq(struct irq_domain *domain,
+			   struct hpet_dev *dev, int dev_num);
 
 #ifdef CONFIG_PCI_MSI
 extern int default_setup_hpet_msi(unsigned int irq, unsigned int id);
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index ad2d624a0800..709fedab44f2 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -237,38 +237,48 @@ void dmar_free_hwirq(int irq)
  * MSI message composition
  */
 #ifdef CONFIG_HPET_TIMER
+#define	HPET_DOMAIN_REMAPPED		0x80000000
+
+static inline int hpet_dev_id(struct irq_domain *domain)
+{
+	return (int)((long)domain->host_data & ~HPET_DOMAIN_REMAPPED);
+}
+
+static inline bool hpet_remapped(struct irq_domain *domain)
+{
+	return (bool)((long)domain->host_data & HPET_DOMAIN_REMAPPED);
+}
 
 static int hpet_msi_set_affinity(struct irq_data *data,
 				 const struct cpumask *mask, bool force)
 {
+	struct irq_data *parent = data->parent_data;
 	struct irq_cfg *cfg = irqd_cfg(data);
 	struct msi_msg msg;
-	unsigned int dest;
 	int ret;
 
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
-
-	hpet_msi_read(data->handler_data, &msg);
-
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
-
-	hpet_msi_write(data->handler_data, &msg);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to rewrite HPET registers if interrupt is remapped */
+	if (ret >= 0 && !hpet_remapped(data->domain)) {
+		hpet_msi_read(data->handler_data, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		hpet_msi_write(data->handler_data, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 static struct irq_chip hpet_msi_type = {
 	.name = "HPET_MSI",
 	.irq_unmask = hpet_msi_unmask,
 	.irq_mask = hpet_msi_mask,
-	.irq_ack = apic_ack_edge,
+	.irq_ack = irq_chip_ack_parent,
 	.irq_set_affinity = hpet_msi_set_affinity,
-	.irq_retrigger = apic_retrigger_irq,
+	.irq_retrigger = irq_chip_retrigger_hierarchy,
+	.irq_print_chip = irq_remapping_print_chip,
 };
 
 int default_setup_hpet_msi(unsigned int irq, unsigned int id)
@@ -288,4 +298,118 @@ int default_setup_hpet_msi(unsigned int irq, unsigned int id)
 	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
 	return 0;
 }
+
+static int hpet_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			     unsigned int nr_irqs, void *arg)
+{
+	struct irq_alloc_info *info = arg;
+	int ret;
+
+	if (nr_irqs > 1 || !info || info->type != X86_IRQ_ALLOC_TYPE_HPET)
+		return -EINVAL;
+	if (irq_find_mapping(domain, info->hpet_index)) {
+		pr_warn("IRQ for HPET%d already exists.\n", info->hpet_index);
+		return -EEXIST;
+	}
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret >= 0) {
+		irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
+		irq_domain_set_hwirq_and_chip(domain, virq, info->hpet_index,
+					      &hpet_msi_type, NULL);
+		irq_set_handler_data(virq, info->hpet_data);
+		__irq_set_handler(virq, handle_edge_irq, 0, "edge");
+	}
+
+	return ret;
+}
+
+static void hpet_domain_free(struct irq_domain *domain, unsigned int virq,
+			     unsigned int nr_irqs)
+{
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+		irq_set_handler_data(virq + i, NULL);
+		irq_set_handler(virq + i, NULL);
+		irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int hpet_domain_activate(struct irq_domain *domain,
+				struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	if (hpet_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, hpet_dev_id(domain));
+	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
+
+	return 0;
+}
+
+static int hpet_domain_deactivate(struct irq_domain *domain,
+				  struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	memset(&msg, 0, sizeof(msg));
+	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops hpet_domain_ops = {
+	.alloc = hpet_domain_alloc,
+	.free = hpet_domain_free,
+	.activate = hpet_domain_activate,
+	.deactivate = hpet_domain_deactivate,
+};
+
+struct irq_domain *hpet_create_irq_domain(int hpet_id)
+{
+	struct irq_domain *parent, *domain;
+	struct irq_alloc_info info;
+	long host_data;
+
+	BUG_ON(hpet_id & HPET_DOMAIN_REMAPPED);
+	host_data = hpet_id;
+
+	init_irq_alloc_info(&info, NULL);
+	info.type = X86_IRQ_ALLOC_TYPE_HPET;
+	info.hpet_id = hpet_id;
+	parent = irq_remapping_get_ir_irq_domain(&info);
+	if (!parent)
+		parent = x86_vector_domain;
+	else
+		host_data |= HPET_DOMAIN_REMAPPED;
+	if (!parent)
+		return NULL;
+
+	domain = irq_domain_add_tree(NULL, &hpet_domain_ops, (void *)host_data);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+int hpet_assign_irq(struct irq_domain *domain, struct hpet_dev *dev,
+		    int dev_num)
+{
+	struct irq_alloc_info info;
+
+	init_irq_alloc_info(&info, NULL);
+	info.type = X86_IRQ_ALLOC_TYPE_HPET;
+	info.hpet_data = dev;
+	info.hpet_id = hpet_dev_id(domain);
+	info.hpet_index = dev_num;
+
+	return irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE, NULL);
+}
 #endif
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index cb60652f59a3..c5559c293773 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -43,6 +43,7 @@ u8					hpet_msi_disable;
 static unsigned long			hpet_num_timers;
 #endif
 static void __iomem			*hpet_virt_address;
+static struct irq_domain		*hpet_domain;
 
 struct hpet_dev {
 	struct clock_event_device	evt;
@@ -306,8 +307,6 @@ static void hpet_legacy_clockevent_register(void)
 	printk(KERN_DEBUG "hpet clockevent registered\n");
 }
 
-static int hpet_setup_msi_irq(unsigned int irq);
-
 static void hpet_set_mode(enum clock_event_mode mode,
 			  struct clock_event_device *evt, int timer)
 {
@@ -358,7 +357,7 @@ static void hpet_set_mode(enum clock_event_mode mode,
 			hpet_enable_legacy_int();
 		} else {
 			struct hpet_dev *hdev = EVT_TO_HPET_DEV(evt);
-			hpet_setup_msi_irq(hdev->irq);
+			irq_domain_activate_irq(irq_get_irq_data(hdev->irq));
 			disable_irq(hdev->irq);
 			irq_set_affinity(hdev->irq, cpumask_of(hdev->cpu));
 			enable_irq(hdev->irq);
@@ -474,32 +473,6 @@ static int hpet_msi_next_event(unsigned long delta,
 	return hpet_next_event(delta, evt, hdev->num);
 }
 
-static int hpet_setup_msi_irq(unsigned int irq)
-{
-	if (x86_msi.setup_hpet_msi(irq, hpet_blockid)) {
-		irq_domain_free_irqs(irq, 1);
-		return -EINVAL;
-	}
-	return 0;
-}
-
-static int hpet_assign_irq(struct hpet_dev *dev)
-{
-	int irq;
-
-	irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
-	if (irq <= 0)
-		return -EINVAL;
-
-	irq_set_handler_data(irq, dev);
-
-	if (hpet_setup_msi_irq(irq))
-		return -EINVAL;
-
-	dev->irq = irq;
-	return 0;
-}
-
 static irqreturn_t hpet_interrupt_handler(int irq, void *data)
 {
 	struct hpet_dev *dev = (struct hpet_dev *)data;
@@ -542,9 +515,6 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
 	if (!(hdev->flags & HPET_DEV_VALID))
 		return;
 
-	if (hpet_setup_msi_irq(hdev->irq))
-		return;
-
 	hdev->cpu = cpu;
 	per_cpu(cpu_hpet_dev, cpu) = hdev;
 	evt->name = hdev->name;
@@ -576,7 +546,7 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 	unsigned int id;
 	unsigned int num_timers;
 	unsigned int num_timers_used = 0;
-	int i;
+	int i, irq;
 
 	if (hpet_msi_disable)
 		return;
@@ -589,6 +559,10 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 	num_timers++; /* Value read out starts from 0 */
 	hpet_print_config();
 
+	hpet_domain = hpet_create_irq_domain(hpet_blockid);
+	if (!hpet_domain)
+		return;
+
 	hpet_devs = kzalloc(sizeof(struct hpet_dev) * num_timers, GFP_KERNEL);
 	if (!hpet_devs)
 		return;
@@ -603,15 +577,16 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
 		if (!(cfg & HPET_TN_FSB_CAP))
 			continue;
 
+		irq = hpet_assign_irq(hpet_domain, hdev, hdev->num);
+		if (irq < 0)
+			continue;
+
+		sprintf(hdev->name, "hpet%d", i);
+		hdev->num = i;
+		hdev->irq = irq;
 		hdev->flags = 0;
 		if (cfg & HPET_TN_PERIODIC_CAP)
 			hdev->flags |= HPET_DEV_PERI_CAP;
-		hdev->num = i;
-
-		sprintf(hdev->name, "hpet%d", i);
-		if (hpet_assign_irq(hdev))
-			continue;
-
 		hdev->flags |= HPET_DEV_FSB_CAP;
 		hdev->flags |= HPET_DEV_VALID;
 		num_timers_used++;
@@ -711,10 +686,6 @@ static int hpet_cpuhp_notify(struct notifier_block *n,
 }
 #else
 
-static int hpet_setup_msi_irq(unsigned int irq)
-{
-	return 0;
-}
 static void hpet_msi_capability_lookup(unsigned int start_timer)
 {
 	return;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, Jiang Liu, linux-arm-kernel

Enhance MSI code to support hierarchy irqdomain, it helps to make
the and and architecture more clear.


Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |    6 +
 arch/x86/include/asm/irq_remapping.h |    6 +-
 arch/x86/kernel/apic/msi.c           |  225 +++++++++++++++++++++++++++++-----
 arch/x86/kernel/apic/vector.c        |    2 +
 drivers/iommu/irq_remapping.c        |    1 -
 5 files changed, 204 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 57f81f5a9686..9f705c49f850 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -199,6 +199,12 @@ static inline void lock_vector_lock(void) {}
 static inline void unlock_vector_lock(void) {}
 #endif	/* CONFIG_X86_LOCAL_APIC */
 
+#ifdef	CONFIG_PCI_MSI
+extern void arch_init_msi_domain(struct irq_domain *domain);
+#else
+static inline void arch_init_msi_domain(struct irq_domain *domain) { }
+#endif
+
 /* Statistics */
 extern atomic_t irq_err_count;
 extern atomic_t irq_mis_count;
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 428b4e6d637c..440053ca7515 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -73,11 +73,7 @@ extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
  * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
  * parent irqdomain.
  */
-static inline struct irq_domain *
-arch_create_msi_irq_domain(struct irq_domain *parent)
-{
-	return NULL;
-}
+extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent);
 
 /* Get parent irqdomain for interrupt remapping irqdomain */
 static inline struct irq_domain *arch_get_ir_parent_domain(void)
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 709fedab44f2..5696703271af 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -21,6 +23,8 @@
 #include <asm/apic.h>
 #include <asm/irq_remapping.h>
 
+static struct irq_domain *msi_default_domain;
+
 void native_compose_msi_msg(struct pci_dev *pdev,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id)
@@ -76,28 +80,32 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 	return 0;
 }
 
-static int
-msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
+static bool msi_remapped(struct irq_domain *domain)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	struct msi_msg msg;
-	unsigned int dest;
-	int ret;
-
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
+	return domain->host_data != NULL;
+}
 
-	__get_cached_msi_msg(data->msi_desc, &msg);
+static int msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
+			    bool force)
+{
+	struct irq_data *parent = data->parent_data;
+	int ret;
 
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to reprogram MSI registers if interrupt is remapped */
+	if (ret >= 0 && !msi_remapped(data->domain)) {
+		struct irq_cfg *cfg = irqd_cfg(data);
+		struct msi_msg msg;
 
-	__write_msi_msg(data->msi_desc, &msg);
+		__get_cached_msi_msg(data->msi_desc, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		__write_msi_msg(data->msi_desc, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 /*
@@ -108,9 +116,105 @@ static struct irq_chip msi_chip = {
 	.name			= "PCI-MSI",
 	.irq_unmask		= unmask_msi_irq,
 	.irq_mask		= mask_msi_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= msi_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_print_chip		= irq_remapping_print_chip,
+};
+
+static inline irq_hw_number_t
+get_hwirq_from_pcidev(struct pci_dev *pdev, struct msi_desc *msidesc)
+{
+	return (irq_hw_number_t)msidesc->msi_attrib.entry_nr |
+		PCI_DEVID(pdev->bus->number, pdev->devfn) << 11 |
+		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27;
+}
+
+static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs, void *arg)
+{
+	int i, ret;
+	irq_hw_number_t hwirq;
+	struct irq_alloc_info *info = arg;
+
+	hwirq = get_hwirq_from_pcidev(info->msi_dev, info->msi_desc);
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_msi_desc_off(virq, i, info->msi_desc);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &msi_chip, (void *)(long)i);
+		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
+		dev_dbg(&info->msi_dev->dev, "irq %d for MSI/MSI-X\n",
+			virq + i);
+	}
+
+	return ret;
+}
+
+static void msi_domain_free(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs)
+{
+	int i;
+	struct msi_desc *msidesc = irq_get_msi_desc(virq);
+
+	if (msidesc)
+		msidesc->irq = 0;
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_handler(virq + i, NULL);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int msi_domain_activate(struct irq_domain *domain,
+			       struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	/*
+	 * irq_data->chip_data is MSI/MSIx offset.
+	 * MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (irq_data->chip_data)
+		return 0;
+
+	if (msi_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, 0);
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static int msi_domain_deactivate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	if (irq_data->chip_data)
+		return 0;
+
+	memset(&msg, 0, sizeof(msg));
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops msi_domain_ops = {
+	.alloc = msi_domain_alloc,
+	.free = msi_domain_free,
+	.activate = msi_domain_activate,
+	.deactivate = msi_domain_deactivate,
 };
 
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
@@ -145,25 +249,56 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
+	int irq, cnt, nvec_pow2;
+	struct irq_domain *domain;
 	struct msi_desc *msidesc;
-	int irq, ret;
+	struct irq_alloc_info info;
+	int node = dev_to_node(&dev->dev);
+
+	if (disable_apic)
+		return -ENOSYS;
 
-	/* Multiple MSI vectors only supported with interrupt remapping */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
-		return 1;
+	init_irq_alloc_info(&info, NULL);
+	info.msi_dev = dev;
+	if (type == PCI_CAP_ID_MSI) {
+		msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+		WARN_ON(!list_is_singular(&dev->msi_list));
+		WARN_ON(msidesc->irq);
+		WARN_ON(msidesc->msi_attrib.multiple);
+		WARN_ON(msidesc->nvec_used);
+		info.type = X86_IRQ_ALLOC_TYPE_MSI;
+		cnt = nvec;
+	} else {
+		info.type = X86_IRQ_ALLOC_TYPE_MSIX;
+		cnt = 1;
+	}
+
+	domain = irq_remapping_get_irq_domain(&info);
+	if (domain == NULL) {
+		/*
+		 * Multiple MSI vectors only supported with interrupt
+		 * remapping
+		 */
+		if (type == PCI_CAP_ID_MSI && nvec > 1)
+			return 1;
+		domain = msi_default_domain;
+	}
+	if (domain == NULL)
+		return -ENOSYS;
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		info.msi_desc = msidesc;
+		irq = irq_domain_alloc_irqs(domain, -1, cnt, node, &info);
 		if (irq <= 0)
 			return -ENOSPC;
+	}
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0) {
-			irq_domain_free_irqs(irq, 1);
-			return ret;
-		}
-
+	if (type == PCI_CAP_ID_MSI) {
+		nvec_pow2 = __roundup_pow_of_two(nvec);
+		msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
+		msidesc->nvec_used = nvec;
 	}
+
 	return 0;
 }
 
@@ -172,6 +307,36 @@ void native_teardown_msi_irq(unsigned int irq)
 	irq_domain_free_irqs(irq, 1);
 }
 
+static struct irq_domain *msi_create_domain(struct irq_domain *parent,
+					    int remapped)
+{
+	struct irq_domain *domain;
+
+	domain = irq_domain_add_tree(NULL, &msi_domain_ops,
+				     (void *)(long)remapped);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+void arch_init_msi_domain(struct irq_domain *parent)
+{
+	if (disable_apic)
+		return;
+
+	msi_default_domain = msi_create_domain(parent, 0);
+	if (!msi_default_domain)
+		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
+}
+
+#ifdef CONFIG_IRQ_REMAP
+struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent)
+{
+	return msi_create_domain(parent, 1);
+}
+#endif
+
 #ifdef CONFIG_DMAR_TABLE
 static int
 dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 774ab5ba95f2..e9329fc28c63 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -357,6 +357,8 @@ int __init arch_early_irq_init(void)
 	BUG_ON(x86_vector_domain == NULL);
 	irq_set_default_host(x86_vector_domain);
 
+	arch_init_msi_domain(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 7ac44a464be0..bda0d8e73fde 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -178,7 +178,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_msi_irqs		= irq_remapping_setup_msi_irqs;
 	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
 	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Enhance MSI code to support hierarchy irqdomain, it helps to make
the and and architecture more clear.


Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |    6 +
 arch/x86/include/asm/irq_remapping.h |    6 +-
 arch/x86/kernel/apic/msi.c           |  225 +++++++++++++++++++++++++++++-----
 arch/x86/kernel/apic/vector.c        |    2 +
 drivers/iommu/irq_remapping.c        |    1 -
 5 files changed, 204 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 57f81f5a9686..9f705c49f850 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -199,6 +199,12 @@ static inline void lock_vector_lock(void) {}
 static inline void unlock_vector_lock(void) {}
 #endif	/* CONFIG_X86_LOCAL_APIC */
 
+#ifdef	CONFIG_PCI_MSI
+extern void arch_init_msi_domain(struct irq_domain *domain);
+#else
+static inline void arch_init_msi_domain(struct irq_domain *domain) { }
+#endif
+
 /* Statistics */
 extern atomic_t irq_err_count;
 extern atomic_t irq_mis_count;
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 428b4e6d637c..440053ca7515 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -73,11 +73,7 @@ extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
  * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
  * parent irqdomain.
  */
-static inline struct irq_domain *
-arch_create_msi_irq_domain(struct irq_domain *parent)
-{
-	return NULL;
-}
+extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent);
 
 /* Get parent irqdomain for interrupt remapping irqdomain */
 static inline struct irq_domain *arch_get_ir_parent_domain(void)
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 709fedab44f2..5696703271af 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -21,6 +23,8 @@
 #include <asm/apic.h>
 #include <asm/irq_remapping.h>
 
+static struct irq_domain *msi_default_domain;
+
 void native_compose_msi_msg(struct pci_dev *pdev,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id)
@@ -76,28 +80,32 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 	return 0;
 }
 
-static int
-msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
+static bool msi_remapped(struct irq_domain *domain)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	struct msi_msg msg;
-	unsigned int dest;
-	int ret;
-
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
+	return domain->host_data != NULL;
+}
 
-	__get_cached_msi_msg(data->msi_desc, &msg);
+static int msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
+			    bool force)
+{
+	struct irq_data *parent = data->parent_data;
+	int ret;
 
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to reprogram MSI registers if interrupt is remapped */
+	if (ret >= 0 && !msi_remapped(data->domain)) {
+		struct irq_cfg *cfg = irqd_cfg(data);
+		struct msi_msg msg;
 
-	__write_msi_msg(data->msi_desc, &msg);
+		__get_cached_msi_msg(data->msi_desc, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		__write_msi_msg(data->msi_desc, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 /*
@@ -108,9 +116,105 @@ static struct irq_chip msi_chip = {
 	.name			= "PCI-MSI",
 	.irq_unmask		= unmask_msi_irq,
 	.irq_mask		= mask_msi_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= msi_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_print_chip		= irq_remapping_print_chip,
+};
+
+static inline irq_hw_number_t
+get_hwirq_from_pcidev(struct pci_dev *pdev, struct msi_desc *msidesc)
+{
+	return (irq_hw_number_t)msidesc->msi_attrib.entry_nr |
+		PCI_DEVID(pdev->bus->number, pdev->devfn) << 11 |
+		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27;
+}
+
+static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs, void *arg)
+{
+	int i, ret;
+	irq_hw_number_t hwirq;
+	struct irq_alloc_info *info = arg;
+
+	hwirq = get_hwirq_from_pcidev(info->msi_dev, info->msi_desc);
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_msi_desc_off(virq, i, info->msi_desc);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &msi_chip, (void *)(long)i);
+		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
+		dev_dbg(&info->msi_dev->dev, "irq %d for MSI/MSI-X\n",
+			virq + i);
+	}
+
+	return ret;
+}
+
+static void msi_domain_free(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs)
+{
+	int i;
+	struct msi_desc *msidesc = irq_get_msi_desc(virq);
+
+	if (msidesc)
+		msidesc->irq = 0;
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_handler(virq + i, NULL);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int msi_domain_activate(struct irq_domain *domain,
+			       struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	/*
+	 * irq_data->chip_data is MSI/MSIx offset.
+	 * MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (irq_data->chip_data)
+		return 0;
+
+	if (msi_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, 0);
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static int msi_domain_deactivate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	if (irq_data->chip_data)
+		return 0;
+
+	memset(&msg, 0, sizeof(msg));
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops msi_domain_ops = {
+	.alloc = msi_domain_alloc,
+	.free = msi_domain_free,
+	.activate = msi_domain_activate,
+	.deactivate = msi_domain_deactivate,
 };
 
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
@@ -145,25 +249,56 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
+	int irq, cnt, nvec_pow2;
+	struct irq_domain *domain;
 	struct msi_desc *msidesc;
-	int irq, ret;
+	struct irq_alloc_info info;
+	int node = dev_to_node(&dev->dev);
+
+	if (disable_apic)
+		return -ENOSYS;
 
-	/* Multiple MSI vectors only supported with interrupt remapping */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
-		return 1;
+	init_irq_alloc_info(&info, NULL);
+	info.msi_dev = dev;
+	if (type == PCI_CAP_ID_MSI) {
+		msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+		WARN_ON(!list_is_singular(&dev->msi_list));
+		WARN_ON(msidesc->irq);
+		WARN_ON(msidesc->msi_attrib.multiple);
+		WARN_ON(msidesc->nvec_used);
+		info.type = X86_IRQ_ALLOC_TYPE_MSI;
+		cnt = nvec;
+	} else {
+		info.type = X86_IRQ_ALLOC_TYPE_MSIX;
+		cnt = 1;
+	}
+
+	domain = irq_remapping_get_irq_domain(&info);
+	if (domain == NULL) {
+		/*
+		 * Multiple MSI vectors only supported with interrupt
+		 * remapping
+		 */
+		if (type == PCI_CAP_ID_MSI && nvec > 1)
+			return 1;
+		domain = msi_default_domain;
+	}
+	if (domain == NULL)
+		return -ENOSYS;
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		info.msi_desc = msidesc;
+		irq = irq_domain_alloc_irqs(domain, -1, cnt, node, &info);
 		if (irq <= 0)
 			return -ENOSPC;
+	}
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0) {
-			irq_domain_free_irqs(irq, 1);
-			return ret;
-		}
-
+	if (type == PCI_CAP_ID_MSI) {
+		nvec_pow2 = __roundup_pow_of_two(nvec);
+		msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
+		msidesc->nvec_used = nvec;
 	}
+
 	return 0;
 }
 
@@ -172,6 +307,36 @@ void native_teardown_msi_irq(unsigned int irq)
 	irq_domain_free_irqs(irq, 1);
 }
 
+static struct irq_domain *msi_create_domain(struct irq_domain *parent,
+					    int remapped)
+{
+	struct irq_domain *domain;
+
+	domain = irq_domain_add_tree(NULL, &msi_domain_ops,
+				     (void *)(long)remapped);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+void arch_init_msi_domain(struct irq_domain *parent)
+{
+	if (disable_apic)
+		return;
+
+	msi_default_domain = msi_create_domain(parent, 0);
+	if (!msi_default_domain)
+		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
+}
+
+#ifdef CONFIG_IRQ_REMAP
+struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent)
+{
+	return msi_create_domain(parent, 1);
+}
+#endif
+
 #ifdef CONFIG_DMAR_TABLE
 static int
 dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 774ab5ba95f2..e9329fc28c63 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -357,6 +357,8 @@ int __init arch_early_irq_init(void)
 	BUG_ON(x86_vector_domain == NULL);
 	irq_set_default_host(x86_vector_domain);
 
+	arch_init_msi_domain(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 7ac44a464be0..bda0d8e73fde 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -178,7 +178,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_msi_irqs		= irq_remapping_setup_msi_irqs;
 	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
 	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Enhance MSI code to support hierarchy irqdomain, it helps to make
the and and architecture more clear.


Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |    6 +
 arch/x86/include/asm/irq_remapping.h |    6 +-
 arch/x86/kernel/apic/msi.c           |  225 +++++++++++++++++++++++++++++-----
 arch/x86/kernel/apic/vector.c        |    2 +
 drivers/iommu/irq_remapping.c        |    1 -
 5 files changed, 204 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 57f81f5a9686..9f705c49f850 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -199,6 +199,12 @@ static inline void lock_vector_lock(void) {}
 static inline void unlock_vector_lock(void) {}
 #endif	/* CONFIG_X86_LOCAL_APIC */
 
+#ifdef	CONFIG_PCI_MSI
+extern void arch_init_msi_domain(struct irq_domain *domain);
+#else
+static inline void arch_init_msi_domain(struct irq_domain *domain) { }
+#endif
+
 /* Statistics */
 extern atomic_t irq_err_count;
 extern atomic_t irq_mis_count;
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 428b4e6d637c..440053ca7515 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -73,11 +73,7 @@ extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
  * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
  * parent irqdomain.
  */
-static inline struct irq_domain *
-arch_create_msi_irq_domain(struct irq_domain *parent)
-{
-	return NULL;
-}
+extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent);
 
 /* Get parent irqdomain for interrupt remapping irqdomain */
 static inline struct irq_domain *arch_get_ir_parent_domain(void)
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 709fedab44f2..5696703271af 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -21,6 +23,8 @@
 #include <asm/apic.h>
 #include <asm/irq_remapping.h>
 
+static struct irq_domain *msi_default_domain;
+
 void native_compose_msi_msg(struct pci_dev *pdev,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id)
@@ -76,28 +80,32 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 	return 0;
 }
 
-static int
-msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
+static bool msi_remapped(struct irq_domain *domain)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	struct msi_msg msg;
-	unsigned int dest;
-	int ret;
-
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
+	return domain->host_data != NULL;
+}
 
-	__get_cached_msi_msg(data->msi_desc, &msg);
+static int msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
+			    bool force)
+{
+	struct irq_data *parent = data->parent_data;
+	int ret;
 
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to reprogram MSI registers if interrupt is remapped */
+	if (ret >= 0 && !msi_remapped(data->domain)) {
+		struct irq_cfg *cfg = irqd_cfg(data);
+		struct msi_msg msg;
 
-	__write_msi_msg(data->msi_desc, &msg);
+		__get_cached_msi_msg(data->msi_desc, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		__write_msi_msg(data->msi_desc, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 /*
@@ -108,9 +116,105 @@ static struct irq_chip msi_chip = {
 	.name			= "PCI-MSI",
 	.irq_unmask		= unmask_msi_irq,
 	.irq_mask		= mask_msi_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= msi_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_print_chip		= irq_remapping_print_chip,
+};
+
+static inline irq_hw_number_t
+get_hwirq_from_pcidev(struct pci_dev *pdev, struct msi_desc *msidesc)
+{
+	return (irq_hw_number_t)msidesc->msi_attrib.entry_nr |
+		PCI_DEVID(pdev->bus->number, pdev->devfn) << 11 |
+		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27;
+}
+
+static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs, void *arg)
+{
+	int i, ret;
+	irq_hw_number_t hwirq;
+	struct irq_alloc_info *info = arg;
+
+	hwirq = get_hwirq_from_pcidev(info->msi_dev, info->msi_desc);
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_msi_desc_off(virq, i, info->msi_desc);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &msi_chip, (void *)(long)i);
+		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
+		dev_dbg(&info->msi_dev->dev, "irq %d for MSI/MSI-X\n",
+			virq + i);
+	}
+
+	return ret;
+}
+
+static void msi_domain_free(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs)
+{
+	int i;
+	struct msi_desc *msidesc = irq_get_msi_desc(virq);
+
+	if (msidesc)
+		msidesc->irq = 0;
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_handler(virq + i, NULL);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int msi_domain_activate(struct irq_domain *domain,
+			       struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	/*
+	 * irq_data->chip_data is MSI/MSIx offset.
+	 * MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (irq_data->chip_data)
+		return 0;
+
+	if (msi_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, 0);
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static int msi_domain_deactivate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	if (irq_data->chip_data)
+		return 0;
+
+	memset(&msg, 0, sizeof(msg));
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops msi_domain_ops = {
+	.alloc = msi_domain_alloc,
+	.free = msi_domain_free,
+	.activate = msi_domain_activate,
+	.deactivate = msi_domain_deactivate,
 };
 
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
@@ -145,25 +249,56 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
+	int irq, cnt, nvec_pow2;
+	struct irq_domain *domain;
 	struct msi_desc *msidesc;
-	int irq, ret;
+	struct irq_alloc_info info;
+	int node = dev_to_node(&dev->dev);
+
+	if (disable_apic)
+		return -ENOSYS;
 
-	/* Multiple MSI vectors only supported with interrupt remapping */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
-		return 1;
+	init_irq_alloc_info(&info, NULL);
+	info.msi_dev = dev;
+	if (type == PCI_CAP_ID_MSI) {
+		msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+		WARN_ON(!list_is_singular(&dev->msi_list));
+		WARN_ON(msidesc->irq);
+		WARN_ON(msidesc->msi_attrib.multiple);
+		WARN_ON(msidesc->nvec_used);
+		info.type = X86_IRQ_ALLOC_TYPE_MSI;
+		cnt = nvec;
+	} else {
+		info.type = X86_IRQ_ALLOC_TYPE_MSIX;
+		cnt = 1;
+	}
+
+	domain = irq_remapping_get_irq_domain(&info);
+	if (domain == NULL) {
+		/*
+		 * Multiple MSI vectors only supported with interrupt
+		 * remapping
+		 */
+		if (type == PCI_CAP_ID_MSI && nvec > 1)
+			return 1;
+		domain = msi_default_domain;
+	}
+	if (domain == NULL)
+		return -ENOSYS;
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		info.msi_desc = msidesc;
+		irq = irq_domain_alloc_irqs(domain, -1, cnt, node, &info);
 		if (irq <= 0)
 			return -ENOSPC;
+	}
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0) {
-			irq_domain_free_irqs(irq, 1);
-			return ret;
-		}
-
+	if (type == PCI_CAP_ID_MSI) {
+		nvec_pow2 = __roundup_pow_of_two(nvec);
+		msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
+		msidesc->nvec_used = nvec;
 	}
+
 	return 0;
 }
 
@@ -172,6 +307,36 @@ void native_teardown_msi_irq(unsigned int irq)
 	irq_domain_free_irqs(irq, 1);
 }
 
+static struct irq_domain *msi_create_domain(struct irq_domain *parent,
+					    int remapped)
+{
+	struct irq_domain *domain;
+
+	domain = irq_domain_add_tree(NULL, &msi_domain_ops,
+				     (void *)(long)remapped);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+void arch_init_msi_domain(struct irq_domain *parent)
+{
+	if (disable_apic)
+		return;
+
+	msi_default_domain = msi_create_domain(parent, 0);
+	if (!msi_default_domain)
+		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
+}
+
+#ifdef CONFIG_IRQ_REMAP
+struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent)
+{
+	return msi_create_domain(parent, 1);
+}
+#endif
+
 #ifdef CONFIG_DMAR_TABLE
 static int
 dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 774ab5ba95f2..e9329fc28c63 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -357,6 +357,8 @@ int __init arch_early_irq_init(void)
 	BUG_ON(x86_vector_domain == NULL);
 	irq_set_default_host(x86_vector_domain);
 
+	arch_init_msi_domain(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 7ac44a464be0..bda0d8e73fde 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -178,7 +178,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_msi_irqs		= irq_remapping_setup_msi_irqs;
 	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
 	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 16/21] x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

DMAR interrupt won't be remapped by interrupt remapping hardware,
so directly call native_compose_msi_msg() for DMAR IRQ to compose MSI
message data. This will help to simplify MSI code later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 5696703271af..4f2a349ccef0 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -375,12 +375,10 @@ static struct irq_chip dmar_msi_type = {
 
 int arch_setup_dmar_msi(unsigned int irq)
 {
-	int ret;
 	struct msi_msg msg;
+	struct irq_cfg *cfg = irq_cfg(irq);
 
-	ret = msi_compose_msg(NULL, irq, &msg, -1);
-	if (ret < 0)
-		return ret;
+	native_compose_msi_msg(NULL, irq, cfg->dest_apicid, &msg, -1);
 	dmar_msi_write(irq, &msg);
 	irq_set_chip_and_handler_name(irq, &dmar_msi_type, handle_edge_irq,
 				      "edge");
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 16/21] x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

DMAR interrupt won't be remapped by interrupt remapping hardware,
so directly call native_compose_msi_msg() for DMAR IRQ to compose MSI
message data. This will help to simplify MSI code later.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 5696703271af..4f2a349ccef0 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -375,12 +375,10 @@ static struct irq_chip dmar_msi_type = {
 
 int arch_setup_dmar_msi(unsigned int irq)
 {
-	int ret;
 	struct msi_msg msg;
+	struct irq_cfg *cfg = irq_cfg(irq);
 
-	ret = msi_compose_msg(NULL, irq, &msg, -1);
-	if (ret < 0)
-		return ret;
+	native_compose_msi_msg(NULL, irq, cfg->dest_apicid, &msg, -1);
 	dmar_msi_write(irq, &msg);
 	irq_set_chip_and_handler_name(irq, &dmar_msi_type, handle_edge_irq,
 				      "edge");
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 17/21] x86, htirq: Use hierarchy irqdomain to manage Hypertransport interrupts
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Use hierarchy irqdomain to manage Hypertransport interrupts.
We have slightly changed the architecture interfaces to support htirq
PCI driver, it should be safe because currently Hypertransport interrupt
is only enabled on x86 platforms.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h |   13 ++++
 arch/x86/kernel/apic/htirq.c  |  167 +++++++++++++++++++++++++++++++----------
 arch/x86/kernel/apic/vector.c |    1 +
 drivers/pci/htirq.c           |   47 ++----------
 include/linux/htirq.h         |   24 ++++--
 5 files changed, 165 insertions(+), 87 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 9f705c49f850..1913607f6422 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -148,6 +148,14 @@ struct irq_alloc_info {
 			u32		ioapic_polarity : 1;
 		};
 #endif
+#ifdef	CONFIG_HT_IRQ
+		struct {
+			int		ht_pos;
+			int		ht_idx;
+			struct pci_dev	*ht_dev;
+			void		*ht_update;
+		};
+#endif
 	};
 };
 
@@ -204,6 +212,11 @@ extern void arch_init_msi_domain(struct irq_domain *domain);
 #else
 static inline void arch_init_msi_domain(struct irq_domain *domain) { }
 #endif
+#ifdef	CONFIG_HT_IRQ
+extern void arch_init_htirq_domain(struct irq_domain *domain);
+#else
+static inline void arch_init_htirq_domain(struct irq_domain *domain) { }
+#endif
 
 /* Statistics */
 extern atomic_t irq_err_count;
diff --git a/arch/x86/kernel/apic/htirq.c b/arch/x86/kernel/apic/htirq.c
index 55cb061a95cb..a67da63996df 100644
--- a/arch/x86/kernel/apic/htirq.c
+++ b/arch/x86/kernel/apic/htirq.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -19,69 +21,107 @@
 #include <asm/apic.h>
 #include <asm/hypertransport.h>
 
+static struct irq_domain *htirq_domain;
+
 /*
  * Hypertransport interrupt support
  */
-static void target_ht_irq(unsigned int irq, unsigned int dest, u8 vector)
-{
-	struct ht_irq_msg msg;
-
-	fetch_ht_irq_msg(irq, &msg);
-
-	msg.address_lo &= ~(HT_IRQ_LOW_VECTOR_MASK | HT_IRQ_LOW_DEST_ID_MASK);
-	msg.address_hi &= ~(HT_IRQ_HIGH_DEST_ID_MASK);
-
-	msg.address_lo |= HT_IRQ_LOW_VECTOR(vector) | HT_IRQ_LOW_DEST_ID(dest);
-	msg.address_hi |= HT_IRQ_HIGH_DEST_ID(dest);
-
-	write_ht_irq_msg(irq, &msg);
-}
-
 static int
 ht_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	unsigned int dest;
+	struct irq_data *parent = data->parent_data;
 	int ret;
 
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
-
-	target_ht_irq(data->irq, dest, cfg->vector);
-	return IRQ_SET_MASK_OK_NOCOPY;
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	if (ret >= 0) {
+		struct ht_irq_msg msg;
+		struct irq_cfg *cfg = data->chip_data;
+
+		fetch_ht_irq_msg(data->irq, &msg);
+		msg.address_lo &= ~(HT_IRQ_LOW_VECTOR_MASK |
+				    HT_IRQ_LOW_DEST_ID_MASK);
+		msg.address_lo |= HT_IRQ_LOW_VECTOR(cfg->vector) |
+				  HT_IRQ_LOW_DEST_ID(cfg->dest_apicid);
+		msg.address_hi &= ~(HT_IRQ_HIGH_DEST_ID_MASK);
+		msg.address_hi |= HT_IRQ_HIGH_DEST_ID(cfg->dest_apicid);
+		write_ht_irq_msg(data->irq, &msg);
+	}
+
+	return ret;
 }
 
 static struct irq_chip ht_irq_chip = {
 	.name			= "PCI-HT",
 	.irq_mask		= mask_ht_irq,
 	.irq_unmask		= unmask_ht_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= ht_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 };
 
-int arch_alloc_ht_irq(struct pci_dev *dev)
+static int htirq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			      unsigned int nr_irqs, void *arg)
 {
-	return irq_domain_alloc_irqs(NULL, -1, 1, dev_to_node(&dev->dev), NULL);
+	struct ht_irq_cfg *ht_cfg;
+	struct irq_alloc_info *info = arg;
+	struct pci_dev *dev;
+	irq_hw_number_t hwirq;
+	int ret;
+
+	if (nr_irqs > 1 || !info)
+		return -EINVAL;
+
+	dev = info->ht_dev;
+	hwirq = (info->ht_idx & 0xFF) |
+		PCI_DEVID(dev->bus->number, dev->devfn) << 8 |
+		(pci_domain_nr(dev->bus) & 0xFFFFFFFF) << 24;
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ht_cfg = kmalloc(sizeof(*ht_cfg), GFP_KERNEL);
+	if (!ht_cfg)
+		return -ENOMEM;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0) {
+		kfree(ht_cfg);
+		return ret;
+	}
+
+	/* Initialize msg to a value that will never match the first write. */
+	ht_cfg->msg.address_lo = 0xffffffff;
+	ht_cfg->msg.address_hi = 0xffffffff;
+	ht_cfg->dev = info->ht_dev;
+	ht_cfg->update = info->ht_update;
+	ht_cfg->pos = info->ht_pos;
+	ht_cfg->idx = 0x10 + (info->ht_idx * 2);
+	irq_domain_set_hwirq_and_chip(domain, virq, hwirq, &ht_irq_chip,
+				      ht_cfg);
+	__irq_set_handler(virq, handle_edge_irq, 0, "edge");
+
+	return 0;
 }
 
-void arch_free_ht_irq(int irq)
+static void htirq_domain_free(struct irq_domain *domain, unsigned int virq,
+			      unsigned int nr_irqs)
 {
-	irq_domain_free_irqs(irq, 1);
+	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
+
+	BUG_ON(nr_irqs != 1);
+	if (irq_data && irq_data->chip_data)
+		kfree(irq_data->chip_data);
+	irq_domain_set_hwirq_and_chip(domain, virq, 0, NULL, NULL);
+	irq_set_handler(virq, NULL);
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
 }
 
-int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
+static int htirq_domain_activate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
 {
-	struct irq_cfg *cfg;
 	struct ht_irq_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
 
-	if (disable_apic)
-		return -ENXIO;
-
-	cfg = irq_cfg(irq);
 	msg.address_hi = HT_IRQ_HIGH_DEST_ID(cfg->dest_apicid);
-
 	msg.address_lo =
 		HT_IRQ_LOW_BASE |
 		HT_IRQ_LOW_DEST_ID(cfg->dest_apicid) |
@@ -94,13 +134,60 @@ int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
 			HT_IRQ_LOW_MT_FIXED :
 			HT_IRQ_LOW_MT_ARBITRATED) |
 		HT_IRQ_LOW_IRQ_MASKED;
+	write_ht_irq_msg(irq_data->irq, &msg);
 
-	write_ht_irq_msg(irq, &msg);
+	return 0;
+}
 
-	irq_set_chip_and_handler_name(irq, &ht_irq_chip,
-				      handle_edge_irq, "edge");
+static int htirq_domain_deactivate(struct irq_domain *domain,
+				   struct irq_data *irq_data)
+{
+	struct ht_irq_msg msg;
 
-	dev_dbg(&dev->dev, "irq %d for HT\n", irq);
+	memset(&msg, 0, sizeof(msg));
+	write_ht_irq_msg(irq_data->irq, &msg);
 
 	return 0;
 }
+
+static struct irq_domain_ops htirq_domain_ops = {
+	.alloc = htirq_domain_alloc,
+	.free = htirq_domain_free,
+	.activate = htirq_domain_activate,
+	.deactivate = htirq_domain_deactivate,
+};
+
+void arch_init_htirq_domain(struct irq_domain *parent)
+{
+	if (disable_apic)
+		return;
+
+	htirq_domain = irq_domain_add_tree(NULL, &htirq_domain_ops, NULL);
+	if (!htirq_domain)
+		pr_warn("failed to initialize irqdomain for HTIRQ.\n");
+	else
+		htirq_domain->parent = parent;
+}
+
+int arch_setup_ht_irq(int idx, int pos, struct pci_dev *dev,
+		      ht_irq_update_t *update)
+{
+	struct irq_alloc_info info;
+
+	if (!htirq_domain)
+		return -ENOSYS;
+
+	init_irq_alloc_info(&info, NULL);
+	info.ht_idx = idx;
+	info.ht_pos = pos;
+	info.ht_dev = dev;
+	info.ht_update = update;
+
+	return irq_domain_alloc_irqs(htirq_domain, -1, 1,
+				     dev_to_node(&dev->dev), &info);
+}
+
+void arch_teardown_ht_irq(unsigned int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index e9329fc28c63..1ddede9d5be7 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -358,6 +358,7 @@ int __init arch_early_irq_init(void)
 	irq_set_default_host(x86_vector_domain);
 
 	arch_init_msi_domain(x86_vector_domain);
+	arch_init_htirq_domain(x86_vector_domain);
 
 	return arch_early_ioapic_init();
 }
diff --git a/drivers/pci/htirq.c b/drivers/pci/htirq.c
index ceb0ebeb7b5f..7eb4109a3df4 100644
--- a/drivers/pci/htirq.c
+++ b/drivers/pci/htirq.c
@@ -23,20 +23,11 @@
  */
 static DEFINE_SPINLOCK(ht_irq_lock);
 
-struct ht_irq_cfg {
-	struct pci_dev *dev;
-	 /* Update callback used to cope with buggy hardware */
-	ht_irq_update_t *update;
-	unsigned pos;
-	unsigned idx;
-	struct ht_irq_msg msg;
-};
-
-
 void write_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg)
 {
 	struct ht_irq_cfg *cfg = irq_get_handler_data(irq);
 	unsigned long flags;
+
 	spin_lock_irqsave(&ht_irq_lock, flags);
 	if (cfg->msg.address_lo != msg->address_lo) {
 		pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx);
@@ -55,6 +46,7 @@ void write_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg)
 void fetch_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg)
 {
 	struct ht_irq_cfg *cfg = irq_get_handler_data(irq);
+
 	*msg = cfg->msg;
 }
 
@@ -86,7 +78,6 @@ void unmask_ht_irq(struct irq_data *data)
  */
 int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update)
 {
-	struct ht_irq_cfg *cfg;
 	int max_irq, pos, irq;
 	unsigned long flags;
 	u32 data;
@@ -105,29 +96,9 @@ int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update)
 	if (idx > max_irq)
 		return -EINVAL;
 
-	cfg = kmalloc(sizeof(*cfg), GFP_KERNEL);
-	if (!cfg)
-		return -ENOMEM;
-
-	cfg->dev = dev;
-	cfg->update = update;
-	cfg->pos = pos;
-	cfg->idx = 0x10 + (idx * 2);
-	/* Initialize msg to a value that will never match the first write. */
-	cfg->msg.address_lo = 0xffffffff;
-	cfg->msg.address_hi = 0xffffffff;
-
-	irq = arch_alloc_ht_irq(dev);
-	if (irq <= 0) {
-		kfree(cfg);
-		return -EBUSY;
-	}
-	irq_set_handler_data(irq, cfg);
-
-	if (arch_setup_ht_irq(irq, dev) < 0) {
-		ht_destroy_irq(irq);
-		return -EBUSY;
-	}
+	irq = arch_setup_ht_irq(idx, pos, dev, update);
+	if (irq > 0)
+		dev_dbg(&dev->dev, "irq %d for HT\n", irq);
 
 	return irq;
 }
@@ -158,12 +129,6 @@ EXPORT_SYMBOL(ht_create_irq);
  */
 void ht_destroy_irq(unsigned int irq)
 {
-	struct ht_irq_cfg *cfg;
-
-	cfg = irq_get_handler_data(irq);
-	irq_set_chip(irq, NULL);
-	irq_set_handler_data(irq, NULL);
-	arch_free_ht_irq(irq);
-	kfree(cfg);
+	arch_teardown_ht_irq(irq);
 }
 EXPORT_SYMBOL(ht_destroy_irq);
diff --git a/include/linux/htirq.h b/include/linux/htirq.h
index 5caa51b7b95c..d4a527e58434 100644
--- a/include/linux/htirq.h
+++ b/include/linux/htirq.h
@@ -1,26 +1,38 @@
 #ifndef LINUX_HTIRQ_H
 #define LINUX_HTIRQ_H
 
+struct pci_dev;
+struct irq_data;
+
 struct ht_irq_msg {
 	u32	address_lo;	/* low 32 bits of the ht irq message */
 	u32	address_hi;	/* high 32 bits of the it irq message */
 };
 
+typedef void (ht_irq_update_t)(struct pci_dev *dev, int irq,
+			       struct ht_irq_msg *msg);
+
+struct ht_irq_cfg {
+	struct pci_dev *dev;
+	 /* Update callback used to cope with buggy hardware */
+	ht_irq_update_t *update;
+	unsigned pos;
+	unsigned idx;
+	struct ht_irq_msg msg;
+};
+
 /* Helper functions.. */
 void fetch_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg);
 void write_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg);
-struct irq_data;
 void mask_ht_irq(struct irq_data *data);
 void unmask_ht_irq(struct irq_data *data);
 
 /* The arch hook for getting things started */
-int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev);
-int arch_alloc_ht_irq(struct pci_dev *dev);
-void arch_free_ht_irq(int irq);
+int arch_setup_ht_irq(int idx, int pos, struct pci_dev *dev,
+		      ht_irq_update_t *update);
+void arch_teardown_ht_irq(unsigned int irq);
 
 /* For drivers of buggy hardware */
-typedef void (ht_irq_update_t)(struct pci_dev *dev, int irq,
-			       struct ht_irq_msg *msg);
 int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update);
 
 #endif /* LINUX_HTIRQ_H */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 17/21] x86, htirq: Use hierarchy irqdomain to manage Hypertransport interrupts
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Use hierarchy irqdomain to manage Hypertransport interrupts.
We have slightly changed the architecture interfaces to support htirq
PCI driver, it should be safe because currently Hypertransport interrupt
is only enabled on x86 platforms.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h |   13 ++++
 arch/x86/kernel/apic/htirq.c  |  167 +++++++++++++++++++++++++++++++----------
 arch/x86/kernel/apic/vector.c |    1 +
 drivers/pci/htirq.c           |   47 ++----------
 include/linux/htirq.h         |   24 ++++--
 5 files changed, 165 insertions(+), 87 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 9f705c49f850..1913607f6422 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -148,6 +148,14 @@ struct irq_alloc_info {
 			u32		ioapic_polarity : 1;
 		};
 #endif
+#ifdef	CONFIG_HT_IRQ
+		struct {
+			int		ht_pos;
+			int		ht_idx;
+			struct pci_dev	*ht_dev;
+			void		*ht_update;
+		};
+#endif
 	};
 };
 
@@ -204,6 +212,11 @@ extern void arch_init_msi_domain(struct irq_domain *domain);
 #else
 static inline void arch_init_msi_domain(struct irq_domain *domain) { }
 #endif
+#ifdef	CONFIG_HT_IRQ
+extern void arch_init_htirq_domain(struct irq_domain *domain);
+#else
+static inline void arch_init_htirq_domain(struct irq_domain *domain) { }
+#endif
 
 /* Statistics */
 extern atomic_t irq_err_count;
diff --git a/arch/x86/kernel/apic/htirq.c b/arch/x86/kernel/apic/htirq.c
index 55cb061a95cb..a67da63996df 100644
--- a/arch/x86/kernel/apic/htirq.c
+++ b/arch/x86/kernel/apic/htirq.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -19,69 +21,107 @@
 #include <asm/apic.h>
 #include <asm/hypertransport.h>
 
+static struct irq_domain *htirq_domain;
+
 /*
  * Hypertransport interrupt support
  */
-static void target_ht_irq(unsigned int irq, unsigned int dest, u8 vector)
-{
-	struct ht_irq_msg msg;
-
-	fetch_ht_irq_msg(irq, &msg);
-
-	msg.address_lo &= ~(HT_IRQ_LOW_VECTOR_MASK | HT_IRQ_LOW_DEST_ID_MASK);
-	msg.address_hi &= ~(HT_IRQ_HIGH_DEST_ID_MASK);
-
-	msg.address_lo |= HT_IRQ_LOW_VECTOR(vector) | HT_IRQ_LOW_DEST_ID(dest);
-	msg.address_hi |= HT_IRQ_HIGH_DEST_ID(dest);
-
-	write_ht_irq_msg(irq, &msg);
-}
-
 static int
 ht_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	unsigned int dest;
+	struct irq_data *parent = data->parent_data;
 	int ret;
 
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
-
-	target_ht_irq(data->irq, dest, cfg->vector);
-	return IRQ_SET_MASK_OK_NOCOPY;
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	if (ret >= 0) {
+		struct ht_irq_msg msg;
+		struct irq_cfg *cfg = data->chip_data;
+
+		fetch_ht_irq_msg(data->irq, &msg);
+		msg.address_lo &= ~(HT_IRQ_LOW_VECTOR_MASK |
+				    HT_IRQ_LOW_DEST_ID_MASK);
+		msg.address_lo |= HT_IRQ_LOW_VECTOR(cfg->vector) |
+				  HT_IRQ_LOW_DEST_ID(cfg->dest_apicid);
+		msg.address_hi &= ~(HT_IRQ_HIGH_DEST_ID_MASK);
+		msg.address_hi |= HT_IRQ_HIGH_DEST_ID(cfg->dest_apicid);
+		write_ht_irq_msg(data->irq, &msg);
+	}
+
+	return ret;
 }
 
 static struct irq_chip ht_irq_chip = {
 	.name			= "PCI-HT",
 	.irq_mask		= mask_ht_irq,
 	.irq_unmask		= unmask_ht_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= ht_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 };
 
-int arch_alloc_ht_irq(struct pci_dev *dev)
+static int htirq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			      unsigned int nr_irqs, void *arg)
 {
-	return irq_domain_alloc_irqs(NULL, -1, 1, dev_to_node(&dev->dev), NULL);
+	struct ht_irq_cfg *ht_cfg;
+	struct irq_alloc_info *info = arg;
+	struct pci_dev *dev;
+	irq_hw_number_t hwirq;
+	int ret;
+
+	if (nr_irqs > 1 || !info)
+		return -EINVAL;
+
+	dev = info->ht_dev;
+	hwirq = (info->ht_idx & 0xFF) |
+		PCI_DEVID(dev->bus->number, dev->devfn) << 8 |
+		(pci_domain_nr(dev->bus) & 0xFFFFFFFF) << 24;
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ht_cfg = kmalloc(sizeof(*ht_cfg), GFP_KERNEL);
+	if (!ht_cfg)
+		return -ENOMEM;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0) {
+		kfree(ht_cfg);
+		return ret;
+	}
+
+	/* Initialize msg to a value that will never match the first write. */
+	ht_cfg->msg.address_lo = 0xffffffff;
+	ht_cfg->msg.address_hi = 0xffffffff;
+	ht_cfg->dev = info->ht_dev;
+	ht_cfg->update = info->ht_update;
+	ht_cfg->pos = info->ht_pos;
+	ht_cfg->idx = 0x10 + (info->ht_idx * 2);
+	irq_domain_set_hwirq_and_chip(domain, virq, hwirq, &ht_irq_chip,
+				      ht_cfg);
+	__irq_set_handler(virq, handle_edge_irq, 0, "edge");
+
+	return 0;
 }
 
-void arch_free_ht_irq(int irq)
+static void htirq_domain_free(struct irq_domain *domain, unsigned int virq,
+			      unsigned int nr_irqs)
 {
-	irq_domain_free_irqs(irq, 1);
+	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
+
+	BUG_ON(nr_irqs != 1);
+	if (irq_data && irq_data->chip_data)
+		kfree(irq_data->chip_data);
+	irq_domain_set_hwirq_and_chip(domain, virq, 0, NULL, NULL);
+	irq_set_handler(virq, NULL);
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
 }
 
-int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
+static int htirq_domain_activate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
 {
-	struct irq_cfg *cfg;
 	struct ht_irq_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
 
-	if (disable_apic)
-		return -ENXIO;
-
-	cfg = irq_cfg(irq);
 	msg.address_hi = HT_IRQ_HIGH_DEST_ID(cfg->dest_apicid);
-
 	msg.address_lo =
 		HT_IRQ_LOW_BASE |
 		HT_IRQ_LOW_DEST_ID(cfg->dest_apicid) |
@@ -94,13 +134,60 @@ int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
 			HT_IRQ_LOW_MT_FIXED :
 			HT_IRQ_LOW_MT_ARBITRATED) |
 		HT_IRQ_LOW_IRQ_MASKED;
+	write_ht_irq_msg(irq_data->irq, &msg);
 
-	write_ht_irq_msg(irq, &msg);
+	return 0;
+}
 
-	irq_set_chip_and_handler_name(irq, &ht_irq_chip,
-				      handle_edge_irq, "edge");
+static int htirq_domain_deactivate(struct irq_domain *domain,
+				   struct irq_data *irq_data)
+{
+	struct ht_irq_msg msg;
 
-	dev_dbg(&dev->dev, "irq %d for HT\n", irq);
+	memset(&msg, 0, sizeof(msg));
+	write_ht_irq_msg(irq_data->irq, &msg);
 
 	return 0;
 }
+
+static struct irq_domain_ops htirq_domain_ops = {
+	.alloc = htirq_domain_alloc,
+	.free = htirq_domain_free,
+	.activate = htirq_domain_activate,
+	.deactivate = htirq_domain_deactivate,
+};
+
+void arch_init_htirq_domain(struct irq_domain *parent)
+{
+	if (disable_apic)
+		return;
+
+	htirq_domain = irq_domain_add_tree(NULL, &htirq_domain_ops, NULL);
+	if (!htirq_domain)
+		pr_warn("failed to initialize irqdomain for HTIRQ.\n");
+	else
+		htirq_domain->parent = parent;
+}
+
+int arch_setup_ht_irq(int idx, int pos, struct pci_dev *dev,
+		      ht_irq_update_t *update)
+{
+	struct irq_alloc_info info;
+
+	if (!htirq_domain)
+		return -ENOSYS;
+
+	init_irq_alloc_info(&info, NULL);
+	info.ht_idx = idx;
+	info.ht_pos = pos;
+	info.ht_dev = dev;
+	info.ht_update = update;
+
+	return irq_domain_alloc_irqs(htirq_domain, -1, 1,
+				     dev_to_node(&dev->dev), &info);
+}
+
+void arch_teardown_ht_irq(unsigned int irq)
+{
+	irq_domain_free_irqs(irq, 1);
+}
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index e9329fc28c63..1ddede9d5be7 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -358,6 +358,7 @@ int __init arch_early_irq_init(void)
 	irq_set_default_host(x86_vector_domain);
 
 	arch_init_msi_domain(x86_vector_domain);
+	arch_init_htirq_domain(x86_vector_domain);
 
 	return arch_early_ioapic_init();
 }
diff --git a/drivers/pci/htirq.c b/drivers/pci/htirq.c
index ceb0ebeb7b5f..7eb4109a3df4 100644
--- a/drivers/pci/htirq.c
+++ b/drivers/pci/htirq.c
@@ -23,20 +23,11 @@
  */
 static DEFINE_SPINLOCK(ht_irq_lock);
 
-struct ht_irq_cfg {
-	struct pci_dev *dev;
-	 /* Update callback used to cope with buggy hardware */
-	ht_irq_update_t *update;
-	unsigned pos;
-	unsigned idx;
-	struct ht_irq_msg msg;
-};
-
-
 void write_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg)
 {
 	struct ht_irq_cfg *cfg = irq_get_handler_data(irq);
 	unsigned long flags;
+
 	spin_lock_irqsave(&ht_irq_lock, flags);
 	if (cfg->msg.address_lo != msg->address_lo) {
 		pci_write_config_byte(cfg->dev, cfg->pos + 2, cfg->idx);
@@ -55,6 +46,7 @@ void write_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg)
 void fetch_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg)
 {
 	struct ht_irq_cfg *cfg = irq_get_handler_data(irq);
+
 	*msg = cfg->msg;
 }
 
@@ -86,7 +78,6 @@ void unmask_ht_irq(struct irq_data *data)
  */
 int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update)
 {
-	struct ht_irq_cfg *cfg;
 	int max_irq, pos, irq;
 	unsigned long flags;
 	u32 data;
@@ -105,29 +96,9 @@ int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update)
 	if (idx > max_irq)
 		return -EINVAL;
 
-	cfg = kmalloc(sizeof(*cfg), GFP_KERNEL);
-	if (!cfg)
-		return -ENOMEM;
-
-	cfg->dev = dev;
-	cfg->update = update;
-	cfg->pos = pos;
-	cfg->idx = 0x10 + (idx * 2);
-	/* Initialize msg to a value that will never match the first write. */
-	cfg->msg.address_lo = 0xffffffff;
-	cfg->msg.address_hi = 0xffffffff;
-
-	irq = arch_alloc_ht_irq(dev);
-	if (irq <= 0) {
-		kfree(cfg);
-		return -EBUSY;
-	}
-	irq_set_handler_data(irq, cfg);
-
-	if (arch_setup_ht_irq(irq, dev) < 0) {
-		ht_destroy_irq(irq);
-		return -EBUSY;
-	}
+	irq = arch_setup_ht_irq(idx, pos, dev, update);
+	if (irq > 0)
+		dev_dbg(&dev->dev, "irq %d for HT\n", irq);
 
 	return irq;
 }
@@ -158,12 +129,6 @@ EXPORT_SYMBOL(ht_create_irq);
  */
 void ht_destroy_irq(unsigned int irq)
 {
-	struct ht_irq_cfg *cfg;
-
-	cfg = irq_get_handler_data(irq);
-	irq_set_chip(irq, NULL);
-	irq_set_handler_data(irq, NULL);
-	arch_free_ht_irq(irq);
-	kfree(cfg);
+	arch_teardown_ht_irq(irq);
 }
 EXPORT_SYMBOL(ht_destroy_irq);
diff --git a/include/linux/htirq.h b/include/linux/htirq.h
index 5caa51b7b95c..d4a527e58434 100644
--- a/include/linux/htirq.h
+++ b/include/linux/htirq.h
@@ -1,26 +1,38 @@
 #ifndef LINUX_HTIRQ_H
 #define LINUX_HTIRQ_H
 
+struct pci_dev;
+struct irq_data;
+
 struct ht_irq_msg {
 	u32	address_lo;	/* low 32 bits of the ht irq message */
 	u32	address_hi;	/* high 32 bits of the it irq message */
 };
 
+typedef void (ht_irq_update_t)(struct pci_dev *dev, int irq,
+			       struct ht_irq_msg *msg);
+
+struct ht_irq_cfg {
+	struct pci_dev *dev;
+	 /* Update callback used to cope with buggy hardware */
+	ht_irq_update_t *update;
+	unsigned pos;
+	unsigned idx;
+	struct ht_irq_msg msg;
+};
+
 /* Helper functions.. */
 void fetch_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg);
 void write_ht_irq_msg(unsigned int irq, struct ht_irq_msg *msg);
-struct irq_data;
 void mask_ht_irq(struct irq_data *data);
 void unmask_ht_irq(struct irq_data *data);
 
 /* The arch hook for getting things started */
-int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev);
-int arch_alloc_ht_irq(struct pci_dev *dev);
-void arch_free_ht_irq(int irq);
+int arch_setup_ht_irq(int idx, int pos, struct pci_dev *dev,
+		      ht_irq_update_t *update);
+void arch_teardown_ht_irq(unsigned int irq);
 
 /* For drivers of buggy hardware */
-typedef void (ht_irq_update_t)(struct pci_dev *dev, int irq,
-			       struct ht_irq_msg *msg);
 int __ht_create_irq(struct pci_dev *dev, int idx, ht_irq_update_t *update);
 
 #endif /* LINUX_HTIRQ_H */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 18/21] iommu/vt-d: Clean up unused MSI related code
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/intel_irq_remapping.c |  144 -----------------------------------
 1 file changed, 144 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 8bac5935e0d5..840db5fee320 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -146,44 +146,6 @@ static int qi_flush_iec(struct intel_iommu *iommu, int index, int mask)
 	return qi_submit_sync(&desc, iommu);
 }
 
-static int map_irq_to_irte_handle(int irq, u16 *sub_handle)
-{
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
-	unsigned long flags;
-	int index;
-
-	if (!irq_iommu)
-		return -1;
-
-	raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
-	*sub_handle = irq_iommu->sub_handle;
-	index = irq_iommu->irte_index;
-	raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
-	return index;
-}
-
-static int set_irte_irq(int irq, struct intel_iommu *iommu, u16 index, u16 subhandle)
-{
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
-	struct irq_cfg *cfg = irq_cfg(irq);
-	unsigned long flags;
-
-	if (!irq_iommu)
-		return -1;
-
-	raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
-
-	cfg->remapped = 1;
-	irq_iommu->iommu = iommu;
-	irq_iommu->irte_index = index;
-	irq_iommu->sub_handle = subhandle;
-	irq_iommu->irte_mask = 0;
-
-	raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
-
-	return 0;
-}
-
 static int modify_irte(struct irq_2_iommu *irq_iommu,
 		       struct irte *irte_modified)
 {
@@ -1072,108 +1034,6 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	return 0;
 }
 
-static void intel_compose_msi_msg(struct pci_dev *pdev,
-				  unsigned int irq, unsigned int dest,
-				  struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_cfg *cfg;
-	struct irte irte;
-	u16 sub_handle = 0;
-	int ir_index;
-
-	cfg = irq_cfg(irq);
-
-	ir_index = map_irq_to_irte_handle(irq, &sub_handle);
-	BUG_ON(ir_index == -1);
-
-	prepare_irte(&irte, cfg->vector, dest);
-
-	/* Set source-id of interrupt request */
-	if (pdev)
-		set_msi_sid(&irte, pdev);
-	else
-		set_hpet_sid(&irte, hpet_id);
-
-	modify_irte(irq_2_iommu(irq), &irte);
-
-	msg->address_hi = MSI_ADDR_BASE_HI;
-	msg->data = sub_handle;
-	msg->address_lo = MSI_ADDR_BASE_LO | MSI_ADDR_IR_EXT_INT |
-			  MSI_ADDR_IR_SHV |
-			  MSI_ADDR_IR_INDEX1(ir_index) |
-			  MSI_ADDR_IR_INDEX2(ir_index);
-}
-
-/*
- * Map the PCI dev to the corresponding remapping hardware unit
- * and allocate 'nvec' consecutive interrupt-remapping table entries
- * in it.
- */
-static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
-{
-	struct intel_iommu *iommu;
-	int index;
-
-	down_read(&dmar_global_lock);
-	iommu = map_dev_to_ir(dev);
-	if (!iommu) {
-		printk(KERN_ERR
-		       "Unable to map PCI %s to iommu\n", pci_name(dev));
-		index = -ENOENT;
-	} else {
-		index = alloc_irte(iommu, irq, irq_2_iommu(irq), nvec);
-		if (index < 0) {
-			printk(KERN_ERR
-			       "Unable to allocate %d IRTE for PCI %s\n",
-			       nvec, pci_name(dev));
-			index = -ENOSPC;
-		}
-	}
-	up_read(&dmar_global_lock);
-
-	return index;
-}
-
-static int intel_msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
-			       int index, int sub_handle)
-{
-	struct intel_iommu *iommu;
-	int ret = -ENOENT;
-
-	down_read(&dmar_global_lock);
-	iommu = map_dev_to_ir(pdev);
-	if (iommu) {
-		/*
-		 * setup the mapping between the irq and the IRTE
-		 * base index, the sub_handle pointing to the
-		 * appropriate interrupt remap table entry.
-		 */
-		set_irte_irq(irq, iommu, index, sub_handle);
-		ret = 0;
-	}
-	up_read(&dmar_global_lock);
-
-	return ret;
-}
-
-static int intel_setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	int ret = -1;
-	struct intel_iommu *iommu;
-	int index;
-
-	down_read(&dmar_global_lock);
-	iommu = map_hpet_to_ir(id);
-	if (iommu) {
-		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
-		if (index >= 0)
-			ret = 0;
-	}
-	up_read(&dmar_global_lock);
-
-	return ret;
-}
-
 static struct irq_domain *intel_get_ir_irq_domain(struct irq_alloc_info *info)
 {
 	struct intel_iommu *iommu = NULL;
@@ -1247,10 +1107,6 @@ struct irq_remap_ops intel_irq_remap_ops = {
 	.setup_ioapic_entry	= intel_setup_ioapic_entry,
 	.set_affinity		= intel_ioapic_set_affinity,
 	.free_irq		= free_irte,
-	.compose_msi_msg	= intel_compose_msi_msg,
-	.msi_alloc_irq		= intel_msi_alloc_irq,
-	.msi_setup_irq		= intel_msi_setup_irq,
-	.setup_hpet_msi		= intel_setup_hpet_msi,
 	.get_ir_irq_domain	= intel_get_ir_irq_domain,
 	.get_irq_domain		= intel_get_irq_domain,
 	.get_ioapic_entry	= intel_get_ioapic_entry,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 18/21] iommu/vt-d: Clean up unused MSI related code
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/intel_irq_remapping.c |  144 -----------------------------------
 1 file changed, 144 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 8bac5935e0d5..840db5fee320 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -146,44 +146,6 @@ static int qi_flush_iec(struct intel_iommu *iommu, int index, int mask)
 	return qi_submit_sync(&desc, iommu);
 }
 
-static int map_irq_to_irte_handle(int irq, u16 *sub_handle)
-{
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
-	unsigned long flags;
-	int index;
-
-	if (!irq_iommu)
-		return -1;
-
-	raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
-	*sub_handle = irq_iommu->sub_handle;
-	index = irq_iommu->irte_index;
-	raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
-	return index;
-}
-
-static int set_irte_irq(int irq, struct intel_iommu *iommu, u16 index, u16 subhandle)
-{
-	struct irq_2_iommu *irq_iommu = irq_2_iommu(irq);
-	struct irq_cfg *cfg = irq_cfg(irq);
-	unsigned long flags;
-
-	if (!irq_iommu)
-		return -1;
-
-	raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
-
-	cfg->remapped = 1;
-	irq_iommu->iommu = iommu;
-	irq_iommu->irte_index = index;
-	irq_iommu->sub_handle = subhandle;
-	irq_iommu->irte_mask = 0;
-
-	raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
-
-	return 0;
-}
-
 static int modify_irte(struct irq_2_iommu *irq_iommu,
 		       struct irte *irte_modified)
 {
@@ -1072,108 +1034,6 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	return 0;
 }
 
-static void intel_compose_msi_msg(struct pci_dev *pdev,
-				  unsigned int irq, unsigned int dest,
-				  struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_cfg *cfg;
-	struct irte irte;
-	u16 sub_handle = 0;
-	int ir_index;
-
-	cfg = irq_cfg(irq);
-
-	ir_index = map_irq_to_irte_handle(irq, &sub_handle);
-	BUG_ON(ir_index == -1);
-
-	prepare_irte(&irte, cfg->vector, dest);
-
-	/* Set source-id of interrupt request */
-	if (pdev)
-		set_msi_sid(&irte, pdev);
-	else
-		set_hpet_sid(&irte, hpet_id);
-
-	modify_irte(irq_2_iommu(irq), &irte);
-
-	msg->address_hi = MSI_ADDR_BASE_HI;
-	msg->data = sub_handle;
-	msg->address_lo = MSI_ADDR_BASE_LO | MSI_ADDR_IR_EXT_INT |
-			  MSI_ADDR_IR_SHV |
-			  MSI_ADDR_IR_INDEX1(ir_index) |
-			  MSI_ADDR_IR_INDEX2(ir_index);
-}
-
-/*
- * Map the PCI dev to the corresponding remapping hardware unit
- * and allocate 'nvec' consecutive interrupt-remapping table entries
- * in it.
- */
-static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
-{
-	struct intel_iommu *iommu;
-	int index;
-
-	down_read(&dmar_global_lock);
-	iommu = map_dev_to_ir(dev);
-	if (!iommu) {
-		printk(KERN_ERR
-		       "Unable to map PCI %s to iommu\n", pci_name(dev));
-		index = -ENOENT;
-	} else {
-		index = alloc_irte(iommu, irq, irq_2_iommu(irq), nvec);
-		if (index < 0) {
-			printk(KERN_ERR
-			       "Unable to allocate %d IRTE for PCI %s\n",
-			       nvec, pci_name(dev));
-			index = -ENOSPC;
-		}
-	}
-	up_read(&dmar_global_lock);
-
-	return index;
-}
-
-static int intel_msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
-			       int index, int sub_handle)
-{
-	struct intel_iommu *iommu;
-	int ret = -ENOENT;
-
-	down_read(&dmar_global_lock);
-	iommu = map_dev_to_ir(pdev);
-	if (iommu) {
-		/*
-		 * setup the mapping between the irq and the IRTE
-		 * base index, the sub_handle pointing to the
-		 * appropriate interrupt remap table entry.
-		 */
-		set_irte_irq(irq, iommu, index, sub_handle);
-		ret = 0;
-	}
-	up_read(&dmar_global_lock);
-
-	return ret;
-}
-
-static int intel_setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	int ret = -1;
-	struct intel_iommu *iommu;
-	int index;
-
-	down_read(&dmar_global_lock);
-	iommu = map_hpet_to_ir(id);
-	if (iommu) {
-		index = alloc_irte(iommu, irq, irq_2_iommu(irq), 1);
-		if (index >= 0)
-			ret = 0;
-	}
-	up_read(&dmar_global_lock);
-
-	return ret;
-}
-
 static struct irq_domain *intel_get_ir_irq_domain(struct irq_alloc_info *info)
 {
 	struct intel_iommu *iommu = NULL;
@@ -1247,10 +1107,6 @@ struct irq_remap_ops intel_irq_remap_ops = {
 	.setup_ioapic_entry	= intel_setup_ioapic_entry,
 	.set_affinity		= intel_ioapic_set_affinity,
 	.free_irq		= free_irte,
-	.compose_msi_msg	= intel_compose_msi_msg,
-	.msi_alloc_irq		= intel_msi_alloc_irq,
-	.msi_setup_irq		= intel_msi_setup_irq,
-	.setup_hpet_msi		= intel_setup_hpet_msi,
 	.get_ir_irq_domain	= intel_get_ir_irq_domain,
 	.get_irq_domain		= intel_get_irq_domain,
 	.get_ioapic_entry	= intel_get_ioapic_entry,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 19/21] iommu/amd: Clean up unused MSI related code
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/amd_iommu.c |  115 +--------------------------------------------
 1 file changed, 2 insertions(+), 113 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 71ab03949599..e34587a9d317 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3940,8 +3940,7 @@ out_unlock:
 	return table;
 }
 
-static int alloc_irq_index(struct irq_cfg *cfg, struct irq_2_irte *irte_info,
-			   u16 devid, int count)
+static int alloc_irq_index(u16 devid, int count)
 {
 	struct irq_remap_table *table;
 	unsigned long flags;
@@ -3967,11 +3966,6 @@ static int alloc_irq_index(struct irq_cfg *cfg, struct irq_2_irte *irte_info,
 				table->table[index - c + 1] = IRTE_ALLOCATED;
 
 			index -= count - 1;
-
-			cfg->remapped	      = 1;
-			irte_info->devid      = devid;
-			irte_info->index      = index;
-
 			goto out;
 		}
 	}
@@ -4171,106 +4165,6 @@ static int free_irq(int irq)
 	return 0;
 }
 
-static void compose_msi_msg(struct pci_dev *pdev,
-			    unsigned int irq, unsigned int dest,
-			    struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_2_irte *irte_info;
-	struct irq_cfg *cfg;
-	union irte irte;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return;
-
-	irte_info = &cfg->irq_2_irte;
-
-	irte.val		= 0;
-	irte.fields.vector	= cfg->vector;
-	irte.fields.int_type    = apic->irq_delivery_mode;
-	irte.fields.destination	= dest;
-	irte.fields.dm		= apic->irq_dest_mode;
-	irte.fields.valid	= 1;
-
-	modify_irte(irte_info->devid, irte_info->index, irte);
-
-	msg->address_hi = MSI_ADDR_BASE_HI;
-	msg->address_lo = MSI_ADDR_BASE_LO;
-	msg->data       = irte_info->index;
-}
-
-static int msi_alloc_irq(struct pci_dev *pdev, int irq, int nvec)
-{
-	struct irq_cfg *cfg;
-	int index;
-	u16 devid;
-
-	if (!pdev)
-		return -EINVAL;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return -EINVAL;
-
-	devid = get_device_id(&pdev->dev);
-	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, nvec);
-
-	return index < 0 ? MAX_IRQS_PER_TABLE : index;
-}
-
-static int msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
-			 int index, int offset)
-{
-	struct irq_2_irte *irte_info;
-	struct irq_cfg *cfg;
-	u16 devid;
-
-	if (!pdev)
-		return -EINVAL;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return -EINVAL;
-
-	if (index >= MAX_IRQS_PER_TABLE)
-		return 0;
-
-	devid		= get_device_id(&pdev->dev);
-	irte_info	= &cfg->irq_2_irte;
-
-	cfg->remapped	      = 1;
-	irte_info->devid      = devid;
-	irte_info->index      = index + offset;
-
-	return 0;
-}
-
-static int setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	struct irq_2_irte *irte_info;
-	struct irq_cfg *cfg;
-	int index, devid;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return -EINVAL;
-
-	irte_info = &cfg->irq_2_irte;
-	devid     = get_hpet_devid(id);
-	if (devid < 0)
-		return devid;
-
-	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, 1);
-	if (index < 0)
-		return index;
-
-	cfg->remapped	      = 1;
-	irte_info->devid      = devid;
-	irte_info->index      = index;
-
-	return 0;
-}
-
 static int get_devid(struct irq_alloc_info *info)
 {
 	int devid = -1;
@@ -4363,10 +4257,6 @@ struct irq_remap_ops amd_iommu_irq_ops = {
 	.setup_ioapic_entry	= setup_ioapic_entry,
 	.set_affinity		= set_affinity,
 	.free_irq		= free_irq,
-	.compose_msi_msg	= compose_msi_msg,
-	.msi_alloc_irq		= msi_alloc_irq,
-	.msi_setup_irq		= msi_setup_irq,
-	.setup_hpet_msi		= setup_hpet_msi,
 	.get_ir_irq_domain	= get_ir_irq_domain,
 	.get_irq_domain		= get_irq_domain,
 	.get_ioapic_entry	= get_ioapic_entry,
@@ -4457,8 +4347,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq,
 		else
 			ret = -ENOMEM;
 	} else {
-		cfg = irq_cfg(virq);
-		index = alloc_irq_index(cfg, &data->irq_2_irte, devid, nr_irqs);
+		index = alloc_irq_index(devid, nr_irqs);
 	}
 	if (index < 0) {
 		pr_warn("Failed to allocate IRTE\n");
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 19/21] iommu/amd: Clean up unused MSI related code
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 drivers/iommu/amd_iommu.c |  115 +--------------------------------------------
 1 file changed, 2 insertions(+), 113 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 71ab03949599..e34587a9d317 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3940,8 +3940,7 @@ out_unlock:
 	return table;
 }
 
-static int alloc_irq_index(struct irq_cfg *cfg, struct irq_2_irte *irte_info,
-			   u16 devid, int count)
+static int alloc_irq_index(u16 devid, int count)
 {
 	struct irq_remap_table *table;
 	unsigned long flags;
@@ -3967,11 +3966,6 @@ static int alloc_irq_index(struct irq_cfg *cfg, struct irq_2_irte *irte_info,
 				table->table[index - c + 1] = IRTE_ALLOCATED;
 
 			index -= count - 1;
-
-			cfg->remapped	      = 1;
-			irte_info->devid      = devid;
-			irte_info->index      = index;
-
 			goto out;
 		}
 	}
@@ -4171,106 +4165,6 @@ static int free_irq(int irq)
 	return 0;
 }
 
-static void compose_msi_msg(struct pci_dev *pdev,
-			    unsigned int irq, unsigned int dest,
-			    struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_2_irte *irte_info;
-	struct irq_cfg *cfg;
-	union irte irte;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return;
-
-	irte_info = &cfg->irq_2_irte;
-
-	irte.val		= 0;
-	irte.fields.vector	= cfg->vector;
-	irte.fields.int_type    = apic->irq_delivery_mode;
-	irte.fields.destination	= dest;
-	irte.fields.dm		= apic->irq_dest_mode;
-	irte.fields.valid	= 1;
-
-	modify_irte(irte_info->devid, irte_info->index, irte);
-
-	msg->address_hi = MSI_ADDR_BASE_HI;
-	msg->address_lo = MSI_ADDR_BASE_LO;
-	msg->data       = irte_info->index;
-}
-
-static int msi_alloc_irq(struct pci_dev *pdev, int irq, int nvec)
-{
-	struct irq_cfg *cfg;
-	int index;
-	u16 devid;
-
-	if (!pdev)
-		return -EINVAL;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return -EINVAL;
-
-	devid = get_device_id(&pdev->dev);
-	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, nvec);
-
-	return index < 0 ? MAX_IRQS_PER_TABLE : index;
-}
-
-static int msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
-			 int index, int offset)
-{
-	struct irq_2_irte *irte_info;
-	struct irq_cfg *cfg;
-	u16 devid;
-
-	if (!pdev)
-		return -EINVAL;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return -EINVAL;
-
-	if (index >= MAX_IRQS_PER_TABLE)
-		return 0;
-
-	devid		= get_device_id(&pdev->dev);
-	irte_info	= &cfg->irq_2_irte;
-
-	cfg->remapped	      = 1;
-	irte_info->devid      = devid;
-	irte_info->index      = index + offset;
-
-	return 0;
-}
-
-static int setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	struct irq_2_irte *irte_info;
-	struct irq_cfg *cfg;
-	int index, devid;
-
-	cfg = irq_cfg(irq);
-	if (!cfg)
-		return -EINVAL;
-
-	irte_info = &cfg->irq_2_irte;
-	devid     = get_hpet_devid(id);
-	if (devid < 0)
-		return devid;
-
-	index = alloc_irq_index(cfg, &cfg->irq_2_irte, devid, 1);
-	if (index < 0)
-		return index;
-
-	cfg->remapped	      = 1;
-	irte_info->devid      = devid;
-	irte_info->index      = index;
-
-	return 0;
-}
-
 static int get_devid(struct irq_alloc_info *info)
 {
 	int devid = -1;
@@ -4363,10 +4257,6 @@ struct irq_remap_ops amd_iommu_irq_ops = {
 	.setup_ioapic_entry	= setup_ioapic_entry,
 	.set_affinity		= set_affinity,
 	.free_irq		= free_irq,
-	.compose_msi_msg	= compose_msi_msg,
-	.msi_alloc_irq		= msi_alloc_irq,
-	.msi_setup_irq		= msi_setup_irq,
-	.setup_hpet_msi		= setup_hpet_msi,
 	.get_ir_irq_domain	= get_ir_irq_domain,
 	.get_irq_domain		= get_irq_domain,
 	.get_ioapic_entry	= get_ioapic_entry,
@@ -4457,8 +4347,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq,
 		else
 			ret = -ENOMEM;
 	} else {
-		cfg = irq_cfg(virq);
-		index = alloc_irq_index(cfg, &data->irq_2_irte, devid, nr_irqs);
+		index = alloc_irq_index(devid, nr_irqs);
 	}
 	if (index < 0) {
 		pr_warn("Failed to allocate IRTE\n");
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 20/21] x86: irq_remapping: Clean up unused MSI related code
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code and interfaces.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h |   13 ---
 arch/x86/include/asm/pci.h           |    5 --
 arch/x86/kernel/x86_init.c           |    2 -
 drivers/iommu/irq_remapping.c        |  153 ----------------------------------
 drivers/iommu/irq_remapping.h        |   15 ----
 5 files changed, 188 deletions(-)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 440053ca7515..8dc1d06f74d9 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -48,10 +48,6 @@ extern int setup_ioapic_remapped_entry(int irq,
 				       int vector,
 				       struct io_apic_irq_attr *attr);
 extern void free_remapped_irq(int irq);
-extern void compose_remapped_msi_msg(struct pci_dev *pdev,
-				     unsigned int irq, unsigned int dest,
-				     struct msi_msg *msg, u8 hpet_id);
-extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
 extern void panic_if_irq_remap(const char *msg);
 extern bool setup_remapped_irq(int irq,
 			       struct irq_cfg *cfg,
@@ -100,15 +96,6 @@ static inline int setup_ioapic_remapped_entry(int irq,
 	return -ENODEV;
 }
 static inline void free_remapped_irq(int irq) { }
-static inline void compose_remapped_msi_msg(struct pci_dev *pdev,
-					    unsigned int irq, unsigned int dest,
-					    struct msi_msg *msg, u8 hpet_id)
-{
-}
-static inline int setup_hpet_msi_remapped(unsigned int irq, unsigned int id)
-{
-	return -ENODEV;
-}
 
 static inline void panic_if_irq_remap(const char *msg)
 {
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 4e370a5d8117..d8c80ff32e8c 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -96,15 +96,10 @@ extern void pci_iommu_alloc(void);
 #ifdef CONFIG_PCI_MSI
 /* implemented in arch/x86/kernel/apic/io_apic. */
 struct msi_desc;
-void native_compose_msi_msg(struct pci_dev *pdev, unsigned int irq,
-			    unsigned int dest, struct msi_msg *msg, u8 hpet_id);
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void native_teardown_msi_irq(unsigned int irq);
 void native_restore_msi_irqs(struct pci_dev *dev);
-int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
-		  unsigned int irq_base, unsigned int irq_offset);
 #else
-#define native_compose_msi_msg		NULL
 #define native_setup_msi_irqs		NULL
 #define native_teardown_msi_irq		NULL
 #endif
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index e48b674639cc..814fcbadaad1 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -111,11 +111,9 @@ EXPORT_SYMBOL_GPL(x86_platform);
 #if defined(CONFIG_PCI_MSI)
 struct x86_msi_ops x86_msi = {
 	.setup_msi_irqs		= native_setup_msi_irqs,
-	.compose_msi_msg	= native_compose_msi_msg,
 	.teardown_msi_irq	= native_teardown_msi_irq,
 	.teardown_msi_irqs	= default_teardown_msi_irqs,
 	.restore_msi_irqs	= default_restore_msi_irqs,
-	.setup_hpet_msi		= default_setup_hpet_msi,
 	.msi_mask_irq		= default_msi_mask_irq,
 	.msix_mask_irq		= default_msix_mask_irq,
 };
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index bda0d8e73fde..d72094e01dec 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -25,9 +25,6 @@ int no_x2apic_optout;
 
 static struct irq_remap_ops *remap_ops;
 
-static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec);
-static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
-				  int index, int sub_handle);
 static int set_remapped_irq_affinity(struct irq_data *data,
 				     const struct cpumask *mask,
 				     bool force);
@@ -50,117 +47,6 @@ static void irq_remapping_disable_io_apic(void)
 		disconnect_bsp_APIC(0);
 }
 
-#ifndef CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ
-static unsigned int irq_alloc_hwirqs(int cnt, int node)
-{
-	return irq_domain_alloc_irqs(NULL, -1, cnt, node, NULL);
-}
-
-static void irq_free_hwirqs(unsigned int from, int cnt)
-{
-	irq_domain_free_irqs(from, cnt);
-}
-#endif
-
-static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
-{
-	int ret, sub_handle, nvec_pow2, index = 0;
-	unsigned int irq;
-	struct msi_desc *msidesc;
-
-	WARN_ON(!list_is_singular(&dev->msi_list));
-	msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
-	WARN_ON(msidesc->irq);
-	WARN_ON(msidesc->msi_attrib.multiple);
-	WARN_ON(msidesc->nvec_used);
-
-	irq = irq_alloc_hwirqs(nvec, dev_to_node(&dev->dev));
-	if (irq == 0)
-		return -ENOSPC;
-
-	nvec_pow2 = __roundup_pow_of_two(nvec);
-	msidesc->nvec_used = nvec;
-	msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
-	for (sub_handle = 0; sub_handle < nvec; sub_handle++) {
-		if (!sub_handle) {
-			index = msi_alloc_remapped_irq(dev, irq, nvec_pow2);
-			if (index < 0) {
-				ret = index;
-				goto error;
-			}
-		} else {
-			ret = msi_setup_remapped_irq(dev, irq + sub_handle,
-						     index, sub_handle);
-			if (ret < 0)
-				goto error;
-		}
-		ret = setup_msi_irq(dev, msidesc, irq, sub_handle);
-		if (ret < 0)
-			goto error;
-	}
-	return 0;
-
-error:
-	irq_free_hwirqs(irq, nvec);
-
-	/*
-	 * Restore altered MSI descriptor fields and prevent just destroyed
-	 * IRQs from tearing down again in default_teardown_msi_irqs()
-	 */
-	msidesc->irq = 0;
-	msidesc->nvec_used = 0;
-	msidesc->msi_attrib.multiple = 0;
-
-	return ret;
-}
-
-static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
-{
-	int node, ret, sub_handle, index = 0;
-	struct msi_desc *msidesc;
-	unsigned int irq;
-
-	node		= dev_to_node(&dev->dev);
-	sub_handle	= 0;
-
-	list_for_each_entry(msidesc, &dev->msi_list, list) {
-
-		irq = irq_alloc_hwirqs(1, node);
-		if (irq == 0)
-			return -1;
-
-		if (sub_handle == 0)
-			ret = index = msi_alloc_remapped_irq(dev, irq, nvec);
-		else
-			ret = msi_setup_remapped_irq(dev, irq, index, sub_handle);
-
-		if (ret < 0)
-			goto error;
-
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0)
-			goto error;
-
-		sub_handle += 1;
-		irq        += 1;
-	}
-
-	return 0;
-
-error:
-	irq_free_hwirqs(irq, 1);
-	return ret;
-}
-
-static int irq_remapping_setup_msi_irqs(struct pci_dev *dev,
-					int nvec, int type)
-{
-	if (type == PCI_CAP_ID_MSI)
-		return do_setup_msi_irqs(dev, nvec);
-	else
-		return do_setup_msix_irqs(dev, nvec);
-}
-
 static void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
 {
 	/*
@@ -178,8 +64,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
-	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
 
 static __init int setup_nointremap(char *str)
@@ -326,43 +210,6 @@ void free_remapped_irq(int irq)
 		remap_ops->free_irq(irq);
 }
 
-void compose_remapped_msi_msg(struct pci_dev *pdev,
-			      unsigned int irq, unsigned int dest,
-			      struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_cfg *cfg = irq_cfg(irq);
-
-	if (!irq_remapped(cfg))
-		native_compose_msi_msg(pdev, irq, dest, msg, hpet_id);
-	else if (remap_ops && remap_ops->compose_msi_msg)
-		remap_ops->compose_msi_msg(pdev, irq, dest, msg, hpet_id);
-}
-
-static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec)
-{
-	if (!remap_ops || !remap_ops->msi_alloc_irq)
-		return -ENODEV;
-
-	return remap_ops->msi_alloc_irq(pdev, irq, nvec);
-}
-
-static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
-				  int index, int sub_handle)
-{
-	if (!remap_ops || !remap_ops->msi_setup_irq)
-		return -ENODEV;
-
-	return remap_ops->msi_setup_irq(pdev, irq, index, sub_handle);
-}
-
-int setup_hpet_msi_remapped(unsigned int irq, unsigned int id)
-{
-	if (!remap_ops || !remap_ops->setup_hpet_msi)
-		return -ENODEV;
-
-	return remap_ops->setup_hpet_msi(irq, id);
-}
-
 void panic_if_irq_remap(const char *msg)
 {
 	if (irq_remapping_enabled)
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 6e46074f06d0..474e20be528f 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -70,21 +70,6 @@ struct irq_remap_ops {
 	/* Free an IRQ */
 	int (*free_irq)(int);
 
-	/* Create MSI msg to use for interrupt remapping */
-	void (*compose_msi_msg)(struct pci_dev *,
-				unsigned int, unsigned int,
-				struct msi_msg *, u8);
-
-	/* Allocate remapping resources for MSI */
-	int (*msi_alloc_irq)(struct pci_dev *, int, int);
-
-	/* Setup the remapped MSI irq */
-	int (*msi_setup_irq)(struct pci_dev *, unsigned int, int, int);
-
-	/* Setup interrupt remapping for an HPET MSI */
-	int (*setup_hpet_msi)(unsigned int, unsigned int);
-
-	/* Get the irqdomain associated the IOMMU device */
 	struct irq_domain *(*get_ir_irq_domain)(struct irq_alloc_info *);
 
 	/* Get the MSI irqdomain associated with the IOMMU device */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 20/21] x86: irq_remapping: Clean up unused MSI related code
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code and interfaces.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h |   13 ---
 arch/x86/include/asm/pci.h           |    5 --
 arch/x86/kernel/x86_init.c           |    2 -
 drivers/iommu/irq_remapping.c        |  153 ----------------------------------
 drivers/iommu/irq_remapping.h        |   15 ----
 5 files changed, 188 deletions(-)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 440053ca7515..8dc1d06f74d9 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -48,10 +48,6 @@ extern int setup_ioapic_remapped_entry(int irq,
 				       int vector,
 				       struct io_apic_irq_attr *attr);
 extern void free_remapped_irq(int irq);
-extern void compose_remapped_msi_msg(struct pci_dev *pdev,
-				     unsigned int irq, unsigned int dest,
-				     struct msi_msg *msg, u8 hpet_id);
-extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
 extern void panic_if_irq_remap(const char *msg);
 extern bool setup_remapped_irq(int irq,
 			       struct irq_cfg *cfg,
@@ -100,15 +96,6 @@ static inline int setup_ioapic_remapped_entry(int irq,
 	return -ENODEV;
 }
 static inline void free_remapped_irq(int irq) { }
-static inline void compose_remapped_msi_msg(struct pci_dev *pdev,
-					    unsigned int irq, unsigned int dest,
-					    struct msi_msg *msg, u8 hpet_id)
-{
-}
-static inline int setup_hpet_msi_remapped(unsigned int irq, unsigned int id)
-{
-	return -ENODEV;
-}
 
 static inline void panic_if_irq_remap(const char *msg)
 {
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 4e370a5d8117..d8c80ff32e8c 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -96,15 +96,10 @@ extern void pci_iommu_alloc(void);
 #ifdef CONFIG_PCI_MSI
 /* implemented in arch/x86/kernel/apic/io_apic. */
 struct msi_desc;
-void native_compose_msi_msg(struct pci_dev *pdev, unsigned int irq,
-			    unsigned int dest, struct msi_msg *msg, u8 hpet_id);
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void native_teardown_msi_irq(unsigned int irq);
 void native_restore_msi_irqs(struct pci_dev *dev);
-int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
-		  unsigned int irq_base, unsigned int irq_offset);
 #else
-#define native_compose_msi_msg		NULL
 #define native_setup_msi_irqs		NULL
 #define native_teardown_msi_irq		NULL
 #endif
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index e48b674639cc..814fcbadaad1 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -111,11 +111,9 @@ EXPORT_SYMBOL_GPL(x86_platform);
 #if defined(CONFIG_PCI_MSI)
 struct x86_msi_ops x86_msi = {
 	.setup_msi_irqs		= native_setup_msi_irqs,
-	.compose_msi_msg	= native_compose_msi_msg,
 	.teardown_msi_irq	= native_teardown_msi_irq,
 	.teardown_msi_irqs	= default_teardown_msi_irqs,
 	.restore_msi_irqs	= default_restore_msi_irqs,
-	.setup_hpet_msi		= default_setup_hpet_msi,
 	.msi_mask_irq		= default_msi_mask_irq,
 	.msix_mask_irq		= default_msix_mask_irq,
 };
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index bda0d8e73fde..d72094e01dec 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -25,9 +25,6 @@ int no_x2apic_optout;
 
 static struct irq_remap_ops *remap_ops;
 
-static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec);
-static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
-				  int index, int sub_handle);
 static int set_remapped_irq_affinity(struct irq_data *data,
 				     const struct cpumask *mask,
 				     bool force);
@@ -50,117 +47,6 @@ static void irq_remapping_disable_io_apic(void)
 		disconnect_bsp_APIC(0);
 }
 
-#ifndef CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ
-static unsigned int irq_alloc_hwirqs(int cnt, int node)
-{
-	return irq_domain_alloc_irqs(NULL, -1, cnt, node, NULL);
-}
-
-static void irq_free_hwirqs(unsigned int from, int cnt)
-{
-	irq_domain_free_irqs(from, cnt);
-}
-#endif
-
-static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
-{
-	int ret, sub_handle, nvec_pow2, index = 0;
-	unsigned int irq;
-	struct msi_desc *msidesc;
-
-	WARN_ON(!list_is_singular(&dev->msi_list));
-	msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
-	WARN_ON(msidesc->irq);
-	WARN_ON(msidesc->msi_attrib.multiple);
-	WARN_ON(msidesc->nvec_used);
-
-	irq = irq_alloc_hwirqs(nvec, dev_to_node(&dev->dev));
-	if (irq == 0)
-		return -ENOSPC;
-
-	nvec_pow2 = __roundup_pow_of_two(nvec);
-	msidesc->nvec_used = nvec;
-	msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
-	for (sub_handle = 0; sub_handle < nvec; sub_handle++) {
-		if (!sub_handle) {
-			index = msi_alloc_remapped_irq(dev, irq, nvec_pow2);
-			if (index < 0) {
-				ret = index;
-				goto error;
-			}
-		} else {
-			ret = msi_setup_remapped_irq(dev, irq + sub_handle,
-						     index, sub_handle);
-			if (ret < 0)
-				goto error;
-		}
-		ret = setup_msi_irq(dev, msidesc, irq, sub_handle);
-		if (ret < 0)
-			goto error;
-	}
-	return 0;
-
-error:
-	irq_free_hwirqs(irq, nvec);
-
-	/*
-	 * Restore altered MSI descriptor fields and prevent just destroyed
-	 * IRQs from tearing down again in default_teardown_msi_irqs()
-	 */
-	msidesc->irq = 0;
-	msidesc->nvec_used = 0;
-	msidesc->msi_attrib.multiple = 0;
-
-	return ret;
-}
-
-static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
-{
-	int node, ret, sub_handle, index = 0;
-	struct msi_desc *msidesc;
-	unsigned int irq;
-
-	node		= dev_to_node(&dev->dev);
-	sub_handle	= 0;
-
-	list_for_each_entry(msidesc, &dev->msi_list, list) {
-
-		irq = irq_alloc_hwirqs(1, node);
-		if (irq == 0)
-			return -1;
-
-		if (sub_handle == 0)
-			ret = index = msi_alloc_remapped_irq(dev, irq, nvec);
-		else
-			ret = msi_setup_remapped_irq(dev, irq, index, sub_handle);
-
-		if (ret < 0)
-			goto error;
-
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0)
-			goto error;
-
-		sub_handle += 1;
-		irq        += 1;
-	}
-
-	return 0;
-
-error:
-	irq_free_hwirqs(irq, 1);
-	return ret;
-}
-
-static int irq_remapping_setup_msi_irqs(struct pci_dev *dev,
-					int nvec, int type)
-{
-	if (type == PCI_CAP_ID_MSI)
-		return do_setup_msi_irqs(dev, nvec);
-	else
-		return do_setup_msix_irqs(dev, nvec);
-}
-
 static void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
 {
 	/*
@@ -178,8 +64,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
-	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
 
 static __init int setup_nointremap(char *str)
@@ -326,43 +210,6 @@ void free_remapped_irq(int irq)
 		remap_ops->free_irq(irq);
 }
 
-void compose_remapped_msi_msg(struct pci_dev *pdev,
-			      unsigned int irq, unsigned int dest,
-			      struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_cfg *cfg = irq_cfg(irq);
-
-	if (!irq_remapped(cfg))
-		native_compose_msi_msg(pdev, irq, dest, msg, hpet_id);
-	else if (remap_ops && remap_ops->compose_msi_msg)
-		remap_ops->compose_msi_msg(pdev, irq, dest, msg, hpet_id);
-}
-
-static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec)
-{
-	if (!remap_ops || !remap_ops->msi_alloc_irq)
-		return -ENODEV;
-
-	return remap_ops->msi_alloc_irq(pdev, irq, nvec);
-}
-
-static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
-				  int index, int sub_handle)
-{
-	if (!remap_ops || !remap_ops->msi_setup_irq)
-		return -ENODEV;
-
-	return remap_ops->msi_setup_irq(pdev, irq, index, sub_handle);
-}
-
-int setup_hpet_msi_remapped(unsigned int irq, unsigned int id)
-{
-	if (!remap_ops || !remap_ops->setup_hpet_msi)
-		return -ENODEV;
-
-	return remap_ops->setup_hpet_msi(irq, id);
-}
-
 void panic_if_irq_remap(const char *msg)
 {
 	if (irq_remapping_enabled)
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 6e46074f06d0..474e20be528f 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -70,21 +70,6 @@ struct irq_remap_ops {
 	/* Free an IRQ */
 	int (*free_irq)(int);
 
-	/* Create MSI msg to use for interrupt remapping */
-	void (*compose_msi_msg)(struct pci_dev *,
-				unsigned int, unsigned int,
-				struct msi_msg *, u8);
-
-	/* Allocate remapping resources for MSI */
-	int (*msi_alloc_irq)(struct pci_dev *, int, int);
-
-	/* Setup the remapped MSI irq */
-	int (*msi_setup_irq)(struct pci_dev *, unsigned int, int, int);
-
-	/* Setup interrupt remapping for an HPET MSI */
-	int (*setup_hpet_msi)(unsigned int, unsigned int);
-
-	/* Get the irqdomain associated the IOMMU device */
 	struct irq_domain *(*get_ir_irq_domain)(struct irq_alloc_info *);
 
 	/* Get the MSI irqdomain associated with the IOMMU device */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 21/21] x86, irq: Clean up unused MSI related code and interfaces
  2014-09-11 14:03 ` Jiang Liu
@ 2014-09-11 14:03   ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code and interfaces.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hpet.h     |    9 ----
 arch/x86/include/asm/x86_init.h |    4 --
 arch/x86/kernel/apic/msi.c      |   91 +++------------------------------------
 3 files changed, 6 insertions(+), 98 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index e87e9faf87a9..5fa9fb0f8809 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -85,15 +85,6 @@ extern struct irq_domain *hpet_create_irq_domain(int hpet_id);
 extern int hpet_assign_irq(struct irq_domain *domain,
 			   struct hpet_dev *dev, int dev_num);
 
-#ifdef CONFIG_PCI_MSI
-extern int default_setup_hpet_msi(unsigned int irq, unsigned int id);
-#else
-static inline int default_setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	return -EINVAL;
-}
-#endif
-
 #ifdef CONFIG_HPET_EMULATE_RTC
 
 #include <linux/interrupt.h>
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index e45e4da96bf1..9b53cb2acfbb 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -176,13 +176,9 @@ struct msi_desc;
 
 struct x86_msi_ops {
 	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
-	void (*compose_msi_msg)(struct pci_dev *dev, unsigned int irq,
-				unsigned int dest, struct msi_msg *msg,
-			       u8 hpet_id);
 	void (*teardown_msi_irq)(unsigned int irq);
 	void (*teardown_msi_irqs)(struct pci_dev *dev);
 	void (*restore_msi_irqs)(struct pci_dev *dev);
-	int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
 	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
 	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
 };
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 4f2a349ccef0..69129dbb4604 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -25,16 +25,12 @@
 
 static struct irq_domain *msi_default_domain;
 
-void native_compose_msi_msg(struct pci_dev *pdev,
-			    unsigned int irq, unsigned int dest,
-			    struct msi_msg *msg, u8 hpet_id)
+static void native_compose_msi_msg(struct irq_cfg *cfg, struct msi_msg *msg)
 {
-	struct irq_cfg *cfg = irq_cfg(irq);
-
 	msg->address_hi = MSI_ADDR_BASE_HI;
 
 	if (x2apic_enabled())
-		msg->address_hi |= MSI_ADDR_EXT_DEST_ID(dest);
+		msg->address_hi |= MSI_ADDR_EXT_DEST_ID(cfg->dest_apicid);
 
 	msg->address_lo =
 		MSI_ADDR_BASE_LO |
@@ -44,7 +40,7 @@ void native_compose_msi_msg(struct pci_dev *pdev,
 		((apic->irq_delivery_mode != dest_LowestPrio) ?
 			MSI_ADDR_REDIRECTION_CPU :
 			MSI_ADDR_REDIRECTION_LOWPRI) |
-		MSI_ADDR_DEST_ID(dest);
+		MSI_ADDR_DEST_ID(cfg->dest_apicid);
 
 	msg->data =
 		MSI_DATA_TRIGGER_EDGE |
@@ -55,31 +51,6 @@ void native_compose_msi_msg(struct pci_dev *pdev,
 		MSI_DATA_VECTOR(cfg->vector);
 }
 
-static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
-			   struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_cfg *cfg;
-	int err;
-	unsigned dest;
-
-	if (disable_apic)
-		return -ENXIO;
-
-	cfg = irq_cfg(irq);
-	err = assign_irq_vector(irq, cfg, apic->target_cpus());
-	if (err)
-		return err;
-
-	err = apic->cpu_mask_to_apicid_and(cfg->domain,
-					   apic->target_cpus(), &dest);
-	if (err)
-		return err;
-
-	x86_msi.compose_msi_msg(pdev, irq, dest, msg, hpet_id);
-
-	return 0;
-}
-
 static bool msi_remapped(struct irq_domain *domain)
 {
 	return domain->host_data != NULL;
@@ -189,8 +160,7 @@ static int msi_domain_activate(struct irq_domain *domain,
 	if (msi_remapped(domain))
 		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
 	else
-		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
-				       &msg, 0);
+		native_compose_msi_msg(cfg, &msg);
 	write_msi_msg(irq_data->irq, &msg);
 
 	return 0;
@@ -217,36 +187,6 @@ static struct irq_domain_ops msi_domain_ops = {
 	.deactivate = msi_domain_deactivate,
 };
 
-int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
-		  unsigned int irq_base, unsigned int irq_offset)
-{
-	struct irq_chip *chip = &msi_chip;
-	struct msi_msg msg;
-	unsigned int irq = irq_base + irq_offset;
-	int ret;
-
-	ret = msi_compose_msg(dev, irq, &msg, -1);
-	if (ret < 0)
-		return ret;
-
-	irq_set_msi_desc_off(irq_base, irq_offset, msidesc);
-
-	/*
-	 * MSI-X message is written per-IRQ, the offset is always 0.
-	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
-	 */
-	if (!irq_offset)
-		write_msi_msg(irq, &msg);
-
-	setup_remapped_irq(irq, irq_cfg(irq), chip);
-
-	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
-
-	dev_dbg(&dev->dev, "irq %d for MSI/MSI-X\n", irq);
-
-	return 0;
-}
-
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	int irq, cnt, nvec_pow2;
@@ -378,7 +318,7 @@ int arch_setup_dmar_msi(unsigned int irq)
 	struct msi_msg msg;
 	struct irq_cfg *cfg = irq_cfg(irq);
 
-	native_compose_msi_msg(NULL, irq, cfg->dest_apicid, &msg, -1);
+	native_compose_msi_msg(cfg, &msg);
 	dmar_msi_write(irq, &msg);
 	irq_set_chip_and_handler_name(irq, &dmar_msi_type, handle_edge_irq,
 				      "edge");
@@ -444,24 +384,6 @@ static struct irq_chip hpet_msi_type = {
 	.irq_print_chip = irq_remapping_print_chip,
 };
 
-int default_setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	struct irq_chip *chip = &hpet_msi_type;
-	struct msi_msg msg;
-	int ret;
-
-	ret = msi_compose_msg(NULL, irq, &msg, id);
-	if (ret < 0)
-		return ret;
-
-	hpet_msi_write(irq_get_handler_data(irq), &msg);
-	irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
-	setup_remapped_irq(irq, irq_cfg(irq), chip);
-
-	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
-	return 0;
-}
-
 static int hpet_domain_alloc(struct irq_domain *domain, unsigned int virq,
 			     unsigned int nr_irqs, void *arg)
 {
@@ -510,8 +432,7 @@ static int hpet_domain_activate(struct irq_domain *domain,
 	if (hpet_remapped(domain))
 		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
 	else
-		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
-				       &msg, hpet_dev_id(domain));
+		native_compose_msi_msg(cfg, &msg);
 	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 21/21] x86, irq: Clean up unused MSI related code and interfaces
@ 2014-09-11 14:03   ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Now MSI interrupt has been converted to new hierarchy irqdomain
interfaces, so kill legacy MSI related code and interfaces.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hpet.h     |    9 ----
 arch/x86/include/asm/x86_init.h |    4 --
 arch/x86/kernel/apic/msi.c      |   91 +++------------------------------------
 3 files changed, 6 insertions(+), 98 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index e87e9faf87a9..5fa9fb0f8809 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -85,15 +85,6 @@ extern struct irq_domain *hpet_create_irq_domain(int hpet_id);
 extern int hpet_assign_irq(struct irq_domain *domain,
 			   struct hpet_dev *dev, int dev_num);
 
-#ifdef CONFIG_PCI_MSI
-extern int default_setup_hpet_msi(unsigned int irq, unsigned int id);
-#else
-static inline int default_setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	return -EINVAL;
-}
-#endif
-
 #ifdef CONFIG_HPET_EMULATE_RTC
 
 #include <linux/interrupt.h>
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index e45e4da96bf1..9b53cb2acfbb 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -176,13 +176,9 @@ struct msi_desc;
 
 struct x86_msi_ops {
 	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
-	void (*compose_msi_msg)(struct pci_dev *dev, unsigned int irq,
-				unsigned int dest, struct msi_msg *msg,
-			       u8 hpet_id);
 	void (*teardown_msi_irq)(unsigned int irq);
 	void (*teardown_msi_irqs)(struct pci_dev *dev);
 	void (*restore_msi_irqs)(struct pci_dev *dev);
-	int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
 	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
 	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
 };
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 4f2a349ccef0..69129dbb4604 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -25,16 +25,12 @@
 
 static struct irq_domain *msi_default_domain;
 
-void native_compose_msi_msg(struct pci_dev *pdev,
-			    unsigned int irq, unsigned int dest,
-			    struct msi_msg *msg, u8 hpet_id)
+static void native_compose_msi_msg(struct irq_cfg *cfg, struct msi_msg *msg)
 {
-	struct irq_cfg *cfg = irq_cfg(irq);
-
 	msg->address_hi = MSI_ADDR_BASE_HI;
 
 	if (x2apic_enabled())
-		msg->address_hi |= MSI_ADDR_EXT_DEST_ID(dest);
+		msg->address_hi |= MSI_ADDR_EXT_DEST_ID(cfg->dest_apicid);
 
 	msg->address_lo =
 		MSI_ADDR_BASE_LO |
@@ -44,7 +40,7 @@ void native_compose_msi_msg(struct pci_dev *pdev,
 		((apic->irq_delivery_mode != dest_LowestPrio) ?
 			MSI_ADDR_REDIRECTION_CPU :
 			MSI_ADDR_REDIRECTION_LOWPRI) |
-		MSI_ADDR_DEST_ID(dest);
+		MSI_ADDR_DEST_ID(cfg->dest_apicid);
 
 	msg->data =
 		MSI_DATA_TRIGGER_EDGE |
@@ -55,31 +51,6 @@ void native_compose_msi_msg(struct pci_dev *pdev,
 		MSI_DATA_VECTOR(cfg->vector);
 }
 
-static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
-			   struct msi_msg *msg, u8 hpet_id)
-{
-	struct irq_cfg *cfg;
-	int err;
-	unsigned dest;
-
-	if (disable_apic)
-		return -ENXIO;
-
-	cfg = irq_cfg(irq);
-	err = assign_irq_vector(irq, cfg, apic->target_cpus());
-	if (err)
-		return err;
-
-	err = apic->cpu_mask_to_apicid_and(cfg->domain,
-					   apic->target_cpus(), &dest);
-	if (err)
-		return err;
-
-	x86_msi.compose_msi_msg(pdev, irq, dest, msg, hpet_id);
-
-	return 0;
-}
-
 static bool msi_remapped(struct irq_domain *domain)
 {
 	return domain->host_data != NULL;
@@ -189,8 +160,7 @@ static int msi_domain_activate(struct irq_domain *domain,
 	if (msi_remapped(domain))
 		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
 	else
-		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
-				       &msg, 0);
+		native_compose_msi_msg(cfg, &msg);
 	write_msi_msg(irq_data->irq, &msg);
 
 	return 0;
@@ -217,36 +187,6 @@ static struct irq_domain_ops msi_domain_ops = {
 	.deactivate = msi_domain_deactivate,
 };
 
-int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
-		  unsigned int irq_base, unsigned int irq_offset)
-{
-	struct irq_chip *chip = &msi_chip;
-	struct msi_msg msg;
-	unsigned int irq = irq_base + irq_offset;
-	int ret;
-
-	ret = msi_compose_msg(dev, irq, &msg, -1);
-	if (ret < 0)
-		return ret;
-
-	irq_set_msi_desc_off(irq_base, irq_offset, msidesc);
-
-	/*
-	 * MSI-X message is written per-IRQ, the offset is always 0.
-	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
-	 */
-	if (!irq_offset)
-		write_msi_msg(irq, &msg);
-
-	setup_remapped_irq(irq, irq_cfg(irq), chip);
-
-	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
-
-	dev_dbg(&dev->dev, "irq %d for MSI/MSI-X\n", irq);
-
-	return 0;
-}
-
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	int irq, cnt, nvec_pow2;
@@ -378,7 +318,7 @@ int arch_setup_dmar_msi(unsigned int irq)
 	struct msi_msg msg;
 	struct irq_cfg *cfg = irq_cfg(irq);
 
-	native_compose_msi_msg(NULL, irq, cfg->dest_apicid, &msg, -1);
+	native_compose_msi_msg(cfg, &msg);
 	dmar_msi_write(irq, &msg);
 	irq_set_chip_and_handler_name(irq, &dmar_msi_type, handle_edge_irq,
 				      "edge");
@@ -444,24 +384,6 @@ static struct irq_chip hpet_msi_type = {
 	.irq_print_chip = irq_remapping_print_chip,
 };
 
-int default_setup_hpet_msi(unsigned int irq, unsigned int id)
-{
-	struct irq_chip *chip = &hpet_msi_type;
-	struct msi_msg msg;
-	int ret;
-
-	ret = msi_compose_msg(NULL, irq, &msg, id);
-	if (ret < 0)
-		return ret;
-
-	hpet_msi_write(irq_get_handler_data(irq), &msg);
-	irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
-	setup_remapped_irq(irq, irq_cfg(irq), chip);
-
-	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
-	return 0;
-}
-
 static int hpet_domain_alloc(struct irq_domain *domain, unsigned int virq,
 			     unsigned int nr_irqs, void *arg)
 {
@@ -510,8 +432,7 @@ static int hpet_domain_activate(struct irq_domain *domain,
 	if (hpet_remapped(domain))
 		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
 	else
-		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
-				       &msg, hpet_dev_id(domain));
+		native_compose_msi_msg(cfg, &msg);
 	hpet_msi_write(irq_get_handler_data(irq_data->irq), &msg);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* RE: [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
  2014-09-11 14:03   ` Jiang Liu
  (?)
@ 2014-09-11 14:17     ` Ni, Xun
  -1 siblings, 0 replies; 110+ messages in thread
From: Ni, Xun @ 2014-09-11 14:17 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Luck, Tony, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

It has mis-understandings in your word" helps to make the and and architecture" ...

Thanks
Xun

-----Original Message-----
From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org] On Behalf Of Jiang Liu
Sent: Thursday, September 11, 2014 10:04 PM
To: Benjamin Herrenschmidt; Thomas Gleixner; Ingo Molnar; H. Peter Anvin; Rafael J. Wysocki; Bjorn Helgaas; Randy Dunlap; Yinghai Lu; Borislav Petkov; Grant Likely; Marc Zyngier
Cc: Jiang Liu; Konrad Rzeszutek Wilk; Andrew Morton; Luck, Tony; Joerg Roedel; Greg Kroah-Hartman; x86@kernel.org; linux-kernel@vger.kernel.org; linux-pci@vger.kernel.org; linux-acpi@vger.kernel.org; linux-arm-kernel@lists.infradead.org
Subject: [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts

Enhance MSI code to support hierarchy irqdomain, it helps to make the and and architecture more clear.


Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |    6 +
 arch/x86/include/asm/irq_remapping.h |    6 +-
 arch/x86/kernel/apic/msi.c           |  225 +++++++++++++++++++++++++++++-----
 arch/x86/kernel/apic/vector.c        |    2 +
 drivers/iommu/irq_remapping.c        |    1 -
 5 files changed, 204 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index 57f81f5a9686..9f705c49f850 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -199,6 +199,12 @@ static inline void lock_vector_lock(void) {}  static inline void unlock_vector_lock(void) {}
 #endif	/* CONFIG_X86_LOCAL_APIC */
 
+#ifdef	CONFIG_PCI_MSI
+extern void arch_init_msi_domain(struct irq_domain *domain); #else 
+static inline void arch_init_msi_domain(struct irq_domain *domain) { } 
+#endif
+
 /* Statistics */
 extern atomic_t irq_err_count;
 extern atomic_t irq_mis_count;
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 428b4e6d637c..440053ca7515 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -73,11 +73,7 @@ extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
  * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
  * parent irqdomain.
  */
-static inline struct irq_domain *
-arch_create_msi_irq_domain(struct irq_domain *parent) -{
-	return NULL;
-}
+extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain 
+*parent);
 
 /* Get parent irqdomain for interrupt remapping irqdomain */  static inline struct irq_domain *arch_get_ir_parent_domain(void) diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c index 709fedab44f2..5696703271af 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as @@ -21,6 +23,8 @@  #include <asm/apic.h>  #include <asm/irq_remapping.h>
 
+static struct irq_domain *msi_default_domain;
+
 void native_compose_msi_msg(struct pci_dev *pdev,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id) @@ -76,28 +80,32 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 	return 0;
 }
 
-static int
-msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
+static bool msi_remapped(struct irq_domain *domain)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	struct msi_msg msg;
-	unsigned int dest;
-	int ret;
-
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
+	return domain->host_data != NULL;
+}
 
-	__get_cached_msi_msg(data->msi_desc, &msg);
+static int msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
+			    bool force)
+{
+	struct irq_data *parent = data->parent_data;
+	int ret;
 
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to reprogram MSI registers if interrupt is remapped */
+	if (ret >= 0 && !msi_remapped(data->domain)) {
+		struct irq_cfg *cfg = irqd_cfg(data);
+		struct msi_msg msg;
 
-	__write_msi_msg(data->msi_desc, &msg);
+		__get_cached_msi_msg(data->msi_desc, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		__write_msi_msg(data->msi_desc, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 /*
@@ -108,9 +116,105 @@ static struct irq_chip msi_chip = {
 	.name			= "PCI-MSI",
 	.irq_unmask		= unmask_msi_irq,
 	.irq_mask		= mask_msi_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= msi_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_print_chip		= irq_remapping_print_chip,
+};
+
+static inline irq_hw_number_t
+get_hwirq_from_pcidev(struct pci_dev *pdev, struct msi_desc *msidesc) {
+	return (irq_hw_number_t)msidesc->msi_attrib.entry_nr |
+		PCI_DEVID(pdev->bus->number, pdev->devfn) << 11 |
+		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27; }
+
+static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs, void *arg) {
+	int i, ret;
+	irq_hw_number_t hwirq;
+	struct irq_alloc_info *info = arg;
+
+	hwirq = get_hwirq_from_pcidev(info->msi_dev, info->msi_desc);
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_msi_desc_off(virq, i, info->msi_desc);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &msi_chip, (void *)(long)i);
+		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
+		dev_dbg(&info->msi_dev->dev, "irq %d for MSI/MSI-X\n",
+			virq + i);
+	}
+
+	return ret;
+}
+
+static void msi_domain_free(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs)
+{
+	int i;
+	struct msi_desc *msidesc = irq_get_msi_desc(virq);
+
+	if (msidesc)
+		msidesc->irq = 0;
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_handler(virq + i, NULL);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs); }
+
+static int msi_domain_activate(struct irq_domain *domain,
+			       struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	/*
+	 * irq_data->chip_data is MSI/MSIx offset.
+	 * MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (irq_data->chip_data)
+		return 0;
+
+	if (msi_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, 0);
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static int msi_domain_deactivate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	if (irq_data->chip_data)
+		return 0;
+
+	memset(&msg, 0, sizeof(msg));
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops msi_domain_ops = {
+	.alloc = msi_domain_alloc,
+	.free = msi_domain_free,
+	.activate = msi_domain_activate,
+	.deactivate = msi_domain_deactivate,
 };
 
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, @@ -145,25 +249,56 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)  {
+	int irq, cnt, nvec_pow2;
+	struct irq_domain *domain;
 	struct msi_desc *msidesc;
-	int irq, ret;
+	struct irq_alloc_info info;
+	int node = dev_to_node(&dev->dev);
+
+	if (disable_apic)
+		return -ENOSYS;
 
-	/* Multiple MSI vectors only supported with interrupt remapping */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
-		return 1;
+	init_irq_alloc_info(&info, NULL);
+	info.msi_dev = dev;
+	if (type == PCI_CAP_ID_MSI) {
+		msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+		WARN_ON(!list_is_singular(&dev->msi_list));
+		WARN_ON(msidesc->irq);
+		WARN_ON(msidesc->msi_attrib.multiple);
+		WARN_ON(msidesc->nvec_used);
+		info.type = X86_IRQ_ALLOC_TYPE_MSI;
+		cnt = nvec;
+	} else {
+		info.type = X86_IRQ_ALLOC_TYPE_MSIX;
+		cnt = 1;
+	}
+
+	domain = irq_remapping_get_irq_domain(&info);
+	if (domain == NULL) {
+		/*
+		 * Multiple MSI vectors only supported with interrupt
+		 * remapping
+		 */
+		if (type == PCI_CAP_ID_MSI && nvec > 1)
+			return 1;
+		domain = msi_default_domain;
+	}
+	if (domain == NULL)
+		return -ENOSYS;
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		info.msi_desc = msidesc;
+		irq = irq_domain_alloc_irqs(domain, -1, cnt, node, &info);
 		if (irq <= 0)
 			return -ENOSPC;
+	}
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0) {
-			irq_domain_free_irqs(irq, 1);
-			return ret;
-		}
-
+	if (type == PCI_CAP_ID_MSI) {
+		nvec_pow2 = __roundup_pow_of_two(nvec);
+		msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
+		msidesc->nvec_used = nvec;
 	}
+
 	return 0;
 }
 
@@ -172,6 +307,36 @@ void native_teardown_msi_irq(unsigned int irq)
 	irq_domain_free_irqs(irq, 1);
 }
 
+static struct irq_domain *msi_create_domain(struct irq_domain *parent,
+					    int remapped)
+{
+	struct irq_domain *domain;
+
+	domain = irq_domain_add_tree(NULL, &msi_domain_ops,
+				     (void *)(long)remapped);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+void arch_init_msi_domain(struct irq_domain *parent) {
+	if (disable_apic)
+		return;
+
+	msi_default_domain = msi_create_domain(parent, 0);
+	if (!msi_default_domain)
+		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n"); }
+
+#ifdef CONFIG_IRQ_REMAP
+struct irq_domain *arch_create_msi_irq_domain(struct irq_domain 
+*parent) {
+	return msi_create_domain(parent, 1);
+}
+#endif
+
 #ifdef CONFIG_DMAR_TABLE
 static int
 dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask, diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 774ab5ba95f2..e9329fc28c63 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -357,6 +357,8 @@ int __init arch_early_irq_init(void)
 	BUG_ON(x86_vector_domain == NULL);
 	irq_set_default_host(x86_vector_domain);
 
+	arch_init_msi_domain(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 7ac44a464be0..bda0d8e73fde 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -178,7 +178,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_msi_irqs		= irq_remapping_setup_msi_irqs;
 	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
 	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
--
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* RE: [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
@ 2014-09-11 14:17     ` Ni, Xun
  0 siblings, 0 replies; 110+ messages in thread
From: Ni, Xun @ 2014-09-11 14:17 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Luck, Tony, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

It has mis-understandings in your word" helps to make the and and architecture" ...

Thanks
Xun

-----Original Message-----
From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org] On Behalf Of Jiang Liu
Sent: Thursday, September 11, 2014 10:04 PM
To: Benjamin Herrenschmidt; Thomas Gleixner; Ingo Molnar; H. Peter Anvin; Rafael J. Wysocki; Bjorn Helgaas; Randy Dunlap; Yinghai Lu; Borislav Petkov; Grant Likely; Marc Zyngier
Cc: Jiang Liu; Konrad Rzeszutek Wilk; Andrew Morton; Luck, Tony; Joerg Roedel; Greg Kroah-Hartman; x86@kernel.org; linux-kernel@vger.kernel.org; linux-pci@vger.kernel.org; linux-acpi@vger.kernel.org; linux-arm-kernel@lists.infradead.org
Subject: [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts

Enhance MSI code to support hierarchy irqdomain, it helps to make the and and architecture more clear.


Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |    6 +
 arch/x86/include/asm/irq_remapping.h |    6 +-
 arch/x86/kernel/apic/msi.c           |  225 +++++++++++++++++++++++++++++-----
 arch/x86/kernel/apic/vector.c        |    2 +
 drivers/iommu/irq_remapping.c        |    1 -
 5 files changed, 204 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index 57f81f5a9686..9f705c49f850 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -199,6 +199,12 @@ static inline void lock_vector_lock(void) {}  static inline void unlock_vector_lock(void) {}
 #endif	/* CONFIG_X86_LOCAL_APIC */
 
+#ifdef	CONFIG_PCI_MSI
+extern void arch_init_msi_domain(struct irq_domain *domain); #else 
+static inline void arch_init_msi_domain(struct irq_domain *domain) { } 
+#endif
+
 /* Statistics */
 extern atomic_t irq_err_count;
 extern atomic_t irq_mis_count;
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 428b4e6d637c..440053ca7515 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -73,11 +73,7 @@ extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
  * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
  * parent irqdomain.
  */
-static inline struct irq_domain *
-arch_create_msi_irq_domain(struct irq_domain *parent) -{
-	return NULL;
-}
+extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain 
+*parent);
 
 /* Get parent irqdomain for interrupt remapping irqdomain */  static inline struct irq_domain *arch_get_ir_parent_domain(void) diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c index 709fedab44f2..5696703271af 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as @@ -21,6 +23,8 @@  #include <asm/apic.h>  #include <asm/irq_remapping.h>
 
+static struct irq_domain *msi_default_domain;
+
 void native_compose_msi_msg(struct pci_dev *pdev,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id) @@ -76,28 +80,32 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 	return 0;
 }
 
-static int
-msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
+static bool msi_remapped(struct irq_domain *domain)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	struct msi_msg msg;
-	unsigned int dest;
-	int ret;
-
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
+	return domain->host_data != NULL;
+}
 
-	__get_cached_msi_msg(data->msi_desc, &msg);
+static int msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
+			    bool force)
+{
+	struct irq_data *parent = data->parent_data;
+	int ret;
 
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to reprogram MSI registers if interrupt is remapped */
+	if (ret >= 0 && !msi_remapped(data->domain)) {
+		struct irq_cfg *cfg = irqd_cfg(data);
+		struct msi_msg msg;
 
-	__write_msi_msg(data->msi_desc, &msg);
+		__get_cached_msi_msg(data->msi_desc, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		__write_msi_msg(data->msi_desc, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 /*
@@ -108,9 +116,105 @@ static struct irq_chip msi_chip = {
 	.name			= "PCI-MSI",
 	.irq_unmask		= unmask_msi_irq,
 	.irq_mask		= mask_msi_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= msi_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_print_chip		= irq_remapping_print_chip,
+};
+
+static inline irq_hw_number_t
+get_hwirq_from_pcidev(struct pci_dev *pdev, struct msi_desc *msidesc) {
+	return (irq_hw_number_t)msidesc->msi_attrib.entry_nr |
+		PCI_DEVID(pdev->bus->number, pdev->devfn) << 11 |
+		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27; }
+
+static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs, void *arg) {
+	int i, ret;
+	irq_hw_number_t hwirq;
+	struct irq_alloc_info *info = arg;
+
+	hwirq = get_hwirq_from_pcidev(info->msi_dev, info->msi_desc);
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_msi_desc_off(virq, i, info->msi_desc);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &msi_chip, (void *)(long)i);
+		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
+		dev_dbg(&info->msi_dev->dev, "irq %d for MSI/MSI-X\n",
+			virq + i);
+	}
+
+	return ret;
+}
+
+static void msi_domain_free(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs)
+{
+	int i;
+	struct msi_desc *msidesc = irq_get_msi_desc(virq);
+
+	if (msidesc)
+		msidesc->irq = 0;
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_handler(virq + i, NULL);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs); }
+
+static int msi_domain_activate(struct irq_domain *domain,
+			       struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	/*
+	 * irq_data->chip_data is MSI/MSIx offset.
+	 * MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (irq_data->chip_data)
+		return 0;
+
+	if (msi_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, 0);
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static int msi_domain_deactivate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	if (irq_data->chip_data)
+		return 0;
+
+	memset(&msg, 0, sizeof(msg));
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops msi_domain_ops = {
+	.alloc = msi_domain_alloc,
+	.free = msi_domain_free,
+	.activate = msi_domain_activate,
+	.deactivate = msi_domain_deactivate,
 };
 
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, @@ -145,25 +249,56 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)  {
+	int irq, cnt, nvec_pow2;
+	struct irq_domain *domain;
 	struct msi_desc *msidesc;
-	int irq, ret;
+	struct irq_alloc_info info;
+	int node = dev_to_node(&dev->dev);
+
+	if (disable_apic)
+		return -ENOSYS;
 
-	/* Multiple MSI vectors only supported with interrupt remapping */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
-		return 1;
+	init_irq_alloc_info(&info, NULL);
+	info.msi_dev = dev;
+	if (type == PCI_CAP_ID_MSI) {
+		msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+		WARN_ON(!list_is_singular(&dev->msi_list));
+		WARN_ON(msidesc->irq);
+		WARN_ON(msidesc->msi_attrib.multiple);
+		WARN_ON(msidesc->nvec_used);
+		info.type = X86_IRQ_ALLOC_TYPE_MSI;
+		cnt = nvec;
+	} else {
+		info.type = X86_IRQ_ALLOC_TYPE_MSIX;
+		cnt = 1;
+	}
+
+	domain = irq_remapping_get_irq_domain(&info);
+	if (domain == NULL) {
+		/*
+		 * Multiple MSI vectors only supported with interrupt
+		 * remapping
+		 */
+		if (type == PCI_CAP_ID_MSI && nvec > 1)
+			return 1;
+		domain = msi_default_domain;
+	}
+	if (domain == NULL)
+		return -ENOSYS;
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		info.msi_desc = msidesc;
+		irq = irq_domain_alloc_irqs(domain, -1, cnt, node, &info);
 		if (irq <= 0)
 			return -ENOSPC;
+	}
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0) {
-			irq_domain_free_irqs(irq, 1);
-			return ret;
-		}
-
+	if (type == PCI_CAP_ID_MSI) {
+		nvec_pow2 = __roundup_pow_of_two(nvec);
+		msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
+		msidesc->nvec_used = nvec;
 	}
+
 	return 0;
 }
 
@@ -172,6 +307,36 @@ void native_teardown_msi_irq(unsigned int irq)
 	irq_domain_free_irqs(irq, 1);
 }
 
+static struct irq_domain *msi_create_domain(struct irq_domain *parent,
+					    int remapped)
+{
+	struct irq_domain *domain;
+
+	domain = irq_domain_add_tree(NULL, &msi_domain_ops,
+				     (void *)(long)remapped);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+void arch_init_msi_domain(struct irq_domain *parent) {
+	if (disable_apic)
+		return;
+
+	msi_default_domain = msi_create_domain(parent, 0);
+	if (!msi_default_domain)
+		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n"); }
+
+#ifdef CONFIG_IRQ_REMAP
+struct irq_domain *arch_create_msi_irq_domain(struct irq_domain 
+*parent) {
+	return msi_create_domain(parent, 1);
+}
+#endif
+
 #ifdef CONFIG_DMAR_TABLE
 static int
 dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask, diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 774ab5ba95f2..e9329fc28c63 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -357,6 +357,8 @@ int __init arch_early_irq_init(void)
 	BUG_ON(x86_vector_domain == NULL);
 	irq_set_default_host(x86_vector_domain);
 
+	arch_init_msi_domain(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 7ac44a464be0..bda0d8e73fde 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -178,7 +178,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_msi_irqs		= irq_remapping_setup_msi_irqs;
 	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
 	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
--
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
@ 2014-09-11 14:17     ` Ni, Xun
  0 siblings, 0 replies; 110+ messages in thread
From: Ni, Xun @ 2014-09-11 14:17 UTC (permalink / raw)
  To: linux-arm-kernel

It has mis-understandings in your word" helps to make the and and architecture" ...

Thanks
Xun

-----Original Message-----
From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner at vger.kernel.org] On Behalf Of Jiang Liu
Sent: Thursday, September 11, 2014 10:04 PM
To: Benjamin Herrenschmidt; Thomas Gleixner; Ingo Molnar; H. Peter Anvin; Rafael J. Wysocki; Bjorn Helgaas; Randy Dunlap; Yinghai Lu; Borislav Petkov; Grant Likely; Marc Zyngier
Cc: Jiang Liu; Konrad Rzeszutek Wilk; Andrew Morton; Luck, Tony; Joerg Roedel; Greg Kroah-Hartman; x86 at kernel.org; linux-kernel at vger.kernel.org; linux-pci at vger.kernel.org; linux-acpi at vger.kernel.org; linux-arm-kernel at lists.infradead.org
Subject: [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts

Enhance MSI code to support hierarchy irqdomain, it helps to make the and and architecture more clear.


Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h        |    6 +
 arch/x86/include/asm/irq_remapping.h |    6 +-
 arch/x86/kernel/apic/msi.c           |  225 +++++++++++++++++++++++++++++-----
 arch/x86/kernel/apic/vector.c        |    2 +
 drivers/iommu/irq_remapping.c        |    1 -
 5 files changed, 204 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index 57f81f5a9686..9f705c49f850 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -199,6 +199,12 @@ static inline void lock_vector_lock(void) {}  static inline void unlock_vector_lock(void) {}
 #endif	/* CONFIG_X86_LOCAL_APIC */
 
+#ifdef	CONFIG_PCI_MSI
+extern void arch_init_msi_domain(struct irq_domain *domain); #else 
+static inline void arch_init_msi_domain(struct irq_domain *domain) { } 
+#endif
+
 /* Statistics */
 extern atomic_t irq_err_count;
 extern atomic_t irq_mis_count;
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 428b4e6d637c..440053ca7515 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -73,11 +73,7 @@ extern void irq_remapping_print_chip(struct irq_data *data, struct seq_file *p);
  * Create MSI/MSIx irqdomain for interrupt remapping device, use @parent as
  * parent irqdomain.
  */
-static inline struct irq_domain *
-arch_create_msi_irq_domain(struct irq_domain *parent) -{
-	return NULL;
-}
+extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain 
+*parent);
 
 /* Get parent irqdomain for interrupt remapping irqdomain */  static inline struct irq_domain *arch_get_ir_parent_domain(void) diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c index 709fedab44f2..5696703271af 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -3,6 +3,8 @@
  *
  * Copyright (C) 1997, 1998, 1999, 2000, 2009 Ingo Molnar, Hajnalka Szabo
  *	Moved from arch/x86/kernel/apic/io_apic.c.
+ * Jiang Liu <jiang.liu@linux.intel.com>
+ *	Add support of hierarchy irqdomain
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as @@ -21,6 +23,8 @@  #include <asm/apic.h>  #include <asm/irq_remapping.h>
 
+static struct irq_domain *msi_default_domain;
+
 void native_compose_msi_msg(struct pci_dev *pdev,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id) @@ -76,28 +80,32 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 	return 0;
 }
 
-static int
-msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
+static bool msi_remapped(struct irq_domain *domain)
 {
-	struct irq_cfg *cfg = irqd_cfg(data);
-	struct msi_msg msg;
-	unsigned int dest;
-	int ret;
-
-	ret = apic_set_affinity(data, mask, &dest);
-	if (ret)
-		return ret;
+	return domain->host_data != NULL;
+}
 
-	__get_cached_msi_msg(data->msi_desc, &msg);
+static int msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
+			    bool force)
+{
+	struct irq_data *parent = data->parent_data;
+	int ret;
 
-	msg.data &= ~MSI_DATA_VECTOR_MASK;
-	msg.data |= MSI_DATA_VECTOR(cfg->vector);
-	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
-	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	ret = parent->chip->irq_set_affinity(parent, mask, force);
+	/* No need to reprogram MSI registers if interrupt is remapped */
+	if (ret >= 0 && !msi_remapped(data->domain)) {
+		struct irq_cfg *cfg = irqd_cfg(data);
+		struct msi_msg msg;
 
-	__write_msi_msg(data->msi_desc, &msg);
+		__get_cached_msi_msg(data->msi_desc, &msg);
+		msg.data &= ~MSI_DATA_VECTOR_MASK;
+		msg.data |= MSI_DATA_VECTOR(cfg->vector);
+		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
+		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
+		__write_msi_msg(data->msi_desc, &msg);
+	}
 
-	return IRQ_SET_MASK_OK_NOCOPY;
+	return ret;
 }
 
 /*
@@ -108,9 +116,105 @@ static struct irq_chip msi_chip = {
 	.name			= "PCI-MSI",
 	.irq_unmask		= unmask_msi_irq,
 	.irq_mask		= mask_msi_irq,
-	.irq_ack		= apic_ack_edge,
+	.irq_ack		= irq_chip_ack_parent,
 	.irq_set_affinity	= msi_set_affinity,
-	.irq_retrigger		= apic_retrigger_irq,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_print_chip		= irq_remapping_print_chip,
+};
+
+static inline irq_hw_number_t
+get_hwirq_from_pcidev(struct pci_dev *pdev, struct msi_desc *msidesc) {
+	return (irq_hw_number_t)msidesc->msi_attrib.entry_nr |
+		PCI_DEVID(pdev->bus->number, pdev->devfn) << 11 |
+		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27; }
+
+static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs, void *arg) {
+	int i, ret;
+	irq_hw_number_t hwirq;
+	struct irq_alloc_info *info = arg;
+
+	hwirq = get_hwirq_from_pcidev(info->msi_dev, info->msi_desc);
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_msi_desc_off(virq, i, info->msi_desc);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      &msi_chip, (void *)(long)i);
+		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
+		dev_dbg(&info->msi_dev->dev, "irq %d for MSI/MSI-X\n",
+			virq + i);
+	}
+
+	return ret;
+}
+
+static void msi_domain_free(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs)
+{
+	int i;
+	struct msi_desc *msidesc = irq_get_msi_desc(virq);
+
+	if (msidesc)
+		msidesc->irq = 0;
+	for (i = 0; i < nr_irqs; i++) {
+		irq_set_handler(virq + i, NULL);
+		irq_domain_set_hwirq_and_chip(domain, virq + i, 0, NULL, NULL);
+	}
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs); }
+
+static int msi_domain_activate(struct irq_domain *domain,
+			       struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
+
+	/*
+	 * irq_data->chip_data is MSI/MSIx offset.
+	 * MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (irq_data->chip_data)
+		return 0;
+
+	if (msi_remapped(domain))
+		irq_remapping_get_msi_entry(irq_data->parent_data, &msg);
+	else
+		native_compose_msi_msg(NULL, irq_data->irq, cfg->dest_apicid,
+				       &msg, 0);
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static int msi_domain_deactivate(struct irq_domain *domain,
+				 struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	if (irq_data->chip_data)
+		return 0;
+
+	memset(&msg, 0, sizeof(msg));
+	write_msi_msg(irq_data->irq, &msg);
+
+	return 0;
+}
+
+static struct irq_domain_ops msi_domain_ops = {
+	.alloc = msi_domain_alloc,
+	.free = msi_domain_free,
+	.activate = msi_domain_activate,
+	.deactivate = msi_domain_deactivate,
 };
 
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, @@ -145,25 +249,56 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)  {
+	int irq, cnt, nvec_pow2;
+	struct irq_domain *domain;
 	struct msi_desc *msidesc;
-	int irq, ret;
+	struct irq_alloc_info info;
+	int node = dev_to_node(&dev->dev);
+
+	if (disable_apic)
+		return -ENOSYS;
 
-	/* Multiple MSI vectors only supported with interrupt remapping */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
-		return 1;
+	init_irq_alloc_info(&info, NULL);
+	info.msi_dev = dev;
+	if (type == PCI_CAP_ID_MSI) {
+		msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+		WARN_ON(!list_is_singular(&dev->msi_list));
+		WARN_ON(msidesc->irq);
+		WARN_ON(msidesc->msi_attrib.multiple);
+		WARN_ON(msidesc->nvec_used);
+		info.type = X86_IRQ_ALLOC_TYPE_MSI;
+		cnt = nvec;
+	} else {
+		info.type = X86_IRQ_ALLOC_TYPE_MSIX;
+		cnt = 1;
+	}
+
+	domain = irq_remapping_get_irq_domain(&info);
+	if (domain == NULL) {
+		/*
+		 * Multiple MSI vectors only supported with interrupt
+		 * remapping
+		 */
+		if (type == PCI_CAP_ID_MSI && nvec > 1)
+			return 1;
+		domain = msi_default_domain;
+	}
+	if (domain == NULL)
+		return -ENOSYS;
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_domain_alloc_irqs(NULL, -1, 1, NUMA_NO_NODE, NULL);
+		info.msi_desc = msidesc;
+		irq = irq_domain_alloc_irqs(domain, -1, cnt, node, &info);
 		if (irq <= 0)
 			return -ENOSPC;
+	}
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
-		if (ret < 0) {
-			irq_domain_free_irqs(irq, 1);
-			return ret;
-		}
-
+	if (type == PCI_CAP_ID_MSI) {
+		nvec_pow2 = __roundup_pow_of_two(nvec);
+		msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
+		msidesc->nvec_used = nvec;
 	}
+
 	return 0;
 }
 
@@ -172,6 +307,36 @@ void native_teardown_msi_irq(unsigned int irq)
 	irq_domain_free_irqs(irq, 1);
 }
 
+static struct irq_domain *msi_create_domain(struct irq_domain *parent,
+					    int remapped)
+{
+	struct irq_domain *domain;
+
+	domain = irq_domain_add_tree(NULL, &msi_domain_ops,
+				     (void *)(long)remapped);
+	if (domain)
+		domain->parent = parent;
+
+	return domain;
+}
+
+void arch_init_msi_domain(struct irq_domain *parent) {
+	if (disable_apic)
+		return;
+
+	msi_default_domain = msi_create_domain(parent, 0);
+	if (!msi_default_domain)
+		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n"); }
+
+#ifdef CONFIG_IRQ_REMAP
+struct irq_domain *arch_create_msi_irq_domain(struct irq_domain 
+*parent) {
+	return msi_create_domain(parent, 1);
+}
+#endif
+
 #ifdef CONFIG_DMAR_TABLE
 static int
 dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask, diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 774ab5ba95f2..e9329fc28c63 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -357,6 +357,8 @@ int __init arch_early_irq_init(void)
 	BUG_ON(x86_vector_domain == NULL);
 	irq_set_default_host(x86_vector_domain);
 
+	arch_init_msi_domain(x86_vector_domain);
+
 	return arch_early_ioapic_init();
 }
 
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 7ac44a464be0..bda0d8e73fde 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -178,7 +178,6 @@ static void __init irq_remapping_modify_x86_ops(void)
 	x86_io_apic_ops.set_affinity	= set_remapped_irq_affinity;
 	x86_io_apic_ops.setup_entry	= setup_ioapic_remapped_entry;
 	x86_io_apic_ops.eoi_ioapic_pin	= eoi_ioapic_pin_remapped;
-	x86_msi.setup_msi_irqs		= irq_remapping_setup_msi_irqs;
 	x86_msi.setup_hpet_msi		= setup_hpet_msi_remapped;
 	x86_msi.compose_msi_msg		= compose_remapped_msi_msg;
 }
--
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo at vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
  2014-09-11 14:17     ` Ni, Xun
  (?)
@ 2014-09-11 14:29       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:29 UTC (permalink / raw)
  To: Ni, Xun, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Luck, Tony, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On 2014/9/11 22:17, Ni, Xun wrote:
> It has mis-understandings in your word" helps to make the and and architecture" ...
Hi Xun,
	Thanks, will fix it in next version.
Regards!
Gerry
> 
> Thanks
> Xun
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
@ 2014-09-11 14:29       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:29 UTC (permalink / raw)
  To: Ni, Xun, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Luck, Tony, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On 2014/9/11 22:17, Ni, Xun wrote:
> It has mis-understandings in your word" helps to make the and and architecture" ...
Hi Xun,
	Thanks, will fix it in next version.
Regards!
Gerry
> 
> Thanks
> Xun
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
@ 2014-09-11 14:29       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-11 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/9/11 22:17, Ni, Xun wrote:
> It has mis-understandings in your word" helps to make the and and architecture" ...
Hi Xun,
	Thanks, will fix it in next version.
Regards!
Gerry
> 
> Thanks
> Xun
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-11 14:03   ` Jiang Liu
@ 2014-09-16 17:43     ` Thomas Gleixner
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 17:43 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

Jiang,

On Thu, 11 Sep 2014, Jiang Liu wrote:
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>  	/* Create mapping */
> -	virq = irq_create_mapping(domain, hwirq);
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	if (domain->ops->alloc)
> +		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> +					     irq_data);
> +	else
> +#endif
> +		virq = irq_create_mapping(domain, hwirq);

I'd prefer to get rid of the #ifdef CONFIG...s in the code. So this
can be written:

        if (irq_domain_has_hierarchy(domain))
		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
					     irq_data);
	else
		virq = irq_create_mapping(domain, hwirq);
	   
	   

>  	if (!virq)
>  		return virq;
>  
> @@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
>  		return 0;
>  
>  	if (hwirq < domain->revmap_direct_max_irq) {
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +		data = irq_domain_get_irq_data(domain, hwirq);
> +#else
>  		data = irq_get_irq_data(hwirq);
> +#endif

Similar here. Make irq_domain_get_irq_data() map to irq_get_irq_data() for
the non hierarchy mode so you end up with a single line:

-  		data = irq_get_irq_data(hwirq);
+		data = irq_domain_get_irq_data(domain, hwirq);


> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +/**
> + * irq_domain_alloc_irqs - Allocate IRQs from domain
> + * @domain: domain to allocate from
> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
> + * @nr_irqs: number of IRQs to allocate
> + * @node: NUMA node id for memory allocation
> + * @arg: domain specific argument
> + * @realloc: IRQ descriptors have already been allocated if true
> + *
> + * Allocate IRQ numbers and initialized all data structures to support
> + * hiearchy IRQ domains.
> + * Parameter @realloc is mainly to support legacy IRQs.

What's the issue with the legacy irqs? So this has the interrupt
descriptors allocated already. Are they already wired up for serving
interrupts and what's the state of those lines?

> + * Returns error code or allocated IRQ number

Can you please add some documentation how the hierarchical allocation
is supposed to work and how the domains are connected. That should
probably go to Documentation/IRQ-domains.txt.

Other than that this looks pretty good! Nice work!

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-16 17:43     ` Thomas Gleixner
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 17:43 UTC (permalink / raw)
  To: linux-arm-kernel

Jiang,

On Thu, 11 Sep 2014, Jiang Liu wrote:
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>  	/* Create mapping */
> -	virq = irq_create_mapping(domain, hwirq);
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	if (domain->ops->alloc)
> +		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> +					     irq_data);
> +	else
> +#endif
> +		virq = irq_create_mapping(domain, hwirq);

I'd prefer to get rid of the #ifdef CONFIG...s in the code. So this
can be written:

        if (irq_domain_has_hierarchy(domain))
		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
					     irq_data);
	else
		virq = irq_create_mapping(domain, hwirq);
	   
	   

>  	if (!virq)
>  		return virq;
>  
> @@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
>  		return 0;
>  
>  	if (hwirq < domain->revmap_direct_max_irq) {
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +		data = irq_domain_get_irq_data(domain, hwirq);
> +#else
>  		data = irq_get_irq_data(hwirq);
> +#endif

Similar here. Make irq_domain_get_irq_data() map to irq_get_irq_data() for
the non hierarchy mode so you end up with a single line:

-  		data = irq_get_irq_data(hwirq);
+		data = irq_domain_get_irq_data(domain, hwirq);


> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +/**
> + * irq_domain_alloc_irqs - Allocate IRQs from domain
> + * @domain: domain to allocate from
> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
> + * @nr_irqs: number of IRQs to allocate
> + * @node: NUMA node id for memory allocation
> + * @arg: domain specific argument
> + * @realloc: IRQ descriptors have already been allocated if true
> + *
> + * Allocate IRQ numbers and initialized all data structures to support
> + * hiearchy IRQ domains.
> + * Parameter @realloc is mainly to support legacy IRQs.

What's the issue with the legacy irqs? So this has the interrupt
descriptors allocated already. Are they already wired up for serving
interrupts and what's the state of those lines?

> + * Returns error code or allocated IRQ number

Can you please add some documentation how the hierarchical allocation
is supposed to work and how the domains are connected. That should
probably go to Documentation/IRQ-domains.txt.

Other than that this looks pretty good! Nice work!

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
  2014-09-11 14:03   ` Jiang Liu
@ 2014-09-16 17:45     ` Thomas Gleixner
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 17:45 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On Thu, 11 Sep 2014, Jiang Liu wrote:
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +void irq_chip_ack_parent(struct irq_data *data)
> +{
> +	data = data->parent_data;
> +	if (data && data->chip && data->chip->irq_ack)
> +		data->chip->irq_ack(data);

Why is this restricted to a single parent level and does not go down
the whole stack?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
@ 2014-09-16 17:45     ` Thomas Gleixner
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 17:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 11 Sep 2014, Jiang Liu wrote:
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +void irq_chip_ack_parent(struct irq_data *data)
> +{
> +	data = data->parent_data;
> +	if (data && data->chip && data->chip->irq_ack)
> +		data->chip->irq_ack(data);

Why is this restricted to a single parent level and does not go down
the whole stack?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 03/21] x86, irq: Save destination CPU ID in irq_cfg
  2014-09-11 14:03   ` Jiang Liu
@ 2014-09-16 17:47     ` Thomas Gleixner
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 17:47 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel



On Thu, 11 Sep 2014, Jiang Liu wrote:

> Cache destination CPU APIC ID into struct irq_cfg when assigning vector
> for interrupt. Upper layer just needs to read the cached APIC ID instead
> of calling apic->cpu_mask_to_apicid_and(), it helps to hide APIC driver
> details from IOAPIC/HPET/MSI drivers..
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
>  arch/x86/include/asm/hw_irq.h |    1 +
>  arch/x86/kernel/apic/vector.c |    4 ++++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
> index 7624fffc2822..3d51d74d6c01 100644
> --- a/arch/x86/include/asm/hw_irq.h
> +++ b/arch/x86/include/asm/hw_irq.h
> @@ -116,6 +116,7 @@ struct irq_data;
>  struct irq_cfg {
>  	cpumask_var_t		domain;
>  	cpumask_var_t		old_domain;
> +	unsigned int		dest_apicid;
>  	u8			vector;
>  	u8			move_in_progress : 1;
>  #ifdef CONFIG_IRQ_REMAP
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index 7562cb15b3bd..287ae4e8d500 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -188,6 +188,10 @@ next:
>  	}
>  	free_cpumask_var(tmp_mask);

Lacks a comment what this call is actually doing.
  
> +	if (!err)
> +		err = apic->cpu_mask_to_apicid_and(mask, cfg->domain,
> +						   &cfg->dest_apicid);
> +

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 03/21] x86, irq: Save destination CPU ID in irq_cfg
@ 2014-09-16 17:47     ` Thomas Gleixner
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 17:47 UTC (permalink / raw)
  To: linux-arm-kernel



On Thu, 11 Sep 2014, Jiang Liu wrote:

> Cache destination CPU APIC ID into struct irq_cfg when assigning vector
> for interrupt. Upper layer just needs to read the cached APIC ID instead
> of calling apic->cpu_mask_to_apicid_and(), it helps to hide APIC driver
> details from IOAPIC/HPET/MSI drivers..
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
>  arch/x86/include/asm/hw_irq.h |    1 +
>  arch/x86/kernel/apic/vector.c |    4 ++++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
> index 7624fffc2822..3d51d74d6c01 100644
> --- a/arch/x86/include/asm/hw_irq.h
> +++ b/arch/x86/include/asm/hw_irq.h
> @@ -116,6 +116,7 @@ struct irq_data;
>  struct irq_cfg {
>  	cpumask_var_t		domain;
>  	cpumask_var_t		old_domain;
> +	unsigned int		dest_apicid;
>  	u8			vector;
>  	u8			move_in_progress : 1;
>  #ifdef CONFIG_IRQ_REMAP
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index 7562cb15b3bd..287ae4e8d500 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -188,6 +188,10 @@ next:
>  	}
>  	free_cpumask_var(tmp_mask);

Lacks a comment what this call is actually doing.
  
> +	if (!err)
> +		err = apic->cpu_mask_to_apicid_and(mask, cfg->domain,
> +						   &cfg->dest_apicid);
> +

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
  2014-09-11 14:03   ` Jiang Liu
@ 2014-09-16 18:31     ` Thomas Gleixner
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 18:31 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On Thu, 11 Sep 2014, Jiang Liu wrote:
>  #ifdef CONFIG_HPET_TIMER
> +#define	HPET_DOMAIN_REMAPPED		0x80000000
> +
> +static inline int hpet_dev_id(struct irq_domain *domain)
> +{
> +	return (int)((long)domain->host_data & ~HPET_DOMAIN_REMAPPED);
> +}
> +
> +static inline bool hpet_remapped(struct irq_domain *domain)
> +{
> +	return (bool)((long)domain->host_data & HPET_DOMAIN_REMAPPED);
> +}

It's kinda odd to have this encoded in domain->host_data.

>  static int hpet_msi_set_affinity(struct irq_data *data,
>  				 const struct cpumask *mask, bool force)
>  {
> +	struct irq_data *parent = data->parent_data;
>  	struct irq_cfg *cfg = irqd_cfg(data);
>  	struct msi_msg msg;
> -	unsigned int dest;
>  	int ret;
>  
> -	ret = apic_set_affinity(data, mask, &dest);
> -	if (ret)
> -		return ret;
> -
> -	hpet_msi_read(data->handler_data, &msg);
> -
> -	msg.data &= ~MSI_DATA_VECTOR_MASK;
> -	msg.data |= MSI_DATA_VECTOR(cfg->vector);
> -	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
> -	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
> -
> -	hpet_msi_write(data->handler_data, &msg);
> +	ret = parent->chip->irq_set_affinity(parent, mask, force);
> +	/* No need to rewrite HPET registers if interrupt is remapped */
> +	if (ret >= 0 && !hpet_remapped(data->domain)) {

So we really should use irq_data->chip_data for this, i.e. storing

struct hpet_msi {
       struct msi_msg msg;
       bool remapped;
       /* whatever you need here */
};

> +		hpet_msi_read(data->handler_data, &msg);
> +		msg.data &= ~MSI_DATA_VECTOR_MASK;
> +		msg.data |= MSI_DATA_VECTOR(cfg->vector);
> +		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
> +		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);

We need the same thing for MSI so this should be a helper function

   msi_update_msg(struct msi_msg *msg, struct irq_cfg *cfg)

> +		hpet_msi_write(data->handler_data, &msg);
> +	}
>  
> -	return IRQ_SET_MASK_OK_NOCOPY;
> +	return ret;
>  }
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
@ 2014-09-16 18:31     ` Thomas Gleixner
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-16 18:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 11 Sep 2014, Jiang Liu wrote:
>  #ifdef CONFIG_HPET_TIMER
> +#define	HPET_DOMAIN_REMAPPED		0x80000000
> +
> +static inline int hpet_dev_id(struct irq_domain *domain)
> +{
> +	return (int)((long)domain->host_data & ~HPET_DOMAIN_REMAPPED);
> +}
> +
> +static inline bool hpet_remapped(struct irq_domain *domain)
> +{
> +	return (bool)((long)domain->host_data & HPET_DOMAIN_REMAPPED);
> +}

It's kinda odd to have this encoded in domain->host_data.

>  static int hpet_msi_set_affinity(struct irq_data *data,
>  				 const struct cpumask *mask, bool force)
>  {
> +	struct irq_data *parent = data->parent_data;
>  	struct irq_cfg *cfg = irqd_cfg(data);
>  	struct msi_msg msg;
> -	unsigned int dest;
>  	int ret;
>  
> -	ret = apic_set_affinity(data, mask, &dest);
> -	if (ret)
> -		return ret;
> -
> -	hpet_msi_read(data->handler_data, &msg);
> -
> -	msg.data &= ~MSI_DATA_VECTOR_MASK;
> -	msg.data |= MSI_DATA_VECTOR(cfg->vector);
> -	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
> -	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
> -
> -	hpet_msi_write(data->handler_data, &msg);
> +	ret = parent->chip->irq_set_affinity(parent, mask, force);
> +	/* No need to rewrite HPET registers if interrupt is remapped */
> +	if (ret >= 0 && !hpet_remapped(data->domain)) {

So we really should use irq_data->chip_data for this, i.e. storing

struct hpet_msi {
       struct msi_msg msg;
       bool remapped;
       /* whatever you need here */
};

> +		hpet_msi_read(data->handler_data, &msg);
> +		msg.data &= ~MSI_DATA_VECTOR_MASK;
> +		msg.data |= MSI_DATA_VECTOR(cfg->vector);
> +		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
> +		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);

We need the same thing for MSI so this should be a helper function

   msi_update_msg(struct msi_msg *msg, struct irq_cfg *cfg)

> +		hpet_msi_write(data->handler_data, &msg);
> +	}
>  
> -	return IRQ_SET_MASK_OK_NOCOPY;
> +	return ret;
>  }
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 03/21] x86, irq: Save destination CPU ID in irq_cfg
  2014-09-16 17:47     ` Thomas Gleixner
@ 2014-09-17  2:24       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-17  2:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel



On 2014/9/17 1:47, Thomas Gleixner wrote:
> 
> 
> On Thu, 11 Sep 2014, Jiang Liu wrote:
> 
>> Cache destination CPU APIC ID into struct irq_cfg when assigning vector
>> for interrupt. Upper layer just needs to read the cached APIC ID instead
>> of calling apic->cpu_mask_to_apicid_and(), it helps to hide APIC driver
>> details from IOAPIC/HPET/MSI drivers..
>>
>> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
>> ---
>>  arch/x86/include/asm/hw_irq.h |    1 +
>>  arch/x86/kernel/apic/vector.c |    4 ++++
>>  2 files changed, 5 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
>> index 7624fffc2822..3d51d74d6c01 100644
>> --- a/arch/x86/include/asm/hw_irq.h
>> +++ b/arch/x86/include/asm/hw_irq.h
>> @@ -116,6 +116,7 @@ struct irq_data;
>>  struct irq_cfg {
>>  	cpumask_var_t		domain;
>>  	cpumask_var_t		old_domain;
>> +	unsigned int		dest_apicid;
>>  	u8			vector;
>>  	u8			move_in_progress : 1;
>>  #ifdef CONFIG_IRQ_REMAP
>> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
>> index 7562cb15b3bd..287ae4e8d500 100644
>> --- a/arch/x86/kernel/apic/vector.c
>> +++ b/arch/x86/kernel/apic/vector.c
>> @@ -188,6 +188,10 @@ next:
>>  	}
>>  	free_cpumask_var(tmp_mask);
> 
> Lacks a comment what this call is actually doing.
How about this?
/* cache destination APIC IDs into cfg->dest_apicid */
Regards!
Gerry
>   
>> +	if (!err)
>> +		err = apic->cpu_mask_to_apicid_and(mask, cfg->domain,
>> +						   &cfg->dest_apicid);
>> +
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 03/21] x86, irq: Save destination CPU ID in irq_cfg
@ 2014-09-17  2:24       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-17  2:24 UTC (permalink / raw)
  To: linux-arm-kernel



On 2014/9/17 1:47, Thomas Gleixner wrote:
> 
> 
> On Thu, 11 Sep 2014, Jiang Liu wrote:
> 
>> Cache destination CPU APIC ID into struct irq_cfg when assigning vector
>> for interrupt. Upper layer just needs to read the cached APIC ID instead
>> of calling apic->cpu_mask_to_apicid_and(), it helps to hide APIC driver
>> details from IOAPIC/HPET/MSI drivers..
>>
>> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
>> ---
>>  arch/x86/include/asm/hw_irq.h |    1 +
>>  arch/x86/kernel/apic/vector.c |    4 ++++
>>  2 files changed, 5 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
>> index 7624fffc2822..3d51d74d6c01 100644
>> --- a/arch/x86/include/asm/hw_irq.h
>> +++ b/arch/x86/include/asm/hw_irq.h
>> @@ -116,6 +116,7 @@ struct irq_data;
>>  struct irq_cfg {
>>  	cpumask_var_t		domain;
>>  	cpumask_var_t		old_domain;
>> +	unsigned int		dest_apicid;
>>  	u8			vector;
>>  	u8			move_in_progress : 1;
>>  #ifdef CONFIG_IRQ_REMAP
>> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
>> index 7562cb15b3bd..287ae4e8d500 100644
>> --- a/arch/x86/kernel/apic/vector.c
>> +++ b/arch/x86/kernel/apic/vector.c
>> @@ -188,6 +188,10 @@ next:
>>  	}
>>  	free_cpumask_var(tmp_mask);
> 
> Lacks a comment what this call is actually doing.
How about this?
/* cache destination APIC IDs into cfg->dest_apicid */
Regards!
Gerry
>   
>> +	if (!err)
>> +		err = apic->cpu_mask_to_apicid_and(mask, cfg->domain,
>> +						   &cfg->dest_apicid);
>> +
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
  2014-09-16 17:45     ` Thomas Gleixner
@ 2014-09-17  3:07       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-17  3:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel



On 2014/9/17 1:45, Thomas Gleixner wrote:
> On Thu, 11 Sep 2014, Jiang Liu wrote:
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +void irq_chip_ack_parent(struct irq_data *data)
>> +{
>> +	data = data->parent_data;
>> +	if (data && data->chip && data->chip->irq_ack)
>> +		data->chip->irq_ack(data);
> 
> Why is this restricted to a single parent level and does not go down
> the whole stack?
Hi Thomas,
	It happens to work on x86, and we want to achieve a bit
performance advantage by not walking down the whole stack.
If preferred, I will change it to walk the whole stack.
Regards!
Gerry

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
@ 2014-09-17  3:07       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-17  3:07 UTC (permalink / raw)
  To: linux-arm-kernel



On 2014/9/17 1:45, Thomas Gleixner wrote:
> On Thu, 11 Sep 2014, Jiang Liu wrote:
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +void irq_chip_ack_parent(struct irq_data *data)
>> +{
>> +	data = data->parent_data;
>> +	if (data && data->chip && data->chip->irq_ack)
>> +		data->chip->irq_ack(data);
> 
> Why is this restricted to a single parent level and does not go down
> the whole stack?
Hi Thomas,
	It happens to work on x86, and we want to achieve a bit
performance advantage by not walking down the whole stack.
If preferred, I will change it to walk the whole stack.
Regards!
Gerry

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
  2014-09-16 18:31     ` Thomas Gleixner
@ 2014-09-17  5:16       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-17  5:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel



On 2014/9/17 2:31, Thomas Gleixner wrote:
> On Thu, 11 Sep 2014, Jiang Liu wrote:
>>  #ifdef CONFIG_HPET_TIMER
>> +#define	HPET_DOMAIN_REMAPPED		0x80000000
>> +
>> +static inline int hpet_dev_id(struct irq_domain *domain)
>> +{
>> +	return (int)((long)domain->host_data & ~HPET_DOMAIN_REMAPPED);
>> +}
>> +
>> +static inline bool hpet_remapped(struct irq_domain *domain)
>> +{
>> +	return (bool)((long)domain->host_data & HPET_DOMAIN_REMAPPED);
>> +}
> 
> It's kinda odd to have this encoded in domain->host_data.
Hi Thomas,
	I have thought about add a "domain_flags" field to struct
irq_domain to host the remapping flag. But remapping flag is not a
common flag for all architectures, so I adopted the dirty solution
to hide the remapped flag in x86 arch specific code.
How about adding domain_flags field and define IRQ_DOMAIN_FLAG_ARCH1?

> 
>>  static int hpet_msi_set_affinity(struct irq_data *data,
>>  				 const struct cpumask *mask, bool force)
>>  {
>> +	struct irq_data *parent = data->parent_data;
>>  	struct irq_cfg *cfg = irqd_cfg(data);
>>  	struct msi_msg msg;
>> -	unsigned int dest;
>>  	int ret;
>>  
>> -	ret = apic_set_affinity(data, mask, &dest);
>> -	if (ret)
>> -		return ret;
>> -
>> -	hpet_msi_read(data->handler_data, &msg);
>> -
>> -	msg.data &= ~MSI_DATA_VECTOR_MASK;
>> -	msg.data |= MSI_DATA_VECTOR(cfg->vector);
>> -	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
>> -	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
>> -
>> -	hpet_msi_write(data->handler_data, &msg);
>> +	ret = parent->chip->irq_set_affinity(parent, mask, force);
>> +	/* No need to rewrite HPET registers if interrupt is remapped */
>> +	if (ret >= 0 && !hpet_remapped(data->domain)) {
> 
> So we really should use irq_data->chip_data for this, i.e. storing
> 
> struct hpet_msi {
>        struct msi_msg msg;
>        bool remapped;
>        /* whatever you need here */
> };
OK, we add this flag for HPET, MSI and IOAPIC.

> 
>> +		hpet_msi_read(data->handler_data, &msg);
>> +		msg.data &= ~MSI_DATA_VECTOR_MASK;
>> +		msg.data |= MSI_DATA_VECTOR(cfg->vector);
>> +		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
>> +		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
> 
> We need the same thing for MSI so this should be a helper function
> 
>    msi_update_msg(struct msi_msg *msg, struct irq_cfg *cfg)
Good suggestion, will do it in next version
Regards!
Gerry

> 
>> +		hpet_msi_write(data->handler_data, &msg);
>> +	}
>>  
>> -	return IRQ_SET_MASK_OK_NOCOPY;
>> +	return ret;
>>  }
>  
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
@ 2014-09-17  5:16       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-17  5:16 UTC (permalink / raw)
  To: linux-arm-kernel



On 2014/9/17 2:31, Thomas Gleixner wrote:
> On Thu, 11 Sep 2014, Jiang Liu wrote:
>>  #ifdef CONFIG_HPET_TIMER
>> +#define	HPET_DOMAIN_REMAPPED		0x80000000
>> +
>> +static inline int hpet_dev_id(struct irq_domain *domain)
>> +{
>> +	return (int)((long)domain->host_data & ~HPET_DOMAIN_REMAPPED);
>> +}
>> +
>> +static inline bool hpet_remapped(struct irq_domain *domain)
>> +{
>> +	return (bool)((long)domain->host_data & HPET_DOMAIN_REMAPPED);
>> +}
> 
> It's kinda odd to have this encoded in domain->host_data.
Hi Thomas,
	I have thought about add a "domain_flags" field to struct
irq_domain to host the remapping flag. But remapping flag is not a
common flag for all architectures, so I adopted the dirty solution
to hide the remapped flag in x86 arch specific code.
How about adding domain_flags field and define IRQ_DOMAIN_FLAG_ARCH1?

> 
>>  static int hpet_msi_set_affinity(struct irq_data *data,
>>  				 const struct cpumask *mask, bool force)
>>  {
>> +	struct irq_data *parent = data->parent_data;
>>  	struct irq_cfg *cfg = irqd_cfg(data);
>>  	struct msi_msg msg;
>> -	unsigned int dest;
>>  	int ret;
>>  
>> -	ret = apic_set_affinity(data, mask, &dest);
>> -	if (ret)
>> -		return ret;
>> -
>> -	hpet_msi_read(data->handler_data, &msg);
>> -
>> -	msg.data &= ~MSI_DATA_VECTOR_MASK;
>> -	msg.data |= MSI_DATA_VECTOR(cfg->vector);
>> -	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
>> -	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
>> -
>> -	hpet_msi_write(data->handler_data, &msg);
>> +	ret = parent->chip->irq_set_affinity(parent, mask, force);
>> +	/* No need to rewrite HPET registers if interrupt is remapped */
>> +	if (ret >= 0 && !hpet_remapped(data->domain)) {
> 
> So we really should use irq_data->chip_data for this, i.e. storing
> 
> struct hpet_msi {
>        struct msi_msg msg;
>        bool remapped;
>        /* whatever you need here */
> };
OK, we add this flag for HPET, MSI and IOAPIC.

> 
>> +		hpet_msi_read(data->handler_data, &msg);
>> +		msg.data &= ~MSI_DATA_VECTOR_MASK;
>> +		msg.data |= MSI_DATA_VECTOR(cfg->vector);
>> +		msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
>> +		msg.address_lo |= MSI_ADDR_DEST_ID(cfg->dest_apicid);
> 
> We need the same thing for MSI so this should be a helper function
> 
>    msi_update_msg(struct msi_msg *msg, struct irq_cfg *cfg)
Good suggestion, will do it in next version
Regards!
Gerry

> 
>> +		hpet_msi_write(data->handler_data, &msg);
>> +	}
>>  
>> -	return IRQ_SET_MASK_OK_NOCOPY;
>> +	return ret;
>>  }
>  
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
  2014-09-17  3:07       ` Jiang Liu
@ 2014-09-17 20:58         ` Thomas Gleixner
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-17 20:58 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On Wed, 17 Sep 2014, Jiang Liu wrote:
> On 2014/9/17 1:45, Thomas Gleixner wrote:
> > On Thu, 11 Sep 2014, Jiang Liu wrote:
> >> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> >> +void irq_chip_ack_parent(struct irq_data *data)
> >> +{
> >> +	data = data->parent_data;
> >> +	if (data && data->chip && data->chip->irq_ack)
> >> +		data->chip->irq_ack(data);
> > 
> > Why is this restricted to a single parent level and does not go down
> > the whole stack?
> Hi Thomas,
> 	It happens to work on x86, and we want to achieve a bit
> performance advantage by not walking down the whole stack.
> If preferred, I will change it to walk the whole stack.

Happens to work on my machine is always a bad argument :)

Now, I can see why you want to do that, but if we do an optimization
like that then we should really get rid of the conditional.

You surely need a conditional on data->chip and data->chip->callback
for a full stackq walk, but for an explicit request to use the parents
ack the parent better has a chip with an ack function, right?

void irq_chip_ack_parent(struct irq_data *data)
{
	data = data->parent_data;
	data->chip->irq_ack(data);
}

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
@ 2014-09-17 20:58         ` Thomas Gleixner
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-17 20:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 17 Sep 2014, Jiang Liu wrote:
> On 2014/9/17 1:45, Thomas Gleixner wrote:
> > On Thu, 11 Sep 2014, Jiang Liu wrote:
> >> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> >> +void irq_chip_ack_parent(struct irq_data *data)
> >> +{
> >> +	data = data->parent_data;
> >> +	if (data && data->chip && data->chip->irq_ack)
> >> +		data->chip->irq_ack(data);
> > 
> > Why is this restricted to a single parent level and does not go down
> > the whole stack?
> Hi Thomas,
> 	It happens to work on x86, and we want to achieve a bit
> performance advantage by not walking down the whole stack.
> If preferred, I will change it to walk the whole stack.

Happens to work on my machine is always a bad argument :)

Now, I can see why you want to do that, but if we do an optimization
like that then we should really get rid of the conditional.

You surely need a conditional on data->chip and data->chip->callback
for a full stackq walk, but for an explicit request to use the parents
ack the parent better has a chip with an ack function, right?

void irq_chip_ack_parent(struct irq_data *data)
{
	data = data->parent_data;
	data->chip->irq_ack(data);
}

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
  2014-09-17 20:58         ` Thomas Gleixner
@ 2014-09-18  6:14           ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-18  6:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel



On 2014/9/18 4:58, Thomas Gleixner wrote:
> On Wed, 17 Sep 2014, Jiang Liu wrote:
>> On 2014/9/17 1:45, Thomas Gleixner wrote:
>>> On Thu, 11 Sep 2014, Jiang Liu wrote:
>>>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>>>> +void irq_chip_ack_parent(struct irq_data *data)
>>>> +{
>>>> +	data = data->parent_data;
>>>> +	if (data && data->chip && data->chip->irq_ack)
>>>> +		data->chip->irq_ack(data);
>>>
>>> Why is this restricted to a single parent level and does not go down
>>> the whole stack?
>> Hi Thomas,
>> 	It happens to work on x86, and we want to achieve a bit
>> performance advantage by not walking down the whole stack.
>> If preferred, I will change it to walk the whole stack.
> 
> Happens to work on my machine is always a bad argument :)
> 
> Now, I can see why you want to do that, but if we do an optimization
> like that then we should really get rid of the conditional.
> 
> You surely need a conditional on data->chip and data->chip->callback
> for a full stackq walk, but for an explicit request to use the parents
> ack the parent better has a chip with an ack function, right?
> 
> void irq_chip_ack_parent(struct irq_data *data)
> {
> 	data = data->parent_data;
> 	data->chip->irq_ack(data);
> }
Sure, will optimize it further as above code.
Regards!
Gerry
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip
@ 2014-09-18  6:14           ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-18  6:14 UTC (permalink / raw)
  To: linux-arm-kernel



On 2014/9/18 4:58, Thomas Gleixner wrote:
> On Wed, 17 Sep 2014, Jiang Liu wrote:
>> On 2014/9/17 1:45, Thomas Gleixner wrote:
>>> On Thu, 11 Sep 2014, Jiang Liu wrote:
>>>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>>>> +void irq_chip_ack_parent(struct irq_data *data)
>>>> +{
>>>> +	data = data->parent_data;
>>>> +	if (data && data->chip && data->chip->irq_ack)
>>>> +		data->chip->irq_ack(data);
>>>
>>> Why is this restricted to a single parent level and does not go down
>>> the whole stack?
>> Hi Thomas,
>> 	It happens to work on x86, and we want to achieve a bit
>> performance advantage by not walking down the whole stack.
>> If preferred, I will change it to walk the whole stack.
> 
> Happens to work on my machine is always a bad argument :)
> 
> Now, I can see why you want to do that, but if we do an optimization
> like that then we should really get rid of the conditional.
> 
> You surely need a conditional on data->chip and data->chip->callback
> for a full stackq walk, but for an explicit request to use the parents
> ack the parent better has a chip with an ack function, right?
> 
> void irq_chip_ack_parent(struct irq_data *data)
> {
> 	data = data->parent_data;
> 	data->chip->irq_ack(data);
> }
Sure, will optimize it further as above code.
Regards!
Gerry
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-16 17:43     ` Thomas Gleixner
@ 2014-09-18  7:28       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-18  7:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On 2014/9/17 1:43, Thomas Gleixner wrote:
> Jiang,
> 
> On Thu, 11 Sep 2014, Jiang Liu wrote:
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>>  	/* Create mapping */
>> -	virq = irq_create_mapping(domain, hwirq);
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +	if (domain->ops->alloc)
>> +		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
>> +					     irq_data);
>> +	else
>> +#endif
>> +		virq = irq_create_mapping(domain, hwirq);
> 
> I'd prefer to get rid of the #ifdef CONFIG...s in the code. So this
> can be written:
> 
>         if (irq_domain_has_hierarchy(domain))
> 		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> 					     irq_data);
> 	else
> 		virq = irq_create_mapping(domain, hwirq);
Sure, will kill the ifdef. 	

> 	   
> 
>>  	if (!virq)
>>  		return virq;
>>  
>> @@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
>>  		return 0;
>>  
>>  	if (hwirq < domain->revmap_direct_max_irq) {
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +		data = irq_domain_get_irq_data(domain, hwirq);
>> +#else
>>  		data = irq_get_irq_data(hwirq);
>> +#endif
> 
> Similar here. Make irq_domain_get_irq_data() map to irq_get_irq_data() for
> the non hierarchy mode so you end up with a single line:
> 
> -  		data = irq_get_irq_data(hwirq);
> +		data = irq_domain_get_irq_data(domain, hwirq);
Sure.

>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +/**
>> + * irq_domain_alloc_irqs - Allocate IRQs from domain
>> + * @domain: domain to allocate from
>> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
>> + * @nr_irqs: number of IRQs to allocate
>> + * @node: NUMA node id for memory allocation
>> + * @arg: domain specific argument
>> + * @realloc: IRQ descriptors have already been allocated if true
>> + *
>> + * Allocate IRQ numbers and initialized all data structures to support
>> + * hiearchy IRQ domains.
>> + * Parameter @realloc is mainly to support legacy IRQs.
> 
> What's the issue with the legacy irqs? So this has the interrupt
> descriptors allocated already. Are they already wired up for serving
> interrupts and what's the state of those lines?
Function arch_early_ioapic_init() will allocate irq descriptors and
irq_cfg structures for all legacy IRQ for three purposes:
1) To support ISA IRQs managed by 8259.
2) To reserve vectors on all CPUs for legacy IRQs
3) Prepare data structures to support pre_init_apic_IRQ0().
We will kill pre_init_apic_IRQ0() soon, so item 3 above won't be needed
anymore.

When __irq_domain_alloc_irqs() is called, only irq descriptor and
irq_cfg have been allocated, but the interrupt controller hardware
should be untouched yet.

> 
>> + * Returns error code or allocated IRQ number
> 
> Can you please add some documentation how the hierarchical allocation
> is supposed to work and how the domains are connected. That should
> probably go to Documentation/IRQ-domains.txt.
Sure, I will do my best to add documentations for it.
> 
> Other than that this looks pretty good! Nice work!
Thanks!
Gerry
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-18  7:28       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-18  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/9/17 1:43, Thomas Gleixner wrote:
> Jiang,
> 
> On Thu, 11 Sep 2014, Jiang Liu wrote:
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>>  	/* Create mapping */
>> -	virq = irq_create_mapping(domain, hwirq);
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +	if (domain->ops->alloc)
>> +		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
>> +					     irq_data);
>> +	else
>> +#endif
>> +		virq = irq_create_mapping(domain, hwirq);
> 
> I'd prefer to get rid of the #ifdef CONFIG...s in the code. So this
> can be written:
> 
>         if (irq_domain_has_hierarchy(domain))
> 		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> 					     irq_data);
> 	else
> 		virq = irq_create_mapping(domain, hwirq);
Sure, will kill the ifdef. 	

> 	   
> 
>>  	if (!virq)
>>  		return virq;
>>  
>> @@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
>>  		return 0;
>>  
>>  	if (hwirq < domain->revmap_direct_max_irq) {
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +		data = irq_domain_get_irq_data(domain, hwirq);
>> +#else
>>  		data = irq_get_irq_data(hwirq);
>> +#endif
> 
> Similar here. Make irq_domain_get_irq_data() map to irq_get_irq_data() for
> the non hierarchy mode so you end up with a single line:
> 
> -  		data = irq_get_irq_data(hwirq);
> +		data = irq_domain_get_irq_data(domain, hwirq);
Sure.

>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +/**
>> + * irq_domain_alloc_irqs - Allocate IRQs from domain
>> + * @domain: domain to allocate from
>> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
>> + * @nr_irqs: number of IRQs to allocate
>> + * @node: NUMA node id for memory allocation
>> + * @arg: domain specific argument
>> + * @realloc: IRQ descriptors have already been allocated if true
>> + *
>> + * Allocate IRQ numbers and initialized all data structures to support
>> + * hiearchy IRQ domains.
>> + * Parameter @realloc is mainly to support legacy IRQs.
> 
> What's the issue with the legacy irqs? So this has the interrupt
> descriptors allocated already. Are they already wired up for serving
> interrupts and what's the state of those lines?
Function arch_early_ioapic_init() will allocate irq descriptors and
irq_cfg structures for all legacy IRQ for three purposes:
1) To support ISA IRQs managed by 8259.
2) To reserve vectors on all CPUs for legacy IRQs
3) Prepare data structures to support pre_init_apic_IRQ0().
We will kill pre_init_apic_IRQ0() soon, so item 3 above won't be needed
anymore.

When __irq_domain_alloc_irqs() is called, only irq descriptor and
irq_cfg have been allocated, but the interrupt controller hardware
should be untouched yet.

> 
>> + * Returns error code or allocated IRQ number
> 
> Can you please add some documentation how the hierarchical allocation
> is supposed to work and how the domains are connected. That should
> probably go to Documentation/IRQ-domains.txt.
Sure, I will do my best to add documentations for it.
> 
> Other than that this looks pretty good! Nice work!
Thanks!
Gerry
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-11 14:03   ` Jiang Liu
@ 2014-09-18  8:48     ` Joe.C
  -1 siblings, 0 replies; 110+ messages in thread
From: Joe.C @ 2014-09-18  8:48 UTC (permalink / raw)
  To: Jiang Liu
  Cc: x86, Tony Luck, linux-acpi, Konrad Rzeszutek Wilk, Marc Zyngier,
	Benjamin Herrenschmidt, Joerg Roedel, Randy Dunlap,
	Rafael J. Wysocki, Greg Kroah-Hartman, linux-pci, linux-kernel,
	Grant Likely, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Bjorn Helgaas, Thomas Gleixner, Yinghai Lu, Andrew Morton,
	linux-arm-kernel

On Thu, 2014-09-11 at 22:03 +0800, Jiang Liu wrote:
> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +

I get the following build warnings on these lines:

../include/linux/irqdomain.h:75:47: warning: 'struct irq_data' declared
inside parameter list [enabled by default]
../include/linux/irqdomain.h:75:47: warning: its scope is only this
definition or declaration, which is probably not what you want [enabled
by default]
../include/linux/irqdomain.h:76:49: warning: 'struct irq_data' declared
inside parameter list [enabled by default

Joe.C

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-18  8:48     ` Joe.C
  0 siblings, 0 replies; 110+ messages in thread
From: Joe.C @ 2014-09-18  8:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2014-09-11 at 22:03 +0800, Jiang Liu wrote:
> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +

I get the following build warnings on these lines:

../include/linux/irqdomain.h:75:47: warning: 'struct irq_data' declared
inside parameter list [enabled by default]
../include/linux/irqdomain.h:75:47: warning: its scope is only this
definition or declaration, which is probably not what you want [enabled
by default]
../include/linux/irqdomain.h:76:49: warning: 'struct irq_data' declared
inside parameter list [enabled by default

Joe.C

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-18  8:48     ` Joe.C
@ 2014-09-18  8:58       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-18  8:58 UTC (permalink / raw)
  To: Joe.C
  Cc: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier,
	Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, linux-arm-kernel

On 2014/9/18 16:48, Joe.C wrote:
> On Thu, 2014-09-11 at 22:03 +0800, Jiang Liu wrote:
>> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
>> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
>> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>> +
> 
> I get the following build warnings on these lines:
> 
> ../include/linux/irqdomain.h:75:47: warning: 'struct irq_data' declared
> inside parameter list [enabled by default]
> ../include/linux/irqdomain.h:75:47: warning: its scope is only this
> definition or declaration, which is probably not what you want [enabled
> by default]
> ../include/linux/irqdomain.h:76:49: warning: 'struct irq_data' declared
> inside parameter list [enabled by default
Hi Joe,
	Thanks for testing. We should add a forward declaration of
struct irq_data to the top of include/linux/irqdomain.h.
Will fix it in next version.
Regards!
Gerry
> 
> Joe.C
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-18  8:58       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-18  8:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/9/18 16:48, Joe.C wrote:
> On Thu, 2014-09-11 at 22:03 +0800, Jiang Liu wrote:
>> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
>> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
>> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>> +
> 
> I get the following build warnings on these lines:
> 
> ../include/linux/irqdomain.h:75:47: warning: 'struct irq_data' declared
> inside parameter list [enabled by default]
> ../include/linux/irqdomain.h:75:47: warning: its scope is only this
> definition or declaration, which is probably not what you want [enabled
> by default]
> ../include/linux/irqdomain.h:76:49: warning: 'struct irq_data' declared
> inside parameter list [enabled by default
Hi Joe,
	Thanks for testing. We should add a forward declaration of
struct irq_data to the top of include/linux/irqdomain.h.
Will fix it in next version.
Regards!
Gerry
> 
> Joe.C
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-16 17:43     ` Thomas Gleixner
@ 2014-09-22  8:17       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-22  8:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi, linux-arm-kernel

We plan to use hierarchy irqdomain to suppport CPU vector assignment,
interrupt remapping controller, IO-APIC controller, MSI interrupt
and hypertransport interrupt etc on x86 platforms. So extend irqdomain
interfaces to support hierarchy irqdomain.

There are already many clients of current irqdomain interfaces.
To minimize the changes, we choose to introduce new version 2 interfaces
to support hierarchy instead of extending existing irqdomain interfaces.

According to Thomas's suggestion, the most important design decision is
to build hierarchy struct irq_data to support hierarchy irqdomain, so
hierarchy irqdomain related data could be saved in struct irq_data.
With support of hierarchy irq_data, we could also support stacked
irq_chips. This is most useful in case of set_affinity().

The new hierarchy irqdomain introduces following interfaces:
1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
   and related resources.
2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
   interrupt controllers to activate/deactivate interrupt.

There are also several help functions to ease irqdomain implemenations:
1) irq_domain_get_irq_data(): get irq_data associated with a specific
   irqdomain.
2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
   irq_data.
3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
   parent irqdomain's alloc/free callbacks.

We also changed irq_startup()/irq_shutdown() to invoke
irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
interrupt controller when start/stop interrupts.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
Hi Thomas,
	I have refined the patch by:
1) add documentation for hierarch irq_domain into IRQ-domain.txt
2) add 'flags' field into struct irq_domain
3) intrdouce irq_domain_is_hierarchy() to hide details
4) refine irq_domain_get_irq_data() to support non-hierarch irqdomain
Regards!
Gerry
---
 Documentation/IRQ-domain.txt |   71 +++++++++
 include/linux/irq.h          |    3 +
 include/linux/irqdomain.h    |   86 ++++++++++
 kernel/irq/Kconfig           |    3 +
 kernel/irq/chip.c            |    3 +
 kernel/irq/irqdomain.c       |  360 ++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 510 insertions(+), 16 deletions(-)

diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
index 8a8b82c9ca53..062f6b6088b4 100644
--- a/Documentation/IRQ-domain.txt
+++ b/Documentation/IRQ-domain.txt
@@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure
 that the driver using the simple domain call irq_create_mapping()
 before any irq_find_mapping() since the latter will actually work
 for the static IRQ assignment case.
+
+==== Hierarchy IRQ domain ====
+On some architectures, there may be multiple interrupt controllers
+involved in delivering an interrupt from the device to the target CPU.
+Let's look at a typical interrupt delivering path on x86 platforms:
+
+Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
+
+There are three interrupt controllers involved:
+1) IOAPIC controller
+2) Interrupt remapping controller
+3) Local APIC controller
+
+To support such a hardware topology and make software architecture match
+hardware architecture, an irq_domain data structure is built for each
+interrupt controller and those irq_domains are organized into hierarchy.
+When building irq_domain hierarchy, the irq_domain near to the device is
+child and the irq_domain near to CPU is parent. So a hierarchy structure
+as below will be built for the example above.
+	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
+		^
+		|
+	Interrupt Remapping irq_domain (manage irq_remapping entries)
+		^
+		|
+	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
+
+There are four major interfaces to use hierarchy irq_domain:
+1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
+   controller related resources to deliver these interrupts.
+2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controler
+   related resources associated with these interrupts.
+3) irq_domain_activate_irq(): activate interrupt controller hardware to
+   deliver the interrupt.
+3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
+   to stopping delivering the interrupt.
+
+Following changes are needed to support hierarchy irq_domain.
+1) a new field 'parent' is added to struct irq_domain, it's used to
+   maintain irq_domain hierarchy information.
+2) a new field 'parent_data' is added to struct irq_data, it's used to
+   build hierarchy irq_data to match hierarchy irq_domains. The irq_data
+   is used to store irq_domain pointer and hardware irq number.
+3) new callbacks are added to struct irq_domain_ops to support hierarchy
+   irq_domain operations.
+
+With support of hierarchy irq_domain and hierarchy irq_data ready, an
+irq_domain structure is built for each interrupt controller, and an
+irq_data structure is allocated for each irq_domain associated with an
+IRQ. Now we could go one step further to support stacked(hierarchy)
+irq_chip. That is, an irq_chip is associated with each irq_data along
+the hierarchy. A child irq_chip may implement a required action by
+itself or by cooperating with its parent irq_chip.
+
+With stacked irq_chip, interrupt controller driver only needs to deal
+with the hardware managed by itself and may ask for services from its
+parent irq_chip when needed. So we could achieve a much more cleaner
+software architecture.
+
+For an interrupt controller driver to support hierarchy irq_domain, it
+needs to:
+1) Implement irq_domain_ops.alloc and irq_domain_ops.free
+2) Optionally implement irq_domain_ops.activate and
+   irq_domain_ops.deactivate.
+3) Optionally implement an irq_chip to manage the interrupt controller
+   hardware.
+4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
+   they are unused with hierarchy irq_domain.
+
+Hierarchy irq_domain may also be used to support other architectures,
+such as ARM, ARM64 etc.
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62af59242ddc..4b74565690ce 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -151,6 +151,9 @@ struct irq_data {
 	unsigned int		state_use_accessors;
 	struct irq_chip		*chip;
 	struct irq_domain	*domain;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_data		*parent_data;
+#endif
 	void			*handler_data;
 	void			*chip_data;
 	struct msi_desc		*msi_desc;
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index b0f9d16e48f6..46e047c414bc 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -38,6 +38,8 @@
 struct device_node;
 struct irq_domain;
 struct of_device_id;
+struct irq_chip;
+struct irq_data;
 
 /* Number of irqs reserved for a legacy isa controller */
 #define NUM_ISA_INTERRUPTS	16
@@ -64,6 +66,16 @@ struct irq_domain_ops {
 	int (*xlate)(struct irq_domain *d, struct device_node *node,
 		     const u32 *intspec, unsigned int intsize,
 		     unsigned long *out_hwirq, unsigned int *out_type);
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	/* extended V2 interfaces to support hierarchy irqdomains */
+	int (*alloc)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs, void *arg);
+	void (*free)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs);
+	int (*activate)(struct irq_domain *d, struct irq_data *irq_data);
+	int (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
+#endif
 };
 
 extern struct irq_domain_ops irq_generic_chip_ops;
@@ -77,6 +89,7 @@ struct irq_domain_chip_generic;
  * @ops: pointer to irq_domain methods
  * @host_data: private data pointer for use by owner.  Not touched by irq_domain
  *             core code.
+ * @flags: host per irqdomain flags
  *
  * Optional elements
  * @of_node: Pointer to device tree nodes associated with the irq_domain. Used
@@ -84,6 +97,7 @@ struct irq_domain_chip_generic;
  * @gc: Pointer to a list of generic chips. There is a helper function for
  *      setting up one or more generic chips for interrupt controllers
  *      drivers using the generic chip library which uses this pointer.
+ * @parent: Pointer to parent irqdomain to support hierarchy irqdomains
  *
  * Revmap data, used internally by irq_domain
  * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that
@@ -97,10 +111,14 @@ struct irq_domain {
 	const char *name;
 	const struct irq_domain_ops *ops;
 	void *host_data;
+	unsigned int flags;
 
 	/* Optional data */
 	struct device_node *of_node;
 	struct irq_domain_chip_generic *gc;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_domain *parent;
+#endif
 
 	/* reverse map data. The linear map gets appended to the irq_domain */
 	irq_hw_number_t hwirq_max;
@@ -110,6 +128,9 @@ struct irq_domain {
 	unsigned int linear_revmap[];
 };
 
+#define	IRQ_DOMAIN_FLAG_HIERARCHY	0x1
+#define	IRQ_DOMAIN_FLAG_ARCH1		0x10000
+
 #ifdef CONFIG_IRQ_DOMAIN
 struct irq_domain *__irq_domain_add(struct device_node *of_node, int size,
 				    irq_hw_number_t hwirq_max, int direct_max,
@@ -220,8 +241,73 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d, struct device_node *ctrlr,
 			const u32 *intspec, unsigned int intsize,
 			irq_hw_number_t *out_hwirq, unsigned int *out_type);
 
+/* V2 interfaces to support hierarchy IRQ domains. */
+extern struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+						unsigned int virq);
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+extern int irq_domain_set_hwirq_and_chip(struct irq_domain *domain,
+					 unsigned int virq,
+					 irq_hw_number_t hwirq,
+					 struct irq_chip *chip,
+					 void *chip_data);
+extern void irq_domain_reset_irq_data(struct irq_data *irq_data);
+extern int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+				   unsigned int nr_irqs, int node, void *arg,
+				   bool realloc);
+extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
+extern int irq_domain_activate_irq(struct irq_data *irq_data);
+extern int irq_domain_deactivate_irq(struct irq_data *irq_data);
+
+static inline int irq_domain_alloc_irqs(struct irq_domain *domain,
+			unsigned int nr_irqs, int node, void *arg)
+{
+	return __irq_domain_alloc_irqs(domain, -1, nr_irqs, node, arg, false);
+}
+
+static inline int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
+				int irq_base, unsigned int nr_irqs, void *arg)
+{
+	if (domain->parent && domain->parent->ops->alloc)
+		return domain->parent->ops->alloc(domain->parent, irq_base,
+						  nr_irqs, arg);
+	return -ENOSYS;
+}
+
+static inline void irq_domain_free_irqs_parent(struct irq_domain *domain,
+					int irq_base, unsigned int nr_irqs)
+{
+	if (domain->parent && domain->parent->ops->free)
+		domain->parent->ops->free(domain->parent, irq_base, nr_irqs);
+}
+
+static inline bool irq_domain_is_hierarchy(struct irq_domain *domain)
+{
+	return domain->flags & IRQ_DOMAIN_FLAG_HIERARCHY;
+}
+
+static inline void irq_domain_set_hierarchy(struct irq_domain *domain)
+{
+	domain->flags |= IRQ_DOMAIN_FLAG_HIERARCHY;
+}
+#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_alloc_irqs(struct irq_domain *domain,
+			unsigned int nr_irqs, int node, void *arg)
+{
+	return -1;
+}
+
+static inline bool irq_domain_is_hierarchy(struct irq_domain *domain)
+{
+	return false;
+}
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+
 #else /* CONFIG_IRQ_DOMAIN */
 static inline void irq_dispose_mapping(unsigned int virq) { }
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
 #endif /* !CONFIG_IRQ_DOMAIN */
 
 #endif /* _LINUX_IRQDOMAIN_H */
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index d269cecdfbf0..dc1f3d08892e 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -55,6 +55,9 @@ config GENERIC_IRQ_CHIP
 config IRQ_DOMAIN
 	bool
 
+config IRQ_DOMAIN_HIERARCHY
+	bool
+
 config IRQ_DOMAIN_DEBUG
 	bool "Expose hardware/virtual IRQ mapping via debugfs"
 	depends on IRQ_DOMAIN && DEBUG_FS
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6223fab9a9d2..46bd5e2190c3 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
+#include <linux/irqdomain.h>
 
 #include <trace/events/irq.h>
 
@@ -178,6 +179,7 @@ int irq_startup(struct irq_desc *desc, bool resend)
 	irq_state_clr_disabled(desc);
 	desc->depth = 0;
 
+	irq_domain_activate_irq(&desc->irq_data);
 	if (desc->irq_data.chip->irq_startup) {
 		ret = desc->irq_data.chip->irq_startup(&desc->irq_data);
 		irq_state_clr_masked(desc);
@@ -199,6 +201,7 @@ void irq_shutdown(struct irq_desc *desc)
 		desc->irq_data.chip->irq_disable(&desc->irq_data);
 	else
 		desc->irq_data.chip->irq_mask(&desc->irq_data);
+	irq_domain_deactivate_irq(&desc->irq_data);
 	irq_state_set_masked(desc);
 }
 
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 6534ff6ce02e..26628239088c 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -23,6 +23,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
 static DEFINE_MUTEX(revmap_trees_mutex);
 static struct irq_domain *irq_default_domain;
 
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node);
+
 /**
  * __irq_domain_add() - Allocate a new irq_domain data structure
  * @of_node: optional device-tree node of the interrupt controller
@@ -30,7 +33,7 @@ static struct irq_domain *irq_default_domain;
  * @hwirq_max: Maximum number of interrupts supported by controller
  * @direct_max: Maximum value of direct maps; Use ~0 for no limit; 0 for no
  *              direct mapping
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates and initialize and irq_domain structure.
@@ -109,7 +112,7 @@ EXPORT_SYMBOL_GPL(irq_domain_remove);
  * @first_irq: first number of irq block assigned to the domain,
  *	pass zero to assign irqs on-the-fly. If first_irq is non-zero, then
  *	pre-map all of the irqs in the domain to virqs starting at first_irq.
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates an irq_domain, and optionally if first_irq is positive then also
@@ -174,10 +177,8 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
 
 	domain = __irq_domain_add(of_node, first_hwirq + size,
 				  first_hwirq + size, 0, ops, host_data);
-	if (!domain)
-		return NULL;
-
-	irq_domain_associate_many(domain, first_irq, first_hwirq, size);
+	if (domain)
+		irq_domain_associate_many(domain, first_irq, first_hwirq, size);
 
 	return domain;
 }
@@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
 unsigned int irq_create_mapping(struct irq_domain *domain,
 				irq_hw_number_t hwirq)
 {
-	unsigned int hint;
 	int virq;
 
 	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
@@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 	}
 
 	/* Allocate a virtual interrupt number */
-	hint = hwirq % nr_irqs;
-	if (hint == 0)
-		hint++;
-	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
-	if (virq <= 0)
-		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
+	virq = irq_domain_alloc_descs(-1, 1, hwirq,
+				      of_node_to_nid(domain->of_node));
 	if (virq <= 0) {
 		pr_debug("-> virq allocation failed\n");
 		return 0;
@@ -490,7 +486,10 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
 	}
 
 	/* Create mapping */
-	virq = irq_create_mapping(domain, hwirq);
+	if (irq_domain_is_hierarchy(domain))
+		virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, irq_data);
+	else
+		virq = irq_create_mapping(domain, hwirq);
 	if (!virq)
 		return virq;
 
@@ -540,8 +539,8 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
 		return 0;
 
 	if (hwirq < domain->revmap_direct_max_irq) {
-		data = irq_get_irq_data(hwirq);
-		if (data && (data->domain == domain) && (data->hwirq == hwirq))
+		data = irq_domain_get_irq_data(domain, hwirq);
+		if (data && data->hwirq == hwirq)
 			return hwirq;
 	}
 
@@ -709,3 +708,332 @@ const struct irq_domain_ops irq_domain_simple_ops = {
 	.xlate = irq_domain_xlate_onetwocell,
 };
 EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
+
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node)
+{
+	unsigned int hint;
+
+	if (virq >= 0) {
+		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
+	} else {
+		hint = hwirq % nr_irqs;
+		if (hint == 0)
+			hint++;
+		virq = irq_alloc_descs_from(hint, nr_irqs, node);
+		if (virq <= 0 && hint > 1)
+			virq = irq_alloc_descs_from(1, nr_irqs, node);
+	}
+
+	return virq;
+}
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
+{
+	unsigned int i;
+
+	for (i = 0; i < nr_irqs; i++)
+		irq_free_desc(virq + i);
+}
+
+static void irq_domain_insert_irq(int virq)
+{
+	struct irq_data *data;
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = virq;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_insert(&domain->revmap_tree, hwirq, data);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+
+		/* If not already assigned, give the domain the chip's name */
+		if (!domain->name && data->chip)
+			domain->name = data->chip->name;
+	}
+
+	irq_clear_status_flags(virq, IRQ_NOREQUEST);
+}
+
+static void irq_domain_remove_irq(int virq)
+{
+	struct irq_data *data;
+
+	irq_set_status_flags(virq, IRQ_NOREQUEST);
+	irq_set_chip_and_handler(virq, NULL, NULL);
+	synchronize_irq(virq);
+	smp_mb();
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = 0;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_delete(&domain->revmap_tree, hwirq);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+	}
+}
+
+static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
+						   struct irq_data *child)
+{
+	struct irq_data *irq_data;
+
+	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
+	if (irq_data) {
+		child->parent_data = irq_data;
+		irq_data->irq = child->irq;
+		irq_data->node = child->node;
+		irq_data->domain = domain;
+	}
+
+	return irq_data;
+}
+
+static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data, *tmp;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		tmp = irq_data->parent_data;
+		irq_data->parent_data = NULL;
+		irq_data->domain = NULL;
+
+		while (tmp) {
+			irq_data = tmp;
+			tmp = tmp->parent_data;
+			kfree(irq_data);
+		}
+	}
+}
+
+static int irq_domain_alloc_irq_data(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data;
+	struct irq_domain *parent;
+
+	/* The outmost irq_data is embedded in struct irq_desc */
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		irq_data->domain = domain;
+
+		for (parent = domain->parent; parent; parent = parent->parent) {
+			irq_data = irq_domain_insert_irq_data(parent, irq_data);
+			if (!irq_data) {
+				irq_domain_free_irq_data(virq, i + 1);
+				return -ENOMEM;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
+ * @domain: domain to match
+ * @virq: IRQ number to get irq_data
+ */
+struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+					 unsigned int virq)
+{
+	struct irq_data *irq_data;
+
+	for (irq_data = irq_get_irq_data(virq); irq_data;
+	     irq_data = irq_data->parent_data)
+		if (irq_data->domain == domain)
+			return irq_data;
+
+	return NULL;
+}
+
+int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
+				  irq_hw_number_t hwirq, struct irq_chip *chip,
+				  void *chip_data)
+{
+	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
+
+	if (!irq_data)
+		return -ENOENT;
+
+	irq_data->hwirq = hwirq;
+	irq_data->chip = chip;
+	irq_data->chip_data = chip_data;
+
+	return 0;
+}
+
+void irq_domain_reset_irq_data(struct irq_data *irq_data)
+{
+	irq_data->hwirq = 0;
+	irq_data->chip = NULL;
+	irq_data->chip_data = NULL;
+}
+
+/**
+ * __irq_domain_alloc_irqs - Allocate IRQs from domain
+ * @domain: domain to allocate from
+ * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
+ * @nr_irqs: number of IRQs to allocate
+ * @node: NUMA node id for memory allocation
+ * @arg: domain specific argument
+ * @realloc: IRQ descriptors have already been allocated if true
+ *
+ * Allocate IRQ numbers and initialized all data structures to support
+ * hiearchy IRQ domains.
+ * Parameter @realloc is mainly to support legacy IRQs.
+ * Returns error code or allocated IRQ number
+ */
+int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+			    unsigned int nr_irqs, int node, void *arg,
+			    bool realloc)
+{
+	int i, ret, virq;
+
+	if (domain == NULL) {
+		domain = irq_default_domain;
+		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
+			return -EINVAL;
+	}
+
+	if (!domain->ops->alloc) {
+		pr_debug("domain->ops->alloc() is NULL\n");
+		return -ENOSYS;
+	}
+
+	if (realloc && irq_base >= 0) {
+		virq =  irq_base;
+	} else {
+		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
+		if (virq < 0) {
+			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
+				 irq_base, nr_irqs);
+			return virq;
+		}
+	}
+
+	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
+		pr_debug("cannot allocate memory for IRQ%d\n", virq);
+		ret = -ENOMEM;
+		goto out_free_desc;
+	}
+
+	mutex_lock(&irq_domain_mutex);
+	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
+	if (ret < 0) {
+		mutex_unlock(&irq_domain_mutex);
+		goto out_free_irq_data;
+	}
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_insert_irq(virq + i);
+	mutex_unlock(&irq_domain_mutex);
+
+	return virq;
+
+out_free_irq_data:
+	irq_domain_free_irq_data(virq, nr_irqs);
+out_free_desc:
+	irq_domain_free_descs(virq, nr_irqs);
+	return ret;
+}
+
+/**
+ * irq_domain_free_irqs - Free IRQ number and assoicated data structures
+ * @virq: base IRQ number
+ * @nr_irqs: number of IRQs to free
+ */
+void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *data = irq_get_irq_data(virq);
+
+	if (WARN(!data || !data->domain || !data->domain->ops->free,
+		 "NULL pointer, cannot free irq\n"))
+		return;
+
+	mutex_lock(&irq_domain_mutex);
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_remove_irq(virq + i);
+	data->domain->ops->free(data->domain, virq, nr_irqs);
+	mutex_unlock(&irq_domain_mutex);
+
+	irq_domain_free_irq_data(virq, nr_irqs);
+	irq_domain_free_descs(virq, nr_irqs);
+}
+
+/**
+ * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
+ *			     interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->activate to program interrupt controllers, so the
+ * interrupt could actually delivered.
+ */
+int irq_domain_activate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (irq_data->parent_data)
+			ret = irq_domain_activate_irq(irq_data->parent_data);
+		if (ret == 0 && domain->ops->activate)
+			ret = domain->ops->activate(domain, irq_data);
+	}
+
+	return ret;
+}
+
+/**
+ * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
+ *			       deactivate interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->deactivate to program interrupt controllers to disable
+ * interrupt delivery.
+ */
+int irq_domain_deactivate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (domain->ops->deactivate)
+			ret = domain->ops->deactivate(domain, irq_data);
+		if (ret == 0 && irq_data->parent_data)
+			ret = irq_domain_deactivate_irq(irq_data->parent_data);
+	}
+
+	return ret;
+}
+#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+/**
+ * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
+ * @domain: domain to match
+ * @virq: IRQ number to get irq_data
+ */
+struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+					 unsigned int virq)
+{
+	struct irq_data *irq_data = irq_get_irq_data(virq);
+
+	return (irq_data && irq_data->domain == domain) ? irq_data : NULL;
+}
+
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-22  8:17       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-22  8:17 UTC (permalink / raw)
  To: linux-arm-kernel

We plan to use hierarchy irqdomain to suppport CPU vector assignment,
interrupt remapping controller, IO-APIC controller, MSI interrupt
and hypertransport interrupt etc on x86 platforms. So extend irqdomain
interfaces to support hierarchy irqdomain.

There are already many clients of current irqdomain interfaces.
To minimize the changes, we choose to introduce new version 2 interfaces
to support hierarchy instead of extending existing irqdomain interfaces.

According to Thomas's suggestion, the most important design decision is
to build hierarchy struct irq_data to support hierarchy irqdomain, so
hierarchy irqdomain related data could be saved in struct irq_data.
With support of hierarchy irq_data, we could also support stacked
irq_chips. This is most useful in case of set_affinity().

The new hierarchy irqdomain introduces following interfaces:
1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
   and related resources.
2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
   interrupt controllers to activate/deactivate interrupt.

There are also several help functions to ease irqdomain implemenations:
1) irq_domain_get_irq_data(): get irq_data associated with a specific
   irqdomain.
2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
   irq_data.
3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
   parent irqdomain's alloc/free callbacks.

We also changed irq_startup()/irq_shutdown() to invoke
irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
interrupt controller when start/stop interrupts.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
Hi Thomas,
	I have refined the patch by:
1) add documentation for hierarch irq_domain into IRQ-domain.txt
2) add 'flags' field into struct irq_domain
3) intrdouce irq_domain_is_hierarchy() to hide details
4) refine irq_domain_get_irq_data() to support non-hierarch irqdomain
Regards!
Gerry
---
 Documentation/IRQ-domain.txt |   71 +++++++++
 include/linux/irq.h          |    3 +
 include/linux/irqdomain.h    |   86 ++++++++++
 kernel/irq/Kconfig           |    3 +
 kernel/irq/chip.c            |    3 +
 kernel/irq/irqdomain.c       |  360 ++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 510 insertions(+), 16 deletions(-)

diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
index 8a8b82c9ca53..062f6b6088b4 100644
--- a/Documentation/IRQ-domain.txt
+++ b/Documentation/IRQ-domain.txt
@@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure
 that the driver using the simple domain call irq_create_mapping()
 before any irq_find_mapping() since the latter will actually work
 for the static IRQ assignment case.
+
+==== Hierarchy IRQ domain ====
+On some architectures, there may be multiple interrupt controllers
+involved in delivering an interrupt from the device to the target CPU.
+Let's look at a typical interrupt delivering path on x86 platforms:
+
+Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
+
+There are three interrupt controllers involved:
+1) IOAPIC controller
+2) Interrupt remapping controller
+3) Local APIC controller
+
+To support such a hardware topology and make software architecture match
+hardware architecture, an irq_domain data structure is built for each
+interrupt controller and those irq_domains are organized into hierarchy.
+When building irq_domain hierarchy, the irq_domain near to the device is
+child and the irq_domain near to CPU is parent. So a hierarchy structure
+as below will be built for the example above.
+	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
+		^
+		|
+	Interrupt Remapping irq_domain (manage irq_remapping entries)
+		^
+		|
+	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
+
+There are four major interfaces to use hierarchy irq_domain:
+1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
+   controller related resources to deliver these interrupts.
+2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controler
+   related resources associated with these interrupts.
+3) irq_domain_activate_irq(): activate interrupt controller hardware to
+   deliver the interrupt.
+3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
+   to stopping delivering the interrupt.
+
+Following changes are needed to support hierarchy irq_domain.
+1) a new field 'parent' is added to struct irq_domain, it's used to
+   maintain irq_domain hierarchy information.
+2) a new field 'parent_data' is added to struct irq_data, it's used to
+   build hierarchy irq_data to match hierarchy irq_domains. The irq_data
+   is used to store irq_domain pointer and hardware irq number.
+3) new callbacks are added to struct irq_domain_ops to support hierarchy
+   irq_domain operations.
+
+With support of hierarchy irq_domain and hierarchy irq_data ready, an
+irq_domain structure is built for each interrupt controller, and an
+irq_data structure is allocated for each irq_domain associated with an
+IRQ. Now we could go one step further to support stacked(hierarchy)
+irq_chip. That is, an irq_chip is associated with each irq_data along
+the hierarchy. A child irq_chip may implement a required action by
+itself or by cooperating with its parent irq_chip.
+
+With stacked irq_chip, interrupt controller driver only needs to deal
+with the hardware managed by itself and may ask for services from its
+parent irq_chip when needed. So we could achieve a much more cleaner
+software architecture.
+
+For an interrupt controller driver to support hierarchy irq_domain, it
+needs to:
+1) Implement irq_domain_ops.alloc and irq_domain_ops.free
+2) Optionally implement irq_domain_ops.activate and
+   irq_domain_ops.deactivate.
+3) Optionally implement an irq_chip to manage the interrupt controller
+   hardware.
+4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
+   they are unused with hierarchy irq_domain.
+
+Hierarchy irq_domain may also be used to support other architectures,
+such as ARM, ARM64 etc.
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62af59242ddc..4b74565690ce 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -151,6 +151,9 @@ struct irq_data {
 	unsigned int		state_use_accessors;
 	struct irq_chip		*chip;
 	struct irq_domain	*domain;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_data		*parent_data;
+#endif
 	void			*handler_data;
 	void			*chip_data;
 	struct msi_desc		*msi_desc;
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index b0f9d16e48f6..46e047c414bc 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -38,6 +38,8 @@
 struct device_node;
 struct irq_domain;
 struct of_device_id;
+struct irq_chip;
+struct irq_data;
 
 /* Number of irqs reserved for a legacy isa controller */
 #define NUM_ISA_INTERRUPTS	16
@@ -64,6 +66,16 @@ struct irq_domain_ops {
 	int (*xlate)(struct irq_domain *d, struct device_node *node,
 		     const u32 *intspec, unsigned int intsize,
 		     unsigned long *out_hwirq, unsigned int *out_type);
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	/* extended V2 interfaces to support hierarchy irqdomains */
+	int (*alloc)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs, void *arg);
+	void (*free)(struct irq_domain *d, unsigned int virq,
+		     unsigned int nr_irqs);
+	int (*activate)(struct irq_domain *d, struct irq_data *irq_data);
+	int (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
+#endif
 };
 
 extern struct irq_domain_ops irq_generic_chip_ops;
@@ -77,6 +89,7 @@ struct irq_domain_chip_generic;
  * @ops: pointer to irq_domain methods
  * @host_data: private data pointer for use by owner.  Not touched by irq_domain
  *             core code.
+ * @flags: host per irqdomain flags
  *
  * Optional elements
  * @of_node: Pointer to device tree nodes associated with the irq_domain. Used
@@ -84,6 +97,7 @@ struct irq_domain_chip_generic;
  * @gc: Pointer to a list of generic chips. There is a helper function for
  *      setting up one or more generic chips for interrupt controllers
  *      drivers using the generic chip library which uses this pointer.
+ * @parent: Pointer to parent irqdomain to support hierarchy irqdomains
  *
  * Revmap data, used internally by irq_domain
  * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that
@@ -97,10 +111,14 @@ struct irq_domain {
 	const char *name;
 	const struct irq_domain_ops *ops;
 	void *host_data;
+	unsigned int flags;
 
 	/* Optional data */
 	struct device_node *of_node;
 	struct irq_domain_chip_generic *gc;
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+	struct irq_domain *parent;
+#endif
 
 	/* reverse map data. The linear map gets appended to the irq_domain */
 	irq_hw_number_t hwirq_max;
@@ -110,6 +128,9 @@ struct irq_domain {
 	unsigned int linear_revmap[];
 };
 
+#define	IRQ_DOMAIN_FLAG_HIERARCHY	0x1
+#define	IRQ_DOMAIN_FLAG_ARCH1		0x10000
+
 #ifdef CONFIG_IRQ_DOMAIN
 struct irq_domain *__irq_domain_add(struct device_node *of_node, int size,
 				    irq_hw_number_t hwirq_max, int direct_max,
@@ -220,8 +241,73 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d, struct device_node *ctrlr,
 			const u32 *intspec, unsigned int intsize,
 			irq_hw_number_t *out_hwirq, unsigned int *out_type);
 
+/* V2 interfaces to support hierarchy IRQ domains. */
+extern struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+						unsigned int virq);
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+extern int irq_domain_set_hwirq_and_chip(struct irq_domain *domain,
+					 unsigned int virq,
+					 irq_hw_number_t hwirq,
+					 struct irq_chip *chip,
+					 void *chip_data);
+extern void irq_domain_reset_irq_data(struct irq_data *irq_data);
+extern int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+				   unsigned int nr_irqs, int node, void *arg,
+				   bool realloc);
+extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
+extern int irq_domain_activate_irq(struct irq_data *irq_data);
+extern int irq_domain_deactivate_irq(struct irq_data *irq_data);
+
+static inline int irq_domain_alloc_irqs(struct irq_domain *domain,
+			unsigned int nr_irqs, int node, void *arg)
+{
+	return __irq_domain_alloc_irqs(domain, -1, nr_irqs, node, arg, false);
+}
+
+static inline int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
+				int irq_base, unsigned int nr_irqs, void *arg)
+{
+	if (domain->parent && domain->parent->ops->alloc)
+		return domain->parent->ops->alloc(domain->parent, irq_base,
+						  nr_irqs, arg);
+	return -ENOSYS;
+}
+
+static inline void irq_domain_free_irqs_parent(struct irq_domain *domain,
+					int irq_base, unsigned int nr_irqs)
+{
+	if (domain->parent && domain->parent->ops->free)
+		domain->parent->ops->free(domain->parent, irq_base, nr_irqs);
+}
+
+static inline bool irq_domain_is_hierarchy(struct irq_domain *domain)
+{
+	return domain->flags & IRQ_DOMAIN_FLAG_HIERARCHY;
+}
+
+static inline void irq_domain_set_hierarchy(struct irq_domain *domain)
+{
+	domain->flags |= IRQ_DOMAIN_FLAG_HIERARCHY;
+}
+#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_alloc_irqs(struct irq_domain *domain,
+			unsigned int nr_irqs, int node, void *arg)
+{
+	return -1;
+}
+
+static inline bool irq_domain_is_hierarchy(struct irq_domain *domain)
+{
+	return false;
+}
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+
 #else /* CONFIG_IRQ_DOMAIN */
 static inline void irq_dispose_mapping(unsigned int virq) { }
+static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
+static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
 #endif /* !CONFIG_IRQ_DOMAIN */
 
 #endif /* _LINUX_IRQDOMAIN_H */
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index d269cecdfbf0..dc1f3d08892e 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -55,6 +55,9 @@ config GENERIC_IRQ_CHIP
 config IRQ_DOMAIN
 	bool
 
+config IRQ_DOMAIN_HIERARCHY
+	bool
+
 config IRQ_DOMAIN_DEBUG
 	bool "Expose hardware/virtual IRQ mapping via debugfs"
 	depends on IRQ_DOMAIN && DEBUG_FS
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6223fab9a9d2..46bd5e2190c3 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
+#include <linux/irqdomain.h>
 
 #include <trace/events/irq.h>
 
@@ -178,6 +179,7 @@ int irq_startup(struct irq_desc *desc, bool resend)
 	irq_state_clr_disabled(desc);
 	desc->depth = 0;
 
+	irq_domain_activate_irq(&desc->irq_data);
 	if (desc->irq_data.chip->irq_startup) {
 		ret = desc->irq_data.chip->irq_startup(&desc->irq_data);
 		irq_state_clr_masked(desc);
@@ -199,6 +201,7 @@ void irq_shutdown(struct irq_desc *desc)
 		desc->irq_data.chip->irq_disable(&desc->irq_data);
 	else
 		desc->irq_data.chip->irq_mask(&desc->irq_data);
+	irq_domain_deactivate_irq(&desc->irq_data);
 	irq_state_set_masked(desc);
 }
 
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 6534ff6ce02e..26628239088c 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -23,6 +23,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
 static DEFINE_MUTEX(revmap_trees_mutex);
 static struct irq_domain *irq_default_domain;
 
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node);
+
 /**
  * __irq_domain_add() - Allocate a new irq_domain data structure
  * @of_node: optional device-tree node of the interrupt controller
@@ -30,7 +33,7 @@ static struct irq_domain *irq_default_domain;
  * @hwirq_max: Maximum number of interrupts supported by controller
  * @direct_max: Maximum value of direct maps; Use ~0 for no limit; 0 for no
  *              direct mapping
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates and initialize and irq_domain structure.
@@ -109,7 +112,7 @@ EXPORT_SYMBOL_GPL(irq_domain_remove);
  * @first_irq: first number of irq block assigned to the domain,
  *	pass zero to assign irqs on-the-fly. If first_irq is non-zero, then
  *	pre-map all of the irqs in the domain to virqs starting at first_irq.
- * @ops: map/unmap domain callbacks
+ * @ops: domain callbacks
  * @host_data: Controller private data pointer
  *
  * Allocates an irq_domain, and optionally if first_irq is positive then also
@@ -174,10 +177,8 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
 
 	domain = __irq_domain_add(of_node, first_hwirq + size,
 				  first_hwirq + size, 0, ops, host_data);
-	if (!domain)
-		return NULL;
-
-	irq_domain_associate_many(domain, first_irq, first_hwirq, size);
+	if (domain)
+		irq_domain_associate_many(domain, first_irq, first_hwirq, size);
 
 	return domain;
 }
@@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
 unsigned int irq_create_mapping(struct irq_domain *domain,
 				irq_hw_number_t hwirq)
 {
-	unsigned int hint;
 	int virq;
 
 	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
@@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 	}
 
 	/* Allocate a virtual interrupt number */
-	hint = hwirq % nr_irqs;
-	if (hint == 0)
-		hint++;
-	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
-	if (virq <= 0)
-		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
+	virq = irq_domain_alloc_descs(-1, 1, hwirq,
+				      of_node_to_nid(domain->of_node));
 	if (virq <= 0) {
 		pr_debug("-> virq allocation failed\n");
 		return 0;
@@ -490,7 +486,10 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
 	}
 
 	/* Create mapping */
-	virq = irq_create_mapping(domain, hwirq);
+	if (irq_domain_is_hierarchy(domain))
+		virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, irq_data);
+	else
+		virq = irq_create_mapping(domain, hwirq);
 	if (!virq)
 		return virq;
 
@@ -540,8 +539,8 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
 		return 0;
 
 	if (hwirq < domain->revmap_direct_max_irq) {
-		data = irq_get_irq_data(hwirq);
-		if (data && (data->domain == domain) && (data->hwirq == hwirq))
+		data = irq_domain_get_irq_data(domain, hwirq);
+		if (data && data->hwirq == hwirq)
 			return hwirq;
 	}
 
@@ -709,3 +708,332 @@ const struct irq_domain_ops irq_domain_simple_ops = {
 	.xlate = irq_domain_xlate_onetwocell,
 };
 EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
+
+static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
+				  irq_hw_number_t hwirq, int node)
+{
+	unsigned int hint;
+
+	if (virq >= 0) {
+		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
+	} else {
+		hint = hwirq % nr_irqs;
+		if (hint == 0)
+			hint++;
+		virq = irq_alloc_descs_from(hint, nr_irqs, node);
+		if (virq <= 0 && hint > 1)
+			virq = irq_alloc_descs_from(1, nr_irqs, node);
+	}
+
+	return virq;
+}
+
+#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
+static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
+{
+	unsigned int i;
+
+	for (i = 0; i < nr_irqs; i++)
+		irq_free_desc(virq + i);
+}
+
+static void irq_domain_insert_irq(int virq)
+{
+	struct irq_data *data;
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = virq;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_insert(&domain->revmap_tree, hwirq, data);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+
+		/* If not already assigned, give the domain the chip's name */
+		if (!domain->name && data->chip)
+			domain->name = data->chip->name;
+	}
+
+	irq_clear_status_flags(virq, IRQ_NOREQUEST);
+}
+
+static void irq_domain_remove_irq(int virq)
+{
+	struct irq_data *data;
+
+	irq_set_status_flags(virq, IRQ_NOREQUEST);
+	irq_set_chip_and_handler(virq, NULL, NULL);
+	synchronize_irq(virq);
+	smp_mb();
+
+	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
+		struct irq_domain *domain = data->domain;
+		irq_hw_number_t hwirq = data->hwirq;
+
+		if (hwirq < domain->revmap_size) {
+			domain->linear_revmap[hwirq] = 0;
+		} else {
+			mutex_lock(&revmap_trees_mutex);
+			radix_tree_delete(&domain->revmap_tree, hwirq);
+			mutex_unlock(&revmap_trees_mutex);
+		}
+	}
+}
+
+static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
+						   struct irq_data *child)
+{
+	struct irq_data *irq_data;
+
+	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
+	if (irq_data) {
+		child->parent_data = irq_data;
+		irq_data->irq = child->irq;
+		irq_data->node = child->node;
+		irq_data->domain = domain;
+	}
+
+	return irq_data;
+}
+
+static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data, *tmp;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		tmp = irq_data->parent_data;
+		irq_data->parent_data = NULL;
+		irq_data->domain = NULL;
+
+		while (tmp) {
+			irq_data = tmp;
+			tmp = tmp->parent_data;
+			kfree(irq_data);
+		}
+	}
+}
+
+static int irq_domain_alloc_irq_data(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *irq_data;
+	struct irq_domain *parent;
+
+	/* The outmost irq_data is embedded in struct irq_desc */
+	for (i = 0; i < nr_irqs; i++) {
+		irq_data = irq_get_irq_data(virq + i);
+		irq_data->domain = domain;
+
+		for (parent = domain->parent; parent; parent = parent->parent) {
+			irq_data = irq_domain_insert_irq_data(parent, irq_data);
+			if (!irq_data) {
+				irq_domain_free_irq_data(virq, i + 1);
+				return -ENOMEM;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
+ * @domain: domain to match
+ * @virq: IRQ number to get irq_data
+ */
+struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+					 unsigned int virq)
+{
+	struct irq_data *irq_data;
+
+	for (irq_data = irq_get_irq_data(virq); irq_data;
+	     irq_data = irq_data->parent_data)
+		if (irq_data->domain == domain)
+			return irq_data;
+
+	return NULL;
+}
+
+int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
+				  irq_hw_number_t hwirq, struct irq_chip *chip,
+				  void *chip_data)
+{
+	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
+
+	if (!irq_data)
+		return -ENOENT;
+
+	irq_data->hwirq = hwirq;
+	irq_data->chip = chip;
+	irq_data->chip_data = chip_data;
+
+	return 0;
+}
+
+void irq_domain_reset_irq_data(struct irq_data *irq_data)
+{
+	irq_data->hwirq = 0;
+	irq_data->chip = NULL;
+	irq_data->chip_data = NULL;
+}
+
+/**
+ * __irq_domain_alloc_irqs - Allocate IRQs from domain
+ * @domain: domain to allocate from
+ * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
+ * @nr_irqs: number of IRQs to allocate
+ * @node: NUMA node id for memory allocation
+ * @arg: domain specific argument
+ * @realloc: IRQ descriptors have already been allocated if true
+ *
+ * Allocate IRQ numbers and initialized all data structures to support
+ * hiearchy IRQ domains.
+ * Parameter @realloc is mainly to support legacy IRQs.
+ * Returns error code or allocated IRQ number
+ */
+int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
+			    unsigned int nr_irqs, int node, void *arg,
+			    bool realloc)
+{
+	int i, ret, virq;
+
+	if (domain == NULL) {
+		domain = irq_default_domain;
+		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
+			return -EINVAL;
+	}
+
+	if (!domain->ops->alloc) {
+		pr_debug("domain->ops->alloc() is NULL\n");
+		return -ENOSYS;
+	}
+
+	if (realloc && irq_base >= 0) {
+		virq =  irq_base;
+	} else {
+		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
+		if (virq < 0) {
+			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
+				 irq_base, nr_irqs);
+			return virq;
+		}
+	}
+
+	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
+		pr_debug("cannot allocate memory for IRQ%d\n", virq);
+		ret = -ENOMEM;
+		goto out_free_desc;
+	}
+
+	mutex_lock(&irq_domain_mutex);
+	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
+	if (ret < 0) {
+		mutex_unlock(&irq_domain_mutex);
+		goto out_free_irq_data;
+	}
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_insert_irq(virq + i);
+	mutex_unlock(&irq_domain_mutex);
+
+	return virq;
+
+out_free_irq_data:
+	irq_domain_free_irq_data(virq, nr_irqs);
+out_free_desc:
+	irq_domain_free_descs(virq, nr_irqs);
+	return ret;
+}
+
+/**
+ * irq_domain_free_irqs - Free IRQ number and assoicated data structures
+ * @virq: base IRQ number
+ * @nr_irqs: number of IRQs to free
+ */
+void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
+{
+	int i;
+	struct irq_data *data = irq_get_irq_data(virq);
+
+	if (WARN(!data || !data->domain || !data->domain->ops->free,
+		 "NULL pointer, cannot free irq\n"))
+		return;
+
+	mutex_lock(&irq_domain_mutex);
+	for (i = 0; i < nr_irqs; i++)
+		irq_domain_remove_irq(virq + i);
+	data->domain->ops->free(data->domain, virq, nr_irqs);
+	mutex_unlock(&irq_domain_mutex);
+
+	irq_domain_free_irq_data(virq, nr_irqs);
+	irq_domain_free_descs(virq, nr_irqs);
+}
+
+/**
+ * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
+ *			     interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->activate to program interrupt controllers, so the
+ * interrupt could actually delivered.
+ */
+int irq_domain_activate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (irq_data->parent_data)
+			ret = irq_domain_activate_irq(irq_data->parent_data);
+		if (ret == 0 && domain->ops->activate)
+			ret = domain->ops->activate(domain, irq_data);
+	}
+
+	return ret;
+}
+
+/**
+ * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
+ *			       deactivate interrupt
+ * @irq_data: out most irq_data associated with interrupt
+ *
+ * It calls domain_ops->deactivate to program interrupt controllers to disable
+ * interrupt delivery.
+ */
+int irq_domain_deactivate_irq(struct irq_data *irq_data)
+{
+	int ret = 0;
+
+	if (irq_data && irq_data->domain) {
+		struct irq_domain *domain = irq_data->domain;
+
+		if (domain->ops->deactivate)
+			ret = domain->ops->deactivate(domain, irq_data);
+		if (ret == 0 && irq_data->parent_data)
+			ret = irq_domain_deactivate_irq(irq_data->parent_data);
+	}
+
+	return ret;
+}
+#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
+/**
+ * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
+ * @domain: domain to match
+ * @virq: IRQ number to get irq_data
+ */
+struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
+					 unsigned int virq)
+{
+	struct irq_data *irq_data = irq_get_irq_data(virq);
+
+	return (irq_data && irq_data->domain == domain) ? irq_data : NULL;
+}
+
+#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-22  8:17       ` Jiang Liu
@ 2014-09-22 17:30         ` Randy Dunlap
  -1 siblings, 0 replies; 110+ messages in thread
From: Randy Dunlap @ 2014-09-22 17:30 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On 09/22/14 01:17, Jiang Liu wrote:
> ---
>  Documentation/IRQ-domain.txt |   71 +++++++++
>  include/linux/irq.h          |    3 +
>  include/linux/irqdomain.h    |   86 ++++++++++
>  kernel/irq/Kconfig           |    3 +
>  kernel/irq/chip.c            |    3 +
>  kernel/irq/irqdomain.c       |  360 ++++++++++++++++++++++++++++++++++++++++--
>  6 files changed, 510 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
> index 8a8b82c9ca53..062f6b6088b4 100644
> --- a/Documentation/IRQ-domain.txt
> +++ b/Documentation/IRQ-domain.txt
> @@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure
>  that the driver using the simple domain call irq_create_mapping()
>  before any irq_find_mapping() since the latter will actually work
>  for the static IRQ assignment case.
> +
> +==== Hierarchy IRQ domain ====
> +On some architectures, there may be multiple interrupt controllers
> +involved in delivering an interrupt from the device to the target CPU.
> +Let's look at a typical interrupt delivering path on x86 platforms:
> +
> +Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
> +
> +There are three interrupt controllers involved:
> +1) IOAPIC controller
> +2) Interrupt remapping controller
> +3) Local APIC controller
> +
> +To support such a hardware topology and make software architecture match
> +hardware architecture, an irq_domain data structure is built for each
> +interrupt controller and those irq_domains are organized into hierarchy.
> +When building irq_domain hierarchy, the irq_domain near to the device is
> +child and the irq_domain near to CPU is parent. So a hierarchy structure
> +as below will be built for the example above.
> +	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
> +		^
> +		|
> +	Interrupt Remapping irq_domain (manage irq_remapping entries)
> +		^
> +		|
> +	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
> +
> +There are four major interfaces to use hierarchy irq_domain:
> +1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
> +   controller related resources to deliver these interrupts.
> +2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controler

                                                                 controller

> +   related resources associated with these interrupts.
> +3) irq_domain_activate_irq(): activate interrupt controller hardware to
> +   deliver the interrupt.
> +3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
> +   to stopping delivering the interrupt.

      to stop

> +
> +Following changes are needed to support hierarchy irq_domain.
> +1) a new field 'parent' is added to struct irq_domain, it's used to

                                              irq_domain;

> +   maintain irq_domain hierarchy information.
> +2) a new field 'parent_data' is added to struct irq_data, it's used to

                                                   irq_data;

> +   build hierarchy irq_data to match hierarchy irq_domains. The irq_data
> +   is used to store irq_domain pointer and hardware irq number.
> +3) new callbacks are added to struct irq_domain_ops to support hierarchy
> +   irq_domain operations.
> +
> +With support of hierarchy irq_domain and hierarchy irq_data ready, an
> +irq_domain structure is built for each interrupt controller, and an
> +irq_data structure is allocated for each irq_domain associated with an
> +IRQ. Now we could go one step further to support stacked(hierarchy)
> +irq_chip. That is, an irq_chip is associated with each irq_data along
> +the hierarchy. A child irq_chip may implement a required action by
> +itself or by cooperating with its parent irq_chip.
> +
> +With stacked irq_chip, interrupt controller driver only needs to deal
> +with the hardware managed by itself and may ask for services from its
> +parent irq_chip when needed. So we could achieve a much more cleaner

                                                    a much cleaner

> +software architecture.
> +
> +For an interrupt controller driver to support hierarchy irq_domain, it
> +needs to:
> +1) Implement irq_domain_ops.alloc and irq_domain_ops.free
> +2) Optionally implement irq_domain_ops.activate and
> +   irq_domain_ops.deactivate.
> +3) Optionally implement an irq_chip to manage the interrupt controller
> +   hardware.
> +4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
> +   they are unused with hierarchy irq_domain.
> +
> +Hierarchy irq_domain may also be used to support other architectures,
> +such as ARM, ARM64 etc.

> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index b0f9d16e48f6..46e047c414bc 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -77,6 +89,7 @@ struct irq_domain_chip_generic;
>   * @ops: pointer to irq_domain methods
>   * @host_data: private data pointer for use by owner.  Not touched by irq_domain
>   *             core code.
> + * @flags: host per irqdomain flags

                       irq_domain ?

>   *
>   * Optional elements
>   * @of_node: Pointer to device tree nodes associated with the irq_domain. Used
> @@ -84,6 +97,7 @@ struct irq_domain_chip_generic;
>   * @gc: Pointer to a list of generic chips. There is a helper function for
>   *      setting up one or more generic chips for interrupt controllers
>   *      drivers using the generic chip library which uses this pointer.
> + * @parent: Pointer to parent irqdomain to support hierarchy irqdomains

                                 irq_domain ?                   irq_domains ?

>   *
>   * Revmap data, used internally by irq_domain
>   * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that

> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 6534ff6ce02e..26628239088c 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -709,3 +708,332 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>  	.xlate = irq_domain_xlate_onetwocell,
>  };
>  EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
> +
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node)
> +{
> +	unsigned int hint;
> +
> +	if (virq >= 0) {
> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
> +	} else {
> +		hint = hwirq % nr_irqs;
> +		if (hint == 0)
> +			hint++;
> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
> +		if (virq <= 0 && hint > 1)
> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
> +	}
> +
> +	return virq;
> +}
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_free_desc(virq + i);
> +}
> +
> +static void irq_domain_insert_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = virq;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +
> +		/* If not already assigned, give the domain the chip's name */
> +		if (!domain->name && data->chip)
> +			domain->name = data->chip->name;
> +	}
> +
> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
> +}
> +
> +static void irq_domain_remove_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
> +	irq_set_chip_and_handler(virq, NULL, NULL);
> +	synchronize_irq(virq);
> +	smp_mb();
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = 0;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_delete(&domain->revmap_tree, hwirq);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +	}
> +}
> +
> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
> +						   struct irq_data *child)
> +{
> +	struct irq_data *irq_data;
> +
> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
> +	if (irq_data) {
> +		child->parent_data = irq_data;
> +		irq_data->irq = child->irq;
> +		irq_data->node = child->node;
> +		irq_data->domain = domain;
> +	}
> +
> +	return irq_data;
> +}
> +
> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data, *tmp;
> +
> +	for (i = 0; i < nr_irqs; i++) {
> +		irq_data = irq_get_irq_data(virq + i);
> +		tmp = irq_data->parent_data;
> +		irq_data->parent_data = NULL;
> +		irq_data->domain = NULL;
> +
> +		while (tmp) {
> +			irq_data = tmp;
> +			tmp = tmp->parent_data;
> +			kfree(irq_data);
> +		}
> +	}
> +}
> +
> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
> +				     unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data;
> +	struct irq_domain *parent;
> +
> +	/* The outmost irq_data is embedded in struct irq_desc */

	       outermost

> +	for (i = 0; i < nr_irqs; i++) {
> +		irq_data = irq_get_irq_data(virq + i);
> +		irq_data->domain = domain;
> +
> +		for (parent = domain->parent; parent; parent = parent->parent) {
> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
> +			if (!irq_data) {
> +				irq_domain_free_irq_data(virq, i + 1);
> +				return -ENOMEM;
> +			}
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain

                                             associated

> + * @domain: domain to match
> + * @virq: IRQ number to get irq_data
> + */
> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +					 unsigned int virq)
> +{
> +	struct irq_data *irq_data;
> +
> +	for (irq_data = irq_get_irq_data(virq); irq_data;
> +	     irq_data = irq_data->parent_data)
> +		if (irq_data->domain == domain)
> +			return irq_data;
> +
> +	return NULL;
> +}
> +
> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
> +				  void *chip_data)
> +{
> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
> +
> +	if (!irq_data)
> +		return -ENOENT;
> +
> +	irq_data->hwirq = hwirq;
> +	irq_data->chip = chip;
> +	irq_data->chip_data = chip_data;
> +
> +	return 0;
> +}
> +
> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
> +{
> +	irq_data->hwirq = 0;
> +	irq_data->chip = NULL;
> +	irq_data->chip_data = NULL;
> +}
> +
> +/**
> + * __irq_domain_alloc_irqs - Allocate IRQs from domain
> + * @domain: domain to allocate from
> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
> + * @nr_irqs: number of IRQs to allocate
> + * @node: NUMA node id for memory allocation
> + * @arg: domain specific argument
> + * @realloc: IRQ descriptors have already been allocated if true
> + *
> + * Allocate IRQ numbers and initialized all data structures to support
> + * hiearchy IRQ domains.
> + * Parameter @realloc is mainly to support legacy IRQs.
> + * Returns error code or allocated IRQ number
> + */
> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +			    unsigned int nr_irqs, int node, void *arg,
> +			    bool realloc)
> +{
> +	int i, ret, virq;
> +
> +	if (domain == NULL) {
> +		domain = irq_default_domain;
> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
> +			return -EINVAL;
> +	}
> +
> +	if (!domain->ops->alloc) {
> +		pr_debug("domain->ops->alloc() is NULL\n");
> +		return -ENOSYS;
> +	}
> +
> +	if (realloc && irq_base >= 0) {
> +		virq =  irq_base;
> +	} else {
> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
> +		if (virq < 0) {
> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
> +				 irq_base, nr_irqs);
> +			return virq;
> +		}
> +	}
> +
> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
> +		ret = -ENOMEM;
> +		goto out_free_desc;
> +	}
> +
> +	mutex_lock(&irq_domain_mutex);
> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
> +	if (ret < 0) {
> +		mutex_unlock(&irq_domain_mutex);
> +		goto out_free_irq_data;
> +	}
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_insert_irq(virq + i);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	return virq;
> +
> +out_free_irq_data:
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +out_free_desc:
> +	irq_domain_free_descs(virq, nr_irqs);
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures

                                                 associated

> + * @virq: base IRQ number
> + * @nr_irqs: number of IRQs to free
> + */
> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *data = irq_get_irq_data(virq);
> +
> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
> +		 "NULL pointer, cannot free irq\n"))
> +		return;
> +
> +	mutex_lock(&irq_domain_mutex);
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_remove_irq(virq + i);
> +	data->domain->ops->free(data->domain, virq, nr_irqs);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +	irq_domain_free_descs(virq, nr_irqs);
> +}
> +
> +/**
> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
> + *			     interrupt
> + * @irq_data: out most irq_data associated with interrupt

                 outermost

> + *
> + * It calls domain_ops->activate to program interrupt controllers, so the
> + * interrupt could actually delivered.
> + */
> +int irq_domain_activate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (irq_data->parent_data)
> +			ret = irq_domain_activate_irq(irq_data->parent_data);
> +		if (ret == 0 && domain->ops->activate)
> +			ret = domain->ops->activate(domain, irq_data);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
> + *			       deactivate interrupt
> + * @irq_data: out most irq_data associated with interrupt

                 outermost

> + *
> + * It calls domain_ops->deactivate to program interrupt controllers to disable
> + * interrupt delivery.
> + */
> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (domain->ops->deactivate)
> +			ret = domain->ops->deactivate(domain, irq_data);
> +		if (ret == 0 && irq_data->parent_data)
> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
> +	}
> +
> +	return ret;
> +}
> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +/**
> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain

                                             associated

> + * @domain: domain to match
> + * @virq: IRQ number to get irq_data
> + */
> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +					 unsigned int virq)
> +{
> +	struct irq_data *irq_data = irq_get_irq_data(virq);
> +
> +	return (irq_data && irq_data->domain == domain) ? irq_data : NULL;
> +}
> +
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> 


-- 
~Randy

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-22 17:30         ` Randy Dunlap
  0 siblings, 0 replies; 110+ messages in thread
From: Randy Dunlap @ 2014-09-22 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/22/14 01:17, Jiang Liu wrote:
> ---
>  Documentation/IRQ-domain.txt |   71 +++++++++
>  include/linux/irq.h          |    3 +
>  include/linux/irqdomain.h    |   86 ++++++++++
>  kernel/irq/Kconfig           |    3 +
>  kernel/irq/chip.c            |    3 +
>  kernel/irq/irqdomain.c       |  360 ++++++++++++++++++++++++++++++++++++++++--
>  6 files changed, 510 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
> index 8a8b82c9ca53..062f6b6088b4 100644
> --- a/Documentation/IRQ-domain.txt
> +++ b/Documentation/IRQ-domain.txt
> @@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure
>  that the driver using the simple domain call irq_create_mapping()
>  before any irq_find_mapping() since the latter will actually work
>  for the static IRQ assignment case.
> +
> +==== Hierarchy IRQ domain ====
> +On some architectures, there may be multiple interrupt controllers
> +involved in delivering an interrupt from the device to the target CPU.
> +Let's look at a typical interrupt delivering path on x86 platforms:
> +
> +Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
> +
> +There are three interrupt controllers involved:
> +1) IOAPIC controller
> +2) Interrupt remapping controller
> +3) Local APIC controller
> +
> +To support such a hardware topology and make software architecture match
> +hardware architecture, an irq_domain data structure is built for each
> +interrupt controller and those irq_domains are organized into hierarchy.
> +When building irq_domain hierarchy, the irq_domain near to the device is
> +child and the irq_domain near to CPU is parent. So a hierarchy structure
> +as below will be built for the example above.
> +	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
> +		^
> +		|
> +	Interrupt Remapping irq_domain (manage irq_remapping entries)
> +		^
> +		|
> +	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
> +
> +There are four major interfaces to use hierarchy irq_domain:
> +1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
> +   controller related resources to deliver these interrupts.
> +2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controler

                                                                 controller

> +   related resources associated with these interrupts.
> +3) irq_domain_activate_irq(): activate interrupt controller hardware to
> +   deliver the interrupt.
> +3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
> +   to stopping delivering the interrupt.

      to stop

> +
> +Following changes are needed to support hierarchy irq_domain.
> +1) a new field 'parent' is added to struct irq_domain, it's used to

                                              irq_domain;

> +   maintain irq_domain hierarchy information.
> +2) a new field 'parent_data' is added to struct irq_data, it's used to

                                                   irq_data;

> +   build hierarchy irq_data to match hierarchy irq_domains. The irq_data
> +   is used to store irq_domain pointer and hardware irq number.
> +3) new callbacks are added to struct irq_domain_ops to support hierarchy
> +   irq_domain operations.
> +
> +With support of hierarchy irq_domain and hierarchy irq_data ready, an
> +irq_domain structure is built for each interrupt controller, and an
> +irq_data structure is allocated for each irq_domain associated with an
> +IRQ. Now we could go one step further to support stacked(hierarchy)
> +irq_chip. That is, an irq_chip is associated with each irq_data along
> +the hierarchy. A child irq_chip may implement a required action by
> +itself or by cooperating with its parent irq_chip.
> +
> +With stacked irq_chip, interrupt controller driver only needs to deal
> +with the hardware managed by itself and may ask for services from its
> +parent irq_chip when needed. So we could achieve a much more cleaner

                                                    a much cleaner

> +software architecture.
> +
> +For an interrupt controller driver to support hierarchy irq_domain, it
> +needs to:
> +1) Implement irq_domain_ops.alloc and irq_domain_ops.free
> +2) Optionally implement irq_domain_ops.activate and
> +   irq_domain_ops.deactivate.
> +3) Optionally implement an irq_chip to manage the interrupt controller
> +   hardware.
> +4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
> +   they are unused with hierarchy irq_domain.
> +
> +Hierarchy irq_domain may also be used to support other architectures,
> +such as ARM, ARM64 etc.

> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index b0f9d16e48f6..46e047c414bc 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -77,6 +89,7 @@ struct irq_domain_chip_generic;
>   * @ops: pointer to irq_domain methods
>   * @host_data: private data pointer for use by owner.  Not touched by irq_domain
>   *             core code.
> + * @flags: host per irqdomain flags

                       irq_domain ?

>   *
>   * Optional elements
>   * @of_node: Pointer to device tree nodes associated with the irq_domain. Used
> @@ -84,6 +97,7 @@ struct irq_domain_chip_generic;
>   * @gc: Pointer to a list of generic chips. There is a helper function for
>   *      setting up one or more generic chips for interrupt controllers
>   *      drivers using the generic chip library which uses this pointer.
> + * @parent: Pointer to parent irqdomain to support hierarchy irqdomains

                                 irq_domain ?                   irq_domains ?

>   *
>   * Revmap data, used internally by irq_domain
>   * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that

> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 6534ff6ce02e..26628239088c 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -709,3 +708,332 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>  	.xlate = irq_domain_xlate_onetwocell,
>  };
>  EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
> +
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node)
> +{
> +	unsigned int hint;
> +
> +	if (virq >= 0) {
> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
> +	} else {
> +		hint = hwirq % nr_irqs;
> +		if (hint == 0)
> +			hint++;
> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
> +		if (virq <= 0 && hint > 1)
> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
> +	}
> +
> +	return virq;
> +}
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_free_desc(virq + i);
> +}
> +
> +static void irq_domain_insert_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = virq;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +
> +		/* If not already assigned, give the domain the chip's name */
> +		if (!domain->name && data->chip)
> +			domain->name = data->chip->name;
> +	}
> +
> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
> +}
> +
> +static void irq_domain_remove_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
> +	irq_set_chip_and_handler(virq, NULL, NULL);
> +	synchronize_irq(virq);
> +	smp_mb();
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = 0;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_delete(&domain->revmap_tree, hwirq);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +	}
> +}
> +
> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
> +						   struct irq_data *child)
> +{
> +	struct irq_data *irq_data;
> +
> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
> +	if (irq_data) {
> +		child->parent_data = irq_data;
> +		irq_data->irq = child->irq;
> +		irq_data->node = child->node;
> +		irq_data->domain = domain;
> +	}
> +
> +	return irq_data;
> +}
> +
> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data, *tmp;
> +
> +	for (i = 0; i < nr_irqs; i++) {
> +		irq_data = irq_get_irq_data(virq + i);
> +		tmp = irq_data->parent_data;
> +		irq_data->parent_data = NULL;
> +		irq_data->domain = NULL;
> +
> +		while (tmp) {
> +			irq_data = tmp;
> +			tmp = tmp->parent_data;
> +			kfree(irq_data);
> +		}
> +	}
> +}
> +
> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
> +				     unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data;
> +	struct irq_domain *parent;
> +
> +	/* The outmost irq_data is embedded in struct irq_desc */

	       outermost

> +	for (i = 0; i < nr_irqs; i++) {
> +		irq_data = irq_get_irq_data(virq + i);
> +		irq_data->domain = domain;
> +
> +		for (parent = domain->parent; parent; parent = parent->parent) {
> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
> +			if (!irq_data) {
> +				irq_domain_free_irq_data(virq, i + 1);
> +				return -ENOMEM;
> +			}
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain

                                             associated

> + * @domain: domain to match
> + * @virq: IRQ number to get irq_data
> + */
> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +					 unsigned int virq)
> +{
> +	struct irq_data *irq_data;
> +
> +	for (irq_data = irq_get_irq_data(virq); irq_data;
> +	     irq_data = irq_data->parent_data)
> +		if (irq_data->domain == domain)
> +			return irq_data;
> +
> +	return NULL;
> +}
> +
> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
> +				  void *chip_data)
> +{
> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
> +
> +	if (!irq_data)
> +		return -ENOENT;
> +
> +	irq_data->hwirq = hwirq;
> +	irq_data->chip = chip;
> +	irq_data->chip_data = chip_data;
> +
> +	return 0;
> +}
> +
> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
> +{
> +	irq_data->hwirq = 0;
> +	irq_data->chip = NULL;
> +	irq_data->chip_data = NULL;
> +}
> +
> +/**
> + * __irq_domain_alloc_irqs - Allocate IRQs from domain
> + * @domain: domain to allocate from
> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
> + * @nr_irqs: number of IRQs to allocate
> + * @node: NUMA node id for memory allocation
> + * @arg: domain specific argument
> + * @realloc: IRQ descriptors have already been allocated if true
> + *
> + * Allocate IRQ numbers and initialized all data structures to support
> + * hiearchy IRQ domains.
> + * Parameter @realloc is mainly to support legacy IRQs.
> + * Returns error code or allocated IRQ number
> + */
> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +			    unsigned int nr_irqs, int node, void *arg,
> +			    bool realloc)
> +{
> +	int i, ret, virq;
> +
> +	if (domain == NULL) {
> +		domain = irq_default_domain;
> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
> +			return -EINVAL;
> +	}
> +
> +	if (!domain->ops->alloc) {
> +		pr_debug("domain->ops->alloc() is NULL\n");
> +		return -ENOSYS;
> +	}
> +
> +	if (realloc && irq_base >= 0) {
> +		virq =  irq_base;
> +	} else {
> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
> +		if (virq < 0) {
> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
> +				 irq_base, nr_irqs);
> +			return virq;
> +		}
> +	}
> +
> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
> +		ret = -ENOMEM;
> +		goto out_free_desc;
> +	}
> +
> +	mutex_lock(&irq_domain_mutex);
> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
> +	if (ret < 0) {
> +		mutex_unlock(&irq_domain_mutex);
> +		goto out_free_irq_data;
> +	}
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_insert_irq(virq + i);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	return virq;
> +
> +out_free_irq_data:
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +out_free_desc:
> +	irq_domain_free_descs(virq, nr_irqs);
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures

                                                 associated

> + * @virq: base IRQ number
> + * @nr_irqs: number of IRQs to free
> + */
> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *data = irq_get_irq_data(virq);
> +
> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
> +		 "NULL pointer, cannot free irq\n"))
> +		return;
> +
> +	mutex_lock(&irq_domain_mutex);
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_remove_irq(virq + i);
> +	data->domain->ops->free(data->domain, virq, nr_irqs);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +	irq_domain_free_descs(virq, nr_irqs);
> +}
> +
> +/**
> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
> + *			     interrupt
> + * @irq_data: out most irq_data associated with interrupt

                 outermost

> + *
> + * It calls domain_ops->activate to program interrupt controllers, so the
> + * interrupt could actually delivered.
> + */
> +int irq_domain_activate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (irq_data->parent_data)
> +			ret = irq_domain_activate_irq(irq_data->parent_data);
> +		if (ret == 0 && domain->ops->activate)
> +			ret = domain->ops->activate(domain, irq_data);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
> + *			       deactivate interrupt
> + * @irq_data: out most irq_data associated with interrupt

                 outermost

> + *
> + * It calls domain_ops->deactivate to program interrupt controllers to disable
> + * interrupt delivery.
> + */
> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (domain->ops->deactivate)
> +			ret = domain->ops->deactivate(domain, irq_data);
> +		if (ret == 0 && irq_data->parent_data)
> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
> +	}
> +
> +	return ret;
> +}
> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +/**
> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain

                                             associated

> + * @domain: domain to match
> + * @virq: IRQ number to get irq_data
> + */
> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +					 unsigned int virq)
> +{
> +	struct irq_data *irq_data = irq_get_irq_data(virq);
> +
> +	return (irq_data && irq_data->domain == domain) ? irq_data : NULL;
> +}
> +
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> 


-- 
~Randy

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-22  8:17       ` Jiang Liu
@ 2014-09-23  9:43         ` Joe.C
  -1 siblings, 0 replies; 110+ messages in thread
From: Joe.C @ 2014-09-23  9:43 UTC (permalink / raw)
  To: Jiang Liu
  Cc: x86, Tony Luck, linux-acpi, Konrad Rzeszutek Wilk, Marc Zyngier,
	Benjamin Herrenschmidt, Joerg Roedel, Randy Dunlap,
	Rafael J. Wysocki, Greg Kroah-Hartman, linux-pci, linux-kernel,
	Grant Likely, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Bjorn Helgaas, Thomas Gleixner, Yinghai Lu, Andrew Morton,
	linux-arm-kernel

On Mon, 2014-09-22 at 16:17 +0800, Jiang Liu wrote:
> @@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
>  unsigned int irq_create_mapping(struct irq_domain *domain,
>  				irq_hw_number_t hwirq)
>  {
> -	unsigned int hint;
>  	int virq;
>  
>  	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
> @@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>  	}
>  
>  	/* Allocate a virtual interrupt number */
> -	hint = hwirq % nr_irqs;
> -	if (hint == 0)
> -		hint++;
> -	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
> -	if (virq <= 0)
> -		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
> +	virq = irq_domain_alloc_descs(-1, 1, hwirq,
> +				      of_node_to_nid(domain->of_node));

If I read this correct, the resulting virq is different after your
change.

>  	if (virq <= 0) {
>  		pr_debug("-> virq allocation failed\n");
>  		return 0;
> @@ -490,7 +486,10 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
>  	}
>  
>  	/* Create mapping */
> -	virq = irq_create_mapping(domain, hwirq);
> +	if (irq_domain_is_hierarchy(domain))
> +		virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, irq_data);
> +	else
> +		virq = irq_create_mapping(domain, hwirq);
>  	if (!virq)
>  		return virq;

hwirq returned from xlat above is lost. Without hwirq or virq, how do we
know which irq are we working for?
Also, if the irq_desc/irq_data was already created, this will create
another one. Should we do irq_find_mapping just like irq_create_mapping?

> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +			    unsigned int nr_irqs, int node, void *arg,
> +			    bool realloc)
> +{
> +	int i, ret, virq;
> +
> +	if (domain == NULL) {
> +		domain = irq_default_domain;
> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
> +			return -EINVAL;
> +	}
> +
> +	if (!domain->ops->alloc) {
> +		pr_debug("domain->ops->alloc() is NULL\n");
> +		return -ENOSYS;
> +	}
> +
> +	if (realloc && irq_base >= 0) {
> +		virq =  irq_base;
                         ^
extra space here.

Joe.C

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-23  9:43         ` Joe.C
  0 siblings, 0 replies; 110+ messages in thread
From: Joe.C @ 2014-09-23  9:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2014-09-22 at 16:17 +0800, Jiang Liu wrote:
> @@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
>  unsigned int irq_create_mapping(struct irq_domain *domain,
>  				irq_hw_number_t hwirq)
>  {
> -	unsigned int hint;
>  	int virq;
>  
>  	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
> @@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>  	}
>  
>  	/* Allocate a virtual interrupt number */
> -	hint = hwirq % nr_irqs;
> -	if (hint == 0)
> -		hint++;
> -	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
> -	if (virq <= 0)
> -		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
> +	virq = irq_domain_alloc_descs(-1, 1, hwirq,
> +				      of_node_to_nid(domain->of_node));

If I read this correct, the resulting virq is different after your
change.

>  	if (virq <= 0) {
>  		pr_debug("-> virq allocation failed\n");
>  		return 0;
> @@ -490,7 +486,10 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
>  	}
>  
>  	/* Create mapping */
> -	virq = irq_create_mapping(domain, hwirq);
> +	if (irq_domain_is_hierarchy(domain))
> +		virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, irq_data);
> +	else
> +		virq = irq_create_mapping(domain, hwirq);
>  	if (!virq)
>  		return virq;

hwirq returned from xlat above is lost. Without hwirq or virq, how do we
know which irq are we working for?
Also, if the irq_desc/irq_data was already created, this will create
another one. Should we do irq_find_mapping just like irq_create_mapping?

> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +			    unsigned int nr_irqs, int node, void *arg,
> +			    bool realloc)
> +{
> +	int i, ret, virq;
> +
> +	if (domain == NULL) {
> +		domain = irq_default_domain;
> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
> +			return -EINVAL;
> +	}
> +
> +	if (!domain->ops->alloc) {
> +		pr_debug("domain->ops->alloc() is NULL\n");
> +		return -ENOSYS;
> +	}
> +
> +	if (realloc && irq_base >= 0) {
> +		virq =  irq_base;
                         ^
extra space here.

Joe.C

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-22 17:30         ` Randy Dunlap
  (?)
@ 2014-09-24  5:26           ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  5:26 UTC (permalink / raw)
  To: Randy Dunlap, Benjamin Herrenschmidt, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, linux-arm-kernel

Thanks Randy! I will fix these issues in next version.
Regards!
Gerry

On 2014/9/23 1:30, Randy Dunlap wrote:
> On 09/22/14 01:17, Jiang Liu wrote:
>> ---
>>  Documentation/IRQ-domain.txt |   71 +++++++++
>>  include/linux/irq.h          |    3 +
>>  include/linux/irqdomain.h    |   86 ++++++++++
>>  kernel/irq/Kconfig           |    3 +
>>  kernel/irq/chip.c            |    3 +
>>  kernel/irq/irqdomain.c       |  360 ++++++++++++++++++++++++++++++++++++++++--
>>  6 files changed, 510 insertions(+), 16 deletions(-)
>>
>> diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
>> index 8a8b82c9ca53..062f6b6088b4 100644
>> --- a/Documentation/IRQ-domain.txt
>> +++ b/Documentation/IRQ-domain.txt
>> @@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure
>>  that the driver using the simple domain call irq_create_mapping()
>>  before any irq_find_mapping() since the latter will actually work
>>  for the static IRQ assignment case.
>> +
>> +==== Hierarchy IRQ domain ====
>> +On some architectures, there may be multiple interrupt controllers
>> +involved in delivering an interrupt from the device to the target CPU.
>> +Let's look at a typical interrupt delivering path on x86 platforms:
>> +
>> +Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
>> +
>> +There are three interrupt controllers involved:
>> +1) IOAPIC controller
>> +2) Interrupt remapping controller
>> +3) Local APIC controller
>> +
>> +To support such a hardware topology and make software architecture match
>> +hardware architecture, an irq_domain data structure is built for each
>> +interrupt controller and those irq_domains are organized into hierarchy.
>> +When building irq_domain hierarchy, the irq_domain near to the device is
>> +child and the irq_domain near to CPU is parent. So a hierarchy structure
>> +as below will be built for the example above.
>> +	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
>> +		^
>> +		|
>> +	Interrupt Remapping irq_domain (manage irq_remapping entries)
>> +		^
>> +		|
>> +	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
>> +
>> +There are four major interfaces to use hierarchy irq_domain:
>> +1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
>> +   controller related resources to deliver these interrupts.
>> +2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controler
> 
>                                                                  controller
> 
>> +   related resources associated with these interrupts.
>> +3) irq_domain_activate_irq(): activate interrupt controller hardware to
>> +   deliver the interrupt.
>> +3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
>> +   to stopping delivering the interrupt.
> 
>       to stop
> 
>> +
>> +Following changes are needed to support hierarchy irq_domain.
>> +1) a new field 'parent' is added to struct irq_domain, it's used to
> 
>                                               irq_domain;
> 
>> +   maintain irq_domain hierarchy information.
>> +2) a new field 'parent_data' is added to struct irq_data, it's used to
> 
>                                                    irq_data;
> 
>> +   build hierarchy irq_data to match hierarchy irq_domains. The irq_data
>> +   is used to store irq_domain pointer and hardware irq number.
>> +3) new callbacks are added to struct irq_domain_ops to support hierarchy
>> +   irq_domain operations.
>> +
>> +With support of hierarchy irq_domain and hierarchy irq_data ready, an
>> +irq_domain structure is built for each interrupt controller, and an
>> +irq_data structure is allocated for each irq_domain associated with an
>> +IRQ. Now we could go one step further to support stacked(hierarchy)
>> +irq_chip. That is, an irq_chip is associated with each irq_data along
>> +the hierarchy. A child irq_chip may implement a required action by
>> +itself or by cooperating with its parent irq_chip.
>> +
>> +With stacked irq_chip, interrupt controller driver only needs to deal
>> +with the hardware managed by itself and may ask for services from its
>> +parent irq_chip when needed. So we could achieve a much more cleaner
> 
>                                                     a much cleaner
> 
>> +software architecture.
>> +
>> +For an interrupt controller driver to support hierarchy irq_domain, it
>> +needs to:
>> +1) Implement irq_domain_ops.alloc and irq_domain_ops.free
>> +2) Optionally implement irq_domain_ops.activate and
>> +   irq_domain_ops.deactivate.
>> +3) Optionally implement an irq_chip to manage the interrupt controller
>> +   hardware.
>> +4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
>> +   they are unused with hierarchy irq_domain.
>> +
>> +Hierarchy irq_domain may also be used to support other architectures,
>> +such as ARM, ARM64 etc.
> 
>> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
>> index b0f9d16e48f6..46e047c414bc 100644
>> --- a/include/linux/irqdomain.h
>> +++ b/include/linux/irqdomain.h
>> @@ -77,6 +89,7 @@ struct irq_domain_chip_generic;
>>   * @ops: pointer to irq_domain methods
>>   * @host_data: private data pointer for use by owner.  Not touched by irq_domain
>>   *             core code.
>> + * @flags: host per irqdomain flags
> 
>                        irq_domain ?
> 
>>   *
>>   * Optional elements
>>   * @of_node: Pointer to device tree nodes associated with the irq_domain. Used
>> @@ -84,6 +97,7 @@ struct irq_domain_chip_generic;
>>   * @gc: Pointer to a list of generic chips. There is a helper function for
>>   *      setting up one or more generic chips for interrupt controllers
>>   *      drivers using the generic chip library which uses this pointer.
>> + * @parent: Pointer to parent irqdomain to support hierarchy irqdomains
> 
>                                  irq_domain ?                   irq_domains ?
> 
>>   *
>>   * Revmap data, used internally by irq_domain
>>   * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that
> 
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>> index 6534ff6ce02e..26628239088c 100644
>> --- a/kernel/irq/irqdomain.c
>> +++ b/kernel/irq/irqdomain.c
>> @@ -709,3 +708,332 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>>  	.xlate = irq_domain_xlate_onetwocell,
>>  };
>>  EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
>> +
>> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
>> +				  irq_hw_number_t hwirq, int node)
>> +{
>> +	unsigned int hint;
>> +
>> +	if (virq >= 0) {
>> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
>> +	} else {
>> +		hint = hwirq % nr_irqs;
>> +		if (hint == 0)
>> +			hint++;
>> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
>> +		if (virq <= 0 && hint > 1)
>> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
>> +	}
>> +
>> +	return virq;
>> +}
>> +
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_free_desc(virq + i);
>> +}
>> +
>> +static void irq_domain_insert_irq(int virq)
>> +{
>> +	struct irq_data *data;
>> +
>> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
>> +		struct irq_domain *domain = data->domain;
>> +		irq_hw_number_t hwirq = data->hwirq;
>> +
>> +		if (hwirq < domain->revmap_size) {
>> +			domain->linear_revmap[hwirq] = virq;
>> +		} else {
>> +			mutex_lock(&revmap_trees_mutex);
>> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
>> +			mutex_unlock(&revmap_trees_mutex);
>> +		}
>> +
>> +		/* If not already assigned, give the domain the chip's name */
>> +		if (!domain->name && data->chip)
>> +			domain->name = data->chip->name;
>> +	}
>> +
>> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
>> +}
>> +
>> +static void irq_domain_remove_irq(int virq)
>> +{
>> +	struct irq_data *data;
>> +
>> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
>> +	irq_set_chip_and_handler(virq, NULL, NULL);
>> +	synchronize_irq(virq);
>> +	smp_mb();
>> +
>> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
>> +		struct irq_domain *domain = data->domain;
>> +		irq_hw_number_t hwirq = data->hwirq;
>> +
>> +		if (hwirq < domain->revmap_size) {
>> +			domain->linear_revmap[hwirq] = 0;
>> +		} else {
>> +			mutex_lock(&revmap_trees_mutex);
>> +			radix_tree_delete(&domain->revmap_tree, hwirq);
>> +			mutex_unlock(&revmap_trees_mutex);
>> +		}
>> +	}
>> +}
>> +
>> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
>> +						   struct irq_data *child)
>> +{
>> +	struct irq_data *irq_data;
>> +
>> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
>> +	if (irq_data) {
>> +		child->parent_data = irq_data;
>> +		irq_data->irq = child->irq;
>> +		irq_data->node = child->node;
>> +		irq_data->domain = domain;
>> +	}
>> +
>> +	return irq_data;
>> +}
>> +
>> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data, *tmp;
>> +
>> +	for (i = 0; i < nr_irqs; i++) {
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		tmp = irq_data->parent_data;
>> +		irq_data->parent_data = NULL;
>> +		irq_data->domain = NULL;
>> +
>> +		while (tmp) {
>> +			irq_data = tmp;
>> +			tmp = tmp->parent_data;
>> +			kfree(irq_data);
>> +		}
>> +	}
>> +}
>> +
>> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
>> +				     unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data;
>> +	struct irq_domain *parent;
>> +
>> +	/* The outmost irq_data is embedded in struct irq_desc */
> 
> 	       outermost
> 
>> +	for (i = 0; i < nr_irqs; i++) {
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		irq_data->domain = domain;
>> +
>> +		for (parent = domain->parent; parent; parent = parent->parent) {
>> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
>> +			if (!irq_data) {
>> +				irq_domain_free_irq_data(virq, i + 1);
>> +				return -ENOMEM;
>> +			}
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
> 
>                                              associated
> 
>> + * @domain: domain to match
>> + * @virq: IRQ number to get irq_data
>> + */
>> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
>> +					 unsigned int virq)
>> +{
>> +	struct irq_data *irq_data;
>> +
>> +	for (irq_data = irq_get_irq_data(virq); irq_data;
>> +	     irq_data = irq_data->parent_data)
>> +		if (irq_data->domain == domain)
>> +			return irq_data;
>> +
>> +	return NULL;
>> +}
>> +
>> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
>> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
>> +				  void *chip_data)
>> +{
>> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
>> +
>> +	if (!irq_data)
>> +		return -ENOENT;
>> +
>> +	irq_data->hwirq = hwirq;
>> +	irq_data->chip = chip;
>> +	irq_data->chip_data = chip_data;
>> +
>> +	return 0;
>> +}
>> +
>> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
>> +{
>> +	irq_data->hwirq = 0;
>> +	irq_data->chip = NULL;
>> +	irq_data->chip_data = NULL;
>> +}
>> +
>> +/**
>> + * __irq_domain_alloc_irqs - Allocate IRQs from domain
>> + * @domain: domain to allocate from
>> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
>> + * @nr_irqs: number of IRQs to allocate
>> + * @node: NUMA node id for memory allocation
>> + * @arg: domain specific argument
>> + * @realloc: IRQ descriptors have already been allocated if true
>> + *
>> + * Allocate IRQ numbers and initialized all data structures to support
>> + * hiearchy IRQ domains.
>> + * Parameter @realloc is mainly to support legacy IRQs.
>> + * Returns error code or allocated IRQ number
>> + */
>> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>> +			    unsigned int nr_irqs, int node, void *arg,
>> +			    bool realloc)
>> +{
>> +	int i, ret, virq;
>> +
>> +	if (domain == NULL) {
>> +		domain = irq_default_domain;
>> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
>> +			return -EINVAL;
>> +	}
>> +
>> +	if (!domain->ops->alloc) {
>> +		pr_debug("domain->ops->alloc() is NULL\n");
>> +		return -ENOSYS;
>> +	}
>> +
>> +	if (realloc && irq_base >= 0) {
>> +		virq =  irq_base;
>> +	} else {
>> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
>> +		if (virq < 0) {
>> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
>> +				 irq_base, nr_irqs);
>> +			return virq;
>> +		}
>> +	}
>> +
>> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
>> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
>> +		ret = -ENOMEM;
>> +		goto out_free_desc;
>> +	}
>> +
>> +	mutex_lock(&irq_domain_mutex);
>> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
>> +	if (ret < 0) {
>> +		mutex_unlock(&irq_domain_mutex);
>> +		goto out_free_irq_data;
>> +	}
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_domain_insert_irq(virq + i);
>> +	mutex_unlock(&irq_domain_mutex);
>> +
>> +	return virq;
>> +
>> +out_free_irq_data:
>> +	irq_domain_free_irq_data(virq, nr_irqs);
>> +out_free_desc:
>> +	irq_domain_free_descs(virq, nr_irqs);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures
> 
>                                                  associated
> 
>> + * @virq: base IRQ number
>> + * @nr_irqs: number of IRQs to free
>> + */
>> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *data = irq_get_irq_data(virq);
>> +
>> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
>> +		 "NULL pointer, cannot free irq\n"))
>> +		return;
>> +
>> +	mutex_lock(&irq_domain_mutex);
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_domain_remove_irq(virq + i);
>> +	data->domain->ops->free(data->domain, virq, nr_irqs);
>> +	mutex_unlock(&irq_domain_mutex);
>> +
>> +	irq_domain_free_irq_data(virq, nr_irqs);
>> +	irq_domain_free_descs(virq, nr_irqs);
>> +}
>> +
>> +/**
>> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
>> + *			     interrupt
>> + * @irq_data: out most irq_data associated with interrupt
> 
>                  outermost
> 
>> + *
>> + * It calls domain_ops->activate to program interrupt controllers, so the
>> + * interrupt could actually delivered.
>> + */
>> +int irq_domain_activate_irq(struct irq_data *irq_data)
>> +{
>> +	int ret = 0;
>> +
>> +	if (irq_data && irq_data->domain) {
>> +		struct irq_domain *domain = irq_data->domain;
>> +
>> +		if (irq_data->parent_data)
>> +			ret = irq_domain_activate_irq(irq_data->parent_data);
>> +		if (ret == 0 && domain->ops->activate)
>> +			ret = domain->ops->activate(domain, irq_data);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
>> + *			       deactivate interrupt
>> + * @irq_data: out most irq_data associated with interrupt
> 
>                  outermost
> 
>> + *
>> + * It calls domain_ops->deactivate to program interrupt controllers to disable
>> + * interrupt delivery.
>> + */
>> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
>> +{
>> +	int ret = 0;
>> +
>> +	if (irq_data && irq_data->domain) {
>> +		struct irq_domain *domain = irq_data->domain;
>> +
>> +		if (domain->ops->deactivate)
>> +			ret = domain->ops->deactivate(domain, irq_data);
>> +		if (ret == 0 && irq_data->parent_data)
>> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
>> +	}
>> +
>> +	return ret;
>> +}
>> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>> +/**
>> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
> 
>                                              associated
> 
>> + * @domain: domain to match
>> + * @virq: IRQ number to get irq_data
>> + */
>> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
>> +					 unsigned int virq)
>> +{
>> +	struct irq_data *irq_data = irq_get_irq_data(virq);
>> +
>> +	return (irq_data && irq_data->domain == domain) ? irq_data : NULL;
>> +}
>> +
>> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>>
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-24  5:26           ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  5:26 UTC (permalink / raw)
  To: Randy Dunlap, Benjamin Herrenschmidt, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

Thanks Randy! I will fix these issues in next version.
Regards!
Gerry

On 2014/9/23 1:30, Randy Dunlap wrote:
> On 09/22/14 01:17, Jiang Liu wrote:
>> ---
>>  Documentation/IRQ-domain.txt |   71 +++++++++
>>  include/linux/irq.h          |    3 +
>>  include/linux/irqdomain.h    |   86 ++++++++++
>>  kernel/irq/Kconfig           |    3 +
>>  kernel/irq/chip.c            |    3 +
>>  kernel/irq/irqdomain.c       |  360 ++++++++++++++++++++++++++++++++++++++++--
>>  6 files changed, 510 insertions(+), 16 deletions(-)
>>
>> diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
>> index 8a8b82c9ca53..062f6b6088b4 100644
>> --- a/Documentation/IRQ-domain.txt
>> +++ b/Documentation/IRQ-domain.txt
>> @@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure
>>  that the driver using the simple domain call irq_create_mapping()
>>  before any irq_find_mapping() since the latter will actually work
>>  for the static IRQ assignment case.
>> +
>> +==== Hierarchy IRQ domain ====
>> +On some architectures, there may be multiple interrupt controllers
>> +involved in delivering an interrupt from the device to the target CPU.
>> +Let's look at a typical interrupt delivering path on x86 platforms:
>> +
>> +Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
>> +
>> +There are three interrupt controllers involved:
>> +1) IOAPIC controller
>> +2) Interrupt remapping controller
>> +3) Local APIC controller
>> +
>> +To support such a hardware topology and make software architecture match
>> +hardware architecture, an irq_domain data structure is built for each
>> +interrupt controller and those irq_domains are organized into hierarchy.
>> +When building irq_domain hierarchy, the irq_domain near to the device is
>> +child and the irq_domain near to CPU is parent. So a hierarchy structure
>> +as below will be built for the example above.
>> +	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
>> +		^
>> +		|
>> +	Interrupt Remapping irq_domain (manage irq_remapping entries)
>> +		^
>> +		|
>> +	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
>> +
>> +There are four major interfaces to use hierarchy irq_domain:
>> +1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
>> +   controller related resources to deliver these interrupts.
>> +2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controler
> 
>                                                                  controller
> 
>> +   related resources associated with these interrupts.
>> +3) irq_domain_activate_irq(): activate interrupt controller hardware to
>> +   deliver the interrupt.
>> +3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
>> +   to stopping delivering the interrupt.
> 
>       to stop
> 
>> +
>> +Following changes are needed to support hierarchy irq_domain.
>> +1) a new field 'parent' is added to struct irq_domain, it's used to
> 
>                                               irq_domain;
> 
>> +   maintain irq_domain hierarchy information.
>> +2) a new field 'parent_data' is added to struct irq_data, it's used to
> 
>                                                    irq_data;
> 
>> +   build hierarchy irq_data to match hierarchy irq_domains. The irq_data
>> +   is used to store irq_domain pointer and hardware irq number.
>> +3) new callbacks are added to struct irq_domain_ops to support hierarchy
>> +   irq_domain operations.
>> +
>> +With support of hierarchy irq_domain and hierarchy irq_data ready, an
>> +irq_domain structure is built for each interrupt controller, and an
>> +irq_data structure is allocated for each irq_domain associated with an
>> +IRQ. Now we could go one step further to support stacked(hierarchy)
>> +irq_chip. That is, an irq_chip is associated with each irq_data along
>> +the hierarchy. A child irq_chip may implement a required action by
>> +itself or by cooperating with its parent irq_chip.
>> +
>> +With stacked irq_chip, interrupt controller driver only needs to deal
>> +with the hardware managed by itself and may ask for services from its
>> +parent irq_chip when needed. So we could achieve a much more cleaner
> 
>                                                     a much cleaner
> 
>> +software architecture.
>> +
>> +For an interrupt controller driver to support hierarchy irq_domain, it
>> +needs to:
>> +1) Implement irq_domain_ops.alloc and irq_domain_ops.free
>> +2) Optionally implement irq_domain_ops.activate and
>> +   irq_domain_ops.deactivate.
>> +3) Optionally implement an irq_chip to manage the interrupt controller
>> +   hardware.
>> +4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
>> +   they are unused with hierarchy irq_domain.
>> +
>> +Hierarchy irq_domain may also be used to support other architectures,
>> +such as ARM, ARM64 etc.
> 
>> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
>> index b0f9d16e48f6..46e047c414bc 100644
>> --- a/include/linux/irqdomain.h
>> +++ b/include/linux/irqdomain.h
>> @@ -77,6 +89,7 @@ struct irq_domain_chip_generic;
>>   * @ops: pointer to irq_domain methods
>>   * @host_data: private data pointer for use by owner.  Not touched by irq_domain
>>   *             core code.
>> + * @flags: host per irqdomain flags
> 
>                        irq_domain ?
> 
>>   *
>>   * Optional elements
>>   * @of_node: Pointer to device tree nodes associated with the irq_domain. Used
>> @@ -84,6 +97,7 @@ struct irq_domain_chip_generic;
>>   * @gc: Pointer to a list of generic chips. There is a helper function for
>>   *      setting up one or more generic chips for interrupt controllers
>>   *      drivers using the generic chip library which uses this pointer.
>> + * @parent: Pointer to parent irqdomain to support hierarchy irqdomains
> 
>                                  irq_domain ?                   irq_domains ?
> 
>>   *
>>   * Revmap data, used internally by irq_domain
>>   * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that
> 
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>> index 6534ff6ce02e..26628239088c 100644
>> --- a/kernel/irq/irqdomain.c
>> +++ b/kernel/irq/irqdomain.c
>> @@ -709,3 +708,332 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>>  	.xlate = irq_domain_xlate_onetwocell,
>>  };
>>  EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
>> +
>> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
>> +				  irq_hw_number_t hwirq, int node)
>> +{
>> +	unsigned int hint;
>> +
>> +	if (virq >= 0) {
>> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
>> +	} else {
>> +		hint = hwirq % nr_irqs;
>> +		if (hint == 0)
>> +			hint++;
>> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
>> +		if (virq <= 0 && hint > 1)
>> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
>> +	}
>> +
>> +	return virq;
>> +}
>> +
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_free_desc(virq + i);
>> +}
>> +
>> +static void irq_domain_insert_irq(int virq)
>> +{
>> +	struct irq_data *data;
>> +
>> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
>> +		struct irq_domain *domain = data->domain;
>> +		irq_hw_number_t hwirq = data->hwirq;
>> +
>> +		if (hwirq < domain->revmap_size) {
>> +			domain->linear_revmap[hwirq] = virq;
>> +		} else {
>> +			mutex_lock(&revmap_trees_mutex);
>> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
>> +			mutex_unlock(&revmap_trees_mutex);
>> +		}
>> +
>> +		/* If not already assigned, give the domain the chip's name */
>> +		if (!domain->name && data->chip)
>> +			domain->name = data->chip->name;
>> +	}
>> +
>> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
>> +}
>> +
>> +static void irq_domain_remove_irq(int virq)
>> +{
>> +	struct irq_data *data;
>> +
>> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
>> +	irq_set_chip_and_handler(virq, NULL, NULL);
>> +	synchronize_irq(virq);
>> +	smp_mb();
>> +
>> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
>> +		struct irq_domain *domain = data->domain;
>> +		irq_hw_number_t hwirq = data->hwirq;
>> +
>> +		if (hwirq < domain->revmap_size) {
>> +			domain->linear_revmap[hwirq] = 0;
>> +		} else {
>> +			mutex_lock(&revmap_trees_mutex);
>> +			radix_tree_delete(&domain->revmap_tree, hwirq);
>> +			mutex_unlock(&revmap_trees_mutex);
>> +		}
>> +	}
>> +}
>> +
>> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
>> +						   struct irq_data *child)
>> +{
>> +	struct irq_data *irq_data;
>> +
>> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
>> +	if (irq_data) {
>> +		child->parent_data = irq_data;
>> +		irq_data->irq = child->irq;
>> +		irq_data->node = child->node;
>> +		irq_data->domain = domain;
>> +	}
>> +
>> +	return irq_data;
>> +}
>> +
>> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data, *tmp;
>> +
>> +	for (i = 0; i < nr_irqs; i++) {
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		tmp = irq_data->parent_data;
>> +		irq_data->parent_data = NULL;
>> +		irq_data->domain = NULL;
>> +
>> +		while (tmp) {
>> +			irq_data = tmp;
>> +			tmp = tmp->parent_data;
>> +			kfree(irq_data);
>> +		}
>> +	}
>> +}
>> +
>> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
>> +				     unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data;
>> +	struct irq_domain *parent;
>> +
>> +	/* The outmost irq_data is embedded in struct irq_desc */
> 
> 	       outermost
> 
>> +	for (i = 0; i < nr_irqs; i++) {
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		irq_data->domain = domain;
>> +
>> +		for (parent = domain->parent; parent; parent = parent->parent) {
>> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
>> +			if (!irq_data) {
>> +				irq_domain_free_irq_data(virq, i + 1);
>> +				return -ENOMEM;
>> +			}
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
> 
>                                              associated
> 
>> + * @domain: domain to match
>> + * @virq: IRQ number to get irq_data
>> + */
>> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
>> +					 unsigned int virq)
>> +{
>> +	struct irq_data *irq_data;
>> +
>> +	for (irq_data = irq_get_irq_data(virq); irq_data;
>> +	     irq_data = irq_data->parent_data)
>> +		if (irq_data->domain == domain)
>> +			return irq_data;
>> +
>> +	return NULL;
>> +}
>> +
>> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
>> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
>> +				  void *chip_data)
>> +{
>> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
>> +
>> +	if (!irq_data)
>> +		return -ENOENT;
>> +
>> +	irq_data->hwirq = hwirq;
>> +	irq_data->chip = chip;
>> +	irq_data->chip_data = chip_data;
>> +
>> +	return 0;
>> +}
>> +
>> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
>> +{
>> +	irq_data->hwirq = 0;
>> +	irq_data->chip = NULL;
>> +	irq_data->chip_data = NULL;
>> +}
>> +
>> +/**
>> + * __irq_domain_alloc_irqs - Allocate IRQs from domain
>> + * @domain: domain to allocate from
>> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
>> + * @nr_irqs: number of IRQs to allocate
>> + * @node: NUMA node id for memory allocation
>> + * @arg: domain specific argument
>> + * @realloc: IRQ descriptors have already been allocated if true
>> + *
>> + * Allocate IRQ numbers and initialized all data structures to support
>> + * hiearchy IRQ domains.
>> + * Parameter @realloc is mainly to support legacy IRQs.
>> + * Returns error code or allocated IRQ number
>> + */
>> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>> +			    unsigned int nr_irqs, int node, void *arg,
>> +			    bool realloc)
>> +{
>> +	int i, ret, virq;
>> +
>> +	if (domain == NULL) {
>> +		domain = irq_default_domain;
>> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
>> +			return -EINVAL;
>> +	}
>> +
>> +	if (!domain->ops->alloc) {
>> +		pr_debug("domain->ops->alloc() is NULL\n");
>> +		return -ENOSYS;
>> +	}
>> +
>> +	if (realloc && irq_base >= 0) {
>> +		virq =  irq_base;
>> +	} else {
>> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
>> +		if (virq < 0) {
>> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
>> +				 irq_base, nr_irqs);
>> +			return virq;
>> +		}
>> +	}
>> +
>> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
>> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
>> +		ret = -ENOMEM;
>> +		goto out_free_desc;
>> +	}
>> +
>> +	mutex_lock(&irq_domain_mutex);
>> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
>> +	if (ret < 0) {
>> +		mutex_unlock(&irq_domain_mutex);
>> +		goto out_free_irq_data;
>> +	}
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_domain_insert_irq(virq + i);
>> +	mutex_unlock(&irq_domain_mutex);
>> +
>> +	return virq;
>> +
>> +out_free_irq_data:
>> +	irq_domain_free_irq_data(virq, nr_irqs);
>> +out_free_desc:
>> +	irq_domain_free_descs(virq, nr_irqs);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures
> 
>                                                  associated
> 
>> + * @virq: base IRQ number
>> + * @nr_irqs: number of IRQs to free
>> + */
>> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *data = irq_get_irq_data(virq);
>> +
>> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
>> +		 "NULL pointer, cannot free irq\n"))
>> +		return;
>> +
>> +	mutex_lock(&irq_domain_mutex);
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_domain_remove_irq(virq + i);
>> +	data->domain->ops->free(data->domain, virq, nr_irqs);
>> +	mutex_unlock(&irq_domain_mutex);
>> +
>> +	irq_domain_free_irq_data(virq, nr_irqs);
>> +	irq_domain_free_descs(virq, nr_irqs);
>> +}
>> +
>> +/**
>> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
>> + *			     interrupt
>> + * @irq_data: out most irq_data associated with interrupt
> 
>                  outermost
> 
>> + *
>> + * It calls domain_ops->activate to program interrupt controllers, so the
>> + * interrupt could actually delivered.
>> + */
>> +int irq_domain_activate_irq(struct irq_data *irq_data)
>> +{
>> +	int ret = 0;
>> +
>> +	if (irq_data && irq_data->domain) {
>> +		struct irq_domain *domain = irq_data->domain;
>> +
>> +		if (irq_data->parent_data)
>> +			ret = irq_domain_activate_irq(irq_data->parent_data);
>> +		if (ret == 0 && domain->ops->activate)
>> +			ret = domain->ops->activate(domain, irq_data);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
>> + *			       deactivate interrupt
>> + * @irq_data: out most irq_data associated with interrupt
> 
>                  outermost
> 
>> + *
>> + * It calls domain_ops->deactivate to program interrupt controllers to disable
>> + * interrupt delivery.
>> + */
>> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
>> +{
>> +	int ret = 0;
>> +
>> +	if (irq_data && irq_data->domain) {
>> +		struct irq_domain *domain = irq_data->domain;
>> +
>> +		if (domain->ops->deactivate)
>> +			ret = domain->ops->deactivate(domain, irq_data);
>> +		if (ret == 0 && irq_data->parent_data)
>> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
>> +	}
>> +
>> +	return ret;
>> +}
>> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>> +/**
>> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
> 
>                                              associated
> 
>> + * @domain: domain to match
>> + * @virq: IRQ number to get irq_data
>> + */
>> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
>> +					 unsigned int virq)
>> +{
>> +	struct irq_data *irq_data = irq_get_irq_data(virq);
>> +
>> +	return (irq_data && irq_data->domain == domain) ? irq_data : NULL;
>> +}
>> +
>> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>>
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-24  5:26           ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  5:26 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks Randy! I will fix these issues in next version.
Regards!
Gerry

On 2014/9/23 1:30, Randy Dunlap wrote:
> On 09/22/14 01:17, Jiang Liu wrote:
>> ---
>>  Documentation/IRQ-domain.txt |   71 +++++++++
>>  include/linux/irq.h          |    3 +
>>  include/linux/irqdomain.h    |   86 ++++++++++
>>  kernel/irq/Kconfig           |    3 +
>>  kernel/irq/chip.c            |    3 +
>>  kernel/irq/irqdomain.c       |  360 ++++++++++++++++++++++++++++++++++++++++--
>>  6 files changed, 510 insertions(+), 16 deletions(-)
>>
>> diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
>> index 8a8b82c9ca53..062f6b6088b4 100644
>> --- a/Documentation/IRQ-domain.txt
>> +++ b/Documentation/IRQ-domain.txt
>> @@ -151,3 +151,74 @@ used and no descriptor gets allocated it is very important to make sure
>>  that the driver using the simple domain call irq_create_mapping()
>>  before any irq_find_mapping() since the latter will actually work
>>  for the static IRQ assignment case.
>> +
>> +==== Hierarchy IRQ domain ====
>> +On some architectures, there may be multiple interrupt controllers
>> +involved in delivering an interrupt from the device to the target CPU.
>> +Let's look at a typical interrupt delivering path on x86 platforms:
>> +
>> +Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
>> +
>> +There are three interrupt controllers involved:
>> +1) IOAPIC controller
>> +2) Interrupt remapping controller
>> +3) Local APIC controller
>> +
>> +To support such a hardware topology and make software architecture match
>> +hardware architecture, an irq_domain data structure is built for each
>> +interrupt controller and those irq_domains are organized into hierarchy.
>> +When building irq_domain hierarchy, the irq_domain near to the device is
>> +child and the irq_domain near to CPU is parent. So a hierarchy structure
>> +as below will be built for the example above.
>> +	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
>> +		^
>> +		|
>> +	Interrupt Remapping irq_domain (manage irq_remapping entries)
>> +		^
>> +		|
>> +	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
>> +
>> +There are four major interfaces to use hierarchy irq_domain:
>> +1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
>> +   controller related resources to deliver these interrupts.
>> +2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controler
> 
>                                                                  controller
> 
>> +   related resources associated with these interrupts.
>> +3) irq_domain_activate_irq(): activate interrupt controller hardware to
>> +   deliver the interrupt.
>> +3) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
>> +   to stopping delivering the interrupt.
> 
>       to stop
> 
>> +
>> +Following changes are needed to support hierarchy irq_domain.
>> +1) a new field 'parent' is added to struct irq_domain, it's used to
> 
>                                               irq_domain;
> 
>> +   maintain irq_domain hierarchy information.
>> +2) a new field 'parent_data' is added to struct irq_data, it's used to
> 
>                                                    irq_data;
> 
>> +   build hierarchy irq_data to match hierarchy irq_domains. The irq_data
>> +   is used to store irq_domain pointer and hardware irq number.
>> +3) new callbacks are added to struct irq_domain_ops to support hierarchy
>> +   irq_domain operations.
>> +
>> +With support of hierarchy irq_domain and hierarchy irq_data ready, an
>> +irq_domain structure is built for each interrupt controller, and an
>> +irq_data structure is allocated for each irq_domain associated with an
>> +IRQ. Now we could go one step further to support stacked(hierarchy)
>> +irq_chip. That is, an irq_chip is associated with each irq_data along
>> +the hierarchy. A child irq_chip may implement a required action by
>> +itself or by cooperating with its parent irq_chip.
>> +
>> +With stacked irq_chip, interrupt controller driver only needs to deal
>> +with the hardware managed by itself and may ask for services from its
>> +parent irq_chip when needed. So we could achieve a much more cleaner
> 
>                                                     a much cleaner
> 
>> +software architecture.
>> +
>> +For an interrupt controller driver to support hierarchy irq_domain, it
>> +needs to:
>> +1) Implement irq_domain_ops.alloc and irq_domain_ops.free
>> +2) Optionally implement irq_domain_ops.activate and
>> +   irq_domain_ops.deactivate.
>> +3) Optionally implement an irq_chip to manage the interrupt controller
>> +   hardware.
>> +4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
>> +   they are unused with hierarchy irq_domain.
>> +
>> +Hierarchy irq_domain may also be used to support other architectures,
>> +such as ARM, ARM64 etc.
> 
>> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
>> index b0f9d16e48f6..46e047c414bc 100644
>> --- a/include/linux/irqdomain.h
>> +++ b/include/linux/irqdomain.h
>> @@ -77,6 +89,7 @@ struct irq_domain_chip_generic;
>>   * @ops: pointer to irq_domain methods
>>   * @host_data: private data pointer for use by owner.  Not touched by irq_domain
>>   *             core code.
>> + * @flags: host per irqdomain flags
> 
>                        irq_domain ?
> 
>>   *
>>   * Optional elements
>>   * @of_node: Pointer to device tree nodes associated with the irq_domain. Used
>> @@ -84,6 +97,7 @@ struct irq_domain_chip_generic;
>>   * @gc: Pointer to a list of generic chips. There is a helper function for
>>   *      setting up one or more generic chips for interrupt controllers
>>   *      drivers using the generic chip library which uses this pointer.
>> + * @parent: Pointer to parent irqdomain to support hierarchy irqdomains
> 
>                                  irq_domain ?                   irq_domains ?
> 
>>   *
>>   * Revmap data, used internally by irq_domain
>>   * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that
> 
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>> index 6534ff6ce02e..26628239088c 100644
>> --- a/kernel/irq/irqdomain.c
>> +++ b/kernel/irq/irqdomain.c
>> @@ -709,3 +708,332 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>>  	.xlate = irq_domain_xlate_onetwocell,
>>  };
>>  EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
>> +
>> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
>> +				  irq_hw_number_t hwirq, int node)
>> +{
>> +	unsigned int hint;
>> +
>> +	if (virq >= 0) {
>> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
>> +	} else {
>> +		hint = hwirq % nr_irqs;
>> +		if (hint == 0)
>> +			hint++;
>> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
>> +		if (virq <= 0 && hint > 1)
>> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
>> +	}
>> +
>> +	return virq;
>> +}
>> +
>> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
>> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_free_desc(virq + i);
>> +}
>> +
>> +static void irq_domain_insert_irq(int virq)
>> +{
>> +	struct irq_data *data;
>> +
>> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
>> +		struct irq_domain *domain = data->domain;
>> +		irq_hw_number_t hwirq = data->hwirq;
>> +
>> +		if (hwirq < domain->revmap_size) {
>> +			domain->linear_revmap[hwirq] = virq;
>> +		} else {
>> +			mutex_lock(&revmap_trees_mutex);
>> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
>> +			mutex_unlock(&revmap_trees_mutex);
>> +		}
>> +
>> +		/* If not already assigned, give the domain the chip's name */
>> +		if (!domain->name && data->chip)
>> +			domain->name = data->chip->name;
>> +	}
>> +
>> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
>> +}
>> +
>> +static void irq_domain_remove_irq(int virq)
>> +{
>> +	struct irq_data *data;
>> +
>> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
>> +	irq_set_chip_and_handler(virq, NULL, NULL);
>> +	synchronize_irq(virq);
>> +	smp_mb();
>> +
>> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
>> +		struct irq_domain *domain = data->domain;
>> +		irq_hw_number_t hwirq = data->hwirq;
>> +
>> +		if (hwirq < domain->revmap_size) {
>> +			domain->linear_revmap[hwirq] = 0;
>> +		} else {
>> +			mutex_lock(&revmap_trees_mutex);
>> +			radix_tree_delete(&domain->revmap_tree, hwirq);
>> +			mutex_unlock(&revmap_trees_mutex);
>> +		}
>> +	}
>> +}
>> +
>> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
>> +						   struct irq_data *child)
>> +{
>> +	struct irq_data *irq_data;
>> +
>> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
>> +	if (irq_data) {
>> +		child->parent_data = irq_data;
>> +		irq_data->irq = child->irq;
>> +		irq_data->node = child->node;
>> +		irq_data->domain = domain;
>> +	}
>> +
>> +	return irq_data;
>> +}
>> +
>> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data, *tmp;
>> +
>> +	for (i = 0; i < nr_irqs; i++) {
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		tmp = irq_data->parent_data;
>> +		irq_data->parent_data = NULL;
>> +		irq_data->domain = NULL;
>> +
>> +		while (tmp) {
>> +			irq_data = tmp;
>> +			tmp = tmp->parent_data;
>> +			kfree(irq_data);
>> +		}
>> +	}
>> +}
>> +
>> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
>> +				     unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data;
>> +	struct irq_domain *parent;
>> +
>> +	/* The outmost irq_data is embedded in struct irq_desc */
> 
> 	       outermost
> 
>> +	for (i = 0; i < nr_irqs; i++) {
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		irq_data->domain = domain;
>> +
>> +		for (parent = domain->parent; parent; parent = parent->parent) {
>> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
>> +			if (!irq_data) {
>> +				irq_domain_free_irq_data(virq, i + 1);
>> +				return -ENOMEM;
>> +			}
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
> 
>                                              associated
> 
>> + * @domain: domain to match
>> + * @virq: IRQ number to get irq_data
>> + */
>> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
>> +					 unsigned int virq)
>> +{
>> +	struct irq_data *irq_data;
>> +
>> +	for (irq_data = irq_get_irq_data(virq); irq_data;
>> +	     irq_data = irq_data->parent_data)
>> +		if (irq_data->domain == domain)
>> +			return irq_data;
>> +
>> +	return NULL;
>> +}
>> +
>> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
>> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
>> +				  void *chip_data)
>> +{
>> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
>> +
>> +	if (!irq_data)
>> +		return -ENOENT;
>> +
>> +	irq_data->hwirq = hwirq;
>> +	irq_data->chip = chip;
>> +	irq_data->chip_data = chip_data;
>> +
>> +	return 0;
>> +}
>> +
>> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
>> +{
>> +	irq_data->hwirq = 0;
>> +	irq_data->chip = NULL;
>> +	irq_data->chip_data = NULL;
>> +}
>> +
>> +/**
>> + * __irq_domain_alloc_irqs - Allocate IRQs from domain
>> + * @domain: domain to allocate from
>> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
>> + * @nr_irqs: number of IRQs to allocate
>> + * @node: NUMA node id for memory allocation
>> + * @arg: domain specific argument
>> + * @realloc: IRQ descriptors have already been allocated if true
>> + *
>> + * Allocate IRQ numbers and initialized all data structures to support
>> + * hiearchy IRQ domains.
>> + * Parameter @realloc is mainly to support legacy IRQs.
>> + * Returns error code or allocated IRQ number
>> + */
>> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>> +			    unsigned int nr_irqs, int node, void *arg,
>> +			    bool realloc)
>> +{
>> +	int i, ret, virq;
>> +
>> +	if (domain == NULL) {
>> +		domain = irq_default_domain;
>> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
>> +			return -EINVAL;
>> +	}
>> +
>> +	if (!domain->ops->alloc) {
>> +		pr_debug("domain->ops->alloc() is NULL\n");
>> +		return -ENOSYS;
>> +	}
>> +
>> +	if (realloc && irq_base >= 0) {
>> +		virq =  irq_base;
>> +	} else {
>> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
>> +		if (virq < 0) {
>> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
>> +				 irq_base, nr_irqs);
>> +			return virq;
>> +		}
>> +	}
>> +
>> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
>> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
>> +		ret = -ENOMEM;
>> +		goto out_free_desc;
>> +	}
>> +
>> +	mutex_lock(&irq_domain_mutex);
>> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
>> +	if (ret < 0) {
>> +		mutex_unlock(&irq_domain_mutex);
>> +		goto out_free_irq_data;
>> +	}
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_domain_insert_irq(virq + i);
>> +	mutex_unlock(&irq_domain_mutex);
>> +
>> +	return virq;
>> +
>> +out_free_irq_data:
>> +	irq_domain_free_irq_data(virq, nr_irqs);
>> +out_free_desc:
>> +	irq_domain_free_descs(virq, nr_irqs);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures
> 
>                                                  associated
> 
>> + * @virq: base IRQ number
>> + * @nr_irqs: number of IRQs to free
>> + */
>> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *data = irq_get_irq_data(virq);
>> +
>> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
>> +		 "NULL pointer, cannot free irq\n"))
>> +		return;
>> +
>> +	mutex_lock(&irq_domain_mutex);
>> +	for (i = 0; i < nr_irqs; i++)
>> +		irq_domain_remove_irq(virq + i);
>> +	data->domain->ops->free(data->domain, virq, nr_irqs);
>> +	mutex_unlock(&irq_domain_mutex);
>> +
>> +	irq_domain_free_irq_data(virq, nr_irqs);
>> +	irq_domain_free_descs(virq, nr_irqs);
>> +}
>> +
>> +/**
>> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
>> + *			     interrupt
>> + * @irq_data: out most irq_data associated with interrupt
> 
>                  outermost
> 
>> + *
>> + * It calls domain_ops->activate to program interrupt controllers, so the
>> + * interrupt could actually delivered.
>> + */
>> +int irq_domain_activate_irq(struct irq_data *irq_data)
>> +{
>> +	int ret = 0;
>> +
>> +	if (irq_data && irq_data->domain) {
>> +		struct irq_domain *domain = irq_data->domain;
>> +
>> +		if (irq_data->parent_data)
>> +			ret = irq_domain_activate_irq(irq_data->parent_data);
>> +		if (ret == 0 && domain->ops->activate)
>> +			ret = domain->ops->activate(domain, irq_data);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
>> + *			       deactivate interrupt
>> + * @irq_data: out most irq_data associated with interrupt
> 
>                  outermost
> 
>> + *
>> + * It calls domain_ops->deactivate to program interrupt controllers to disable
>> + * interrupt delivery.
>> + */
>> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
>> +{
>> +	int ret = 0;
>> +
>> +	if (irq_data && irq_data->domain) {
>> +		struct irq_domain *domain = irq_data->domain;
>> +
>> +		if (domain->ops->deactivate)
>> +			ret = domain->ops->deactivate(domain, irq_data);
>> +		if (ret == 0 && irq_data->parent_data)
>> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
>> +	}
>> +
>> +	return ret;
>> +}
>> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>> +/**
>> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and @domain
> 
>                                              associated
> 
>> + * @domain: domain to match
>> + * @virq: IRQ number to get irq_data
>> + */
>> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
>> +					 unsigned int virq)
>> +{
>> +	struct irq_data *irq_data = irq_get_irq_data(virq);
>> +
>> +	return (irq_data && irq_data->domain == domain) ? irq_data : NULL;
>> +}
>> +
>> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>>
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-23  9:43         ` Joe.C
@ 2014-09-24  5:55           ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  5:55 UTC (permalink / raw)
  To: Joe.C
  Cc: Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier,
	Tony Luck, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Joerg Roedel, x86, linux-kernel, linux-acpi, linux-pci,
	Andrew Morton, linux-arm-kernel



On 2014/9/23 17:43, Joe.C wrote:
> On Mon, 2014-09-22 at 16:17 +0800, Jiang Liu wrote:
>> @@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
>>  unsigned int irq_create_mapping(struct irq_domain *domain,
>>  				irq_hw_number_t hwirq)
>>  {
>> -	unsigned int hint;
>>  	int virq;
>>  
>>  	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
>> @@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>>  	}
>>  
>>  	/* Allocate a virtual interrupt number */
>> -	hint = hwirq % nr_irqs;
>> -	if (hint == 0)
>> -		hint++;
>> -	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
>> -	if (virq <= 0)
>> -		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
>> +	virq = irq_domain_alloc_descs(-1, 1, hwirq,
>> +				      of_node_to_nid(domain->of_node));
> 
> If I read this correct, the resulting virq is different after your
> change.
It should have the same effect. We just factored out original code as
a function, so it could be reused.

> 
>>  	if (virq <= 0) {
>>  		pr_debug("-> virq allocation failed\n");
>>  		return 0;
>> @@ -490,7 +486,10 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
>>  	}
>>  
>>  	/* Create mapping */
>> -	virq = irq_create_mapping(domain, hwirq);
>> +	if (irq_domain_is_hierarchy(domain))
>> +		virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, irq_data);
>> +	else
>> +		virq = irq_create_mapping(domain, hwirq);
>>  	if (!virq)
>>  		return virq;
> 
> hwirq returned from xlat above is lost. Without hwirq or virq, how do we
> know which irq are we working for?
> Also, if the irq_desc/irq_data was already created, this will create
> another one. Should we do irq_find_mapping just like irq_create_mapping?
When irq_create_of_mapping is called, IRQ desc/irq_data haven't been
allocated yet. The parameter irq_data is type of struct of_phandle_args
instead of struct irq_data:)

We pass irq_data to irq_domain_alloc_irqs(), the we could reconstruct
hwirq from irq_data if needed.

To make the code clearer, I plan to change code as below:
unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
{
        struct irq_domain *domain;
        irq_hw_number_t hwirq;
        unsigned int type = IRQ_TYPE_NONE;
        unsigned int virq;

        domain = irq_data->np ? irq_find_host(irq_data->np) :
irq_default_domain;
        if (!domain) {
                pr_warn("no irq domain found for %s !\n",
                        of_node_full_name(irq_data->np));
                return 0;
        }

+        if (irq_domain_is_hierarchy(domain))
+                return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE,
irq_data);
+
        /* If domain has no translation, then we assume interrupt line */
        if (domain->ops->xlate == NULL)
                hwirq = irq_data->args[0];
        else {
                if (domain->ops->xlate(domain, irq_data->np, irq_data->args,
                                        irq_data->args_count, &hwirq,
&type))
                        return 0;
        }

        /* Create mapping */
        virq = irq_create_mapping(domain, hwirq);
        if (!virq)
                return virq;

        /* Set type if specified and different than the current one */
        if (type != IRQ_TYPE_NONE &&
            type != irq_get_trigger_type(virq))
                irq_set_irq_type(virq, type);
        return virq;
}
EXPORT_SYMBOL_GPL(irq_create_of_mapping);

>> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>> +			    unsigned int nr_irqs, int node, void *arg,
>> +			    bool realloc)
>> +{
>> +	int i, ret, virq;
>> +
>> +	if (domain == NULL) {
>> +		domain = irq_default_domain;
>> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
>> +			return -EINVAL;
>> +	}
>> +
>> +	if (!domain->ops->alloc) {
>> +		pr_debug("domain->ops->alloc() is NULL\n");
>> +		return -ENOSYS;
>> +	}
>> +
>> +	if (realloc && irq_base >= 0) {
>> +		virq =  irq_base;
>                          ^
> extra space here.
Will fix it in next version.
Thanks, Joe!
Gerry
> 
> Joe.C
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [Patch] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-24  5:55           ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  5:55 UTC (permalink / raw)
  To: linux-arm-kernel



On 2014/9/23 17:43, Joe.C wrote:
> On Mon, 2014-09-22 at 16:17 +0800, Jiang Liu wrote:
>> @@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
>>  unsigned int irq_create_mapping(struct irq_domain *domain,
>>  				irq_hw_number_t hwirq)
>>  {
>> -	unsigned int hint;
>>  	int virq;
>>  
>>  	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
>> @@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>>  	}
>>  
>>  	/* Allocate a virtual interrupt number */
>> -	hint = hwirq % nr_irqs;
>> -	if (hint == 0)
>> -		hint++;
>> -	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
>> -	if (virq <= 0)
>> -		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
>> +	virq = irq_domain_alloc_descs(-1, 1, hwirq,
>> +				      of_node_to_nid(domain->of_node));
> 
> If I read this correct, the resulting virq is different after your
> change.
It should have the same effect. We just factored out original code as
a function, so it could be reused.

> 
>>  	if (virq <= 0) {
>>  		pr_debug("-> virq allocation failed\n");
>>  		return 0;
>> @@ -490,7 +486,10 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
>>  	}
>>  
>>  	/* Create mapping */
>> -	virq = irq_create_mapping(domain, hwirq);
>> +	if (irq_domain_is_hierarchy(domain))
>> +		virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, irq_data);
>> +	else
>> +		virq = irq_create_mapping(domain, hwirq);
>>  	if (!virq)
>>  		return virq;
> 
> hwirq returned from xlat above is lost. Without hwirq or virq, how do we
> know which irq are we working for?
> Also, if the irq_desc/irq_data was already created, this will create
> another one. Should we do irq_find_mapping just like irq_create_mapping?
When irq_create_of_mapping is called, IRQ desc/irq_data haven't been
allocated yet. The parameter irq_data is type of struct of_phandle_args
instead of struct irq_data:)

We pass irq_data to irq_domain_alloc_irqs(), the we could reconstruct
hwirq from irq_data if needed.

To make the code clearer, I plan to change code as below:
unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
{
        struct irq_domain *domain;
        irq_hw_number_t hwirq;
        unsigned int type = IRQ_TYPE_NONE;
        unsigned int virq;

        domain = irq_data->np ? irq_find_host(irq_data->np) :
irq_default_domain;
        if (!domain) {
                pr_warn("no irq domain found for %s !\n",
                        of_node_full_name(irq_data->np));
                return 0;
        }

+        if (irq_domain_is_hierarchy(domain))
+                return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE,
irq_data);
+
        /* If domain has no translation, then we assume interrupt line */
        if (domain->ops->xlate == NULL)
                hwirq = irq_data->args[0];
        else {
                if (domain->ops->xlate(domain, irq_data->np, irq_data->args,
                                        irq_data->args_count, &hwirq,
&type))
                        return 0;
        }

        /* Create mapping */
        virq = irq_create_mapping(domain, hwirq);
        if (!virq)
                return virq;

        /* Set type if specified and different than the current one */
        if (type != IRQ_TYPE_NONE &&
            type != irq_get_trigger_type(virq))
                irq_set_irq_type(virq, type);
        return virq;
}
EXPORT_SYMBOL_GPL(irq_create_of_mapping);

>> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
>> +			    unsigned int nr_irqs, int node, void *arg,
>> +			    bool realloc)
>> +{
>> +	int i, ret, virq;
>> +
>> +	if (domain == NULL) {
>> +		domain = irq_default_domain;
>> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
>> +			return -EINVAL;
>> +	}
>> +
>> +	if (!domain->ops->alloc) {
>> +		pr_debug("domain->ops->alloc() is NULL\n");
>> +		return -ENOSYS;
>> +	}
>> +
>> +	if (realloc && irq_base >= 0) {
>> +		virq =  irq_base;
>                          ^
> extra space here.
Will fix it in next version.
Thanks, Joe!
Gerry
> 
> Joe.C
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-11 14:03   ` Jiang Liu
  (?)
@ 2014-09-24  6:55     ` Yasuaki Ishimatsu
  -1 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  6:55 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/11 23:03), Jiang Liu wrote:
> We plan to use hierarchy irqdomain to suppport CPU vector assignment,
> interrupt remapping controller, IO-APIC controller, MSI interrupt
> and hypertransport interrupt etc on x86 platforms. So extend irqdomain
> interfaces to support hierarchy irqdomain.
> 
> There are already many clients of current irqdomain interfaces.
> To minimize the changes, we choose to introduce new version 2 interfaces
> to support hierarchy instead of extending existing irqdomain interfaces.
> 
> According to Thomas's suggestion, the most important design decision is
> to build hierarchy struct irq_data to support hierarchy irqdomain, so
> hierarchy irqdomain related data could be saved in struct irq_data.
> With support of hierarchy irq_data, we could also support stacked
> irq_chips. This is most useful in case of set_affinity().
> 
> The new hierarchy irqdomain introduces following interfaces:
> 1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
>     and related resources.
> 2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
> 3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
>     interrupt controllers to activate/deactivate interrupt.
> 
> There are also several help functions to ease irqdomain implemenations:
> 1) irq_domain_get_irq_data(): get irq_data associated with a specific
>     irqdomain.
> 2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
>     irq_data.
> 3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
>     parent irqdomain's alloc/free callbacks.
> 
> We also changed irq_startup()/irq_shutdown() to invoke
> irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
> interrupt controller when start/stop interrupts.
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
>   include/linux/irq.h       |    3 +
>   include/linux/irqdomain.h |   60 ++++++++
>   kernel/irq/Kconfig        |    3 +
>   kernel/irq/chip.c         |    3 +
>   kernel/irq/irqdomain.c    |  349 +++++++++++++++++++++++++++++++++++++++++++--
>   5 files changed, 404 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 62af59242ddc..4b74565690ce 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -151,6 +151,9 @@ struct irq_data {
>   	unsigned int		state_use_accessors;
>   	struct irq_chip		*chip;
>   	struct irq_domain	*domain;
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	struct irq_data		*parent_data;
> +#endif
>   	void			*handler_data;
>   	void			*chip_data;
>   	struct msi_desc		*msi_desc;
> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index b0f9d16e48f6..a9ddc8534c63 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -38,6 +38,7 @@
>   struct device_node;
>   struct irq_domain;
>   struct of_device_id;
> +struct irq_chip;
>   
>   /* Number of irqs reserved for a legacy isa controller */
>   #define NUM_ISA_INTERRUPTS	16
> @@ -64,6 +65,16 @@ struct irq_domain_ops {
>   	int (*xlate)(struct irq_domain *d, struct device_node *node,
>   		     const u32 *intspec, unsigned int intsize,
>   		     unsigned long *out_hwirq, unsigned int *out_type);
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	/* extended V2 interfaces to support hierarchy irqdomains */
> +	int (*alloc)(struct irq_domain *d, unsigned int virq,
> +		     unsigned int nr_irqs, void *arg);
> +	void (*free)(struct irq_domain *d, unsigned int virq,
> +		     unsigned int nr_irqs);
> +	int (*activate)(struct irq_domain *d, struct irq_data *irq_data);
> +	int (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
> +#endif
>   };
>   
>   extern struct irq_domain_ops irq_generic_chip_ops;
> @@ -101,6 +112,9 @@ struct irq_domain {
>   	/* Optional data */
>   	struct device_node *of_node;
>   	struct irq_domain_chip_generic *gc;
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	struct irq_domain *parent;
> +#endif
>   
>   	/* reverse map data. The linear map gets appended to the irq_domain */
>   	irq_hw_number_t hwirq_max;
> @@ -220,8 +234,54 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d, struct device_node *ctrlr,
>   			const u32 *intspec, unsigned int intsize,
>   			irq_hw_number_t *out_hwirq, unsigned int *out_type);
>   
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +/* V2 interfaces to support hierarchy IRQ domains. */
> +extern struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +						unsigned int virq);
> +extern int irq_domain_set_hwirq_and_chip(struct irq_domain *domain,
> +					 unsigned int virq,
> +					 irq_hw_number_t hwirq,
> +					 struct irq_chip *chip,
> +					 void *chip_data);
> +extern void irq_domain_reset_irq_data(struct irq_data *irq_data);
> +extern int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +				   unsigned int nr_irqs, int node, void *arg,
> +				   bool realloc);
> +extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
> +extern int irq_domain_activate_irq(struct irq_data *irq_data);
> +extern int irq_domain_deactivate_irq(struct irq_data *irq_data);
> +
> +static inline int irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +				unsigned int nr_irqs, int node, void *arg)
> +{
> +	return __irq_domain_alloc_irqs(domain, irq_base, nr_irqs, node,
> +				       arg, false);
> +}
> +
> +static inline int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
> +				int irq_base, unsigned int nr_irqs, void *arg)
> +{
> +	if (domain->parent && domain->parent->ops->alloc)
> +		return domain->parent->ops->alloc(domain->parent, irq_base,
> +						  nr_irqs, arg);
> +	return -ENOSYS;
> +}
> +
> +static inline void irq_domain_free_irqs_parent(struct irq_domain *domain,
> +					int irq_base, unsigned int nr_irqs)
> +{
> +	if (domain->parent && domain->parent->ops->free)
> +		domain->parent->ops->free(domain->parent, irq_base, nr_irqs);
> +}
> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +
>   #else /* CONFIG_IRQ_DOMAIN */
>   static inline void irq_dispose_mapping(unsigned int virq) { }
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
>   #endif /* !CONFIG_IRQ_DOMAIN */
>   
>   #endif /* _LINUX_IRQDOMAIN_H */
> diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
> index d269cecdfbf0..dc1f3d08892e 100644
> --- a/kernel/irq/Kconfig
> +++ b/kernel/irq/Kconfig
> @@ -55,6 +55,9 @@ config GENERIC_IRQ_CHIP
>   config IRQ_DOMAIN
>   	bool
>   
> +config IRQ_DOMAIN_HIERARCHY
> +	bool
> +
>   config IRQ_DOMAIN_DEBUG
>   	bool "Expose hardware/virtual IRQ mapping via debugfs"
>   	depends on IRQ_DOMAIN && DEBUG_FS
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 6223fab9a9d2..46bd5e2190c3 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -15,6 +15,7 @@
>   #include <linux/module.h>
>   #include <linux/interrupt.h>
>   #include <linux/kernel_stat.h>
> +#include <linux/irqdomain.h>
>   
>   #include <trace/events/irq.h>
>   
> @@ -178,6 +179,7 @@ int irq_startup(struct irq_desc *desc, bool resend)
>   	irq_state_clr_disabled(desc);
>   	desc->depth = 0;
>   
> +	irq_domain_activate_irq(&desc->irq_data);
>   	if (desc->irq_data.chip->irq_startup) {
>   		ret = desc->irq_data.chip->irq_startup(&desc->irq_data);
>   		irq_state_clr_masked(desc);
> @@ -199,6 +201,7 @@ void irq_shutdown(struct irq_desc *desc)
>   		desc->irq_data.chip->irq_disable(&desc->irq_data);
>   	else
>   		desc->irq_data.chip->irq_mask(&desc->irq_data);
> +	irq_domain_deactivate_irq(&desc->irq_data);
>   	irq_state_set_masked(desc);
>   }
>   
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 6534ff6ce02e..e285f3abc595 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -23,6 +23,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
>   static DEFINE_MUTEX(revmap_trees_mutex);
>   static struct irq_domain *irq_default_domain;
>   
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node);
> +
>   /**
>    * __irq_domain_add() - Allocate a new irq_domain data structure
>    * @of_node: optional device-tree node of the interrupt controller
> @@ -30,7 +33,7 @@ static struct irq_domain *irq_default_domain;
>    * @hwirq_max: Maximum number of interrupts supported by controller
>    * @direct_max: Maximum value of direct maps; Use ~0 for no limit; 0 for no
>    *              direct mapping
> - * @ops: map/unmap domain callbacks
> + * @ops: domain callbacks
>    * @host_data: Controller private data pointer
>    *
>    * Allocates and initialize and irq_domain structure.
> @@ -109,7 +112,7 @@ EXPORT_SYMBOL_GPL(irq_domain_remove);
>    * @first_irq: first number of irq block assigned to the domain,
>    *	pass zero to assign irqs on-the-fly. If first_irq is non-zero, then
>    *	pre-map all of the irqs in the domain to virqs starting at first_irq.
> - * @ops: map/unmap domain callbacks
> + * @ops: domain callbacks
>    * @host_data: Controller private data pointer
>    *
>    * Allocates an irq_domain, and optionally if first_irq is positive then also
> @@ -174,10 +177,8 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
>   
>   	domain = __irq_domain_add(of_node, first_hwirq + size,
>   				  first_hwirq + size, 0, ops, host_data);
> -	if (!domain)
> -		return NULL;
> -
> -	irq_domain_associate_many(domain, first_irq, first_hwirq, size);
> +	if (domain)
> +		irq_domain_associate_many(domain, first_irq, first_hwirq, size);
>   
>   	return domain;
>   }
> @@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
>   unsigned int irq_create_mapping(struct irq_domain *domain,
>   				irq_hw_number_t hwirq)
>   {
> -	unsigned int hint;
>   	int virq;
>   
>   	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
> @@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>   	}
>   
>   	/* Allocate a virtual interrupt number */
> -	hint = hwirq % nr_irqs;
> -	if (hint == 0)
> -		hint++;
> -	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
> -	if (virq <= 0)
> -		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
> +	virq = irq_domain_alloc_descs(-1, 1, hwirq,
> +				      of_node_to_nid(domain->of_node));
>   	if (virq <= 0) {
>   		pr_debug("-> virq allocation failed\n");
>   		return 0;
> @@ -490,7 +486,13 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
>   	}
>   
>   	/* Create mapping */
> -	virq = irq_create_mapping(domain, hwirq);
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	if (domain->ops->alloc)
> +		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> +					     irq_data);
> +	else
> +#endif
> +		virq = irq_create_mapping(domain, hwirq);
>   	if (!virq)
>   		return virq;
>   
> @@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
>   		return 0;
>   
>   	if (hwirq < domain->revmap_direct_max_irq) {
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +		data = irq_domain_get_irq_data(domain, hwirq);
> +#else
>   		data = irq_get_irq_data(hwirq);
> +#endif
>   		if (data && (data->domain == domain) && (data->hwirq == hwirq))
>   			return hwirq;
>   	}
> @@ -709,3 +715,318 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>   	.xlate = irq_domain_xlate_onetwocell,
>   };
>   EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
> +
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node)
> +{
> +	unsigned int hint;
> +
> +	if (virq >= 0) {
> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
> +	} else {
> +		hint = hwirq % nr_irqs;
> +		if (hint == 0)
> +			hint++;
> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
> +		if (virq <= 0 && hint > 1)
> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
> +	}
> +
> +	return virq;
> +}
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_free_desc(virq + i);
> +}
> +
> +static void irq_domain_insert_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = virq;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +
> +		/* If not already assigned, give the domain the chip's name */
> +		if (!domain->name && data->chip)
> +			domain->name = data->chip->name;
> +	}
> +
> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
> +}
> +
> +static void irq_domain_remove_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
> +	irq_set_chip_and_handler(virq, NULL, NULL);
> +	synchronize_irq(virq);
> +	smp_mb();
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = 0;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_delete(&domain->revmap_tree, hwirq);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +	}
> +}
> +
> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
> +						   struct irq_data *child)
> +{
> +	struct irq_data *irq_data;
> +
> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
> +	if (irq_data) {
> +		child->parent_data = irq_data;
> +		irq_data->irq = child->irq;
> +		irq_data->node = child->node;
> +		irq_data->domain = domain;
> +	}
> +
> +	return irq_data;
> +}
> +
> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data, *tmp;
> +
> +	for (i = 0; i < nr_irqs; i++) {

> +		irq_data = irq_get_irq_data(virq + i);
> +		tmp = irq_data->parent_data;

Why don't you care NULL condition?

> +		irq_data->parent_data = NULL;
> +		irq_data->domain = NULL;
> +
> +		while (tmp) {
> +			irq_data = tmp;
> +			tmp = tmp->parent_data;
> +			kfree(irq_data);
> +		}
> +	}
> +}
> +
> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
> +				     unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data;
> +	struct irq_domain *parent;
> +
> +	/* The outmost irq_data is embedded in struct irq_desc */
> +	for (i = 0; i < nr_irqs; i++) {


> +		irq_data = irq_get_irq_data(virq + i);
> +		irq_data->domain = domain;

ditto.

Thanks,
Yasuaki Ishimatsu

> +
> +		for (parent = domain->parent; parent; parent = parent->parent) {
> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
> +			if (!irq_data) {
> +				irq_domain_free_irq_data(virq, i + 1);
> +				return -ENOMEM;
> +			}
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and  @domain
> + * @domain: domain to match
> + * @virq: IRQ number to get irq_data
> + */
> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +					 unsigned int virq)
> +{
> +	struct irq_data *irq_data;
> +
> +	for (irq_data = irq_get_irq_data(virq); irq_data;
> +	     irq_data = irq_data->parent_data)
> +		if (irq_data->domain == domain)
> +			return irq_data;
> +
> +	return NULL;
> +}
> +
> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
> +				  void *chip_data)
> +{
> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
> +
> +	if (!irq_data)
> +		return -ENOENT;
> +
> +	irq_data->hwirq = hwirq;
> +	irq_data->chip = chip;
> +	irq_data->chip_data = chip_data;
> +
> +	return 0;
> +}
> +
> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
> +{
> +	irq_data->hwirq = 0;
> +	irq_data->chip = NULL;
> +	irq_data->chip_data = NULL;
> +}
> +
> +/**
> + * irq_domain_alloc_irqs - Allocate IRQs from domain
> + * @domain: domain to allocate from
> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
> + * @nr_irqs: number of IRQs to allocate
> + * @node: NUMA node id for memory allocation
> + * @arg: domain specific argument
> + * @realloc: IRQ descriptors have already been allocated if true
> + *
> + * Allocate IRQ numbers and initialized all data structures to support
> + * hiearchy IRQ domains.
> + * Parameter @realloc is mainly to support legacy IRQs.
> + * Returns error code or allocated IRQ number
> + */
> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +			    unsigned int nr_irqs, int node, void *arg,
> +			    bool realloc)
> +{
> +	int i, ret, virq;
> +
> +	if (domain == NULL) {
> +		domain = irq_default_domain;
> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
> +			return -EINVAL;
> +	}
> +
> +	if (!domain->ops->alloc) {
> +		pr_debug("domain->ops->alloc() is NULL\n");
> +		return -ENOSYS;
> +	}
> +
> +	if (realloc && irq_base >= 0) {
> +		virq =  irq_base;
> +	} else {
> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
> +		if (virq < 0) {
> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
> +				 irq_base, nr_irqs);
> +			return virq;
> +		}
> +	}
> +
> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
> +		ret = -ENOMEM;
> +		goto out_free_desc;
> +	}
> +
> +	mutex_lock(&irq_domain_mutex);
> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
> +	if (ret < 0) {
> +		mutex_unlock(&irq_domain_mutex);
> +		goto out_free_irq_data;
> +	}
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_insert_irq(virq + i);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	return virq;
> +
> +out_free_irq_data:
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +out_free_desc:
> +	irq_domain_free_descs(virq, nr_irqs);
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures
> + * @virq: base IRQ number
> + * @nr_irqs: number of IRQs to free
> + */
> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *data = irq_get_irq_data(virq);
> +
> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
> +		 "NULL pointer, cannot free irq\n"))
> +		return;
> +
> +	mutex_lock(&irq_domain_mutex);
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_remove_irq(virq + i);
> +	data->domain->ops->free(data->domain, virq, nr_irqs);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +	irq_domain_free_descs(virq, nr_irqs);
> +}
> +
> +/**
> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
> + *			     interrupt
> + * @irq_data: out most irq_data associated with interrupt
> + *
> + * It calls domain_ops->activate to program interrupt controllers, so the
> + * interrupt could actually delivered.
> + */
> +int irq_domain_activate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (irq_data->parent_data)
> +			ret = irq_domain_activate_irq(irq_data->parent_data);
> +		if (ret == 0 && domain->ops->activate)
> +			ret = domain->ops->activate(domain, irq_data);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
> + *			       deactivate interrupt
> + * @irq_data: out most irq_data associated with interrupt
> + *
> + * It calls domain_ops->deactivate to program interrupt controllers to disable
> + * interrupt delivery.
> + */
> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (domain->ops->deactivate)
> +			ret = domain->ops->deactivate(domain, irq_data);
> +		if (ret == 0 && irq_data->parent_data)
> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
> +	}
> +
> +	return ret;
> +}
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> 



^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-24  6:55     ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  6:55 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/11 23:03), Jiang Liu wrote:
> We plan to use hierarchy irqdomain to suppport CPU vector assignment,
> interrupt remapping controller, IO-APIC controller, MSI interrupt
> and hypertransport interrupt etc on x86 platforms. So extend irqdomain
> interfaces to support hierarchy irqdomain.
> 
> There are already many clients of current irqdomain interfaces.
> To minimize the changes, we choose to introduce new version 2 interfaces
> to support hierarchy instead of extending existing irqdomain interfaces.
> 
> According to Thomas's suggestion, the most important design decision is
> to build hierarchy struct irq_data to support hierarchy irqdomain, so
> hierarchy irqdomain related data could be saved in struct irq_data.
> With support of hierarchy irq_data, we could also support stacked
> irq_chips. This is most useful in case of set_affinity().
> 
> The new hierarchy irqdomain introduces following interfaces:
> 1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
>     and related resources.
> 2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
> 3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
>     interrupt controllers to activate/deactivate interrupt.
> 
> There are also several help functions to ease irqdomain implemenations:
> 1) irq_domain_get_irq_data(): get irq_data associated with a specific
>     irqdomain.
> 2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
>     irq_data.
> 3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
>     parent irqdomain's alloc/free callbacks.
> 
> We also changed irq_startup()/irq_shutdown() to invoke
> irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
> interrupt controller when start/stop interrupts.
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
>   include/linux/irq.h       |    3 +
>   include/linux/irqdomain.h |   60 ++++++++
>   kernel/irq/Kconfig        |    3 +
>   kernel/irq/chip.c         |    3 +
>   kernel/irq/irqdomain.c    |  349 +++++++++++++++++++++++++++++++++++++++++++--
>   5 files changed, 404 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 62af59242ddc..4b74565690ce 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -151,6 +151,9 @@ struct irq_data {
>   	unsigned int		state_use_accessors;
>   	struct irq_chip		*chip;
>   	struct irq_domain	*domain;
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	struct irq_data		*parent_data;
> +#endif
>   	void			*handler_data;
>   	void			*chip_data;
>   	struct msi_desc		*msi_desc;
> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index b0f9d16e48f6..a9ddc8534c63 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -38,6 +38,7 @@
>   struct device_node;
>   struct irq_domain;
>   struct of_device_id;
> +struct irq_chip;
>   
>   /* Number of irqs reserved for a legacy isa controller */
>   #define NUM_ISA_INTERRUPTS	16
> @@ -64,6 +65,16 @@ struct irq_domain_ops {
>   	int (*xlate)(struct irq_domain *d, struct device_node *node,
>   		     const u32 *intspec, unsigned int intsize,
>   		     unsigned long *out_hwirq, unsigned int *out_type);
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	/* extended V2 interfaces to support hierarchy irqdomains */
> +	int (*alloc)(struct irq_domain *d, unsigned int virq,
> +		     unsigned int nr_irqs, void *arg);
> +	void (*free)(struct irq_domain *d, unsigned int virq,
> +		     unsigned int nr_irqs);
> +	int (*activate)(struct irq_domain *d, struct irq_data *irq_data);
> +	int (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
> +#endif
>   };
>   
>   extern struct irq_domain_ops irq_generic_chip_ops;
> @@ -101,6 +112,9 @@ struct irq_domain {
>   	/* Optional data */
>   	struct device_node *of_node;
>   	struct irq_domain_chip_generic *gc;
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	struct irq_domain *parent;
> +#endif
>   
>   	/* reverse map data. The linear map gets appended to the irq_domain */
>   	irq_hw_number_t hwirq_max;
> @@ -220,8 +234,54 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d, struct device_node *ctrlr,
>   			const u32 *intspec, unsigned int intsize,
>   			irq_hw_number_t *out_hwirq, unsigned int *out_type);
>   
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +/* V2 interfaces to support hierarchy IRQ domains. */
> +extern struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +						unsigned int virq);
> +extern int irq_domain_set_hwirq_and_chip(struct irq_domain *domain,
> +					 unsigned int virq,
> +					 irq_hw_number_t hwirq,
> +					 struct irq_chip *chip,
> +					 void *chip_data);
> +extern void irq_domain_reset_irq_data(struct irq_data *irq_data);
> +extern int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +				   unsigned int nr_irqs, int node, void *arg,
> +				   bool realloc);
> +extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
> +extern int irq_domain_activate_irq(struct irq_data *irq_data);
> +extern int irq_domain_deactivate_irq(struct irq_data *irq_data);
> +
> +static inline int irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +				unsigned int nr_irqs, int node, void *arg)
> +{
> +	return __irq_domain_alloc_irqs(domain, irq_base, nr_irqs, node,
> +				       arg, false);
> +}
> +
> +static inline int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
> +				int irq_base, unsigned int nr_irqs, void *arg)
> +{
> +	if (domain->parent && domain->parent->ops->alloc)
> +		return domain->parent->ops->alloc(domain->parent, irq_base,
> +						  nr_irqs, arg);
> +	return -ENOSYS;
> +}
> +
> +static inline void irq_domain_free_irqs_parent(struct irq_domain *domain,
> +					int irq_base, unsigned int nr_irqs)
> +{
> +	if (domain->parent && domain->parent->ops->free)
> +		domain->parent->ops->free(domain->parent, irq_base, nr_irqs);
> +}
> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +
>   #else /* CONFIG_IRQ_DOMAIN */
>   static inline void irq_dispose_mapping(unsigned int virq) { }
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
>   #endif /* !CONFIG_IRQ_DOMAIN */
>   
>   #endif /* _LINUX_IRQDOMAIN_H */
> diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
> index d269cecdfbf0..dc1f3d08892e 100644
> --- a/kernel/irq/Kconfig
> +++ b/kernel/irq/Kconfig
> @@ -55,6 +55,9 @@ config GENERIC_IRQ_CHIP
>   config IRQ_DOMAIN
>   	bool
>   
> +config IRQ_DOMAIN_HIERARCHY
> +	bool
> +
>   config IRQ_DOMAIN_DEBUG
>   	bool "Expose hardware/virtual IRQ mapping via debugfs"
>   	depends on IRQ_DOMAIN && DEBUG_FS
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 6223fab9a9d2..46bd5e2190c3 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -15,6 +15,7 @@
>   #include <linux/module.h>
>   #include <linux/interrupt.h>
>   #include <linux/kernel_stat.h>
> +#include <linux/irqdomain.h>
>   
>   #include <trace/events/irq.h>
>   
> @@ -178,6 +179,7 @@ int irq_startup(struct irq_desc *desc, bool resend)
>   	irq_state_clr_disabled(desc);
>   	desc->depth = 0;
>   
> +	irq_domain_activate_irq(&desc->irq_data);
>   	if (desc->irq_data.chip->irq_startup) {
>   		ret = desc->irq_data.chip->irq_startup(&desc->irq_data);
>   		irq_state_clr_masked(desc);
> @@ -199,6 +201,7 @@ void irq_shutdown(struct irq_desc *desc)
>   		desc->irq_data.chip->irq_disable(&desc->irq_data);
>   	else
>   		desc->irq_data.chip->irq_mask(&desc->irq_data);
> +	irq_domain_deactivate_irq(&desc->irq_data);
>   	irq_state_set_masked(desc);
>   }
>   
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 6534ff6ce02e..e285f3abc595 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -23,6 +23,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
>   static DEFINE_MUTEX(revmap_trees_mutex);
>   static struct irq_domain *irq_default_domain;
>   
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node);
> +
>   /**
>    * __irq_domain_add() - Allocate a new irq_domain data structure
>    * @of_node: optional device-tree node of the interrupt controller
> @@ -30,7 +33,7 @@ static struct irq_domain *irq_default_domain;
>    * @hwirq_max: Maximum number of interrupts supported by controller
>    * @direct_max: Maximum value of direct maps; Use ~0 for no limit; 0 for no
>    *              direct mapping
> - * @ops: map/unmap domain callbacks
> + * @ops: domain callbacks
>    * @host_data: Controller private data pointer
>    *
>    * Allocates and initialize and irq_domain structure.
> @@ -109,7 +112,7 @@ EXPORT_SYMBOL_GPL(irq_domain_remove);
>    * @first_irq: first number of irq block assigned to the domain,
>    *	pass zero to assign irqs on-the-fly. If first_irq is non-zero, then
>    *	pre-map all of the irqs in the domain to virqs starting at first_irq.
> - * @ops: map/unmap domain callbacks
> + * @ops: domain callbacks
>    * @host_data: Controller private data pointer
>    *
>    * Allocates an irq_domain, and optionally if first_irq is positive then also
> @@ -174,10 +177,8 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
>   
>   	domain = __irq_domain_add(of_node, first_hwirq + size,
>   				  first_hwirq + size, 0, ops, host_data);
> -	if (!domain)
> -		return NULL;
> -
> -	irq_domain_associate_many(domain, first_irq, first_hwirq, size);
> +	if (domain)
> +		irq_domain_associate_many(domain, first_irq, first_hwirq, size);
>   
>   	return domain;
>   }
> @@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
>   unsigned int irq_create_mapping(struct irq_domain *domain,
>   				irq_hw_number_t hwirq)
>   {
> -	unsigned int hint;
>   	int virq;
>   
>   	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
> @@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>   	}
>   
>   	/* Allocate a virtual interrupt number */
> -	hint = hwirq % nr_irqs;
> -	if (hint == 0)
> -		hint++;
> -	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
> -	if (virq <= 0)
> -		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
> +	virq = irq_domain_alloc_descs(-1, 1, hwirq,
> +				      of_node_to_nid(domain->of_node));
>   	if (virq <= 0) {
>   		pr_debug("-> virq allocation failed\n");
>   		return 0;
> @@ -490,7 +486,13 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
>   	}
>   
>   	/* Create mapping */
> -	virq = irq_create_mapping(domain, hwirq);
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	if (domain->ops->alloc)
> +		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> +					     irq_data);
> +	else
> +#endif
> +		virq = irq_create_mapping(domain, hwirq);
>   	if (!virq)
>   		return virq;
>   
> @@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
>   		return 0;
>   
>   	if (hwirq < domain->revmap_direct_max_irq) {
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +		data = irq_domain_get_irq_data(domain, hwirq);
> +#else
>   		data = irq_get_irq_data(hwirq);
> +#endif
>   		if (data && (data->domain == domain) && (data->hwirq == hwirq))
>   			return hwirq;
>   	}
> @@ -709,3 +715,318 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>   	.xlate = irq_domain_xlate_onetwocell,
>   };
>   EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
> +
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node)
> +{
> +	unsigned int hint;
> +
> +	if (virq >= 0) {
> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
> +	} else {
> +		hint = hwirq % nr_irqs;
> +		if (hint == 0)
> +			hint++;
> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
> +		if (virq <= 0 && hint > 1)
> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
> +	}
> +
> +	return virq;
> +}
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_free_desc(virq + i);
> +}
> +
> +static void irq_domain_insert_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = virq;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +
> +		/* If not already assigned, give the domain the chip's name */
> +		if (!domain->name && data->chip)
> +			domain->name = data->chip->name;
> +	}
> +
> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
> +}
> +
> +static void irq_domain_remove_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
> +	irq_set_chip_and_handler(virq, NULL, NULL);
> +	synchronize_irq(virq);
> +	smp_mb();
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = 0;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_delete(&domain->revmap_tree, hwirq);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +	}
> +}
> +
> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
> +						   struct irq_data *child)
> +{
> +	struct irq_data *irq_data;
> +
> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
> +	if (irq_data) {
> +		child->parent_data = irq_data;
> +		irq_data->irq = child->irq;
> +		irq_data->node = child->node;
> +		irq_data->domain = domain;
> +	}
> +
> +	return irq_data;
> +}
> +
> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data, *tmp;
> +
> +	for (i = 0; i < nr_irqs; i++) {

> +		irq_data = irq_get_irq_data(virq + i);
> +		tmp = irq_data->parent_data;

Why don't you care NULL condition?

> +		irq_data->parent_data = NULL;
> +		irq_data->domain = NULL;
> +
> +		while (tmp) {
> +			irq_data = tmp;
> +			tmp = tmp->parent_data;
> +			kfree(irq_data);
> +		}
> +	}
> +}
> +
> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
> +				     unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data;
> +	struct irq_domain *parent;
> +
> +	/* The outmost irq_data is embedded in struct irq_desc */
> +	for (i = 0; i < nr_irqs; i++) {


> +		irq_data = irq_get_irq_data(virq + i);
> +		irq_data->domain = domain;

ditto.

Thanks,
Yasuaki Ishimatsu

> +
> +		for (parent = domain->parent; parent; parent = parent->parent) {
> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
> +			if (!irq_data) {
> +				irq_domain_free_irq_data(virq, i + 1);
> +				return -ENOMEM;
> +			}
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and  @domain
> + * @domain: domain to match
> + * @virq: IRQ number to get irq_data
> + */
> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +					 unsigned int virq)
> +{
> +	struct irq_data *irq_data;
> +
> +	for (irq_data = irq_get_irq_data(virq); irq_data;
> +	     irq_data = irq_data->parent_data)
> +		if (irq_data->domain == domain)
> +			return irq_data;
> +
> +	return NULL;
> +}
> +
> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
> +				  void *chip_data)
> +{
> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
> +
> +	if (!irq_data)
> +		return -ENOENT;
> +
> +	irq_data->hwirq = hwirq;
> +	irq_data->chip = chip;
> +	irq_data->chip_data = chip_data;
> +
> +	return 0;
> +}
> +
> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
> +{
> +	irq_data->hwirq = 0;
> +	irq_data->chip = NULL;
> +	irq_data->chip_data = NULL;
> +}
> +
> +/**
> + * irq_domain_alloc_irqs - Allocate IRQs from domain
> + * @domain: domain to allocate from
> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
> + * @nr_irqs: number of IRQs to allocate
> + * @node: NUMA node id for memory allocation
> + * @arg: domain specific argument
> + * @realloc: IRQ descriptors have already been allocated if true
> + *
> + * Allocate IRQ numbers and initialized all data structures to support
> + * hiearchy IRQ domains.
> + * Parameter @realloc is mainly to support legacy IRQs.
> + * Returns error code or allocated IRQ number
> + */
> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +			    unsigned int nr_irqs, int node, void *arg,
> +			    bool realloc)
> +{
> +	int i, ret, virq;
> +
> +	if (domain == NULL) {
> +		domain = irq_default_domain;
> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
> +			return -EINVAL;
> +	}
> +
> +	if (!domain->ops->alloc) {
> +		pr_debug("domain->ops->alloc() is NULL\n");
> +		return -ENOSYS;
> +	}
> +
> +	if (realloc && irq_base >= 0) {
> +		virq =  irq_base;
> +	} else {
> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
> +		if (virq < 0) {
> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
> +				 irq_base, nr_irqs);
> +			return virq;
> +		}
> +	}
> +
> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
> +		ret = -ENOMEM;
> +		goto out_free_desc;
> +	}
> +
> +	mutex_lock(&irq_domain_mutex);
> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
> +	if (ret < 0) {
> +		mutex_unlock(&irq_domain_mutex);
> +		goto out_free_irq_data;
> +	}
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_insert_irq(virq + i);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	return virq;
> +
> +out_free_irq_data:
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +out_free_desc:
> +	irq_domain_free_descs(virq, nr_irqs);
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures
> + * @virq: base IRQ number
> + * @nr_irqs: number of IRQs to free
> + */
> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *data = irq_get_irq_data(virq);
> +
> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
> +		 "NULL pointer, cannot free irq\n"))
> +		return;
> +
> +	mutex_lock(&irq_domain_mutex);
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_remove_irq(virq + i);
> +	data->domain->ops->free(data->domain, virq, nr_irqs);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +	irq_domain_free_descs(virq, nr_irqs);
> +}
> +
> +/**
> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
> + *			     interrupt
> + * @irq_data: out most irq_data associated with interrupt
> + *
> + * It calls domain_ops->activate to program interrupt controllers, so the
> + * interrupt could actually delivered.
> + */
> +int irq_domain_activate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (irq_data->parent_data)
> +			ret = irq_domain_activate_irq(irq_data->parent_data);
> +		if (ret == 0 && domain->ops->activate)
> +			ret = domain->ops->activate(domain, irq_data);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
> + *			       deactivate interrupt
> + * @irq_data: out most irq_data associated with interrupt
> + *
> + * It calls domain_ops->deactivate to program interrupt controllers to disable
> + * interrupt delivery.
> + */
> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (domain->ops->deactivate)
> +			ret = domain->ops->deactivate(domain, irq_data);
> +		if (ret == 0 && irq_data->parent_data)
> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
> +	}
> +
> +	return ret;
> +}
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> 



^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-24  6:55     ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  6:55 UTC (permalink / raw)
  To: linux-arm-kernel

(2014/09/11 23:03), Jiang Liu wrote:
> We plan to use hierarchy irqdomain to suppport CPU vector assignment,
> interrupt remapping controller, IO-APIC controller, MSI interrupt
> and hypertransport interrupt etc on x86 platforms. So extend irqdomain
> interfaces to support hierarchy irqdomain.
> 
> There are already many clients of current irqdomain interfaces.
> To minimize the changes, we choose to introduce new version 2 interfaces
> to support hierarchy instead of extending existing irqdomain interfaces.
> 
> According to Thomas's suggestion, the most important design decision is
> to build hierarchy struct irq_data to support hierarchy irqdomain, so
> hierarchy irqdomain related data could be saved in struct irq_data.
> With support of hierarchy irq_data, we could also support stacked
> irq_chips. This is most useful in case of set_affinity().
> 
> The new hierarchy irqdomain introduces following interfaces:
> 1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
>     and related resources.
> 2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
> 3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
>     interrupt controllers to activate/deactivate interrupt.
> 
> There are also several help functions to ease irqdomain implemenations:
> 1) irq_domain_get_irq_data(): get irq_data associated with a specific
>     irqdomain.
> 2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
>     irq_data.
> 3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
>     parent irqdomain's alloc/free callbacks.
> 
> We also changed irq_startup()/irq_shutdown() to invoke
> irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
> interrupt controller when start/stop interrupts.
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
>   include/linux/irq.h       |    3 +
>   include/linux/irqdomain.h |   60 ++++++++
>   kernel/irq/Kconfig        |    3 +
>   kernel/irq/chip.c         |    3 +
>   kernel/irq/irqdomain.c    |  349 +++++++++++++++++++++++++++++++++++++++++++--
>   5 files changed, 404 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 62af59242ddc..4b74565690ce 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -151,6 +151,9 @@ struct irq_data {
>   	unsigned int		state_use_accessors;
>   	struct irq_chip		*chip;
>   	struct irq_domain	*domain;
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	struct irq_data		*parent_data;
> +#endif
>   	void			*handler_data;
>   	void			*chip_data;
>   	struct msi_desc		*msi_desc;
> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index b0f9d16e48f6..a9ddc8534c63 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -38,6 +38,7 @@
>   struct device_node;
>   struct irq_domain;
>   struct of_device_id;
> +struct irq_chip;
>   
>   /* Number of irqs reserved for a legacy isa controller */
>   #define NUM_ISA_INTERRUPTS	16
> @@ -64,6 +65,16 @@ struct irq_domain_ops {
>   	int (*xlate)(struct irq_domain *d, struct device_node *node,
>   		     const u32 *intspec, unsigned int intsize,
>   		     unsigned long *out_hwirq, unsigned int *out_type);
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	/* extended V2 interfaces to support hierarchy irqdomains */
> +	int (*alloc)(struct irq_domain *d, unsigned int virq,
> +		     unsigned int nr_irqs, void *arg);
> +	void (*free)(struct irq_domain *d, unsigned int virq,
> +		     unsigned int nr_irqs);
> +	int (*activate)(struct irq_domain *d, struct irq_data *irq_data);
> +	int (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
> +#endif
>   };
>   
>   extern struct irq_domain_ops irq_generic_chip_ops;
> @@ -101,6 +112,9 @@ struct irq_domain {
>   	/* Optional data */
>   	struct device_node *of_node;
>   	struct irq_domain_chip_generic *gc;
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	struct irq_domain *parent;
> +#endif
>   
>   	/* reverse map data. The linear map gets appended to the irq_domain */
>   	irq_hw_number_t hwirq_max;
> @@ -220,8 +234,54 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d, struct device_node *ctrlr,
>   			const u32 *intspec, unsigned int intsize,
>   			irq_hw_number_t *out_hwirq, unsigned int *out_type);
>   
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +/* V2 interfaces to support hierarchy IRQ domains. */
> +extern struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +						unsigned int virq);
> +extern int irq_domain_set_hwirq_and_chip(struct irq_domain *domain,
> +					 unsigned int virq,
> +					 irq_hw_number_t hwirq,
> +					 struct irq_chip *chip,
> +					 void *chip_data);
> +extern void irq_domain_reset_irq_data(struct irq_data *irq_data);
> +extern int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +				   unsigned int nr_irqs, int node, void *arg,
> +				   bool realloc);
> +extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
> +extern int irq_domain_activate_irq(struct irq_data *irq_data);
> +extern int irq_domain_deactivate_irq(struct irq_data *irq_data);
> +
> +static inline int irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +				unsigned int nr_irqs, int node, void *arg)
> +{
> +	return __irq_domain_alloc_irqs(domain, irq_base, nr_irqs, node,
> +				       arg, false);
> +}
> +
> +static inline int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
> +				int irq_base, unsigned int nr_irqs, void *arg)
> +{
> +	if (domain->parent && domain->parent->ops->alloc)
> +		return domain->parent->ops->alloc(domain->parent, irq_base,
> +						  nr_irqs, arg);
> +	return -ENOSYS;
> +}
> +
> +static inline void irq_domain_free_irqs_parent(struct irq_domain *domain,
> +					int irq_base, unsigned int nr_irqs)
> +{
> +	if (domain->parent && domain->parent->ops->free)
> +		domain->parent->ops->free(domain->parent, irq_base, nr_irqs);
> +}
> +#else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> +
>   #else /* CONFIG_IRQ_DOMAIN */
>   static inline void irq_dispose_mapping(unsigned int virq) { }
> +static inline int irq_domain_activate_irq(struct irq_data *data) { return 0; }
> +static inline int irq_domain_deactivate_irq(struct irq_data *data) { return 0; }
>   #endif /* !CONFIG_IRQ_DOMAIN */
>   
>   #endif /* _LINUX_IRQDOMAIN_H */
> diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
> index d269cecdfbf0..dc1f3d08892e 100644
> --- a/kernel/irq/Kconfig
> +++ b/kernel/irq/Kconfig
> @@ -55,6 +55,9 @@ config GENERIC_IRQ_CHIP
>   config IRQ_DOMAIN
>   	bool
>   
> +config IRQ_DOMAIN_HIERARCHY
> +	bool
> +
>   config IRQ_DOMAIN_DEBUG
>   	bool "Expose hardware/virtual IRQ mapping via debugfs"
>   	depends on IRQ_DOMAIN && DEBUG_FS
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 6223fab9a9d2..46bd5e2190c3 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -15,6 +15,7 @@
>   #include <linux/module.h>
>   #include <linux/interrupt.h>
>   #include <linux/kernel_stat.h>
> +#include <linux/irqdomain.h>
>   
>   #include <trace/events/irq.h>
>   
> @@ -178,6 +179,7 @@ int irq_startup(struct irq_desc *desc, bool resend)
>   	irq_state_clr_disabled(desc);
>   	desc->depth = 0;
>   
> +	irq_domain_activate_irq(&desc->irq_data);
>   	if (desc->irq_data.chip->irq_startup) {
>   		ret = desc->irq_data.chip->irq_startup(&desc->irq_data);
>   		irq_state_clr_masked(desc);
> @@ -199,6 +201,7 @@ void irq_shutdown(struct irq_desc *desc)
>   		desc->irq_data.chip->irq_disable(&desc->irq_data);
>   	else
>   		desc->irq_data.chip->irq_mask(&desc->irq_data);
> +	irq_domain_deactivate_irq(&desc->irq_data);
>   	irq_state_set_masked(desc);
>   }
>   
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 6534ff6ce02e..e285f3abc595 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -23,6 +23,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
>   static DEFINE_MUTEX(revmap_trees_mutex);
>   static struct irq_domain *irq_default_domain;
>   
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node);
> +
>   /**
>    * __irq_domain_add() - Allocate a new irq_domain data structure
>    * @of_node: optional device-tree node of the interrupt controller
> @@ -30,7 +33,7 @@ static struct irq_domain *irq_default_domain;
>    * @hwirq_max: Maximum number of interrupts supported by controller
>    * @direct_max: Maximum value of direct maps; Use ~0 for no limit; 0 for no
>    *              direct mapping
> - * @ops: map/unmap domain callbacks
> + * @ops: domain callbacks
>    * @host_data: Controller private data pointer
>    *
>    * Allocates and initialize and irq_domain structure.
> @@ -109,7 +112,7 @@ EXPORT_SYMBOL_GPL(irq_domain_remove);
>    * @first_irq: first number of irq block assigned to the domain,
>    *	pass zero to assign irqs on-the-fly. If first_irq is non-zero, then
>    *	pre-map all of the irqs in the domain to virqs starting at first_irq.
> - * @ops: map/unmap domain callbacks
> + * @ops: domain callbacks
>    * @host_data: Controller private data pointer
>    *
>    * Allocates an irq_domain, and optionally if first_irq is positive then also
> @@ -174,10 +177,8 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
>   
>   	domain = __irq_domain_add(of_node, first_hwirq + size,
>   				  first_hwirq + size, 0, ops, host_data);
> -	if (!domain)
> -		return NULL;
> -
> -	irq_domain_associate_many(domain, first_irq, first_hwirq, size);
> +	if (domain)
> +		irq_domain_associate_many(domain, first_irq, first_hwirq, size);
>   
>   	return domain;
>   }
> @@ -388,7 +389,6 @@ EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
>   unsigned int irq_create_mapping(struct irq_domain *domain,
>   				irq_hw_number_t hwirq)
>   {
> -	unsigned int hint;
>   	int virq;
>   
>   	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);
> @@ -410,12 +410,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
>   	}
>   
>   	/* Allocate a virtual interrupt number */
> -	hint = hwirq % nr_irqs;
> -	if (hint == 0)
> -		hint++;
> -	virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
> -	if (virq <= 0)
> -		virq = irq_alloc_desc_from(1, of_node_to_nid(domain->of_node));
> +	virq = irq_domain_alloc_descs(-1, 1, hwirq,
> +				      of_node_to_nid(domain->of_node));
>   	if (virq <= 0) {
>   		pr_debug("-> virq allocation failed\n");
>   		return 0;
> @@ -490,7 +486,13 @@ unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
>   	}
>   
>   	/* Create mapping */
> -	virq = irq_create_mapping(domain, hwirq);
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +	if (domain->ops->alloc)
> +		virq = irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> +					     irq_data);
> +	else
> +#endif
> +		virq = irq_create_mapping(domain, hwirq);
>   	if (!virq)
>   		return virq;
>   
> @@ -540,7 +542,11 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
>   		return 0;
>   
>   	if (hwirq < domain->revmap_direct_max_irq) {
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +		data = irq_domain_get_irq_data(domain, hwirq);
> +#else
>   		data = irq_get_irq_data(hwirq);
> +#endif
>   		if (data && (data->domain == domain) && (data->hwirq == hwirq))
>   			return hwirq;
>   	}
> @@ -709,3 +715,318 @@ const struct irq_domain_ops irq_domain_simple_ops = {
>   	.xlate = irq_domain_xlate_onetwocell,
>   };
>   EXPORT_SYMBOL_GPL(irq_domain_simple_ops);
> +
> +static int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
> +				  irq_hw_number_t hwirq, int node)
> +{
> +	unsigned int hint;
> +
> +	if (virq >= 0) {
> +		virq = irq_alloc_descs(virq, virq, nr_irqs, node);
> +	} else {
> +		hint = hwirq % nr_irqs;
> +		if (hint == 0)
> +			hint++;
> +		virq = irq_alloc_descs_from(hint, nr_irqs, node);
> +		if (virq <= 0 && hint > 1)
> +			virq = irq_alloc_descs_from(1, nr_irqs, node);
> +	}
> +
> +	return virq;
> +}
> +
> +#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
> +static void irq_domain_free_descs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_free_desc(virq + i);
> +}
> +
> +static void irq_domain_insert_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = virq;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_insert(&domain->revmap_tree, hwirq, data);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +
> +		/* If not already assigned, give the domain the chip's name */
> +		if (!domain->name && data->chip)
> +			domain->name = data->chip->name;
> +	}
> +
> +	irq_clear_status_flags(virq, IRQ_NOREQUEST);
> +}
> +
> +static void irq_domain_remove_irq(int virq)
> +{
> +	struct irq_data *data;
> +
> +	irq_set_status_flags(virq, IRQ_NOREQUEST);
> +	irq_set_chip_and_handler(virq, NULL, NULL);
> +	synchronize_irq(virq);
> +	smp_mb();
> +
> +	for (data = irq_get_irq_data(virq); data; data = data->parent_data) {
> +		struct irq_domain *domain = data->domain;
> +		irq_hw_number_t hwirq = data->hwirq;
> +
> +		if (hwirq < domain->revmap_size) {
> +			domain->linear_revmap[hwirq] = 0;
> +		} else {
> +			mutex_lock(&revmap_trees_mutex);
> +			radix_tree_delete(&domain->revmap_tree, hwirq);
> +			mutex_unlock(&revmap_trees_mutex);
> +		}
> +	}
> +}
> +
> +static struct irq_data *irq_domain_insert_irq_data(struct irq_domain *domain,
> +						   struct irq_data *child)
> +{
> +	struct irq_data *irq_data;
> +
> +	irq_data = kzalloc_node(sizeof(*irq_data), GFP_KERNEL, child->node);
> +	if (irq_data) {
> +		child->parent_data = irq_data;
> +		irq_data->irq = child->irq;
> +		irq_data->node = child->node;
> +		irq_data->domain = domain;
> +	}
> +
> +	return irq_data;
> +}
> +
> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data, *tmp;
> +
> +	for (i = 0; i < nr_irqs; i++) {

> +		irq_data = irq_get_irq_data(virq + i);
> +		tmp = irq_data->parent_data;

Why don't you care NULL condition?

> +		irq_data->parent_data = NULL;
> +		irq_data->domain = NULL;
> +
> +		while (tmp) {
> +			irq_data = tmp;
> +			tmp = tmp->parent_data;
> +			kfree(irq_data);
> +		}
> +	}
> +}
> +
> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
> +				     unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *irq_data;
> +	struct irq_domain *parent;
> +
> +	/* The outmost irq_data is embedded in struct irq_desc */
> +	for (i = 0; i < nr_irqs; i++) {


> +		irq_data = irq_get_irq_data(virq + i);
> +		irq_data->domain = domain;

ditto.

Thanks,
Yasuaki Ishimatsu

> +
> +		for (parent = domain->parent; parent; parent = parent->parent) {
> +			irq_data = irq_domain_insert_irq_data(parent, irq_data);
> +			if (!irq_data) {
> +				irq_domain_free_irq_data(virq, i + 1);
> +				return -ENOMEM;
> +			}
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * irq_domain_get_irq_data - Get irq_data assoicated with @virq and  @domain
> + * @domain: domain to match
> + * @virq: IRQ number to get irq_data
> + */
> +struct irq_data *irq_domain_get_irq_data(struct irq_domain *domain,
> +					 unsigned int virq)
> +{
> +	struct irq_data *irq_data;
> +
> +	for (irq_data = irq_get_irq_data(virq); irq_data;
> +	     irq_data = irq_data->parent_data)
> +		if (irq_data->domain == domain)
> +			return irq_data;
> +
> +	return NULL;
> +}
> +
> +int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
> +				  irq_hw_number_t hwirq, struct irq_chip *chip,
> +				  void *chip_data)
> +{
> +	struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);
> +
> +	if (!irq_data)
> +		return -ENOENT;
> +
> +	irq_data->hwirq = hwirq;
> +	irq_data->chip = chip;
> +	irq_data->chip_data = chip_data;
> +
> +	return 0;
> +}
> +
> +void irq_domain_reset_irq_data(struct irq_data *irq_data)
> +{
> +	irq_data->hwirq = 0;
> +	irq_data->chip = NULL;
> +	irq_data->chip_data = NULL;
> +}
> +
> +/**
> + * irq_domain_alloc_irqs - Allocate IRQs from domain
> + * @domain: domain to allocate from
> + * @irq_base: allocate specified IRQ nubmer if irq_base >= 0
> + * @nr_irqs: number of IRQs to allocate
> + * @node: NUMA node id for memory allocation
> + * @arg: domain specific argument
> + * @realloc: IRQ descriptors have already been allocated if true
> + *
> + * Allocate IRQ numbers and initialized all data structures to support
> + * hiearchy IRQ domains.
> + * Parameter @realloc is mainly to support legacy IRQs.
> + * Returns error code or allocated IRQ number
> + */
> +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
> +			    unsigned int nr_irqs, int node, void *arg,
> +			    bool realloc)
> +{
> +	int i, ret, virq;
> +
> +	if (domain == NULL) {
> +		domain = irq_default_domain;
> +		if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n"))
> +			return -EINVAL;
> +	}
> +
> +	if (!domain->ops->alloc) {
> +		pr_debug("domain->ops->alloc() is NULL\n");
> +		return -ENOSYS;
> +	}
> +
> +	if (realloc && irq_base >= 0) {
> +		virq =  irq_base;
> +	} else {
> +		virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);
> +		if (virq < 0) {
> +			pr_debug("cannot allocate IRQ(base %d, count %d)\n",
> +				 irq_base, nr_irqs);
> +			return virq;
> +		}
> +	}
> +
> +	if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {
> +		pr_debug("cannot allocate memory for IRQ%d\n", virq);
> +		ret = -ENOMEM;
> +		goto out_free_desc;
> +	}
> +
> +	mutex_lock(&irq_domain_mutex);
> +	ret = domain->ops->alloc(domain, virq, nr_irqs, arg);
> +	if (ret < 0) {
> +		mutex_unlock(&irq_domain_mutex);
> +		goto out_free_irq_data;
> +	}
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_insert_irq(virq + i);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	return virq;
> +
> +out_free_irq_data:
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +out_free_desc:
> +	irq_domain_free_descs(virq, nr_irqs);
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_free_irqs - Free IRQ number and assoicated data structures
> + * @virq: base IRQ number
> + * @nr_irqs: number of IRQs to free
> + */
> +void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
> +{
> +	int i;
> +	struct irq_data *data = irq_get_irq_data(virq);
> +
> +	if (WARN(!data || !data->domain || !data->domain->ops->free,
> +		 "NULL pointer, cannot free irq\n"))
> +		return;
> +
> +	mutex_lock(&irq_domain_mutex);
> +	for (i = 0; i < nr_irqs; i++)
> +		irq_domain_remove_irq(virq + i);
> +	data->domain->ops->free(data->domain, virq, nr_irqs);
> +	mutex_unlock(&irq_domain_mutex);
> +
> +	irq_domain_free_irq_data(virq, nr_irqs);
> +	irq_domain_free_descs(virq, nr_irqs);
> +}
> +
> +/**
> + * irq_domain_activate_irq - Call domain_ops->activate recursively to activate
> + *			     interrupt
> + * @irq_data: out most irq_data associated with interrupt
> + *
> + * It calls domain_ops->activate to program interrupt controllers, so the
> + * interrupt could actually delivered.
> + */
> +int irq_domain_activate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (irq_data->parent_data)
> +			ret = irq_domain_activate_irq(irq_data->parent_data);
> +		if (ret == 0 && domain->ops->activate)
> +			ret = domain->ops->activate(domain, irq_data);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * irq_domain_deactivate_irq - Call domain_ops->deactivate recursively to
> + *			       deactivate interrupt
> + * @irq_data: out most irq_data associated with interrupt
> + *
> + * It calls domain_ops->deactivate to program interrupt controllers to disable
> + * interrupt delivery.
> + */
> +int irq_domain_deactivate_irq(struct irq_data *irq_data)
> +{
> +	int ret = 0;
> +
> +	if (irq_data && irq_data->domain) {
> +		struct irq_domain *domain = irq_data->domain;
> +
> +		if (domain->ops->deactivate)
> +			ret = domain->ops->deactivate(domain, irq_data);
> +		if (ret == 0 && irq_data->parent_data)
> +			ret = irq_domain_deactivate_irq(irq_data->parent_data);
> +	}
> +
> +	return ret;
> +}
> +#endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
  2014-09-24  6:55     ` Yasuaki Ishimatsu
@ 2014-09-24  7:23       ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  7:23 UTC (permalink / raw)
  To: Yasuaki Ishimatsu, Benjamin Herrenschmidt, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas,
	Randy Dunlap, Yinghai Lu, Borislav Petkov, Grant Likely,
	Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel



On 2014/9/24 14:55, Yasuaki Ishimatsu wrote:
> (2014/09/11 23:03), Jiang Liu wrote:
>> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data, *tmp;
>> +
>> +	for (i = 0; i < nr_irqs; i++) {
> 
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		tmp = irq_data->parent_data;
> 
> Why don't you care NULL condition?
Yeah, there's an explicitly assumption that, irq_get_irq_data()
always return valid pointer once we have allocated the irq number
and associated irq_desc. If preferred, I will add a check here.

> 
>> +		irq_data->parent_data = NULL;
>> +		irq_data->domain = NULL;
>> +
>> +		while (tmp) {
>> +			irq_data = tmp;
>> +			tmp = tmp->parent_data;
>> +			kfree(irq_data);
>> +		}
>> +	}
>> +}
>> +
>> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
>> +				     unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data;
>> +	struct irq_domain *parent;
>> +
>> +	/* The outmost irq_data is embedded in struct irq_desc */
>> +	for (i = 0; i < nr_irqs; i++) {
> 
> 
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		irq_data->domain = domain;
> 
> ditto.
Seems as above.
Regards!
Gerry

> 
> Thanks,
> Yasuaki Ishimatsu
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains
@ 2014-09-24  7:23       ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  7:23 UTC (permalink / raw)
  To: linux-arm-kernel



On 2014/9/24 14:55, Yasuaki Ishimatsu wrote:
> (2014/09/11 23:03), Jiang Liu wrote:
>> +static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data, *tmp;
>> +
>> +	for (i = 0; i < nr_irqs; i++) {
> 
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		tmp = irq_data->parent_data;
> 
> Why don't you care NULL condition?
Yeah, there's an explicitly assumption that, irq_get_irq_data()
always return valid pointer once we have allocated the irq number
and associated irq_desc. If preferred, I will add a check here.

> 
>> +		irq_data->parent_data = NULL;
>> +		irq_data->domain = NULL;
>> +
>> +		while (tmp) {
>> +			irq_data = tmp;
>> +			tmp = tmp->parent_data;
>> +			kfree(irq_data);
>> +		}
>> +	}
>> +}
>> +
>> +static int irq_domain_alloc_irq_data(struct irq_domain *domain,
>> +				     unsigned int virq, unsigned int nr_irqs)
>> +{
>> +	int i;
>> +	struct irq_data *irq_data;
>> +	struct irq_domain *parent;
>> +
>> +	/* The outmost irq_data is embedded in struct irq_desc */
>> +	for (i = 0; i < nr_irqs; i++) {
> 
> 
>> +		irq_data = irq_get_irq_data(virq + i);
>> +		irq_data->domain = domain;
> 
> ditto.
Seems as above.
Regards!
Gerry

> 
> Thanks,
> Yasuaki Ishimatsu
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
  2014-09-11 14:03 ` Jiang Liu
  (?)
@ 2014-09-24  7:59   ` Yasuaki Ishimatsu
  -1 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  7:59 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/11 23:03), Jiang Liu wrote:
> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
> that is to build irqdomains for CPU vector, interrupt remapping unit,
> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
> Each irqdomain manages corresponding interrupt controller and talks to
> parent interrupt controller through public irqdomain interfaces. We also
> support stacked irq_chip based on hierarchy irqdomain. It will make the
> x86 interrupt architecture much more clear and more easy to maintain
> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
> interrupt management architecture too.

Do you have a documentation which more detailed information is written?
I'm interested in this feature. And I want to know more detailed information.

I cannot imagine why the feature makes x86 irq architecture much more clear
and more easy to maintain in this description. Of course, I have read the
following threads:

https://lkml.org/lkml/2014/9/11/101

Thanks,
Yasuaki Ishimatsu

> 
> This is the second patch set to enable support of hierarchy irqdomain
> on x86 platforms. It depends on the first part at:
> https://lkml.org/lkml/2014/9/11/101
> And you may access it at:
> https://github.com/jiangliu/linux.git irqdomain/p2v1
> 
> And there will be a third patch set to convert IOAPIC driver to support
> hierarchy irqdomain and clean up code.
> 
> The first patch extends irqdomain interfaces to support hierarchy
> irqdomain. Hope this interface could be used by other architectures too,
> such as ARM/ARM64.
> The second patch introduces two helper functions to support stacked
> irq_chip.
> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
> it's the root irqdomain for x86 platforms.
> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
> support hierarchy irqdomain.
> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
> irqdomain.
> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
> drivers.
> 
> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
> have only done compilation tests for HT_IRQ and AMD interrupt remapping
> drivers due to hardware resource limitation. Tests on AMD platforms are
> warmly welcomed!
> 
> Jiang Liu (21):
>    irqdomain: Introduce new interfaces to support hierarchy irqdomains
>    genirq: Introduce helper functions to support stacked irq_chip
>    x86, irq: Save destination CPU ID in irq_cfg
>    x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>    x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>    x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>    x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>    x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>    x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>    x86: irq_remapping: Introduce new interfaces to support hierarchy
>      irqdomain
>    iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>      irqdomain
>    iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>    iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>    x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>    x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>    x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>    x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>      interrupts
>    iommu/vt-d: Clean up unused MSI related code
>    iommu/amd: Clean up unused MSI related code
>    x86: irq_remapping: Clean up unused MSI related code
>    x86, irq: Clean up unused MSI related code and interfaces
> 
>   arch/x86/Kconfig                     |    3 +-
>   arch/x86/include/asm/hpet.h          |   16 +-
>   arch/x86/include/asm/hw_irq.h        |   64 +++++
>   arch/x86/include/asm/irq_remapping.h |   66 +++--
>   arch/x86/include/asm/pci.h           |    5 -
>   arch/x86/include/asm/x86_init.h      |    4 -
>   arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>   arch/x86/kernel/apic/io_apic.c       |    3 -
>   arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>   arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>   arch/x86/kernel/hpet.c               |   57 ++---
>   arch/x86/kernel/x86_init.c           |    2 -
>   arch/x86/platform/uv/uv_irq.c        |   27 +-
>   drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>   drivers/iommu/amd_iommu_init.c       |    4 +
>   drivers/iommu/amd_iommu_proto.h      |    9 +
>   drivers/iommu/amd_iommu_types.h      |    5 +
>   drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>   drivers/iommu/irq_remapping.c        |  221 ++++++----------
>   drivers/iommu/irq_remapping.h        |   22 +-
>   drivers/pci/htirq.c                  |   48 +---
>   include/linux/htirq.h                |   22 +-
>   include/linux/intel-iommu.h          |    4 +
>   include/linux/irq.h                  |    8 +
>   include/linux/irqdomain.h            |   60 +++++
>   kernel/irq/Kconfig                   |    3 +
>   kernel/irq/chip.c                    |   21 ++
>   kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>   28 files changed, 1934 insertions(+), 709 deletions(-)
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-24  7:59   ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  7:59 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/11 23:03), Jiang Liu wrote:
> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
> that is to build irqdomains for CPU vector, interrupt remapping unit,
> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
> Each irqdomain manages corresponding interrupt controller and talks to
> parent interrupt controller through public irqdomain interfaces. We also
> support stacked irq_chip based on hierarchy irqdomain. It will make the
> x86 interrupt architecture much more clear and more easy to maintain
> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
> interrupt management architecture too.

Do you have a documentation which more detailed information is written?
I'm interested in this feature. And I want to know more detailed information.

I cannot imagine why the feature makes x86 irq architecture much more clear
and more easy to maintain in this description. Of course, I have read the
following threads:

https://lkml.org/lkml/2014/9/11/101

Thanks,
Yasuaki Ishimatsu

> 
> This is the second patch set to enable support of hierarchy irqdomain
> on x86 platforms. It depends on the first part at:
> https://lkml.org/lkml/2014/9/11/101
> And you may access it at:
> https://github.com/jiangliu/linux.git irqdomain/p2v1
> 
> And there will be a third patch set to convert IOAPIC driver to support
> hierarchy irqdomain and clean up code.
> 
> The first patch extends irqdomain interfaces to support hierarchy
> irqdomain. Hope this interface could be used by other architectures too,
> such as ARM/ARM64.
> The second patch introduces two helper functions to support stacked
> irq_chip.
> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
> it's the root irqdomain for x86 platforms.
> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
> support hierarchy irqdomain.
> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
> irqdomain.
> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
> drivers.
> 
> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
> have only done compilation tests for HT_IRQ and AMD interrupt remapping
> drivers due to hardware resource limitation. Tests on AMD platforms are
> warmly welcomed!
> 
> Jiang Liu (21):
>    irqdomain: Introduce new interfaces to support hierarchy irqdomains
>    genirq: Introduce helper functions to support stacked irq_chip
>    x86, irq: Save destination CPU ID in irq_cfg
>    x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>    x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>    x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>    x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>    x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>    x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>    x86: irq_remapping: Introduce new interfaces to support hierarchy
>      irqdomain
>    iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>      irqdomain
>    iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>    iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>    x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>    x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>    x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>    x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>      interrupts
>    iommu/vt-d: Clean up unused MSI related code
>    iommu/amd: Clean up unused MSI related code
>    x86: irq_remapping: Clean up unused MSI related code
>    x86, irq: Clean up unused MSI related code and interfaces
> 
>   arch/x86/Kconfig                     |    3 +-
>   arch/x86/include/asm/hpet.h          |   16 +-
>   arch/x86/include/asm/hw_irq.h        |   64 +++++
>   arch/x86/include/asm/irq_remapping.h |   66 +++--
>   arch/x86/include/asm/pci.h           |    5 -
>   arch/x86/include/asm/x86_init.h      |    4 -
>   arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>   arch/x86/kernel/apic/io_apic.c       |    3 -
>   arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>   arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>   arch/x86/kernel/hpet.c               |   57 ++---
>   arch/x86/kernel/x86_init.c           |    2 -
>   arch/x86/platform/uv/uv_irq.c        |   27 +-
>   drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>   drivers/iommu/amd_iommu_init.c       |    4 +
>   drivers/iommu/amd_iommu_proto.h      |    9 +
>   drivers/iommu/amd_iommu_types.h      |    5 +
>   drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>   drivers/iommu/irq_remapping.c        |  221 ++++++----------
>   drivers/iommu/irq_remapping.h        |   22 +-
>   drivers/pci/htirq.c                  |   48 +---
>   include/linux/htirq.h                |   22 +-
>   include/linux/intel-iommu.h          |    4 +
>   include/linux/irq.h                  |    8 +
>   include/linux/irqdomain.h            |   60 +++++
>   kernel/irq/Kconfig                   |    3 +
>   kernel/irq/chip.c                    |   21 ++
>   kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>   28 files changed, 1934 insertions(+), 709 deletions(-)
> 



^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-24  7:59   ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  7:59 UTC (permalink / raw)
  To: linux-arm-kernel

(2014/09/11 23:03), Jiang Liu wrote:
> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
> that is to build irqdomains for CPU vector, interrupt remapping unit,
> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
> Each irqdomain manages corresponding interrupt controller and talks to
> parent interrupt controller through public irqdomain interfaces. We also
> support stacked irq_chip based on hierarchy irqdomain. It will make the
> x86 interrupt architecture much more clear and more easy to maintain
> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
> interrupt management architecture too.

Do you have a documentation which more detailed information is written?
I'm interested in this feature. And I want to know more detailed information.

I cannot imagine why the feature makes x86 irq architecture much more clear
and more easy to maintain in this description. Of course, I have read the
following threads:

https://lkml.org/lkml/2014/9/11/101

Thanks,
Yasuaki Ishimatsu

> 
> This is the second patch set to enable support of hierarchy irqdomain
> on x86 platforms. It depends on the first part at:
> https://lkml.org/lkml/2014/9/11/101
> And you may access it at:
> https://github.com/jiangliu/linux.git irqdomain/p2v1
> 
> And there will be a third patch set to convert IOAPIC driver to support
> hierarchy irqdomain and clean up code.
> 
> The first patch extends irqdomain interfaces to support hierarchy
> irqdomain. Hope this interface could be used by other architectures too,
> such as ARM/ARM64.
> The second patch introduces two helper functions to support stacked
> irq_chip.
> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
> it's the root irqdomain for x86 platforms.
> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
> support hierarchy irqdomain.
> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
> irqdomain.
> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
> drivers.
> 
> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
> have only done compilation tests for HT_IRQ and AMD interrupt remapping
> drivers due to hardware resource limitation. Tests on AMD platforms are
> warmly welcomed!
> 
> Jiang Liu (21):
>    irqdomain: Introduce new interfaces to support hierarchy irqdomains
>    genirq: Introduce helper functions to support stacked irq_chip
>    x86, irq: Save destination CPU ID in irq_cfg
>    x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>    x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>    x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>    x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>    x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>    x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>    x86: irq_remapping: Introduce new interfaces to support hierarchy
>      irqdomain
>    iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>      irqdomain
>    iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>    iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>    x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>    x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>    x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>    x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>      interrupts
>    iommu/vt-d: Clean up unused MSI related code
>    iommu/amd: Clean up unused MSI related code
>    x86: irq_remapping: Clean up unused MSI related code
>    x86, irq: Clean up unused MSI related code and interfaces
> 
>   arch/x86/Kconfig                     |    3 +-
>   arch/x86/include/asm/hpet.h          |   16 +-
>   arch/x86/include/asm/hw_irq.h        |   64 +++++
>   arch/x86/include/asm/irq_remapping.h |   66 +++--
>   arch/x86/include/asm/pci.h           |    5 -
>   arch/x86/include/asm/x86_init.h      |    4 -
>   arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>   arch/x86/kernel/apic/io_apic.c       |    3 -
>   arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>   arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>   arch/x86/kernel/hpet.c               |   57 ++---
>   arch/x86/kernel/x86_init.c           |    2 -
>   arch/x86/platform/uv/uv_irq.c        |   27 +-
>   drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>   drivers/iommu/amd_iommu_init.c       |    4 +
>   drivers/iommu/amd_iommu_proto.h      |    9 +
>   drivers/iommu/amd_iommu_types.h      |    5 +
>   drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>   drivers/iommu/irq_remapping.c        |  221 ++++++----------
>   drivers/iommu/irq_remapping.h        |   22 +-
>   drivers/pci/htirq.c                  |   48 +---
>   include/linux/htirq.h                |   22 +-
>   include/linux/intel-iommu.h          |    4 +
>   include/linux/irq.h                  |    8 +
>   include/linux/irqdomain.h            |   60 +++++
>   kernel/irq/Kconfig                   |    3 +
>   kernel/irq/chip.c                    |   21 ++
>   kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>   28 files changed, 1934 insertions(+), 709 deletions(-)
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
  2014-09-24  7:59   ` Yasuaki Ishimatsu
@ 2014-09-24  8:10     ` Jiang Liu
  -1 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  8:10 UTC (permalink / raw)
  To: Yasuaki Ishimatsu, Benjamin Herrenschmidt, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas,
	Randy Dunlap, Yinghai Lu, Borislav Petkov, Grant Likely,
	Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel



On 2014/9/24 15:59, Yasuaki Ishimatsu wrote:
> (2014/09/11 23:03), Jiang Liu wrote:
>> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
>> that is to build irqdomains for CPU vector, interrupt remapping unit,
>> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
>> Each irqdomain manages corresponding interrupt controller and talks to
>> parent interrupt controller through public irqdomain interfaces. We also
>> support stacked irq_chip based on hierarchy irqdomain. It will make the
>> x86 interrupt architecture much more clear and more easy to maintain
>> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
>> interrupt management architecture too.
> 
> Do you have a documentation which more detailed information is written?
> I'm interested in this feature. And I want to know more detailed information.
> 
> I cannot imagine why the feature makes x86 irq architecture much more clear
> and more easy to maintain in this description. Of course, I have read the
> following threads:
> 
> https://lkml.org/lkml/2014/9/11/101
Hi Yasuaki,

Do these help?
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg729924.html
https://lkml.org/lkml/2014/8/1/67

Regards!
Gerry

> 
> Thanks,
> Yasuaki Ishimatsu
> 
>>
>> This is the second patch set to enable support of hierarchy irqdomain
>> on x86 platforms. It depends on the first part at:
>> https://lkml.org/lkml/2014/9/11/101
>> And you may access it at:
>> https://github.com/jiangliu/linux.git irqdomain/p2v1
>>
>> And there will be a third patch set to convert IOAPIC driver to support
>> hierarchy irqdomain and clean up code.
>>
>> The first patch extends irqdomain interfaces to support hierarchy
>> irqdomain. Hope this interface could be used by other architectures too,
>> such as ARM/ARM64.
>> The second patch introduces two helper functions to support stacked
>> irq_chip.
>> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
>> it's the root irqdomain for x86 platforms.
>> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
>> support hierarchy irqdomain.
>> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
>> irqdomain.
>> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
>> drivers.
>>
>> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
>> have only done compilation tests for HT_IRQ and AMD interrupt remapping
>> drivers due to hardware resource limitation. Tests on AMD platforms are
>> warmly welcomed!
>>
>> Jiang Liu (21):
>>    irqdomain: Introduce new interfaces to support hierarchy irqdomains
>>    genirq: Introduce helper functions to support stacked irq_chip
>>    x86, irq: Save destination CPU ID in irq_cfg
>>    x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>>    x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>>    x86: irq_remapping: Introduce new interfaces to support hierarchy
>>      irqdomain
>>    iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>>      irqdomain
>>    iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>>    iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>>    x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>>    x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>>    x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>>    x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>>      interrupts
>>    iommu/vt-d: Clean up unused MSI related code
>>    iommu/amd: Clean up unused MSI related code
>>    x86: irq_remapping: Clean up unused MSI related code
>>    x86, irq: Clean up unused MSI related code and interfaces
>>
>>   arch/x86/Kconfig                     |    3 +-
>>   arch/x86/include/asm/hpet.h          |   16 +-
>>   arch/x86/include/asm/hw_irq.h        |   64 +++++
>>   arch/x86/include/asm/irq_remapping.h |   66 +++--
>>   arch/x86/include/asm/pci.h           |    5 -
>>   arch/x86/include/asm/x86_init.h      |    4 -
>>   arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>>   arch/x86/kernel/apic/io_apic.c       |    3 -
>>   arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>>   arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>>   arch/x86/kernel/hpet.c               |   57 ++---
>>   arch/x86/kernel/x86_init.c           |    2 -
>>   arch/x86/platform/uv/uv_irq.c        |   27 +-
>>   drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>>   drivers/iommu/amd_iommu_init.c       |    4 +
>>   drivers/iommu/amd_iommu_proto.h      |    9 +
>>   drivers/iommu/amd_iommu_types.h      |    5 +
>>   drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>>   drivers/iommu/irq_remapping.c        |  221 ++++++----------
>>   drivers/iommu/irq_remapping.h        |   22 +-
>>   drivers/pci/htirq.c                  |   48 +---
>>   include/linux/htirq.h                |   22 +-
>>   include/linux/intel-iommu.h          |    4 +
>>   include/linux/irq.h                  |    8 +
>>   include/linux/irqdomain.h            |   60 +++++
>>   kernel/irq/Kconfig                   |    3 +
>>   kernel/irq/chip.c                    |   21 ++
>>   kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>>   28 files changed, 1934 insertions(+), 709 deletions(-)
>>
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-24  8:10     ` Jiang Liu
  0 siblings, 0 replies; 110+ messages in thread
From: Jiang Liu @ 2014-09-24  8:10 UTC (permalink / raw)
  To: linux-arm-kernel



On 2014/9/24 15:59, Yasuaki Ishimatsu wrote:
> (2014/09/11 23:03), Jiang Liu wrote:
>> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
>> that is to build irqdomains for CPU vector, interrupt remapping unit,
>> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
>> Each irqdomain manages corresponding interrupt controller and talks to
>> parent interrupt controller through public irqdomain interfaces. We also
>> support stacked irq_chip based on hierarchy irqdomain. It will make the
>> x86 interrupt architecture much more clear and more easy to maintain
>> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
>> interrupt management architecture too.
> 
> Do you have a documentation which more detailed information is written?
> I'm interested in this feature. And I want to know more detailed information.
> 
> I cannot imagine why the feature makes x86 irq architecture much more clear
> and more easy to maintain in this description. Of course, I have read the
> following threads:
> 
> https://lkml.org/lkml/2014/9/11/101
Hi Yasuaki,

Do these help?
http://www.mail-archive.com/linux-kernel at vger.kernel.org/msg729924.html
https://lkml.org/lkml/2014/8/1/67

Regards!
Gerry

> 
> Thanks,
> Yasuaki Ishimatsu
> 
>>
>> This is the second patch set to enable support of hierarchy irqdomain
>> on x86 platforms. It depends on the first part at:
>> https://lkml.org/lkml/2014/9/11/101
>> And you may access it at:
>> https://github.com/jiangliu/linux.git irqdomain/p2v1
>>
>> And there will be a third patch set to convert IOAPIC driver to support
>> hierarchy irqdomain and clean up code.
>>
>> The first patch extends irqdomain interfaces to support hierarchy
>> irqdomain. Hope this interface could be used by other architectures too,
>> such as ARM/ARM64.
>> The second patch introduces two helper functions to support stacked
>> irq_chip.
>> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
>> it's the root irqdomain for x86 platforms.
>> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
>> support hierarchy irqdomain.
>> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
>> irqdomain.
>> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
>> drivers.
>>
>> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
>> have only done compilation tests for HT_IRQ and AMD interrupt remapping
>> drivers due to hardware resource limitation. Tests on AMD platforms are
>> warmly welcomed!
>>
>> Jiang Liu (21):
>>    irqdomain: Introduce new interfaces to support hierarchy irqdomains
>>    genirq: Introduce helper functions to support stacked irq_chip
>>    x86, irq: Save destination CPU ID in irq_cfg
>>    x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>>    x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>>    x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>>    x86: irq_remapping: Introduce new interfaces to support hierarchy
>>      irqdomain
>>    iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>>      irqdomain
>>    iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>>    iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>>    x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>>    x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>>    x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>>    x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>>      interrupts
>>    iommu/vt-d: Clean up unused MSI related code
>>    iommu/amd: Clean up unused MSI related code
>>    x86: irq_remapping: Clean up unused MSI related code
>>    x86, irq: Clean up unused MSI related code and interfaces
>>
>>   arch/x86/Kconfig                     |    3 +-
>>   arch/x86/include/asm/hpet.h          |   16 +-
>>   arch/x86/include/asm/hw_irq.h        |   64 +++++
>>   arch/x86/include/asm/irq_remapping.h |   66 +++--
>>   arch/x86/include/asm/pci.h           |    5 -
>>   arch/x86/include/asm/x86_init.h      |    4 -
>>   arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>>   arch/x86/kernel/apic/io_apic.c       |    3 -
>>   arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>>   arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>>   arch/x86/kernel/hpet.c               |   57 ++---
>>   arch/x86/kernel/x86_init.c           |    2 -
>>   arch/x86/platform/uv/uv_irq.c        |   27 +-
>>   drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>>   drivers/iommu/amd_iommu_init.c       |    4 +
>>   drivers/iommu/amd_iommu_proto.h      |    9 +
>>   drivers/iommu/amd_iommu_types.h      |    5 +
>>   drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>>   drivers/iommu/irq_remapping.c        |  221 ++++++----------
>>   drivers/iommu/irq_remapping.h        |   22 +-
>>   drivers/pci/htirq.c                  |   48 +---
>>   include/linux/htirq.h                |   22 +-
>>   include/linux/intel-iommu.h          |    4 +
>>   include/linux/irq.h                  |    8 +
>>   include/linux/irqdomain.h            |   60 +++++
>>   kernel/irq/Kconfig                   |    3 +
>>   kernel/irq/chip.c                    |   21 ++
>>   kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>>   28 files changed, 1934 insertions(+), 709 deletions(-)
>>
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
  2014-09-24  8:10     ` Jiang Liu
  (?)
@ 2014-09-24  8:12       ` Yasuaki Ishimatsu
  -1 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  8:12 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/24 17:10), Jiang Liu wrote:
> 
> 
> On 2014/9/24 15:59, Yasuaki Ishimatsu wrote:
>> (2014/09/11 23:03), Jiang Liu wrote:
>>> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
>>> that is to build irqdomains for CPU vector, interrupt remapping unit,
>>> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
>>> Each irqdomain manages corresponding interrupt controller and talks to
>>> parent interrupt controller through public irqdomain interfaces. We also
>>> support stacked irq_chip based on hierarchy irqdomain. It will make the
>>> x86 interrupt architecture much more clear and more easy to maintain
>>> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
>>> interrupt management architecture too.
>>
>> Do you have a documentation which more detailed information is written?
>> I'm interested in this feature. And I want to know more detailed information.
>>
>> I cannot imagine why the feature makes x86 irq architecture much more clear
>> and more easy to maintain in this description. Of course, I have read the
>> following threads:
>>
>> https://lkml.org/lkml/2014/9/11/101
> Hi Yasuaki,
> 
> Do these help?
> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg729924.html
> https://lkml.org/lkml/2014/8/1/67

Thank you for the information. I'll read it.

Thanks,
Yasuaki Ishimatsu

> 
> Regards!
> Gerry
> 
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>>
>>> This is the second patch set to enable support of hierarchy irqdomain
>>> on x86 platforms. It depends on the first part at:
>>> https://lkml.org/lkml/2014/9/11/101
>>> And you may access it at:
>>> https://github.com/jiangliu/linux.git irqdomain/p2v1
>>>
>>> And there will be a third patch set to convert IOAPIC driver to support
>>> hierarchy irqdomain and clean up code.
>>>
>>> The first patch extends irqdomain interfaces to support hierarchy
>>> irqdomain. Hope this interface could be used by other architectures too,
>>> such as ARM/ARM64.
>>> The second patch introduces two helper functions to support stacked
>>> irq_chip.
>>> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
>>> it's the root irqdomain for x86 platforms.
>>> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
>>> support hierarchy irqdomain.
>>> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
>>> irqdomain.
>>> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
>>> drivers.
>>>
>>> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
>>> have only done compilation tests for HT_IRQ and AMD interrupt remapping
>>> drivers due to hardware resource limitation. Tests on AMD platforms are
>>> warmly welcomed!
>>>
>>> Jiang Liu (21):
>>>     irqdomain: Introduce new interfaces to support hierarchy irqdomains
>>>     genirq: Introduce helper functions to support stacked irq_chip
>>>     x86, irq: Save destination CPU ID in irq_cfg
>>>     x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>>>     x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86: irq_remapping: Introduce new interfaces to support hierarchy
>>>       irqdomain
>>>     iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>>>       irqdomain
>>>     iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>>>     iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>>>     x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>>>     x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>>>     x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>>>     x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>>>       interrupts
>>>     iommu/vt-d: Clean up unused MSI related code
>>>     iommu/amd: Clean up unused MSI related code
>>>     x86: irq_remapping: Clean up unused MSI related code
>>>     x86, irq: Clean up unused MSI related code and interfaces
>>>
>>>    arch/x86/Kconfig                     |    3 +-
>>>    arch/x86/include/asm/hpet.h          |   16 +-
>>>    arch/x86/include/asm/hw_irq.h        |   64 +++++
>>>    arch/x86/include/asm/irq_remapping.h |   66 +++--
>>>    arch/x86/include/asm/pci.h           |    5 -
>>>    arch/x86/include/asm/x86_init.h      |    4 -
>>>    arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>>>    arch/x86/kernel/apic/io_apic.c       |    3 -
>>>    arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>>>    arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>>>    arch/x86/kernel/hpet.c               |   57 ++---
>>>    arch/x86/kernel/x86_init.c           |    2 -
>>>    arch/x86/platform/uv/uv_irq.c        |   27 +-
>>>    drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>>>    drivers/iommu/amd_iommu_init.c       |    4 +
>>>    drivers/iommu/amd_iommu_proto.h      |    9 +
>>>    drivers/iommu/amd_iommu_types.h      |    5 +
>>>    drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>>>    drivers/iommu/irq_remapping.c        |  221 ++++++----------
>>>    drivers/iommu/irq_remapping.h        |   22 +-
>>>    drivers/pci/htirq.c                  |   48 +---
>>>    include/linux/htirq.h                |   22 +-
>>>    include/linux/intel-iommu.h          |    4 +
>>>    include/linux/irq.h                  |    8 +
>>>    include/linux/irqdomain.h            |   60 +++++
>>>    kernel/irq/Kconfig                   |    3 +
>>>    kernel/irq/chip.c                    |   21 ++
>>>    kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>>>    28 files changed, 1934 insertions(+), 709 deletions(-)
>>>
>>
>>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-24  8:12       ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  8:12 UTC (permalink / raw)
  To: Jiang Liu, Benjamin Herrenschmidt, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Grant Likely, Marc Zyngier
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/24 17:10), Jiang Liu wrote:
> 
> 
> On 2014/9/24 15:59, Yasuaki Ishimatsu wrote:
>> (2014/09/11 23:03), Jiang Liu wrote:
>>> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
>>> that is to build irqdomains for CPU vector, interrupt remapping unit,
>>> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
>>> Each irqdomain manages corresponding interrupt controller and talks to
>>> parent interrupt controller through public irqdomain interfaces. We also
>>> support stacked irq_chip based on hierarchy irqdomain. It will make the
>>> x86 interrupt architecture much more clear and more easy to maintain
>>> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
>>> interrupt management architecture too.
>>
>> Do you have a documentation which more detailed information is written?
>> I'm interested in this feature. And I want to know more detailed information.
>>
>> I cannot imagine why the feature makes x86 irq architecture much more clear
>> and more easy to maintain in this description. Of course, I have read the
>> following threads:
>>
>> https://lkml.org/lkml/2014/9/11/101
> Hi Yasuaki,
> 
> Do these help?
> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg729924.html
> https://lkml.org/lkml/2014/8/1/67

Thank you for the information. I'll read it.

Thanks,
Yasuaki Ishimatsu

> 
> Regards!
> Gerry
> 
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>>
>>> This is the second patch set to enable support of hierarchy irqdomain
>>> on x86 platforms. It depends on the first part at:
>>> https://lkml.org/lkml/2014/9/11/101
>>> And you may access it at:
>>> https://github.com/jiangliu/linux.git irqdomain/p2v1
>>>
>>> And there will be a third patch set to convert IOAPIC driver to support
>>> hierarchy irqdomain and clean up code.
>>>
>>> The first patch extends irqdomain interfaces to support hierarchy
>>> irqdomain. Hope this interface could be used by other architectures too,
>>> such as ARM/ARM64.
>>> The second patch introduces two helper functions to support stacked
>>> irq_chip.
>>> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
>>> it's the root irqdomain for x86 platforms.
>>> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
>>> support hierarchy irqdomain.
>>> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
>>> irqdomain.
>>> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
>>> drivers.
>>>
>>> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
>>> have only done compilation tests for HT_IRQ and AMD interrupt remapping
>>> drivers due to hardware resource limitation. Tests on AMD platforms are
>>> warmly welcomed!
>>>
>>> Jiang Liu (21):
>>>     irqdomain: Introduce new interfaces to support hierarchy irqdomains
>>>     genirq: Introduce helper functions to support stacked irq_chip
>>>     x86, irq: Save destination CPU ID in irq_cfg
>>>     x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>>>     x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86: irq_remapping: Introduce new interfaces to support hierarchy
>>>       irqdomain
>>>     iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>>>       irqdomain
>>>     iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>>>     iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>>>     x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>>>     x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>>>     x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>>>     x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>>>       interrupts
>>>     iommu/vt-d: Clean up unused MSI related code
>>>     iommu/amd: Clean up unused MSI related code
>>>     x86: irq_remapping: Clean up unused MSI related code
>>>     x86, irq: Clean up unused MSI related code and interfaces
>>>
>>>    arch/x86/Kconfig                     |    3 +-
>>>    arch/x86/include/asm/hpet.h          |   16 +-
>>>    arch/x86/include/asm/hw_irq.h        |   64 +++++
>>>    arch/x86/include/asm/irq_remapping.h |   66 +++--
>>>    arch/x86/include/asm/pci.h           |    5 -
>>>    arch/x86/include/asm/x86_init.h      |    4 -
>>>    arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>>>    arch/x86/kernel/apic/io_apic.c       |    3 -
>>>    arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>>>    arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>>>    arch/x86/kernel/hpet.c               |   57 ++---
>>>    arch/x86/kernel/x86_init.c           |    2 -
>>>    arch/x86/platform/uv/uv_irq.c        |   27 +-
>>>    drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>>>    drivers/iommu/amd_iommu_init.c       |    4 +
>>>    drivers/iommu/amd_iommu_proto.h      |    9 +
>>>    drivers/iommu/amd_iommu_types.h      |    5 +
>>>    drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>>>    drivers/iommu/irq_remapping.c        |  221 ++++++----------
>>>    drivers/iommu/irq_remapping.h        |   22 +-
>>>    drivers/pci/htirq.c                  |   48 +---
>>>    include/linux/htirq.h                |   22 +-
>>>    include/linux/intel-iommu.h          |    4 +
>>>    include/linux/irq.h                  |    8 +
>>>    include/linux/irqdomain.h            |   60 +++++
>>>    kernel/irq/Kconfig                   |    3 +
>>>    kernel/irq/chip.c                    |   21 ++
>>>    kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>>>    28 files changed, 1934 insertions(+), 709 deletions(-)
>>>
>>
>>



^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-24  8:12       ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-24  8:12 UTC (permalink / raw)
  To: linux-arm-kernel

(2014/09/24 17:10), Jiang Liu wrote:
> 
> 
> On 2014/9/24 15:59, Yasuaki Ishimatsu wrote:
>> (2014/09/11 23:03), Jiang Liu wrote:
>>> We plan to restructure x86 interrupt code based on hierarchy irqdomain,
>>> that is to build irqdomains for CPU vector, interrupt remapping unit,
>>> IOAPIC, MSI and HPET etc and organize those irqdomains in hierarchy mode.
>>> Each irqdomain manages corresponding interrupt controller and talks to
>>> parent interrupt controller through public irqdomain interfaces. We also
>>> support stacked irq_chip based on hierarchy irqdomain. It will make the
>>> x86 interrupt architecture much more clear and more easy to maintain
>>> with hierarchy irqdomain and stacked irq_chip. It may also help ARM
>>> interrupt management architecture too.
>>
>> Do you have a documentation which more detailed information is written?
>> I'm interested in this feature. And I want to know more detailed information.
>>
>> I cannot imagine why the feature makes x86 irq architecture much more clear
>> and more easy to maintain in this description. Of course, I have read the
>> following threads:
>>
>> https://lkml.org/lkml/2014/9/11/101
> Hi Yasuaki,
> 
> Do these help?
> http://www.mail-archive.com/linux-kernel at vger.kernel.org/msg729924.html
> https://lkml.org/lkml/2014/8/1/67

Thank you for the information. I'll read it.

Thanks,
Yasuaki Ishimatsu

> 
> Regards!
> Gerry
> 
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>>
>>> This is the second patch set to enable support of hierarchy irqdomain
>>> on x86 platforms. It depends on the first part at:
>>> https://lkml.org/lkml/2014/9/11/101
>>> And you may access it at:
>>> https://github.com/jiangliu/linux.git irqdomain/p2v1
>>>
>>> And there will be a third patch set to convert IOAPIC driver to support
>>> hierarchy irqdomain and clean up code.
>>>
>>> The first patch extends irqdomain interfaces to support hierarchy
>>> irqdomain. Hope this interface could be used by other architectures too,
>>> such as ARM/ARM64.
>>> The second patch introduces two helper functions to support stacked
>>> irq_chip.
>>> Patch 3-9 implements an irqdomain to manange CPU interrupt vectors, and
>>> it's the root irqdomain for x86 platforms.
>>> Patch 10-13 converts Intel and AMD interrupt remapping drivers to
>>> support hierarchy irqdomain.
>>> Patch 14-17 converts HPET, MSI and HT_IRQ drivers to support hierarchy
>>> irqdomain.
>>> Patch 18-21 cleans up unsued code in x86 arch and interrupt remapping
>>> drivers.
>>>
>>> We have tested this patchset on Intel 32-bit and 64-bit systems. But we
>>> have only done compilation tests for HT_IRQ and AMD interrupt remapping
>>> drivers due to hardware resource limitation. Tests on AMD platforms are
>>> warmly welcomed!
>>>
>>> Jiang Liu (21):
>>>     irqdomain: Introduce new interfaces to support hierarchy irqdomains
>>>     genirq: Introduce helper functions to support stacked irq_chip
>>>     x86, irq: Save destination CPU ID in irq_cfg
>>>     x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors
>>>     x86, hpet: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, MSI: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, uv: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, htirq: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86, dmar: Use new irqdomain interfaces to allocate/free IRQ
>>>     x86: irq_remapping: Introduce new interfaces to support hierarchy
>>>       irqdomain
>>>     iommu/vt-d: Change prototypes to prepare for enabling hierarchy
>>>       irqdomain
>>>     iommu/vt-d: Enhance Intel IR driver to suppport hierarchy irqdomain
>>>     iommu/amd: Enhance AMD IR driver to suppport hierarchy irqdomain
>>>     x86, hpet: Enhance HPET IRQ to support hierarchy irqdomain
>>>     x86, MSI: Use hierarchy irqdomain to manage MSI interrupts
>>>     x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ
>>>     x86, htirq: Use hierarchy irqdomain to manage Hypertransport
>>>       interrupts
>>>     iommu/vt-d: Clean up unused MSI related code
>>>     iommu/amd: Clean up unused MSI related code
>>>     x86: irq_remapping: Clean up unused MSI related code
>>>     x86, irq: Clean up unused MSI related code and interfaces
>>>
>>>    arch/x86/Kconfig                     |    3 +-
>>>    arch/x86/include/asm/hpet.h          |   16 +-
>>>    arch/x86/include/asm/hw_irq.h        |   64 +++++
>>>    arch/x86/include/asm/irq_remapping.h |   66 +++--
>>>    arch/x86/include/asm/pci.h           |    5 -
>>>    arch/x86/include/asm/x86_init.h      |    4 -
>>>    arch/x86/kernel/apic/htirq.c         |  179 +++++++++----
>>>    arch/x86/kernel/apic/io_apic.c       |    3 -
>>>    arch/x86/kernel/apic/msi.c           |  430 +++++++++++++++++++++++--------
>>>    arch/x86/kernel/apic/vector.c        |  158 +++++++++++-
>>>    arch/x86/kernel/hpet.c               |   57 ++---
>>>    arch/x86/kernel/x86_init.c           |    2 -
>>>    arch/x86/platform/uv/uv_irq.c        |   27 +-
>>>    drivers/iommu/amd_iommu.c            |  385 ++++++++++++++++++++++------
>>>    drivers/iommu/amd_iommu_init.c       |    4 +
>>>    drivers/iommu/amd_iommu_proto.h      |    9 +
>>>    drivers/iommu/amd_iommu_types.h      |    5 +
>>>    drivers/iommu/intel_irq_remapping.c  |  468 +++++++++++++++++++++++-----------
>>>    drivers/iommu/irq_remapping.c        |  221 ++++++----------
>>>    drivers/iommu/irq_remapping.h        |   22 +-
>>>    drivers/pci/htirq.c                  |   48 +---
>>>    include/linux/htirq.h                |   22 +-
>>>    include/linux/intel-iommu.h          |    4 +
>>>    include/linux/irq.h                  |    8 +
>>>    include/linux/irqdomain.h            |   60 +++++
>>>    kernel/irq/Kconfig                   |    3 +
>>>    kernel/irq/chip.c                    |   21 ++
>>>    kernel/irq/irqdomain.c               |  349 ++++++++++++++++++++++++-
>>>    28 files changed, 1934 insertions(+), 709 deletions(-)
>>>
>>
>>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
  2014-09-24  7:59   ` Yasuaki Ishimatsu
@ 2014-09-24 19:25     ` Thomas Gleixner
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-24 19:25 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Jiang Liu, Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

On Wed, 24 Sep 2014, Yasuaki Ishimatsu wrote:
> I cannot imagine why the feature makes x86 irq architecture much more clear
> and more easy to maintain in this description. Of course, I have read the
> following threads:
> 
> https://lkml.org/lkml/2014/9/11/101

Here is the long version of the idea:

     https://lkml.org/lkml/2014/8/26/707

Short version is:

Separate the irq handling entities in a clear layered way instead of
having vector/[io]apic specific knowledge/data in places like msi,
remap etc.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-24 19:25     ` Thomas Gleixner
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Gleixner @ 2014-09-24 19:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 24 Sep 2014, Yasuaki Ishimatsu wrote:
> I cannot imagine why the feature makes x86 irq architecture much more clear
> and more easy to maintain in this description. Of course, I have read the
> following threads:
> 
> https://lkml.org/lkml/2014/9/11/101

Here is the long version of the idea:

     https://lkml.org/lkml/2014/8/26/707

Short version is:

Separate the irq handling entities in a clear layered way instead of
having vector/[io]apic specific knowledge/data in places like msi,
remap etc.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
  2014-09-24 19:25     ` Thomas Gleixner
  (?)
@ 2014-09-25  8:15       ` Yasuaki Ishimatsu
  -1 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-25  8:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jiang Liu, Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/25 4:25), Thomas Gleixner wrote:
> On Wed, 24 Sep 2014, Yasuaki Ishimatsu wrote:
>> I cannot imagine why the feature makes x86 irq architecture much more clear
>> and more easy to maintain in this description. Of course, I have read the
>> following threads:
>>
>> https://lkml.org/lkml/2014/9/11/101
>

> Here is the long version of the idea:
>
>       https://lkml.org/lkml/2014/8/26/707

This is very helpful for the understanding of this feature.

Thanks,
Yasuaki Ishimatsu

>
> Short version is:
>
> Separate the irq handling entities in a clear layered way instead of
> having vector/[io]apic specific knowledge/data in places like msi,
> remap etc.
>
> Thanks,
>
> 	tglx
>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-25  8:15       ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-25  8:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jiang Liu, Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
	Rafael J. Wysocki, Bjorn Helgaas, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Grant Likely, Marc Zyngier,
	Konrad Rzeszutek Wilk, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	linux-arm-kernel

(2014/09/25 4:25), Thomas Gleixner wrote:
> On Wed, 24 Sep 2014, Yasuaki Ishimatsu wrote:
>> I cannot imagine why the feature makes x86 irq architecture much more clear
>> and more easy to maintain in this description. Of course, I have read the
>> following threads:
>>
>> https://lkml.org/lkml/2014/9/11/101
>

> Here is the long version of the idea:
>
>       https://lkml.org/lkml/2014/8/26/707

This is very helpful for the understanding of this feature.

Thanks,
Yasuaki Ishimatsu

>
> Short version is:
>
> Separate the irq handling entities in a clear layered way instead of
> having vector/[io]apic specific knowledge/data in places like msi,
> remap etc.
>
> Thanks,
>
> 	tglx
>



^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms
@ 2014-09-25  8:15       ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 110+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-25  8:15 UTC (permalink / raw)
  To: linux-arm-kernel

(2014/09/25 4:25), Thomas Gleixner wrote:
> On Wed, 24 Sep 2014, Yasuaki Ishimatsu wrote:
>> I cannot imagine why the feature makes x86 irq architecture much more clear
>> and more easy to maintain in this description. Of course, I have read the
>> following threads:
>>
>> https://lkml.org/lkml/2014/9/11/101
>

> Here is the long version of the idea:
>
>       https://lkml.org/lkml/2014/8/26/707

This is very helpful for the understanding of this feature.

Thanks,
Yasuaki Ishimatsu

>
> Short version is:
>
> Separate the irq handling entities in a clear layered way instead of
> having vector/[io]apic specific knowledge/data in places like msi,
> remap etc.
>
> Thanks,
>
> 	tglx
>

^ permalink raw reply	[flat|nested] 110+ messages in thread

end of thread, other threads:[~2014-09-25  8:17 UTC | newest]

Thread overview: 110+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-11 14:03 [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms Jiang Liu
2014-09-11 14:03 ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 01/21] irqdomain: Introduce new interfaces to support hierarchy irqdomains Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-16 17:43   ` Thomas Gleixner
2014-09-16 17:43     ` Thomas Gleixner
2014-09-18  7:28     ` Jiang Liu
2014-09-18  7:28       ` Jiang Liu
2014-09-22  8:17     ` [Patch] " Jiang Liu
2014-09-22  8:17       ` Jiang Liu
2014-09-22 17:30       ` Randy Dunlap
2014-09-22 17:30         ` Randy Dunlap
2014-09-24  5:26         ` Jiang Liu
2014-09-24  5:26           ` Jiang Liu
2014-09-24  5:26           ` Jiang Liu
2014-09-23  9:43       ` Joe.C
2014-09-23  9:43         ` Joe.C
2014-09-24  5:55         ` Jiang Liu
2014-09-24  5:55           ` Jiang Liu
2014-09-18  8:48   ` [RFC Part2 v1 01/21] " Joe.C
2014-09-18  8:48     ` Joe.C
2014-09-18  8:58     ` Jiang Liu
2014-09-18  8:58       ` Jiang Liu
2014-09-24  6:55   ` Yasuaki Ishimatsu
2014-09-24  6:55     ` Yasuaki Ishimatsu
2014-09-24  6:55     ` Yasuaki Ishimatsu
2014-09-24  7:23     ` Jiang Liu
2014-09-24  7:23       ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 02/21] genirq: Introduce helper functions to support stacked irq_chip Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-16 17:45   ` Thomas Gleixner
2014-09-16 17:45     ` Thomas Gleixner
2014-09-17  3:07     ` Jiang Liu
2014-09-17  3:07       ` Jiang Liu
2014-09-17 20:58       ` Thomas Gleixner
2014-09-17 20:58         ` Thomas Gleixner
2014-09-18  6:14         ` Jiang Liu
2014-09-18  6:14           ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 03/21] x86, irq: Save destination CPU ID in irq_cfg Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-16 17:47   ` Thomas Gleixner
2014-09-16 17:47     ` Thomas Gleixner
2014-09-17  2:24     ` Jiang Liu
2014-09-17  2:24       ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 04/21] x86, irq: Use hierarchy irqdomain to manage CPU interrupt vectors Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 05/21] x86, hpet: Use new irqdomain interfaces to allocate/free IRQ Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 06/21] x86, MSI: " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 07/21] x86, uv: " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 08/21] x86, htirq: " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 09/21] x86, dmar: " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 10/21] x86: irq_remapping: Introduce new interfaces to support hierarchy irqdomain Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 11/21] iommu/vt-d: Change prototypes to prepare for enabling " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 12/21] iommu/vt-d: Enhance Intel IR driver to suppport " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 13/21] iommu/amd: Enhance AMD " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 14/21] x86, hpet: Enhance HPET IRQ to support " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-16 18:31   ` Thomas Gleixner
2014-09-16 18:31     ` Thomas Gleixner
2014-09-17  5:16     ` Jiang Liu
2014-09-17  5:16       ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 15/21] x86, MSI: Use hierarchy irqdomain to manage MSI interrupts Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:17   ` Ni, Xun
2014-09-11 14:17     ` Ni, Xun
2014-09-11 14:17     ` Ni, Xun
2014-09-11 14:29     ` Jiang Liu
2014-09-11 14:29       ` Jiang Liu
2014-09-11 14:29       ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 16/21] x86, irq: Directly call native_compose_msi_msg() for DMAR IRQ Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 17/21] x86, htirq: Use hierarchy irqdomain to manage Hypertransport interrupts Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 18/21] iommu/vt-d: Clean up unused MSI related code Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 19/21] iommu/amd: " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 20/21] x86: irq_remapping: " Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-11 14:03 ` [RFC Part2 v1 21/21] x86, irq: Clean up unused MSI related code and interfaces Jiang Liu
2014-09-11 14:03   ` Jiang Liu
2014-09-24  7:59 ` [RFC Part2 v1 00/21] Enable hierarchy irqdomian on x86 platforms Yasuaki Ishimatsu
2014-09-24  7:59   ` Yasuaki Ishimatsu
2014-09-24  7:59   ` Yasuaki Ishimatsu
2014-09-24  8:10   ` Jiang Liu
2014-09-24  8:10     ` Jiang Liu
2014-09-24  8:12     ` Yasuaki Ishimatsu
2014-09-24  8:12       ` Yasuaki Ishimatsu
2014-09-24  8:12       ` Yasuaki Ishimatsu
2014-09-24 19:25   ` Thomas Gleixner
2014-09-24 19:25     ` Thomas Gleixner
2014-09-25  8:15     ` Yasuaki Ishimatsu
2014-09-25  8:15       ` Yasuaki Ishimatsu
2014-09-25  8:15       ` Yasuaki Ishimatsu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.