linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/27] x86, irq: support ioapic device hotplug
@ 2013-06-07 22:30 Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 01/27] x86, irq: Change irq_remap_modify_chip_defaults to static Yinghai Lu
                   ` (26 more replies)
  0 siblings, 27 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu

Hi,

Intel cpu (from IVB) include cpu, mem controller, and IIO.
When hotadd cpu physically, it will involve cpu hotplug, mem hotplug,
and pci root hotplug.

IIO includes pci host bridge, ioapic controller and iommu.
pci devices will need to use ioapic and iommu.
So to make pci root bus hotplug working, we need to add ioapic hotplug support.

Now ioapics are with ACPI MADT table. During booting, kernel will parse
MADT and put info into ioapic arrays.
Also Bjorn added one pci device based driver, but it is not wired in yet,
as it need to call acpi_register_ioapic, and that is TBD.

This patchset will
1. extend genirq to support reserve/realloc method.
   because we want irqs for one ioapic controller to be linear mapping
   to the gsi range.
2. change ioapic array operation code so we could insert new ioapic and
   remove one leave the blank slot.
3. record irq_base in gsi_config in ioapic struct, and use it to convert gsi
   to irq for pci device using that ioapic controller
4. make static ioapic path (from MADT) use same code as hot-add path,
   with reserve/realloc.
5. change ioapic add/removing to use acpi way, as that is only needed when
   pci root bus hotplug. Also make it support the case that ioapic
   controller is hiding in pci config space or ioapic address is not
   managed by pci reallocation subsys.

Also include some cleanups
1. print MSI-X clearly in /proc/interrupts and dmesg.
2. convert irq_2_pin to be generic list.

It is based on linus's tree of 2013/06/07 aka v3.10-rc3+.

could get them from:
        git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-irq

-v3: split last patch to 4 patches per request from Bjorn.
     drop one patch that will put pci device name in proc/interrupts
     update changelog.

Yinghai Lu (27):
  x86, irq: Change irq_remap_modify_chip_defaults to static
  x86, irq: Modify irq chip once for irq remapping
  x86, irq: Print out MSI/MSI-X clearly
  x86, irq: Show MSI-X in /proc/interrupt
  x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent
  ia64, irq: Add dummy create_irq_nr()
  iommu, irq: Allocate irq_desc for dmar_msi with local node
  x86, irq: kill create_irq()
  x86, irq: Convert irq_2_pin list to generic list
  genirq: Split __irq_reserve_irqs from irq_alloc_descs
  x86, irq: Add realloc_irq_and_cfg_at()
  x86, irq: Move down arch_early_irq_init()
  x86, irq: Split out alloc_ioapic_save_registers()
  xen, irq: call irq_realloc_desc_at() at first
  x86, irq: pre-reserve irq range/realloc for booting path
  x86, irq: Add ioapic_gsi_to_irq
  genirq: Bail out early in free_desc()
  x86, irq: More strict checking about registering ioapic
  x86, irq: Make mp_register_ioapic handle hot-added ioapic
  x86, irq: Add mp_unregister_ioapic to handle hot-remove ioapic
  x86, irq: Make ioapics loop skip blank slots
  x86, ioapic: Find usable ioapic id for 64bit.
  x86: Move declaration for mp_register_ioapic()
  PCI, x86: Make ioapic hotplug support built-in
  PCI, x86, ACPI: Link acpi ioapic register to ioapic
  PCI, x86, ACPI: Enable ioapic hotplug support with acpi host bridge.
  PCI, x86, ACPI: get ioapic address from acpi device

 arch/ia64/kernel/irq_ia64.c          |  10 +
 arch/x86/include/asm/hw_irq.h        |   2 +-
 arch/x86/include/asm/io_apic.h       |   7 +
 arch/x86/include/asm/irq_remapping.h |   6 -
 arch/x86/include/asm/mpspec.h        |  21 +-
 arch/x86/kernel/acpi/boot.c          |  32 +--
 arch/x86/kernel/apic/apic.c          |   9 +-
 arch/x86/kernel/apic/io_apic.c       | 441 ++++++++++++++++++++++++++---------
 drivers/acpi/pci_root.c              |   4 +
 drivers/iommu/dmar.c                 |   2 +-
 drivers/iommu/irq_remapping.c        |  16 +-
 drivers/pci/Kconfig                  |   3 +-
 drivers/pci/ioapic.c                 | 232 ++++++++++++------
 drivers/xen/events.c                 |   8 +-
 include/linux/irq.h                  |   6 +
 include/linux/pci-acpi.h             |   8 +
 kernel/irq/irqdesc.c                 |  83 +++++--
 17 files changed, 650 insertions(+), 240 deletions(-)

-- 
1.8.1.4


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 01/27] x86, irq: Change irq_remap_modify_chip_defaults to static
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping Yinghai Lu
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

Change irq_remap_modify_chip_defaults() to static, as we have no
outside user.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/include/asm/irq_remapping.h | 6 ------
 drivers/iommu/irq_remapping.c        | 2 +-
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index d806b22..606f42e 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -56,8 +56,6 @@ extern bool setup_remapped_irq(int irq,
 			       struct irq_cfg *cfg,
 			       struct irq_chip *chip);
 
-void irq_remap_modify_chip_defaults(struct irq_chip *chip);
-
 #else  /* CONFIG_IRQ_REMAP */
 
 static inline void setup_irq_remapping_ops(void) { }
@@ -91,10 +89,6 @@ static inline void panic_if_irq_remap(const char *msg)
 {
 }
 
-static inline void irq_remap_modify_chip_defaults(struct irq_chip *chip)
-{
-}
-
 static inline bool setup_remapped_irq(int irq,
 				      struct irq_cfg *cfg,
 				      struct irq_chip *chip)
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index dcfea4e..07ce86a 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -373,7 +373,7 @@ static void ir_print_prefix(struct irq_data *data, struct seq_file *p)
 	seq_printf(p, " IR-%s", data->chip->name);
 }
 
-void irq_remap_modify_chip_defaults(struct irq_chip *chip)
+static void irq_remap_modify_chip_defaults(struct irq_chip *chip)
 {
 	chip->irq_print_chip = ir_print_prefix;
 	chip->irq_ack = ir_ack_apic_edge;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 01/27] x86, irq: Change irq_remap_modify_chip_defaults to static Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-09 14:54   ` Sebastian Andrzej Siewior
  2013-06-07 22:30 ` [PATCH v3 03/27] x86, irq: Print out MSI/MSI-X clearly Yinghai Lu
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Konrad Rzeszutek Wilk,
	Sebastian Andrzej Siewior, Joerg Roedel

Current code: after irq remapping is enabled, irq_chip fields are modified
during every irq setup.
	mp_register_gsi
		io_apic_set_pci_routing
			io_apic_setup_irq_pin
				setup_ioapic_irq
					ioapic_register_intr
						setup_remapped_irq
	native_setup_msi_irqs
		setup_msi_irq
			setup_remapped_irq
	default_setup_hpet_msi
		setup_remapped_irq
that is not efficient.

We only need to modify those irq chip one time just after we enable
irq mapping.

Change irq_remap_modify_chip_defaults() to __init as it only gets
called during booting stage, via irq_remap_modify_chips().

Affected irq_chip: ioapic_chip, msi_chip, hpet_msi_type.
We don't need to use #ifdef in irq_remap_modify_chips():
IRQ_REMAP only support x86_64 and X86_IO_APIC and PCI_MSI.
HPET_TIMER is set when x86_64 is set.
When we have IRQ_REMAP enabled, al three chips are defined and
used.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Joerg Roedel <joro@8bytes.org>
---
 arch/x86/include/asm/io_apic.h |  5 +++++
 arch/x86/kernel/apic/apic.c    |  9 ++++++++-
 arch/x86/kernel/apic/io_apic.c |  8 +++-----
 drivers/iommu/irq_remapping.c  | 11 +++++++++--
 4 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 459e50a..eaff3ad 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -215,6 +215,11 @@ static inline void io_apic_modify(unsigned int apic, unsigned int reg, unsigned
 
 extern void io_apic_eoi(unsigned int apic, unsigned int vector);
 
+extern struct irq_chip ioapic_chip;
+extern struct irq_chip msi_chip;
+extern struct irq_chip hpet_msi_type;
+void irq_remap_modify_chips(void);
+
 #else  /* !CONFIG_X86_IO_APIC */
 
 #define io_apic_assign_pci_irqs 0
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 904611b..ff50b90 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1552,6 +1552,8 @@ void enable_x2apic(void)
 int __init enable_IR(void)
 {
 #ifdef CONFIG_IRQ_REMAP
+	int ret;
+
 	if (!irq_remapping_supported()) {
 		pr_debug("intr-remapping not supported\n");
 		return -1;
@@ -1563,7 +1565,12 @@ int __init enable_IR(void)
 		return -1;
 	}
 
-	return irq_remapping_enable();
+	ret = irq_remapping_enable();
+
+	if (ret >= 0)
+		irq_remap_modify_chips();
+
+	return ret;
 #endif
 	return -1;
 }
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 9ed796c..f78f7f6 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1238,8 +1238,6 @@ void __setup_vector_irq(int cpu)
 	raw_spin_unlock(&vector_lock);
 }
 
-static struct irq_chip ioapic_chip;
-
 #ifdef CONFIG_X86_32
 static inline int IO_APIC_irq_trigger(int irq)
 {
@@ -2511,7 +2509,7 @@ static void ack_apic_level(struct irq_data *data)
 	ioapic_irqd_unmask(data, cfg, masked);
 }
 
-static struct irq_chip ioapic_chip __read_mostly = {
+struct irq_chip ioapic_chip __read_mostly = {
 	.name			= "IO-APIC",
 	.irq_startup		= startup_ioapic_irq,
 	.irq_mask		= mask_ioapic_irq,
@@ -3093,7 +3091,7 @@ msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
  * IRQ Chip for MSI PCI/PCI-X/PCI-Express Devices,
  * which implement the MSI or MSI-X Capability Structure.
  */
-static struct irq_chip msi_chip = {
+struct irq_chip msi_chip = {
 	.name			= "PCI-MSI",
 	.irq_unmask		= unmask_msi_irq,
 	.irq_mask		= mask_msi_irq,
@@ -3240,7 +3238,7 @@ static int hpet_msi_set_affinity(struct irq_data *data,
 	return IRQ_SET_MASK_OK_NOCOPY;
 }
 
-static struct irq_chip hpet_msi_type = {
+struct irq_chip hpet_msi_type = {
 	.name = "HPET_MSI",
 	.irq_unmask = hpet_msi_unmask,
 	.irq_mask = hpet_msi_mask,
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 07ce86a..21ef344 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -373,19 +373,26 @@ static void ir_print_prefix(struct irq_data *data, struct seq_file *p)
 	seq_printf(p, " IR-%s", data->chip->name);
 }
 
-static void irq_remap_modify_chip_defaults(struct irq_chip *chip)
+static void __init irq_remap_modify_chip_defaults(struct irq_chip *chip)
 {
+	printk(KERN_DEBUG "irq_chip: %s ==> IR-%s", chip->name, chip->name);
 	chip->irq_print_chip = ir_print_prefix;
 	chip->irq_ack = ir_ack_apic_edge;
 	chip->irq_eoi = ir_ack_apic_level;
 	chip->irq_set_affinity = x86_io_apic_ops.set_affinity;
 }
 
+void __init irq_remap_modify_chips(void)
+{
+	irq_remap_modify_chip_defaults(&ioapic_chip);
+	irq_remap_modify_chip_defaults(&msi_chip);
+	irq_remap_modify_chip_defaults(&hpet_msi_type);
+}
+
 bool setup_remapped_irq(int irq, struct irq_cfg *cfg, struct irq_chip *chip)
 {
 	if (!irq_remapped(cfg))
 		return false;
 	irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
-	irq_remap_modify_chip_defaults(chip);
 	return true;
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 03/27] x86, irq: Print out MSI/MSI-X clearly
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 01/27] x86, irq: Change irq_remap_modify_chip_defaults to static Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 04/27] x86, irq: Show MSI-X in /proc/interrupt Yinghai Lu
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

Print out exact MSI or MSI-X instead of MSI/MSI-X.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index f78f7f6..059589f 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3125,7 +3125,8 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 
 	irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "edge");
 
-	dev_printk(KERN_DEBUG, &dev->dev, "irq %d for MSI/MSI-X\n", irq);
+	dev_printk(KERN_DEBUG, &dev->dev, "irq %d for MSI%s\n", irq,
+			 msidesc->msi_attrib.is_msix ? "-X" : "");
 
 	return 0;
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 04/27] x86, irq: Show MSI-X in /proc/interrupt
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (2 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 03/27] x86, irq: Print out MSI/MSI-X clearly Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent Yinghai Lu
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

Now MSI-X is shown as MSI in /proc/interrupt.

We could use new added irq_print_chip() interface to append -X for MSI-X.

After this patch, we will have
        PCI-MSI-edge
        PCI-MSI-X-edge
        IR-PCI-MSI-edge
        IR-PCI-MSI-X-edge
in the /proc/interrupt

-v2: do not need to check if msi_desc is null in msi_irq_print_chip().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 7 +++++++
 drivers/iommu/irq_remapping.c  | 5 ++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 059589f..f8d4d8d 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3087,6 +3087,12 @@ msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
 	return IRQ_SET_MASK_OK_NOCOPY;
 }
 
+static void msi_irq_print_chip(struct irq_data *data, struct seq_file *p)
+{
+	seq_printf(p, " %s%s", data->chip->name,
+			data->msi_desc->msi_attrib.is_msix ? "-X" : "");
+}
+
 /*
  * IRQ Chip for MSI PCI/PCI-X/PCI-Express Devices,
  * which implement the MSI or MSI-X Capability Structure.
@@ -3098,6 +3104,7 @@ struct irq_chip msi_chip = {
 	.irq_ack		= ack_apic_edge,
 	.irq_set_affinity	= msi_set_affinity,
 	.irq_retrigger		= ioapic_retrigger_irq,
+	.irq_print_chip		= msi_irq_print_chip,
 };
 
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 21ef344..05496bd 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -370,7 +370,10 @@ static void ir_ack_apic_level(struct irq_data *data)
 
 static void ir_print_prefix(struct irq_data *data, struct seq_file *p)
 {
-	seq_printf(p, " IR-%s", data->chip->name);
+	seq_printf(p, " IR-%s%s", data->chip->name,
+			data->msi_desc ?
+			 (data->msi_desc->msi_attrib.is_msix ? "-X" : "")
+			 : "");
 }
 
 static void __init irq_remap_modify_chip_defaults(struct irq_chip *chip)
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (3 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 04/27] x86, irq: Show MSI-X in /proc/interrupt Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-09 15:16   ` Sebastian Andrzej Siewior
  2013-06-07 22:30 ` [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr() Yinghai Lu
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

All others are using "-" instead of "_".

Change dmar_msi and hpet_msi to use "-".

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index f8d4d8d..436fb38 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3199,7 +3199,7 @@ dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
 }
 
 static struct irq_chip dmar_msi_type = {
-	.name			= "DMAR_MSI",
+	.name			= "DMAR-MSI",
 	.irq_unmask		= dmar_msi_unmask,
 	.irq_mask		= dmar_msi_mask,
 	.irq_ack		= ack_apic_edge,
@@ -3247,7 +3247,7 @@ static int hpet_msi_set_affinity(struct irq_data *data,
 }
 
 struct irq_chip hpet_msi_type = {
-	.name = "HPET_MSI",
+	.name = "HPET-MSI",
 	.irq_unmask = hpet_msi_unmask,
 	.irq_mask = hpet_msi_mask,
 	.irq_ack = ack_apic_edge,
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr()
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (4 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-09 15:22   ` Sebastian Andrzej Siewior
  2013-06-07 22:30 ` [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node Yinghai Lu
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Tony Luck, Fenghua Yu, linux-ia64

create_irq() will return -1 when fail to allocate.
create_irq_nr() will return 0 when fail to allocate.

Will use it to fix one return value checking for dmar_msi irq.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
---
 arch/ia64/kernel/irq_ia64.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/ia64/kernel/irq_ia64.c b/arch/ia64/kernel/irq_ia64.c
index 1034884..38e46df 100644
--- a/arch/ia64/kernel/irq_ia64.c
+++ b/arch/ia64/kernel/irq_ia64.c
@@ -429,6 +429,16 @@ int create_irq(void)
 	return irq;
 }
 
+unsigned int create_irq_nr(unsigned int from, int node)
+{
+	int irq = create_irq();
+
+	if (irq < 0)
+		irq = 0;
+
+	return irq;
+}
+
 void destroy_irq(unsigned int irq)
 {
 	dynamic_irq_cleanup(irq);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (5 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr() Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-09 15:31   ` Sebastian Andrzej Siewior
  2013-06-07 22:30 ` [PATCH v3 08/27] x86, irq: kill create_irq() Yinghai Lu
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel, Donald Dutile

iommu irq's irq_desc should be on local node ram.

Fix the return value checking problem.

create_irq() will return -1 when fail to allocate.
create_irq_nr() will return 0 when fail to allocate.

here only check !irq, so need to change it to use create_irq_nr instead.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Donald Dutile <ddutile@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
---
 drivers/iommu/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index a7967ce..c09e60d 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1273,7 +1273,7 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
 	if (iommu->irq)
 		return 0;
 
-	irq = create_irq();
+	irq = create_irq_nr(0, iommu->node);
 	if (!irq) {
 		pr_err("IOMMU: no free vectors\n");
 		return -EINVAL;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 08/27] x86, irq: kill create_irq()
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (6 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-09 15:35   ` Sebastian Andrzej Siewior
  2013-06-07 22:30 ` [PATCH v3 09/27] x86, irq: Convert irq_2_pin list to generic list Yinghai Lu
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

create_irq() will return -1 when failing to allocate.
create_irq_nr() will return 0 when failing to allocate.

It only causes confusion.

Now we have no user for create_irq(), so remove create_irq() for x86.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 436fb38..70b6617 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2968,21 +2968,6 @@ unsigned int create_irq_nr(unsigned int from, int node)
 	return __create_irqs(from, 1, node);
 }
 
-int create_irq(void)
-{
-	int node = cpu_to_node(0);
-	unsigned int irq_want;
-	int irq;
-
-	irq_want = nr_irqs_gsi;
-	irq = create_irq_nr(irq_want, node);
-
-	if (irq == 0)
-		irq = -1;
-
-	return irq;
-}
-
 void destroy_irq(unsigned int irq)
 {
 	struct irq_cfg *cfg = irq_get_chip_data(irq);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 09/27] x86, irq: Convert irq_2_pin list to generic list
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (7 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 08/27] x86, irq: kill create_irq() Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-09 15:52   ` Sebastian Andrzej Siewior
  2013-06-07 22:30 ` [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs Yinghai Lu
                   ` (17 subsequent siblings)
  26 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Konrad Rzeszutek Wilk,
	Sebastian Andrzej Siewior, Joerg Roedel

Now irq_2_pin list is own grown list.
We can use generic list to replace it so we could use generic helper
functions to operate it.

Also make free_irq_cfg() free irq_2_pin list to support coming ioapic
hotplug.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Joerg Roedel <joro@8bytes.org>
---
 arch/x86/include/asm/hw_irq.h  |  2 +-
 arch/x86/kernel/apic/io_apic.c | 22 ++++++++++++----------
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 1da97ef..dc3942c 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -122,7 +122,7 @@ struct irq_2_irte {
  * Most irqs are mapped 1:1 with pins.
  */
 struct irq_cfg {
-	struct irq_pin_list	*irq_2_pin;
+	struct list_head	irq_2_pin;
 	cpumask_var_t		domain;
 	cpumask_var_t		old_domain;
 	u8			vector;
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 70b6617..670c538 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -66,7 +66,7 @@
 #define __apicdebuginit(type) static type __init
 
 #define for_each_irq_pin(entry, head) \
-	for (entry = head; entry; entry = entry->next)
+	list_for_each_entry(entry, &head, list)
 
 /*
  *      Is the SiS APIC rmw bug present ?
@@ -176,8 +176,8 @@ void mp_save_irq(struct mpc_intsrc *m)
 }
 
 struct irq_pin_list {
+	struct list_head list;
 	int apic, pin;
-	struct irq_pin_list *next;
 };
 
 static struct irq_pin_list *alloc_irq_pin_list(int node)
@@ -213,6 +213,7 @@ int __init arch_early_irq_init(void)
 	irq_reserve_irqs(0, legacy_pic->nr_legacy_irqs);
 
 	for (i = 0; i < count; i++) {
+		INIT_LIST_HEAD(&cfg[i].irq_2_pin);
 		irq_set_chip_data(i, &cfg[i]);
 		zalloc_cpumask_var_node(&cfg[i].domain, GFP_KERNEL, node);
 		zalloc_cpumask_var_node(&cfg[i].old_domain, GFP_KERNEL, node);
@@ -245,6 +246,7 @@ static struct irq_cfg *alloc_irq_cfg(unsigned int irq, int node)
 		goto out_cfg;
 	if (!zalloc_cpumask_var_node(&cfg->old_domain, GFP_KERNEL, node))
 		goto out_domain;
+	INIT_LIST_HEAD(&cfg->irq_2_pin);
 	return cfg;
 out_domain:
 	free_cpumask_var(cfg->domain);
@@ -255,11 +257,15 @@ out_cfg:
 
 static void free_irq_cfg(unsigned int at, struct irq_cfg *cfg)
 {
+	struct irq_pin_list *entry, *tmp;
+
 	if (!cfg)
 		return;
 	irq_set_chip_data(at, NULL);
 	free_cpumask_var(cfg->domain);
 	free_cpumask_var(cfg->old_domain);
+	list_for_each_entry_safe(entry, tmp, &cfg->irq_2_pin, list)
+		kfree(entry);
 	kfree(cfg);
 }
 
@@ -420,15 +426,12 @@ static void ioapic_mask_entry(int apic, int pin)
  */
 static int __add_pin_to_irq_node(struct irq_cfg *cfg, int node, int apic, int pin)
 {
-	struct irq_pin_list **last, *entry;
+	struct irq_pin_list *entry;
 
 	/* don't allow duplicates */
-	last = &cfg->irq_2_pin;
-	for_each_irq_pin(entry, cfg->irq_2_pin) {
+	for_each_irq_pin(entry, cfg->irq_2_pin)
 		if (entry->apic == apic && entry->pin == pin)
 			return 0;
-		last = &entry->next;
-	}
 
 	entry = alloc_irq_pin_list(node);
 	if (!entry) {
@@ -439,7 +442,7 @@ static int __add_pin_to_irq_node(struct irq_cfg *cfg, int node, int apic, int pi
 	entry->apic = apic;
 	entry->pin = pin;
 
-	*last = entry;
+	list_add_tail(&entry->list, &cfg->irq_2_pin);
 	return 0;
 }
 
@@ -1622,8 +1625,7 @@ __apicdebuginit(void) print_IO_APICs(void)
 		cfg = irq_get_chip_data(irq);
 		if (!cfg)
 			continue;
-		entry = cfg->irq_2_pin;
-		if (!entry)
+		if (list_empty(&cfg->irq_2_pin))
 			continue;
 		printk(KERN_DEBUG "IRQ%d ", irq);
 		for_each_irq_pin(entry, cfg->irq_2_pin)
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (8 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 09/27] x86, irq: Convert irq_2_pin list to generic list Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-10 13:51   ` Alexander Gordeev
  2013-06-10 19:39   ` Thomas Gleixner
  2013-06-07 22:30 ` [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at() Yinghai Lu
                   ` (16 subsequent siblings)
  26 siblings, 2 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Alexander Gordeev

irq_alloc_descs and irq_reserve_irqs are almost the same.
Separate code out to __irq_reserved_irqs, and other two reuse
__irq_reserve_irqs.

We will use __irq_reserve_irqs for coming ioapic hotplug support.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
---
 include/linux/irq.h  |  1 +
 kernel/irq/irqdesc.c | 54 ++++++++++++++++++++++++++++++----------------------
 2 files changed, 32 insertions(+), 23 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index bc4e066..4e0fcbb 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -600,6 +600,7 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
 
 void irq_free_descs(unsigned int irq, unsigned int cnt);
 int irq_reserve_irqs(unsigned int from, unsigned int cnt);
+int __irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt);
 
 static inline void irq_free_desc(unsigned int irq)
 {
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 192a302..3b9fb92 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -340,18 +340,15 @@ void irq_free_descs(unsigned int from, unsigned int cnt)
 EXPORT_SYMBOL_GPL(irq_free_descs);
 
 /**
- * irq_alloc_descs - allocate and initialize a range of irq descriptors
- * @irq:	Allocate for specific irq number if irq >= 0
+ * __irq_reserve_descs - reserve and initialize a range of irq descriptors
+ * @irq:	Reserve for specific irq number if irq >= 0
  * @from:	Start the search from this irq number
- * @cnt:	Number of consecutive irqs to allocate.
- * @node:	Preferred node on which the irq descriptor should be allocated
- * @owner:	Owning module (can be NULL)
+ * @cnt:	Number of consecutive irqs to reserve.
  *
  * Returns the first irq number or error code
  */
 int __ref
-__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
-		  struct module *owner)
+__irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt)
 {
 	int start, ret;
 
@@ -369,7 +366,7 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
 	start = bitmap_find_next_zero_area(allocated_irqs, IRQ_BITMAP_BITS,
 					   from, cnt, 0);
 	ret = -EEXIST;
-	if (irq >=0 && start != irq)
+	if (irq >= 0 && start != irq)
 		goto err;
 
 	if (start + cnt > nr_irqs) {
@@ -380,12 +377,36 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
 
 	bitmap_set(allocated_irqs, start, cnt);
 	mutex_unlock(&sparse_irq_lock);
-	return alloc_descs(start, cnt, node, owner);
+	return start;
 
 err:
 	mutex_unlock(&sparse_irq_lock);
 	return ret;
 }
+
+/**
+ * irq_alloc_descs - allocate and initialize a range of irq descriptors
+ * @irq:	Allocate for specific irq number if irq >= 0
+ * @from:	Start the search from this irq number
+ * @cnt:	Number of consecutive irqs to allocate.
+ * @node:	Preferred node on which the irq descriptor should be allocated
+ * @owner:	Owning module (can be NULL)
+ *
+ * Returns the first irq number or error code
+ */
+int __ref
+__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
+		  struct module *owner)
+{
+	int start;
+
+	start = __irq_reserve_irqs(irq, from, cnt);
+
+	if (start < 0)
+		return start;
+
+	return alloc_descs(start, cnt, node, owner);
+}
 EXPORT_SYMBOL_GPL(__irq_alloc_descs);
 
 /**
@@ -397,20 +418,7 @@ EXPORT_SYMBOL_GPL(__irq_alloc_descs);
  */
 int irq_reserve_irqs(unsigned int from, unsigned int cnt)
 {
-	unsigned int start;
-	int ret = 0;
-
-	if (!cnt || (from + cnt) > nr_irqs)
-		return -EINVAL;
-
-	mutex_lock(&sparse_irq_lock);
-	start = bitmap_find_next_zero_area(allocated_irqs, nr_irqs, from, cnt, 0);
-	if (start == from)
-		bitmap_set(allocated_irqs, start, cnt);
-	else
-		ret = -EEXIST;
-	mutex_unlock(&sparse_irq_lock);
-	return ret;
+	return __irq_reserve_irqs(from, from, cnt);
 }
 
 /**
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at()
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (9 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-09 19:13   ` Sebastian Andrzej Siewior
  2013-06-10 20:13   ` Thomas Gleixner
  2013-06-07 22:30 ` [PATCH v3 12/27] x86, irq: Move down arch_early_irq_init() Yinghai Lu
                   ` (15 subsequent siblings)
  26 siblings, 2 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

For ioapic hot-add support, it would be easy if we put all irqs
for that ioapic controller together.

We can reserve irq range at first, then reallocate those
pre-reserved one when it is needed.

Add realloc_irq_and_cfg_at() to really allocate irq_desc and cfg,
because pre-reserved only mark bits in allocate_irqs bit maps.

The reasons for not allocating them during reserving:
1. only several pins in ioapic are used, allocate for all pins, will
   waste memory for not used pins.
2. relocate later could make sure irq_desc is allocated on local node ram.
   as dev->node is set at that point.

-v2: update changelog by adding reasons, requested by Konrad.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 32 +++++++++++++++++++++++++++++++-
 include/linux/irq.h            |  5 +++++
 kernel/irq/irqdesc.c           | 26 ++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 670c538..a157a56 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -301,6 +301,36 @@ static void free_irq_at(unsigned int at, struct irq_cfg *cfg)
 	irq_free_desc(at);
 }
 
+static struct irq_cfg *realloc_irq_and_cfg_at(unsigned int at, int node)
+{
+	struct irq_desc *desc = irq_to_desc(at);
+	struct irq_cfg *cfg;
+	int res;
+
+	if (desc) {
+		if (irq_desc_get_irq_data(desc)->node == node)
+			return alloc_irq_and_cfg_at(at, node);
+
+		cfg = irq_desc_get_chip_data(desc);
+		if (cfg) {
+			/* shared irq */
+			if (!list_empty(&cfg->irq_2_pin))
+				return cfg;
+			free_irq_cfg(at, cfg);
+		}
+	}
+
+	res = irq_realloc_desc_at(at, node);
+	if (res >= 0) {
+		cfg = alloc_irq_cfg(at, node);
+		if (cfg) {
+			irq_set_chip_data(at, cfg);
+			return cfg;
+		}
+	}
+
+	return alloc_irq_and_cfg_at(at, node);
+}
 
 struct io_apic {
 	unsigned int index;
@@ -3352,7 +3382,7 @@ int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
 static int
 io_apic_setup_irq_pin(unsigned int irq, int node, struct io_apic_irq_attr *attr)
 {
-	struct irq_cfg *cfg = alloc_irq_and_cfg_at(irq, node);
+	struct irq_cfg *cfg = realloc_irq_and_cfg_at(irq, node);
 	int ret;
 
 	if (!cfg)
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 4e0fcbb..9c6c047 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -602,6 +602,11 @@ void irq_free_descs(unsigned int irq, unsigned int cnt);
 int irq_reserve_irqs(unsigned int from, unsigned int cnt);
 int __irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt);
 
+int __irq_realloc_desc(int at, int node, struct module *owner);
+/* use macros to avoid needing export.h for THIS_MODULE */
+#define irq_realloc_desc_at(at, node)	\
+	__irq_realloc_desc(at, node, THIS_MODULE)
+
 static inline void irq_free_desc(unsigned int irq)
 {
 	irq_free_descs(irq, 1);
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 3b9fb92..b48f65b 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -99,6 +99,11 @@ EXPORT_SYMBOL_GPL(nr_irqs);
 static DEFINE_MUTEX(sparse_irq_lock);
 static DECLARE_BITMAP(allocated_irqs, IRQ_BITMAP_BITS);
 
+static bool __irq_is_reserved(int irq)
+{
+	return !!test_bit(irq, allocated_irqs);
+}
+
 #ifdef CONFIG_SPARSE_IRQ
 
 static RADIX_TREE(irq_desc_tree, GFP_KERNEL);
@@ -410,6 +415,27 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
 EXPORT_SYMBOL_GPL(__irq_alloc_descs);
 
 /**
+ * irq_realloc_desc - allocate irq descriptor for irq that is already reserved
+ * @irq:	Allocate for specific irq number if irq >= 0
+ * @node:	Preferred node on which the irq descriptor should be allocated
+ * @owner:	Owning module (can be NULL)
+ *
+ * Returns the irq number or error code
+ */
+int __ref
+__irq_realloc_desc(int irq, int node, struct module *owner)
+{
+	if (!__irq_is_reserved(irq))
+		return -EINVAL;
+
+	if (irq_to_desc(irq))
+		free_desc(irq);
+
+	return alloc_descs(irq, 1, node, owner);
+}
+EXPORT_SYMBOL_GPL(__irq_realloc_desc);
+
+/**
  * irq_reserve_irqs - mark irqs allocated
  * @from:	mark from irq number
  * @cnt:	number of irqs to mark
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 12/27] x86, irq: Move down arch_early_irq_init()
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (10 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at() Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-07 22:30 ` [PATCH v3 13/27] x86, irq: Split out alloc_ioapic_save_registers() Yinghai Lu
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

Change position only.

Prepare to update arch_early_irq_init(), it needs to call some static
functions.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 89 +++++++++++++++++++++---------------------
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index a157a56..2fcf813 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -185,51 +185,6 @@ static struct irq_pin_list *alloc_irq_pin_list(int node)
 	return kzalloc_node(sizeof(struct irq_pin_list), GFP_KERNEL, node);
 }
 
-
-/* irq_cfg is indexed by the sum of all RTEs in all I/O APICs. */
-static struct irq_cfg irq_cfgx[NR_IRQS_LEGACY];
-
-int __init arch_early_irq_init(void)
-{
-	struct irq_cfg *cfg;
-	int count, node, i;
-
-	if (!legacy_pic->nr_legacy_irqs)
-		io_apic_irqs = ~0UL;
-
-	for (i = 0; i < nr_ioapics; i++) {
-		ioapics[i].saved_registers =
-			kzalloc(sizeof(struct IO_APIC_route_entry) *
-				ioapics[i].nr_registers, GFP_KERNEL);
-		if (!ioapics[i].saved_registers)
-			pr_err("IOAPIC %d: suspend/resume impossible!\n", i);
-	}
-
-	cfg = irq_cfgx;
-	count = ARRAY_SIZE(irq_cfgx);
-	node = cpu_to_node(0);
-
-	/* Make sure the legacy interrupts are marked in the bitmap */
-	irq_reserve_irqs(0, legacy_pic->nr_legacy_irqs);
-
-	for (i = 0; i < count; i++) {
-		INIT_LIST_HEAD(&cfg[i].irq_2_pin);
-		irq_set_chip_data(i, &cfg[i]);
-		zalloc_cpumask_var_node(&cfg[i].domain, GFP_KERNEL, node);
-		zalloc_cpumask_var_node(&cfg[i].old_domain, GFP_KERNEL, node);
-		/*
-		 * For legacy IRQ's, start with assigning irq0 to irq15 to
-		 * IRQ0_VECTOR to IRQ15_VECTOR for all cpu's.
-		 */
-		if (i < legacy_pic->nr_legacy_irqs) {
-			cfg[i].vector = IRQ0_VECTOR + i;
-			cpumask_setall(cfg[i].domain);
-		}
-	}
-
-	return 0;
-}
-
 static struct irq_cfg *irq_cfg(unsigned int irq)
 {
 	return irq_get_chip_data(irq);
@@ -332,6 +287,50 @@ static struct irq_cfg *realloc_irq_and_cfg_at(unsigned int at, int node)
 	return alloc_irq_and_cfg_at(at, node);
 }
 
+/* irq_cfg is indexed by the sum of all RTEs in all I/O APICs. */
+static struct irq_cfg irq_cfgx[NR_IRQS_LEGACY];
+
+int __init arch_early_irq_init(void)
+{
+	struct irq_cfg *cfg;
+	int count, node, i;
+
+	if (!legacy_pic->nr_legacy_irqs)
+		io_apic_irqs = ~0UL;
+
+	for (i = 0; i < nr_ioapics; i++) {
+		ioapics[i].saved_registers =
+			kzalloc(sizeof(struct IO_APIC_route_entry) *
+				ioapics[i].nr_registers, GFP_KERNEL);
+		if (!ioapics[i].saved_registers)
+			pr_err("IOAPIC %d: suspend/resume impossible!\n", i);
+	}
+
+	cfg = irq_cfgx;
+	count = ARRAY_SIZE(irq_cfgx);
+	node = cpu_to_node(0);
+
+	/* Make sure the legacy interrupts are marked in the bitmap */
+	irq_reserve_irqs(0, legacy_pic->nr_legacy_irqs);
+
+	for (i = 0; i < count; i++) {
+		INIT_LIST_HEAD(&cfg[i].irq_2_pin);
+		irq_set_chip_data(i, &cfg[i]);
+		zalloc_cpumask_var_node(&cfg[i].domain, GFP_KERNEL, node);
+		zalloc_cpumask_var_node(&cfg[i].old_domain, GFP_KERNEL, node);
+		/*
+		 * For legacy IRQ's, start with assigning irq0 to irq15 to
+		 * IRQ0_VECTOR to IRQ15_VECTOR for all cpu's.
+		 */
+		if (i < legacy_pic->nr_legacy_irqs) {
+			cfg[i].vector = IRQ0_VECTOR + i;
+			cpumask_setall(cfg[i].domain);
+		}
+	}
+
+	return 0;
+}
+
 struct io_apic {
 	unsigned int index;
 	unsigned int unused[3];
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 13/27] x86, irq: Split out alloc_ioapic_save_registers()
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (11 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 12/27] x86, irq: Move down arch_early_irq_init() Yinghai Lu
@ 2013-06-07 22:30 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 14/27] xen, irq: call irq_realloc_desc_at() at first Yinghai Lu
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

Split alloc_ioapic_save_registers() from early_irq_init(),
so it will be per ioapic.

Will call that later for hot-added ioapic controller.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 2fcf813..23b0be5 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -290,6 +290,19 @@ static struct irq_cfg *realloc_irq_and_cfg_at(unsigned int at, int node)
 /* irq_cfg is indexed by the sum of all RTEs in all I/O APICs. */
 static struct irq_cfg irq_cfgx[NR_IRQS_LEGACY];
 
+static void alloc_ioapic_saved_registers(int idx)
+{
+	if (ioapics[idx].saved_registers)
+		return;
+
+	ioapics[idx].saved_registers =
+			kzalloc(sizeof(struct IO_APIC_route_entry) *
+				ioapics[idx].nr_registers, GFP_KERNEL);
+
+	if (!ioapics[idx].saved_registers)
+		pr_err("IOAPIC %d: suspend/resume impossible!\n", idx);
+}
+
 int __init arch_early_irq_init(void)
 {
 	struct irq_cfg *cfg;
@@ -298,13 +311,8 @@ int __init arch_early_irq_init(void)
 	if (!legacy_pic->nr_legacy_irqs)
 		io_apic_irqs = ~0UL;
 
-	for (i = 0; i < nr_ioapics; i++) {
-		ioapics[i].saved_registers =
-			kzalloc(sizeof(struct IO_APIC_route_entry) *
-				ioapics[i].nr_registers, GFP_KERNEL);
-		if (!ioapics[i].saved_registers)
-			pr_err("IOAPIC %d: suspend/resume impossible!\n", i);
-	}
+	for (i = 0; i < nr_ioapics; i++)
+		alloc_ioapic_saved_registers(i);
 
 	cfg = irq_cfgx;
 	count = ARRAY_SIZE(irq_cfgx);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 14/27] xen, irq: call irq_realloc_desc_at() at first
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (12 preceding siblings ...)
  2013-06-07 22:30 ` [PATCH v3 13/27] x86, irq: Split out alloc_ioapic_save_registers() Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 15/27] x86, irq: pre-reserve irq range/realloc for booting path Yinghai Lu
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Konrad Rzeszutek Wilk, xen-devel

To make x86 irq allocation to be same with booting path and ioapic
hot add path, We will pre-reserve irq for all gsi at first.
We have to use realloc here, otherwise irq_alloc_desc_at will fail
because bit is already get marked for pre-reserved in irq bitmaps.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xensource.com
---
 drivers/xen/events.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 6a6bbe4..b0caa4e 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -505,8 +505,12 @@ static int __must_check xen_allocate_irq_gsi(unsigned gsi)
 	/* Legacy IRQ descriptors are already allocated by the arch. */
 	if (gsi < NR_IRQS_LEGACY)
 		irq = gsi;
-	else
-		irq = irq_alloc_desc_at(gsi, -1);
+	else {
+		/* for x86, irq already get reserved for gsi */
+		irq = irq_realloc_desc_at(gsi, -1);
+		if (irq < 0)
+			irq = irq_alloc_desc_at(gsi, -1);
+	}
 
 	xen_irq_init(irq);
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 15/27] x86, irq: pre-reserve irq range/realloc for booting path
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (13 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 14/27] xen, irq: call irq_realloc_desc_at() at first Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 16/27] x86, irq: Add ioapic_gsi_to_irq Yinghai Lu
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

We will use reserve/realloc_irq_and_cfg_at for hotplug ioapic path.

To make thing simple, we could make booting path use same code.

All gsi range will be reserved at first, and realloc will really
allocate those irq_desc/cfg when it is used.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/include/asm/io_apic.h |  1 +
 arch/x86/kernel/apic/io_apic.c | 78 ++++++++++++++++++++++++++++++------------
 2 files changed, 57 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index eaff3ad..8181fd8 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -176,6 +176,7 @@ extern void setup_ioapic_ids_from_mpc_nocheck(void);
 struct mp_ioapic_gsi{
 	u32 gsi_base;
 	u32 gsi_end;
+	u32 irq_base;
 };
 extern struct mp_ioapic_gsi  mp_gsi_routing[];
 extern u32 gsi_top;
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 23b0be5..65863bd 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -287,8 +287,47 @@ static struct irq_cfg *realloc_irq_and_cfg_at(unsigned int at, int node)
 	return alloc_irq_and_cfg_at(at, node);
 }
 
-/* irq_cfg is indexed by the sum of all RTEs in all I/O APICs. */
-static struct irq_cfg irq_cfgx[NR_IRQS_LEGACY];
+static int reserve_ioapic_gsi_irq_base(int idx)
+{
+	int irq;
+	struct mp_ioapic_gsi *gsi_cfg = mp_ioapic_gsi_routing(idx);
+	int cnt = gsi_cfg->gsi_end - gsi_cfg->gsi_base + 1;
+
+	irq = __irq_reserve_irqs(-1, gsi_cfg->gsi_base, cnt);
+	if (irq >= 0) {
+		gsi_cfg->irq_base = irq;
+		apic_printk(APIC_VERBOSE, KERN_INFO
+			"IOAPIC[%d]: apic_id %d, GSI %d-%d ==> irq %d-%d reserved\n",
+			idx, mpc_ioapic_id(idx),
+			gsi_cfg->gsi_base, gsi_cfg->gsi_end,
+			irq, irq + cnt - 1);
+	} else
+		apic_printk(APIC_VERBOSE, KERN_WARNING
+			"IOAPIC[%d]: apic_id %d, GSI %d-%d ==> irq reserve failed\n",
+			idx, mpc_ioapic_id(idx),
+			gsi_cfg->gsi_base, gsi_cfg->gsi_end);
+
+	return irq;
+}
+
+static void __init reserve_ioapic_gsi_irq_extra(void)
+{
+	int irq;
+
+	/* to prevent hot add ioapic taking those slots */
+	if (gsi_top) {
+		irq = irq_reserve_irqs(gsi_top, NR_IRQS_LEGACY);
+		if (irq >= 0)
+			apic_printk(APIC_VERBOSE, KERN_INFO
+				"IOAPIC[extra]: GSI %d-%d ==> irq %d-%d reserved\n",
+				gsi_top, gsi_top + NR_IRQS_LEGACY - 1,
+				irq, irq + NR_IRQS_LEGACY - 1);
+		else
+			apic_printk(APIC_VERBOSE, KERN_WARNING
+				"IOAPIC[extra]: GSI %d-%d ==> irq reserve failed\n",
+				gsi_top, gsi_top + NR_IRQS_LEGACY - 1);
+	}
+}
 
 static void alloc_ioapic_saved_registers(int idx)
 {
@@ -305,8 +344,9 @@ static void alloc_ioapic_saved_registers(int idx)
 
 int __init arch_early_irq_init(void)
 {
+	int node = cpu_to_node(0);
 	struct irq_cfg *cfg;
-	int count, node, i;
+	int i;
 
 	if (!legacy_pic->nr_legacy_irqs)
 		io_apic_irqs = ~0UL;
@@ -314,26 +354,19 @@ int __init arch_early_irq_init(void)
 	for (i = 0; i < nr_ioapics; i++)
 		alloc_ioapic_saved_registers(i);
 
-	cfg = irq_cfgx;
-	count = ARRAY_SIZE(irq_cfgx);
-	node = cpu_to_node(0);
+	for (i = 0; i < nr_ioapics; i++)
+		reserve_ioapic_gsi_irq_base(i);
 
-	/* Make sure the legacy interrupts are marked in the bitmap */
-	irq_reserve_irqs(0, legacy_pic->nr_legacy_irqs);
+	reserve_ioapic_gsi_irq_extra();
 
-	for (i = 0; i < count; i++) {
-		INIT_LIST_HEAD(&cfg[i].irq_2_pin);
-		irq_set_chip_data(i, &cfg[i]);
-		zalloc_cpumask_var_node(&cfg[i].domain, GFP_KERNEL, node);
-		zalloc_cpumask_var_node(&cfg[i].old_domain, GFP_KERNEL, node);
-		/*
-		 * For legacy IRQ's, start with assigning irq0 to irq15 to
-		 * IRQ0_VECTOR to IRQ15_VECTOR for all cpu's.
-		 */
-		if (i < legacy_pic->nr_legacy_irqs) {
-			cfg[i].vector = IRQ0_VECTOR + i;
-			cpumask_setall(cfg[i].domain);
-		}
+	/*
+	 * For legacy IRQ's, start with assigning irq0 to irq15 to
+	 * IRQ0_VECTOR to IRQ15_VECTOR for all cpu's.
+	 */
+	for (i = 0; i < legacy_pic->nr_legacy_irqs; i++) {
+		cfg = realloc_irq_and_cfg_at(i, node);
+		cfg->vector = IRQ0_VECTOR + i;
+		cpumask_setall(cfg->domain);
 	}
 
 	return 0;
@@ -3467,7 +3500,8 @@ int __init arch_probe_nr_irqs(void)
 	if (nr < nr_irqs)
 		nr_irqs = nr;
 
-	return NR_IRQS_LEGACY;
+	/* x86 arch code will allocate irq_desc/cfg */
+	return 0;
 }
 
 int io_apic_set_pci_routing(struct device *dev, int irq,
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 16/27] x86, irq: Add ioapic_gsi_to_irq
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (14 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 15/27] x86, irq: pre-reserve irq range/realloc for booting path Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 17/27] genirq: Bail out early in free_desc() Yinghai Lu
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Pavel Machek, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

For hot add ioapic, irq_base is not equal to gsi_base.
We need a way to do mapping between gsi and irq.

Also remove irq_to_gsi() that is confusing, just use that array
directly as only caller already check input irq before.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/include/asm/io_apic.h |  1 +
 arch/x86/kernel/acpi/boot.c    | 22 +++++-----------------
 arch/x86/kernel/apic/io_apic.c | 29 ++++++++++++++++++++++++++++-
 3 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 8181fd8..02ac411 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -180,6 +180,7 @@ struct mp_ioapic_gsi{
 };
 extern struct mp_ioapic_gsi  mp_gsi_routing[];
 extern u32 gsi_top;
+int ioapic_gsi_to_irq(u32 gsi);
 int mp_find_ioapic(u32 gsi);
 int mp_find_ioapic_pin(int ioapic, u32 gsi);
 void __init mp_register_ioapic(int id, u32 address, u32 gsi_base);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 230c8ea..7190d0d 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -111,6 +111,10 @@ static unsigned int gsi_to_irq(unsigned int gsi)
 		}
 	}
 
+#ifdef CONFIG_X86_IO_APIC
+	if (acpi_irq_model == ACPI_IRQ_MODEL_IOAPIC)
+		return ioapic_gsi_to_irq(gsi);
+#endif
 	/* Provide an identity mapping of gsi == irq
 	 * except on truly weird platforms that have
 	 * non isa irqs in the first 16 gsis.
@@ -123,22 +127,6 @@ static unsigned int gsi_to_irq(unsigned int gsi)
 	return irq;
 }
 
-static u32 irq_to_gsi(int irq)
-{
-	unsigned int gsi;
-
-	if (irq < NR_IRQS_LEGACY)
-		gsi = isa_irq_to_gsi[irq];
-	else if (irq < gsi_top)
-		gsi = irq;
-	else if (irq < (gsi_top + NR_IRQS_LEGACY))
-		gsi = irq - gsi_top;
-	else
-		gsi = 0xffffffff;
-
-	return gsi;
-}
-
 /*
  * Temporarily use the virtual area starting from FIX_IO_APIC_BASE_END,
  * to map the target physical address. The problem is that set_fixmap()
@@ -528,7 +516,7 @@ int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi)
 {
 	if (isa_irq >= 16)
 		return -1;
-	*gsi = irq_to_gsi(isa_irq);
+	*gsi = isa_irq_to_gsi[isa_irq];
 	return 0;
 }
 
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 65863bd..7f95bbe 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1044,13 +1044,16 @@ static int pin_2_irq(int idx, int apic, int pin)
 
 	if (test_bit(bus, mp_bus_not_pci)) {
 		irq = mp_irqs[idx].srcbusirq;
-	} else {
+	} else if (gsi_cfg->gsi_base == gsi_cfg->irq_base) {
 		u32 gsi = gsi_cfg->gsi_base + pin;
 
 		if (gsi >= NR_IRQS_LEGACY)
 			irq = gsi;
 		else
 			irq = gsi_top + gsi;
+	} else {
+		/* hotadd ioapic */
+		irq = gsi_cfg->irq_base + pin;
 	}
 
 #ifdef CONFIG_X86_32
@@ -1476,6 +1479,30 @@ static void __init setup_IO_APIC_irqs(void)
 		__io_apic_setup_irqs(ioapic_idx);
 }
 
+int ioapic_gsi_to_irq(u32 gsi)
+{
+	int ioapic_idx = 0, irq = gsi;
+	struct mp_ioapic_gsi *gsi_cfg;
+
+	ioapic_idx = mp_find_ioapic(gsi);
+	if (ioapic_idx < 0)
+		return -1;
+
+	gsi_cfg = mp_ioapic_gsi_routing(ioapic_idx);
+	if (gsi_cfg->gsi_base == gsi_cfg->irq_base) {
+		if (gsi < NR_IRQS_LEGACY)
+			irq = gsi_top + gsi;
+	} else {
+		int pin = mp_find_ioapic_pin(ioapic_idx, gsi);
+
+		if (pin < 0)
+			return -1;
+		/* hotadd ioapic */
+		irq = gsi_cfg->irq_base + pin;
+	}
+
+	return irq;
+}
 /*
  * for the gsit that is not in first ioapic
  * but could not use acpi_register_gsi()
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 17/27] genirq: Bail out early in free_desc()
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (15 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 16/27] x86, irq: Add ioapic_gsi_to_irq Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-10 20:43   ` Thomas Gleixner
  2013-06-07 22:31 ` [PATCH v3 18/27] x86, irq: More strict checking about registering ioapic Yinghai Lu
                   ` (9 subsequent siblings)
  26 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu

We pre-reserve irq range for hot-added ioapic, and later only
some are used via realloc.
So during hot-remove, we need to clear bits in allocated_irqs
for both case.

Check if the irq_desc is there at first, and bail out early if
irq_desc is not allocated yet.
We can use irq_free_descs to clear allocated_irqs bits for
preserved irqs only.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 kernel/irq/irqdesc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index b48f65b..32f099e 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -170,6 +170,9 @@ static void free_desc(unsigned int irq)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
 
+	if (!desc)
+		return;
+
 	unregister_irq_proc(irq, desc);
 
 	mutex_lock(&sparse_irq_lock);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 18/27] x86, irq: More strict checking about registering ioapic
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (16 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 17/27] genirq: Bail out early in free_desc() Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 19/27] x86, irq: Make mp_register_ioapic handle hot-added ioapic Yinghai Lu
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

1. check overlaping gsi range
for hot-add ioapic case, BIOS may have some entries in MADT
and also have setting in pci root bus with _GSB of DSDT.

2. make bad_ioapics check idx instead of nr_ioapics.
for hotadd ioapic could find spare slot in the middle later.

3. check if entries is in right range.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 7f95bbe..850d8e99 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3818,7 +3818,7 @@ void __init ioapic_insert_resources(void)
 	}
 }
 
-int mp_find_ioapic(u32 gsi)
+static int __mp_find_ioapic(u32 gsi, bool quiet)
 {
 	int i = 0;
 
@@ -3833,10 +3833,16 @@ int mp_find_ioapic(u32 gsi)
 			return i;
 	}
 
-	printk(KERN_ERR "ERROR: Unable to locate IOAPIC for GSI %d\n", gsi);
+	if (!quiet)
+		pr_err("ERROR: Unable to locate IOAPIC for GSI %d\n", gsi);
 	return -1;
 }
 
+int mp_find_ioapic(u32 gsi)
+{
+	return __mp_find_ioapic(gsi, false);
+}
+
 int mp_find_ioapic_pin(int ioapic, u32 gsi)
 {
 	struct mp_ioapic_gsi *gsi_cfg;
@@ -3851,11 +3857,11 @@ int mp_find_ioapic_pin(int ioapic, u32 gsi)
 	return gsi - gsi_cfg->gsi_base;
 }
 
-static __init int bad_ioapic(unsigned long address)
+static __init int bad_ioapic(int idx, unsigned long address)
 {
-	if (nr_ioapics >= MAX_IO_APICS) {
+	if (idx >= MAX_IO_APICS) {
 		pr_warn("WARNING: Max # of I/O APICs (%d) exceeded (found %d), skipping\n",
-			MAX_IO_APICS, nr_ioapics);
+			MAX_IO_APICS, idx);
 		return 1;
 	}
 	if (!address) {
@@ -3886,14 +3892,17 @@ static __init int bad_ioapic_register(int idx)
 
 void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 {
-	int idx = 0;
+	int idx;
 	int entries;
 	struct mp_ioapic_gsi *gsi_cfg;
 
-	if (bad_ioapic(address))
+	idx = __mp_find_ioapic(gsi_base, true);
+	if (idx >= 0)
 		return;
 
 	idx = nr_ioapics;
+	if (bad_ioapic(idx, address))
+		return;
 
 	ioapics[idx].mp_config.type = MP_IOAPIC;
 	ioapics[idx].mp_config.flags = MPC_APIC_USABLE;
@@ -3914,6 +3923,12 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 	 * and to prevent reprogramming of IOAPIC pins (PCI GSIs).
 	 */
 	entries = io_apic_get_redir_entries(idx);
+
+	if (!entries || entries > MP_MAX_IOAPIC_PIN) {
+		clear_fixmap(FIX_IO_APIC_BASE_0 + idx);
+		return;
+	}
+
 	gsi_cfg = mp_ioapic_gsi_routing(idx);
 	gsi_cfg->gsi_base = gsi_base;
 	gsi_cfg->gsi_end = gsi_base + entries - 1;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 19/27] x86, irq: Make mp_register_ioapic handle hot-added ioapic
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (17 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 18/27] x86, irq: More strict checking about registering ioapic Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 20/27] x86, irq: Add mp_unregister_ioapic to handle hot-remove ioapic Yinghai Lu
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Paul Gortmaker,
	Joerg Roedel, Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

It needs to reserve irq range in allocated_irqs bitmaps
and irq_base will be used to get right irq for ioapic/pin or gsi.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/include/asm/mpspec.h  |  1 +
 arch/x86/kernel/apic/io_apic.c | 59 ++++++++++++++++++++++++++++++++----------
 2 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index 3e2f42a..11bb6ea 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -96,6 +96,7 @@ static inline void early_reserve_e820_mpc_new(void) { }
 
 void __cpuinit generic_processor_info(int apicid, int version);
 #ifdef CONFIG_ACPI
+int __mp_register_ioapic(int id, u32 address, u32 gsi_base, bool hot);
 extern void mp_register_ioapic(int id, u32 address, u32 gsi_base);
 extern void mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger,
 				   u32 gsi);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 850d8e99..c581c4c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3890,7 +3890,7 @@ static __init int bad_ioapic_register(int idx)
 	return 0;
 }
 
-void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
+int __mp_register_ioapic(int id, u32 address, u32 gsi_base, bool hotadd)
 {
 	int idx;
 	int entries;
@@ -3898,11 +3898,19 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 
 	idx = __mp_find_ioapic(gsi_base, true);
 	if (idx >= 0)
-		return;
+		return -EINVAL;
 
 	idx = nr_ioapics;
+	if (hotadd) {
+		/* find free spot */
+		for (idx = 0; idx < nr_ioapics; idx++)
+			if (!ioapics[idx].nr_registers &&
+			    ioapics[idx].mp_config.apicid == 0xff)
+				break;
+	}
+
 	if (bad_ioapic(idx, address))
-		return;
+		return -EINVAL;
 
 	ioapics[idx].mp_config.type = MP_IOAPIC;
 	ioapics[idx].mp_config.flags = MPC_APIC_USABLE;
@@ -3910,10 +3918,8 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 
 	set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
 
-	if (bad_ioapic_register(idx)) {
-		clear_fixmap(FIX_IO_APIC_BASE_0 + idx);
-		return;
-	}
+	if (bad_ioapic_register(idx))
+		goto failed;
 
 	ioapics[idx].mp_config.apicid = io_apic_unique_id(id);
 	ioapics[idx].mp_config.apicver = io_apic_get_version(idx);
@@ -3924,10 +3930,8 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 	 */
 	entries = io_apic_get_redir_entries(idx);
 
-	if (!entries || entries > MP_MAX_IOAPIC_PIN) {
-		clear_fixmap(FIX_IO_APIC_BASE_0 + idx);
-		return;
-	}
+	if (!entries || entries > MP_MAX_IOAPIC_PIN)
+		goto failed;
 
 	gsi_cfg = mp_ioapic_gsi_routing(idx);
 	gsi_cfg->gsi_base = gsi_base;
@@ -3938,15 +3942,42 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 	 */
 	ioapics[idx].nr_registers = entries;
 
-	if (gsi_cfg->gsi_end >= gsi_top)
-		gsi_top = gsi_cfg->gsi_end + 1;
+	if (!hotadd) {
+		/*
+		 * irqs will be reserved in arch_early_irq_init()
+		 * don't need to update gsi_top for hot add case
+		 */
+		if (gsi_cfg->gsi_end >= gsi_top)
+			gsi_top = gsi_cfg->gsi_end + 1;
+	} else {
+		int irq = reserve_ioapic_gsi_irq_base(idx);
+
+		if (irq < 0)
+			goto failed;
+
+		alloc_ioapic_saved_registers(idx);
+	}
 
 	pr_info("IOAPIC[%d]: apic_id %d, version %d, address 0x%x, GSI %d-%d\n",
 		idx, mpc_ioapic_id(idx),
 		mpc_ioapic_ver(idx), mpc_ioapic_addr(idx),
 		gsi_cfg->gsi_base, gsi_cfg->gsi_end);
 
-	nr_ioapics++;
+	if (idx == nr_ioapics)
+		nr_ioapics++;
+
+	return 0;
+
+failed:
+	clear_fixmap(FIX_IO_APIC_BASE_0 + idx);
+	memset(&ioapics[idx], 0, sizeof(struct ioapic));
+	ioapics[idx].mp_config.apicid = 0xff;
+	return -EINVAL;
+}
+
+void mp_register_ioapic(int id, u32 address, u32 gsi_base)
+{
+	__mp_register_ioapic(id, address, gsi_base, false);
 }
 
 /* Enable IOAPIC early just for system timer */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 20/27] x86, irq: Add mp_unregister_ioapic to handle hot-remove ioapic
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (18 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 19/27] x86, irq: Make mp_register_ioapic handle hot-added ioapic Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 21/27] x86, irq: Make ioapics loop skip blank slots Yinghai Lu
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

It will free ioapic related irq_desc and also clear allocated_irqs bits.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/include/asm/mpspec.h  |  1 +
 arch/x86/kernel/apic/io_apic.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index 11bb6ea..e70b7e5 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -97,6 +97,7 @@ static inline void early_reserve_e820_mpc_new(void) { }
 void __cpuinit generic_processor_info(int apicid, int version);
 #ifdef CONFIG_ACPI
 int __mp_register_ioapic(int id, u32 address, u32 gsi_base, bool hot);
+int mp_unregister_ioapic(u32 gsi_base);
 extern void mp_register_ioapic(int id, u32 address, u32 gsi_base);
 extern void mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger,
 				   u32 gsi);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index c581c4c..8a3fedf 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -256,6 +256,14 @@ static void free_irq_at(unsigned int at, struct irq_cfg *cfg)
 	irq_free_desc(at);
 }
 
+static void free_irqs(unsigned int from, int cnt)
+{
+	int i;
+
+	for (i = from; i < from + cnt; i++)
+		free_irq_at(i, irq_get_chip_data(i));
+}
+
 static struct irq_cfg *realloc_irq_and_cfg_at(unsigned int at, int node)
 {
 	struct irq_desc *desc = irq_to_desc(at);
@@ -329,6 +337,16 @@ static void __init reserve_ioapic_gsi_irq_extra(void)
 	}
 }
 
+static void free_ioapic_gsi_irq_base(int idx)
+{
+	struct mp_ioapic_gsi *gsi_cfg = mp_ioapic_gsi_routing(idx);
+	int irq_base = gsi_cfg->irq_base;
+	int irq_cnt = gsi_cfg->gsi_end - gsi_cfg->gsi_base + 1;
+
+	if (irq_base > 0)
+		free_irqs(irq_base, irq_cnt);
+}
+
 static void alloc_ioapic_saved_registers(int idx)
 {
 	if (ioapics[idx].saved_registers)
@@ -342,6 +360,11 @@ static void alloc_ioapic_saved_registers(int idx)
 		pr_err("IOAPIC %d: suspend/resume impossible!\n", idx);
 }
 
+static void free_ioapic_saved_registers(int idx)
+{
+	kfree(ioapics[idx].saved_registers);
+}
+
 int __init arch_early_irq_init(void)
 {
 	int node = cpu_to_node(0);
@@ -3980,6 +4003,25 @@ void mp_register_ioapic(int id, u32 address, u32 gsi_base)
 	__mp_register_ioapic(id, address, gsi_base, false);
 }
 
+int mp_unregister_ioapic(u32 gsi_base)
+{
+	int idx;
+
+	idx = mp_find_ioapic(gsi_base);
+	if (idx < 0)
+		return -EINVAL;
+
+	free_ioapic_saved_registers(idx);
+
+	free_ioapic_gsi_irq_base(idx);
+
+	clear_fixmap(FIX_IO_APIC_BASE_0 + idx);
+	memset(&ioapics[idx], 0, sizeof(struct ioapic));
+	ioapics[idx].mp_config.apicid = 0xff;
+
+	return 0;
+}
+
 /* Enable IOAPIC early just for system timer */
 void __init pre_init_apic_IRQ0(void)
 {
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 21/27] x86, irq: Make ioapics loop skip blank slots
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (19 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 20/27] x86, irq: Add mp_unregister_ioapic to handle hot-remove ioapic Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 22/27] x86, ioapic: Find usable ioapic id for 64bit Yinghai Lu
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Paul Gortmaker,
	Joerg Roedel, Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

When multiple ioapics get added and removed, we could have blank slots
in middle of ioapics array.

Add skip code for this case by checking nr_registers.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 8a3fedf..970c4fd 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -793,7 +793,8 @@ int save_ioapic_entries(void)
 	int err = 0;
 
 	for (apic = 0; apic < nr_ioapics; apic++) {
-		if (!ioapics[apic].saved_registers) {
+		if (!ioapics[apic].saved_registers &&
+		    ioapics[apic].nr_registers) {
 			err = -ENOMEM;
 			continue;
 		}
@@ -899,9 +900,12 @@ static int __init find_isa_irq_apic(int irq, int type)
 	if (i < mp_irq_entries) {
 		int ioapic_idx;
 
-		for (ioapic_idx = 0; ioapic_idx < nr_ioapics; ioapic_idx++)
+		for (ioapic_idx = 0; ioapic_idx < nr_ioapics; ioapic_idx++) {
+			if (!ioapics[ioapic_idx].nr_registers)
+				continue;
 			if (mpc_ioapic_id(ioapic_idx) == mp_irqs[i].dstapic)
 				return ioapic_idx;
+		}
 	}
 
 	return -1;
@@ -1121,10 +1125,13 @@ int IO_APIC_get_PCI_irq_vector(int bus, int slot, int pin,
 	for (i = 0; i < mp_irq_entries; i++) {
 		int lbus = mp_irqs[i].srcbus;
 
-		for (ioapic_idx = 0; ioapic_idx < nr_ioapics; ioapic_idx++)
+		for (ioapic_idx = 0; ioapic_idx < nr_ioapics; ioapic_idx++) {
+			if (!ioapics[ioapic_idx].nr_registers)
+				continue;
 			if (mpc_ioapic_id(ioapic_idx) == mp_irqs[i].dstapic ||
 			    mp_irqs[i].dstapic == MP_APIC_ALL)
 				break;
+		}
 
 		if (!test_bit(lbus, mp_bus_not_pci) &&
 		    !mp_irqs[i].irqtype &&
@@ -2084,6 +2091,9 @@ void __init setup_ioapic_ids_from_mpc_nocheck(void)
 	 * Set the IOAPIC ID to the value stored in the MPC table.
 	 */
 	for (ioapic_idx = 0; ioapic_idx < nr_ioapics; ioapic_idx++) {
+		if (!ioapics[ioapic_idx].nr_registers)
+			continue;
+
 		/* Read the register 0 value */
 		raw_spin_lock_irqsave(&ioapic_lock, flags);
 		reg_00.raw = io_apic_read(ioapic_idx, 0);
@@ -3013,8 +3023,12 @@ static void ioapic_resume(void)
 {
 	int ioapic_idx;
 
-	for (ioapic_idx = nr_ioapics - 1; ioapic_idx >= 0; ioapic_idx--)
+	for (ioapic_idx = nr_ioapics - 1; ioapic_idx >= 0; ioapic_idx--) {
+		if (!ioapics[ioapic_idx].nr_registers)
+			continue;
+
 		resume_ioapic_id(ioapic_idx);
+	}
 
 	restore_ioapic_entries();
 }
@@ -3851,8 +3865,11 @@ static int __mp_find_ioapic(u32 gsi, bool quiet)
 	/* Find the IOAPIC that manages this GSI. */
 	for (i = 0; i < nr_ioapics; i++) {
 		struct mp_ioapic_gsi *gsi_cfg = mp_ioapic_gsi_routing(i);
-		if ((gsi >= gsi_cfg->gsi_base)
-		    && (gsi <= gsi_cfg->gsi_end))
+
+		if (!ioapics[i].nr_registers)
+			continue;
+
+		if ((gsi >= gsi_cfg->gsi_base) && (gsi <= gsi_cfg->gsi_end))
 			return i;
 	}
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 22/27] x86, ioapic: Find usable ioapic id for 64bit.
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (20 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 21/27] x86, irq: Make ioapics loop skip blank slots Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 23/27] x86: Move declaration for mp_register_ioapic() Yinghai Lu
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

Checking the id in register, if that is duplicated, will pick one and
update id register.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 arch/x86/kernel/apic/io_apic.c | 41 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 34 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 970c4fd..8eb4cd3 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3660,27 +3660,54 @@ static int __init io_apic_get_unique_id(int ioapic, int apic_id)
 	return apic_id;
 }
 
-static u8 __init io_apic_unique_id(u8 id)
+static u8 __init io_apic_unique_id(int idx, u8 id)
 {
 	if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) &&
 	    !APIC_XAPIC(apic_version[boot_cpu_physical_apicid]))
-		return io_apic_get_unique_id(nr_ioapics, id);
+		return io_apic_get_unique_id(idx, id);
 	else
 		return id;
 }
 #else
-static u8 __init io_apic_unique_id(u8 id)
+static u8 io_apic_unique_id(int idx, u8 id)
 {
 	int i;
+	u8 new_id;
+	unsigned long flags;
 	DECLARE_BITMAP(used, 256);
+	union IO_APIC_reg_00 reg_00;
 
 	bitmap_zero(used, 256);
-	for (i = 0; i < nr_ioapics; i++) {
+	for (i = 0; i < nr_ioapics; i++)
 		__set_bit(mpc_ioapic_id(i), used);
-	}
 	if (!test_bit(id, used))
 		return id;
-	return find_first_zero_bit(used, 256);
+
+	/* check register at first */
+	raw_spin_lock_irqsave(&ioapic_lock, flags);
+	reg_00.raw = io_apic_read(idx, 0);
+	raw_spin_unlock_irqrestore(&ioapic_lock, flags);
+	new_id = reg_00.bits.ID;
+	if (!test_bit(new_id, used)) {
+		apic_printk(APIC_VERBOSE, KERN_INFO
+			"IOAPIC[%d]: Using reg apic_id %d instead of %d\n",
+			 idx, new_id, id);
+		return new_id;
+	}
+
+	new_id = find_first_zero_bit(used, 256);
+	reg_00.bits.ID = new_id;
+	raw_spin_lock_irqsave(&ioapic_lock, flags);
+	io_apic_write(idx, 0, reg_00.raw);
+	reg_00.raw = io_apic_read(idx, 0);
+	raw_spin_unlock_irqrestore(&ioapic_lock, flags);
+
+	/* Sanity check */
+	if (reg_00.bits.ID != new_id)
+		pr_warn("IOAPIC[%d]: Unable to change apic_id to %d!\n",
+			idx, new_id);
+
+	return new_id;
 }
 #endif
 
@@ -3961,7 +3988,7 @@ int __mp_register_ioapic(int id, u32 address, u32 gsi_base, bool hotadd)
 	if (bad_ioapic_register(idx))
 		goto failed;
 
-	ioapics[idx].mp_config.apicid = io_apic_unique_id(id);
+	ioapics[idx].mp_config.apicid = io_apic_unique_id(idx, id);
 	ioapics[idx].mp_config.apicver = io_apic_get_version(idx);
 
 	/*
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 23/27] x86: Move declaration for mp_register_ioapic()
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (21 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 22/27] x86, ioapic: Find usable ioapic id for 64bit Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 24/27] PCI, x86: Make ioapic hotplug support built-in Yinghai Lu
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu

Address compiling problem that Fengguang report.

Reported-by: Fengguang Wu <wfg@linux.intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/mpspec.h | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index e70b7e5..13439c5 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -95,10 +95,27 @@ static inline void early_reserve_e820_mpc_new(void) { }
 #endif
 
 void __cpuinit generic_processor_info(int apicid, int version);
-#ifdef CONFIG_ACPI
+
+#ifdef CONFIG_X86_IO_APIC
 int __mp_register_ioapic(int id, u32 address, u32 gsi_base, bool hot);
 int mp_unregister_ioapic(u32 gsi_base);
 extern void mp_register_ioapic(int id, u32 address, u32 gsi_base);
+#else
+static inline int __mp_register_ioapic(int id, u32 address, u32 gsi_base,
+					 bool hot)
+{
+	return 0;
+}
+static inline int mp_unregister_ioapic(u32 gsi_base)
+{
+	return 0;
+}
+static inline void mp_register_ioapic(int id, u32 address, u32 gsi_base)
+{
+}
+#endif
+
+#ifdef CONFIG_ACPI
 extern void mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger,
 				   u32 gsi);
 extern void mp_config_acpi_legacy_irqs(void);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 24/27] PCI, x86: Make ioapic hotplug support built-in
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (22 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 23/27] x86: Move declaration for mp_register_ioapic() Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 25/27] PCI, x86, ACPI: Link acpi ioapic register to ioapic Yinghai Lu
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu

ioapic hotplug should be built-in like pci root bus hotplug.

Also need to make it depends on X86_IO_APIC.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 drivers/pci/Kconfig  | 3 ++-
 drivers/pci/ioapic.c | 7 -------
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 6d51aa6..720989f 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -110,10 +110,11 @@ config PCI_PASID
 	  If unsure, say N.
 
 config PCI_IOAPIC
-	tristate "PCI IO-APIC hotplug support" if X86
+	bool "PCI IO-APIC hotplug support" if X86
 	depends on PCI
 	depends on ACPI
 	depends on HOTPLUG
+	depends on X86_IO_APIC
 	default !X86
 
 config PCI_LABEL
diff --git a/drivers/pci/ioapic.c b/drivers/pci/ioapic.c
index 3c6bbdd..7d6b157 100644
--- a/drivers/pci/ioapic.c
+++ b/drivers/pci/ioapic.c
@@ -117,13 +117,6 @@ static int __init ioapic_init(void)
 {
 	return pci_register_driver(&ioapic_driver);
 }
-
-static void __exit ioapic_exit(void)
-{
-	pci_unregister_driver(&ioapic_driver);
-}
-
 module_init(ioapic_init);
-module_exit(ioapic_exit);
 
 MODULE_LICENSE("GPL");
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 25/27] PCI, x86, ACPI: Link acpi ioapic register to ioapic
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (23 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 24/27] PCI, x86: Make ioapic hotplug support built-in Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 26/27] PCI, x86, ACPI: Enable ioapic hotplug support with acpi host bridge Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 27/27] PCI, x86, ACPI: get ioapic address from acpi device Yinghai Lu
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu

During ioapic hotplug, acpi_register_ioapic will be called.
Now for x86, that function is blank.
Fill that will update __mp_register_ioapic to use those ioapic.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/acpi/boot.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 7190d0d..5dca557 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -700,16 +700,18 @@ EXPORT_SYMBOL(acpi_unmap_lsapic);
 
 int acpi_register_ioapic(acpi_handle handle, u64 phys_addr, u32 gsi_base)
 {
-	/* TBD */
-	return -EINVAL;
+	unsigned long long id = 0;
+
+	acpi_evaluate_integer(handle, "_UID", NULL, &id);
+
+	return	__mp_register_ioapic(id, phys_addr, gsi_base, true);
 }
 
 EXPORT_SYMBOL(acpi_register_ioapic);
 
 int acpi_unregister_ioapic(acpi_handle handle, u32 gsi_base)
 {
-	/* TBD */
-	return -EINVAL;
+	return mp_unregister_ioapic(gsi_base);
 }
 
 EXPORT_SYMBOL(acpi_unregister_ioapic);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 26/27] PCI, x86, ACPI: Enable ioapic hotplug support with acpi host bridge.
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (24 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 25/27] PCI, x86, ACPI: Link acpi ioapic register to ioapic Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  2013-06-07 22:31 ` [PATCH v3 27/27] PCI, x86, ACPI: get ioapic address from acpi device Yinghai Lu
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu

We need to have ioapic setup before normal pci drivers.
otherwise other pci driver can not setup irq.

So we should not treat them as normal pci devices.
Also we will need to support ioapic hotplug without pci device around.

We need to call ioapic add/remove during host-bridge add/remove.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 drivers/acpi/pci_root.c  |   4 ++
 drivers/pci/ioapic.c     | 147 ++++++++++++++++++++++++++++++-----------------
 include/linux/pci-acpi.h |   8 +++
 3 files changed, 106 insertions(+), 53 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index e427dc5..dadcbe9 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -542,6 +542,8 @@ static int acpi_pci_root_add(struct acpi_device *device,
 	if (system_state != SYSTEM_BOOTING)
 		pci_enable_bridges(root->bus);
 
+	acpi_pci_ioapic_add(root);
+
 	pci_bus_add_devices(root->bus);
 	return 1;
 
@@ -561,6 +563,8 @@ static void acpi_pci_root_remove(struct acpi_device *device)
 
 	pci_stop_root_bus(root->bus);
 
+	acpi_pci_ioapic_remove(root);
+
 	device_set_run_wake(root->bus->bridge, false);
 	pci_acpi_remove_bus_pm_notifier(device);
 
diff --git a/drivers/pci/ioapic.c b/drivers/pci/ioapic.c
index 7d6b157..60351b2 100644
--- a/drivers/pci/ioapic.c
+++ b/drivers/pci/ioapic.c
@@ -22,101 +22,142 @@
 #include <linux/slab.h>
 #include <acpi/acpi_bus.h>
 
-struct ioapic {
-	acpi_handle	handle;
+struct acpi_pci_ioapic {
+	acpi_handle	root_handle;
 	u32		gsi_base;
+	struct pci_dev *pdev;
+	struct list_head list;
 };
 
-static int ioapic_probe(struct pci_dev *dev, const struct pci_device_id *ent)
+static LIST_HEAD(ioapic_list);
+static DEFINE_MUTEX(ioapic_list_lock);
+
+static void handle_ioapic_add(acpi_handle handle, struct pci_dev **pdev,
+				 u32 *pgsi_base)
 {
-	acpi_handle handle;
 	acpi_status status;
 	unsigned long long gsb;
-	struct ioapic *ioapic;
+	struct pci_dev *dev;
+	u32 gsi_base;
 	int ret;
 	char *type;
-	struct resource *res;
+	struct resource r;
+	struct resource *res = &r;
+	char objname[64];
+	struct acpi_buffer buffer = {sizeof(objname), objname};
 
-	handle = DEVICE_ACPI_HANDLE(&dev->dev);
-	if (!handle)
-		return -EINVAL;
+	*pdev = NULL;
+	*pgsi_base = 0;
 
 	status = acpi_evaluate_integer(handle, "_GSB", NULL, &gsb);
-	if (ACPI_FAILURE(status))
-		return -EINVAL;
-
-	/*
-	 * The previous code in acpiphp evaluated _MAT if _GSB failed, but
-	 * ACPI spec 4.0 sec 6.2.2 requires _GSB for hot-pluggable I/O APICs.
-	 */
+	if (ACPI_FAILURE(status) || !gsb)
+		return;
 
-	ioapic = kzalloc(sizeof(*ioapic), GFP_KERNEL);
-	if (!ioapic)
-		return -ENOMEM;
+	dev = acpi_get_pci_dev(handle);
+	if (!dev)
+		return;
 
-	ioapic->handle = handle;
-	ioapic->gsi_base = (u32) gsb;
+	acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
 
-	if (dev->class == PCI_CLASS_SYSTEM_PIC_IOAPIC)
-		type = "IOAPIC";
-	else
-		type = "IOxAPIC";
+	gsi_base = gsb;
+	type = "IOxAPIC";
 
 	ret = pci_enable_device(dev);
 	if (ret < 0)
-		goto exit_free;
+		goto exit_put;
 
 	pci_set_master(dev);
 
+	if (dev->class == PCI_CLASS_SYSTEM_PIC_IOAPIC)
+		type = "IOAPIC";
+
 	if (pci_request_region(dev, 0, type))
 		goto exit_disable;
 
 	res = &dev->resource[0];
-	if (acpi_register_ioapic(ioapic->handle, res->start, ioapic->gsi_base))
+
+	if (acpi_register_ioapic(handle, res->start, gsi_base))
 		goto exit_release;
 
-	pci_set_drvdata(dev, ioapic);
-	dev_info(&dev->dev, "%s at %pR, GSI %u\n", type, res, ioapic->gsi_base);
-	return 0;
+	pr_info("%s %s %s at %pR, GSI %u\n",
+		dev_name(&dev->dev), objname, type,
+		res, gsi_base);
+
+	*pdev = dev;
+	*pgsi_base = gsi_base;
+	return;
 
 exit_release:
 	pci_release_region(dev, 0);
 exit_disable:
 	pci_disable_device(dev);
-exit_free:
-	kfree(ioapic);
-	return -ENODEV;
+exit_put:
+	pci_dev_put(dev);
 }
 
-static void ioapic_remove(struct pci_dev *dev)
+static void handle_ioapic_remove(acpi_handle handle, struct pci_dev *dev,
+				  u32 gsi_base)
 {
-	struct ioapic *ioapic = pci_get_drvdata(dev);
+	acpi_unregister_ioapic(handle, gsi_base);
 
-	acpi_unregister_ioapic(ioapic->handle, ioapic->gsi_base);
 	pci_release_region(dev, 0);
 	pci_disable_device(dev);
-	kfree(ioapic);
+	pci_dev_put(dev);
 }
 
+static acpi_status register_ioapic(acpi_handle handle, u32 lvl,
+					void *context, void **rv)
+{
+	acpi_handle root_handle = context;
+	struct pci_dev *pdev;
+	u32 gsi_base;
+	struct acpi_pci_ioapic *ioapic;
 
-static DEFINE_PCI_DEVICE_TABLE(ioapic_devices) = {
-	{ PCI_DEVICE_CLASS(PCI_CLASS_SYSTEM_PIC_IOAPIC, ~0) },
-	{ PCI_DEVICE_CLASS(PCI_CLASS_SYSTEM_PIC_IOXAPIC, ~0) },
-	{ }
-};
-MODULE_DEVICE_TABLE(pci, ioapic_devices);
+	handle_ioapic_add(handle, &pdev, &gsi_base);
+	if (!gsi_base)
+		return AE_OK;
 
-static struct pci_driver ioapic_driver = {
-	.name		= "ioapic",
-	.id_table	= ioapic_devices,
-	.probe		= ioapic_probe,
-	.remove		= ioapic_remove,
-};
+	ioapic = kzalloc(sizeof(*ioapic), GFP_KERNEL);
+	if (!ioapic) {
+		pr_err("%s: cannot allocate memory\n", __func__);
+		handle_ioapic_remove(root_handle, pdev, gsi_base);
+		return AE_OK;
+	}
+	ioapic->root_handle = root_handle;
+	ioapic->pdev = pdev;
+	ioapic->gsi_base = gsi_base;
+
+	mutex_lock(&ioapic_list_lock);
+	list_add(&ioapic->list, &ioapic_list);
+	mutex_unlock(&ioapic_list_lock);
+
+	return AE_OK;
+}
 
-static int __init ioapic_init(void)
+void acpi_pci_ioapic_add(struct acpi_pci_root *root)
 {
-	return pci_register_driver(&ioapic_driver);
+	acpi_status status;
+
+	status = acpi_walk_namespace(ACPI_TYPE_DEVICE, root->device->handle,
+				     (u32)1, register_ioapic, NULL,
+				     root->device->handle,
+				     NULL);
+	if (ACPI_FAILURE(status))
+		pr_err("%s: register_ioapic failure - %d", __func__, status);
 }
-module_init(ioapic_init);
 
-MODULE_LICENSE("GPL");
+void acpi_pci_ioapic_remove(struct acpi_pci_root *root)
+{
+	struct acpi_pci_ioapic *ioapic, *tmp;
+
+	mutex_lock(&ioapic_list_lock);
+	list_for_each_entry_safe(ioapic, tmp, &ioapic_list, list) {
+		if (root->device->handle != ioapic->root_handle)
+			continue;
+		list_del(&ioapic->list);
+		handle_ioapic_remove(ioapic->root_handle, ioapic->pdev,
+					ioapic->gsi_base);
+		kfree(ioapic);
+	}
+	mutex_unlock(&ioapic_list_lock);
+}
diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
index 1704479..b2a2ced 100644
--- a/include/linux/pci-acpi.h
+++ b/include/linux/pci-acpi.h
@@ -69,6 +69,14 @@ static inline void acpiphp_remove_slots(struct pci_bus *bus) { }
 static inline void acpiphp_check_host_bridge(acpi_handle handle) { }
 #endif
 
+#ifdef CONFIG_PCI_IOAPIC
+void acpi_pci_ioapic_add(struct acpi_pci_root *root);
+void acpi_pci_ioapic_remove(struct acpi_pci_root *root);
+#else
+static inline void acpi_pci_ioapic_add(struct acpi_pci_root *root) { }
+static inline void acpi_pci_ioapic_remove(struct acpi_pci_root *root) { }
+#endif
+
 #else	/* CONFIG_ACPI */
 static inline void acpi_pci_add_bus(struct pci_bus *bus) { }
 static inline void acpi_pci_remove_bus(struct pci_bus *bus) { }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 27/27] PCI, x86, ACPI: get ioapic address from acpi device
  2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
                   ` (25 preceding siblings ...)
  2013-06-07 22:31 ` [PATCH v3 26/27] PCI, x86, ACPI: Enable ioapic hotplug support with acpi host bridge Yinghai Lu
@ 2013-06-07 22:31 ` Yinghai Lu
  26 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-07 22:31 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki
  Cc: linux-pci, linux-kernel, Yinghai Lu

Some ioapic controllers do not show up on pci config space,
or pci device is there but no bar is used and is set by firmware in
other non standard registers.

We can get ioapic address from ACPI0009's _CRS.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 drivers/pci/ioapic.c | 86 +++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 71 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/ioapic.c b/drivers/pci/ioapic.c
index 60351b2..41f7c69 100644
--- a/drivers/pci/ioapic.c
+++ b/drivers/pci/ioapic.c
@@ -32,6 +32,36 @@ struct acpi_pci_ioapic {
 static LIST_HEAD(ioapic_list);
 static DEFINE_MUTEX(ioapic_list_lock);
 
+static acpi_status setup_res(struct acpi_resource *acpi_res, void *data)
+{
+	struct resource *res;
+	struct acpi_resource_address64 addr;
+	acpi_status status;
+	unsigned long flags;
+	u64 start, end;
+
+	status = acpi_resource_to_address64(acpi_res, &addr);
+	if (!ACPI_SUCCESS(status))
+		return AE_OK;
+
+	if (addr.resource_type == ACPI_MEMORY_RANGE) {
+		if (addr.info.mem.caching == ACPI_PREFETCHABLE_MEMORY)
+			return AE_OK;
+		flags = IORESOURCE_MEM;
+	} else
+		return AE_OK;
+
+	start = addr.minimum + addr.translation_offset;
+	end = addr.maximum + addr.translation_offset;
+
+	res = data;
+	res->flags = flags;
+	res->start = start;
+	res->end = end;
+
+	return AE_OK;
+}
+
 static void handle_ioapic_add(acpi_handle handle, struct pci_dev **pdev,
 				 u32 *pgsi_base)
 {
@@ -54,33 +84,56 @@ static void handle_ioapic_add(acpi_handle handle, struct pci_dev **pdev,
 		return;
 
 	dev = acpi_get_pci_dev(handle);
-	if (!dev)
-		return;
+	if (!dev || !pci_resource_len(dev, 0)) {
+		struct acpi_device_info *info;
+		char *hid = NULL;
+
+		status = acpi_get_object_info(handle, &info);
+		if (ACPI_FAILURE(status))
+			goto exit_put;
+		if (info->valid & ACPI_VALID_HID)
+			hid = info->hardware_id.string;
+		if (!hid || strcmp(hid, "ACPI0009")) {
+			kfree(info);
+			goto exit_put;
+		}
+		kfree(info);
+		memset(res, 0, sizeof(*res));
+		acpi_walk_resources(handle, METHOD_NAME__CRS, setup_res, res);
+		if (!res->flags)
+			goto exit_put;
+	}
 
 	acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
 
 	gsi_base = gsb;
 	type = "IOxAPIC";
+	if (dev) {
+		ret = pci_enable_device(dev);
+		if (ret < 0)
+			goto exit_put;
 
-	ret = pci_enable_device(dev);
-	if (ret < 0)
-		goto exit_put;
-
-	pci_set_master(dev);
+		pci_set_master(dev);
 
-	if (dev->class == PCI_CLASS_SYSTEM_PIC_IOAPIC)
-		type = "IOAPIC";
+		if (dev->class == PCI_CLASS_SYSTEM_PIC_IOAPIC)
+			type = "IOAPIC";
 
-	if (pci_request_region(dev, 0, type))
-		goto exit_disable;
+		if (pci_resource_len(dev, 0)) {
+			if (pci_request_region(dev, 0, type))
+				goto exit_disable;
 
-	res = &dev->resource[0];
+			res = &dev->resource[0];
+		}
+	}
 
-	if (acpi_register_ioapic(handle, res->start, gsi_base))
-		goto exit_release;
+	if (acpi_register_ioapic(handle, res->start, gsi_base)) {
+		if (dev)
+			goto exit_release;
+		return;
+	}
 
 	pr_info("%s %s %s at %pR, GSI %u\n",
-		dev_name(&dev->dev), objname, type,
+		dev ? dev_name(&dev->dev) : "", objname, type,
 		res, gsi_base);
 
 	*pdev = dev;
@@ -100,6 +153,9 @@ static void handle_ioapic_remove(acpi_handle handle, struct pci_dev *dev,
 {
 	acpi_unregister_ioapic(handle, gsi_base);
 
+	if (!dev)
+		return;
+
 	pci_release_region(dev, 0);
 	pci_disable_device(dev);
 	pci_dev_put(dev);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping
  2013-06-07 22:30 ` [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping Yinghai Lu
@ 2013-06-09 14:54   ` Sebastian Andrzej Siewior
  2013-06-10 23:17     ` Yinghai Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-09 14:54 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior, Joerg Roedel

On Fri, Jun 07, 2013 at 03:30:48PM -0700, Yinghai Lu wrote:
> Current code: after irq remapping is enabled, irq_chip fields are modified
> during every irq setup.
> 	mp_register_gsi
> 		io_apic_set_pci_routing
> 			io_apic_setup_irq_pin
> 				setup_ioapic_irq
> 					ioapic_register_intr
> 						setup_remapped_irq
> 	native_setup_msi_irqs
> 		setup_msi_irq
> 			setup_remapped_irq
> 	default_setup_hpet_msi
> 		setup_remapped_irq
> that is not efficient.
> 
> We only need to modify those irq chip one time just after we enable
> irq mapping.

The overhead you talk about is calling setup_remapped_irq() for every
interrupt from the mp_register_gsi() call chain? MSI & HPET should happen only
once, or do I miss something?

> Change irq_remap_modify_chip_defaults() to __init as it only gets
> called during booting stage, via irq_remap_modify_chips().
> 
> Affected irq_chip: ioapic_chip, msi_chip, hpet_msi_type.
> We don't need to use #ifdef in irq_remap_modify_chips():
> IRQ_REMAP only support x86_64 and X86_IO_APIC and PCI_MSI.
> HPET_TIMER is set when x86_64 is set.
> When we have IRQ_REMAP enabled, al three chips are defined and
> used.

all, not al

Still, the user could disable hpet or apic from the commandline but this
should cause any harm as the irq chips shouldn't be used then.

> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 904611b..ff50b90 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1552,6 +1552,8 @@ void enable_x2apic(void)
>  int __init enable_IR(void)
>  {
>  #ifdef CONFIG_IRQ_REMAP
> +	int ret;
> +
>  	if (!irq_remapping_supported()) {
>  		pr_debug("intr-remapping not supported\n");
>  		return -1;
> @@ -1563,7 +1565,12 @@ int __init enable_IR(void)
>  		return -1;
>  	}
>  
> -	return irq_remapping_enable();
> +	ret = irq_remapping_enable();
> +
> +	if (ret >= 0)
> +		irq_remap_modify_chips();

This looks like ehm, well not well.
Could you please change this to:
	ret = irq_remapping_enable();
	if (ret)
		return ret;

	irq_remap_modify_chips();

> +
> +	return ret;
>  #endif
>  	return -1;
>  }
> diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
> index 07ce86a..21ef344 100644
> --- a/drivers/iommu/irq_remapping.c
> +++ b/drivers/iommu/irq_remapping.c
> @@ -373,19 +373,26 @@ static void ir_print_prefix(struct irq_data *data, struct seq_file *p)
>  	seq_printf(p, " IR-%s", data->chip->name);
>  }
>  
> -static void irq_remap_modify_chip_defaults(struct irq_chip *chip)
> +static void __init irq_remap_modify_chip_defaults(struct irq_chip *chip)
>  {
> +	printk(KERN_DEBUG "irq_chip: %s ==> IR-%s", chip->name, chip->name);

If you need this please use pr_debug and add a \n at the end.

>  	chip->irq_print_chip = ir_print_prefix;
>  	chip->irq_ack = ir_ack_apic_edge;
>  	chip->irq_eoi = ir_ack_apic_level;
>  	chip->irq_set_affinity = x86_io_apic_ops.set_affinity;
>  }
>  
> +void __init irq_remap_modify_chips(void)
> +{
> +	irq_remap_modify_chip_defaults(&ioapic_chip);
> +	irq_remap_modify_chip_defaults(&msi_chip);
> +	irq_remap_modify_chip_defaults(&hpet_msi_type);
> +}
> +
>  bool setup_remapped_irq(int irq, struct irq_cfg *cfg, struct irq_chip *chip)
>  {
>  	if (!irq_remapped(cfg))
>  		return false;
>  	irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
> -	irq_remap_modify_chip_defaults(chip);

chip is not required, and can be removed.

>  	return true;
>  }

Sebastian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent
  2013-06-07 22:30 ` [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent Yinghai Lu
@ 2013-06-09 15:16   ` Sebastian Andrzej Siewior
  2013-06-10 23:40     ` Yinghai Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-09 15:16 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel, Joerg Roedel,
	Konrad Rzeszutek Wilk

On Fri, Jun 07, 2013 at 03:30:51PM -0700, Yinghai Lu wrote:
> All others are using "-" instead of "_".

Who are "all others"? According to my grep it is 121 vs 44. So even without
your two here we still 42 doing it the other way around so it is not all.

However it might make sense to use _ instead of " " within a chip's name and
- as a delimiter for modes like edge or level. However that is way beyond the
scope of this patch and I am not sure if anyone finds this usefull at all.

Saying this, I am fine with this patch :)

Sebastian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr()
  2013-06-07 22:30 ` [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr() Yinghai Lu
@ 2013-06-09 15:22   ` Sebastian Andrzej Siewior
  2013-06-10 23:41     ` Yinghai Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-09 15:22 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel, Tony Luck,
	Fenghua Yu, linux-ia64

On Fri, Jun 07, 2013 at 03:30:52PM -0700, Yinghai Lu wrote:
> create_irq() will return -1 when fail to allocate.
The ia64 code here will return -ENOSPC. 

> create_irq_nr() will return 0 when fail to allocate.
> 
> Will use it to fix one return value checking for dmar_msi irq.

What about to unify the interface? Using -1 is kinda bad.

Sebastian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node
  2013-06-07 22:30 ` [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node Yinghai Lu
@ 2013-06-09 15:31   ` Sebastian Andrzej Siewior
  2013-06-10 23:43     ` Yinghai Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-09 15:31 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel, Joerg Roedel,
	Donald Dutile

On Fri, Jun 07, 2013 at 03:30:53PM -0700, Yinghai Lu wrote:
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index a7967ce..c09e60d 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1273,7 +1273,7 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
>  	if (iommu->irq)
>  		return 0;
>  
> -	irq = create_irq();
> +	irq = create_irq_nr(0, iommu->node);
>  	if (!irq) {

This is okay. However you fix a bug here so you might want first fix the bug
with a Cc: stable tag and then do this.

>  		pr_err("IOMMU: no free vectors\n");
>  		return -EINVAL;

Sebastian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/27] x86, irq: kill create_irq()
  2013-06-07 22:30 ` [PATCH v3 08/27] x86, irq: kill create_irq() Yinghai Lu
@ 2013-06-09 15:35   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 47+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-09 15:35 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

On Fri, Jun 07, 2013 at 03:30:54PM -0700, Yinghai Lu wrote:
> create_irq() will return -1 when failing to allocate.
> create_irq_nr() will return 0 when failing to allocate.
> 
> It only causes confusion.
> 
> Now we have no user for create_irq(), so remove create_irq() for x86.

Oh thank you so much. That is what I just asked for :)

Sebastian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 09/27] x86, irq: Convert irq_2_pin list to generic list
  2013-06-07 22:30 ` [PATCH v3 09/27] x86, irq: Convert irq_2_pin list to generic list Yinghai Lu
@ 2013-06-09 15:52   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 47+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-09 15:52 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior, Joerg Roedel

On Fri, Jun 07, 2013 at 03:30:55PM -0700, Yinghai Lu wrote:
> Now irq_2_pin list is own grown list.
> We can use generic list to replace it so we could use generic helper
> functions to operate it.

Nice.

Sebastian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at()
  2013-06-07 22:30 ` [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at() Yinghai Lu
@ 2013-06-09 19:13   ` Sebastian Andrzej Siewior
  2013-06-10 20:13   ` Thomas Gleixner
  1 sibling, 0 replies; 47+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-09 19:13 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel, Joerg Roedel,
	Konrad Rzeszutek Wilk, Sebastian Andrzej Siewior

On Fri, Jun 07, 2013 at 03:30:57PM -0700, Yinghai Lu wrote:
> For ioapic hot-add support, it would be easy if we put all irqs
> for that ioapic controller together.
> 
> We can reserve irq range at first, then reallocate those
> pre-reserved one when it is needed.
> 
> Add realloc_irq_and_cfg_at() to really allocate irq_desc and cfg,
> because pre-reserved only mark bits in allocate_irqs bit maps.
> 
> The reasons for not allocating them during reserving:
> 1. only several pins in ioapic are used, allocate for all pins, will
>    waste memory for not used pins.
> 2. relocate later could make sure irq_desc is allocated on local node ram.
>    as dev->node is set at that point.
> 
> -v2: update changelog by adding reasons, requested by Konrad.

I think it will be better to split out the gen irq changes out of x86 / apic
specific code. 

I don't what to say. You are worried about the extra unused memory in case
of irq_desc right? The OF code has more or less the same problem. They use
irq_of_parse_and_map() / irq_create_mapping() to create a mapping between
virq (the linux number) and hw-irq (pin on the irq chip). This mapping is
created once the irq chip is detected. They don't care about virqs (aka 
linux numbers) to be in a row and they allocate all of irqdesc at once.

Once you get irqdomain to be used within ioapic, then your "linux irq
number" vs "hw irq number" should disapper. Plus you can remove the whole
gsi_number thingy. And then you start working on getting irqdesc allocated
later, say at request_irq() time and free at free_irq().

However I am not sure if this is worth it. The advantage would be that you
would once one infrastcuture for linux-number vs hw-number mapping and the
delayed irqdesc allocate + numa node rellocating.

> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
>  arch/x86/kernel/apic/io_apic.c | 32 +++++++++++++++++++++++++++++++-
>  include/linux/irq.h            |  5 +++++
>  kernel/irq/irqdesc.c           | 26 ++++++++++++++++++++++++++
>  3 files changed, 62 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 670c538..a157a56 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -301,6 +301,36 @@ static void free_irq_at(unsigned int at, struct irq_cfg *cfg)
>  	irq_free_desc(at);
>  }
>  
> +static struct irq_cfg *realloc_irq_and_cfg_at(unsigned int at, int node)
> +{
> +	struct irq_desc *desc = irq_to_desc(at);
> +	struct irq_cfg *cfg;
> +	int res;
> +
> +	if (desc) {
> +		if (irq_desc_get_irq_data(desc)->node == node)
> +			return alloc_irq_and_cfg_at(at, node);
> +
> +		cfg = irq_desc_get_chip_data(desc);
> +		if (cfg) {
> +			/* shared irq */
> +			if (!list_empty(&cfg->irq_2_pin))
> +				return cfg;
> +			free_irq_cfg(at, cfg);
> +		}
> +	}
> +
> +	res = irq_realloc_desc_at(at, node);
> +	if (res >= 0) {
> +		cfg = alloc_irq_cfg(at, node);
> +		if (cfg) {
> +			irq_set_chip_data(at, cfg);
> +			return cfg;
> +		}
> +	}

This looks somehow hackish. If irqdesc exists on another node then it
looks here like you overwrite it with a new one but __irq_realloc_desc()
deallocates the old one. As I said, this looks very hackish.

> +
> +	return alloc_irq_and_cfg_at(at, node);
> +}
>  
>  struct io_apic {
>  	unsigned int index;
> @@ -3352,7 +3382,7 @@ int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
>  static int
>  io_apic_setup_irq_pin(unsigned int irq, int node, struct io_apic_irq_attr *attr)
>  {
> -	struct irq_cfg *cfg = alloc_irq_and_cfg_at(irq, node);
> +	struct irq_cfg *cfg = realloc_irq_and_cfg_at(irq, node);
>  	int ret;
>  
>  	if (!cfg)
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 4e0fcbb..9c6c047 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -602,6 +602,11 @@ void irq_free_descs(unsigned int irq, unsigned int cnt);
>  int irq_reserve_irqs(unsigned int from, unsigned int cnt);
>  int __irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt);
>  
> +int __irq_realloc_desc(int at, int node, struct module *owner);
> +/* use macros to avoid needing export.h for THIS_MODULE */
If you put this in line with the other functions, you wouldn't need to
copy the comment.

> +#define irq_realloc_desc_at(at, node)	\
> +	__irq_realloc_desc(at, node, THIS_MODULE)
> +
>  static inline void irq_free_desc(unsigned int irq)
>  {
>  	irq_free_descs(irq, 1);
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 3b9fb92..b48f65b 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -99,6 +99,11 @@ EXPORT_SYMBOL_GPL(nr_irqs);
>  static DEFINE_MUTEX(sparse_irq_lock);
>  static DECLARE_BITMAP(allocated_irqs, IRQ_BITMAP_BITS);
>  
> +static bool __irq_is_reserved(int irq)
> +{
> +	return !!test_bit(irq, allocated_irqs);

This is not reserved, this is allocated.

> +}
> +
>  #ifdef CONFIG_SPARSE_IRQ
>  
>  static RADIX_TREE(irq_desc_tree, GFP_KERNEL);
> @@ -410,6 +415,27 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  EXPORT_SYMBOL_GPL(__irq_alloc_descs);
>  
>  /**
> + * irq_realloc_desc - allocate irq descriptor for irq that is already reserved
> + * @irq:	Allocate for specific irq number if irq >= 0
> + * @node:	Preferred node on which the irq descriptor should be allocated
> + * @owner:	Owning module (can be NULL)
> + *
> + * Returns the irq number or error code
> + */
> +int __ref
> +__irq_realloc_desc(int irq, int node, struct module *owner)
> +{
> +	if (!__irq_is_reserved(irq))
> +		return -EINVAL;

I don't like the part where it is named reserved but means allocated

> +
> +	if (irq_to_desc(irq))
> +		free_desc(irq);

and free if it is already avaiable because it should not be allocated.

> +
> +	return alloc_descs(irq, 1, node, owner);
> +}
> +EXPORT_SYMBOL_GPL(__irq_realloc_desc);

Sebastian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs
  2013-06-07 22:30 ` [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs Yinghai Lu
@ 2013-06-10 13:51   ` Alexander Gordeev
  2013-06-10 19:16     ` Yinghai Lu
  2013-06-10 19:39   ` Thomas Gleixner
  1 sibling, 1 reply; 47+ messages in thread
From: Alexander Gordeev @ 2013-06-10 13:51 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, linux-kernel

On Fri, Jun 07, 2013 at 03:30:56PM -0700, Yinghai Lu wrote:
> irq_alloc_descs and irq_reserve_irqs are almost the same.
> Separate code out to __irq_reserved_irqs, and other two reuse
> __irq_reserve_irqs.
> 
> We will use __irq_reserve_irqs for coming ioapic hotplug support.
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Alexander Gordeev <agordeev@redhat.com>
> ---
>  include/linux/irq.h  |  1 +
>  kernel/irq/irqdesc.c | 54 ++++++++++++++++++++++++++++++----------------------
>  2 files changed, 32 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index bc4e066..4e0fcbb 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -600,6 +600,7 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  
>  void irq_free_descs(unsigned int irq, unsigned int cnt);
>  int irq_reserve_irqs(unsigned int from, unsigned int cnt);
> +int __irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt);
>  
>  static inline void irq_free_desc(unsigned int irq)
>  {
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 192a302..3b9fb92 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -340,18 +340,15 @@ void irq_free_descs(unsigned int from, unsigned int cnt)
>  EXPORT_SYMBOL_GPL(irq_free_descs);
>  
>  /**
> - * irq_alloc_descs - allocate and initialize a range of irq descriptors
> - * @irq:	Allocate for specific irq number if irq >= 0
> + * __irq_reserve_descs - reserve and initialize a range of irq descriptors
> + * @irq:	Reserve for specific irq number if irq >= 0
>   * @from:	Start the search from this irq number
> - * @cnt:	Number of consecutive irqs to allocate.
> - * @node:	Preferred node on which the irq descriptor should be allocated
> - * @owner:	Owning module (can be NULL)
> + * @cnt:	Number of consecutive irqs to reserve.
>   *
>   * Returns the first irq number or error code
>   */
>  int __ref
> -__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
> -		  struct module *owner)
> +__irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt)
>  {
>  	int start, ret;
>  
> @@ -369,7 +366,7 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  	start = bitmap_find_next_zero_area(allocated_irqs, IRQ_BITMAP_BITS,
>  					   from, cnt, 0);
>  	ret = -EEXIST;
> -	if (irq >=0 && start != irq)
> +	if (irq >= 0 && start != irq)
>  		goto err;
>  
>  	if (start + cnt > nr_irqs) {
> @@ -380,12 +377,36 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  
>  	bitmap_set(allocated_irqs, start, cnt);
>  	mutex_unlock(&sparse_irq_lock);
> -	return alloc_descs(start, cnt, node, owner);
> +	return start;
>  
>  err:
>  	mutex_unlock(&sparse_irq_lock);
>  	return ret;
>  }
> +
> +/**
> + * irq_alloc_descs - allocate and initialize a range of irq descriptors
> + * @irq:	Allocate for specific irq number if irq >= 0
> + * @from:	Start the search from this irq number
> + * @cnt:	Number of consecutive irqs to allocate.
> + * @node:	Preferred node on which the irq descriptor should be allocated
> + * @owner:	Owning module (can be NULL)
> + *
> + * Returns the first irq number or error code
> + */
> +int __ref
> +__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
> +		  struct module *owner)
> +{
> +	int start;
> +
> +	start = __irq_reserve_irqs(irq, from, cnt);
> +
> +	if (start < 0)
> +		return start;
> +
> +	return alloc_descs(start, cnt, node, owner);

I think alloc_descs() fail path is needed before return.

> +}
>  EXPORT_SYMBOL_GPL(__irq_alloc_descs);
>  
>  /**
> @@ -397,20 +418,7 @@ EXPORT_SYMBOL_GPL(__irq_alloc_descs);
>   */
>  int irq_reserve_irqs(unsigned int from, unsigned int cnt)
>  {
> -	unsigned int start;
> -	int ret = 0;
> -
> -	if (!cnt || (from + cnt) > nr_irqs)
> -		return -EINVAL;
> -
> -	mutex_lock(&sparse_irq_lock);
> -	start = bitmap_find_next_zero_area(allocated_irqs, nr_irqs, from, cnt, 0);
> -	if (start == from)
> -		bitmap_set(allocated_irqs, start, cnt);
> -	else
> -		ret = -EEXIST;
> -	mutex_unlock(&sparse_irq_lock);
> -	return ret;
> +	return __irq_reserve_irqs(from, from, cnt);
>  }
>  
>  /**
> -- 
> 1.8.1.4
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs
  2013-06-10 13:51   ` Alexander Gordeev
@ 2013-06-10 19:16     ` Yinghai Lu
  2013-06-10 19:42       ` Alexander Gordeev
  0 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-10 19:16 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List

On Mon, Jun 10, 2013 at 6:51 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
> On Fri, Jun 07, 2013 at 03:30:56PM -0700, Yinghai Lu wrote:
>> irq_alloc_descs and irq_reserve_irqs are almost the same.
>> Separate code out to __irq_reserved_irqs, and other two reuse
>> __irq_reserve_irqs.
>>
>> We will use __irq_reserve_irqs for coming ioapic hotplug support.
>>
>> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>> Cc: Alexander Gordeev <agordeev@redhat.com>
>> ---
>>  include/linux/irq.h  |  1 +
>>  kernel/irq/irqdesc.c | 54 ++++++++++++++++++++++++++++++----------------------
>>  2 files changed, 32 insertions(+), 23 deletions(-)
>>
>> diff --git a/include/linux/irq.h b/include/linux/irq.h
>> index bc4e066..4e0fcbb 100644
>> --- a/include/linux/irq.h
>> +++ b/include/linux/irq.h
>> @@ -600,6 +600,7 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>>
>>  void irq_free_descs(unsigned int irq, unsigned int cnt);
>>  int irq_reserve_irqs(unsigned int from, unsigned int cnt);
>> +int __irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt);
>>
>>  static inline void irq_free_desc(unsigned int irq)
>>  {
>> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
>> index 192a302..3b9fb92 100644
>> --- a/kernel/irq/irqdesc.c
>> +++ b/kernel/irq/irqdesc.c
>> @@ -340,18 +340,15 @@ void irq_free_descs(unsigned int from, unsigned int cnt)
>>  EXPORT_SYMBOL_GPL(irq_free_descs);
>>
>>  /**
>> - * irq_alloc_descs - allocate and initialize a range of irq descriptors
>> - * @irq:     Allocate for specific irq number if irq >= 0
>> + * __irq_reserve_descs - reserve and initialize a range of irq descriptors
>> + * @irq:     Reserve for specific irq number if irq >= 0
>>   * @from:    Start the search from this irq number
>> - * @cnt:     Number of consecutive irqs to allocate.
>> - * @node:    Preferred node on which the irq descriptor should be allocated
>> - * @owner:   Owning module (can be NULL)
>> + * @cnt:     Number of consecutive irqs to reserve.
>>   *
>>   * Returns the first irq number or error code
>>   */
>>  int __ref
>> -__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>> -               struct module *owner)
>> +__irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt)
>>  {
>>       int start, ret;
>>
>> @@ -369,7 +366,7 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>>       start = bitmap_find_next_zero_area(allocated_irqs, IRQ_BITMAP_BITS,
>>                                          from, cnt, 0);
>>       ret = -EEXIST;
>> -     if (irq >=0 && start != irq)
>> +     if (irq >= 0 && start != irq)
>>               goto err;
>>
>>       if (start + cnt > nr_irqs) {
>> @@ -380,12 +377,36 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>>
>>       bitmap_set(allocated_irqs, start, cnt);
>>       mutex_unlock(&sparse_irq_lock);
>> -     return alloc_descs(start, cnt, node, owner);
>> +     return start;
>>
>>  err:
>>       mutex_unlock(&sparse_irq_lock);
>>       return ret;
>>  }
>> +
>> +/**
>> + * irq_alloc_descs - allocate and initialize a range of irq descriptors
>> + * @irq:     Allocate for specific irq number if irq >= 0
>> + * @from:    Start the search from this irq number
>> + * @cnt:     Number of consecutive irqs to allocate.
>> + * @node:    Preferred node on which the irq descriptor should be allocated
>> + * @owner:   Owning module (can be NULL)
>> + *
>> + * Returns the first irq number or error code
>> + */
>> +int __ref
>> +__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>> +               struct module *owner)
>> +{
>> +     int start;
>> +
>> +     start = __irq_reserve_irqs(irq, from, cnt);
>> +
>> +     if (start < 0)
>> +             return start;
>> +
>> +     return alloc_descs(start, cnt, node, owner);
>
> I think alloc_descs() fail path is needed before return.

__irq_reserve_irqs already return -EEXIST etc,

old kernel is like:

>> @@ -380,12 +377,36 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,

       bitmap_set(allocated_irqs, start, cnt);
       mutex_unlock(&sparse_irq_lock);
     return alloc_descs(start, cnt, node, owner);

 err:
      mutex_unlock(&sparse_irq_lock);
      return ret;

so i don't change the fail path handling.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs
  2013-06-07 22:30 ` [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs Yinghai Lu
  2013-06-10 13:51   ` Alexander Gordeev
@ 2013-06-10 19:39   ` Thomas Gleixner
  2013-06-10 23:55     ` Yinghai Lu
  1 sibling, 1 reply; 47+ messages in thread
From: Thomas Gleixner @ 2013-06-10 19:39 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, H. Peter Anvin, Bjorn Helgaas, Rafael J. Wysocki,
	linux-pci, linux-kernel, Alexander Gordeev

On Fri, 7 Jun 2013, Yinghai Lu wrote:
>  /**
> - * irq_alloc_descs - allocate and initialize a range of irq descriptors
> - * @irq:	Allocate for specific irq number if irq >= 0
> + * __irq_reserve_descs - reserve and initialize a range of irq descriptors

Did you even bother to run docbook ?

No you didn't. Otherwise you'd have noticed that documentation for
__irq_reserve_descs is not the right thing for a function named
__irq_reserve_irqs.

> + * __irq_reserve_descs - reserve and initialize a range of irq descriptors

The function only reserves a range of irq descriptors and does not
initialize them.

> +
> +/**
> + * irq_alloc_descs - allocate and initialize a range of irq descriptors

Again you document non matching function names.

> +int __ref
> +__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
> +		  struct module *owner)

Sloppy as usual along with a sucky changelog as usual.....

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs
  2013-06-10 19:16     ` Yinghai Lu
@ 2013-06-10 19:42       ` Alexander Gordeev
  0 siblings, 0 replies; 47+ messages in thread
From: Alexander Gordeev @ 2013-06-10 19:42 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List

On Mon, Jun 10, 2013 at 12:16:16PM -0700, Yinghai Lu wrote:
> On Mon, Jun 10, 2013 at 6:51 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
> >> +/**
> >> + * irq_alloc_descs - allocate and initialize a range of irq descriptors
> >> + * @irq:     Allocate for specific irq number if irq >= 0
> >> + * @from:    Start the search from this irq number
> >> + * @cnt:     Number of consecutive irqs to allocate.
> >> + * @node:    Preferred node on which the irq descriptor should be allocated
> >> + * @owner:   Owning module (can be NULL)
> >> + *
> >> + * Returns the first irq number or error code
> >> + */
> >> +int __ref
> >> +__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
> >> +               struct module *owner)
> >> +{
> >> +     int start;
> >> +
> >> +     start = __irq_reserve_irqs(irq, from, cnt);
> >> +
> >> +     if (start < 0)
> >> +             return start;
> >> +
> >> +     return alloc_descs(start, cnt, node, owner);
> >
> > I think alloc_descs() fail path is needed before return.
> 
> __irq_reserve_irqs already return -EEXIST etc,
> 
> old kernel is like:
> 
> >> @@ -380,12 +377,36 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
> 
>        bitmap_set(allocated_irqs, start, cnt);
>        mutex_unlock(&sparse_irq_lock);
>      return alloc_descs(start, cnt, node, owner);
> 
>  err:
>       mutex_unlock(&sparse_irq_lock);
>       return ret;
> 
> so i don't change the fail path handling.

I rather meant the bits should be unset in case alloc_descs() failed.
But I failed to notice alloc_descs() does it. Therefore..

Reviewed-by: Alexander Gordeev <agordeev@redhat.com>

> Thanks
> 
> Yinghai

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at()
  2013-06-07 22:30 ` [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at() Yinghai Lu
  2013-06-09 19:13   ` Sebastian Andrzej Siewior
@ 2013-06-10 20:13   ` Thomas Gleixner
  1 sibling, 0 replies; 47+ messages in thread
From: Thomas Gleixner @ 2013-06-10 20:13 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, H. Peter Anvin, Bjorn Helgaas, Rafael J. Wysocki,
	linux-pci, linux-kernel, Joerg Roedel, Konrad Rzeszutek Wilk,
	Sebastian Andrzej Siewior

On Fri, 7 Jun 2013, Yinghai Lu wrote:

> For ioapic hot-add support, it would be easy if we put all irqs
> for that ioapic controller together.
> 
> We can reserve irq range at first, then reallocate those

No. We do not reallocate something which does not exist in the first
place.

> pre-reserved one when it is needed.
> 
> Add realloc_irq_and_cfg_at() to really allocate irq_desc and cfg,
> because pre-reserved only mark bits in allocate_irqs bit maps.
> 
> The reasons for not allocating them during reserving:
> 1. only several pins in ioapic are used, allocate for all pins, will
>    waste memory for not used pins.
> 2. relocate later could make sure irq_desc is allocated on local node ram.
>    as dev->node is set at that point.

This is not relocating. Your changelog sucks as much as your code.
 
> -v2: update changelog by adding reasons, requested by Konrad.
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
>  arch/x86/kernel/apic/io_apic.c | 32 +++++++++++++++++++++++++++++++-
>  include/linux/irq.h            |  5 +++++
>  kernel/irq/irqdesc.c           | 26 ++++++++++++++++++++++++++

No, we do not add new code to the core and use it in the same patch at
some random other place. The core code change wants to be separate and
have a separate changelog.

> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -602,6 +602,11 @@ void irq_free_descs(unsigned int irq, unsigned int cnt);
>  int irq_reserve_irqs(unsigned int from, unsigned int cnt);
>  int __irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt);
>  
> +int __irq_realloc_desc(int at, int node, struct module *owner);
> +/* use macros to avoid needing export.h for THIS_MODULE */

You must be kidding. export.h has been split out from module.h exactly
to avoid horrible comments like the above and nonsense like this:

> +#define irq_realloc_desc_at(at, node)	\
> +	__irq_realloc_desc(at, node, THIS_MODULE)
> +

> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 3b9fb92..b48f65b 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -99,6 +99,11 @@ EXPORT_SYMBOL_GPL(nr_irqs);
>  static DEFINE_MUTEX(sparse_irq_lock);
>  static DECLARE_BITMAP(allocated_irqs, IRQ_BITMAP_BITS);
>  
> +static bool __irq_is_reserved(int irq)
> +{
> +	return !!test_bit(irq, allocated_irqs);

What's the point of this? Why not use test_bit() directly in the code?

If we really want this to be a function, then it should be inline and
it could do without the pointless and !! nonsense.

>  static RADIX_TREE(irq_desc_tree, GFP_KERNEL);
> @@ -410,6 +415,27 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  EXPORT_SYMBOL_GPL(__irq_alloc_descs);
>  
>  /**
> + * irq_realloc_desc - allocate irq descriptor for irq that is already reserved

And of course you are documenting crap again.

> + * @irq:	Allocate for specific irq number if irq >= 0
> + * @node:	Preferred node on which the irq descriptor should be allocated
> + * @owner:	Owning module (can be NULL)
> + *
> + * Returns the irq number or error code
> + */
> +int __ref
> +__irq_realloc_desc(int irq, int node, struct module *owner)

What's the point of this line split ?

> +{
> +	if (!__irq_is_reserved(irq))
> +		return -EINVAL;

So this function can operate safely w/o holding sparse_irq_lock?

> +	if (irq_to_desc(irq))
> +		free_desc(irq);

You unconditionally throw away an existing irq descriptor? No, you
should bail out here. The function name is a misnomer as it does not
match the funciton description:

irq_realloc_desc - allocate irq descriptor for irq that is already reserved

You want to allocate an irq descriptor for a reserved irq. That's what
the function is about, not about reallocating an existing irq
descriptor. 

So what you want is:

irq_alloc_reserved_desc - allocate irq descriptor for irq that is already reserved

and then bail out if the irq descriptor already exists.

> +	return alloc_descs(irq, 1, node, owner);

> +EXPORT_SYMBOL_GPL(__irq_realloc_desc);

What's the point of exporting this?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 17/27] genirq: Bail out early in free_desc()
  2013-06-07 22:31 ` [PATCH v3 17/27] genirq: Bail out early in free_desc() Yinghai Lu
@ 2013-06-10 20:43   ` Thomas Gleixner
  0 siblings, 0 replies; 47+ messages in thread
From: Thomas Gleixner @ 2013-06-10 20:43 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, H. Peter Anvin, Bjorn Helgaas, Rafael J. Wysocki,
	linux-pci, linux-kernel

On Fri, 7 Jun 2013, Yinghai Lu wrote:

> We pre-reserve irq range for hot-added ioapic, and later only
> some are used via realloc.
> So during hot-remove, we need to clear bits in allocated_irqs
> for both case.
> 
> Check if the irq_desc is there at first, and bail out early if
> irq_desc is not allocated yet.
> We can use irq_free_descs to clear allocated_irqs bits for
> preserved irqs only.

This changelog is a nightmare as usual. 

It has nothing to do with pre reservation and hot-added ioapics. This
is generic code and does not care at all about x86 specific crappola.

Your change is adding a generic sanity check into free_desc().

So first of all the patch subject is bogus:

   genirq: Bail out early in free_desc()

That's missing _WHY_ it bails out early.

And then the changelog itself drivels about completely irrelevant
nonsense instead of explaining the change itself.

So what I want to see here is something like this:

"genirq: Do not free unallocated irq descriptors

 Hot-added interrupt controllers can reserve a range of interrupt
 numbers, but only allocate some of them. To simplify the release on
 hot-remove allow them to iterate over the reserved range and let the
 free_desc() code return early when the descriptor does not exist."

Can you see the difference?

I told you more than once, that I'm not accepting your sloppy crap
anymore. I'm simply not buying your claim that you think "chinese"
when writing "english". You are simply too lazy to give a rats ass
about it. That applies to your code and to your changelogs in the same
way.
 
I'm really tired of dealing with shit like this.

No thanks

   tglx

> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
>  kernel/irq/irqdesc.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index b48f65b..32f099e 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -170,6 +170,9 @@ static void free_desc(unsigned int irq)
>  {
>  	struct irq_desc *desc = irq_to_desc(irq);
>  
> +	if (!desc)
> +		return;
> +
>  	unregister_irq_proc(irq, desc);
>  
>  	mutex_lock(&sparse_irq_lock);
> -- 
> 1.8.1.4
> 
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping
  2013-06-09 14:54   ` Sebastian Andrzej Siewior
@ 2013-06-10 23:17     ` Yinghai Lu
  0 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-10 23:17 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List,
	Konrad Rzeszutek Wilk, Joerg Roedel

On Sun, Jun 9, 2013 at 7:54 AM, Sebastian Andrzej Siewior
<sebastian@breakpoint.cc> wrote:
> On Fri, Jun 07, 2013 at 03:30:48PM -0700, Yinghai Lu wrote:
>> Current code: after irq remapping is enabled, irq_chip fields are modified
>> during every irq setup.
>>       mp_register_gsi
>>               io_apic_set_pci_routing
>>                       io_apic_setup_irq_pin
>>                               setup_ioapic_irq
>>                                       ioapic_register_intr
>>                                               setup_remapped_irq
>>       native_setup_msi_irqs
>>               setup_msi_irq
>>                       setup_remapped_irq
>>       default_setup_hpet_msi
>>               setup_remapped_irq
>> that is not efficient.
>>
>> We only need to modify those irq chip one time just after we enable
>> irq mapping.
>
> The overhead you talk about is calling setup_remapped_irq() for every
> interrupt from the mp_register_gsi() call chain? MSI & HPET should happen only
> once, or do I miss something?

yes, and for every pci_enable_msi per pci device.

>
>> Change irq_remap_modify_chip_defaults() to __init as it only gets
>> called during booting stage, via irq_remap_modify_chips().
>>
>> Affected irq_chip: ioapic_chip, msi_chip, hpet_msi_type.
>> We don't need to use #ifdef in irq_remap_modify_chips():
>> IRQ_REMAP only support x86_64 and X86_IO_APIC and PCI_MSI.
>> HPET_TIMER is set when x86_64 is set.
>> When we have IRQ_REMAP enabled, al three chips are defined and
>> used.
>
> all, not al
>
> Still, the user could disable hpet or apic from the commandline but this
> should cause any harm as the irq chips shouldn't be used then.

should "not" ?

>
>> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
>> index 904611b..ff50b90 100644
>> --- a/arch/x86/kernel/apic/apic.c
>> +++ b/arch/x86/kernel/apic/apic.c
>> @@ -1552,6 +1552,8 @@ void enable_x2apic(void)
>>  int __init enable_IR(void)
>>  {
>>  #ifdef CONFIG_IRQ_REMAP
>> +     int ret;
>> +
>>       if (!irq_remapping_supported()) {
>>               pr_debug("intr-remapping not supported\n");
>>               return -1;
>> @@ -1563,7 +1565,12 @@ int __init enable_IR(void)
>>               return -1;
>>       }
>>
>> -     return irq_remapping_enable();
>> +     ret = irq_remapping_enable();
>> +
>> +     if (ret >= 0)
>> +             irq_remap_modify_chips();
>
> This looks like ehm, well not well.
> Could you please change this to:
>         ret = irq_remapping_enable();
>         if (ret)
>                 return ret;
>
>         irq_remap_modify_chips();
>
>> +
>> +     return ret;

So you want to use

        ret = irq_remapping_enable();
        if (ret < 0)
                return ret;

         irq_remap_modify_chips();

         return ret;

instead of:

        ret = irq_remapping_enable();
        if (ret >= 0)
            irq_remap_modify_chips();

         return ret;


>>  #endif

>>       return -1;
>>  }
>> diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
>> index 07ce86a..21ef344 100644
>> --- a/drivers/iommu/irq_remapping.c
>> +++ b/drivers/iommu/irq_remapping.c
>> @@ -373,19 +373,26 @@ static void ir_print_prefix(struct irq_data *data, struct seq_file *p)
>>       seq_printf(p, " IR-%s", data->chip->name);
>>  }
>>
>> -static void irq_remap_modify_chip_defaults(struct irq_chip *chip)
>> +static void __init irq_remap_modify_chip_defaults(struct irq_chip *chip)
>>  {
>> +     printk(KERN_DEBUG "irq_chip: %s ==> IR-%s", chip->name, chip->name);
>
> If you need this please use pr_debug and add a \n at the end.

No, I hate pr_debug, as it is useless unless have DEBUG defined.
later even ask user to append "debug ignore_loglevel", we still get nothing
unless user recompile the kernel.

will add "\n", not sure how that get dropped.

>
>>       chip->irq_print_chip = ir_print_prefix;
>>       chip->irq_ack = ir_ack_apic_edge;
>>       chip->irq_eoi = ir_ack_apic_level;
>>       chip->irq_set_affinity = x86_io_apic_ops.set_affinity;
>>  }
>>
>> +void __init irq_remap_modify_chips(void)
>> +{
>> +     irq_remap_modify_chip_defaults(&ioapic_chip);
>> +     irq_remap_modify_chip_defaults(&msi_chip);
>> +     irq_remap_modify_chip_defaults(&hpet_msi_type);
>> +}
>> +
>>  bool setup_remapped_irq(int irq, struct irq_cfg *cfg, struct irq_chip *chip)
>>  {
>>       if (!irq_remapped(cfg))
>>               return false;
>>       irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
>> -     irq_remap_modify_chip_defaults(chip);
>
> chip is not required, and can be removed.

yes.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent
  2013-06-09 15:16   ` Sebastian Andrzej Siewior
@ 2013-06-10 23:40     ` Yinghai Lu
  0 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-10 23:40 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List,
	Joerg Roedel, Konrad Rzeszutek Wilk

On Sun, Jun 9, 2013 at 8:16 AM, Sebastian Andrzej Siewior
<sebastian@breakpoint.cc> wrote:
> On Fri, Jun 07, 2013 at 03:30:51PM -0700, Yinghai Lu wrote:
>> All others are using "-" instead of "_".
>
> Who are "all others"? According to my grep it is 121 vs 44. So even without
> your two here we still 42 doing it the other way around so it is not all.

I mean in arch/x86

arch/x86/kernel/apic/io_apic.c:struct irq_chip ioapic_chip __read_mostly = {
arch/x86/kernel/apic/io_apic.c- .name                   = "IO-APIC",
--
arch/x86/kernel/apic/io_apic.c:static struct irq_chip lapic_chip
__read_mostly = {
arch/x86/kernel/apic/io_apic.c- .name           = "local-APIC",
--
arch/x86/kernel/apic/io_apic.c:struct irq_chip msi_chip = {
arch/x86/kernel/apic/io_apic.c- .name                   = "PCI-MSI",
--
arch/x86/kernel/apic/io_apic.c:static struct irq_chip dmar_msi_type = {
arch/x86/kernel/apic/io_apic.c- .name                   = "DMAR_MSI",
--
arch/x86/kernel/apic/io_apic.c:struct irq_chip hpet_msi_type = {
arch/x86/kernel/apic/io_apic.c- .name = "HPET_MSI",
--
arch/x86/kernel/apic/io_apic.c:static struct irq_chip ht_irq_chip = {
arch/x86/kernel/apic/io_apic.c- .name                   = "PCI-HT",
--
arch/x86/kernel/i8259.c:struct irq_chip i8259A_chip = {
arch/x86/kernel/i8259.c-        .name           = "XT-PIC",
--
arch/x86/lguest/boot.c:static struct irq_chip lguest_irq_controller = {
arch/x86/lguest/boot.c- .name           = "lguest",
--
arch/x86/platform/uv/uv_irq.c:static struct irq_chip uv_irq_chip = {
arch/x86/platform/uv/uv_irq.c-  .name                   = "UV-CORE",
--
arch/x86/platform/visws/visws_quirks.c:static struct irq_chip
cobalt_irq_type = {
arch/x86/platform/visws/visws_quirks.c- .name           = "Cobalt-APIC",
--
arch/x86/platform/visws/visws_quirks.c:static struct irq_chip
piix4_master_irq_type = {
arch/x86/platform/visws/visws_quirks.c- .name           = "PIIX4-master",
--
arch/x86/platform/visws/visws_quirks.c:static struct irq_chip
piix4_virtual_irq_type = {
arch/x86/platform/visws/visws_quirks.c- .name           = "PIIX4-virtual",
--


>
> However it might make sense to use _ instead of " " within a chip's name and
> - as a delimiter for modes like edge or level. However that is way beyond the
> scope of this patch and I am not sure if anyone finds this usefull at all.
>
> Saying this, I am fine with this patch :)

ok.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr()
  2013-06-09 15:22   ` Sebastian Andrzej Siewior
@ 2013-06-10 23:41     ` Yinghai Lu
  2013-06-11 21:52       ` Luck, Tony
  0 siblings, 1 reply; 47+ messages in thread
From: Yinghai Lu @ 2013-06-10 23:41 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List,
	Tony Luck, Fenghua Yu, linux-ia64

On Sun, Jun 9, 2013 at 8:22 AM, Sebastian Andrzej Siewior
<sebastian@breakpoint.cc> wrote:
> On Fri, Jun 07, 2013 at 03:30:52PM -0700, Yinghai Lu wrote:
>> create_irq() will return -1 when fail to allocate.
> The ia64 code here will return -ENOSPC.
>
>> create_irq_nr() will return 0 when fail to allocate.
>>
>> Will use it to fix one return value checking for dmar_msi irq.
>
> What about to unify the interface? Using -1 is kinda bad.

after some following patch, create_irq() in x86 get killed.

Still need ia64 guys to kill create_irq() in arch/ia64.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node
  2013-06-09 15:31   ` Sebastian Andrzej Siewior
@ 2013-06-10 23:43     ` Yinghai Lu
  0 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-10 23:43 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List,
	Joerg Roedel, Donald Dutile

On Sun, Jun 9, 2013 at 8:31 AM, Sebastian Andrzej Siewior
<sebastian@breakpoint.cc> wrote:
> On Fri, Jun 07, 2013 at 03:30:53PM -0700, Yinghai Lu wrote:
>> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
>> index a7967ce..c09e60d 100644
>> --- a/drivers/iommu/dmar.c
>> +++ b/drivers/iommu/dmar.c
>> @@ -1273,7 +1273,7 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
>>       if (iommu->irq)
>>               return 0;
>>
>> -     irq = create_irq();
>> +     irq = create_irq_nr(0, iommu->node);
>>       if (!irq) {
>
> This is okay. However you fix a bug here so you might want first fix the bug
> with a Cc: stable tag and then do this.

ok, will add cc to stable to two related patches.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs
  2013-06-10 19:39   ` Thomas Gleixner
@ 2013-06-10 23:55     ` Yinghai Lu
  0 siblings, 0 replies; 47+ messages in thread
From: Yinghai Lu @ 2013-06-10 23:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, Bjorn Helgaas, Rafael J. Wysocki,
	linux-pci, Linux Kernel Mailing List, Alexander Gordeev

On Mon, Jun 10, 2013 at 12:39 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 7 Jun 2013, Yinghai Lu wrote:
>>  /**
>> - * irq_alloc_descs - allocate and initialize a range of irq descriptors
>> - * @irq:     Allocate for specific irq number if irq >= 0
>> + * __irq_reserve_descs - reserve and initialize a range of irq descriptors
>
> Did you even bother to run docbook ?
>
> No you didn't. Otherwise you'd have noticed that documentation for
> __irq_reserve_descs is not the right thing for a function named
> __irq_reserve_irqs.

sorry, will them consistent.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 47+ messages in thread

* RE: [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr()
  2013-06-10 23:41     ` Yinghai Lu
@ 2013-06-11 21:52       ` Luck, Tony
  0 siblings, 0 replies; 47+ messages in thread
From: Luck, Tony @ 2013-06-11 21:52 UTC (permalink / raw)
  To: Yinghai Lu, Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Bjorn Helgaas,
	Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List, Yu,
	Fenghua, linux-ia64

> Still need ia64 guys to kill create_irq() in arch/ia64.

Was there already a patch to do that? I'm afraid my eyes tend to glaze over when I
see [part 64/87: ia64 ...] and assume that its general cleanup that will flow through
with all the other parts of the patch series.  Please point me at something that you
want me to apply.

If not - then what is the recommended replacement? There seem to be a half dozen
places in arch/ia64 where create_irq() is called.

-Tony

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2013-06-11 21:53 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 01/27] x86, irq: Change irq_remap_modify_chip_defaults to static Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping Yinghai Lu
2013-06-09 14:54   ` Sebastian Andrzej Siewior
2013-06-10 23:17     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 03/27] x86, irq: Print out MSI/MSI-X clearly Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 04/27] x86, irq: Show MSI-X in /proc/interrupt Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent Yinghai Lu
2013-06-09 15:16   ` Sebastian Andrzej Siewior
2013-06-10 23:40     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr() Yinghai Lu
2013-06-09 15:22   ` Sebastian Andrzej Siewior
2013-06-10 23:41     ` Yinghai Lu
2013-06-11 21:52       ` Luck, Tony
2013-06-07 22:30 ` [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node Yinghai Lu
2013-06-09 15:31   ` Sebastian Andrzej Siewior
2013-06-10 23:43     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 08/27] x86, irq: kill create_irq() Yinghai Lu
2013-06-09 15:35   ` Sebastian Andrzej Siewior
2013-06-07 22:30 ` [PATCH v3 09/27] x86, irq: Convert irq_2_pin list to generic list Yinghai Lu
2013-06-09 15:52   ` Sebastian Andrzej Siewior
2013-06-07 22:30 ` [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs Yinghai Lu
2013-06-10 13:51   ` Alexander Gordeev
2013-06-10 19:16     ` Yinghai Lu
2013-06-10 19:42       ` Alexander Gordeev
2013-06-10 19:39   ` Thomas Gleixner
2013-06-10 23:55     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at() Yinghai Lu
2013-06-09 19:13   ` Sebastian Andrzej Siewior
2013-06-10 20:13   ` Thomas Gleixner
2013-06-07 22:30 ` [PATCH v3 12/27] x86, irq: Move down arch_early_irq_init() Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 13/27] x86, irq: Split out alloc_ioapic_save_registers() Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 14/27] xen, irq: call irq_realloc_desc_at() at first Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 15/27] x86, irq: pre-reserve irq range/realloc for booting path Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 16/27] x86, irq: Add ioapic_gsi_to_irq Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 17/27] genirq: Bail out early in free_desc() Yinghai Lu
2013-06-10 20:43   ` Thomas Gleixner
2013-06-07 22:31 ` [PATCH v3 18/27] x86, irq: More strict checking about registering ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 19/27] x86, irq: Make mp_register_ioapic handle hot-added ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 20/27] x86, irq: Add mp_unregister_ioapic to handle hot-remove ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 21/27] x86, irq: Make ioapics loop skip blank slots Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 22/27] x86, ioapic: Find usable ioapic id for 64bit Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 23/27] x86: Move declaration for mp_register_ioapic() Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 24/27] PCI, x86: Make ioapic hotplug support built-in Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 25/27] PCI, x86, ACPI: Link acpi ioapic register to ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 26/27] PCI, x86, ACPI: Enable ioapic hotplug support with acpi host bridge Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 27/27] PCI, x86, ACPI: get ioapic address from acpi device Yinghai Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).