linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping
@ 2012-07-31 11:41 Alexander Gordeev
  2012-07-31 11:42 ` [PATCH 1/3] " Alexander Gordeev
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Alexander Gordeev @ 2012-07-31 11:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, Bjorn Helgaas, Suresh Siddha,
	Yinghai Lu, Matthew Wilcox

Currently multiple MSI mode is limited to a single vector per device (at
least on x86 and PPC). This series breathes life into pci_enable_msi_block()
and makes it possible to set interrupt affinity for multiple IRQs, similarly
to MSI-X. Yet, only for x86 and only when IOMMUs are present.

Although IRQ and PCI subsystems are modified, the current behaviour left
intact. The drivers could just start using multiple MSIs just by following
the existing documentation.

The patches are adapted to Ingo's -tip repository, x86/apic branch.

Alexander Gordeev (3):
  x86, MSI: Support multiple MSIs in presense of IRQ remapping
  x86, MSI: Allocate as many multiple IRQs as requested
  x86, MSI: Minor readability fixes

 arch/x86/kernel/apic/io_apic.c |  170 +++++++++++++++++++++++++++++++++++++---
 drivers/pci/msi.c              |   10 ++-
 include/linux/irq.h            |    6 ++
 include/linux/msi.h            |    1 +
 kernel/irq/chip.c              |   30 +++++--
 kernel/irq/irqdesc.c           |   31 +++++++
 6 files changed, 226 insertions(+), 22 deletions(-)

-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping
  2012-07-31 11:41 [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Alexander Gordeev
@ 2012-07-31 11:42 ` Alexander Gordeev
  2012-07-31 11:43 ` [PATCH 2/3] x86, MSI: Allocate as many multiple IRQs as requested Alexander Gordeev
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2012-07-31 11:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, Bjorn Helgaas, Suresh Siddha,
	Yinghai Lu, Matthew Wilcox

The MSI specification has several constraints in comparison with MSI-X,
most notable of them is the inability to configure MSIs independently.
As a result, it is impossible to dispatch interrupts from different
queues to different CPUs. This is largely devalues the support of
multiple MSIs in SMP systems.

Also, a necessity to allocate a contiguous block of vector numbers for
devices capable of multiple MSIs might cause a considerable pressure on
x86 interrupt vector allocator and could lead to fragmentation of the
interrupt vectors space.

This patch overcomes both drawbacks in presense of IRQ remapping and
lets devices take advantage of multiple queues and per-IRQ affinity
assignments.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
 arch/x86/kernel/apic/io_apic.c |  166 +++++++++++++++++++++++++++++++++++++--
 include/linux/irq.h            |    6 ++
 kernel/irq/chip.c              |   30 +++++--
 kernel/irq/irqdesc.c           |   31 ++++++++
 4 files changed, 216 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index a951ef7..f083049 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -305,6 +305,11 @@ static int alloc_irq_from(unsigned int from, int node)
 	return irq_alloc_desc_from(from, node);
 }
 
+static int alloc_irqs_from(unsigned int from, unsigned int count, int node)
+{
+	return irq_alloc_descs_from(from, count, node);
+}
+
 static void free_irq_at(unsigned int at, struct irq_cfg *cfg)
 {
 	free_irq_cfg(at, cfg);
@@ -3030,6 +3035,55 @@ int create_irq(void)
 	return irq;
 }
 
+unsigned int create_irqs(unsigned int from, unsigned int count, int node)
+{
+	struct irq_cfg **cfg;
+	unsigned long flags;
+	int irq, i;
+
+	if (from < nr_irqs_gsi)
+		from = nr_irqs_gsi;
+
+	cfg = kzalloc_node(count * sizeof(cfg[0]), GFP_KERNEL, node);
+	if (!cfg)
+		return 0;
+
+	irq = alloc_irqs_from(from, count, node);
+	if (irq < 0)
+		goto out_cfgs;
+
+	for (i = 0; i < count; i++) {
+		cfg[i] = alloc_irq_cfg(irq + i, node);
+		if (!cfg[i])
+			goto out_irqs;
+	}
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	for (i = 0; i < count; i++)
+		if (__assign_irq_vector(irq + i, cfg[i], apic->target_cpus()))
+			goto out_vecs;
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+
+	for (i = 0; i < count; i++) {
+		irq_set_chip_data(irq + i, cfg[i]);
+		irq_clear_status_flags(irq + i, IRQ_NOREQUEST);
+	}
+
+	kfree(cfg);
+	return irq;
+
+out_vecs:
+	for (; i; i--)
+		__clear_irq_vector(irq + i - 1, cfg[i - 1]);
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+out_irqs:
+	for (i = 0; i < count; i++)
+		free_irq_at(irq + i, cfg[i]);
+out_cfgs:
+	kfree(cfg);
+	return 0;
+}
+
 void destroy_irq(unsigned int irq)
 {
 	struct irq_cfg *cfg = irq_get_chip_data(irq);
@@ -3045,6 +3099,27 @@ void destroy_irq(unsigned int irq)
 	free_irq_at(irq, cfg);
 }
 
+static inline void destroy_irqs(unsigned int irq, unsigned int count)
+{
+	unsigned int i;
+	for (i = 0; i < count; i++)
+		destroy_irq(irq + i);
+}
+
+static inline int
+can_create_pow_of_two_irqs(unsigned int from, unsigned int count)
+{
+	if ((count > 1) && (count % 2))
+		return -EINVAL;
+
+	for (; count; count = count / 2) {
+		if (!irq_can_alloc_irqs(from, count))
+			return count;
+	}
+
+	return -ENOSPC;
+}
+
 /*
  * MSI message composition
  */
@@ -3136,18 +3211,25 @@ static struct irq_chip msi_chip = {
 	.irq_retrigger		= ioapic_retrigger_irq,
 };
 
-static int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, int irq)
+static int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
+			 unsigned int irq_base, unsigned int irq_offset)
 {
 	struct irq_chip *chip = &msi_chip;
 	struct msi_msg msg;
+	unsigned int irq = irq_base + irq_offset;
 	int ret;
 
 	ret = msi_compose_msg(dev, irq, &msg, -1);
 	if (ret < 0)
 		return ret;
 
-	irq_set_msi_desc(irq, msidesc);
-	write_msi_msg(irq, &msg);
+	irq_set_msi_desc_off(irq_base, irq_offset, msidesc);
+
+	/* MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (!irq_offset)
+		write_msi_msg(irq, &msg);
 
 	if (irq_remapped(irq_get_chip_data(irq))) {
 		irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
@@ -3161,16 +3243,12 @@ static int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, int irq)
 	return 0;
 }
 
-int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+int setup_msix_irqs(struct pci_dev *dev, int nvec)
 {
 	int node, ret, sub_handle, index = 0;
 	unsigned int irq, irq_want;
 	struct msi_desc *msidesc;
 
-	/* x86 doesn't support multiple MSI yet */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
-		return 1;
-
 	node = dev_to_node(&dev->dev);
 	irq_want = nr_irqs_gsi;
 	sub_handle = 0;
@@ -3199,7 +3277,7 @@ int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 				goto error;
 		}
 no_ir:
-		ret = setup_msi_irq(dev, msidesc, irq);
+		ret = setup_msi_irq(dev, msidesc, irq, 0);
 		if (ret < 0)
 			goto error;
 		sub_handle++;
@@ -3211,6 +3289,76 @@ error:
 	return ret;
 }
 
+int setup_msi_irqs(struct pci_dev *dev, int nvec)
+{
+	int node, ret, sub_handle, index = 0;
+	unsigned int irq;
+	struct msi_desc *msidesc;
+
+	if (nvec > 1 && !irq_remapping_enabled)
+		return 1;
+
+	nvec = __roundup_pow_of_two(nvec);
+	ret = can_create_pow_of_two_irqs(nr_irqs_gsi, nvec);
+	if (ret != nvec)
+		return ret;
+
+	WARN_ON(!list_is_singular(&dev->msi_list));
+	msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+	WARN_ON(msidesc->irq);
+	WARN_ON(msidesc->msi_attrib.multiple);
+
+	node = dev_to_node(&dev->dev);
+	irq = create_irqs(nr_irqs_gsi, nvec, node);
+	if (irq == 0)
+		return -ENOSPC;
+
+	if (!irq_remapping_enabled) {
+		ret = setup_msi_irq(dev, msidesc, irq, 0);
+		if (ret < 0)
+			goto error;
+		return 0;
+	}
+
+	msidesc->msi_attrib.multiple = ilog2(nvec);
+	for (sub_handle = 0; sub_handle < nvec; sub_handle++) {
+		if (!sub_handle) {
+			index = msi_alloc_remapped_irq(dev, irq, nvec);
+			if (index < 0) {
+				ret = index;
+				goto error;
+			}
+		} else {
+			ret = msi_setup_remapped_irq(dev, irq + sub_handle,
+						     index, sub_handle);
+			if (ret < 0)
+				goto error;
+		}
+		ret = setup_msi_irq(dev, msidesc, irq, sub_handle);
+		if (ret < 0)
+			goto error;
+	}
+	return 0;
+
+error:
+	destroy_irqs(irq, nvec);
+
+	/* Restore altered MSI descriptor fields and prevent just destroyed
+	 * IRQs from tearing down again in default_teardown_msi_irqs()
+	 */
+	msidesc->irq = 0;
+	msidesc->msi_attrib.multiple = 0;
+
+	return ret;
+}
+
+int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+{
+	if (type == PCI_CAP_ID_MSI)
+		return setup_msi_irqs(dev, nvec);
+	return setup_msix_irqs(dev, nvec);
+}
+
 void native_teardown_msi_irq(unsigned int irq)
 {
 	destroy_irq(irq);
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 47a937c..bccddc0 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -523,6 +523,8 @@ extern int irq_set_handler_data(unsigned int irq, void *data);
 extern int irq_set_chip_data(unsigned int irq, void *data);
 extern int irq_set_irq_type(unsigned int irq, unsigned int type);
 extern int irq_set_msi_desc(unsigned int irq, struct msi_desc *entry);
+extern int irq_set_msi_desc_off(unsigned int irq_base, unsigned int irq_offset,
+				struct msi_desc *entry);
 extern struct irq_data *irq_get_irq_data(unsigned int irq);
 
 static inline struct irq_chip *irq_get_chip(unsigned int irq)
@@ -585,8 +587,12 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
 #define irq_alloc_desc_from(from, node)		\
 	irq_alloc_descs(-1, from, 1, node)
 
+#define irq_alloc_descs_from(from, cnt, node)	\
+	irq_alloc_descs(-1, from, cnt, node)
+
 void irq_free_descs(unsigned int irq, unsigned int cnt);
 int irq_reserve_irqs(unsigned int from, unsigned int cnt);
+int irq_can_alloc_irqs(unsigned int from, unsigned int cnt);
 
 static inline void irq_free_desc(unsigned int irq)
 {
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index eebd6d5..dccbec1 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -90,27 +90,41 @@ int irq_set_handler_data(unsigned int irq, void *data)
 EXPORT_SYMBOL(irq_set_handler_data);
 
 /**
- *	irq_set_msi_desc - set MSI descriptor data for an irq
- *	@irq:	Interrupt number
- *	@entry:	Pointer to MSI descriptor data
+ *	irq_set_msi_desc_off - set MSI descriptor data for an irq at offset
+ *	@irq_base:	Interrupt number base
+ *	@irq_offset:	Interrupt number offset
+ *	@entry:		Pointer to MSI descriptor data
  *
- *	Set the MSI descriptor entry for an irq
+ *	Set the MSI descriptor entry for an irq at offset
  */
-int irq_set_msi_desc(unsigned int irq, struct msi_desc *entry)
+int irq_set_msi_desc_off(unsigned int irq_base, unsigned int irq_offset,
+			 struct msi_desc *entry)
 {
 	unsigned long flags;
-	struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
+	struct irq_desc *desc = irq_get_desc_lock(irq_base + irq_offset, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
 
 	if (!desc)
 		return -EINVAL;
 	desc->irq_data.msi_desc = entry;
-	if (entry)
-		entry->irq = irq;
+	if (entry && !irq_offset)
+		entry->irq = irq_base;
 	irq_put_desc_unlock(desc, flags);
 	return 0;
 }
 
 /**
+ *	irq_set_msi_desc - set MSI descriptor data for an irq
+ *	@irq:	Interrupt number
+ *	@entry:	Pointer to MSI descriptor data
+ *
+ *	Set the MSI descriptor entry for an irq
+ */
+int irq_set_msi_desc(unsigned int irq, struct msi_desc *entry)
+{
+	return irq_set_msi_desc_off(irq, 0, entry);
+}
+
+/**
  *	irq_set_chip_data - set irq chip data for an irq
  *	@irq:	Interrupt number
  *	@data:	Pointer to chip specific data
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 192a302..8287b78 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -210,6 +210,13 @@ static int irq_expand_nr_irqs(unsigned int nr)
 	return 0;
 }
 
+static int irq_can_expand_nr_irqs(unsigned int nr)
+{
+	if (nr > IRQ_BITMAP_BITS)
+		return -ENOMEM;
+	return 0;
+}
+
 int __init early_irq_init(void)
 {
 	int i, initcnt, node = first_online_node;
@@ -414,6 +421,30 @@ int irq_reserve_irqs(unsigned int from, unsigned int cnt)
 }
 
 /**
+ * irq_can_alloc_irqs - checks if a range of irqs could be allocated
+ * @from:	check from irq number
+ * @cnt:	number of irqs to check
+ *
+ * Returns 0 on success or an appropriate error code
+ */
+int irq_can_alloc_irqs(unsigned int from, unsigned int cnt)
+{
+	unsigned int start;
+	int ret = 0;
+
+	if (!cnt)
+		return -EINVAL;
+
+	mutex_lock(&sparse_irq_lock);
+	start = bitmap_find_next_zero_area(allocated_irqs, IRQ_BITMAP_BITS,
+					   from, cnt, 0);
+	mutex_unlock(&sparse_irq_lock);
+	if (start + cnt > nr_irqs)
+		ret = irq_can_expand_nr_irqs(start + cnt);
+	return ret;
+}
+
+/**
  * irq_get_next_irq - get next allocated irq number
  * @offset:	where to start the search
  *
-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] x86, MSI: Allocate as many multiple IRQs as requested
  2012-07-31 11:41 [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Alexander Gordeev
  2012-07-31 11:42 ` [PATCH 1/3] " Alexander Gordeev
@ 2012-07-31 11:43 ` Alexander Gordeev
  2012-07-31 11:44 ` [PATCH 3/3] x86, MSI: Minor readability fixes Alexander Gordeev
  2012-07-31 21:12 ` [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Suresh Siddha
  3 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2012-07-31 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, Bjorn Helgaas, Suresh Siddha,
	Yinghai Lu, Matthew Wilcox

When multiple MSIs are enabled with pci_enable_msi_block() the number of
allocated IRQs 'nvec' is rounded up to the nearest value of power of two.
That could lead to a condition when number of requested and used IRQs is
less than number of actually allocated IRQs.

This fix introduces 'msi_desc::nvec' field to address the above issue -
when non-zero, it holds the number of allocated IRQs. Otherwise, the old
method is used.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
 arch/x86/kernel/apic/io_apic.c |   16 +++++++---------
 drivers/pci/msi.c              |   10 ++++++++--
 include/linux/msi.h            |    1 +
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index f083049..5a5c92b 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3107,16 +3107,12 @@ static inline void destroy_irqs(unsigned int irq, unsigned int count)
 }
 
 static inline int
-can_create_pow_of_two_irqs(unsigned int from, unsigned int count)
+can_create_irqs(unsigned int from, unsigned int count)
 {
-	if ((count > 1) && (count % 2))
-		return -EINVAL;
-
-	for (; count; count = count / 2) {
+	for (; count; count = count - 1) {
 		if (!irq_can_alloc_irqs(from, count))
 			return count;
 	}
-
 	return -ENOSPC;
 }
 
@@ -3298,8 +3294,7 @@ int setup_msi_irqs(struct pci_dev *dev, int nvec)
 	if (nvec > 1 && !irq_remapping_enabled)
 		return 1;
 
-	nvec = __roundup_pow_of_two(nvec);
-	ret = can_create_pow_of_two_irqs(nr_irqs_gsi, nvec);
+	ret = can_create_irqs(nr_irqs_gsi, nvec);
 	if (ret != nvec)
 		return ret;
 
@@ -3307,11 +3302,13 @@ int setup_msi_irqs(struct pci_dev *dev, int nvec)
 	msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
 	WARN_ON(msidesc->irq);
 	WARN_ON(msidesc->msi_attrib.multiple);
+	WARN_ON(msidesc->nvec);
 
 	node = dev_to_node(&dev->dev);
 	irq = create_irqs(nr_irqs_gsi, nvec, node);
 	if (irq == 0)
 		return -ENOSPC;
+	msidesc->nvec = nvec;
 
 	if (!irq_remapping_enabled) {
 		ret = setup_msi_irq(dev, msidesc, irq, 0);
@@ -3320,7 +3317,7 @@ int setup_msi_irqs(struct pci_dev *dev, int nvec)
 		return 0;
 	}
 
-	msidesc->msi_attrib.multiple = ilog2(nvec);
+	msidesc->msi_attrib.multiple = ilog2(__roundup_pow_of_two(nvec));
 	for (sub_handle = 0; sub_handle < nvec; sub_handle++) {
 		if (!sub_handle) {
 			index = msi_alloc_remapped_irq(dev, irq, nvec);
@@ -3348,6 +3345,7 @@ error:
 	 */
 	msidesc->irq = 0;
 	msidesc->msi_attrib.multiple = 0;
+	msidesc->nvec = 0;
 
 	return ret;
 }
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a825d78..f0752d1 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -79,7 +79,10 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
 		int i, nvec;
 		if (entry->irq == 0)
 			continue;
-		nvec = 1 << entry->msi_attrib.multiple;
+		if (entry->nvec)
+			nvec = entry->nvec;
+		else
+			nvec = 1 << entry->msi_attrib.multiple;
 		for (i = 0; i < nvec; i++)
 			arch_teardown_msi_irq(entry->irq + i);
 	}
@@ -336,7 +339,10 @@ static void free_msi_irqs(struct pci_dev *dev)
 		int i, nvec;
 		if (!entry->irq)
 			continue;
-		nvec = 1 << entry->msi_attrib.multiple;
+		if (entry->nvec)
+			nvec = entry->nvec;
+		else
+			nvec = 1 << entry->msi_attrib.multiple;
 		for (i = 0; i < nvec; i++)
 			BUG_ON(irq_has_action(entry->irq + i));
 	}
diff --git a/include/linux/msi.h b/include/linux/msi.h
index ce93a34..6f4dfba 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -35,6 +35,7 @@ struct msi_desc {
 
 	u32 masked;			/* mask bits */
 	unsigned int irq;
+	unsigned int nvec;
 	struct list_head list;
 
 	union {
-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] x86, MSI: Minor readability fixes
  2012-07-31 11:41 [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Alexander Gordeev
  2012-07-31 11:42 ` [PATCH 1/3] " Alexander Gordeev
  2012-07-31 11:43 ` [PATCH 2/3] x86, MSI: Allocate as many multiple IRQs as requested Alexander Gordeev
@ 2012-07-31 11:44 ` Alexander Gordeev
  2012-07-31 21:12 ` [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Suresh Siddha
  3 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2012-07-31 11:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, Bjorn Helgaas, Suresh Siddha,
	Yinghai Lu, Matthew Wilcox

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
 arch/x86/kernel/apic/io_apic.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 5a5c92b..888f3b9 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3142,7 +3142,7 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 
 	if (irq_remapped(cfg)) {
 		compose_remapped_msi_msg(pdev, irq, dest, msg, hpet_id);
-		return err;
+		return 0;
 	}
 
 	if (x2apic_enabled())
@@ -3169,7 +3169,7 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 			MSI_DATA_DELIVERY_LOWPRI) |
 		MSI_DATA_VECTOR(cfg->vector);
 
-	return err;
+	return 0;
 }
 
 static int
@@ -3251,7 +3251,7 @@ int setup_msix_irqs(struct pci_dev *dev, int nvec)
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
 		irq = create_irq_nr(irq_want, node);
 		if (irq == 0)
-			return -1;
+			return -ENOSPC;
 		irq_want = irq + 1;
 		if (!irq_remapping_enabled)
 			goto no_ir;
-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping
  2012-07-31 11:41 [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Alexander Gordeev
                   ` (2 preceding siblings ...)
  2012-07-31 11:44 ` [PATCH 3/3] x86, MSI: Minor readability fixes Alexander Gordeev
@ 2012-07-31 21:12 ` Suresh Siddha
  2012-08-01  9:10   ` Alexander Gordeev
  2012-08-16 15:50   ` Alexander Gordeev
  3 siblings, 2 replies; 7+ messages in thread
From: Suresh Siddha @ 2012-07-31 21:12 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Bjorn Helgaas,
	Yinghai Lu, Matthew Wilcox

On Tue, 2012-07-31 at 13:41 +0200, Alexander Gordeev wrote:
> Currently multiple MSI mode is limited to a single vector per device (at
> least on x86 and PPC). This series breathes life into pci_enable_msi_block()
> and makes it possible to set interrupt affinity for multiple IRQs, similarly
> to MSI-X. Yet, only for x86 and only when IOMMUs are present.
> 
> Although IRQ and PCI subsystems are modified, the current behaviour left
> intact. The drivers could just start using multiple MSIs just by following
> the existing documentation.

So while I am ok with the proposed changes, I will hold off acking until
I see the corresponding driver changes (using pci_enable_msi_block()
etc) that take advantage of these changes ;)

Did you have a specific device in mind and are the driver changes
coming?

thanks,
suresh

> 
> The patches are adapted to Ingo's -tip repository, x86/apic branch.
> 
> Alexander Gordeev (3):
>   x86, MSI: Support multiple MSIs in presense of IRQ remapping
>   x86, MSI: Allocate as many multiple IRQs as requested
>   x86, MSI: Minor readability fixes
> 
>  arch/x86/kernel/apic/io_apic.c |  170 +++++++++++++++++++++++++++++++++++++---
>  drivers/pci/msi.c              |   10 ++-
>  include/linux/irq.h            |    6 ++
>  include/linux/msi.h            |    1 +
>  kernel/irq/chip.c              |   30 +++++--
>  kernel/irq/irqdesc.c           |   31 +++++++
>  6 files changed, 226 insertions(+), 22 deletions(-)
> 
> -- 
> 1.7.7.6
> 
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping
  2012-07-31 21:12 ` [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Suresh Siddha
@ 2012-08-01  9:10   ` Alexander Gordeev
  2012-08-16 15:50   ` Alexander Gordeev
  1 sibling, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2012-08-01  9:10 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Bjorn Helgaas,
	Yinghai Lu, Matthew Wilcox

On Tue, Jul 31, 2012 at 02:12:49PM -0700, Suresh Siddha wrote:
> On Tue, 2012-07-31 at 13:41 +0200, Alexander Gordeev wrote:
> > Currently multiple MSI mode is limited to a single vector per device (at
> > least on x86 and PPC). This series breathes life into pci_enable_msi_block()
> > and makes it possible to set interrupt affinity for multiple IRQs, similarly
> > to MSI-X. Yet, only for x86 and only when IOMMUs are present.
> > 
> > Although IRQ and PCI subsystems are modified, the current behaviour left
> > intact. The drivers could just start using multiple MSIs just by following
> > the existing documentation.
> 
> So while I am ok with the proposed changes, I will hold off acking until
> I see the corresponding driver changes (using pci_enable_msi_block()
> etc) that take advantage of these changes ;)

Well, I make Broadcom driver believe it is running MSI-X while it is running
multipe MSIs in fact. This is not at all a decent patch, just to make debugging
possible.

>From 62b14a9e89e866f0883fc8bde17a3196740493b7 Mon Sep 17 00:00:00 2001
From: Alexander Gordeev <agordeev@redhat.com>
Date: Thu, 19 Jul 2012 14:07:00 -0400
Subject: [PATCH] bnx2: Force MSI being used as MSI-X

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
 drivers/net/ethernet/broadcom/bnx2.c |   62 ++++++++++++++++++++++++----------
 drivers/net/ethernet/broadcom/bnx2.h |    1 +
 2 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index ac7b744..84529ec 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -6183,7 +6183,7 @@ bnx2_free_irq(struct bnx2 *bp)
 {
 
 	__bnx2_free_irq(bp);
-	if (bp->flags & BNX2_FLAG_USING_MSI)
+	if (bp->flags & (BNX2_FLAG_USING_FORCED_MSI | BNX2_FLAG_USING_MSI))
 		pci_disable_msi(bp->pdev);
 	else if (bp->flags & BNX2_FLAG_USING_MSIX)
 		pci_disable_msix(bp->pdev);
@@ -6192,6 +6192,42 @@ bnx2_free_irq(struct bnx2 *bp)
 }
 
 static void
+bnx2_enable_msi(struct bnx2 *bp, int msix_vecs)
+{
+	int i, total_vecs, rc;
+	struct net_device *dev = bp->dev;
+	const int len = sizeof(bp->irq_tbl[0].name);
+
+	total_vecs = msix_vecs;
+#ifdef BCM_CNIC
+	total_vecs++;
+#endif
+	rc = -ENOSPC;
+	while (total_vecs >= BNX2_MIN_MSIX_VEC) {
+		rc = pci_enable_msi_block(bp->pdev, total_vecs);
+		if (rc <= 0)
+			break;
+		if (rc > 0)
+			total_vecs = rc;
+	}
+
+	if (rc != 0)
+		return;
+
+	msix_vecs = total_vecs;
+#ifdef BCM_CNIC
+	msix_vecs--;
+#endif
+	bp->irq_nvecs = msix_vecs;
+	bp->flags |= BNX2_FLAG_USING_FORCED_MSI | BNX2_FLAG_USING_MSIX | BNX2_FLAG_ONE_SHOT_MSI;
+	for (i = 0; i < total_vecs; i++) {
+		bp->irq_tbl[i].vector = bp->pdev->irq + i;
+		snprintf(bp->irq_tbl[i].name, len, "%s-%d", dev->name, i);
+		bp->irq_tbl[i].handler = bnx2_msi_1shot;
+	}
+}
+
+static void
 bnx2_enable_msix(struct bnx2 *bp, int msix_vecs)
 {
 	int i, total_vecs, rc;
@@ -6262,22 +6298,12 @@ bnx2_setup_int_mode(struct bnx2 *bp, int dis_msi)
 	bp->irq_nvecs = 1;
 	bp->irq_tbl[0].vector = bp->pdev->irq;
 
-	if ((bp->flags & BNX2_FLAG_MSIX_CAP) && !dis_msi)
-		bnx2_enable_msix(bp, msix_vecs);
+	if ((bp->flags & BNX2_FLAG_MSI_CAP) && !dis_msi)
+		bnx2_enable_msi(bp, msix_vecs);
 
-	if ((bp->flags & BNX2_FLAG_MSI_CAP) && !dis_msi &&
-	    !(bp->flags & BNX2_FLAG_USING_MSIX)) {
-		if (pci_enable_msi(bp->pdev) == 0) {
-			bp->flags |= BNX2_FLAG_USING_MSI;
-			if (CHIP_NUM(bp) == CHIP_NUM_5709) {
-				bp->flags |= BNX2_FLAG_ONE_SHOT_MSI;
-				bp->irq_tbl[0].handler = bnx2_msi_1shot;
-			} else
-				bp->irq_tbl[0].handler = bnx2_msi;
-
-			bp->irq_tbl[0].vector = bp->pdev->irq;
-		}
-	}
+	if ((bp->flags & BNX2_FLAG_MSIX_CAP) && !dis_msi &&
+	    !(bp->flags & BNX2_FLAG_USING_MSIX))
+		bnx2_enable_msix(bp, msix_vecs);
 
 	if (!bp->num_req_tx_rings)
 		bp->num_tx_rings = rounddown_pow_of_two(bp->irq_nvecs);
@@ -6359,7 +6385,9 @@ bnx2_open(struct net_device *dev)
 			bnx2_enable_int(bp);
 		}
 	}
-	if (bp->flags & BNX2_FLAG_USING_MSI)
+	if (bp->flags & BNX2_FLAG_USING_FORCED_MSI)
+		netdev_info(dev, "using forced MSI\n");
+	else if (bp->flags & BNX2_FLAG_USING_MSI)
 		netdev_info(dev, "using MSI\n");
 	else if (bp->flags & BNX2_FLAG_USING_MSIX)
 		netdev_info(dev, "using MSIX\n");
diff --git a/drivers/net/ethernet/broadcom/bnx2.h b/drivers/net/ethernet/broadcom/bnx2.h
index dc06bda..4a65a64 100644
--- a/drivers/net/ethernet/broadcom/bnx2.h
+++ b/drivers/net/ethernet/broadcom/bnx2.h
@@ -6757,6 +6757,7 @@ struct bnx2 {
 #define BNX2_FLAG_CAN_KEEP_VLAN		0x00001000
 #define BNX2_FLAG_BROKEN_STATS		0x00002000
 #define BNX2_FLAG_AER_ENABLED		0x00004000
+#define BNX2_FLAG_USING_FORCED_MSI	0x00008000
 
 	struct bnx2_napi	bnx2_napi[BNX2_MAX_MSIX_VEC];
 
-- 
1.7.10.4


# lspci -v -s 01:00.0
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709S Gigabit
Ethernet (rev 20)
	Subsystem: Dell Device 02dc
	Flags: bus master, fast devsel, latency 0, IRQ 81
	Memory at f2000000 (64-bit, non-prefetchable) [size=32M]
	Capabilities: [48] Power Management version 3
	Capabilities: [50] Vital Product Data
	Capabilities: [58] MSI: Enable+ Count=16/16 Maskable- 64bit+
	Capabilities: [a0] MSI-X: Enable- Count=9 Masked-
	Capabilities: [ac] Express Endpoint, MSI 00
	Capabilities: [100] Device Serial Number b8-ac-6f-ff-fe-d2-68-58
	Capabilities: [110] Advanced Error Reporting
	Capabilities: [150] Power Budgeting <?>
	Capabilities: [160] Virtual Channel
	Kernel driver in use: bnx2
	Kernel modules: bnx2

# dmesg | grep 01:00.0
[    2.614330] pci 0000:01:00.0: [14e4:163a] type 00 class 0x020000
[    2.620457] pci 0000:01:00.0: reg 10: [mem 0xf2000000-0xf3ffffff 64bit]
[    2.627267] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[    5.424713] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[   63.575032] bnx2 0000:01:00.0: eth0: Broadcom NetXtreme II BCM5709 1000Base-SX (C0) PCI Express found at mem f2000000, IRQ 36, node addr b8:ac:6f:d2:68:58
[  105.706316] bnx2 0000:01:00.0: irq 81 for MSI/MSI-X
[  105.706322] bnx2 0000:01:00.0: irq 82 for MSI/MSI-X
[  105.706327] bnx2 0000:01:00.0: irq 83 for MSI/MSI-X
[  105.706333] bnx2 0000:01:00.0: irq 84 for MSI/MSI-X
[  105.706337] bnx2 0000:01:00.0: irq 85 for MSI/MSI-X
[  105.706342] bnx2 0000:01:00.0: irq 86 for MSI/MSI-X
[  105.706347] bnx2 0000:01:00.0: irq 87 for MSI/MSI-X
[  105.706352] bnx2 0000:01:00.0: irq 88 for MSI/MSI-X
[  105.706357] bnx2 0000:01:00.0: irq 89 for MSI/MSI-X
[  105.763869] bnx2 0000:01:00.0: em1: using forced MSI
[  106.477183] bnx2 0000:01:00.0: em1: NIC Remote Copper Link is Up, 1000 Mbps full duplex
# for irq in {81..89}; do cat /proc/irq/$irq/smp_affinity ; done
0000,00000000,01000000
0000,00001111,11111111
0000,00000000,00000001
0000,00004444,44444444
0000,00004444,44444444
0000,00008888,88888888
0000,00001111,11111111
0000,00001111,11111111
cat: /proc/irq/89/smp_affinity: No such file or directory
# 

> Did you have a specific device in mind and are the driver changes
> coming?

Yes, I keep in mind at least AHCI and some QLA chips which do not support
MSI-X. Not to mention MSI-X fallback paths many (most?) drivers have.
Regarding coming driver changes.. depends from the fate of this series :)

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping
  2012-07-31 21:12 ` [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Suresh Siddha
  2012-08-01  9:10   ` Alexander Gordeev
@ 2012-08-16 15:50   ` Alexander Gordeev
  1 sibling, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2012-08-16 15:50 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Bjorn Helgaas,
	Yinghai Lu, Matthew Wilcox

On Tue, Jul 31, 2012 at 02:12:49PM -0700, Suresh Siddha wrote:
> On Tue, 2012-07-31 at 13:41 +0200, Alexander Gordeev wrote:
> So while I am ok with the proposed changes, I will hold off acking until
> I see the corresponding driver changes (using pci_enable_msi_block()
> etc) that take advantage of these changes ;)
> 
> Did you have a specific device in mind and are the driver changes
> coming?

Hi Suresh,

I reposted the series + AHCI driver update.

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-08-16 15:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-31 11:41 [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Alexander Gordeev
2012-07-31 11:42 ` [PATCH 1/3] " Alexander Gordeev
2012-07-31 11:43 ` [PATCH 2/3] x86, MSI: Allocate as many multiple IRQs as requested Alexander Gordeev
2012-07-31 11:44 ` [PATCH 3/3] x86, MSI: Minor readability fixes Alexander Gordeev
2012-07-31 21:12 ` [PATCH 0/3] x86, MSI: Support multiple MSIs in presense of IRQ remapping Suresh Siddha
2012-08-01  9:10   ` Alexander Gordeev
2012-08-16 15:50   ` Alexander Gordeev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).