[Patch 0/2] Optimize CPU vector allocation for NUMA systems

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Patch 0/2] Optimize CPU vector allocation for NUMA systems
@ 2015-05-04  2:47 Jiang Liu
  2015-05-04  2:47 ` [Patch 1/2] irq_remapping/vt-d: Fix regression caused by commit b106ee63abcc Jiang Liu
  2015-05-04  2:47 ` [Patch 2/2] x86, irq: Support CPU vector allocation policies Jiang Liu
  0 siblings, 2 replies; 12+ messages in thread
From: Jiang Liu @ 2015-05-04  2:47 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Dimitri Sivanich
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, David Cohen,
	Sander Eikelenboom, David Vrabel, Andrew Morton, Tony Luck,
	Joerg Roedel, Greg Kroah-Hartman, x86, linux-kernel, linux-pci,
	linux-acpi

Hi all,
	This is a small patch set based on tip/x86/apic.
This first patch is a bugfix for tip/x86/apic. And the second patch
is an enhancement to optimize CPU vector allocation on NUMA systems.
It introduces a mechanism to allocate CPU vectors from device local
NUMA node and a kernel parameter to enable/disable the optimization.

Thanks!
Gerry

Jiang Liu (2):
  irq_remapping/vt-d: Fix regression caused by commit b106ee63abcc
  x86, irq: Support CPU vector allocation policies

 Documentation/kernel-parameters.txt |    5 +++
 arch/x86/kernel/apic/vector.c       |   83 +++++++++++++++++++++++++++++++----
 drivers/iommu/intel_irq_remapping.c |   16 ++++---
 3 files changed, 90 insertions(+), 14 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Patch 1/2] irq_remapping/vt-d: Fix regression caused by commit b106ee63abcc
  2015-05-04  2:47 [Patch 0/2] Optimize CPU vector allocation for NUMA systems Jiang Liu
@ 2015-05-04  2:47 ` Jiang Liu
  2015-05-05  9:15   ` [tip:x86/apic] irq_remapping/vt-d: Init all MSI entries not just the first one tip-bot for Thomas Gleixner
  2015-05-04  2:47 ` [Patch 2/2] x86, irq: Support CPU vector allocation policies Jiang Liu
  1 sibling, 1 reply; 12+ messages in thread
From: Jiang Liu @ 2015-05-04  2:47 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Dimitri Sivanich, Joerg Roedel
  Cc: Jiang Liu, Konrad Rzeszutek Wilk, David Cohen,
	Sander Eikelenboom, David Vrabel, Andrew Morton, Tony Luck,
	Greg Kroah-Hartman, x86, linux-kernel, linux-pci, linux-acpi,
	iommu

Commit b106ee63abcc ("irq_remapping/vt-d: Enhance Intel IR driver to
support hierarchical irqdomains") caused a regression, which forgot
to initialize remapping data structures other than the first entry
when setting up remapping entries for multiple MSIs.

Code is written by Thomas and commit message is written by Jiang.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
Hi Thomas,
	I missed this patch when rebasing my patch set. It may be
fold into commit b106ee63abcc ("irq_remapping/vt-d: Enhance Intel IR
driver to support hierarchical irqdomains").
Thanks!
Gerry
---
 drivers/iommu/intel_irq_remapping.c |   16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 14d95694fc1b..7ecc6b3180ba 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1113,7 +1113,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
 {
 	struct intel_iommu *iommu = domain->host_data;
 	struct irq_alloc_info *info = arg;
-	struct intel_ir_data *data;
+	struct intel_ir_data *data, *ird;
 	struct irq_data *irq_data;
 	struct irq_cfg *irq_cfg;
 	int i, ret, index;
@@ -1158,14 +1158,20 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
 		}
 
 		if (i > 0) {
-			data = kzalloc(sizeof(*data), GFP_KERNEL);
-			if (!data)
+			ird = kzalloc(sizeof(*ird), GFP_KERNEL);
+			if (!ird)
 				goto out_free_data;
+			/* Initialize the common data */
+			ird->irq_2_iommu = data->irq_2_iommu;
+			ird->irq_2_iommu.sub_handle = i;
+		} else {
+			ird = data;
 		}
+
 		irq_data->hwirq = (index << 16) + i;
-		irq_data->chip_data = data;
+		irq_data->chip_data = ird;
 		irq_data->chip = &intel_ir_chip;
-		intel_irq_remapping_prepare_irte(data, irq_cfg, info, index, i);
+		intel_irq_remapping_prepare_irte(ird, irq_cfg, info, index, i);
 		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Patch 2/2] x86, irq: Support CPU vector allocation policies
  2015-05-04  2:47 [Patch 0/2] Optimize CPU vector allocation for NUMA systems Jiang Liu
  2015-05-04  2:47 ` [Patch 1/2] irq_remapping/vt-d: Fix regression caused by commit b106ee63abcc Jiang Liu
@ 2015-05-04  2:47 ` Jiang Liu
  2015-05-05 19:25   ` Thomas Gleixner
  1 sibling, 1 reply; 12+ messages in thread
From: Jiang Liu @ 2015-05-04  2:47 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, Dimitri Sivanich, Jonathan Corbet,
	x86, Jiang Liu
  Cc: Konrad Rzeszutek Wilk, David Cohen, Sander Eikelenboom,
	David Vrabel, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, linux-kernel, linux-pci, linux-acpi,
	Daniel J Blueman, linux-doc

On NUMA systems, an IO device may be associated with a NUMA node.
It may improve IO performance to allocate resources, such as memory
and interrupts, from device local node.

This patch introduces a mechanism to support CPU vector allocation
policies, so users may choose the best suitable CPU vector allocation
policy. Currently there are two supported allocation policies:
1) allocate CPU vectors from CPUs on device local node
2) allocate CPU vectors from all online CPUs

This mechanism may be used to support NumaConnect systems to allocate
CPU vectors from device local node.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Daniel J Blueman <daniel@numascale.com>
---
 Documentation/kernel-parameters.txt |    5 +++
 arch/x86/kernel/apic/vector.c       |   83 +++++++++++++++++++++++++++++++----
 2 files changed, 79 insertions(+), 9 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 274252f205b7..5e8b1c6f0677 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3840,6 +3840,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	vector=		[IA-64,SMP]
 			vector=percpu: enable percpu vector domain
 
+	vector_alloc=	[x86,SMP]
+			vector_alloc=node: try to allocate CPU vectors from CPUs on
+			device local node first, fallback to all online CPUs
+			vector_alloc=global: allocate CPU vector from all online CPUs
+
 	video=		[FB] Frame buffer configuration
 			See Documentation/fb/modedb.txt.
 
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 1c7dd42b98c1..96ce5068a926 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -28,6 +28,17 @@ struct apic_chip_data {
 	u8			move_in_progress : 1;
 };
 
+enum {
+	/* Allocate CPU vectors from CPUs on device local node */
+	X86_VECTOR_POL_NODE = 0x1,
+	/* Allocate CPU vectors from all online CPUs */
+	X86_VECTOR_POL_GLOBAL = 0x2,
+	/* Allocate CPU vectors from caller specified CPUs */
+	X86_VECTOR_POL_CALLER = 0x4,
+	X86_VECTOR_POL_MIN = X86_VECTOR_POL_NODE,
+	X86_VECTOR_POL_MAX = X86_VECTOR_POL_CALLER,
+};
+
 struct irq_domain *x86_vector_domain;
 static DEFINE_RAW_SPINLOCK(vector_lock);
 static cpumask_var_t vector_cpumask;
@@ -35,6 +46,9 @@ static struct irq_chip lapic_controller;
 #ifdef	CONFIG_X86_IO_APIC
 static struct apic_chip_data *legacy_irq_data[NR_IRQS_LEGACY];
 #endif
+static unsigned int vector_alloc_policy = X86_VECTOR_POL_NODE |
+					  X86_VECTOR_POL_GLOBAL |
+					  X86_VECTOR_POL_CALLER;
 
 void lock_vector_lock(void)
 {
@@ -258,12 +272,6 @@ void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
 		memset(dst, 0, sizeof(*dst));
 }
 
-static inline const struct cpumask *
-irq_alloc_info_get_mask(struct irq_alloc_info *info)
-{
-	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
-}
-
 static void x86_vector_free_irqs(struct irq_domain *domain,
 				 unsigned int virq, unsigned int nr_irqs)
 {
@@ -284,12 +292,58 @@ static void x86_vector_free_irqs(struct irq_domain *domain,
 	}
 }
 
+static int assign_irq_vector_policy(int irq, int node,
+				    struct apic_chip_data *data,
+				    struct irq_alloc_info *info)
+{
+	int err = -EBUSY;
+	unsigned int policy;
+	const struct cpumask *mask;
+
+	if (info && info->mask)
+		policy = X86_VECTOR_POL_CALLER;
+	else
+		policy = X86_VECTOR_POL_MIN;
+
+	for (; policy <= X86_VECTOR_POL_MAX; policy <<= 1) {
+		if (!(vector_alloc_policy & policy))
+			continue;
+
+		switch (policy) {
+		case X86_VECTOR_POL_NODE:
+			if (node >= 0)
+				mask = cpumask_of_node(node);
+			else
+				mask = NULL;
+			break;
+		case X86_VECTOR_POL_GLOBAL:
+			mask = apic->target_cpus();
+			break;
+		case X86_VECTOR_POL_CALLER:
+			if (info && info->mask)
+				mask = info->mask;
+			else
+				mask = NULL;
+			break;
+		default:
+			mask = NULL;
+			break;
+		}
+		if (mask) {
+			err = assign_irq_vector(irq, data, mask);
+			if (!err)
+				return 0;
+		}
+	}
+
+	return err;
+}
+
 static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 				 unsigned int nr_irqs, void *arg)
 {
 	struct irq_alloc_info *info = arg;
 	struct apic_chip_data *data;
-	const struct cpumask *mask;
 	struct irq_data *irq_data;
 	int i, err;
 
@@ -300,7 +354,6 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 	if ((info->flags & X86_IRQ_ALLOC_CONTIGUOUS_VECTORS) && nr_irqs > 1)
 		return -ENOSYS;
 
-	mask = irq_alloc_info_get_mask(info);
 	for (i = 0; i < nr_irqs; i++) {
 		irq_data = irq_domain_get_irq_data(domain, virq + i);
 		BUG_ON(!irq_data);
@@ -318,7 +371,8 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 		irq_data->chip = &lapic_controller;
 		irq_data->chip_data = data;
 		irq_data->hwirq = virq + i;
-		err = assign_irq_vector(virq, data, mask);
+		err = assign_irq_vector_policy(virq, irq_data->node, data,
+					       info);
 		if (err)
 			goto error;
 	}
@@ -809,6 +863,17 @@ static __init int setup_show_lapic(char *arg)
 }
 __setup("show_lapic=", setup_show_lapic);
 
+static int __init apic_parse_vector_policy(char *str)
+{
+	if (!strncmp(str, "node", 4))
+		vector_alloc_policy |= X86_VECTOR_POL_NODE;
+	else if (!strncmp(str, "global", 6))
+		vector_alloc_policy &= ~X86_VECTOR_POL_NODE;
+
+	return 1;
+}
+__setup("vector_alloc=", apic_parse_vector_policy);
+
 static int __init print_ICs(void)
 {
 	if (apic_verbosity == APIC_QUIET)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:x86/apic] irq_remapping/vt-d: Init all MSI entries not just the first one
  2015-05-04  2:47 ` [Patch 1/2] irq_remapping/vt-d: Fix regression caused by commit b106ee63abcc Jiang Liu
@ 2015-05-05  9:15   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-05-05  9:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: joro, tglx, bp, sivanich, david.a.cohen, bhelgaas, mingo,
	konrad.wilk, benh, david.vrabel, linux-kernel, hpa, yinghai,
	tony.luck, rdunlap, rjw, jiang.liu, linux, gregkh

Commit-ID:  9d4c0313f24a05e5252e7106636bf3c5b6318f5d
Gitweb:     http://git.kernel.org/tip/9d4c0313f24a05e5252e7106636bf3c5b6318f5d
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 4 May 2015 10:47:40 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 5 May 2015 11:14:48 +0200

irq_remapping/vt-d: Init all MSI entries not just the first one

Commit b106ee63abcc ("irq_remapping/vt-d: Enhance Intel IR driver to
support hierarchical irqdomains") caused a regression, which forgot
to initialize remapping data structures other than the first entry
when setting up remapping entries for multiple MSIs.

[ Jiang: Commit message ]

Fixes: b106ee63abcc ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: iommu@lists.linux-foundation.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Joerg Roedel <joro@8bytes.org>
Link: http://lkml.kernel.org/r/1430707662-28598-2-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/iommu/intel_irq_remapping.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 14d9569..7ecc6b3 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1113,7 +1113,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
 {
 	struct intel_iommu *iommu = domain->host_data;
 	struct irq_alloc_info *info = arg;
-	struct intel_ir_data *data;
+	struct intel_ir_data *data, *ird;
 	struct irq_data *irq_data;
 	struct irq_cfg *irq_cfg;
 	int i, ret, index;
@@ -1158,14 +1158,20 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
 		}
 
 		if (i > 0) {
-			data = kzalloc(sizeof(*data), GFP_KERNEL);
-			if (!data)
+			ird = kzalloc(sizeof(*ird), GFP_KERNEL);
+			if (!ird)
 				goto out_free_data;
+			/* Initialize the common data */
+			ird->irq_2_iommu = data->irq_2_iommu;
+			ird->irq_2_iommu.sub_handle = i;
+		} else {
+			ird = data;
 		}
+
 		irq_data->hwirq = (index << 16) + i;
-		irq_data->chip_data = data;
+		irq_data->chip_data = ird;
 		irq_data->chip = &intel_ir_chip;
-		intel_irq_remapping_prepare_irte(data, irq_cfg, info, index, i);
+		intel_irq_remapping_prepare_irte(ird, irq_cfg, info, index, i);
 		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 	return 0;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Patch 2/2] x86, irq: Support CPU vector allocation policies
  2015-05-04  2:47 ` [Patch 2/2] x86, irq: Support CPU vector allocation policies Jiang Liu
@ 2015-05-05 19:25   ` Thomas Gleixner
  2015-05-06  5:17     ` Jiang Liu
  2015-05-06  8:36     ` [Patch v2] " Jiang Liu
  0 siblings, 2 replies; 12+ messages in thread
From: Thomas Gleixner @ 2015-05-05 19:25 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Bjorn Helgaas, Benjamin Herrenschmidt, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Dimitri Sivanich, Jonathan Corbet, x86,
	Konrad Rzeszutek Wilk, David Cohen, Sander Eikelenboom,
	David Vrabel, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, linux-kernel, linux-pci, linux-acpi,
	Daniel J Blueman, linux-doc

On Mon, 4 May 2015, Jiang Liu wrote:
> +enum {
> +	/* Allocate CPU vectors from CPUs on device local node */
> +	X86_VECTOR_POL_NODE = 0x1,
> +	/* Allocate CPU vectors from all online CPUs */
> +	X86_VECTOR_POL_GLOBAL = 0x2,
> +	/* Allocate CPU vectors from caller specified CPUs */
> +	X86_VECTOR_POL_CALLER = 0x4,
> +	X86_VECTOR_POL_MIN = X86_VECTOR_POL_NODE,
> +	X86_VECTOR_POL_MAX = X86_VECTOR_POL_CALLER,
> +}

> +static unsigned int vector_alloc_policy = X86_VECTOR_POL_NODE |
> +					  X86_VECTOR_POL_GLOBAL |
> +					  X86_VECTOR_POL_CALLER;
  
> +static int __init apic_parse_vector_policy(char *str)
> +{
> +	if (!strncmp(str, "node", 4))
> +		vector_alloc_policy |= X86_VECTOR_POL_NODE;

This does not make sense. X86_VECTOR_POL_NODE is already set.

> +	else if (!strncmp(str, "global", 6))
> +		vector_alloc_policy &= ~X86_VECTOR_POL_NODE;

Why would one disable node aware allocation? We fall back to the
global allocation anyway, if the node aware allocation fails.

I'm completely missing the value of this command line option.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Patch 2/2] x86, irq: Support CPU vector allocation policies
  2015-05-05 19:25   ` Thomas Gleixner
@ 2015-05-06  5:17     ` Jiang Liu
  2015-05-06  8:36     ` [Patch v2] " Jiang Liu
  1 sibling, 0 replies; 12+ messages in thread
From: Jiang Liu @ 2015-05-06  5:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Bjorn Helgaas, Benjamin Herrenschmidt, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, Dimitri Sivanich, Jonathan Corbet, x86,
	Konrad Rzeszutek Wilk, David Cohen, Sander Eikelenboom,
	David Vrabel, Andrew Morton, Tony Luck, Joerg Roedel,
	Greg Kroah-Hartman, linux-kernel, linux-pci, linux-acpi,
	Daniel J Blueman, linux-doc

On 2015/5/6 3:25, Thomas Gleixner wrote:
> On Mon, 4 May 2015, Jiang Liu wrote:
>> +enum {
>> +	/* Allocate CPU vectors from CPUs on device local node */
>> +	X86_VECTOR_POL_NODE = 0x1,
>> +	/* Allocate CPU vectors from all online CPUs */
>> +	X86_VECTOR_POL_GLOBAL = 0x2,
>> +	/* Allocate CPU vectors from caller specified CPUs */
>> +	X86_VECTOR_POL_CALLER = 0x4,
>> +	X86_VECTOR_POL_MIN = X86_VECTOR_POL_NODE,
>> +	X86_VECTOR_POL_MAX = X86_VECTOR_POL_CALLER,
>> +}
> 
>> +static unsigned int vector_alloc_policy = X86_VECTOR_POL_NODE |
>> +					  X86_VECTOR_POL_GLOBAL |
>> +					  X86_VECTOR_POL_CALLER;
>   
>> +static int __init apic_parse_vector_policy(char *str)
>> +{
>> +	if (!strncmp(str, "node", 4))
>> +		vector_alloc_policy |= X86_VECTOR_POL_NODE;
> 
> This does not make sense. X86_VECTOR_POL_NODE is already set.
> 
>> +	else if (!strncmp(str, "global", 6))
>> +		vector_alloc_policy &= ~X86_VECTOR_POL_NODE;
> 
> Why would one disable node aware allocation? We fall back to the
> global allocation anyway, if the node aware allocation fails.
> 
> I'm completely missing the value of this command line option.
Hi Thomas,
	You are right. Originally I want a method to disable the
per-node allocation policy. Think it twice, it seems unnecessary
at all. Enabling per-node allocation policy by default shouldn't
cause serious issues, and user may change irq affinity setting
if the default affinity isn't desired.
	So we don't need the kernel parameter at all. Will update
the patch.

Thanks!
Gerry

> 
> Thanks,
> 
> 	tglx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Patch v2] x86, irq: Support CPU vector allocation policies
  2015-05-05 19:25   ` Thomas Gleixner
  2015-05-06  5:17     ` Jiang Liu
@ 2015-05-06  8:36     ` Jiang Liu
  2015-05-06 10:22       ` Thomas Gleixner
  1 sibling, 1 reply; 12+ messages in thread
From: Jiang Liu @ 2015-05-06  8:36 UTC (permalink / raw)
  To: Thomas Gleixner, Bjorn Helgaas, Benjamin Herrenschmidt,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, x86, Jiang Liu
  Cc: Konrad Rzeszutek Wilk, Tony Luck, linux-kernel, linux-pci,
	linux-acpi, Daniel J Blueman

On NUMA systems, an IO device may be associated with a NUMA node.
It may improve IO performance to allocate resources, such as memory
and interrupts, from device local node.

This patch introduces a mechanism to support CPU vector allocation
policies. It tries to allocate CPU vectors from CPUs on device local
node first, and then fallback to all online(global) CPUs.

This mechanism may be used to support NumaConnect systems to allocate
CPU vectors from device local node.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Daniel J Blueman <daniel@numascale.com>
---
Hi Thomas,
	This is the simplified version, which removed the kernel parameter.
Seems much simpler:)

Thanks!
Gerry
---
 arch/x86/kernel/apic/vector.c |   66 +++++++++++++++++++++++++++++++++++------
 1 file changed, 57 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 1c7dd42b98c1..44363ccce9b5 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -28,6 +28,17 @@ struct apic_chip_data {
 	u8			move_in_progress : 1;
 };
 
+enum {
+	/* Allocate CPU vectors from CPUs on device local node */
+	X86_VECTOR_POL_NODE = 0x1,
+	/* Allocate CPU vectors from all online CPUs */
+	X86_VECTOR_POL_GLOBAL = 0x2,
+	/* Allocate CPU vectors from caller specified CPUs */
+	X86_VECTOR_POL_CALLER = 0x4,
+	X86_VECTOR_POL_MIN = X86_VECTOR_POL_NODE,
+	X86_VECTOR_POL_MAX = X86_VECTOR_POL_CALLER,
+};
+
 struct irq_domain *x86_vector_domain;
 static DEFINE_RAW_SPINLOCK(vector_lock);
 static cpumask_var_t vector_cpumask;
@@ -258,12 +269,6 @@ void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
 		memset(dst, 0, sizeof(*dst));
 }
 
-static inline const struct cpumask *
-irq_alloc_info_get_mask(struct irq_alloc_info *info)
-{
-	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
-}
-
 static void x86_vector_free_irqs(struct irq_domain *domain,
 				 unsigned int virq, unsigned int nr_irqs)
 {
@@ -284,12 +289,55 @@ static void x86_vector_free_irqs(struct irq_domain *domain,
 	}
 }
 
+static int assign_irq_vector_policy(int irq, int node,
+				    struct apic_chip_data *data,
+				    struct irq_alloc_info *info)
+{
+	int err = -EBUSY;
+	unsigned int policy;
+	const struct cpumask *mask;
+
+	if (info && info->mask)
+		policy = X86_VECTOR_POL_CALLER;
+	else
+		policy = X86_VECTOR_POL_MIN;
+
+	for (; policy <= X86_VECTOR_POL_MAX; policy <<= 1) {
+		switch (policy) {
+		case X86_VECTOR_POL_NODE:
+			if (node >= 0)
+				mask = cpumask_of_node(node);
+			else
+				mask = NULL;
+			break;
+		case X86_VECTOR_POL_GLOBAL:
+			mask = apic->target_cpus();
+			break;
+		case X86_VECTOR_POL_CALLER:
+			if (info && info->mask)
+				mask = info->mask;
+			else
+				mask = NULL;
+			break;
+		default:
+			mask = NULL;
+			break;
+		}
+		if (mask) {
+			err = assign_irq_vector(irq, data, mask);
+			if (!err)
+				return 0;
+		}
+	}
+
+	return err;
+}
+
 static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 				 unsigned int nr_irqs, void *arg)
 {
 	struct irq_alloc_info *info = arg;
 	struct apic_chip_data *data;
-	const struct cpumask *mask;
 	struct irq_data *irq_data;
 	int i, err;
 
@@ -300,7 +348,6 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 	if ((info->flags & X86_IRQ_ALLOC_CONTIGUOUS_VECTORS) && nr_irqs > 1)
 		return -ENOSYS;
 
-	mask = irq_alloc_info_get_mask(info);
 	for (i = 0; i < nr_irqs; i++) {
 		irq_data = irq_domain_get_irq_data(domain, virq + i);
 		BUG_ON(!irq_data);
@@ -318,7 +365,8 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 		irq_data->chip = &lapic_controller;
 		irq_data->chip_data = data;
 		irq_data->hwirq = virq + i;
-		err = assign_irq_vector(virq, data, mask);
+		err = assign_irq_vector_policy(virq, irq_data->node, data,
+					       info);
 		if (err)
 			goto error;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Patch v2] x86, irq: Support CPU vector allocation policies
  2015-05-06  8:36     ` [Patch v2] " Jiang Liu
@ 2015-05-06 10:22       ` Thomas Gleixner
  2015-05-07  2:53         ` [Patch v3] x86, irq: Allocate CPU vectors from device local CPUs if possible Jiang Liu
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2015-05-06 10:22 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Bjorn Helgaas, Benjamin Herrenschmidt, Ingo Molnar,
	H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap, Yinghai Lu,
	Borislav Petkov, x86, Konrad Rzeszutek Wilk, Tony Luck,
	linux-kernel, linux-pci, linux-acpi, Daniel J Blueman

On Wed, 6 May 2015, Jiang Liu wrote:
> Hi Thomas,
> 	This is the simplified version, which removed the kernel parameter.
> Seems much simpler:)

But it can be made even simpler. :)
 
> +enum {
> +	/* Allocate CPU vectors from CPUs on device local node */
> +	X86_VECTOR_POL_NODE = 0x1,
> +	/* Allocate CPU vectors from all online CPUs */
> +	X86_VECTOR_POL_GLOBAL = 0x2,
> +	/* Allocate CPU vectors from caller specified CPUs */
> +	X86_VECTOR_POL_CALLER = 0x4,
> +	X86_VECTOR_POL_MIN = X86_VECTOR_POL_NODE,
> +	X86_VECTOR_POL_MAX = X86_VECTOR_POL_CALLER,
> +};

  
> +static int assign_irq_vector_policy(int irq, int node,
> +				    struct apic_chip_data *data,
> +				    struct irq_alloc_info *info)
> +{
> +	int err = -EBUSY;
> +	unsigned int policy;
> +	const struct cpumask *mask;
> +
> +	if (info && info->mask)
> +		policy = X86_VECTOR_POL_CALLER;
> +	else
> +		policy = X86_VECTOR_POL_MIN;
> +
> +	for (; policy <= X86_VECTOR_POL_MAX; policy <<= 1) {
> +		switch (policy) {
> +		case X86_VECTOR_POL_NODE:
> +			if (node >= 0)
> +				mask = cpumask_of_node(node);
> +			else
> +				mask = NULL;
> +			break;
> +		case X86_VECTOR_POL_GLOBAL:
> +			mask = apic->target_cpus();
> +			break;
> +		case X86_VECTOR_POL_CALLER:
> +			if (info && info->mask)
> +				mask = info->mask;
> +			else
> +				mask = NULL;
> +			break;
> +		default:
> +			mask = NULL;
> +			break;
> +		}
> +		if (mask) {
> +			err = assign_irq_vector(irq, data, mask);
> +			if (!err)
> +				return 0;
> +		}
> +	}

This looks pretty overengineered now that you don't have that parameter check.

	if (info && info->mask)
		return assign_irq_vector(irq, data, info->mask);

	if (node >= 0) {
		err = assign_irq_vector(irq, data, cpumask_of_node(node));
		if (!err)
			return 0;
	}

	return assign_irq_vector(irq, data, apic->target_cpus());

Should do the same, right?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Patch v3] x86, irq: Allocate CPU vectors from device local CPUs if possible
  2015-05-06 10:22       ` Thomas Gleixner
@ 2015-05-07  2:53         ` Jiang Liu
  2015-05-08  7:21             ` Daniel J Blueman
  2015-05-13  7:54           ` [tip:x86/apic] " tip-bot for Jiang Liu
  0 siblings, 2 replies; 12+ messages in thread
From: Jiang Liu @ 2015-05-07  2:53 UTC (permalink / raw)
  To: Thomas Gleixner, Bjorn Helgaas, Benjamin Herrenschmidt,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, x86, Jiang Liu
  Cc: Konrad Rzeszutek Wilk, Tony Luck, linux-kernel, linux-pci,
	linux-acpi, Daniel J Blueman

On NUMA systems, an IO device may be associated with a NUMA node.
It may improve IO performance to allocate resources, such as memory
and interrupts, from device local node.

This patch introduces a mechanism to support CPU vector allocation
policies. It tries to allocate CPU vectors from CPUs on device local
node first, and then fallback to all online(global) CPUs.

This mechanism may be used to support NumaConnect systems to allocate
CPU vectors from device local node.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Daniel J Blueman <daniel@numascale.com>
---
Hi Thomas,
	I feel this should be simpliest version now:)
Thanks!
Gerry
---
 arch/x86/kernel/apic/vector.c |   23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 1c7dd42b98c1..eb65c6b98de0 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -210,6 +210,18 @@ static int assign_irq_vector(int irq, struct apic_chip_data *data,
 	return err;
 }
 
+static int assign_irq_vector_policy(int irq, int node,
+				    struct apic_chip_data *data,
+				    struct irq_alloc_info *info)
+{
+	if (info && info->mask)
+		return assign_irq_vector(irq, data, info->mask);
+	if (node != NUMA_NO_NODE &&
+	    assign_irq_vector(irq, data, cpumask_of_node(node)) == 0)
+		return 0;
+	return assign_irq_vector(irq, data, apic->target_cpus());
+}
+
 static void clear_irq_vector(int irq, struct apic_chip_data *data)
 {
 	int cpu, vector;
@@ -258,12 +270,6 @@ void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
 		memset(dst, 0, sizeof(*dst));
 }
 
-static inline const struct cpumask *
-irq_alloc_info_get_mask(struct irq_alloc_info *info)
-{
-	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
-}
-
 static void x86_vector_free_irqs(struct irq_domain *domain,
 				 unsigned int virq, unsigned int nr_irqs)
 {
@@ -289,7 +295,6 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 {
 	struct irq_alloc_info *info = arg;
 	struct apic_chip_data *data;
-	const struct cpumask *mask;
 	struct irq_data *irq_data;
 	int i, err;
 
@@ -300,7 +305,6 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 	if ((info->flags & X86_IRQ_ALLOC_CONTIGUOUS_VECTORS) && nr_irqs > 1)
 		return -ENOSYS;
 
-	mask = irq_alloc_info_get_mask(info);
 	for (i = 0; i < nr_irqs; i++) {
 		irq_data = irq_domain_get_irq_data(domain, virq + i);
 		BUG_ON(!irq_data);
@@ -318,7 +322,8 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 		irq_data->chip = &lapic_controller;
 		irq_data->chip_data = data;
 		irq_data->hwirq = virq + i;
-		err = assign_irq_vector(virq, data, mask);
+		err = assign_irq_vector_policy(virq, irq_data->node, data,
+					       info);
 		if (err)
 			goto error;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Patch v3] x86, irq: Allocate CPU vectors from device local CPUs if possible
  2015-05-07  2:53         ` [Patch v3] x86, irq: Allocate CPU vectors from device local CPUs if possible Jiang Liu
@ 2015-05-08  7:21             ` Daniel J Blueman
  2015-05-13  7:54           ` [tip:x86/apic] " tip-bot for Jiang Liu
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel J Blueman @ 2015-05-08  7:21 UTC (permalink / raw)
  Cc: Thomas Gleixner, Bjorn Helgaas, Benjamin Herrenschmidt,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, x86, Jiang Liu,
	Konrad Rzeszutek Wilk, Tony Luck, linux-kernel, linux-pci,
	linux-acpi, Steffen Persvold

On Thu, May 7, 2015 at 10:53 AM, Jiang Liu <jiang.liu@linux.intel.com> 
wrote:
> On NUMA systems, an IO device may be associated with a NUMA node.
> It may improve IO performance to allocate resources, such as memory
> and interrupts, from device local node.
> 
> This patch introduces a mechanism to support CPU vector allocation
> policies. It tries to allocate CPU vectors from CPUs on device local
> node first, and then fallback to all online(global) CPUs.
> 
> This mechanism may be used to support NumaConnect systems to allocate
> CPU vectors from device local node.
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> Cc: Daniel J Blueman <daniel@numascale.com>
> ---
> Hi Thomas,
> 	I feel this should be simpliest version now:)
> Thanks!
> Gerry
> ---
>  arch/x86/kernel/apic/vector.c |   23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/apic/vector.c 
> b/arch/x86/kernel/apic/vector.c
> index 1c7dd42b98c1..eb65c6b98de0 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -210,6 +210,18 @@ static int assign_irq_vector(int irq, struct 
> apic_chip_data *data,
>  	return err;
>  }
> 
> +static int assign_irq_vector_policy(int irq, int node,
> +				    struct apic_chip_data *data,
> +				    struct irq_alloc_info *info)
> +{
> +	if (info && info->mask)
> +		return assign_irq_vector(irq, data, info->mask);
> +	if (node != NUMA_NO_NODE &&
> +	    assign_irq_vector(irq, data, cpumask_of_node(node)) == 0)
> +		return 0;
> +	return assign_irq_vector(irq, data, apic->target_cpus());
> +}
> +
>  static void clear_irq_vector(int irq, struct apic_chip_data *data)
>  {
>  	int cpu, vector;
> @@ -258,12 +270,6 @@ void copy_irq_alloc_info(struct irq_alloc_info 
> *dst, struct irq_alloc_info *src)
>  		memset(dst, 0, sizeof(*dst));
>  }
> 
> -static inline const struct cpumask *
> -irq_alloc_info_get_mask(struct irq_alloc_info *info)
> -{
> -	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
> -}
> -
>  static void x86_vector_free_irqs(struct irq_domain *domain,
>  				 unsigned int virq, unsigned int nr_irqs)
>  {
> @@ -289,7 +295,6 @@ static int x86_vector_alloc_irqs(struct 
> irq_domain *domain, unsigned int virq,
>  {
>  	struct irq_alloc_info *info = arg;
>  	struct apic_chip_data *data;
> -	const struct cpumask *mask;
>  	struct irq_data *irq_data;
>  	int i, err;
> 
> @@ -300,7 +305,6 @@ static int x86_vector_alloc_irqs(struct 
> irq_domain *domain, unsigned int virq,
>  	if ((info->flags & X86_IRQ_ALLOC_CONTIGUOUS_VECTORS) && nr_irqs > 1)
>  		return -ENOSYS;
> 
> -	mask = irq_alloc_info_get_mask(info);
>  	for (i = 0; i < nr_irqs; i++) {
>  		irq_data = irq_domain_get_irq_data(domain, virq + i);
>  		BUG_ON(!irq_data);
> @@ -318,7 +322,8 @@ static int x86_vector_alloc_irqs(struct 
> irq_domain *domain, unsigned int virq,
>  		irq_data->chip = &lapic_controller;
>  		irq_data->chip_data = data;
>  		irq_data->hwirq = virq + i;
> -		err = assign_irq_vector(virq, data, mask);
> +		err = assign_irq_vector_policy(virq, irq_data->node, data,
> +					       info);
>  		if (err)
>  			goto error;
>  	}

Testing x86/tip/apic with this patch on a 192 core/24 node NumaConnect 
system, all the PCIe bridge, GPU, SATA, NIC etc interrupts are 
allocated on the correct NUMA nodes, so it works great. Tested-by: 
Daniel J Blueman <daniel@numascale.com>

Many thanks!
  Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Patch v3] x86, irq: Allocate CPU vectors from device local CPUs if possible
@ 2015-05-08  7:21             ` Daniel J Blueman
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel J Blueman @ 2015-05-08  7:21 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Thomas Gleixner, Bjorn Helgaas, Benjamin Herrenschmidt,
	Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
	Yinghai Lu, Borislav Petkov, x86, Jiang Liu,
	Konrad Rzeszutek Wilk, Tony Luck, linux-kernel, linux-pci,
	linux-acpi, Steffen Persvold

On Thu, May 7, 2015 at 10:53 AM, Jiang Liu <jiang.liu@linux.intel.com> 
wrote:
> On NUMA systems, an IO device may be associated with a NUMA node.
> It may improve IO performance to allocate resources, such as memory
> and interrupts, from device local node.
> 
> This patch introduces a mechanism to support CPU vector allocation
> policies. It tries to allocate CPU vectors from CPUs on device local
> node first, and then fallback to all online(global) CPUs.
> 
> This mechanism may be used to support NumaConnect systems to allocate
> CPU vectors from device local node.
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> Cc: Daniel J Blueman <daniel@numascale.com>
> ---
> Hi Thomas,
> 	I feel this should be simpliest version now:)
> Thanks!
> Gerry
> ---
>  arch/x86/kernel/apic/vector.c |   23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/apic/vector.c 
> b/arch/x86/kernel/apic/vector.c
> index 1c7dd42b98c1..eb65c6b98de0 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -210,6 +210,18 @@ static int assign_irq_vector(int irq, struct 
> apic_chip_data *data,
>  	return err;
>  }
> 
> +static int assign_irq_vector_policy(int irq, int node,
> +				    struct apic_chip_data *data,
> +				    struct irq_alloc_info *info)
> +{
> +	if (info && info->mask)
> +		return assign_irq_vector(irq, data, info->mask);
> +	if (node != NUMA_NO_NODE &&
> +	    assign_irq_vector(irq, data, cpumask_of_node(node)) == 0)
> +		return 0;
> +	return assign_irq_vector(irq, data, apic->target_cpus());
> +}
> +
>  static void clear_irq_vector(int irq, struct apic_chip_data *data)
>  {
>  	int cpu, vector;
> @@ -258,12 +270,6 @@ void copy_irq_alloc_info(struct irq_alloc_info 
> *dst, struct irq_alloc_info *src)
>  		memset(dst, 0, sizeof(*dst));
>  }
> 
> -static inline const struct cpumask *
> -irq_alloc_info_get_mask(struct irq_alloc_info *info)
> -{
> -	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
> -}
> -
>  static void x86_vector_free_irqs(struct irq_domain *domain,
>  				 unsigned int virq, unsigned int nr_irqs)
>  {
> @@ -289,7 +295,6 @@ static int x86_vector_alloc_irqs(struct 
> irq_domain *domain, unsigned int virq,
>  {
>  	struct irq_alloc_info *info = arg;
>  	struct apic_chip_data *data;
> -	const struct cpumask *mask;
>  	struct irq_data *irq_data;
>  	int i, err;
> 
> @@ -300,7 +305,6 @@ static int x86_vector_alloc_irqs(struct 
> irq_domain *domain, unsigned int virq,
>  	if ((info->flags & X86_IRQ_ALLOC_CONTIGUOUS_VECTORS) && nr_irqs > 1)
>  		return -ENOSYS;
> 
> -	mask = irq_alloc_info_get_mask(info);
>  	for (i = 0; i < nr_irqs; i++) {
>  		irq_data = irq_domain_get_irq_data(domain, virq + i);
>  		BUG_ON(!irq_data);
> @@ -318,7 +322,8 @@ static int x86_vector_alloc_irqs(struct 
> irq_domain *domain, unsigned int virq,
>  		irq_data->chip = &lapic_controller;
>  		irq_data->chip_data = data;
>  		irq_data->hwirq = virq + i;
> -		err = assign_irq_vector(virq, data, mask);
> +		err = assign_irq_vector_policy(virq, irq_data->node, data,
> +					       info);
>  		if (err)
>  			goto error;
>  	}

Testing x86/tip/apic with this patch on a 192 core/24 node NumaConnect 
system, all the PCIe bridge, GPU, SATA, NIC etc interrupts are 
allocated on the correct NUMA nodes, so it works great. Tested-by: 
Daniel J Blueman <daniel@numascale.com>

Many thanks!
  Daniel


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:x86/apic] x86, irq: Allocate CPU vectors from device local CPUs if possible
  2015-05-07  2:53         ` [Patch v3] x86, irq: Allocate CPU vectors from device local CPUs if possible Jiang Liu
  2015-05-08  7:21             ` Daniel J Blueman
@ 2015-05-13  7:54           ` tip-bot for Jiang Liu
  1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Jiang Liu @ 2015-05-13  7:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: yinghai, linux-kernel, bp, daniel, tony.luck, rdunlap, jiang.liu,
	bhelgaas, benh, tglx, konrad.wilk, rjw, mingo, hpa

Commit-ID:  486ca539caa082c7f2929c207af1b3ce2a304489
Gitweb:     http://git.kernel.org/tip/486ca539caa082c7f2929c207af1b3ce2a304489
Author:     Jiang Liu <jiang.liu@linux.intel.com>
AuthorDate: Thu, 7 May 2015 10:53:56 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 13 May 2015 09:50:24 +0200

x86, irq: Allocate CPU vectors from device local CPUs if possible

On NUMA systems, an IO device may be associated with a NUMA node.
It may improve IO performance to allocate resources, such as memory
and interrupts, from device local node.

This patch introduces a mechanism to support CPU vector allocation
policies. It tries to allocate CPU vectors from CPUs on device local
node first, and then fallback to all online(global) CPUs.

This mechanism may be used to support NumaConnect systems to allocate
CPU vectors from device local node.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1430967244-28905-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 2766747..b590c9d 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -210,6 +210,18 @@ static int assign_irq_vector(int irq, struct apic_chip_data *data,
 	return err;
 }
 
+static int assign_irq_vector_policy(int irq, int node,
+				    struct apic_chip_data *data,
+				    struct irq_alloc_info *info)
+{
+	if (info && info->mask)
+		return assign_irq_vector(irq, data, info->mask);
+	if (node != NUMA_NO_NODE &&
+	    assign_irq_vector(irq, data, cpumask_of_node(node)) == 0)
+		return 0;
+	return assign_irq_vector(irq, data, apic->target_cpus());
+}
+
 static void clear_irq_vector(int irq, struct apic_chip_data *data)
 {
 	int cpu, vector;
@@ -258,12 +270,6 @@ void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
 		memset(dst, 0, sizeof(*dst));
 }
 
-static inline const struct cpumask *
-irq_alloc_info_get_mask(struct irq_alloc_info *info)
-{
-	return (!info || !info->mask) ? apic->target_cpus() : info->mask;
-}
-
 static void x86_vector_free_irqs(struct irq_domain *domain,
 				 unsigned int virq, unsigned int nr_irqs)
 {
@@ -289,7 +295,6 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 {
 	struct irq_alloc_info *info = arg;
 	struct apic_chip_data *data;
-	const struct cpumask *mask;
 	struct irq_data *irq_data;
 	int i, err;
 
@@ -300,7 +305,6 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 	if ((info->flags & X86_IRQ_ALLOC_CONTIGUOUS_VECTORS) && nr_irqs > 1)
 		return -ENOSYS;
 
-	mask = irq_alloc_info_get_mask(info);
 	for (i = 0; i < nr_irqs; i++) {
 		irq_data = irq_domain_get_irq_data(domain, virq + i);
 		BUG_ON(!irq_data);
@@ -318,7 +322,8 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 		irq_data->chip = &lapic_controller;
 		irq_data->chip_data = data;
 		irq_data->hwirq = virq + i;
-		err = assign_irq_vector(virq, data, mask);
+		err = assign_irq_vector_policy(virq, irq_data->node, data,
+					       info);
 		if (err)
 			goto error;
 	}

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-05-13  7:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-04  2:47 [Patch 0/2] Optimize CPU vector allocation for NUMA systems Jiang Liu
2015-05-04  2:47 ` [Patch 1/2] irq_remapping/vt-d: Fix regression caused by commit b106ee63abcc Jiang Liu
2015-05-05  9:15   ` [tip:x86/apic] irq_remapping/vt-d: Init all MSI entries not just the first one tip-bot for Thomas Gleixner
2015-05-04  2:47 ` [Patch 2/2] x86, irq: Support CPU vector allocation policies Jiang Liu
2015-05-05 19:25   ` Thomas Gleixner
2015-05-06  5:17     ` Jiang Liu
2015-05-06  8:36     ` [Patch v2] " Jiang Liu
2015-05-06 10:22       ` Thomas Gleixner
2015-05-07  2:53         ` [Patch v3] x86, irq: Allocate CPU vectors from device local CPUs if possible Jiang Liu
2015-05-08  7:21           ` Daniel J Blueman
2015-05-08  7:21             ` Daniel J Blueman
2015-05-13  7:54           ` [tip:x86/apic] " tip-bot for Jiang Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.