linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] genirq/affinity: irq vector spread among online CPUs as far as possible
@ 2018-02-06 12:17 Ming Lei
  2018-02-06 12:17 ` [PATCH 1/5] genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Ming Lei @ 2018-02-06 12:17 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel
  Cc: linux-block, linux-nvme, Laurence Oberman, Ming Lei

Hi,

This patchset tries to spread among online CPUs as far as possible, so
that we can avoid to allocate too less irq vectors with online CPUs
mapped.

For example, in a 8cores system, 4 cpu cores(4~7) are offline/non present,
on a device with 4 queues:

1) before this patchset
	irq 39, cpu list 0-2
	irq 40, cpu list 3-4,6
	irq 41, cpu list 5
	irq 42, cpu list 7

2) after this patchset
	irq 39, cpu list 0,4
	irq 40, cpu list 1,6
	irq 41, cpu list 2,5
	irq 42, cpu list 3,7

Without this patchset, only two vectors(39, 40) can be active, but there
can be 4 active irq vectors after applying this patchset.

One disadvantage is that CPUs from different NUMA node can be mapped to
one same irq vector. Given generally one CPU should be enough to handle
one irq vector, it shouldn't be a big deal. Especailly more vectors have
to be allocated, otherwise performance can be hurt in current
assignment.

Thanks
Ming

Ming Lei (5):
  genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask
  genirq/affinity: move actual irq vector spread into one helper
  genirq/affinity: support to do irq vectors spread starting from any
    vector
  genirq/affinity: irq vector spread among online CPUs as far as
    possible
  nvme: pci: pass max vectors as num_possible_cpus() to
    pci_alloc_irq_vectors

 drivers/nvme/host/pci.c |   2 +-
 kernel/irq/affinity.c   | 145 +++++++++++++++++++++++++++++++-----------------
 2 files changed, 95 insertions(+), 52 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask
  2018-02-06 12:17 [PATCH 0/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
@ 2018-02-06 12:17 ` Ming Lei
  2018-03-02 23:06   ` Christoph Hellwig
  2018-02-06 12:17 ` [PATCH 2/5] genirq/affinity: move actual irq vector spread into one helper Ming Lei
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2018-02-06 12:17 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel
  Cc: linux-block, linux-nvme, Laurence Oberman, Ming Lei, Christoph Hellwig

The following patches will introduce two stage irq spread for improving
irq spread on all possible CPUs.

No funtional change.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index a37a3b4b6342..4b1c4763212d 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
 	}
 }
 
-static cpumask_var_t *alloc_node_to_possible_cpumask(void)
+static cpumask_var_t *alloc_node_to_cpumask(void)
 {
 	cpumask_var_t *masks;
 	int node;
@@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_possible_cpumask(void)
 	return NULL;
 }
 
-static void free_node_to_possible_cpumask(cpumask_var_t *masks)
+static void free_node_to_cpumask(cpumask_var_t *masks)
 {
 	int node;
 
@@ -71,7 +71,7 @@ static void free_node_to_possible_cpumask(cpumask_var_t *masks)
 	kfree(masks);
 }
 
-static void build_node_to_possible_cpumask(cpumask_var_t *masks)
+static void build_node_to_cpumask(cpumask_var_t *masks)
 {
 	int cpu;
 
@@ -79,14 +79,14 @@ static void build_node_to_possible_cpumask(cpumask_var_t *masks)
 		cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
 }
 
-static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
+static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
 				const struct cpumask *mask, nodemask_t *nodemsk)
 {
 	int n, nodes = 0;
 
 	/* Calculate the number of nodes in the supplied affinity mask */
 	for_each_node(n) {
-		if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
+		if (cpumask_intersects(mask, node_to_cpumask[n])) {
 			node_set(n, *nodemsk);
 			nodes++;
 		}
@@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	int last_affv = affv + affd->pre_vectors;
 	nodemask_t nodemsk = NODE_MASK_NONE;
 	struct cpumask *masks;
-	cpumask_var_t nmsk, *node_to_possible_cpumask;
+	cpumask_var_t nmsk, *node_to_cpumask;
 
 	/*
 	 * If there aren't any vectors left after applying the pre/post
@@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	if (!masks)
 		goto out;
 
-	node_to_possible_cpumask = alloc_node_to_possible_cpumask();
-	if (!node_to_possible_cpumask)
+	node_to_cpumask = alloc_node_to_cpumask();
+	if (!node_to_cpumask)
 		goto out;
 
 	/* Fill out vectors at the beginning that don't need affinity */
@@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 
 	/* Stabilize the cpumasks */
 	get_online_cpus();
-	build_node_to_possible_cpumask(node_to_possible_cpumask);
-	nodes = get_nodes_in_cpumask(node_to_possible_cpumask, cpu_possible_mask,
+	build_node_to_cpumask(node_to_cpumask);
+	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_possible_mask,
 				     &nodemsk);
 
 	/*
@@ -146,7 +146,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	if (affv <= nodes) {
 		for_each_node_mask(n, nodemsk) {
 			cpumask_copy(masks + curvec,
-				     node_to_possible_cpumask[n]);
+				     node_to_cpumask[n]);
 			if (++curvec == last_affv)
 				break;
 		}
@@ -160,7 +160,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 		vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes;
 
 		/* Get the cpus on this node which are in the mask */
-		cpumask_and(nmsk, cpu_possible_mask, node_to_possible_cpumask[n]);
+		cpumask_and(nmsk, cpu_possible_mask, node_to_cpumask[n]);
 
 		/* Calculate the number of cpus per vector */
 		ncpus = cpumask_weight(nmsk);
@@ -192,7 +192,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	/* Fill out vectors at the end that don't need affinity */
 	for (; curvec < nvecs; curvec++)
 		cpumask_copy(masks + curvec, irq_default_affinity);
-	free_node_to_possible_cpumask(node_to_possible_cpumask);
+	free_node_to_cpumask(node_to_cpumask);
 out:
 	free_cpumask_var(nmsk);
 	return masks;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] genirq/affinity: move actual irq vector spread into one helper
  2018-02-06 12:17 [PATCH 0/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
  2018-02-06 12:17 ` [PATCH 1/5] genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask Ming Lei
@ 2018-02-06 12:17 ` Ming Lei
  2018-03-02 23:07   ` Christoph Hellwig
  2018-02-06 12:17 ` [PATCH 3/5] genirq/affinity: support to do irq vectors spread starting from any vector Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2018-02-06 12:17 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel
  Cc: linux-block, linux-nvme, Laurence Oberman, Ming Lei, Christoph Hellwig

No functional change, just prepare for converting to 2-stage
irq vector spread.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 99 +++++++++++++++++++++++++++++----------------------
 1 file changed, 56 insertions(+), 43 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 4b1c4763212d..6af3f6727f63 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -79,7 +79,7 @@ static void build_node_to_cpumask(cpumask_var_t *masks)
 		cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
 }
 
-static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
+static int get_nodes_in_cpumask(const cpumask_var_t *node_to_cpumask,
 				const struct cpumask *mask, nodemask_t *nodemsk)
 {
 	int n, nodes = 0;
@@ -94,50 +94,19 @@ static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
 	return nodes;
 }
 
-/**
- * irq_create_affinity_masks - Create affinity masks for multiqueue spreading
- * @nvecs:	The total number of vectors
- * @affd:	Description of the affinity requirements
- *
- * Returns the masks pointer or NULL if allocation failed.
- */
-struct cpumask *
-irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
+int irq_build_affinity_masks(int nvecs, const struct irq_affinity *affd,
+			     const cpumask_var_t *node_to_cpumask,
+			     const struct cpumask *cpu_mask,
+			     struct cpumask *nmsk,
+			     struct cpumask *masks)
 {
-	int n, nodes, cpus_per_vec, extra_vecs, curvec;
 	int affv = nvecs - affd->pre_vectors - affd->post_vectors;
 	int last_affv = affv + affd->pre_vectors;
+	int curvec = affd->pre_vectors;
 	nodemask_t nodemsk = NODE_MASK_NONE;
-	struct cpumask *masks;
-	cpumask_var_t nmsk, *node_to_cpumask;
-
-	/*
-	 * If there aren't any vectors left after applying the pre/post
-	 * vectors don't bother with assigning affinity.
-	 */
-	if (!affv)
-		return NULL;
-
-	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
-		return NULL;
-
-	masks = kcalloc(nvecs, sizeof(*masks), GFP_KERNEL);
-	if (!masks)
-		goto out;
+	int n, nodes, cpus_per_vec, extra_vecs;
 
-	node_to_cpumask = alloc_node_to_cpumask();
-	if (!node_to_cpumask)
-		goto out;
-
-	/* Fill out vectors at the beginning that don't need affinity */
-	for (curvec = 0; curvec < affd->pre_vectors; curvec++)
-		cpumask_copy(masks + curvec, irq_default_affinity);
-
-	/* Stabilize the cpumasks */
-	get_online_cpus();
-	build_node_to_cpumask(node_to_cpumask);
-	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_possible_mask,
-				     &nodemsk);
+	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_mask, &nodemsk);
 
 	/*
 	 * If the number of nodes in the mask is greater than or equal the
@@ -150,7 +119,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 			if (++curvec == last_affv)
 				break;
 		}
-		goto done;
+		goto out;
 	}
 
 	for_each_node_mask(n, nodemsk) {
@@ -160,7 +129,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 		vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes;
 
 		/* Get the cpus on this node which are in the mask */
-		cpumask_and(nmsk, cpu_possible_mask, node_to_cpumask[n]);
+		cpumask_and(nmsk, cpu_mask, node_to_cpumask[n]);
 
 		/* Calculate the number of cpus per vector */
 		ncpus = cpumask_weight(nmsk);
@@ -186,7 +155,51 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 		--nodes;
 	}
 
-done:
+out:
+	return curvec - affd->pre_vectors;
+}
+
+/**
+ * irq_create_affinity_masks - Create affinity masks for multiqueue spreading
+ * @nvecs:	The total number of vectors
+ * @affd:	Description of the affinity requirements
+ *
+ * Returns the masks pointer or NULL if allocation failed.
+ */
+struct cpumask *
+irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
+{
+	int curvec;
+	struct cpumask *masks;
+	cpumask_var_t nmsk, *node_to_cpumask;
+
+	/*
+	 * If there aren't any vectors left after applying the pre/post
+	 * vectors don't bother with assigning affinity.
+	 */
+	if (nvecs == affd->pre_vectors + affd->post_vectors)
+		return NULL;
+
+	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
+		return NULL;
+
+	masks = kcalloc(nvecs, sizeof(*masks), GFP_KERNEL);
+	if (!masks)
+		goto out;
+
+	node_to_cpumask = alloc_node_to_cpumask();
+	if (!node_to_cpumask)
+		goto out;
+
+	/* Fill out vectors at the beginning that don't need affinity */
+	for (curvec = 0; curvec < affd->pre_vectors; curvec++)
+		cpumask_copy(masks + curvec, irq_default_affinity);
+
+	/* Stabilize the cpumasks */
+	get_online_cpus();
+	build_node_to_cpumask(node_to_cpumask);
+	curvec += irq_build_affinity_masks(nvecs, affd, node_to_cpumask,
+					   cpu_possible_mask, nmsk, masks);
 	put_online_cpus();
 
 	/* Fill out vectors at the end that don't need affinity */
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] genirq/affinity: support to do irq vectors spread starting from any vector
  2018-02-06 12:17 [PATCH 0/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
  2018-02-06 12:17 ` [PATCH 1/5] genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask Ming Lei
  2018-02-06 12:17 ` [PATCH 2/5] genirq/affinity: move actual irq vector spread into one helper Ming Lei
@ 2018-02-06 12:17 ` Ming Lei
  2018-03-02 23:07   ` Christoph Hellwig
  2018-02-06 12:17 ` [PATCH 4/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
  2018-02-06 12:17 ` [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors Ming Lei
  4 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2018-02-06 12:17 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel
  Cc: linux-block, linux-nvme, Laurence Oberman, Ming Lei, Christoph Hellwig

Now two parameters(start_vec, affv) are introduced to irq_build_affinity_masks(),
then this helper can build the affinity of each irq vector starting from
the irq vector of 'start_vec', and handle at most 'affv' vectors.

This way is required to do 2-stages irq vectors spread among all
possible CPUs.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 6af3f6727f63..9801aecf8763 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -94,17 +94,17 @@ static int get_nodes_in_cpumask(const cpumask_var_t *node_to_cpumask,
 	return nodes;
 }
 
-int irq_build_affinity_masks(int nvecs, const struct irq_affinity *affd,
+int irq_build_affinity_masks(const struct irq_affinity *affd,
+			     const int start_vec, const int affv,
 			     const cpumask_var_t *node_to_cpumask,
 			     const struct cpumask *cpu_mask,
 			     struct cpumask *nmsk,
 			     struct cpumask *masks)
 {
-	int affv = nvecs - affd->pre_vectors - affd->post_vectors;
 	int last_affv = affv + affd->pre_vectors;
-	int curvec = affd->pre_vectors;
+	int curvec = start_vec;
 	nodemask_t nodemsk = NODE_MASK_NONE;
-	int n, nodes, cpus_per_vec, extra_vecs;
+	int n, nodes, cpus_per_vec, extra_vecs, done = 0;
 
 	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_mask, &nodemsk);
 
@@ -116,8 +116,10 @@ int irq_build_affinity_masks(int nvecs, const struct irq_affinity *affd,
 		for_each_node_mask(n, nodemsk) {
 			cpumask_copy(masks + curvec,
 				     node_to_cpumask[n]);
-			if (++curvec == last_affv)
+			if (++done == affv)
 				break;
+			if (++curvec == last_affv)
+				curvec = affd->pre_vectors;
 		}
 		goto out;
 	}
@@ -150,13 +152,16 @@ int irq_build_affinity_masks(int nvecs, const struct irq_affinity *affd,
 			irq_spread_init_one(masks + curvec, nmsk, cpus_per_vec);
 		}
 
-		if (curvec >= last_affv)
+		done += v;
+		if (done >= affv)
 			break;
+		if (curvec >= last_affv)
+			curvec = affd->pre_vectors;
 		--nodes;
 	}
 
 out:
-	return curvec - affd->pre_vectors;
+	return done;
 }
 
 /**
@@ -169,6 +174,7 @@ int irq_build_affinity_masks(int nvecs, const struct irq_affinity *affd,
 struct cpumask *
 irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 {
+	int affv = nvecs - affd->pre_vectors - affd->post_vectors;
 	int curvec;
 	struct cpumask *masks;
 	cpumask_var_t nmsk, *node_to_cpumask;
@@ -198,7 +204,8 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	/* Stabilize the cpumasks */
 	get_online_cpus();
 	build_node_to_cpumask(node_to_cpumask);
-	curvec += irq_build_affinity_masks(nvecs, affd, node_to_cpumask,
+	curvec += irq_build_affinity_masks(affd, curvec, affv,
+					   node_to_cpumask,
 					   cpu_possible_mask, nmsk, masks);
 	put_online_cpus();
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] genirq/affinity: irq vector spread among online CPUs as far as possible
  2018-02-06 12:17 [PATCH 0/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
                   ` (2 preceding siblings ...)
  2018-02-06 12:17 ` [PATCH 3/5] genirq/affinity: support to do irq vectors spread starting from any vector Ming Lei
@ 2018-02-06 12:17 ` Ming Lei
  2018-03-02 23:08   ` Christoph Hellwig
  2018-02-06 12:17 ` [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors Ming Lei
  4 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2018-02-06 12:17 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel
  Cc: linux-block, linux-nvme, Laurence Oberman, Ming Lei, Christoph Hellwig

84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
may cause irq vector assigned to all offline CPUs, and this kind of
assignment may cause much less irq vectors mapped to online CPUs, and
performance may get hurt.

For example, in a 8 cores system, 0~3 online, 4~8 offline/not present,
see 'lscpu':

	[ming@box]$lscpu
	Architecture:          x86_64
	CPU op-mode(s):        32-bit, 64-bit
	Byte Order:            Little Endian
	CPU(s):                4
	On-line CPU(s) list:   0-3
	Thread(s) per core:    1
	Core(s) per socket:    2
	Socket(s):             2
	NUMA node(s):          2
	...
	NUMA node0 CPU(s):     0-3
	NUMA node1 CPU(s):
	...

For example, one device has 4 queues:

1) before 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
	irq 39, cpu list 0
	irq 40, cpu list 1
	irq 41, cpu list 2
	irq 42, cpu list 3

2) after 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
	irq 39, cpu list 0-2
	irq 40, cpu list 3-4,6
	irq 41, cpu list 5
	irq 42, cpu list 7

3) after applying this patch against V4.15+:
	irq 39, cpu list 0,4
	irq 40, cpu list 1,6
	irq 41, cpu list 2,5
	irq 42, cpu list 3,7

This patch tries to do irq vector spread among online CPUs as far as
possible by 2 stages spread.

The above assignment 3) isn't the optimal result from NUMA view, but it
returns more irq vectors with online CPU mapped, given in reality one CPU
should be enough to handle one irq vector, so it is better to do this way.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Reported-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 9801aecf8763..6755ed77d017 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -106,6 +106,9 @@ int irq_build_affinity_masks(const struct irq_affinity *affd,
 	nodemask_t nodemsk = NODE_MASK_NONE;
 	int n, nodes, cpus_per_vec, extra_vecs, done = 0;
 
+	if (!cpumask_weight(cpu_mask))
+		return 0;
+
 	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_mask, &nodemsk);
 
 	/*
@@ -175,9 +178,9 @@ struct cpumask *
 irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 {
 	int affv = nvecs - affd->pre_vectors - affd->post_vectors;
-	int curvec;
+	int curvec, vecs_offline, vecs_online;
 	struct cpumask *masks;
-	cpumask_var_t nmsk, *node_to_cpumask;
+	cpumask_var_t nmsk, cpu_mask, *node_to_cpumask;
 
 	/*
 	 * If there aren't any vectors left after applying the pre/post
@@ -193,9 +196,12 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	if (!masks)
 		goto out;
 
+	if (!alloc_cpumask_var(&cpu_mask, GFP_KERNEL))
+		goto out;
+
 	node_to_cpumask = alloc_node_to_cpumask();
 	if (!node_to_cpumask)
-		goto out;
+		goto out_free_cpu_mask;
 
 	/* Fill out vectors at the beginning that don't need affinity */
 	for (curvec = 0; curvec < affd->pre_vectors; curvec++)
@@ -204,15 +210,32 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	/* Stabilize the cpumasks */
 	get_online_cpus();
 	build_node_to_cpumask(node_to_cpumask);
-	curvec += irq_build_affinity_masks(affd, curvec, affv,
-					   node_to_cpumask,
-					   cpu_possible_mask, nmsk, masks);
+	/* spread on online CPUs starting from the vector of affd->pre_vectors */
+	vecs_online = irq_build_affinity_masks(affd, curvec, affv,
+					       node_to_cpumask,
+					       cpu_online_mask, nmsk, masks);
+
+	/* spread on offline CPUs starting from the next vector to be handled */
+	if (vecs_online >= affv)
+		curvec = affd->pre_vectors;
+	else
+		curvec = affd->pre_vectors + vecs_online;
+	cpumask_andnot(cpu_mask, cpu_possible_mask, cpu_online_mask);
+	vecs_offline = irq_build_affinity_masks(affd, curvec, affv,
+						node_to_cpumask,
+					        cpu_mask, nmsk, masks);
 	put_online_cpus();
 
 	/* Fill out vectors at the end that don't need affinity */
+	if (vecs_online + vecs_offline >= affv)
+		curvec = affv + affd->pre_vectors;
+	else
+		curvec = affd->pre_vectors + vecs_online + vecs_offline;
 	for (; curvec < nvecs; curvec++)
 		cpumask_copy(masks + curvec, irq_default_affinity);
 	free_node_to_cpumask(node_to_cpumask);
+out_free_cpu_mask:
+	free_cpumask_var(cpu_mask);
 out:
 	free_cpumask_var(nmsk);
 	return masks;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
  2018-02-06 12:17 [PATCH 0/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
                   ` (3 preceding siblings ...)
  2018-02-06 12:17 ` [PATCH 4/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
@ 2018-02-06 12:17 ` Ming Lei
  2018-03-01  0:52   ` Christoph Hellwig
  4 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2018-02-06 12:17 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel
  Cc: linux-block, linux-nvme, Laurence Oberman, Ming Lei, Keith Busch,
	Sagi Grimberg, Christoph Hellwig

84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
has switched to do irq vectors spread among all possible CPUs, so
pass num_possible_cpus() as max vecotrs to be assigned.

For example, in a 8 cores system, 0~3 online, 4~8 offline/not present,
see 'lscpu':

        [ming@box]$lscpu
        Architecture:          x86_64
        CPU op-mode(s):        32-bit, 64-bit
        Byte Order:            Little Endian
        CPU(s):                4
        On-line CPU(s) list:   0-3
        Thread(s) per core:    1
        Core(s) per socket:    2
        Socket(s):             2
        NUMA node(s):          2
        ...
        NUMA node0 CPU(s):     0-3
        NUMA node1 CPU(s):
        ...

1) before this patch, follows the allocated vectors and their affinity:
	irq 47, cpu list 0,4
	irq 48, cpu list 1,6
	irq 49, cpu list 2,5
	irq 50, cpu list 3,7

2) after this patch, follows the allocated vectors and their affinity:
	irq 43, cpu list 0
	irq 44, cpu list 1
	irq 45, cpu list 2
	irq 46, cpu list 3
	irq 47, cpu list 4
	irq 48, cpu list 6
	irq 49, cpu list 5
	irq 50, cpu list 7

Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 6fe7af00a1f4..f778426c93d5 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1903,7 +1903,7 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	int result, nr_io_queues;
 	unsigned long size;
 
-	nr_io_queues = num_present_cpus();
+	nr_io_queues = num_possible_cpus();
 	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
 	if (result < 0)
 		return result;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
  2018-02-06 12:17 ` [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors Ming Lei
@ 2018-03-01  0:52   ` Christoph Hellwig
  2018-03-01 17:17     ` Keith Busch
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2018-03-01  0:52 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel,
	linux-block, linux-nvme, Laurence Oberman, Keith Busch,
	Sagi Grimberg, Christoph Hellwig

Looks fine,

and we should pick this up for 4.16 independent of the rest, which
I might need a little more review time for.

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
  2018-03-01  0:52   ` Christoph Hellwig
@ 2018-03-01 17:17     ` Keith Busch
  0 siblings, 0 replies; 12+ messages in thread
From: Keith Busch @ 2018-03-01 17:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Jens Axboe, Christoph Hellwig, Thomas Gleixner,
	linux-kernel, linux-block, linux-nvme, Laurence Oberman,
	Sagi Grimberg

On Thu, Mar 01, 2018 at 01:52:20AM +0100, Christoph Hellwig wrote:
> Looks fine,
> 
> and we should pick this up for 4.16 independent of the rest, which
> I might need a little more review time for.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks, queued up for 4.16.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask
  2018-02-06 12:17 ` [PATCH 1/5] genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask Ming Lei
@ 2018-03-02 23:06   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2018-03-02 23:06 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel,
	linux-block, linux-nvme, Laurence Oberman, Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/5] genirq/affinity: move actual irq vector spread into one helper
  2018-02-06 12:17 ` [PATCH 2/5] genirq/affinity: move actual irq vector spread into one helper Ming Lei
@ 2018-03-02 23:07   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2018-03-02 23:07 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel,
	linux-block, linux-nvme, Laurence Oberman, Christoph Hellwig

On Tue, Feb 06, 2018 at 08:17:39PM +0800, Ming Lei wrote:
> No functional change, just prepare for converting to 2-stage
> irq vector spread.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  kernel/irq/affinity.c | 99 +++++++++++++++++++++++++++++----------------------
>  1 file changed, 56 insertions(+), 43 deletions(-)
> 
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index 4b1c4763212d..6af3f6727f63 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -79,7 +79,7 @@ static void build_node_to_cpumask(cpumask_var_t *masks)
>  		cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
>  }
>  
> -static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
> +static int get_nodes_in_cpumask(const cpumask_var_t *node_to_cpumask,
>  				const struct cpumask *mask, nodemask_t *nodemsk)

Maybe you can split all your constifications into a separate prep patch?

> +int irq_build_affinity_masks(int nvecs, const struct irq_affinity *affd,
> +			     const cpumask_var_t *node_to_cpumask,
> +			     const struct cpumask *cpu_mask,
> +			     struct cpumask *nmsk,
> +			     struct cpumask *masks)

static?

Otherwise looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/5] genirq/affinity: support to do irq vectors spread starting from any vector
  2018-02-06 12:17 ` [PATCH 3/5] genirq/affinity: support to do irq vectors spread starting from any vector Ming Lei
@ 2018-03-02 23:07   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2018-03-02 23:07 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel,
	linux-block, linux-nvme, Laurence Oberman, Christoph Hellwig

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] genirq/affinity: irq vector spread among online CPUs as far as possible
  2018-02-06 12:17 ` [PATCH 4/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
@ 2018-03-02 23:08   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2018-03-02 23:08 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, Thomas Gleixner, linux-kernel,
	linux-block, linux-nvme, Laurence Oberman, Christoph Hellwig

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-03-02 23:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-06 12:17 [PATCH 0/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
2018-02-06 12:17 ` [PATCH 1/5] genirq/affinity: rename *node_to_possible_cpumask as *node_to_cpumask Ming Lei
2018-03-02 23:06   ` Christoph Hellwig
2018-02-06 12:17 ` [PATCH 2/5] genirq/affinity: move actual irq vector spread into one helper Ming Lei
2018-03-02 23:07   ` Christoph Hellwig
2018-02-06 12:17 ` [PATCH 3/5] genirq/affinity: support to do irq vectors spread starting from any vector Ming Lei
2018-03-02 23:07   ` Christoph Hellwig
2018-02-06 12:17 ` [PATCH 4/5] genirq/affinity: irq vector spread among online CPUs as far as possible Ming Lei
2018-03-02 23:08   ` Christoph Hellwig
2018-02-06 12:17 ` [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors Ming Lei
2018-03-01  0:52   ` Christoph Hellwig
2018-03-01 17:17     ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).