linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] irq: fix support for allocating sets of IRQs
@ 2018-11-02 14:59 Ming Lei
  2018-11-02 14:59 ` [PATCH 2/4] irq: move 2-stage irq spread into one helper Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Ming Lei @ 2018-11-02 14:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Thomas Gleixner, linux-kernel,
	Hannes Reinecke, Keith Busch, Sagi Grimberg

Hi Jens,

As I mentioned, there are at least two issues in the patch of '
irq: add support for allocating (and affinitizing) sets of IRQs':

1) it is wrong to pass 'mask + usedvec' to irq_build_affinity_masks()

2) we should spread all possible CPUs in 2-stage way on each set of IRQs

The fix isn't trivial, and I introduce two extra patches as preparation,
then the implementation can be more clean.

The patchset is against mq-maps branch of block tree, feel free to
integrate into the whole patchset of multiple queue maps.

Thanks,
Ming


Jens Axboe (1):
  irq: add support for allocating (and affinitizing) sets of IRQs

Ming Lei (3):
  Revert "irq: add support for allocating (and affinitizing) sets of
    IRQs"
  irq: move 2-stage irq spread into one helper
  irq: pass first vector to __irq_build_affinity_masks

 kernel/irq/affinity.c | 119 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 75 insertions(+), 44 deletions(-)

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>

-- 
2.9.5


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/4] irq: move 2-stage irq spread into one helper
  2018-11-02 14:59 [PATCH 0/4] irq: fix support for allocating sets of IRQs Ming Lei
@ 2018-11-02 14:59 ` Ming Lei
  2018-11-05 11:22   ` [tip:irq/core] genirq/affinity: Move two stage affinity spreading into a helper function tip-bot for Ming Lei
  2018-11-02 14:59 ` [PATCH 3/4] irq: pass first vector to __irq_build_affinity_masks Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2018-11-02 14:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Thomas Gleixner, linux-kernel,
	Hannes Reinecke, Keith Busch, Sagi Grimberg

No functional change, and prepare for the following patch to
support allocating (and affinitizing) sets of IRQs.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 92 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 56 insertions(+), 36 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index f4f29b9d90ee..a16b601604aa 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -94,7 +94,7 @@ static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
 	return nodes;
 }
 
-static int irq_build_affinity_masks(const struct irq_affinity *affd,
+static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 				    int startvec, int numvecs,
 				    cpumask_var_t *node_to_cpumask,
 				    const struct cpumask *cpu_mask,
@@ -166,6 +166,58 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 	return done;
 }
 
+/*
+ * build affinity in two stages:
+ *	1) spread present CPU on these vectors
+ *	2) spread other possible CPUs on these vectors
+ */
+static int irq_build_affinity_masks(const struct irq_affinity *affd,
+				    int startvec, int numvecs,
+				    cpumask_var_t *node_to_cpumask,
+				    struct cpumask *masks)
+{
+	int curvec = startvec, usedvecs = -1;
+	cpumask_var_t nmsk, npresmsk;
+
+	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
+			return usedvecs;
+
+	if (!zalloc_cpumask_var(&npresmsk, GFP_KERNEL))
+			goto fail;
+
+	/* Stabilize the cpumasks */
+	get_online_cpus();
+	build_node_to_cpumask(node_to_cpumask);
+
+	/* Spread on present CPUs starting from affd->pre_vectors */
+	usedvecs = __irq_build_affinity_masks(affd, curvec, numvecs,
+					    node_to_cpumask, cpu_present_mask,
+					    nmsk, masks);
+
+	/*
+	 * Spread on non present CPUs starting from the next vector to be
+	 * handled. If the spreading of present CPUs already exhausted the
+	 * vector space, assign the non present CPUs to the already spread
+	 * out vectors.
+	 */
+	if (usedvecs >= numvecs)
+		curvec = affd->pre_vectors;
+	else
+		curvec = affd->pre_vectors + usedvecs;
+	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
+	usedvecs += __irq_build_affinity_masks(affd, curvec, numvecs,
+					     node_to_cpumask, npresmsk,
+					     nmsk, masks);
+	put_online_cpus();
+
+	free_cpumask_var(npresmsk);
+
+ fail:
+	free_cpumask_var(nmsk);
+
+	return usedvecs;
+}
+
 /**
  * irq_create_affinity_masks - Create affinity masks for multiqueue spreading
  * @nvecs:	The total number of vectors
@@ -178,7 +230,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 {
 	int affvecs = nvecs - affd->pre_vectors - affd->post_vectors;
 	int curvec, usedvecs;
-	cpumask_var_t nmsk, npresmsk, *node_to_cpumask;
+	cpumask_var_t *node_to_cpumask;
 	struct cpumask *masks = NULL;
 
 	/*
@@ -188,15 +240,9 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	if (nvecs == affd->pre_vectors + affd->post_vectors)
 		return NULL;
 
-	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
-		return NULL;
-
-	if (!zalloc_cpumask_var(&npresmsk, GFP_KERNEL))
-		goto outcpumsk;
-
 	node_to_cpumask = alloc_node_to_cpumask();
 	if (!node_to_cpumask)
-		goto outnpresmsk;
+		return NULL;
 
 	masks = kcalloc(nvecs, sizeof(*masks), GFP_KERNEL);
 	if (!masks)
@@ -206,30 +252,8 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	for (curvec = 0; curvec < affd->pre_vectors; curvec++)
 		cpumask_copy(masks + curvec, irq_default_affinity);
 
-	/* Stabilize the cpumasks */
-	get_online_cpus();
-	build_node_to_cpumask(node_to_cpumask);
-
-	/* Spread on present CPUs starting from affd->pre_vectors */
 	usedvecs = irq_build_affinity_masks(affd, curvec, affvecs,
-					    node_to_cpumask, cpu_present_mask,
-					    nmsk, masks);
-
-	/*
-	 * Spread on non present CPUs starting from the next vector to be
-	 * handled. If the spreading of present CPUs already exhausted the
-	 * vector space, assign the non present CPUs to the already spread
-	 * out vectors.
-	 */
-	if (usedvecs >= affvecs)
-		curvec = affd->pre_vectors;
-	else
-		curvec = affd->pre_vectors + usedvecs;
-	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
-	usedvecs += irq_build_affinity_masks(affd, curvec, affvecs,
-					     node_to_cpumask, npresmsk,
-					     nmsk, masks);
-	put_online_cpus();
+					    node_to_cpumask, masks);
 
 	/* Fill out vectors at the end that don't need affinity */
 	if (usedvecs >= affvecs)
@@ -241,10 +265,6 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 
 outnodemsk:
 	free_node_to_cpumask(node_to_cpumask);
-outnpresmsk:
-	free_cpumask_var(npresmsk);
-outcpumsk:
-	free_cpumask_var(nmsk);
 	return masks;
 }
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] irq: pass first vector to __irq_build_affinity_masks
  2018-11-02 14:59 [PATCH 0/4] irq: fix support for allocating sets of IRQs Ming Lei
  2018-11-02 14:59 ` [PATCH 2/4] irq: move 2-stage irq spread into one helper Ming Lei
@ 2018-11-02 14:59 ` Ming Lei
  2018-11-05 11:23   ` [tip:irq/core] genirq/affinity: Pass first vector to __irq_build_affinity_masks() tip-bot for Ming Lei
  2018-11-02 14:59 ` [PATCH 4/4] irq: add support for allocating (and affinitizing) sets of IRQs Ming Lei
  2018-11-03 21:21 ` [PATCH 0/4] irq: fix support for allocating sets of IRQs Jens Axboe
  3 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2018-11-02 14:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Thomas Gleixner, linux-kernel,
	Hannes Reinecke, Keith Busch, Sagi Grimberg

No functional change, and prepare for the following patch to
support allocating (and affinitizing) sets of IRQs, in which
each set of IRQ needs whole 2-stage spread, and the 1st vector
should point to the 1st one in this set.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index a16b601604aa..9c74f21ab10e 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -95,14 +95,14 @@ static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
 }
 
 static int __irq_build_affinity_masks(const struct irq_affinity *affd,
-				    int startvec, int numvecs,
+				    int startvec, int numvecs, int firstvec,
 				    cpumask_var_t *node_to_cpumask,
 				    const struct cpumask *cpu_mask,
 				    struct cpumask *nmsk,
 				    struct cpumask *masks)
 {
 	int n, nodes, cpus_per_vec, extra_vecs, done = 0;
-	int last_affv = affd->pre_vectors + numvecs;
+	int last_affv = firstvec + numvecs;
 	int curvec = startvec;
 	nodemask_t nodemsk = NODE_MASK_NONE;
 
@@ -121,7 +121,7 @@ static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 			if (++done == numvecs)
 				break;
 			if (++curvec == last_affv)
-				curvec = affd->pre_vectors;
+				curvec = firstvec;
 		}
 		goto out;
 	}
@@ -130,7 +130,7 @@ static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 		int ncpus, v, vecs_to_assign, vecs_per_node;
 
 		/* Spread the vectors per node */
-		vecs_per_node = (numvecs - (curvec - affd->pre_vectors)) / nodes;
+		vecs_per_node = (numvecs - (curvec - firstvec)) / nodes;
 
 		/* Get the cpus on this node which are in the mask */
 		cpumask_and(nmsk, cpu_mask, node_to_cpumask[n]);
@@ -158,7 +158,7 @@ static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 		if (done >= numvecs)
 			break;
 		if (curvec >= last_affv)
-			curvec = affd->pre_vectors;
+			curvec = firstvec;
 		--nodes;
 	}
 
@@ -191,8 +191,8 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 
 	/* Spread on present CPUs starting from affd->pre_vectors */
 	usedvecs = __irq_build_affinity_masks(affd, curvec, numvecs,
-					    node_to_cpumask, cpu_present_mask,
-					    nmsk, masks);
+					affd->pre_vectors, node_to_cpumask,
+					cpu_present_mask, nmsk, masks);
 
 	/*
 	 * Spread on non present CPUs starting from the next vector to be
@@ -206,8 +206,8 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 		curvec = affd->pre_vectors + usedvecs;
 	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
 	usedvecs += __irq_build_affinity_masks(affd, curvec, numvecs,
-					     node_to_cpumask, npresmsk,
-					     nmsk, masks);
+					affd->pre_vectors, node_to_cpumask, npresmsk,
+					nmsk, masks);
 	put_online_cpus();
 
 	free_cpumask_var(npresmsk);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] irq: add support for allocating (and affinitizing) sets of IRQs
  2018-11-02 14:59 [PATCH 0/4] irq: fix support for allocating sets of IRQs Ming Lei
  2018-11-02 14:59 ` [PATCH 2/4] irq: move 2-stage irq spread into one helper Ming Lei
  2018-11-02 14:59 ` [PATCH 3/4] irq: pass first vector to __irq_build_affinity_masks Ming Lei
@ 2018-11-02 14:59 ` Ming Lei
  2018-11-05 11:23   ` [tip:irq/core] genirq/affinity: Add support for allocating interrupt sets tip-bot for Jens Axboe
  2018-11-03 21:21 ` [PATCH 0/4] irq: fix support for allocating sets of IRQs Jens Axboe
  3 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2018-11-02 14:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Thomas Gleixner, linux-kernel, Ming Lei

From: Jens Axboe <axboe@kernel.dk>

A driver may have a need to allocate multiple sets of MSI/MSI-X
interrupts, and have them appropriately affinitized. Add support for
defining a number of sets in the irq_affinity structure, of varying
sizes, and get each set affinitized correctly across the machine.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/pci/msi.c         | 14 ++++++++++
 include/linux/interrupt.h |  4 +++
 kernel/irq/affinity.c     | 71 ++++++++++++++++++++++++++++++++++-------------
 3 files changed, 70 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index af24ed50a245..265ed3e4c920 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1036,6 +1036,13 @@ static int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
 	if (maxvec < minvec)
 		return -ERANGE;
 
+	/*
+	 * If the caller is passing in sets, we can't support a range of
+	 * vectors. The caller needs to handle that.
+	 */
+	if (affd && affd->nr_sets && minvec != maxvec)
+		return -EINVAL;
+
 	if (WARN_ON_ONCE(dev->msi_enabled))
 		return -EINVAL;
 
@@ -1087,6 +1094,13 @@ static int __pci_enable_msix_range(struct pci_dev *dev,
 	if (maxvec < minvec)
 		return -ERANGE;
 
+	/*
+	 * If the caller is passing in sets, we can't support a range of
+	 * supported vectors. The caller needs to handle that.
+	 */
+	if (affd && affd->nr_sets && minvec != maxvec)
+		return -EINVAL;
+
 	if (WARN_ON_ONCE(dev->msix_enabled))
 		return -EINVAL;
 
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 1d6711c28271..ca397ff40836 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -247,10 +247,14 @@ struct irq_affinity_notify {
  *			the MSI(-X) vector space
  * @post_vectors:	Don't apply affinity to @post_vectors at end of
  *			the MSI(-X) vector space
+ * @nr_sets:		Length of passed in *sets array
+ * @sets:		Number of affinitized sets
  */
 struct irq_affinity {
 	int	pre_vectors;
 	int	post_vectors;
+	int	nr_sets;
+	int	*sets;
 };
 
 #if defined(CONFIG_SMP)
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 9c74f21ab10e..d49d3bff702c 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -172,26 +172,28 @@ static int __irq_build_affinity_masks(const struct irq_affinity *affd,
  *	2) spread other possible CPUs on these vectors
  */
 static int irq_build_affinity_masks(const struct irq_affinity *affd,
-				    int startvec, int numvecs,
+				    int startvec, int numvecs, int firstvec,
 				    cpumask_var_t *node_to_cpumask,
 				    struct cpumask *masks)
 {
-	int curvec = startvec, usedvecs = -1;
+	int curvec = startvec, nr_present, nr_others;
+	int ret = -ENOMEM;
 	cpumask_var_t nmsk, npresmsk;
 
 	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
-			return usedvecs;
+			return ret;
 
 	if (!zalloc_cpumask_var(&npresmsk, GFP_KERNEL))
 			goto fail;
 
+	ret = 0;
 	/* Stabilize the cpumasks */
 	get_online_cpus();
 	build_node_to_cpumask(node_to_cpumask);
 
 	/* Spread on present CPUs starting from affd->pre_vectors */
-	usedvecs = __irq_build_affinity_masks(affd, curvec, numvecs,
-					affd->pre_vectors, node_to_cpumask,
+	nr_present = __irq_build_affinity_masks(affd, curvec, numvecs,
+					firstvec, node_to_cpumask,
 					cpu_present_mask, nmsk, masks);
 
 	/*
@@ -200,22 +202,24 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 	 * vector space, assign the non present CPUs to the already spread
 	 * out vectors.
 	 */
-	if (usedvecs >= numvecs)
-		curvec = affd->pre_vectors;
+	if (nr_present >= numvecs)
+		curvec = firstvec;
 	else
-		curvec = affd->pre_vectors + usedvecs;
+		curvec = firstvec + nr_present;
 	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
-	usedvecs += __irq_build_affinity_masks(affd, curvec, numvecs,
-					affd->pre_vectors, node_to_cpumask, npresmsk,
+	nr_others = __irq_build_affinity_masks(affd, curvec, numvecs,
+					firstvec, node_to_cpumask, npresmsk,
 					nmsk, masks);
 	put_online_cpus();
 
+	if (nr_present < numvecs)
+			WARN_ON(nr_present + nr_others < numvecs);
+
 	free_cpumask_var(npresmsk);
 
  fail:
 	free_cpumask_var(nmsk);
-
-	return usedvecs;
+	return ret;
 }
 
 /**
@@ -232,6 +236,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	int curvec, usedvecs;
 	cpumask_var_t *node_to_cpumask;
 	struct cpumask *masks = NULL;
+	int i, nr_sets;
 
 	/*
 	 * If there aren't any vectors left after applying the pre/post
@@ -252,8 +257,28 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	for (curvec = 0; curvec < affd->pre_vectors; curvec++)
 		cpumask_copy(masks + curvec, irq_default_affinity);
 
-	usedvecs = irq_build_affinity_masks(affd, curvec, affvecs,
-					    node_to_cpumask, masks);
+	/*
+	 * Spread on present CPUs starting from affd->pre_vectors. If we
+	 * have multiple sets, build each sets affinity mask separately.
+	 */
+	nr_sets = affd->nr_sets;
+	if (!nr_sets)
+		nr_sets = 1;
+
+	for (i = 0, usedvecs = 0; i < nr_sets; i++) {
+		int this_vecs = affd->sets ? affd->sets[i] : affvecs;
+		int ret;
+
+		ret = irq_build_affinity_masks(affd, curvec, this_vecs,
+						curvec, node_to_cpumask, masks);
+		if (ret) {
+				kfree(masks);
+				masks = NULL;
+				goto outnodemsk;
+		}
+		curvec += this_vecs;
+		usedvecs += this_vecs;
+	}
 
 	/* Fill out vectors at the end that don't need affinity */
 	if (usedvecs >= affvecs)
@@ -278,13 +303,21 @@ int irq_calc_affinity_vectors(int minvec, int maxvec, const struct irq_affinity
 {
 	int resv = affd->pre_vectors + affd->post_vectors;
 	int vecs = maxvec - resv;
-	int ret;
+	int set_vecs;
 
 	if (resv > minvec)
 		return 0;
 
-	get_online_cpus();
-	ret = min_t(int, cpumask_weight(cpu_possible_mask), vecs) + resv;
-	put_online_cpus();
-	return ret;
+	if (affd->nr_sets) {
+		int i;
+
+		for (i = 0, set_vecs = 0;  i < affd->nr_sets; i++)
+			set_vecs += affd->sets[i];
+	} else {
+		get_online_cpus();
+		set_vecs = cpumask_weight(cpu_possible_mask);
+		put_online_cpus();
+	}
+
+	return resv + min(set_vecs, vecs);
 }
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] irq: fix support for allocating sets of IRQs
  2018-11-02 14:59 [PATCH 0/4] irq: fix support for allocating sets of IRQs Ming Lei
                   ` (2 preceding siblings ...)
  2018-11-02 14:59 ` [PATCH 4/4] irq: add support for allocating (and affinitizing) sets of IRQs Ming Lei
@ 2018-11-03 21:21 ` Jens Axboe
  2018-11-04 12:02   ` Thomas Gleixner
  3 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2018-11-03 21:21 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-block, Thomas Gleixner, linux-kernel, Hannes Reinecke,
	Keith Busch, Sagi Grimberg

On 11/2/18 8:59 AM, Ming Lei wrote:
> Hi Jens,
> 
> As I mentioned, there are at least two issues in the patch of '
> irq: add support for allocating (and affinitizing) sets of IRQs':
> 
> 1) it is wrong to pass 'mask + usedvec' to irq_build_affinity_masks()
> 
> 2) we should spread all possible CPUs in 2-stage way on each set of IRQs
> 
> The fix isn't trivial, and I introduce two extra patches as preparation,
> then the implementation can be more clean.
> 
> The patchset is against mq-maps branch of block tree, feel free to
> integrate into the whole patchset of multiple queue maps.

Thanks Ming, I ran this through my testing, and I end up with the
same maps and affinities for all the cases I cared about. I'm going
to drop my initial version, and add the three.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] irq: fix support for allocating sets of IRQs
  2018-11-03 21:21 ` [PATCH 0/4] irq: fix support for allocating sets of IRQs Jens Axboe
@ 2018-11-04 12:02   ` Thomas Gleixner
  2018-11-04 17:24     ` Jens Axboe
  2018-11-05  2:18     ` Ming Lei
  0 siblings, 2 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-11-04 12:02 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Ming Lei, linux-block, LKML, Hannes Reinecke, Keith Busch,
	Sagi Grimberg, Long Li

Jens,

On Sat, 3 Nov 2018, Jens Axboe wrote:

> On 11/2/18 8:59 AM, Ming Lei wrote:
> > Hi Jens,
> > 
> > As I mentioned, there are at least two issues in the patch of '
> > irq: add support for allocating (and affinitizing) sets of IRQs':
> > 
> > 1) it is wrong to pass 'mask + usedvec' to irq_build_affinity_masks()
> > 
> > 2) we should spread all possible CPUs in 2-stage way on each set of IRQs
> > 
> > The fix isn't trivial, and I introduce two extra patches as preparation,
> > then the implementation can be more clean.
> > 
> > The patchset is against mq-maps branch of block tree, feel free to
> > integrate into the whole patchset of multiple queue maps.
> 
> Thanks Ming, I ran this through my testing, and I end up with the
> same maps and affinities for all the cases I cared about. I'm going
> to drop my initial version, and add the three.

So I assume, that I can pick up Mings series instead.

There is another patch pending affecting the irq affinity spreading. Can
you folks please have a look at it?

https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] irq: fix support for allocating sets of IRQs
  2018-11-04 12:02   ` Thomas Gleixner
@ 2018-11-04 17:24     ` Jens Axboe
  2018-11-04 18:39       ` Thomas Gleixner
  2018-11-05  2:18     ` Ming Lei
  1 sibling, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2018-11-04 17:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ming Lei, linux-block, LKML, Hannes Reinecke, Keith Busch,
	Sagi Grimberg, Long Li

On 11/4/18 5:02 AM, Thomas Gleixner wrote:
> Jens,
> 
> On Sat, 3 Nov 2018, Jens Axboe wrote:
> 
>> On 11/2/18 8:59 AM, Ming Lei wrote:
>>> Hi Jens,
>>>
>>> As I mentioned, there are at least two issues in the patch of '
>>> irq: add support for allocating (and affinitizing) sets of IRQs':
>>>
>>> 1) it is wrong to pass 'mask + usedvec' to irq_build_affinity_masks()
>>>
>>> 2) we should spread all possible CPUs in 2-stage way on each set of IRQs
>>>
>>> The fix isn't trivial, and I introduce two extra patches as preparation,
>>> then the implementation can be more clean.
>>>
>>> The patchset is against mq-maps branch of block tree, feel free to
>>> integrate into the whole patchset of multiple queue maps.
>>
>> Thanks Ming, I ran this through my testing, and I end up with the
>> same maps and affinities for all the cases I cared about. I'm going
>> to drop my initial version, and add the three.
> 
> So I assume, that I can pick up Mings series instead.

Yes, let's do that.

> There is another patch pending affecting the irq affinity spreading. Can
> you folks please have a look at it?
> 
> https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com

I'll give that a look and test. Thanks for the heads-up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] irq: fix support for allocating sets of IRQs
  2018-11-04 17:24     ` Jens Axboe
@ 2018-11-04 18:39       ` Thomas Gleixner
  2018-11-05 11:24         ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2018-11-04 18:39 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Ming Lei, linux-block, LKML, Hannes Reinecke, Keith Busch,
	Sagi Grimberg, Long Li

On Sun, 4 Nov 2018, Jens Axboe wrote:

Cc'ing Long with a hopefully working E-Mail address. The previous one
bounced because I stupidly copied the wrong one...

> On 11/4/18 5:02 AM, Thomas Gleixner wrote:
> > Jens,
> > 
> > On Sat, 3 Nov 2018, Jens Axboe wrote:
> > 
> >> On 11/2/18 8:59 AM, Ming Lei wrote:
> >>> Hi Jens,
> >>>
> >>> As I mentioned, there are at least two issues in the patch of '
> >>> irq: add support for allocating (and affinitizing) sets of IRQs':
> >>>
> >>> 1) it is wrong to pass 'mask + usedvec' to irq_build_affinity_masks()
> >>>
> >>> 2) we should spread all possible CPUs in 2-stage way on each set of IRQs
> >>>
> >>> The fix isn't trivial, and I introduce two extra patches as preparation,
> >>> then the implementation can be more clean.
> >>>
> >>> The patchset is against mq-maps branch of block tree, feel free to
> >>> integrate into the whole patchset of multiple queue maps.
> >>
> >> Thanks Ming, I ran this through my testing, and I end up with the
> >> same maps and affinities for all the cases I cared about. I'm going
> >> to drop my initial version, and add the three.
> > 
> > So I assume, that I can pick up Mings series instead.
> 
> Yes, let's do that.
> 
> > There is another patch pending affecting the irq affinity spreading. Can
> > you folks please have a look at it?
> > 
> > https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
> 
> I'll give that a look and test. Thanks for the heads-up.
> 
> -- 
> Jens Axboe
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] irq: fix support for allocating sets of IRQs
  2018-11-04 12:02   ` Thomas Gleixner
  2018-11-04 17:24     ` Jens Axboe
@ 2018-11-05  2:18     ` Ming Lei
  1 sibling, 0 replies; 14+ messages in thread
From: Ming Lei @ 2018-11-05  2:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jens Axboe, linux-block, LKML, Hannes Reinecke, Keith Busch,
	Sagi Grimberg, Long Li

On Sun, Nov 04, 2018 at 01:02:18PM +0100, Thomas Gleixner wrote:
> Jens,
> 
> On Sat, 3 Nov 2018, Jens Axboe wrote:
> 
> > On 11/2/18 8:59 AM, Ming Lei wrote:
> > > Hi Jens,
> > > 
> > > As I mentioned, there are at least two issues in the patch of '
> > > irq: add support for allocating (and affinitizing) sets of IRQs':
> > > 
> > > 1) it is wrong to pass 'mask + usedvec' to irq_build_affinity_masks()
> > > 
> > > 2) we should spread all possible CPUs in 2-stage way on each set of IRQs
> > > 
> > > The fix isn't trivial, and I introduce two extra patches as preparation,
> > > then the implementation can be more clean.
> > > 
> > > The patchset is against mq-maps branch of block tree, feel free to
> > > integrate into the whole patchset of multiple queue maps.
> > 
> > Thanks Ming, I ran this through my testing, and I end up with the
> > same maps and affinities for all the cases I cared about. I'm going
> > to drop my initial version, and add the three.
> 
> So I assume, that I can pick up Mings series instead.
> 
> There is another patch pending affecting the irq affinity spreading. Can
> you folks please have a look at it?
> 
> https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com

This patch looks fine.

It ensures that all CPUs are covered in irq's affinity when required
vector number is <= nr_numa_nodes.

Reviewed-by: Ming Lei <ming.lei@redhat.com>


Thanks,
Ming

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:irq/core] genirq/affinity: Move two stage affinity spreading into a helper function
  2018-11-02 14:59 ` [PATCH 2/4] irq: move 2-stage irq spread into one helper Ming Lei
@ 2018-11-05 11:22   ` tip-bot for Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Ming Lei @ 2018-11-05 11:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: sagi, hare, hpa, mingo, ming.lei, linux-kernel, keith.busch, tglx, axboe

Commit-ID:  5c903e108d0b005cf59904ca3520934fca4b9439
Gitweb:     https://git.kernel.org/tip/5c903e108d0b005cf59904ca3520934fca4b9439
Author:     Ming Lei <ming.lei@redhat.com>
AuthorDate: Fri, 2 Nov 2018 22:59:49 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 5 Nov 2018 12:16:26 +0100

genirq/affinity: Move two stage affinity spreading into a helper function

No functional change. Prepares for supporting allocating and affinitizing
interrupt sets.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Link: https://lkml.kernel.org/r/20181102145951.31979-3-ming.lei@redhat.com

---
 kernel/irq/affinity.c | 92 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 56 insertions(+), 36 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index e12cdf637c71..2f9812b6035e 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -94,7 +94,7 @@ static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
 	return nodes;
 }
 
-static int irq_build_affinity_masks(const struct irq_affinity *affd,
+static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 				    int startvec, int numvecs,
 				    cpumask_var_t *node_to_cpumask,
 				    const struct cpumask *cpu_mask,
@@ -165,6 +165,58 @@ out:
 	return done;
 }
 
+/*
+ * build affinity in two stages:
+ *	1) spread present CPU on these vectors
+ *	2) spread other possible CPUs on these vectors
+ */
+static int irq_build_affinity_masks(const struct irq_affinity *affd,
+				    int startvec, int numvecs,
+				    cpumask_var_t *node_to_cpumask,
+				    struct cpumask *masks)
+{
+	int curvec = startvec, usedvecs = -1;
+	cpumask_var_t nmsk, npresmsk;
+
+	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
+			return usedvecs;
+
+	if (!zalloc_cpumask_var(&npresmsk, GFP_KERNEL))
+			goto fail;
+
+	/* Stabilize the cpumasks */
+	get_online_cpus();
+	build_node_to_cpumask(node_to_cpumask);
+
+	/* Spread on present CPUs starting from affd->pre_vectors */
+	usedvecs = __irq_build_affinity_masks(affd, curvec, numvecs,
+					    node_to_cpumask, cpu_present_mask,
+					    nmsk, masks);
+
+	/*
+	 * Spread on non present CPUs starting from the next vector to be
+	 * handled. If the spreading of present CPUs already exhausted the
+	 * vector space, assign the non present CPUs to the already spread
+	 * out vectors.
+	 */
+	if (usedvecs >= numvecs)
+		curvec = affd->pre_vectors;
+	else
+		curvec = affd->pre_vectors + usedvecs;
+	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
+	usedvecs += __irq_build_affinity_masks(affd, curvec, numvecs,
+					     node_to_cpumask, npresmsk,
+					     nmsk, masks);
+	put_online_cpus();
+
+	free_cpumask_var(npresmsk);
+
+ fail:
+	free_cpumask_var(nmsk);
+
+	return usedvecs;
+}
+
 /**
  * irq_create_affinity_masks - Create affinity masks for multiqueue spreading
  * @nvecs:	The total number of vectors
@@ -177,7 +229,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 {
 	int affvecs = nvecs - affd->pre_vectors - affd->post_vectors;
 	int curvec, usedvecs;
-	cpumask_var_t nmsk, npresmsk, *node_to_cpumask;
+	cpumask_var_t *node_to_cpumask;
 	struct cpumask *masks = NULL;
 
 	/*
@@ -187,15 +239,9 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	if (nvecs == affd->pre_vectors + affd->post_vectors)
 		return NULL;
 
-	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
-		return NULL;
-
-	if (!zalloc_cpumask_var(&npresmsk, GFP_KERNEL))
-		goto outcpumsk;
-
 	node_to_cpumask = alloc_node_to_cpumask();
 	if (!node_to_cpumask)
-		goto outnpresmsk;
+		return NULL;
 
 	masks = kcalloc(nvecs, sizeof(*masks), GFP_KERNEL);
 	if (!masks)
@@ -205,30 +251,8 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	for (curvec = 0; curvec < affd->pre_vectors; curvec++)
 		cpumask_copy(masks + curvec, irq_default_affinity);
 
-	/* Stabilize the cpumasks */
-	get_online_cpus();
-	build_node_to_cpumask(node_to_cpumask);
-
-	/* Spread on present CPUs starting from affd->pre_vectors */
 	usedvecs = irq_build_affinity_masks(affd, curvec, affvecs,
-					    node_to_cpumask, cpu_present_mask,
-					    nmsk, masks);
-
-	/*
-	 * Spread on non present CPUs starting from the next vector to be
-	 * handled. If the spreading of present CPUs already exhausted the
-	 * vector space, assign the non present CPUs to the already spread
-	 * out vectors.
-	 */
-	if (usedvecs >= affvecs)
-		curvec = affd->pre_vectors;
-	else
-		curvec = affd->pre_vectors + usedvecs;
-	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
-	usedvecs += irq_build_affinity_masks(affd, curvec, affvecs,
-					     node_to_cpumask, npresmsk,
-					     nmsk, masks);
-	put_online_cpus();
+					    node_to_cpumask, masks);
 
 	/* Fill out vectors at the end that don't need affinity */
 	if (usedvecs >= affvecs)
@@ -240,10 +264,6 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 
 outnodemsk:
 	free_node_to_cpumask(node_to_cpumask);
-outnpresmsk:
-	free_cpumask_var(npresmsk);
-outcpumsk:
-	free_cpumask_var(nmsk);
 	return masks;
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [tip:irq/core] genirq/affinity: Pass first vector to __irq_build_affinity_masks()
  2018-11-02 14:59 ` [PATCH 3/4] irq: pass first vector to __irq_build_affinity_masks Ming Lei
@ 2018-11-05 11:23   ` tip-bot for Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Ming Lei @ 2018-11-05 11:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: axboe, hare, linux-kernel, ming.lei, sagi, hpa, keith.busch, mingo, tglx

Commit-ID:  060746d9e394084b7401e7532f2de528ecbfb521
Gitweb:     https://git.kernel.org/tip/060746d9e394084b7401e7532f2de528ecbfb521
Author:     Ming Lei <ming.lei@redhat.com>
AuthorDate: Fri, 2 Nov 2018 22:59:50 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 5 Nov 2018 12:16:26 +0100

genirq/affinity: Pass first vector to __irq_build_affinity_masks()

No functional change.

Prepares for support of allocating and affinitizing sets of interrupts, in
which each set of interrupts needs a full two stage spreading. The first
vector argument is necessary for this so the affinitizing starts from the
first vector of each set.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Link: https://lkml.kernel.org/r/20181102145951.31979-4-ming.lei@redhat.com

---
 kernel/irq/affinity.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 2f9812b6035e..e028b773e38a 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -95,14 +95,14 @@ static int get_nodes_in_cpumask(cpumask_var_t *node_to_cpumask,
 }
 
 static int __irq_build_affinity_masks(const struct irq_affinity *affd,
-				    int startvec, int numvecs,
+				    int startvec, int numvecs, int firstvec,
 				    cpumask_var_t *node_to_cpumask,
 				    const struct cpumask *cpu_mask,
 				    struct cpumask *nmsk,
 				    struct cpumask *masks)
 {
 	int n, nodes, cpus_per_vec, extra_vecs, done = 0;
-	int last_affv = affd->pre_vectors + numvecs;
+	int last_affv = firstvec + numvecs;
 	int curvec = startvec;
 	nodemask_t nodemsk = NODE_MASK_NONE;
 
@@ -119,7 +119,7 @@ static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 		for_each_node_mask(n, nodemsk) {
 			cpumask_or(masks + curvec, masks + curvec, node_to_cpumask[n]);
 			if (++curvec == last_affv)
-				curvec = affd->pre_vectors;
+				curvec = firstvec;
 		}
 		done = numvecs;
 		goto out;
@@ -129,7 +129,7 @@ static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 		int ncpus, v, vecs_to_assign, vecs_per_node;
 
 		/* Spread the vectors per node */
-		vecs_per_node = (numvecs - (curvec - affd->pre_vectors)) / nodes;
+		vecs_per_node = (numvecs - (curvec - firstvec)) / nodes;
 
 		/* Get the cpus on this node which are in the mask */
 		cpumask_and(nmsk, cpu_mask, node_to_cpumask[n]);
@@ -157,7 +157,7 @@ static int __irq_build_affinity_masks(const struct irq_affinity *affd,
 		if (done >= numvecs)
 			break;
 		if (curvec >= last_affv)
-			curvec = affd->pre_vectors;
+			curvec = firstvec;
 		--nodes;
 	}
 
@@ -190,8 +190,9 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 
 	/* Spread on present CPUs starting from affd->pre_vectors */
 	usedvecs = __irq_build_affinity_masks(affd, curvec, numvecs,
-					    node_to_cpumask, cpu_present_mask,
-					    nmsk, masks);
+					      affd->pre_vectors,
+					      node_to_cpumask,
+					      cpu_present_mask, nmsk, masks);
 
 	/*
 	 * Spread on non present CPUs starting from the next vector to be
@@ -205,8 +206,9 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 		curvec = affd->pre_vectors + usedvecs;
 	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
 	usedvecs += __irq_build_affinity_masks(affd, curvec, numvecs,
-					     node_to_cpumask, npresmsk,
-					     nmsk, masks);
+					       affd->pre_vectors,
+					       node_to_cpumask, npresmsk,
+					       nmsk, masks);
 	put_online_cpus();
 
 	free_cpumask_var(npresmsk);

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [tip:irq/core] genirq/affinity: Add support for allocating interrupt sets
  2018-11-02 14:59 ` [PATCH 4/4] irq: add support for allocating (and affinitizing) sets of IRQs Ming Lei
@ 2018-11-05 11:23   ` tip-bot for Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Jens Axboe @ 2018-11-05 11:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hare, ming.lei, linux-kernel, axboe, mingo, tglx, keith.busch, sagi, hpa

Commit-ID:  6da4b3ab9a6e9b1b5f90322ab3fa3a7dd18edb19
Gitweb:     https://git.kernel.org/tip/6da4b3ab9a6e9b1b5f90322ab3fa3a7dd18edb19
Author:     Jens Axboe <axboe@kernel.dk>
AuthorDate: Fri, 2 Nov 2018 22:59:51 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 5 Nov 2018 12:16:27 +0100

genirq/affinity: Add support for allocating interrupt sets

A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts,
and have them appropriately affinitized.

Add support for defining a number of sets in the irq_affinity structure, of
varying sizes, and get each set affinitized correctly across the machine.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-block@vger.kernel.org
Link: https://lkml.kernel.org/r/20181102145951.31979-5-ming.lei@redhat.com

---
 drivers/pci/msi.c         | 14 +++++++++
 include/linux/interrupt.h |  4 +++
 kernel/irq/affinity.c     | 77 +++++++++++++++++++++++++++++++++--------------
 3 files changed, 72 insertions(+), 23 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index af24ed50a245..265ed3e4c920 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1036,6 +1036,13 @@ static int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
 	if (maxvec < minvec)
 		return -ERANGE;
 
+	/*
+	 * If the caller is passing in sets, we can't support a range of
+	 * vectors. The caller needs to handle that.
+	 */
+	if (affd && affd->nr_sets && minvec != maxvec)
+		return -EINVAL;
+
 	if (WARN_ON_ONCE(dev->msi_enabled))
 		return -EINVAL;
 
@@ -1087,6 +1094,13 @@ static int __pci_enable_msix_range(struct pci_dev *dev,
 	if (maxvec < minvec)
 		return -ERANGE;
 
+	/*
+	 * If the caller is passing in sets, we can't support a range of
+	 * supported vectors. The caller needs to handle that.
+	 */
+	if (affd && affd->nr_sets && minvec != maxvec)
+		return -EINVAL;
+
 	if (WARN_ON_ONCE(dev->msix_enabled))
 		return -EINVAL;
 
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 1d6711c28271..ca397ff40836 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -247,10 +247,14 @@ struct irq_affinity_notify {
  *			the MSI(-X) vector space
  * @post_vectors:	Don't apply affinity to @post_vectors at end of
  *			the MSI(-X) vector space
+ * @nr_sets:		Length of passed in *sets array
+ * @sets:		Number of affinitized sets
  */
 struct irq_affinity {
 	int	pre_vectors;
 	int	post_vectors;
+	int	nr_sets;
+	int	*sets;
 };
 
 #if defined(CONFIG_SMP)
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index e028b773e38a..08c904eb7279 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -171,28 +171,29 @@ out:
  *	2) spread other possible CPUs on these vectors
  */
 static int irq_build_affinity_masks(const struct irq_affinity *affd,
-				    int startvec, int numvecs,
+				    int startvec, int numvecs, int firstvec,
 				    cpumask_var_t *node_to_cpumask,
 				    struct cpumask *masks)
 {
-	int curvec = startvec, usedvecs = -1;
+	int curvec = startvec, nr_present, nr_others;
+	int ret = -ENOMEM;
 	cpumask_var_t nmsk, npresmsk;
 
 	if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL))
-			return usedvecs;
+			return ret;
 
 	if (!zalloc_cpumask_var(&npresmsk, GFP_KERNEL))
 			goto fail;
 
+	ret = 0;
 	/* Stabilize the cpumasks */
 	get_online_cpus();
 	build_node_to_cpumask(node_to_cpumask);
 
 	/* Spread on present CPUs starting from affd->pre_vectors */
-	usedvecs = __irq_build_affinity_masks(affd, curvec, numvecs,
-					      affd->pre_vectors,
-					      node_to_cpumask,
-					      cpu_present_mask, nmsk, masks);
+	nr_present = __irq_build_affinity_masks(affd, curvec, numvecs,
+						firstvec, node_to_cpumask,
+						cpu_present_mask, nmsk, masks);
 
 	/*
 	 * Spread on non present CPUs starting from the next vector to be
@@ -200,23 +201,24 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 	 * vector space, assign the non present CPUs to the already spread
 	 * out vectors.
 	 */
-	if (usedvecs >= numvecs)
-		curvec = affd->pre_vectors;
+	if (nr_present >= numvecs)
+		curvec = firstvec;
 	else
-		curvec = affd->pre_vectors + usedvecs;
+		curvec = firstvec + nr_present;
 	cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
-	usedvecs += __irq_build_affinity_masks(affd, curvec, numvecs,
-					       affd->pre_vectors,
-					       node_to_cpumask, npresmsk,
-					       nmsk, masks);
+	nr_others = __irq_build_affinity_masks(affd, curvec, numvecs,
+					       firstvec, node_to_cpumask,
+					       npresmsk, nmsk, masks);
 	put_online_cpus();
 
+	if (nr_present < numvecs)
+			WARN_ON(nr_present + nr_others < numvecs);
+
 	free_cpumask_var(npresmsk);
 
  fail:
 	free_cpumask_var(nmsk);
-
-	return usedvecs;
+	return ret;
 }
 
 /**
@@ -233,6 +235,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	int curvec, usedvecs;
 	cpumask_var_t *node_to_cpumask;
 	struct cpumask *masks = NULL;
+	int i, nr_sets;
 
 	/*
 	 * If there aren't any vectors left after applying the pre/post
@@ -253,8 +256,28 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	for (curvec = 0; curvec < affd->pre_vectors; curvec++)
 		cpumask_copy(masks + curvec, irq_default_affinity);
 
-	usedvecs = irq_build_affinity_masks(affd, curvec, affvecs,
-					    node_to_cpumask, masks);
+	/*
+	 * Spread on present CPUs starting from affd->pre_vectors. If we
+	 * have multiple sets, build each sets affinity mask separately.
+	 */
+	nr_sets = affd->nr_sets;
+	if (!nr_sets)
+		nr_sets = 1;
+
+	for (i = 0, usedvecs = 0; i < nr_sets; i++) {
+		int this_vecs = affd->sets ? affd->sets[i] : affvecs;
+		int ret;
+
+		ret = irq_build_affinity_masks(affd, curvec, this_vecs,
+						curvec, node_to_cpumask, masks);
+		if (ret) {
+				kfree(masks);
+				masks = NULL;
+				goto outnodemsk;
+		}
+		curvec += this_vecs;
+		usedvecs += this_vecs;
+	}
 
 	/* Fill out vectors at the end that don't need affinity */
 	if (usedvecs >= affvecs)
@@ -279,13 +302,21 @@ int irq_calc_affinity_vectors(int minvec, int maxvec, const struct irq_affinity
 {
 	int resv = affd->pre_vectors + affd->post_vectors;
 	int vecs = maxvec - resv;
-	int ret;
+	int set_vecs;
 
 	if (resv > minvec)
 		return 0;
 
-	get_online_cpus();
-	ret = min_t(int, cpumask_weight(cpu_possible_mask), vecs) + resv;
-	put_online_cpus();
-	return ret;
+	if (affd->nr_sets) {
+		int i;
+
+		for (i = 0, set_vecs = 0;  i < affd->nr_sets; i++)
+			set_vecs += affd->sets[i];
+	} else {
+		get_online_cpus();
+		set_vecs = cpumask_weight(cpu_possible_mask);
+		put_online_cpus();
+	}
+
+	return resv + min(set_vecs, vecs);
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] irq: fix support for allocating sets of IRQs
  2018-11-04 18:39       ` Thomas Gleixner
@ 2018-11-05 11:24         ` Thomas Gleixner
  2018-11-06  3:02           ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2018-11-05 11:24 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Ming Lei, linux-block, LKML, Hannes Reinecke, Keith Busch,
	Sagi Grimberg, Long Li

Jens,

On Sun, 4 Nov 2018, Thomas Gleixner wrote:
> On Sun, 4 Nov 2018, Jens Axboe wrote:
> > On 11/4/18 5:02 AM, Thomas Gleixner wrote:
> > > So I assume, that I can pick up Mings series instead.
> > 
> > Yes, let's do that.
> > 
> > > There is another patch pending affecting the irq affinity spreading. Can
> > > you folks please have a look at it?
> > > 
> > > https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
> > 
> > I'll give that a look and test. Thanks for the heads-up.

I picked the series from Ming and the patch from Long and merged it into
tip/irq/core.

The result is available for you to pull from:

    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/for-block

That branch is immutable now and I keep it around in case that there are
incremental updates necessary before the merge window.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] irq: fix support for allocating sets of IRQs
  2018-11-05 11:24         ` Thomas Gleixner
@ 2018-11-06  3:02           ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2018-11-06  3:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ming Lei, linux-block, LKML, Hannes Reinecke, Keith Busch,
	Sagi Grimberg, Long Li

On 11/5/18 4:24 AM, Thomas Gleixner wrote:
> Jens,
> 
> On Sun, 4 Nov 2018, Thomas Gleixner wrote:
>> On Sun, 4 Nov 2018, Jens Axboe wrote:
>>> On 11/4/18 5:02 AM, Thomas Gleixner wrote:
>>>> So I assume, that I can pick up Mings series instead.
>>>
>>> Yes, let's do that.
>>>
>>>> There is another patch pending affecting the irq affinity spreading. Can
>>>> you folks please have a look at it?
>>>>
>>>> https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
>>>
>>> I'll give that a look and test. Thanks for the heads-up.
> 
> I picked the series from Ming and the patch from Long and merged it into
> tip/irq/core.
> 
> The result is available for you to pull from:
> 
>     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/for-block
> 
> That branch is immutable now and I keep it around in case that there are
> incremental updates necessary before the merge window.

Great, thanks! I'll pull that in as a base.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-11-06  3:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-02 14:59 [PATCH 0/4] irq: fix support for allocating sets of IRQs Ming Lei
2018-11-02 14:59 ` [PATCH 2/4] irq: move 2-stage irq spread into one helper Ming Lei
2018-11-05 11:22   ` [tip:irq/core] genirq/affinity: Move two stage affinity spreading into a helper function tip-bot for Ming Lei
2018-11-02 14:59 ` [PATCH 3/4] irq: pass first vector to __irq_build_affinity_masks Ming Lei
2018-11-05 11:23   ` [tip:irq/core] genirq/affinity: Pass first vector to __irq_build_affinity_masks() tip-bot for Ming Lei
2018-11-02 14:59 ` [PATCH 4/4] irq: add support for allocating (and affinitizing) sets of IRQs Ming Lei
2018-11-05 11:23   ` [tip:irq/core] genirq/affinity: Add support for allocating interrupt sets tip-bot for Jens Axboe
2018-11-03 21:21 ` [PATCH 0/4] irq: fix support for allocating sets of IRQs Jens Axboe
2018-11-04 12:02   ` Thomas Gleixner
2018-11-04 17:24     ` Jens Axboe
2018-11-04 18:39       ` Thomas Gleixner
2018-11-05 11:24         ` Thomas Gleixner
2018-11-06  3:02           ` Jens Axboe
2018-11-05  2:18     ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).