linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] genirq/affinity: fix node generation from cpumask
@ 2016-12-14 18:01 Guilherme G. Piccoli
  2016-12-14 23:24 ` Gavin Shan
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Guilherme G. Piccoli @ 2016-12-14 18:01 UTC (permalink / raw)
  To: tglx, linux-kernel
  Cc: gabriel, hch, linuxppc-dev, linux-pci, gpiccoli, stable

Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading
infrastructure") introduced a better IRQ spreading mechanism, taking
account of the available NUMA nodes in the machine.

Problem is that the algorithm of retrieving the nodemask iterates
"linearly" based on the number of online nodes - some architectures
present non-linear node distribution among the nodemask, like PowerPC.
If this is the case, the algorithm lead to a wrong node count number
and therefore to a bad/incomplete IRQ affinity distribution.

For example, this problem were found in a machine with 128 CPUs and two
nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
distributed). This led to a wrong affinity distribution which then led to
a bad mq allocation for nvme driver.

Finally, we take the opportunity to fix a comment regarding the affinity
distribution when we have _more_ nodes than vectors.

Fixes: 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading infrastructure")
Reported-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.9+
Cc: Christoph Hellwig <hch@lst.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-pci@vger.kernel.org
---
 kernel/irq/affinity.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 9be9bda..464eaf0 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -37,15 +37,15 @@ static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
 
 static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk)
 {
-	int n, nodes;
+	int n, nodes = 0;
 
 	/* Calculate the number of nodes in the supplied affinity mask */
-	for (n = 0, nodes = 0; n < num_online_nodes(); n++) {
+	for_each_online_node(n)
 		if (cpumask_intersects(mask, cpumask_of_node(n))) {
 			node_set(n, *nodemsk);
 			nodes++;
 		}
-	}
+
 	return nodes;
 }
 
@@ -82,7 +82,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	nodes = get_nodes_in_cpumask(cpu_online_mask, &nodemsk);
 
 	/*
-	 * If the number of nodes in the mask is less than or equal the
+	 * If the number of nodes in the mask is greater than or equal the
 	 * number of vectors we just spread the vectors across the nodes.
 	 */
 	if (affv <= nodes) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] genirq/affinity: fix node generation from cpumask
  2016-12-14 18:01 [PATCH] genirq/affinity: fix node generation from cpumask Guilherme G. Piccoli
@ 2016-12-14 23:24 ` Gavin Shan
  2016-12-15  9:36   ` Thomas Gleixner
  2016-12-15  1:05 ` Gabriel Krisman Bertazi
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Gavin Shan @ 2016-12-14 23:24 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: tglx, linux-kernel, linux-pci, hch, linuxppc-dev, stable, gabriel

On Wed, Dec 14, 2016 at 04:01:12PM -0200, Guilherme G. Piccoli wrote:
>Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading
>infrastructure") introduced a better IRQ spreading mechanism, taking
>account of the available NUMA nodes in the machine.
>
>Problem is that the algorithm of retrieving the nodemask iterates
>"linearly" based on the number of online nodes - some architectures
>present non-linear node distribution among the nodemask, like PowerPC.
>If this is the case, the algorithm lead to a wrong node count number
>and therefore to a bad/incomplete IRQ affinity distribution.
>
>For example, this problem were found in a machine with 128 CPUs and two
>nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
>distributed). This led to a wrong affinity distribution which then led to
>a bad mq allocation for nvme driver.
>
>Finally, we take the opportunity to fix a comment regarding the affinity
>distribution when we have _more_ nodes than vectors.
>
>Fixes: 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading infrastructure")
>Reported-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
>Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
>Cc: stable@vger.kernel.org # v4.9+
>Cc: Christoph Hellwig <hch@lst.de>
>Cc: linuxppc-dev@lists.ozlabs.org
>Cc: linux-pci@vger.kernel.org
>---

Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

There is one picky comment as below, but you don't have to fix it :)

> kernel/irq/affinity.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
>index 9be9bda..464eaf0 100644
>--- a/kernel/irq/affinity.c
>+++ b/kernel/irq/affinity.c
>@@ -37,15 +37,15 @@ static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
>
> static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk)
> {
>-	int n, nodes;
>+	int n, nodes = 0;
>
> 	/* Calculate the number of nodes in the supplied affinity mask */
>-	for (n = 0, nodes = 0; n < num_online_nodes(); n++) {
>+	for_each_online_node(n)
> 		if (cpumask_intersects(mask, cpumask_of_node(n))) {
> 			node_set(n, *nodemsk);
> 			nodes++;
> 		}
>-	}
>+

It'd better to keep the brackets so that we needn't add them when adding
more code into the block next time.

> 	return nodes;
> }
>
>@@ -82,7 +82,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
> 	nodes = get_nodes_in_cpumask(cpu_online_mask, &nodemsk);
>
> 	/*
>-	 * If the number of nodes in the mask is less than or equal the
>+	 * If the number of nodes in the mask is greater than or equal the
> 	 * number of vectors we just spread the vectors across the nodes.
> 	 */
> 	if (affv <= nodes) {

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] genirq/affinity: fix node generation from cpumask
  2016-12-14 18:01 [PATCH] genirq/affinity: fix node generation from cpumask Guilherme G. Piccoli
  2016-12-14 23:24 ` Gavin Shan
@ 2016-12-15  1:05 ` Gabriel Krisman Bertazi
  2016-12-15  8:54 ` Christoph Hellwig
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Gabriel Krisman Bertazi @ 2016-12-15  1:05 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: tglx, linux-kernel, gabriel, hch, linuxppc-dev, linux-pci

"Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:

> Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading
> infrastructure") introduced a better IRQ spreading mechanism, taking
> account of the available NUMA nodes in the machine.
>
> Problem is that the algorithm of retrieving the nodemask iterates
> "linearly" based on the number of online nodes - some architectures
> present non-linear node distribution among the nodemask, like PowerPC.
> If this is the case, the algorithm lead to a wrong node count number
> and therefore to a bad/incomplete IRQ affinity distribution.
>
> For example, this problem were found in a machine with 128 CPUs and two
> nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
> distributed). This led to a wrong affinity distribution which then led to
> a bad mq allocation for nvme driver.
>
> Finally, we take the opportunity to fix a comment regarding the affinity
> distribution when we have _more_ nodes than vectors.

Thanks for taking care of this so quickly, Guilherme.

Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] genirq/affinity: fix node generation from cpumask
  2016-12-14 18:01 [PATCH] genirq/affinity: fix node generation from cpumask Guilherme G. Piccoli
  2016-12-14 23:24 ` Gavin Shan
  2016-12-15  1:05 ` Gabriel Krisman Bertazi
@ 2016-12-15  8:54 ` Christoph Hellwig
  2016-12-15 11:37 ` [tip:irq/urgent] genirq/affinity: Fix " tip-bot for Guilherme G. Piccoli
  2016-12-15 12:34 ` [PATCH] genirq/affinity: fix " Balbir Singh
  4 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2016-12-15  8:54 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: tglx, linux-kernel, gabriel, hch, linuxppc-dev, linux-pci, stable

Looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

(but I agree with the bracing nitpick from Gavin)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] genirq/affinity: fix node generation from cpumask
  2016-12-14 23:24 ` Gavin Shan
@ 2016-12-15  9:36   ` Thomas Gleixner
  2016-12-15 12:38     ` Guilherme G. Piccoli
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2016-12-15  9:36 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Guilherme G. Piccoli, LKML, linux-pci, Christoph Hellwig,
	linuxppc-dev, gabriel

On Thu, 15 Dec 2016, Gavin Shan wrote:
> > static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk)
> > {
> >-	int n, nodes;
> >+	int n, nodes = 0;
> >
> > 	/* Calculate the number of nodes in the supplied affinity mask */
> >-	for (n = 0, nodes = 0; n < num_online_nodes(); n++) {
> >+	for_each_online_node(n)
> > 		if (cpumask_intersects(mask, cpumask_of_node(n))) {
> > 			node_set(n, *nodemsk);
> > 			nodes++;
> > 		}
> >-	}
> >+
> 
> It'd better to keep the brackets so that we needn't add them when adding
> more code into the block next time.

Removing the brackets is outright wrong. See:
  https://marc.info/?l=linux-kernel&m=147351236615103

I'll fix that up when applying the patch.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:irq/urgent] genirq/affinity: Fix node generation from cpumask
  2016-12-14 18:01 [PATCH] genirq/affinity: fix node generation from cpumask Guilherme G. Piccoli
                   ` (2 preceding siblings ...)
  2016-12-15  8:54 ` Christoph Hellwig
@ 2016-12-15 11:37 ` tip-bot for Guilherme G. Piccoli
  2016-12-15 12:34 ` [PATCH] genirq/affinity: fix " Balbir Singh
  4 siblings, 0 replies; 8+ messages in thread
From: tip-bot for Guilherme G. Piccoli @ 2016-12-15 11:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: gpiccoli, hch, gabriel, hpa, linux-kernel, gwshan, tglx, mingo

Commit-ID:  c0af52437254fda8b0cdbaae5a9b6d9327f1fcd5
Gitweb:     http://git.kernel.org/tip/c0af52437254fda8b0cdbaae5a9b6d9327f1fcd5
Author:     Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
AuthorDate: Wed, 14 Dec 2016 16:01:12 -0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 15 Dec 2016 12:32:35 +0100

genirq/affinity: Fix node generation from cpumask

Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading
infrastructure") introduced a better IRQ spreading mechanism, taking
account of the available NUMA nodes in the machine.

Problem is that the algorithm of retrieving the nodemask iterates
"linearly" based on the number of online nodes - some architectures
present non-linear node distribution among the nodemask, like PowerPC.
If this is the case, the algorithm lead to a wrong node count number
and therefore to a bad/incomplete IRQ affinity distribution.

For example, this problem were found in a machine with 128 CPUs and two
nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
distributed). This led to a wrong affinity distribution which then led to
a bad mq allocation for nvme driver.

Finally, we take the opportunity to fix a comment regarding the affinity
distribution when we have _more_ nodes than vectors.

Fixes: 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading infrastructure")
Reported-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: linux-pci@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: hch@lst.de
Link: http://lkml.kernel.org/r/1481738472-2671-1-git-send-email-gpiccoli@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 kernel/irq/affinity.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 9be9bda..4544b11 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -37,10 +37,10 @@ static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
 
 static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk)
 {
-	int n, nodes;
+	int n, nodes = 0;
 
 	/* Calculate the number of nodes in the supplied affinity mask */
-	for (n = 0, nodes = 0; n < num_online_nodes(); n++) {
+	for_each_online_node(n) {
 		if (cpumask_intersects(mask, cpumask_of_node(n))) {
 			node_set(n, *nodemsk);
 			nodes++;
@@ -82,7 +82,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	nodes = get_nodes_in_cpumask(cpu_online_mask, &nodemsk);
 
 	/*
-	 * If the number of nodes in the mask is less than or equal the
+	 * If the number of nodes in the mask is greater than or equal the
 	 * number of vectors we just spread the vectors across the nodes.
 	 */
 	if (affv <= nodes) {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] genirq/affinity: fix node generation from cpumask
  2016-12-14 18:01 [PATCH] genirq/affinity: fix node generation from cpumask Guilherme G. Piccoli
                   ` (3 preceding siblings ...)
  2016-12-15 11:37 ` [tip:irq/urgent] genirq/affinity: Fix " tip-bot for Guilherme G. Piccoli
@ 2016-12-15 12:34 ` Balbir Singh
  4 siblings, 0 replies; 8+ messages in thread
From: Balbir Singh @ 2016-12-15 12:34 UTC (permalink / raw)
  To: Guilherme G. Piccoli, tglx, linux-kernel
  Cc: linux-pci, hch, linuxppc-dev, gabriel



On 15/12/16 05:01, Guilherme G. Piccoli wrote:
> Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading
> infrastructure") introduced a better IRQ spreading mechanism, taking
> account of the available NUMA nodes in the machine.
> 
> Problem is that the algorithm of retrieving the nodemask iterates
> "linearly" based on the number of online nodes - some architectures
> present non-linear node distribution among the nodemask, like PowerPC.
> If this is the case, the algorithm lead to a wrong node count number
> and therefore to a bad/incomplete IRQ affinity distribution.
> 
> For example, this problem were found in a machine with 128 CPUs and two
> nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
> distributed). This led to a wrong affinity distribution which then led to
> a bad mq allocation for nvme driver.
> 
> Finally, we take the opportunity to fix a comment regarding the affinity
> distribution when we have _more_ nodes than vectors.

Very good catch! 

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] genirq/affinity: fix node generation from cpumask
  2016-12-15  9:36   ` Thomas Gleixner
@ 2016-12-15 12:38     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 8+ messages in thread
From: Guilherme G. Piccoli @ 2016-12-15 12:38 UTC (permalink / raw)
  To: Thomas Gleixner, Gavin Shan
  Cc: linux-pci, LKML, linuxppc-dev, Christoph Hellwig, gabriel, bsingharora

On 12/15/2016 07:36 AM, Thomas Gleixner wrote:
> On Thu, 15 Dec 2016, Gavin Shan wrote:
>>> static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk)
>>> {
>>> -	int n, nodes;
>>> +	int n, nodes = 0;
>>>
>>> 	/* Calculate the number of nodes in the supplied affinity mask */
>>> -	for (n = 0, nodes = 0; n < num_online_nodes(); n++) {
>>> +	for_each_online_node(n)
>>> 		if (cpumask_intersects(mask, cpumask_of_node(n))) {
>>> 			node_set(n, *nodemsk);
>>> 			nodes++;
>>> 		}
>>> -	}
>>> +
>>
>> It'd better to keep the brackets so that we needn't add them when adding
>> more code into the block next time.
> 
> Removing the brackets is outright wrong. See:
>   https://marc.info/?l=linux-kernel&m=147351236615103
> 
> I'll fix that up when applying the patch.
> 
> Thanks,
> 
> 	tglx
> 

Thanks you all very much for the reviews and comments - lesson learned
about the brackets in multi-line if/for statements!

Thanks for fixing it Thomas.
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-12-15 13:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-14 18:01 [PATCH] genirq/affinity: fix node generation from cpumask Guilherme G. Piccoli
2016-12-14 23:24 ` Gavin Shan
2016-12-15  9:36   ` Thomas Gleixner
2016-12-15 12:38     ` Guilherme G. Piccoli
2016-12-15  1:05 ` Gabriel Krisman Bertazi
2016-12-15  8:54 ` Christoph Hellwig
2016-12-15 11:37 ` [tip:irq/urgent] genirq/affinity: Fix " tip-bot for Guilherme G. Piccoli
2016-12-15 12:34 ` [PATCH] genirq/affinity: fix " Balbir Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).