linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/apic/vector: Fix ordering in vector assignment
@ 2020-12-10 20:18 Thomas Gleixner
  2020-12-10 22:04 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Thomas Gleixner @ 2020-12-10 20:18 UTC (permalink / raw)
  To: LKML; +Cc: x86, Shung-Hsi Yu, Prarit Bhargava, Ming Lei, Peter Xu

Prarit reported that depending on the affinity setting the

 ' irq $N: Affinity broken due to vector space exhaustion.'

message is showing up in dmesg, but the vector space on the CPUs in the
affinity mask is definitely not exhausted.

Shung-Hsi provided traces and analysis which pinpoints the problem:

The ordering of trying to assign an interrupt vector in
assign_irq_vector_any_locked() is simply wrong if the interrupt data has a
valid node assigned. It does:

 1) Try the intersection of affinity mask and node mask
 2) Try the node mask
 3) Try the full affinity mask
 4) Try the full online mask

Obviously #2 and #3 are in the wrong order as the requested affinity
mask has to take precedence.

In the observed cases #1 failed because the affinity mask did not contain
CPUs from node 0. That made it allocate a vector from node 0, thereby
breaking affinity and emitting the misleading message.

Revert the order of #2 and #3 so the full affinity mask without the node
intersection is tried before actually affinity is broken.

If no node is assigned then only the full affinity mask and if that fails
the full online mask is tried.

Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor")
Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/apic/vector.c |   24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -273,20 +273,24 @@ static int assign_irq_vector_any_locked(
 	const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
 	int node = irq_data_get_node(irqd);
 
-	if (node == NUMA_NO_NODE)
-		goto all;
-	/* Try the intersection of @affmsk and node mask */
-	cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
-	if (!assign_vector_locked(irqd, vector_searchmask))
-		return 0;
-	/* Try the node mask */
-	if (!assign_vector_locked(irqd, cpumask_of_node(node)))
-		return 0;
-all:
+	if (node != NUMA_NO_NODE) {
+		/* Try the intersection of @affmsk and node mask */
+		cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
+		if (!assign_vector_locked(irqd, vector_searchmask))
+			return 0;
+	}
+
 	/* Try the full affinity mask */
 	cpumask_and(vector_searchmask, affmsk, cpu_online_mask);
 	if (!assign_vector_locked(irqd, vector_searchmask))
 		return 0;
+
+	if (node != NUMA_NO_NODE) {
+		/* Try the node mask */
+		if (!assign_vector_locked(irqd, cpumask_of_node(node)))
+			return 0;
+	}
+
 	/* Try the full online mask */
 	return assign_vector_locked(irqd, cpu_online_mask);
 }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: x86/urgent] x86/apic/vector: Fix ordering in vector assignment
  2020-12-10 20:18 [PATCH] x86/apic/vector: Fix ordering in vector assignment Thomas Gleixner
@ 2020-12-10 22:04 ` tip-bot2 for Thomas Gleixner
  2020-12-11  7:44 ` [PATCH] " Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-12-10 22:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Prarit Bhargava, Shung-Hsi Yu, Thomas Gleixner, stable, x86,
	linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     190113b4c6531c8e09b31d5235f9b5175cbb0f72
Gitweb:        https://git.kernel.org/tip/190113b4c6531c8e09b31d5235f9b5175cbb0f72
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 10 Dec 2020 21:18:22 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 10 Dec 2020 23:00:54 +01:00

x86/apic/vector: Fix ordering in vector assignment

Prarit reported that depending on the affinity setting the

 ' irq $N: Affinity broken due to vector space exhaustion.'

message is showing up in dmesg, but the vector space on the CPUs in the
affinity mask is definitely not exhausted.

Shung-Hsi provided traces and analysis which pinpoints the problem:

The ordering of trying to assign an interrupt vector in
assign_irq_vector_any_locked() is simply wrong if the interrupt data has a
valid node assigned. It does:

 1) Try the intersection of affinity mask and node mask
 2) Try the node mask
 3) Try the full affinity mask
 4) Try the full online mask

Obviously #2 and #3 are in the wrong order as the requested affinity
mask has to take precedence.

In the observed cases #1 failed because the affinity mask did not contain
CPUs from node 0. That made it allocate a vector from node 0, thereby
breaking affinity and emitting the misleading message.

Revert the order of #2 and #3 so the full affinity mask without the node
intersection is tried before actually affinity is broken.

If no node is assigned then only the full affinity mask and if that fails
the full online mask is tried.

Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor")
Reported-by: Prarit Bhargava <prarit@redhat.com>
Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87ft4djtyp.fsf@nanos.tec.linutronix.de

---
 arch/x86/kernel/apic/vector.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 1eac536..758bbf2 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -273,20 +273,24 @@ static int assign_irq_vector_any_locked(struct irq_data *irqd)
 	const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
 	int node = irq_data_get_node(irqd);
 
-	if (node == NUMA_NO_NODE)
-		goto all;
-	/* Try the intersection of @affmsk and node mask */
-	cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
-	if (!assign_vector_locked(irqd, vector_searchmask))
-		return 0;
-	/* Try the node mask */
-	if (!assign_vector_locked(irqd, cpumask_of_node(node)))
-		return 0;
-all:
+	if (node != NUMA_NO_NODE) {
+		/* Try the intersection of @affmsk and node mask */
+		cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
+		if (!assign_vector_locked(irqd, vector_searchmask))
+			return 0;
+	}
+
 	/* Try the full affinity mask */
 	cpumask_and(vector_searchmask, affmsk, cpu_online_mask);
 	if (!assign_vector_locked(irqd, vector_searchmask))
 		return 0;
+
+	if (node != NUMA_NO_NODE) {
+		/* Try the node mask */
+		if (!assign_vector_locked(irqd, cpumask_of_node(node)))
+			return 0;
+	}
+
 	/* Try the full online mask */
 	return assign_vector_locked(irqd, cpu_online_mask);
 }

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/apic/vector: Fix ordering in vector assignment
  2020-12-10 20:18 [PATCH] x86/apic/vector: Fix ordering in vector assignment Thomas Gleixner
  2020-12-10 22:04 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
@ 2020-12-11  7:44 ` Ming Lei
  2020-12-11 12:24 ` Prarit Bhargava
  2020-12-11 13:50 ` Peter Xu
  3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2020-12-11  7:44 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, x86, Shung-Hsi Yu, Prarit Bhargava, Peter Xu

On Thu, Dec 10, 2020 at 09:18:22PM +0100, Thomas Gleixner wrote:
> Prarit reported that depending on the affinity setting the
> 
>  ' irq $N: Affinity broken due to vector space exhaustion.'
> 
> message is showing up in dmesg, but the vector space on the CPUs in the
> affinity mask is definitely not exhausted.
> 
> Shung-Hsi provided traces and analysis which pinpoints the problem:
> 
> The ordering of trying to assign an interrupt vector in
> assign_irq_vector_any_locked() is simply wrong if the interrupt data has a
> valid node assigned. It does:
> 
>  1) Try the intersection of affinity mask and node mask
>  2) Try the node mask
>  3) Try the full affinity mask
>  4) Try the full online mask
> 
> Obviously #2 and #3 are in the wrong order as the requested affinity
> mask has to take precedence.
> 
> In the observed cases #1 failed because the affinity mask did not contain
> CPUs from node 0. That made it allocate a vector from node 0, thereby
> breaking affinity and emitting the misleading message.
> 
> Revert the order of #2 and #3 so the full affinity mask without the node
> intersection is tried before actually affinity is broken.
> 
> If no node is assigned then only the full affinity mask and if that fails
> the full online mask is tried.
> 
> Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor")
> Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Reported-by: Prarit Bhargava <prarit@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/kernel/apic/vector.c |   24 ++++++++++++++----------
>  1 file changed, 14 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -273,20 +273,24 @@ static int assign_irq_vector_any_locked(
>  	const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
>  	int node = irq_data_get_node(irqd);
>  
> -	if (node == NUMA_NO_NODE)
> -		goto all;
> -	/* Try the intersection of @affmsk and node mask */
> -	cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
> -	if (!assign_vector_locked(irqd, vector_searchmask))
> -		return 0;
> -	/* Try the node mask */
> -	if (!assign_vector_locked(irqd, cpumask_of_node(node)))
> -		return 0;
> -all:
> +	if (node != NUMA_NO_NODE) {
> +		/* Try the intersection of @affmsk and node mask */
> +		cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
> +		if (!assign_vector_locked(irqd, vector_searchmask))
> +			return 0;
> +	}
> +
>  	/* Try the full affinity mask */
>  	cpumask_and(vector_searchmask, affmsk, cpu_online_mask);
>  	if (!assign_vector_locked(irqd, vector_searchmask))
>  		return 0;
> +
> +	if (node != NUMA_NO_NODE) {
> +		/* Try the node mask */
> +		if (!assign_vector_locked(irqd, cpumask_of_node(node)))
> +			return 0;
> +	}
> +
>  	/* Try the full online mask */
>  	return assign_vector_locked(irqd, cpu_online_mask);
>  }
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>


Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/apic/vector: Fix ordering in vector assignment
  2020-12-10 20:18 [PATCH] x86/apic/vector: Fix ordering in vector assignment Thomas Gleixner
  2020-12-10 22:04 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2020-12-11  7:44 ` [PATCH] " Ming Lei
@ 2020-12-11 12:24 ` Prarit Bhargava
  2020-12-11 13:50 ` Peter Xu
  3 siblings, 0 replies; 5+ messages in thread
From: Prarit Bhargava @ 2020-12-11 12:24 UTC (permalink / raw)
  To: Thomas Gleixner, LKML; +Cc: x86, Shung-Hsi Yu, Ming Lei, Peter Xu



On 12/10/20 3:18 PM, Thomas Gleixner wrote:
> Prarit reported that depending on the affinity setting the
> 
>  ' irq $N: Affinity broken due to vector space exhaustion.'
> 
> message is showing up in dmesg, but the vector space on the CPUs in the
> affinity mask is definitely not exhausted.
> 
> Shung-Hsi provided traces and analysis which pinpoints the problem:
> 
> The ordering of trying to assign an interrupt vector in
> assign_irq_vector_any_locked() is simply wrong if the interrupt data has a
> valid node assigned. It does:
> 
>  1) Try the intersection of affinity mask and node mask
>  2) Try the node mask
>  3) Try the full affinity mask
>  4) Try the full online mask
> 
> Obviously #2 and #3 are in the wrong order as the requested affinity
> mask has to take precedence.
> 
> In the observed cases #1 failed because the affinity mask did not contain
> CPUs from node 0. That made it allocate a vector from node 0, thereby
> breaking affinity and emitting the misleading message.
> 
> Revert the order of #2 and #3 so the full affinity mask without the node
> intersection is tried before actually affinity is broken.
> 
> If no node is assigned then only the full affinity mask and if that fails
> the full online mask is tried.
> 
> Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor")
> Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Reported-by: Prarit Bhargava <prarit@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/kernel/apic/vector.c |   24 ++++++++++++++----------
>  1 file changed, 14 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -273,20 +273,24 @@ static int assign_irq_vector_any_locked(
>  	const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
>  	int node = irq_data_get_node(irqd);
>  
> -	if (node == NUMA_NO_NODE)
> -		goto all;
> -	/* Try the intersection of @affmsk and node mask */
> -	cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
> -	if (!assign_vector_locked(irqd, vector_searchmask))
> -		return 0;
> -	/* Try the node mask */
> -	if (!assign_vector_locked(irqd, cpumask_of_node(node)))
> -		return 0;
> -all:
> +	if (node != NUMA_NO_NODE) {
> +		/* Try the intersection of @affmsk and node mask */
> +		cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
> +		if (!assign_vector_locked(irqd, vector_searchmask))
> +			return 0;
> +	}
> +
>  	/* Try the full affinity mask */
>  	cpumask_and(vector_searchmask, affmsk, cpu_online_mask);
>  	if (!assign_vector_locked(irqd, vector_searchmask))
>  		return 0;
> +
> +	if (node != NUMA_NO_NODE) {
> +		/* Try the node mask */
> +		if (!assign_vector_locked(irqd, cpumask_of_node(node)))
> +			return 0;
> +	}
> +
>  	/* Try the full online mask */
>  	return assign_vector_locked(irqd, cpu_online_mask);
>  }
> 

Tested-and-Reviewed-by: Prarit Bhargava <prarit@redhat.com>

P.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/apic/vector: Fix ordering in vector assignment
  2020-12-10 20:18 [PATCH] x86/apic/vector: Fix ordering in vector assignment Thomas Gleixner
                   ` (2 preceding siblings ...)
  2020-12-11 12:24 ` Prarit Bhargava
@ 2020-12-11 13:50 ` Peter Xu
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Xu @ 2020-12-11 13:50 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, x86, Shung-Hsi Yu, Prarit Bhargava, Ming Lei

On Thu, Dec 10, 2020 at 09:18:22PM +0100, Thomas Gleixner wrote:
> Prarit reported that depending on the affinity setting the
> 
>  ' irq $N: Affinity broken due to vector space exhaustion.'
> 
> message is showing up in dmesg, but the vector space on the CPUs in the
> affinity mask is definitely not exhausted.
> 
> Shung-Hsi provided traces and analysis which pinpoints the problem:
> 
> The ordering of trying to assign an interrupt vector in
> assign_irq_vector_any_locked() is simply wrong if the interrupt data has a
> valid node assigned. It does:
> 
>  1) Try the intersection of affinity mask and node mask
>  2) Try the node mask
>  3) Try the full affinity mask
>  4) Try the full online mask
> 
> Obviously #2 and #3 are in the wrong order as the requested affinity
> mask has to take precedence.
> 
> In the observed cases #1 failed because the affinity mask did not contain
> CPUs from node 0. That made it allocate a vector from node 0, thereby
> breaking affinity and emitting the misleading message.
> 
> Revert the order of #2 and #3 so the full affinity mask without the node
> intersection is tried before actually affinity is broken.
> 
> If no node is assigned then only the full affinity mask and if that fails
> the full online mask is tried.
> 
> Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor")
> Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Reported-by: Prarit Bhargava <prarit@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-11 13:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10 20:18 [PATCH] x86/apic/vector: Fix ordering in vector assignment Thomas Gleixner
2020-12-10 22:04 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2020-12-11  7:44 ` [PATCH] " Ming Lei
2020-12-11 12:24 ` Prarit Bhargava
2020-12-11 13:50 ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).