linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: Add irq spillover warning
@ 2019-07-16 13:59 Neil Horman
  2019-07-16 15:57 ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Neil Horman @ 2019-07-16 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Neil Horman, djuran, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86

On Intel hardware, cpus are limited in the number of irqs they can
have affined to them (currently 240), based on section 10.5.2 of:
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf

If a cpu has more than this number of interrupts affined to it, they
will spill over to other cpus, which potentially may be outside of their
affinity mask.  Given that this might cause unexpected behavior on
performance sensitive systems, warn the user should this condition occur
so that corrective action can be taken

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: djuran@redhat.com
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: x86@kernel.org
---
 arch/x86/kernel/irq.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 9b68b5b00ac9..ac7ed32de3d5 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -244,6 +244,14 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
 
 	desc = __this_cpu_read(vector_irq[vector]);
 
+	/*
+	 * Intel processors are limited in the number of irqs they can address. If we affine
+	 * too many irqs to a given cpu, they can silently spill to another cpu outside of
+	 * their affinity mask. Warn the user when this occurs
+	 */
+	if (unlikely(!cpumask_test_cpu(smp_processor_id(), &desc->irq_common_data.affinity)))
+		pr_emerg_ratelimited("%s: %d.%d handled outside of affinity mask\n");
+
 	if (!handle_irq(desc, regs)) {
 		ack_APIC_irq();
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Add irq spillover warning
  2019-07-16 13:59 [PATCH] x86: Add irq spillover warning Neil Horman
@ 2019-07-16 15:57 ` Thomas Gleixner
  2019-07-16 16:07   ` Neil Horman
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2019-07-16 15:57 UTC (permalink / raw)
  To: Neil Horman
  Cc: linux-kernel, djuran, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86

Neil,

On Tue, 16 Jul 2019, Neil Horman wrote:

> On Intel hardware, cpus are limited in the number of irqs they can
> have affined to them (currently 240), based on section 10.5.2 of:
> https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf

That reference is really not useful to explain the problem and the number
of vectors is neither. Please explain the conceptual issue.
 
> If a cpu has more than this number of interrupts affined to it, they
> will spill over to other cpus, which potentially may be outside of their
> affinity mask.

Spill over?

The kernel decides to pick a vector on a CPU outside of the affinity when
it runs out of vectors on the CPUs in the affinity mask.

Please explain issues technically correct.

> Given that this might cause unexpected behavior on
> performance sensitive systems, warn the user should this condition occur
> so that corrective action can be taken

> @@ -244,6 +244,14 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)

Why on earth warn in the interrupt delivery hotpath? Just because it's the
place which really needs extra instructions and extra cache lines on
performance sensitive systems, right?

The fact that the kernel ran out of vectors for the CPUs in the affinity
mask is already known when the vector is allocated in activate_reserved().

So there is an obvious place to put such a warning and it's certainly not
do_IRQ().

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Add irq spillover warning
  2019-07-16 15:57 ` Thomas Gleixner
@ 2019-07-16 16:07   ` Neil Horman
  2019-07-16 19:05     ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Neil Horman @ 2019-07-16 16:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, djuran, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86

On Tue, Jul 16, 2019 at 05:57:31PM +0200, Thomas Gleixner wrote:
> Neil,
> 
> On Tue, 16 Jul 2019, Neil Horman wrote:
> 
> > On Intel hardware, cpus are limited in the number of irqs they can
> > have affined to them (currently 240), based on section 10.5.2 of:
> > https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf
> 
> That reference is really not useful to explain the problem and the number
> of vectors is neither. Please explain the conceptual issue.
>  
You seem to have already done that below.  Not really sure what more you are
asking for here.

> > If a cpu has more than this number of interrupts affined to it, they
> > will spill over to other cpus, which potentially may be outside of their
> > affinity mask.
> 
> Spill over?
> 
> The kernel decides to pick a vector on a CPU outside of the affinity when
> it runs out of vectors on the CPUs in the affinity mask.
> 
Yes.

> Please explain issues technically correct.
> 
I don't know what you mean by this.  I explained it above, and you clearly
understood it.

> > Given that this might cause unexpected behavior on
> > performance sensitive systems, warn the user should this condition occur
> > so that corrective action can be taken
> 
> > @@ -244,6 +244,14 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
> 
> Why on earth warn in the interrupt delivery hotpath? Just because it's the
> place which really needs extra instructions and extra cache lines on
> performance sensitive systems, right?
> 
Because theres already a check of the same variety in do_IRQ, but if the
information is available outside the hotpath, I was unaware, and am happy to
update this patch to refelct that.

> The fact that the kernel ran out of vectors for the CPUs in the affinity
> mask is already known when the vector is allocated in activate_reserved().
> 
> So there is an obvious place to put such a warning and it's certainly not
> do_IRQ().
> 
Sure

Thanks
Neil

> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Add irq spillover warning
  2019-07-16 16:07   ` Neil Horman
@ 2019-07-16 19:05     ` Thomas Gleixner
  2019-07-16 20:39       ` Neil Horman
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2019-07-16 19:05 UTC (permalink / raw)
  To: Neil Horman
  Cc: linux-kernel, djuran, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86

Neil,

On Tue, 16 Jul 2019, Neil Horman wrote:
> On Tue, Jul 16, 2019 at 05:57:31PM +0200, Thomas Gleixner wrote:
> > On Tue, 16 Jul 2019, Neil Horman wrote:
> > > If a cpu has more than this number of interrupts affined to it, they
> > > will spill over to other cpus, which potentially may be outside of their
> > > affinity mask.
> > 
> > Spill over?
> > 
> > The kernel decides to pick a vector on a CPU outside of the affinity when
> > it runs out of vectors on the CPUs in the affinity mask.
> > 
> Yes.
> 
> > Please explain issues technically correct.
> > 
> I don't know what you mean by this.  I explained it above, and you clearly
> understood it.

It took me a while to grok it. Simply because I first thought it's some
hardware issue. And of course after confusion settled I knew what it is,
but just because I know that code like the back of my hand.

> > > Given that this might cause unexpected behavior on
> > > performance sensitive systems, warn the user should this condition occur
> > > so that corrective action can be taken
> > 
> > > @@ -244,6 +244,14 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
> > 
> > Why on earth warn in the interrupt delivery hotpath? Just because it's the
> > place which really needs extra instructions and extra cache lines on
> > performance sensitive systems, right?
> > 
> Because theres already a check of the same variety in do_IRQ, but if the
> information is available outside the hotpath, I was unaware, and am happy to
> update this patch to refelct that.

Which check are you referring to?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Add irq spillover warning
  2019-07-16 19:05     ` Thomas Gleixner
@ 2019-07-16 20:39       ` Neil Horman
  2019-07-16 21:06         ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Neil Horman @ 2019-07-16 20:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, djuran, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86

On Tue, Jul 16, 2019 at 09:05:30PM +0200, Thomas Gleixner wrote:
> Neil,
> 
> On Tue, 16 Jul 2019, Neil Horman wrote:
> > On Tue, Jul 16, 2019 at 05:57:31PM +0200, Thomas Gleixner wrote:
> > > On Tue, 16 Jul 2019, Neil Horman wrote:
> > > > If a cpu has more than this number of interrupts affined to it, they
> > > > will spill over to other cpus, which potentially may be outside of their
> > > > affinity mask.
> > > 
> > > Spill over?
> > > 
> > > The kernel decides to pick a vector on a CPU outside of the affinity when
> > > it runs out of vectors on the CPUs in the affinity mask.
> > > 
> > Yes.
> > 
> > > Please explain issues technically correct.
> > > 
> > I don't know what you mean by this.  I explained it above, and you clearly
> > understood it.
> 
> It took me a while to grok it. Simply because I first thought it's some
> hardware issue. And of course after confusion settled I knew what it is,
> but just because I know that code like the back of my hand.
> 
> > > > Given that this might cause unexpected behavior on
> > > > performance sensitive systems, warn the user should this condition occur
> > > > so that corrective action can be taken
> > > 
> > > > @@ -244,6 +244,14 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
> > > 
> > > Why on earth warn in the interrupt delivery hotpath? Just because it's the
> > > place which really needs extra instructions and extra cache lines on
> > > performance sensitive systems, right?
> > > 
> > Because theres already a check of the same variety in do_IRQ, but if the
> > information is available outside the hotpath, I was unaware, and am happy to
> > update this patch to refelct that.
> 
> Which check are you referring to?
> 
This one:
if (desc != VECTOR_RETRIGGERED) {
                        pr_emerg_ratelimited("%s: %d.%d No irq handler for vector\n",
                                             __func__, smp_processor_id(),
                                             vector);
I figured it was already checking one condition, another wouldn't hurt too much,
but no worries, I'm redoing this in activate_reserved now.

Best
Neil

> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Add irq spillover warning
  2019-07-16 20:39       ` Neil Horman
@ 2019-07-16 21:06         ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2019-07-16 21:06 UTC (permalink / raw)
  To: Neil Horman
  Cc: linux-kernel, djuran, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86

Neil,

On Tue, 16 Jul 2019, Neil Horman wrote:
> On Tue, Jul 16, 2019 at 09:05:30PM +0200, Thomas Gleixner wrote:
> > > Because theres already a check of the same variety in do_IRQ, but if the
> > > information is available outside the hotpath, I was unaware, and am happy to
> > > update this patch to refelct that.
> > 
> > Which check are you referring to?
> > 
> This one:
> if (desc != VECTOR_RETRIGGERED) {
>                         pr_emerg_ratelimited("%s: %d.%d No irq handler for vector\n",
>                                              __func__, smp_processor_id(),
>                                              vector);
> I figured it was already checking one condition, another wouldn't hurt too much,
> but no worries, I'm redoing this in activate_reserved now.

That's in the slow path, when handle_irq() failed which is the unlikely case.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-07-16 21:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-16 13:59 [PATCH] x86: Add irq spillover warning Neil Horman
2019-07-16 15:57 ` Thomas Gleixner
2019-07-16 16:07   ` Neil Horman
2019-07-16 19:05     ` Thomas Gleixner
2019-07-16 20:39       ` Neil Horman
2019-07-16 21:06         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).