All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] parisc: Filter out spurious interrupts in PA-RISC irq handler
@ 2015-09-03 20:45 Helge Deller
  0 siblings, 0 replies; 4+ messages in thread
From: Helge Deller @ 2015-09-03 20:45 UTC (permalink / raw)
  To: linux-parisc, James Bottomley

When detecting a serial port on newer PA-RISC machines (with iosapic) we have a
long way to go to find the right IRQ line, registering it, then registering the
serial port and the irq handler for the serial port. During this phase spurious
interrupts for the serial port may happen which then crashes the kernel because
the action handler might not have been set up yet.

So, basically it's a race condition between the serial port hardware and the
CPU which sets up the necessary fields in the irq sructs. The main reason for
this race is, that we unmask the serial port irqs too early without having set
up everything properly before (which isn't easily possible because we need the
IRQ number to register the serial ports).

This patch is a work-around for this problem. It adds checks to the CPU irq
handler to verify if the IRQ action field has been initialized already. If not,
we just skip this interrupt (which isn't critical for a serial port at bootup).
The real fix would probably involve rewriting all PA-RISC specific IRQ code
(for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of
the irq chips and proper irq enabling along this line.

This bug has been in the PA-RISC port since the beginning, but the crashes
happened very rarely with currently used hardware.  But on the latest machine
which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900,
1GHz) and which has the largest possible L1 cache size (64MB each), the kernel
crashed at every boot because of this race. So, without this patch the machine
would currently be unuseable.

For the record, here is the flow logic:
1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq().
2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq.
3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq
4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!)
5. serial_init_chip() then registers the 8250 port.
Problems:
- In step 4 the CPU irq shouldn't have been registered yet, but after step 5
- If serial irq happens between 4 and 5 have finished, the kernel will crash

Cc: <stable@vger.kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>

diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c
index f3191db..c0eab24 100644
--- a/arch/parisc/kernel/irq.c
+++ b/arch/parisc/kernel/irq.c
@@ -507,8 +507,8 @@ void do_cpu_irq_mask(struct pt_regs *regs)
 	struct pt_regs *old_regs;
 	unsigned long eirr_val;
 	int irq, cpu = smp_processor_id();
-#ifdef CONFIG_SMP
 	struct irq_desc *desc;
+#ifdef CONFIG_SMP
 	cpumask_t dest;
 #endif
 
@@ -521,8 +521,12 @@ void do_cpu_irq_mask(struct pt_regs *regs)
 		goto set_out;
 	irq = eirr_to_irq(eirr_val);
 
-#ifdef CONFIG_SMP
+	/* Filter out spurious interrupts, mostly from serial port at bootup */
 	desc = irq_to_desc(irq);
+	if (unlikely(!desc->action))
+		goto set_out;
+
+#ifdef CONFIG_SMP
 	cpumask_copy(&dest, desc->irq_data.affinity);
 	if (irqd_is_per_cpu(&desc->irq_data) &&
 	    !cpumask_test_cpu(smp_processor_id(), &dest)) {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] parisc: Filter out spurious interrupts in PA-RISC irq handler
  2015-09-09 18:41 Helge Deller
  2015-09-30  9:44 ` Luis Henriques
@ 2015-10-08 22:19 ` Ben Hutchings
  1 sibling, 0 replies; 4+ messages in thread
From: Ben Hutchings @ 2015-10-08 22:19 UTC (permalink / raw)
  To: Helge Deller, stable

[-- Attachment #1: Type: text/plain, Size: 465 bytes --]

On Wed, 2015-09-09 at 20:41 +0200, Helge Deller wrote:
> This is a backport of upstream commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44,
> which sadly can't easily be cherry-picked because of other conflicting patches
> which have been applied by Linus in the meantime...
> 
> Can you please apply it to all stable kernels?
[...]

Queued up for 3.2, thanks.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] parisc: Filter out spurious interrupts in PA-RISC irq handler
  2015-09-09 18:41 Helge Deller
@ 2015-09-30  9:44 ` Luis Henriques
  2015-10-08 22:19 ` Ben Hutchings
  1 sibling, 0 replies; 4+ messages in thread
From: Luis Henriques @ 2015-09-30  9:44 UTC (permalink / raw)
  To: Helge Deller; +Cc: stable

On Wed, Sep 09, 2015 at 08:41:57PM +0200, Helge Deller wrote:
> This is a backport of upstream commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44,
> which sadly can't easily be cherry-picked because of other conflicting patches
> which have been applied by Linus in the meantime...
> 
> Can you please apply it to all stable kernels?
>

Thank you, I'm queuing this backport for the 3.16 kernel.

Cheers,
--
Lu�s


> Thanks,
> Helge
> 
> 
> From: Helge Deller <deller@gmx.de>
> Subject: parisc: Filter out spurious interrupts in PA-RISC irq handler
> 
> When detecting a serial port on newer PA-RISC machines (with iosapic) we have a
> long way to go to find the right IRQ line, registering it, then registering the
> serial port and the irq handler for the serial port. During this phase spurious
> interrupts for the serial port may happen which then crashes the kernel because
> the action handler might not have been set up yet.
> 
> So, basically it's a race condition between the serial port hardware and the
> CPU which sets up the necessary fields in the irq sructs. The main reason for
> this race is, that we unmask the serial port irqs too early without having set
> up everything properly before (which isn't easily possible because we need the
> IRQ number to register the serial ports).
> 
> This patch is a work-around for this problem. It adds checks to the CPU irq
> handler to verify if the IRQ action field has been initialized already. If not,
> we just skip this interrupt (which isn't critical for a serial port at bootup).
> The real fix would probably involve rewriting all PA-RISC specific IRQ code
> (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of
> the irq chips and proper irq enabling along this line.
> 
> This bug has been in the PA-RISC port since the beginning, but the crashes
> happened very rarely with currently used hardware.  But on the latest machine
> which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900,
> 1GHz) and which has the largest possible L1 cache size (64MB each), the kernel
> crashed at every boot because of this race. So, without this patch the machine
> would currently be unuseable.
> 
> For the record, here is the flow logic:
> 1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq().
> 2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq.
> 3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq
> 4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!)
> 5. serial_init_chip() then registers the 8250 port.
> Problems:
> - In step 4 the CPU irq shouldn't have been registered yet, but after step 5
> - If serial irq happens between 4 and 5 have finished, the kernel will crash
> 
> Cc: linux-parisc@vger.kernel.org
> Signed-off-by: Helge Deller <deller@gmx.de>
> 
> ---
> 
>  arch/parisc/kernel/irq.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c
> index f3191db..c0eab24 100644
> --- a/arch/parisc/kernel/irq.c
> +++ b/arch/parisc/kernel/irq.c
> @@ -507,8 +507,8 @@ void do_cpu_irq_mask(struct pt_regs *regs)
>  	struct pt_regs *old_regs;
>  	unsigned long eirr_val;
>  	int irq, cpu = smp_processor_id();
> -#ifdef CONFIG_SMP
>  	struct irq_desc *desc;
> +#ifdef CONFIG_SMP
>  	cpumask_t dest;
>  #endif
>  
> @@ -521,8 +521,12 @@ void do_cpu_irq_mask(struct pt_regs *regs)
>  		goto set_out;
>  	irq = eirr_to_irq(eirr_val);
>  
> -#ifdef CONFIG_SMP
> +	/* Filter out spurious interrupts, mostly from serial port at bootup */
>  	desc = irq_to_desc(irq);
> +	if (unlikely(!desc->action))
> +		goto set_out;
> +
> +#ifdef CONFIG_SMP
>  	cpumask_copy(&dest, desc->irq_data.affinity);
>  	if (irqd_is_per_cpu(&desc->irq_data) &&
>  	    !cpumask_test_cpu(smp_processor_id(), &dest)) {
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] parisc: Filter out spurious interrupts in PA-RISC irq handler
@ 2015-09-09 18:41 Helge Deller
  2015-09-30  9:44 ` Luis Henriques
  2015-10-08 22:19 ` Ben Hutchings
  0 siblings, 2 replies; 4+ messages in thread
From: Helge Deller @ 2015-09-09 18:41 UTC (permalink / raw)
  To: stable

This is a backport of upstream commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44,
which sadly can't easily be cherry-picked because of other conflicting patches
which have been applied by Linus in the meantime...

Can you please apply it to all stable kernels?

Thanks,
Helge


From: Helge Deller <deller@gmx.de>
Subject: parisc: Filter out spurious interrupts in PA-RISC irq handler

When detecting a serial port on newer PA-RISC machines (with iosapic) we have a
long way to go to find the right IRQ line, registering it, then registering the
serial port and the irq handler for the serial port. During this phase spurious
interrupts for the serial port may happen which then crashes the kernel because
the action handler might not have been set up yet.

So, basically it's a race condition between the serial port hardware and the
CPU which sets up the necessary fields in the irq sructs. The main reason for
this race is, that we unmask the serial port irqs too early without having set
up everything properly before (which isn't easily possible because we need the
IRQ number to register the serial ports).

This patch is a work-around for this problem. It adds checks to the CPU irq
handler to verify if the IRQ action field has been initialized already. If not,
we just skip this interrupt (which isn't critical for a serial port at bootup).
The real fix would probably involve rewriting all PA-RISC specific IRQ code
(for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of
the irq chips and proper irq enabling along this line.

This bug has been in the PA-RISC port since the beginning, but the crashes
happened very rarely with currently used hardware.  But on the latest machine
which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900,
1GHz) and which has the largest possible L1 cache size (64MB each), the kernel
crashed at every boot because of this race. So, without this patch the machine
would currently be unuseable.

For the record, here is the flow logic:
1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq().
2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq.
3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq
4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!)
5. serial_init_chip() then registers the 8250 port.
Problems:
- In step 4 the CPU irq shouldn't have been registered yet, but after step 5
- If serial irq happens between 4 and 5 have finished, the kernel will crash

Cc: linux-parisc@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>

---

 arch/parisc/kernel/irq.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)


diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c
index f3191db..c0eab24 100644
--- a/arch/parisc/kernel/irq.c
+++ b/arch/parisc/kernel/irq.c
@@ -507,8 +507,8 @@ void do_cpu_irq_mask(struct pt_regs *regs)
 	struct pt_regs *old_regs;
 	unsigned long eirr_val;
 	int irq, cpu = smp_processor_id();
-#ifdef CONFIG_SMP
 	struct irq_desc *desc;
+#ifdef CONFIG_SMP
 	cpumask_t dest;
 #endif
 
@@ -521,8 +521,12 @@ void do_cpu_irq_mask(struct pt_regs *regs)
 		goto set_out;
 	irq = eirr_to_irq(eirr_val);
 
-#ifdef CONFIG_SMP
+	/* Filter out spurious interrupts, mostly from serial port at bootup */
 	desc = irq_to_desc(irq);
+	if (unlikely(!desc->action))
+		goto set_out;
+
+#ifdef CONFIG_SMP
 	cpumask_copy(&dest, desc->irq_data.affinity);
 	if (irqd_is_per_cpu(&desc->irq_data) &&
 	    !cpumask_test_cpu(smp_processor_id(), &dest)) {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-08 22:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-03 20:45 [PATCH] parisc: Filter out spurious interrupts in PA-RISC irq handler Helge Deller
2015-09-09 18:41 Helge Deller
2015-09-30  9:44 ` Luis Henriques
2015-10-08 22:19 ` Ben Hutchings

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.