All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  1:53 Wang Sheng-Hui
  2012-05-03  2:15   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  1:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Milton Miller, Grant Likely,
	Stephen Rothwell, Anton Blanchard, linuxppc-dev, linux-kernel

local_paca->irq_happened may be changed asychronously. 

In my test env (IBM Power 9117-MMA), I installed the RHEL6.2 with the shipped
oprofile. Then I run into kernel v3.4-rc4, setup/start oprofile and start the
LTP test suite.

In a short while, the system would crash. Seems that oprofile may change
the irq_happened.
======================================================================
      KERNEL: /boot/vmlinux-3.4.0-rc4-00104-gaf3a3ab
    DUMPFILE: vmcore  [PARTIAL DUMP]
        CPUS: 10
        DATE: Fri Apr 27 18:54:34 2012
      UPTIME: 00:02:34
LOAD AVERAGE: 0.60, 0.27, 0.10
       TASKS: 369
    NODENAME: feastlp3.upt.austin.ibm.com
     RELEASE: 3.4.0-rc4-00104-gaf3a3ab
     VERSION: #4 SMP Fri Apr 27 03:13:43 CDT 2012
     MACHINE: ppc64  (4704 Mhz)
      MEMORY: 9.8 GB
       PANIC: "kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!"
         PID: 0
     COMMAND: "swapper/4"
        TASK: c0000002694e3cc0  (1 of 10)  [THREAD_INFO: c0000002694f8000]
         CPU: 4
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 0      TASK: c0000002694e3cc0  CPU: 4   COMMAND: "swapper/4"
 #0 [c00000026ffcb6e0] .crash_kexec at c0000000000f22e8
 #1 [c00000026ffcb8e0] .oops_end at c00000000060aed8
 #2 [c00000026ffcb980] ._exception at c000000000020900
 #3 [c00000026ffcbb40] program_check_common at c0000000000053b4
 Breakpoint trap  [700] exception frame:
 R0:  0000000000000001    R1:  c00000026ffcbe30    R2:  c000000000edd170   
 R3:  0000000000000500    R4:  0000000000000000    R5:  00000000000007fd   
 R6:  000000000124a180    R7:  003450cf9bd1233b    R8:  0000000000940000   
 R9:  c000000003400c00    R10: 0000000000000001    R11: 0000000000000000   
 R12: 0000000000000002    R13: c000000003400c00    R14: c0000002694fbf90   
 R15: 0000000002000040    R16: 0000000000000004    R17: 0000000000000000   
 R18: 0000000000000000    R19: 0000000000000000    R20: c000000000f42100   
 R21: 0000000000000000    R22: c000000000955b80    R23: c000000000955b80   
 R24: 000000000000000a    R25: 0000000000000004    R26: c0000002694f8100   
 R27: c00000026ffc8000    R28: 0000000000000000    R29: c000000000f42100   
 R30: c000000000e60810    R31: 0000000000000040   
 NIP: c00000000000ea9c    MSR: 8000000000029032    OR3: c00000000000ea3c
 CTR: c000000000063e40    LR:  c000000000010578    XER: 0000000000000000
 CCR: 0000000028000048    MQ:  0000000000000000    DAR: c000000001295d00
 DSISR: 0000000000000000     Syscall Result: 0000000000000000

 #4 [c00000026ffcbe30] .__check_irq_replay at c00000000000ea9c
 [Link Register ]  [c00000026ffcbe30] .arch_local_irq_restore at c000000000010578
 #5 [c00000026ffcbea0] .__do_softirq at c000000000085724
 #6 [c00000026ffcbf90] .call_do_softirq at c000000000022928
 #7 [c0000002694fb8d0] .do_softirq at c0000000000106c8
 #8 [c0000002694fb970] .irq_exit at c000000000085414
 #9 [c0000002694fb9f0] .do_IRQ at c0000000000100a4
#10 [c0000002694fbab0] hardware_interrupt_common at c0000000000038c0
 Hardware Interrupt  [501] exception frame:
 R0:  0000000000000001    R1:  c0000002694fbda0    R2:  c000000000edd170   
 R3:  0000000000000000    R4:  0000000000000000    R5:  0000000000000000   
 R6:  00000000000000e0    R7:  003450cf9bd1233b    R8:  0000000000940000   
 R9:  ffffffffffffffff    R10: 0000000000243694    R11: 0000000000000001   
 R12: 0000000000000002    R13: c000000003400c00   
 NIP: c0000000000105b4    MSR: 8000000000009032    OR3: 0000000000000c00
 CTR: c0000000004de3a0    LR:  c0000000000105b4    XER: 0000000000000000
 CCR: 0000000044000044    MQ:  0000000000000001    DAR: c0000000012990b0
 DSISR: c0000002694fbce0     Syscall Result: 0000000000000000

#11 [c0000002694fbda0] .arch_local_irq_restore at c0000000000105b4  (unreliable)
#12 [c0000002694fbe10] .cpu_idle at c000000000017d20
#13 [c0000002694fbed0] .start_secondary at c00000000061a934
#14 [c0000002694fbf90] .start_secondary_prolog at c00000000000936c


Use local var instead of local_paca->irq_happened directly in this function here.

Please check this patch. Any comments are welcome.


Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
---
 arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
 1 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 5ec1b23..3d48b23 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
  */
 notrace unsigned int __check_irq_replay(void)
 {
+	unsigned int ret_val;
 	/*
 	 * We use local_paca rather than get_paca() to avoid all
 	 * the debug_smp_processor_id() business in this low level
 	 * function
 	 */
-	unsigned char happened = local_paca->irq_happened;
+	unsigned char happened, irq_happened;
+	happened = irq_happened = local_paca->irq_happened;
 
 	/* Clear bit 0 which we wouldn't clear otherwise */
-	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
+	irq_happened &= ~PACA_IRQ_HARD_DIS;
 
 	/*
 	 * Force the delivery of pending soft-disabled interrupts on PS3.
@@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
 	 * decrementer itself rather than the paca irq_happened field
 	 * in case we also had a rollover while hard disabled
 	 */
-	local_paca->irq_happened &= ~PACA_IRQ_DEC;
-	if (decrementer_check_overflow())
-		return 0x900;
+	irq_happened &= ~PACA_IRQ_DEC;
+	if (decrementer_check_overflow()) {
+		ret_val = 0x900;
+		goto replay;
+	}
 
 	/* Finally check if an external interrupt happened */
-	local_paca->irq_happened &= ~PACA_IRQ_EE;
-	if (happened & PACA_IRQ_EE)
-		return 0x500;
+	irq_happened &= ~PACA_IRQ_EE;
+	if (happened & PACA_IRQ_EE) {
+		ret_val = 0x500;
+		goto replay;
+	}
 
 #ifdef CONFIG_PPC_BOOK3E
 	/* Finally check if an EPR external interrupt happened
 	 * this bit is typically set if we need to handle another
 	 * "edge" interrupt from within the MPIC "EPR" handler
 	 */
-	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
-	if (happened & PACA_IRQ_EE_EDGE)
-		return 0x500;
+	irq_happened &= ~PACA_IRQ_EE_EDGE;
+	if (happened & PACA_IRQ_EE_EDGE) {
+		ret_val = 0x500;
+		goto replay;
+	}
 
-	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
-	if (happened & PACA_IRQ_DBELL)
-		return 0x280;
+	irq_happened &= ~PACA_IRQ_DBELL;
+	if (happened & PACA_IRQ_DBELL) {
+		ret_val = 0x280;
+		goto replay;
+	}
 #endif /* CONFIG_PPC_BOOK3E */
 
 	/* There should be nothing left ! */
-	BUG_ON(local_paca->irq_happened != 0);
+	BUG_ON(irq_happened != 0);
+	ret_val = 0;
 
-	return 0;
+replay:
+	local_paca->irq_happened = irq_happened;
+
+	return ret_val;
 }
 
 notrace void arch_local_irq_restore(unsigned long en)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  1:53 [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay Wang Sheng-Hui
@ 2012-05-03  2:15   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  2:15 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On Thu, 2012-05-03 at 09:53 +0800, Wang Sheng-Hui wrote:
> local_paca->irq_happened may be changed asychronously. 
> 
> In my test env (IBM Power 9117-MMA), I installed the RHEL6.2 with the shipped
> oprofile. Then I run into kernel v3.4-rc4, setup/start oprofile and start the
> LTP test suite.
> 
> In a short while, the system would crash. Seems that oprofile may change
> the irq_happened.

 .../...

> Use local var instead of local_paca->irq_happened directly in this function here.
> 
> Please check this patch. Any comments are welcome.

It should not as __check_irq_replay() should always be called
with interrupts hard disabled... Do you see any code path
where that is not the case ?

Cheers,
Ben.

> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
> ---
>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>  1 files changed, 30 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5ec1b23..3d48b23 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>   */
>  notrace unsigned int __check_irq_replay(void)
>  {
> +	unsigned int ret_val;
>  	/*
>  	 * We use local_paca rather than get_paca() to avoid all
>  	 * the debug_smp_processor_id() business in this low level
>  	 * function
>  	 */
> -	unsigned char happened = local_paca->irq_happened;
> +	unsigned char happened, irq_happened;
> +	happened = irq_happened = local_paca->irq_happened;
>  
>  	/* Clear bit 0 which we wouldn't clear otherwise */
> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>  
>  	/*
>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>  	 * decrementer itself rather than the paca irq_happened field
>  	 * in case we also had a rollover while hard disabled
>  	 */
> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> -	if (decrementer_check_overflow())
> -		return 0x900;
> +	irq_happened &= ~PACA_IRQ_DEC;
> +	if (decrementer_check_overflow()) {
> +		ret_val = 0x900;
> +		goto replay;
> +	}
>  
>  	/* Finally check if an external interrupt happened */
> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
> -	if (happened & PACA_IRQ_EE)
> -		return 0x500;
> +	irq_happened &= ~PACA_IRQ_EE;
> +	if (happened & PACA_IRQ_EE) {
> +		ret_val = 0x500;
> +		goto replay;
> +	}
>  
>  #ifdef CONFIG_PPC_BOOK3E
>  	/* Finally check if an EPR external interrupt happened
>  	 * this bit is typically set if we need to handle another
>  	 * "edge" interrupt from within the MPIC "EPR" handler
>  	 */
> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
> -	if (happened & PACA_IRQ_EE_EDGE)
> -		return 0x500;
> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
> +	if (happened & PACA_IRQ_EE_EDGE) {
> +		ret_val = 0x500;
> +		goto replay;
> +	}
>  
> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
> -	if (happened & PACA_IRQ_DBELL)
> -		return 0x280;
> +	irq_happened &= ~PACA_IRQ_DBELL;
> +	if (happened & PACA_IRQ_DBELL) {
> +		ret_val = 0x280;
> +		goto replay;
> +	}
>  #endif /* CONFIG_PPC_BOOK3E */
>  
>  	/* There should be nothing left ! */
> -	BUG_ON(local_paca->irq_happened != 0);
> +	BUG_ON(irq_happened != 0);
> +	ret_val = 0;
>  
> -	return 0;
> +replay:
> +	local_paca->irq_happened = irq_happened;
> +
> +	return ret_val;
>  }
>  
>  notrace void arch_local_irq_restore(unsigned long en)



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  2:15   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  2:15 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On Thu, 2012-05-03 at 09:53 +0800, Wang Sheng-Hui wrote:
> local_paca->irq_happened may be changed asychronously. 
> 
> In my test env (IBM Power 9117-MMA), I installed the RHEL6.2 with the shipped
> oprofile. Then I run into kernel v3.4-rc4, setup/start oprofile and start the
> LTP test suite.
> 
> In a short while, the system would crash. Seems that oprofile may change
> the irq_happened.

 .../...

> Use local var instead of local_paca->irq_happened directly in this function here.
> 
> Please check this patch. Any comments are welcome.

It should not as __check_irq_replay() should always be called
with interrupts hard disabled... Do you see any code path
where that is not the case ?

Cheers,
Ben.

> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
> ---
>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>  1 files changed, 30 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5ec1b23..3d48b23 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>   */
>  notrace unsigned int __check_irq_replay(void)
>  {
> +	unsigned int ret_val;
>  	/*
>  	 * We use local_paca rather than get_paca() to avoid all
>  	 * the debug_smp_processor_id() business in this low level
>  	 * function
>  	 */
> -	unsigned char happened = local_paca->irq_happened;
> +	unsigned char happened, irq_happened;
> +	happened = irq_happened = local_paca->irq_happened;
>  
>  	/* Clear bit 0 which we wouldn't clear otherwise */
> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>  
>  	/*
>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>  	 * decrementer itself rather than the paca irq_happened field
>  	 * in case we also had a rollover while hard disabled
>  	 */
> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> -	if (decrementer_check_overflow())
> -		return 0x900;
> +	irq_happened &= ~PACA_IRQ_DEC;
> +	if (decrementer_check_overflow()) {
> +		ret_val = 0x900;
> +		goto replay;
> +	}
>  
>  	/* Finally check if an external interrupt happened */
> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
> -	if (happened & PACA_IRQ_EE)
> -		return 0x500;
> +	irq_happened &= ~PACA_IRQ_EE;
> +	if (happened & PACA_IRQ_EE) {
> +		ret_val = 0x500;
> +		goto replay;
> +	}
>  
>  #ifdef CONFIG_PPC_BOOK3E
>  	/* Finally check if an EPR external interrupt happened
>  	 * this bit is typically set if we need to handle another
>  	 * "edge" interrupt from within the MPIC "EPR" handler
>  	 */
> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
> -	if (happened & PACA_IRQ_EE_EDGE)
> -		return 0x500;
> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
> +	if (happened & PACA_IRQ_EE_EDGE) {
> +		ret_val = 0x500;
> +		goto replay;
> +	}
>  
> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
> -	if (happened & PACA_IRQ_DBELL)
> -		return 0x280;
> +	irq_happened &= ~PACA_IRQ_DBELL;
> +	if (happened & PACA_IRQ_DBELL) {
> +		ret_val = 0x280;
> +		goto replay;
> +	}
>  #endif /* CONFIG_PPC_BOOK3E */
>  
>  	/* There should be nothing left ! */
> -	BUG_ON(local_paca->irq_happened != 0);
> +	BUG_ON(irq_happened != 0);
> +	ret_val = 0;
>  
> -	return 0;
> +replay:
> +	local_paca->irq_happened = irq_happened;
> +
> +	return ret_val;
>  }
>  
>  notrace void arch_local_irq_restore(unsigned long en)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  2:15   ` Benjamin Herrenschmidt
@ 2012-05-03  2:27     ` Wang Sheng-Hui
  -1 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  2:27 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On 2012年05月03日 10:15, Benjamin Herrenschmidt wrote:
> On Thu, 2012-05-03 at 09:53 +0800, Wang Sheng-Hui wrote:
>> local_paca->irq_happened may be changed asychronously. 
>>
>> In my test env (IBM Power 9117-MMA), I installed the RHEL6.2 with the shipped
>> oprofile. Then I run into kernel v3.4-rc4, setup/start oprofile and start the
>> LTP test suite.
>>
>> In a short while, the system would crash. Seems that oprofile may change
>> the irq_happened.
> 
>  .../...
> 
>> Use local var instead of local_paca->irq_happened directly in this function here.
>>
>> Please check this patch. Any comments are welcome.
> 
> It should not as __check_irq_replay() should always be called
> with interrupts hard disabled... Do you see any code path
> where that is not the case ?

This is the only case.
I have run LTP test suite on my system without oprofile over 24 hours
with 3.4-rc4 kernel. 
Then I started oprofile, and the system crashed quickly.

I wonder if oprofile does some special changes with the running.
But I'm not familiar with the internal of oprofile.

I tried to change BUG_ON to WARN_ON, and got lots of warnning messages
in dmesg. So I changed it to local var here.


> 
> Cheers,
> Ben.
> 
>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>> ---
>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>> index 5ec1b23..3d48b23 100644
>> --- a/arch/powerpc/kernel/irq.c
>> +++ b/arch/powerpc/kernel/irq.c
>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>   */
>>  notrace unsigned int __check_irq_replay(void)
>>  {
>> +	unsigned int ret_val;
>>  	/*
>>  	 * We use local_paca rather than get_paca() to avoid all
>>  	 * the debug_smp_processor_id() business in this low level
>>  	 * function
>>  	 */
>> -	unsigned char happened = local_paca->irq_happened;
>> +	unsigned char happened, irq_happened;
>> +	happened = irq_happened = local_paca->irq_happened;
>>  
>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>  
>>  	/*
>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>  	 * decrementer itself rather than the paca irq_happened field
>>  	 * in case we also had a rollover while hard disabled
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>> -	if (decrementer_check_overflow())
>> -		return 0x900;
>> +	irq_happened &= ~PACA_IRQ_DEC;
>> +	if (decrementer_check_overflow()) {
>> +		ret_val = 0x900;
>> +		goto replay;
>> +	}
>>  
>>  	/* Finally check if an external interrupt happened */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>> -	if (happened & PACA_IRQ_EE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE;
>> +	if (happened & PACA_IRQ_EE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>>  #ifdef CONFIG_PPC_BOOK3E
>>  	/* Finally check if an EPR external interrupt happened
>>  	 * this bit is typically set if we need to handle another
>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>> -	if (happened & PACA_IRQ_EE_EDGE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>> +	if (happened & PACA_IRQ_EE_EDGE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>> -	if (happened & PACA_IRQ_DBELL)
>> -		return 0x280;
>> +	irq_happened &= ~PACA_IRQ_DBELL;
>> +	if (happened & PACA_IRQ_DBELL) {
>> +		ret_val = 0x280;
>> +		goto replay;
>> +	}
>>  #endif /* CONFIG_PPC_BOOK3E */
>>  
>>  	/* There should be nothing left ! */
>> -	BUG_ON(local_paca->irq_happened != 0);
>> +	BUG_ON(irq_happened != 0);
>> +	ret_val = 0;
>>  
>> -	return 0;
>> +replay:
>> +	local_paca->irq_happened = irq_happened;
>> +
>> +	return ret_val;
>>  }
>>  
>>  notrace void arch_local_irq_restore(unsigned long en)
> 
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  2:27     ` Wang Sheng-Hui
  0 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  2:27 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On 2012年05月03日 10:15, Benjamin Herrenschmidt wrote:
> On Thu, 2012-05-03 at 09:53 +0800, Wang Sheng-Hui wrote:
>> local_paca->irq_happened may be changed asychronously. 
>>
>> In my test env (IBM Power 9117-MMA), I installed the RHEL6.2 with the shipped
>> oprofile. Then I run into kernel v3.4-rc4, setup/start oprofile and start the
>> LTP test suite.
>>
>> In a short while, the system would crash. Seems that oprofile may change
>> the irq_happened.
> 
>  .../...
> 
>> Use local var instead of local_paca->irq_happened directly in this function here.
>>
>> Please check this patch. Any comments are welcome.
> 
> It should not as __check_irq_replay() should always be called
> with interrupts hard disabled... Do you see any code path
> where that is not the case ?

This is the only case.
I have run LTP test suite on my system without oprofile over 24 hours
with 3.4-rc4 kernel. 
Then I started oprofile, and the system crashed quickly.

I wonder if oprofile does some special changes with the running.
But I'm not familiar with the internal of oprofile.

I tried to change BUG_ON to WARN_ON, and got lots of warnning messages
in dmesg. So I changed it to local var here.


> 
> Cheers,
> Ben.
> 
>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>> ---
>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>> index 5ec1b23..3d48b23 100644
>> --- a/arch/powerpc/kernel/irq.c
>> +++ b/arch/powerpc/kernel/irq.c
>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>   */
>>  notrace unsigned int __check_irq_replay(void)
>>  {
>> +	unsigned int ret_val;
>>  	/*
>>  	 * We use local_paca rather than get_paca() to avoid all
>>  	 * the debug_smp_processor_id() business in this low level
>>  	 * function
>>  	 */
>> -	unsigned char happened = local_paca->irq_happened;
>> +	unsigned char happened, irq_happened;
>> +	happened = irq_happened = local_paca->irq_happened;
>>  
>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>  
>>  	/*
>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>  	 * decrementer itself rather than the paca irq_happened field
>>  	 * in case we also had a rollover while hard disabled
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>> -	if (decrementer_check_overflow())
>> -		return 0x900;
>> +	irq_happened &= ~PACA_IRQ_DEC;
>> +	if (decrementer_check_overflow()) {
>> +		ret_val = 0x900;
>> +		goto replay;
>> +	}
>>  
>>  	/* Finally check if an external interrupt happened */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>> -	if (happened & PACA_IRQ_EE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE;
>> +	if (happened & PACA_IRQ_EE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>>  #ifdef CONFIG_PPC_BOOK3E
>>  	/* Finally check if an EPR external interrupt happened
>>  	 * this bit is typically set if we need to handle another
>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>> -	if (happened & PACA_IRQ_EE_EDGE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>> +	if (happened & PACA_IRQ_EE_EDGE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>> -	if (happened & PACA_IRQ_DBELL)
>> -		return 0x280;
>> +	irq_happened &= ~PACA_IRQ_DBELL;
>> +	if (happened & PACA_IRQ_DBELL) {
>> +		ret_val = 0x280;
>> +		goto replay;
>> +	}
>>  #endif /* CONFIG_PPC_BOOK3E */
>>  
>>  	/* There should be nothing left ! */
>> -	BUG_ON(local_paca->irq_happened != 0);
>> +	BUG_ON(irq_happened != 0);
>> +	ret_val = 0;
>>  
>> -	return 0;
>> +replay:
>> +	local_paca->irq_happened = irq_happened;
>> +
>> +	return ret_val;
>>  }
>>  
>>  notrace void arch_local_irq_restore(unsigned long en)
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  2:15   ` Benjamin Herrenschmidt
@ 2012-05-03  2:32     ` Wang Sheng-Hui
  -1 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  2:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On 2012年05月03日 10:15, Benjamin Herrenschmidt wrote:
> On Thu, 2012-05-03 at 09:53 +0800, Wang Sheng-Hui wrote:
>> local_paca->irq_happened may be changed asychronously. 
>>
>> In my test env (IBM Power 9117-MMA), I installed the RHEL6.2 with the shipped
>> oprofile. Then I run into kernel v3.4-rc4, setup/start oprofile and start the
>> LTP test suite.
>>
>> In a short while, the system would crash. Seems that oprofile may change
>> the irq_happened.
> 
>  .../...
> 
>> Use local var instead of local_paca->irq_happened directly in this function here.
>>
>> Please check this patch. Any comments are welcome.
> 
> It should not as __check_irq_replay() should always be called
> with interrupts hard disabled... Do you see any code path
> where that is not the case ?

Since __check_irq_replay() should always be called with interrupts hard disabled,
I think it's harmless to use local var here.

> 
> Cheers,
> Ben.
> 
>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>> ---
>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>> index 5ec1b23..3d48b23 100644
>> --- a/arch/powerpc/kernel/irq.c
>> +++ b/arch/powerpc/kernel/irq.c
>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>   */
>>  notrace unsigned int __check_irq_replay(void)
>>  {
>> +	unsigned int ret_val;
>>  	/*
>>  	 * We use local_paca rather than get_paca() to avoid all
>>  	 * the debug_smp_processor_id() business in this low level
>>  	 * function
>>  	 */
>> -	unsigned char happened = local_paca->irq_happened;
>> +	unsigned char happened, irq_happened;
>> +	happened = irq_happened = local_paca->irq_happened;
>>  
>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>  
>>  	/*
>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>  	 * decrementer itself rather than the paca irq_happened field
>>  	 * in case we also had a rollover while hard disabled
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>> -	if (decrementer_check_overflow())
>> -		return 0x900;
>> +	irq_happened &= ~PACA_IRQ_DEC;
>> +	if (decrementer_check_overflow()) {
>> +		ret_val = 0x900;
>> +		goto replay;
>> +	}
>>  
>>  	/* Finally check if an external interrupt happened */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>> -	if (happened & PACA_IRQ_EE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE;
>> +	if (happened & PACA_IRQ_EE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>>  #ifdef CONFIG_PPC_BOOK3E
>>  	/* Finally check if an EPR external interrupt happened
>>  	 * this bit is typically set if we need to handle another
>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>> -	if (happened & PACA_IRQ_EE_EDGE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>> +	if (happened & PACA_IRQ_EE_EDGE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>> -	if (happened & PACA_IRQ_DBELL)
>> -		return 0x280;
>> +	irq_happened &= ~PACA_IRQ_DBELL;
>> +	if (happened & PACA_IRQ_DBELL) {
>> +		ret_val = 0x280;
>> +		goto replay;
>> +	}
>>  #endif /* CONFIG_PPC_BOOK3E */
>>  
>>  	/* There should be nothing left ! */
>> -	BUG_ON(local_paca->irq_happened != 0);
>> +	BUG_ON(irq_happened != 0);
>> +	ret_val = 0;
>>  
>> -	return 0;
>> +replay:
>> +	local_paca->irq_happened = irq_happened;
>> +
>> +	return ret_val;
>>  }
>>  
>>  notrace void arch_local_irq_restore(unsigned long en)
> 
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  2:32     ` Wang Sheng-Hui
  0 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  2:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On 2012年05月03日 10:15, Benjamin Herrenschmidt wrote:
> On Thu, 2012-05-03 at 09:53 +0800, Wang Sheng-Hui wrote:
>> local_paca->irq_happened may be changed asychronously. 
>>
>> In my test env (IBM Power 9117-MMA), I installed the RHEL6.2 with the shipped
>> oprofile. Then I run into kernel v3.4-rc4, setup/start oprofile and start the
>> LTP test suite.
>>
>> In a short while, the system would crash. Seems that oprofile may change
>> the irq_happened.
> 
>  .../...
> 
>> Use local var instead of local_paca->irq_happened directly in this function here.
>>
>> Please check this patch. Any comments are welcome.
> 
> It should not as __check_irq_replay() should always be called
> with interrupts hard disabled... Do you see any code path
> where that is not the case ?

Since __check_irq_replay() should always be called with interrupts hard disabled,
I think it's harmless to use local var here.

> 
> Cheers,
> Ben.
> 
>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>> ---
>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>> index 5ec1b23..3d48b23 100644
>> --- a/arch/powerpc/kernel/irq.c
>> +++ b/arch/powerpc/kernel/irq.c
>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>   */
>>  notrace unsigned int __check_irq_replay(void)
>>  {
>> +	unsigned int ret_val;
>>  	/*
>>  	 * We use local_paca rather than get_paca() to avoid all
>>  	 * the debug_smp_processor_id() business in this low level
>>  	 * function
>>  	 */
>> -	unsigned char happened = local_paca->irq_happened;
>> +	unsigned char happened, irq_happened;
>> +	happened = irq_happened = local_paca->irq_happened;
>>  
>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>  
>>  	/*
>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>  	 * decrementer itself rather than the paca irq_happened field
>>  	 * in case we also had a rollover while hard disabled
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>> -	if (decrementer_check_overflow())
>> -		return 0x900;
>> +	irq_happened &= ~PACA_IRQ_DEC;
>> +	if (decrementer_check_overflow()) {
>> +		ret_val = 0x900;
>> +		goto replay;
>> +	}
>>  
>>  	/* Finally check if an external interrupt happened */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>> -	if (happened & PACA_IRQ_EE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE;
>> +	if (happened & PACA_IRQ_EE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>>  #ifdef CONFIG_PPC_BOOK3E
>>  	/* Finally check if an EPR external interrupt happened
>>  	 * this bit is typically set if we need to handle another
>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>  	 */
>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>> -	if (happened & PACA_IRQ_EE_EDGE)
>> -		return 0x500;
>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>> +	if (happened & PACA_IRQ_EE_EDGE) {
>> +		ret_val = 0x500;
>> +		goto replay;
>> +	}
>>  
>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>> -	if (happened & PACA_IRQ_DBELL)
>> -		return 0x280;
>> +	irq_happened &= ~PACA_IRQ_DBELL;
>> +	if (happened & PACA_IRQ_DBELL) {
>> +		ret_val = 0x280;
>> +		goto replay;
>> +	}
>>  #endif /* CONFIG_PPC_BOOK3E */
>>  
>>  	/* There should be nothing left ! */
>> -	BUG_ON(local_paca->irq_happened != 0);
>> +	BUG_ON(irq_happened != 0);
>> +	ret_val = 0;
>>  
>> -	return 0;
>> +replay:
>> +	local_paca->irq_happened = irq_happened;
>> +
>> +	return ret_val;
>>  }
>>  
>>  notrace void arch_local_irq_restore(unsigned long en)
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  2:15   ` Benjamin Herrenschmidt
@ 2012-05-03  4:22     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  4:22 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel


> It should not as __check_irq_replay() should always be called
> with interrupts hard disabled... Do you see any code path
> where that is not the case ?

More specifically, your backtrace seems to indicate that
__check_irq_repay() was called from arch_local_irq_restore() which
should have done this before calling __check_irq_replay():

	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
		__hard_irq_disable();

Now, the only possibility that I can see for an interrupt to come in
and trip the problem you observed would be if for some reason we
had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
not hard disabled.

Can you try if removing the test (and thus unconditionally calling
__hard_irq_disable()) fixes the problem for you ?

If that is the case, then we need to audit the code to figure out how we
can end up with that bit in irq_happened set and interrupts hard
enabled.

Something like may_hard_irq_enable() shouldn't cause it since it should
only be called while hard disabled but adding a check in there might be
worth it (something like WARN_ON(mfmsr() & MSR_EE)).

Cheers,
Ben.

> Cheers,
> Ben.
> 
> > Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
> > ---
> >  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
> >  1 files changed, 30 insertions(+), 16 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> > index 5ec1b23..3d48b23 100644
> > --- a/arch/powerpc/kernel/irq.c
> > +++ b/arch/powerpc/kernel/irq.c
> > @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
> >   */
> >  notrace unsigned int __check_irq_replay(void)
> >  {
> > +	unsigned int ret_val;
> >  	/*
> >  	 * We use local_paca rather than get_paca() to avoid all
> >  	 * the debug_smp_processor_id() business in this low level
> >  	 * function
> >  	 */
> > -	unsigned char happened = local_paca->irq_happened;
> > +	unsigned char happened, irq_happened;
> > +	happened = irq_happened = local_paca->irq_happened;
> >  
> >  	/* Clear bit 0 which we wouldn't clear otherwise */
> > -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
> > +	irq_happened &= ~PACA_IRQ_HARD_DIS;
> >  
> >  	/*
> >  	 * Force the delivery of pending soft-disabled interrupts on PS3.
> > @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
> >  	 * decrementer itself rather than the paca irq_happened field
> >  	 * in case we also had a rollover while hard disabled
> >  	 */
> > -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> > -	if (decrementer_check_overflow())
> > -		return 0x900;
> > +	irq_happened &= ~PACA_IRQ_DEC;
> > +	if (decrementer_check_overflow()) {
> > +		ret_val = 0x900;
> > +		goto replay;
> > +	}
> >  
> >  	/* Finally check if an external interrupt happened */
> > -	local_paca->irq_happened &= ~PACA_IRQ_EE;
> > -	if (happened & PACA_IRQ_EE)
> > -		return 0x500;
> > +	irq_happened &= ~PACA_IRQ_EE;
> > +	if (happened & PACA_IRQ_EE) {
> > +		ret_val = 0x500;
> > +		goto replay;
> > +	}
> >  
> >  #ifdef CONFIG_PPC_BOOK3E
> >  	/* Finally check if an EPR external interrupt happened
> >  	 * this bit is typically set if we need to handle another
> >  	 * "edge" interrupt from within the MPIC "EPR" handler
> >  	 */
> > -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
> > -	if (happened & PACA_IRQ_EE_EDGE)
> > -		return 0x500;
> > +	irq_happened &= ~PACA_IRQ_EE_EDGE;
> > +	if (happened & PACA_IRQ_EE_EDGE) {
> > +		ret_val = 0x500;
> > +		goto replay;
> > +	}
> >  
> > -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
> > -	if (happened & PACA_IRQ_DBELL)
> > -		return 0x280;
> > +	irq_happened &= ~PACA_IRQ_DBELL;
> > +	if (happened & PACA_IRQ_DBELL) {
> > +		ret_val = 0x280;
> > +		goto replay;
> > +	}
> >  #endif /* CONFIG_PPC_BOOK3E */
> >  
> >  	/* There should be nothing left ! */
> > -	BUG_ON(local_paca->irq_happened != 0);
> > +	BUG_ON(irq_happened != 0);
> > +	ret_val = 0;
> >  
> > -	return 0;
> > +replay:
> > +	local_paca->irq_happened = irq_happened;
> > +
> > +	return ret_val;
> >  }
> >  
> >  notrace void arch_local_irq_restore(unsigned long en)
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  4:22     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  4:22 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev


> It should not as __check_irq_replay() should always be called
> with interrupts hard disabled... Do you see any code path
> where that is not the case ?

More specifically, your backtrace seems to indicate that
__check_irq_repay() was called from arch_local_irq_restore() which
should have done this before calling __check_irq_replay():

	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
		__hard_irq_disable();

Now, the only possibility that I can see for an interrupt to come in
and trip the problem you observed would be if for some reason we
had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
not hard disabled.

Can you try if removing the test (and thus unconditionally calling
__hard_irq_disable()) fixes the problem for you ?

If that is the case, then we need to audit the code to figure out how we
can end up with that bit in irq_happened set and interrupts hard
enabled.

Something like may_hard_irq_enable() shouldn't cause it since it should
only be called while hard disabled but adding a check in there might be
worth it (something like WARN_ON(mfmsr() & MSR_EE)).

Cheers,
Ben.

> Cheers,
> Ben.
> 
> > Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
> > ---
> >  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
> >  1 files changed, 30 insertions(+), 16 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> > index 5ec1b23..3d48b23 100644
> > --- a/arch/powerpc/kernel/irq.c
> > +++ b/arch/powerpc/kernel/irq.c
> > @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
> >   */
> >  notrace unsigned int __check_irq_replay(void)
> >  {
> > +	unsigned int ret_val;
> >  	/*
> >  	 * We use local_paca rather than get_paca() to avoid all
> >  	 * the debug_smp_processor_id() business in this low level
> >  	 * function
> >  	 */
> > -	unsigned char happened = local_paca->irq_happened;
> > +	unsigned char happened, irq_happened;
> > +	happened = irq_happened = local_paca->irq_happened;
> >  
> >  	/* Clear bit 0 which we wouldn't clear otherwise */
> > -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
> > +	irq_happened &= ~PACA_IRQ_HARD_DIS;
> >  
> >  	/*
> >  	 * Force the delivery of pending soft-disabled interrupts on PS3.
> > @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
> >  	 * decrementer itself rather than the paca irq_happened field
> >  	 * in case we also had a rollover while hard disabled
> >  	 */
> > -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> > -	if (decrementer_check_overflow())
> > -		return 0x900;
> > +	irq_happened &= ~PACA_IRQ_DEC;
> > +	if (decrementer_check_overflow()) {
> > +		ret_val = 0x900;
> > +		goto replay;
> > +	}
> >  
> >  	/* Finally check if an external interrupt happened */
> > -	local_paca->irq_happened &= ~PACA_IRQ_EE;
> > -	if (happened & PACA_IRQ_EE)
> > -		return 0x500;
> > +	irq_happened &= ~PACA_IRQ_EE;
> > +	if (happened & PACA_IRQ_EE) {
> > +		ret_val = 0x500;
> > +		goto replay;
> > +	}
> >  
> >  #ifdef CONFIG_PPC_BOOK3E
> >  	/* Finally check if an EPR external interrupt happened
> >  	 * this bit is typically set if we need to handle another
> >  	 * "edge" interrupt from within the MPIC "EPR" handler
> >  	 */
> > -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
> > -	if (happened & PACA_IRQ_EE_EDGE)
> > -		return 0x500;
> > +	irq_happened &= ~PACA_IRQ_EE_EDGE;
> > +	if (happened & PACA_IRQ_EE_EDGE) {
> > +		ret_val = 0x500;
> > +		goto replay;
> > +	}
> >  
> > -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
> > -	if (happened & PACA_IRQ_DBELL)
> > -		return 0x280;
> > +	irq_happened &= ~PACA_IRQ_DBELL;
> > +	if (happened & PACA_IRQ_DBELL) {
> > +		ret_val = 0x280;
> > +		goto replay;
> > +	}
> >  #endif /* CONFIG_PPC_BOOK3E */
> >  
> >  	/* There should be nothing left ! */
> > -	BUG_ON(local_paca->irq_happened != 0);
> > +	BUG_ON(irq_happened != 0);
> > +	ret_val = 0;
> >  
> > -	return 0;
> > +replay:
> > +	local_paca->irq_happened = irq_happened;
> > +
> > +	return ret_val;
> >  }
> >  
> >  notrace void arch_local_irq_restore(unsigned long en)
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  2:32     ` Wang Sheng-Hui
@ 2012-05-03  4:26       ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  4:26 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On Thu, 2012-05-03 at 10:32 +0800, Wang Sheng-Hui wrote:
> > It should not as __check_irq_replay() should always be called
> > with interrupts hard disabled... Do you see any code path
> > where that is not the case ?
> 
> Since __check_irq_replay() should always be called with interrupts
> hard disabled, I think it's harmless to use local var here.

No, that would be papering over the real problem. All oprofile does is
trigger perfmon interrupts (which act as some kind of NMI when
soft-disabled but should be masked by MSR:EE when hard disabled).

So there's a deeper issue here that we need to understand before we can
propose a fix. IE. It should not have happened.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  4:26       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  4:26 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On Thu, 2012-05-03 at 10:32 +0800, Wang Sheng-Hui wrote:
> > It should not as __check_irq_replay() should always be called
> > with interrupts hard disabled... Do you see any code path
> > where that is not the case ?
> 
> Since __check_irq_replay() should always be called with interrupts
> hard disabled, I think it's harmless to use local var here.

No, that would be papering over the real problem. All oprofile does is
trigger perfmon interrupts (which act as some kind of NMI when
soft-disabled but should be masked by MSR:EE when hard disabled).

So there's a deeper issue here that we need to understand before we can
propose a fix. IE. It should not have happened.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  4:22     ` Benjamin Herrenschmidt
@ 2012-05-03  5:51       ` Wang Sheng-Hui
  -1 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  5:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On 2012年05月03日 12:22, Benjamin Herrenschmidt wrote:
> 
>> It should not as __check_irq_replay() should always be called
>> with interrupts hard disabled... Do you see any code path
>> where that is not the case ?
> 
> More specifically, your backtrace seems to indicate that
> __check_irq_repay() was called from arch_local_irq_restore() which
> should have done this before calling __check_irq_replay():
> 
> 	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> 		__hard_irq_disable();
> 
> Now, the only possibility that I can see for an interrupt to come in
> and trip the problem you observed would be if for some reason we
> had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
> not hard disabled.

I have a chance to notice that the value is 0x05, not just 0x01.
So I think this is not the case.

> 
> Can you try if removing the test (and thus unconditionally calling
> __hard_irq_disable()) fixes the problem for you ?
> 
> If that is the case, then we need to audit the code to figure out how we
> can end up with that bit in irq_happened set and interrupts hard
> enabled.
> 
> Something like may_hard_irq_enable() shouldn't cause it since it should
> only be called while hard disabled but adding a check in there might be
> worth it (something like WARN_ON(mfmsr() & MSR_EE)).
> 
> Cheers,
> Ben.
> 
>> Cheers,
>> Ben.
>>
>>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>>> ---
>>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>>> index 5ec1b23..3d48b23 100644
>>> --- a/arch/powerpc/kernel/irq.c
>>> +++ b/arch/powerpc/kernel/irq.c
>>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>>   */
>>>  notrace unsigned int __check_irq_replay(void)
>>>  {
>>> +	unsigned int ret_val;
>>>  	/*
>>>  	 * We use local_paca rather than get_paca() to avoid all
>>>  	 * the debug_smp_processor_id() business in this low level
>>>  	 * function
>>>  	 */
>>> -	unsigned char happened = local_paca->irq_happened;
>>> +	unsigned char happened, irq_happened;
>>> +	happened = irq_happened = local_paca->irq_happened;
>>>  
>>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>>  
>>>  	/*
>>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>>  	 * decrementer itself rather than the paca irq_happened field
>>>  	 * in case we also had a rollover while hard disabled
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>>> -	if (decrementer_check_overflow())
>>> -		return 0x900;
>>> +	irq_happened &= ~PACA_IRQ_DEC;
>>> +	if (decrementer_check_overflow()) {
>>> +		ret_val = 0x900;
>>> +		goto replay;
>>> +	}
>>>  
>>>  	/* Finally check if an external interrupt happened */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>>> -	if (happened & PACA_IRQ_EE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE;
>>> +	if (happened & PACA_IRQ_EE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>>  #ifdef CONFIG_PPC_BOOK3E
>>>  	/* Finally check if an EPR external interrupt happened
>>>  	 * this bit is typically set if we need to handle another
>>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> -	if (happened & PACA_IRQ_EE_EDGE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> +	if (happened & PACA_IRQ_EE_EDGE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>>> -	if (happened & PACA_IRQ_DBELL)
>>> -		return 0x280;
>>> +	irq_happened &= ~PACA_IRQ_DBELL;
>>> +	if (happened & PACA_IRQ_DBELL) {
>>> +		ret_val = 0x280;
>>> +		goto replay;
>>> +	}
>>>  #endif /* CONFIG_PPC_BOOK3E */
>>>  
>>>  	/* There should be nothing left ! */
>>> -	BUG_ON(local_paca->irq_happened != 0);
>>> +	BUG_ON(irq_happened != 0);
>>> +	ret_val = 0;
>>>  
>>> -	return 0;
>>> +replay:
>>> +	local_paca->irq_happened = irq_happened;
>>> +
>>> +	return ret_val;
>>>  }
>>>  
>>>  notrace void arch_local_irq_restore(unsigned long en)
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  5:51       ` Wang Sheng-Hui
  0 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  5:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On 2012年05月03日 12:22, Benjamin Herrenschmidt wrote:
> 
>> It should not as __check_irq_replay() should always be called
>> with interrupts hard disabled... Do you see any code path
>> where that is not the case ?
> 
> More specifically, your backtrace seems to indicate that
> __check_irq_repay() was called from arch_local_irq_restore() which
> should have done this before calling __check_irq_replay():
> 
> 	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> 		__hard_irq_disable();
> 
> Now, the only possibility that I can see for an interrupt to come in
> and trip the problem you observed would be if for some reason we
> had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
> not hard disabled.

I have a chance to notice that the value is 0x05, not just 0x01.
So I think this is not the case.

> 
> Can you try if removing the test (and thus unconditionally calling
> __hard_irq_disable()) fixes the problem for you ?
> 
> If that is the case, then we need to audit the code to figure out how we
> can end up with that bit in irq_happened set and interrupts hard
> enabled.
> 
> Something like may_hard_irq_enable() shouldn't cause it since it should
> only be called while hard disabled but adding a check in there might be
> worth it (something like WARN_ON(mfmsr() & MSR_EE)).
> 
> Cheers,
> Ben.
> 
>> Cheers,
>> Ben.
>>
>>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>>> ---
>>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>>> index 5ec1b23..3d48b23 100644
>>> --- a/arch/powerpc/kernel/irq.c
>>> +++ b/arch/powerpc/kernel/irq.c
>>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>>   */
>>>  notrace unsigned int __check_irq_replay(void)
>>>  {
>>> +	unsigned int ret_val;
>>>  	/*
>>>  	 * We use local_paca rather than get_paca() to avoid all
>>>  	 * the debug_smp_processor_id() business in this low level
>>>  	 * function
>>>  	 */
>>> -	unsigned char happened = local_paca->irq_happened;
>>> +	unsigned char happened, irq_happened;
>>> +	happened = irq_happened = local_paca->irq_happened;
>>>  
>>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>>  
>>>  	/*
>>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>>  	 * decrementer itself rather than the paca irq_happened field
>>>  	 * in case we also had a rollover while hard disabled
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>>> -	if (decrementer_check_overflow())
>>> -		return 0x900;
>>> +	irq_happened &= ~PACA_IRQ_DEC;
>>> +	if (decrementer_check_overflow()) {
>>> +		ret_val = 0x900;
>>> +		goto replay;
>>> +	}
>>>  
>>>  	/* Finally check if an external interrupt happened */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>>> -	if (happened & PACA_IRQ_EE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE;
>>> +	if (happened & PACA_IRQ_EE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>>  #ifdef CONFIG_PPC_BOOK3E
>>>  	/* Finally check if an EPR external interrupt happened
>>>  	 * this bit is typically set if we need to handle another
>>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> -	if (happened & PACA_IRQ_EE_EDGE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> +	if (happened & PACA_IRQ_EE_EDGE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>>> -	if (happened & PACA_IRQ_DBELL)
>>> -		return 0x280;
>>> +	irq_happened &= ~PACA_IRQ_DBELL;
>>> +	if (happened & PACA_IRQ_DBELL) {
>>> +		ret_val = 0x280;
>>> +		goto replay;
>>> +	}
>>>  #endif /* CONFIG_PPC_BOOK3E */
>>>  
>>>  	/* There should be nothing left ! */
>>> -	BUG_ON(local_paca->irq_happened != 0);
>>> +	BUG_ON(irq_happened != 0);
>>> +	ret_val = 0;
>>>  
>>> -	return 0;
>>> +replay:
>>> +	local_paca->irq_happened = irq_happened;
>>> +
>>> +	return ret_val;
>>>  }
>>>  
>>>  notrace void arch_local_irq_restore(unsigned long en)
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  4:22     ` Benjamin Herrenschmidt
@ 2012-05-03  6:33       ` Wang Sheng-Hui
  -1 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  6:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On 2012年05月03日 12:22, Benjamin Herrenschmidt wrote:
> 
>> It should not as __check_irq_replay() should always be called
>> with interrupts hard disabled... Do you see any code path
>> where that is not the case ?
> 
> More specifically, your backtrace seems to indicate that
> __check_irq_repay() was called from arch_local_irq_restore() which
> should have done this before calling __check_irq_replay():
> 
> 	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> 		__hard_irq_disable();
> 
> Now, the only possibility that I can see for an interrupt to come in
> and trip the problem you observed would be if for some reason we
> had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
> not hard disabled.
> 
> Can you try if removing the test (and thus unconditionally calling
> __hard_irq_disable()) fixes the problem for you ?

The system crashed before I started the LTP test run.
========================================================
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
cpu 0x3: Vector: 700 (Program Check) at [c00000026ffd3bb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c00000000001058c: .arch_local_irq_restore+0x4c/0x90
    sp: c00000026ffd3e30
   msr: 8000000000029032
  current = 0xc0000002694e0110
  paca    = 0xc000000003580900	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/3
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
enter ? for help
[link register   ] c00000000001058c .arch_local_irq_restore+0x4c/0x90
[c00000026ffd3e30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffd3ea0] c0000000000857d4 .__do_softirq+0xa4/0x2a0
[c00000026ffd3f90] c000000000022958 .call_do_softirq+0x14/0x24
[c0000002694778e0] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000269477980] c0000000000854c4 .irq_exit+0xc4/0xf0
[c000000269477a00] c00000000001e970 .timer_interrupt+0x120/0x290
[c000000269477ab0] c000000000003a40 decrementer_common+0x140/0x180
--- Exception: 901 (Decrementer) at c0000000000105c4 .arch_local_irq_restore+0x84/0x90
[c000000269477da0] c000000000058400 .pSeries_idle+0x10/0x40 (unreliable)
[c000000269477e10] c000000000017d50 .cpu_idle+0x190/0x290
[c000000269477ed0] c00000000061ab04 .start_secondary+0x348/0x354
[c000000269477f90] c00000000000936c .start_secondary_prolog+0x10/0x14
3:mon> e
cpu 0x3: Vector: 700 (Program Check) at [c00000026ffd3bb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c00000000001058c: .arch_local_irq_restore+0x4c/0x90
    sp: c00000026ffd3e30
   msr: 8000000000029032
  current = 0xc0000002694e0110
  paca    = 0xc000000003580900	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/3
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
3:mon> r
R00 = 0000000000000001   R16 = 0000000000000000
R01 = c00000026ffd3e30   R17 = 0000000000000000
R02 = c000000000edd220   R18 = 0000000000000000
R03 = 0000000000000500   R19 = 0000000000000000
R04 = 0000000000000000   R20 = c000000000f42100
R05 = 00000000000004ca   R21 = 0000000000000000
R06 = 0000000002dcc370   R22 = c000000000955b80
R07 = 003524b183dca42e   R23 = c000000000955b80
R08 = 0000000000920000   R24 = 000000000000000a
R09 = c000000003580900   R25 = 0000000000000003
R10 = 0000000000000008   R26 = c000000269474100
R11 = 0000000000000000   R27 = c00000026ffd0000
R12 = c000000000653ba8   R28 = 0000000000000000
R13 = c000000003580900   R29 = c000000000f42100
R14 = c000000269477f90   R30 = c000000000e60890
R15 = 000000000ef03f20   R31 = 0000000000000082
pc  = c00000000000ea9c .__check_irq_replay+0x7c/0x90
cfar= c00000000000ea38 .__check_irq_replay+0x18/0x90
lr  = c00000000001058c .arch_local_irq_restore+0x4c/0x90
msr = 8000000000029032   cr  = 28000028
ctr = c00000000001df50   xer = 0000000000000000   trap =  700
3:mon> t
[link register   ] c00000000001058c .arch_local_irq_restore+0x4c/0x90
[c00000026ffd3e30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffd3ea0] c0000000000857d4 .__do_softirq+0xa4/0x2a0
[c00000026ffd3f90] c000000000022958 .call_do_softirq+0x14/0x24
[c0000002694778e0] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000269477980] c0000000000854c4 .irq_exit+0xc4/0xf0
[c000000269477a00] c00000000001e970 .timer_interrupt+0x120/0x290
[c000000269477ab0] c000000000003a40 decrementer_common+0x140/0x180
--- Exception: 901 (Decrementer) at c0000000000105c4 .arch_local_irq_restore+0x84/0x90
[c000000269477da0] c000000000058400 .pSeries_idle+0x10/0x40 (unreliable)
[c000000269477e10] c000000000017d50 .cpu_idle+0x190/0x290
[c000000269477ed0] c00000000061ab04 .start_secondary+0x348/0x354
[c000000269477f90] c00000000000936c .start_secondary_prolog+0x10/0x14
3:mon> di  c00000000000ea9c
c00000000000ea9c  0b000000	tdnei	r0,0
c00000000000eaa0  38600000	li	r3,0
c00000000000eaa4  4bffffc4	b	c00000000000ea68	# .__check_irq_replay+0x48/0x90
c00000000000eaa8  60000000	nop
 ...
c00000000000eab0  7c0802a6	mflr	r0
c00000000000eab4  fbc1fff0	std	r30,-16(r1)
c00000000000eab8  fba1ffe8	std	r29,-24(r1)
c00000000000eabc  fbe1fff8	std	r31,-8(r1)
c00000000000eac0  ebc28128	ld	r30,-32472(r2)
c00000000000eac4  3d230002	addis	r9,r3,2
c00000000000eac8  f8010010	std	r0,16(r1)
c00000000000eacc  f821ff71	stdu	r1,-144(r1)
c00000000000ead0  38a5ffd8	addi	r5,r5,-40
c00000000000ead4  ebe92060	ld	r31,8288(r9)
c00000000000ead8  80050048	lwz	r0,72(r5)


> 
> If that is the case, then we need to audit the code to figure out how we
> can end up with that bit in irq_happened set and interrupts hard
> enabled.
> 
> Something like may_hard_irq_enable() shouldn't cause it since it should
> only be called while hard disabled but adding a check in there might be
> worth it (something like WARN_ON(mfmsr() & MSR_EE)).
> 
> Cheers,
> Ben.
> 
>> Cheers,
>> Ben.
>>
>>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>>> ---
>>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>>> index 5ec1b23..3d48b23 100644
>>> --- a/arch/powerpc/kernel/irq.c
>>> +++ b/arch/powerpc/kernel/irq.c
>>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>>   */
>>>  notrace unsigned int __check_irq_replay(void)
>>>  {
>>> +	unsigned int ret_val;
>>>  	/*
>>>  	 * We use local_paca rather than get_paca() to avoid all
>>>  	 * the debug_smp_processor_id() business in this low level
>>>  	 * function
>>>  	 */
>>> -	unsigned char happened = local_paca->irq_happened;
>>> +	unsigned char happened, irq_happened;
>>> +	happened = irq_happened = local_paca->irq_happened;
>>>  
>>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>>  
>>>  	/*
>>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>>  	 * decrementer itself rather than the paca irq_happened field
>>>  	 * in case we also had a rollover while hard disabled
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>>> -	if (decrementer_check_overflow())
>>> -		return 0x900;
>>> +	irq_happened &= ~PACA_IRQ_DEC;
>>> +	if (decrementer_check_overflow()) {
>>> +		ret_val = 0x900;
>>> +		goto replay;
>>> +	}
>>>  
>>>  	/* Finally check if an external interrupt happened */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>>> -	if (happened & PACA_IRQ_EE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE;
>>> +	if (happened & PACA_IRQ_EE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>>  #ifdef CONFIG_PPC_BOOK3E
>>>  	/* Finally check if an EPR external interrupt happened
>>>  	 * this bit is typically set if we need to handle another
>>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> -	if (happened & PACA_IRQ_EE_EDGE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> +	if (happened & PACA_IRQ_EE_EDGE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>>> -	if (happened & PACA_IRQ_DBELL)
>>> -		return 0x280;
>>> +	irq_happened &= ~PACA_IRQ_DBELL;
>>> +	if (happened & PACA_IRQ_DBELL) {
>>> +		ret_val = 0x280;
>>> +		goto replay;
>>> +	}
>>>  #endif /* CONFIG_PPC_BOOK3E */
>>>  
>>>  	/* There should be nothing left ! */
>>> -	BUG_ON(local_paca->irq_happened != 0);
>>> +	BUG_ON(irq_happened != 0);
>>> +	ret_val = 0;
>>>  
>>> -	return 0;
>>> +replay:
>>> +	local_paca->irq_happened = irq_happened;
>>> +
>>> +	return ret_val;
>>>  }
>>>  
>>>  notrace void arch_local_irq_restore(unsigned long en)
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  6:33       ` Wang Sheng-Hui
  0 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  6:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On 2012年05月03日 12:22, Benjamin Herrenschmidt wrote:
> 
>> It should not as __check_irq_replay() should always be called
>> with interrupts hard disabled... Do you see any code path
>> where that is not the case ?
> 
> More specifically, your backtrace seems to indicate that
> __check_irq_repay() was called from arch_local_irq_restore() which
> should have done this before calling __check_irq_replay():
> 
> 	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> 		__hard_irq_disable();
> 
> Now, the only possibility that I can see for an interrupt to come in
> and trip the problem you observed would be if for some reason we
> had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
> not hard disabled.
> 
> Can you try if removing the test (and thus unconditionally calling
> __hard_irq_disable()) fixes the problem for you ?

The system crashed before I started the LTP test run.
========================================================
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
cpu 0x3: Vector: 700 (Program Check) at [c00000026ffd3bb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c00000000001058c: .arch_local_irq_restore+0x4c/0x90
    sp: c00000026ffd3e30
   msr: 8000000000029032
  current = 0xc0000002694e0110
  paca    = 0xc000000003580900	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/3
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
enter ? for help
[link register   ] c00000000001058c .arch_local_irq_restore+0x4c/0x90
[c00000026ffd3e30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffd3ea0] c0000000000857d4 .__do_softirq+0xa4/0x2a0
[c00000026ffd3f90] c000000000022958 .call_do_softirq+0x14/0x24
[c0000002694778e0] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000269477980] c0000000000854c4 .irq_exit+0xc4/0xf0
[c000000269477a00] c00000000001e970 .timer_interrupt+0x120/0x290
[c000000269477ab0] c000000000003a40 decrementer_common+0x140/0x180
--- Exception: 901 (Decrementer) at c0000000000105c4 .arch_local_irq_restore+0x84/0x90
[c000000269477da0] c000000000058400 .pSeries_idle+0x10/0x40 (unreliable)
[c000000269477e10] c000000000017d50 .cpu_idle+0x190/0x290
[c000000269477ed0] c00000000061ab04 .start_secondary+0x348/0x354
[c000000269477f90] c00000000000936c .start_secondary_prolog+0x10/0x14
3:mon> e
cpu 0x3: Vector: 700 (Program Check) at [c00000026ffd3bb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c00000000001058c: .arch_local_irq_restore+0x4c/0x90
    sp: c00000026ffd3e30
   msr: 8000000000029032
  current = 0xc0000002694e0110
  paca    = 0xc000000003580900	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/3
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
3:mon> r
R00 = 0000000000000001   R16 = 0000000000000000
R01 = c00000026ffd3e30   R17 = 0000000000000000
R02 = c000000000edd220   R18 = 0000000000000000
R03 = 0000000000000500   R19 = 0000000000000000
R04 = 0000000000000000   R20 = c000000000f42100
R05 = 00000000000004ca   R21 = 0000000000000000
R06 = 0000000002dcc370   R22 = c000000000955b80
R07 = 003524b183dca42e   R23 = c000000000955b80
R08 = 0000000000920000   R24 = 000000000000000a
R09 = c000000003580900   R25 = 0000000000000003
R10 = 0000000000000008   R26 = c000000269474100
R11 = 0000000000000000   R27 = c00000026ffd0000
R12 = c000000000653ba8   R28 = 0000000000000000
R13 = c000000003580900   R29 = c000000000f42100
R14 = c000000269477f90   R30 = c000000000e60890
R15 = 000000000ef03f20   R31 = 0000000000000082
pc  = c00000000000ea9c .__check_irq_replay+0x7c/0x90
cfar= c00000000000ea38 .__check_irq_replay+0x18/0x90
lr  = c00000000001058c .arch_local_irq_restore+0x4c/0x90
msr = 8000000000029032   cr  = 28000028
ctr = c00000000001df50   xer = 0000000000000000   trap =  700
3:mon> t
[link register   ] c00000000001058c .arch_local_irq_restore+0x4c/0x90
[c00000026ffd3e30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffd3ea0] c0000000000857d4 .__do_softirq+0xa4/0x2a0
[c00000026ffd3f90] c000000000022958 .call_do_softirq+0x14/0x24
[c0000002694778e0] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000269477980] c0000000000854c4 .irq_exit+0xc4/0xf0
[c000000269477a00] c00000000001e970 .timer_interrupt+0x120/0x290
[c000000269477ab0] c000000000003a40 decrementer_common+0x140/0x180
--- Exception: 901 (Decrementer) at c0000000000105c4 .arch_local_irq_restore+0x84/0x90
[c000000269477da0] c000000000058400 .pSeries_idle+0x10/0x40 (unreliable)
[c000000269477e10] c000000000017d50 .cpu_idle+0x190/0x290
[c000000269477ed0] c00000000061ab04 .start_secondary+0x348/0x354
[c000000269477f90] c00000000000936c .start_secondary_prolog+0x10/0x14
3:mon> di  c00000000000ea9c
c00000000000ea9c  0b000000	tdnei	r0,0
c00000000000eaa0  38600000	li	r3,0
c00000000000eaa4  4bffffc4	b	c00000000000ea68	# .__check_irq_replay+0x48/0x90
c00000000000eaa8  60000000	nop
 ...
c00000000000eab0  7c0802a6	mflr	r0
c00000000000eab4  fbc1fff0	std	r30,-16(r1)
c00000000000eab8  fba1ffe8	std	r29,-24(r1)
c00000000000eabc  fbe1fff8	std	r31,-8(r1)
c00000000000eac0  ebc28128	ld	r30,-32472(r2)
c00000000000eac4  3d230002	addis	r9,r3,2
c00000000000eac8  f8010010	std	r0,16(r1)
c00000000000eacc  f821ff71	stdu	r1,-144(r1)
c00000000000ead0  38a5ffd8	addi	r5,r5,-40
c00000000000ead4  ebe92060	ld	r31,8288(r9)
c00000000000ead8  80050048	lwz	r0,72(r5)


> 
> If that is the case, then we need to audit the code to figure out how we
> can end up with that bit in irq_happened set and interrupts hard
> enabled.
> 
> Something like may_hard_irq_enable() shouldn't cause it since it should
> only be called while hard disabled but adding a check in there might be
> worth it (something like WARN_ON(mfmsr() & MSR_EE)).
> 
> Cheers,
> Ben.
> 
>> Cheers,
>> Ben.
>>
>>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
>>> ---
>>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
>>>  1 files changed, 30 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>>> index 5ec1b23..3d48b23 100644
>>> --- a/arch/powerpc/kernel/irq.c
>>> +++ b/arch/powerpc/kernel/irq.c
>>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
>>>   */
>>>  notrace unsigned int __check_irq_replay(void)
>>>  {
>>> +	unsigned int ret_val;
>>>  	/*
>>>  	 * We use local_paca rather than get_paca() to avoid all
>>>  	 * the debug_smp_processor_id() business in this low level
>>>  	 * function
>>>  	 */
>>> -	unsigned char happened = local_paca->irq_happened;
>>> +	unsigned char happened, irq_happened;
>>> +	happened = irq_happened = local_paca->irq_happened;
>>>  
>>>  	/* Clear bit 0 which we wouldn't clear otherwise */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
>>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
>>>  
>>>  	/*
>>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
>>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
>>>  	 * decrementer itself rather than the paca irq_happened field
>>>  	 * in case we also had a rollover while hard disabled
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
>>> -	if (decrementer_check_overflow())
>>> -		return 0x900;
>>> +	irq_happened &= ~PACA_IRQ_DEC;
>>> +	if (decrementer_check_overflow()) {
>>> +		ret_val = 0x900;
>>> +		goto replay;
>>> +	}
>>>  
>>>  	/* Finally check if an external interrupt happened */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
>>> -	if (happened & PACA_IRQ_EE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE;
>>> +	if (happened & PACA_IRQ_EE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>>  #ifdef CONFIG_PPC_BOOK3E
>>>  	/* Finally check if an EPR external interrupt happened
>>>  	 * this bit is typically set if we need to handle another
>>>  	 * "edge" interrupt from within the MPIC "EPR" handler
>>>  	 */
>>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> -	if (happened & PACA_IRQ_EE_EDGE)
>>> -		return 0x500;
>>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
>>> +	if (happened & PACA_IRQ_EE_EDGE) {
>>> +		ret_val = 0x500;
>>> +		goto replay;
>>> +	}
>>>  
>>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
>>> -	if (happened & PACA_IRQ_DBELL)
>>> -		return 0x280;
>>> +	irq_happened &= ~PACA_IRQ_DBELL;
>>> +	if (happened & PACA_IRQ_DBELL) {
>>> +		ret_val = 0x280;
>>> +		goto replay;
>>> +	}
>>>  #endif /* CONFIG_PPC_BOOK3E */
>>>  
>>>  	/* There should be nothing left ! */
>>> -	BUG_ON(local_paca->irq_happened != 0);
>>> +	BUG_ON(irq_happened != 0);
>>> +	ret_val = 0;
>>>  
>>> -	return 0;
>>> +replay:
>>> +	local_paca->irq_happened = irq_happened;
>>> +
>>> +	return ret_val;
>>>  }
>>>  
>>>  notrace void arch_local_irq_restore(unsigned long en)
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  5:51       ` Wang Sheng-Hui
@ 2012-05-03  6:52         ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  6:52 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On Thu, 2012-05-03 at 13:51 +0800, Wang Sheng-Hui wrote:
> On 2012年05月03日 12:22, Benjamin Herrenschmidt wrote:
> > 
> >> It should not as __check_irq_replay() should always be called
> >> with interrupts hard disabled... Do you see any code path
> >> where that is not the case ?
> > 
> > More specifically, your backtrace seems to indicate that
> > __check_irq_repay() was called from arch_local_irq_restore() which
> > should have done this before calling __check_irq_replay():
> > 
> > 	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> > 		__hard_irq_disable();
> > 
> > Now, the only possibility that I can see for an interrupt to come in
> > and trip the problem you observed would be if for some reason we
> > had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
> > not hard disabled.
> 
> I have a chance to notice that the value is 0x05, not just 0x01.
> So I think this is not the case.

Well, it depends, the value could have been 0x01 before it hit there... 

However 0x05 means that EE is set too which means it should never have
hard-enabled to begin with. This is all very odd, we'll need to dig.

If the value had been anything other than 0x01 it would have hard
disabled interrupts meaning that paca->irq_happened cannot change
anymore until they are re-enabled at the bottom of the function.

So please try making this disable unconditional see if that makes any
difference...

Cheers,
Ben.
> > Can you try if removing the test (and thus unconditionally calling
> > __hard_irq_disable()) fixes the problem for you ?
> > 
> > If that is the case, then we need to audit the code to figure out how we
> > can end up with that bit in irq_happened set and interrupts hard
> > enabled.
> > 
> > Something like may_hard_irq_enable() shouldn't cause it since it should
> > only be called while hard disabled but adding a check in there might be
> > worth it (something like WARN_ON(mfmsr() & MSR_EE)).
> > 
> > Cheers,
> > Ben.
> > 
> >> Cheers,
> >> Ben.
> >>
> >>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
> >>> ---
> >>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
> >>>  1 files changed, 30 insertions(+), 16 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> >>> index 5ec1b23..3d48b23 100644
> >>> --- a/arch/powerpc/kernel/irq.c
> >>> +++ b/arch/powerpc/kernel/irq.c
> >>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
> >>>   */
> >>>  notrace unsigned int __check_irq_replay(void)
> >>>  {
> >>> +	unsigned int ret_val;
> >>>  	/*
> >>>  	 * We use local_paca rather than get_paca() to avoid all
> >>>  	 * the debug_smp_processor_id() business in this low level
> >>>  	 * function
> >>>  	 */
> >>> -	unsigned char happened = local_paca->irq_happened;
> >>> +	unsigned char happened, irq_happened;
> >>> +	happened = irq_happened = local_paca->irq_happened;
> >>>  
> >>>  	/* Clear bit 0 which we wouldn't clear otherwise */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
> >>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
> >>>  
> >>>  	/*
> >>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
> >>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
> >>>  	 * decrementer itself rather than the paca irq_happened field
> >>>  	 * in case we also had a rollover while hard disabled
> >>>  	 */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> >>> -	if (decrementer_check_overflow())
> >>> -		return 0x900;
> >>> +	irq_happened &= ~PACA_IRQ_DEC;
> >>> +	if (decrementer_check_overflow()) {
> >>> +		ret_val = 0x900;
> >>> +		goto replay;
> >>> +	}
> >>>  
> >>>  	/* Finally check if an external interrupt happened */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
> >>> -	if (happened & PACA_IRQ_EE)
> >>> -		return 0x500;
> >>> +	irq_happened &= ~PACA_IRQ_EE;
> >>> +	if (happened & PACA_IRQ_EE) {
> >>> +		ret_val = 0x500;
> >>> +		goto replay;
> >>> +	}
> >>>  
> >>>  #ifdef CONFIG_PPC_BOOK3E
> >>>  	/* Finally check if an EPR external interrupt happened
> >>>  	 * this bit is typically set if we need to handle another
> >>>  	 * "edge" interrupt from within the MPIC "EPR" handler
> >>>  	 */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
> >>> -	if (happened & PACA_IRQ_EE_EDGE)
> >>> -		return 0x500;
> >>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
> >>> +	if (happened & PACA_IRQ_EE_EDGE) {
> >>> +		ret_val = 0x500;
> >>> +		goto replay;
> >>> +	}
> >>>  
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
> >>> -	if (happened & PACA_IRQ_DBELL)
> >>> -		return 0x280;
> >>> +	irq_happened &= ~PACA_IRQ_DBELL;
> >>> +	if (happened & PACA_IRQ_DBELL) {
> >>> +		ret_val = 0x280;
> >>> +		goto replay;
> >>> +	}
> >>>  #endif /* CONFIG_PPC_BOOK3E */
> >>>  
> >>>  	/* There should be nothing left ! */
> >>> -	BUG_ON(local_paca->irq_happened != 0);
> >>> +	BUG_ON(irq_happened != 0);
> >>> +	ret_val = 0;
> >>>  
> >>> -	return 0;
> >>> +replay:
> >>> +	local_paca->irq_happened = irq_happened;
> >>> +
> >>> +	return ret_val;
> >>>  }
> >>>  
> >>>  notrace void arch_local_irq_restore(unsigned long en)
> >>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at  http://www.tux.org/lkml/
> > 
> > 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  6:52         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  6:52 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On Thu, 2012-05-03 at 13:51 +0800, Wang Sheng-Hui wrote:
> On 2012年05月03日 12:22, Benjamin Herrenschmidt wrote:
> > 
> >> It should not as __check_irq_replay() should always be called
> >> with interrupts hard disabled... Do you see any code path
> >> where that is not the case ?
> > 
> > More specifically, your backtrace seems to indicate that
> > __check_irq_repay() was called from arch_local_irq_restore() which
> > should have done this before calling __check_irq_replay():
> > 
> > 	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> > 		__hard_irq_disable();
> > 
> > Now, the only possibility that I can see for an interrupt to come in
> > and trip the problem you observed would be if for some reason we
> > had irq_happened set to PACA_IRQ_HARD_DIS while interrupts were
> > not hard disabled.
> 
> I have a chance to notice that the value is 0x05, not just 0x01.
> So I think this is not the case.

Well, it depends, the value could have been 0x01 before it hit there... 

However 0x05 means that EE is set too which means it should never have
hard-enabled to begin with. This is all very odd, we'll need to dig.

If the value had been anything other than 0x01 it would have hard
disabled interrupts meaning that paca->irq_happened cannot change
anymore until they are re-enabled at the bottom of the function.

So please try making this disable unconditional see if that makes any
difference...

Cheers,
Ben.
> > Can you try if removing the test (and thus unconditionally calling
> > __hard_irq_disable()) fixes the problem for you ?
> > 
> > If that is the case, then we need to audit the code to figure out how we
> > can end up with that bit in irq_happened set and interrupts hard
> > enabled.
> > 
> > Something like may_hard_irq_enable() shouldn't cause it since it should
> > only be called while hard disabled but adding a check in there might be
> > worth it (something like WARN_ON(mfmsr() & MSR_EE)).
> > 
> > Cheers,
> > Ben.
> > 
> >> Cheers,
> >> Ben.
> >>
> >>> Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
> >>> ---
> >>>  arch/powerpc/kernel/irq.c |   46 +++++++++++++++++++++++++++++---------------
> >>>  1 files changed, 30 insertions(+), 16 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> >>> index 5ec1b23..3d48b23 100644
> >>> --- a/arch/powerpc/kernel/irq.c
> >>> +++ b/arch/powerpc/kernel/irq.c
> >>> @@ -137,15 +137,17 @@ static inline notrace int decrementer_check_overflow(void)
> >>>   */
> >>>  notrace unsigned int __check_irq_replay(void)
> >>>  {
> >>> +	unsigned int ret_val;
> >>>  	/*
> >>>  	 * We use local_paca rather than get_paca() to avoid all
> >>>  	 * the debug_smp_processor_id() business in this low level
> >>>  	 * function
> >>>  	 */
> >>> -	unsigned char happened = local_paca->irq_happened;
> >>> +	unsigned char happened, irq_happened;
> >>> +	happened = irq_happened = local_paca->irq_happened;
> >>>  
> >>>  	/* Clear bit 0 which we wouldn't clear otherwise */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
> >>> +	irq_happened &= ~PACA_IRQ_HARD_DIS;
> >>>  
> >>>  	/*
> >>>  	 * Force the delivery of pending soft-disabled interrupts on PS3.
> >>> @@ -161,33 +163,45 @@ notrace unsigned int __check_irq_replay(void)
> >>>  	 * decrementer itself rather than the paca irq_happened field
> >>>  	 * in case we also had a rollover while hard disabled
> >>>  	 */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> >>> -	if (decrementer_check_overflow())
> >>> -		return 0x900;
> >>> +	irq_happened &= ~PACA_IRQ_DEC;
> >>> +	if (decrementer_check_overflow()) {
> >>> +		ret_val = 0x900;
> >>> +		goto replay;
> >>> +	}
> >>>  
> >>>  	/* Finally check if an external interrupt happened */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_EE;
> >>> -	if (happened & PACA_IRQ_EE)
> >>> -		return 0x500;
> >>> +	irq_happened &= ~PACA_IRQ_EE;
> >>> +	if (happened & PACA_IRQ_EE) {
> >>> +		ret_val = 0x500;
> >>> +		goto replay;
> >>> +	}
> >>>  
> >>>  #ifdef CONFIG_PPC_BOOK3E
> >>>  	/* Finally check if an EPR external interrupt happened
> >>>  	 * this bit is typically set if we need to handle another
> >>>  	 * "edge" interrupt from within the MPIC "EPR" handler
> >>>  	 */
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
> >>> -	if (happened & PACA_IRQ_EE_EDGE)
> >>> -		return 0x500;
> >>> +	irq_happened &= ~PACA_IRQ_EE_EDGE;
> >>> +	if (happened & PACA_IRQ_EE_EDGE) {
> >>> +		ret_val = 0x500;
> >>> +		goto replay;
> >>> +	}
> >>>  
> >>> -	local_paca->irq_happened &= ~PACA_IRQ_DBELL;
> >>> -	if (happened & PACA_IRQ_DBELL)
> >>> -		return 0x280;
> >>> +	irq_happened &= ~PACA_IRQ_DBELL;
> >>> +	if (happened & PACA_IRQ_DBELL) {
> >>> +		ret_val = 0x280;
> >>> +		goto replay;
> >>> +	}
> >>>  #endif /* CONFIG_PPC_BOOK3E */
> >>>  
> >>>  	/* There should be nothing left ! */
> >>> -	BUG_ON(local_paca->irq_happened != 0);
> >>> +	BUG_ON(irq_happened != 0);
> >>> +	ret_val = 0;
> >>>  
> >>> -	return 0;
> >>> +replay:
> >>> +	local_paca->irq_happened = irq_happened;
> >>> +
> >>> +	return ret_val;
> >>>  }
> >>>  
> >>>  notrace void arch_local_irq_restore(unsigned long en)
> >>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at  http://www.tux.org/lkml/
> > 
> > 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  6:33       ` Wang Sheng-Hui
@ 2012-05-03  6:59         ` Wang Sheng-Hui
  -1 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  6:59 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On 2012年05月03日 14:33, Wang Sheng-Hui wrote:
> if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
>> 		__hard_irq_disable();

I have commented out the 2 lines.

FYI.

thanks,

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  6:59         ` Wang Sheng-Hui
  0 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03  6:59 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On 2012年05月03日 14:33, Wang Sheng-Hui wrote:
> if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
>> 		__hard_irq_disable();

I have commented out the 2 lines.

FYI.

thanks,

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  6:59         ` Wang Sheng-Hui
@ 2012-05-03  8:09           ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  8:09 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On Thu, 2012-05-03 at 14:59 +0800, Wang Sheng-Hui wrote:
> On 2012年05月03日 14:33, Wang Sheng-Hui wrote:
> > if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> >> 		__hard_irq_disable();
> 
> I have commented out the 2 lines.

No, Only comment the test, you must absolutely leave the
__hard_irq_disable() call ! That's the whole point of the test, make
sure we unconditionally disable to see if that fixes the problem, in
which case that will tell us that we somewhere accidentally leave
irq_happened set to 0x01 while irqs are hard enabled.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03  8:09           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-03  8:09 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On Thu, 2012-05-03 at 14:59 +0800, Wang Sheng-Hui wrote:
> On 2012年05月03日 14:33, Wang Sheng-Hui wrote:
> > if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
> >> 		__hard_irq_disable();
> 
> I have commented out the 2 lines.

No, Only comment the test, you must absolutely leave the
__hard_irq_disable() call ! That's the whole point of the test, make
sure we unconditionally disable to see if that fixes the problem, in
which case that will tell us that we somewhere accidentally leave
irq_happened set to 0x01 while irqs are hard enabled.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03  8:09           ` Benjamin Herrenschmidt
@ 2012-05-03 23:35             ` Wang Sheng-Hui
  -1 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03 23:35 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On 2012年05月03日 16:09, Benjamin Herrenschmidt wrote:
> On Thu, 2012-05-03 at 14:59 +0800, Wang Sheng-Hui wrote:
>> On 2012年05月03日 14:33, Wang Sheng-Hui wrote:
>>> if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
>>>> 		__hard_irq_disable();
>>
>> I have commented out the 2 lines.
> 
> No, Only comment the test, you must absolutely leave the
> __hard_irq_disable() call ! That's the whole point of the test, make
> sure we unconditionally disable to see if that fixes the problem, in
> which case that will tell us that we somewhere accidentally leave
> irq_happened set to 0x01 while irqs are hard enabled.

It can work.
My system has been running for about 15 hours without crash.

> 
> Cheers,
> Ben.
> 
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-03 23:35             ` Wang Sheng-Hui
  0 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-03 23:35 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On 2012年05月03日 16:09, Benjamin Herrenschmidt wrote:
> On Thu, 2012-05-03 at 14:59 +0800, Wang Sheng-Hui wrote:
>> On 2012年05月03日 14:33, Wang Sheng-Hui wrote:
>>> if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
>>>> 		__hard_irq_disable();
>>
>> I have commented out the 2 lines.
> 
> No, Only comment the test, you must absolutely leave the
> __hard_irq_disable() call ! That's the whole point of the test, make
> sure we unconditionally disable to see if that fixes the problem, in
> which case that will tell us that we somewhere accidentally leave
> irq_happened set to 0x01 while irqs are hard enabled.

It can work.
My system has been running for about 15 hours without crash.

> 
> Cheers,
> Ben.
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-03 23:35             ` Wang Sheng-Hui
@ 2012-05-04  0:10               ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-04  0:10 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On Fri, 2012-05-04 at 07:35 +0800, Wang Sheng-Hui wrote:
> > No, Only comment the test, you must absolutely leave the
> > __hard_irq_disable() call ! That's the whole point of the test, make
> > sure we unconditionally disable to see if that fixes the problem, in
> > which case that will tell us that we somewhere accidentally leave
> > irq_happened set to 0x01 while irqs are hard enabled.
> 
> It can work.
> My system has been running for about 15 hours without crash.

Ok, so now we need to understand under what circumstances we end up
in a situation where paca->irq_happened is 0x01 and IRQs are hard
enabled. I have a few ideas of things to look at but I'm also off
for the week-end.

I'll have a look next week.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-04  0:10               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-04  0:10 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On Fri, 2012-05-04 at 07:35 +0800, Wang Sheng-Hui wrote:
> > No, Only comment the test, you must absolutely leave the
> > __hard_irq_disable() call ! That's the whole point of the test, make
> > sure we unconditionally disable to see if that fixes the problem, in
> > which case that will tell us that we somewhere accidentally leave
> > irq_happened set to 0x01 while irqs are hard enabled.
> 
> It can work.
> My system has been running for about 15 hours without crash.

Ok, so now we need to understand under what circumstances we end up
in a situation where paca->irq_happened is 0x01 and IRQs are hard
enabled. I have a few ideas of things to look at but I'm also off
for the week-end.

I'll have a look next week.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-04  0:10               ` Benjamin Herrenschmidt
@ 2012-05-08  3:46                 ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-08  3:46 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

Hi Wang !

Does this patch fixes it for you ?

>From 249f8649bf95a4c3e6637284754a165c1d83c394 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Tue, 8 May 2012 13:31:59 +1000
Subject: [PATCH 2/3] powerpc/irq: Fix bug with new lazy IRQ handling code

We had a case where we could turn on hard interrupts while
leaving the PACA_IRQ_HARD_DIS bit set in the PACA. This can
in turn cause a BUG_ON() to hit in __check_irq_replay() due
to interrupt state getting out of sync.

The assembly code was also way too convoluted. Instead, we
now leave it to the C code to do the right thing which ends
up being smaller and more readable.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/entry_64.S |   18 ------------------
 arch/powerpc/kernel/irq.c      |    8 +++++++-
 2 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index fd46046..29f1357 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -763,16 +763,6 @@ do_work:
 	SOFT_DISABLE_INTS(r3,r4)
 1:	bl	.preempt_schedule_irq
 
-	/* Hard-disable interrupts again (and update PACA) */
-#ifdef CONFIG_PPC_BOOK3E
-	wrteei	0
-#else
-	ld	r10,PACAKMSR(r13) /* Get kernel MSR without EE */
-	mtmsrd	r10,1
-#endif /* CONFIG_PPC_BOOK3E */
-	li	r0,PACA_IRQ_HARD_DIS
-	stb	r0,PACAIRQHAPPENED(r13)
-
 	/* Re-test flags and eventually loop */
 	clrrdi	r9,r1,THREAD_SHIFT
 	ld	r4,TI_FLAGS(r9)
@@ -783,14 +773,6 @@ do_work:
 user_work:
 #endif /* CONFIG_PREEMPT */
 
-	/* Enable interrupts */
-#ifdef CONFIG_PPC_BOOK3E
-	wrteei	1
-#else
-	ori	r10,r10,MSR_EE
-	mtmsrd	r10,1
-#endif /* CONFIG_PPC_BOOK3E */
-
 	andi.	r0,r4,_TIF_NEED_RESCHED
 	beq	1f
 	bl	.restore_interrupts
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 5ec1b23..3717fb5 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -260,11 +260,17 @@ EXPORT_SYMBOL(arch_local_irq_restore);
  * if they are currently disabled. This is typically called before
  * schedule() or do_signal() when returning to userspace. We do it
  * in C to avoid the burden of dealing with lockdep etc...
+ *
+ * NOTE: This is called with interrupts hard disabled but not marked
+ * as such in paca->irq_happened, so we need to resync this.
  */
 void restore_interrupts(void)
 {
-	if (irqs_disabled())
+	if (irqs_disabled()) {
+		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 		local_irq_enable();
+	} else
+		__hard_irq_enable();
 }
 
 #endif /* CONFIG_PPC64 */
-- 
1.7.9.5




^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-08  3:46                 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-08  3:46 UTC (permalink / raw)
  To: Wang Sheng-Hui
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

Hi Wang !

Does this patch fixes it for you ?

>From 249f8649bf95a4c3e6637284754a165c1d83c394 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Tue, 8 May 2012 13:31:59 +1000
Subject: [PATCH 2/3] powerpc/irq: Fix bug with new lazy IRQ handling code

We had a case where we could turn on hard interrupts while
leaving the PACA_IRQ_HARD_DIS bit set in the PACA. This can
in turn cause a BUG_ON() to hit in __check_irq_replay() due
to interrupt state getting out of sync.

The assembly code was also way too convoluted. Instead, we
now leave it to the C code to do the right thing which ends
up being smaller and more readable.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/entry_64.S |   18 ------------------
 arch/powerpc/kernel/irq.c      |    8 +++++++-
 2 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index fd46046..29f1357 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -763,16 +763,6 @@ do_work:
 	SOFT_DISABLE_INTS(r3,r4)
 1:	bl	.preempt_schedule_irq
 
-	/* Hard-disable interrupts again (and update PACA) */
-#ifdef CONFIG_PPC_BOOK3E
-	wrteei	0
-#else
-	ld	r10,PACAKMSR(r13) /* Get kernel MSR without EE */
-	mtmsrd	r10,1
-#endif /* CONFIG_PPC_BOOK3E */
-	li	r0,PACA_IRQ_HARD_DIS
-	stb	r0,PACAIRQHAPPENED(r13)
-
 	/* Re-test flags and eventually loop */
 	clrrdi	r9,r1,THREAD_SHIFT
 	ld	r4,TI_FLAGS(r9)
@@ -783,14 +773,6 @@ do_work:
 user_work:
 #endif /* CONFIG_PREEMPT */
 
-	/* Enable interrupts */
-#ifdef CONFIG_PPC_BOOK3E
-	wrteei	1
-#else
-	ori	r10,r10,MSR_EE
-	mtmsrd	r10,1
-#endif /* CONFIG_PPC_BOOK3E */
-
 	andi.	r0,r4,_TIF_NEED_RESCHED
 	beq	1f
 	bl	.restore_interrupts
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 5ec1b23..3717fb5 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -260,11 +260,17 @@ EXPORT_SYMBOL(arch_local_irq_restore);
  * if they are currently disabled. This is typically called before
  * schedule() or do_signal() when returning to userspace. We do it
  * in C to avoid the burden of dealing with lockdep etc...
+ *
+ * NOTE: This is called with interrupts hard disabled but not marked
+ * as such in paca->irq_happened, so we need to resync this.
  */
 void restore_interrupts(void)
 {
-	if (irqs_disabled())
+	if (irqs_disabled()) {
+		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 		local_irq_enable();
+	} else
+		__hard_irq_enable();
 }
 
 #endif /* CONFIG_PPC64 */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
  2012-05-08  3:46                 ` Benjamin Herrenschmidt
@ 2012-05-10  5:37                   ` Wang Sheng-Hui
  -1 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-10  5:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Milton Miller, Grant Likely, Stephen Rothwell, Anton Blanchard,
	linuxppc-dev, linux-kernel

On 2012年05月08日 11:46, Benjamin Herrenschmidt wrote:
> Hi Wang !
> 
> Does this patch fixes it for you ?
> 

Sorry, this patch doesn't work. And my system crashed again with the patch.
======================================================
# kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
cpu 0x0: Vector: 700 (Program Check) at [c00000026ffebbb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c000000000010578: .arch_local_irq_restore+0x38/0x90
    sp: c00000026ffebe30
   msr: 8000000000029032
  current = 0xc000000000e27be0
  paca    = 0xc000000003580000	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/0
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
enter ? for help
[link register   ] c000000000010578 .arch_local_irq_restore+0x38/0x90
[c00000026ffebe30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffebea0] c000000000085854 .__do_softirq+0xa4/0x2a0
[c00000026ffebf90] c0000000000229b8 .call_do_softirq+0x14/0x24
[c000000000edf870] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000000edf910] c000000000085544 .irq_exit+0xc4/0xf0
[c000000000edf990] c0000000000100a4 .do_IRQ+0xe4/0x310
[c000000000edfa50] c0000000000038c0 hardware_interrupt_common+0x140/0x180
--- Exception: 501 (Hardware Interrupt) at c0000000000105b4 .arch_local_irq_restore+0x74/0x90
[c000000000edfd40] c000000000058480 .pSeries_idle+0x10/0x40 (unreliable)
[c000000000edfdb0] c000000000017d70 .cpu_idle+0x190/0x290
[c000000000edfe70] c00000000000b308 .rest_init+0x88/0xa0
[c000000000edfef0] c0000000008c0d1c .start_kernel+0x554/0x574
[c000000000edff90] c000000000009658 .start_here_common+0x20/0x48
0:mon> e
cpu 0x0: Vector: 700 (Program Check) at [c00000026ffebbb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c000000000010578: .arch_local_irq_restore+0x38/0x90
    sp: c00000026ffebe30
   msr: 8000000000029032
  current = 0xc000000000e27be0
  paca    = 0xc000000003580000	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/0
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
0:mon> r
R00 = 0000000000000001   R16 = 0000000003680000
R01 = c00000026ffebe30   R17 = 000000000021ed0f
R02 = c000000000edd228   R18 = 000000000021efbb
R03 = 0000000000000500   R19 = 000000000021ee84
R04 = 0000000000000000   R20 = c000000000f42100
R05 = 00000000000007ea   R21 = 0000000000000000
R06 = 00000000273f6d30   R22 = c000000000955b80
R07 = 00363d0e68097e11   R23 = c000000000955b80
R08 = 00000000008c0000   R24 = 000000000000000a
R09 = c000000003580000   R25 = 0000000000000000
R10 = 0000000000000001   R26 = c000000000edc100
R11 = 0000000000000000   R27 = c00000026ffe8000
R12 = 0000000000000002   R28 = 0000000000000000
R13 = c000000003580000   R29 = c000000000f42100
R14 = 0000000002e1fa78   R30 = c000000000e60890
R15 = 0000000001173000   R31 = 0000000000000040
pc  = c00000000000ea9c .__check_irq_replay+0x7c/0x90
cfar= c00000000000ea3c .__check_irq_replay+0x1c/0x90
lr  = c000000000010578 .arch_local_irq_restore+0x38/0x90
msr = 8000000000029032   cr  = 28000048
ctr = c000000000063f70   xer = 0000000000000001   trap =  700
0:mon> t
[link register   ] c000000000010578 .arch_local_irq_restore+0x38/0x90
[c00000026ffebe30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffebea0] c000000000085854 .__do_softirq+0xa4/0x2a0
[c00000026ffebf90] c0000000000229b8 .call_do_softirq+0x14/0x24
[c000000000edf870] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000000edf910] c000000000085544 .irq_exit+0xc4/0xf0
[c000000000edf990] c0000000000100a4 .do_IRQ+0xe4/0x310
[c000000000edfa50] c0000000000038c0 hardware_interrupt_common+0x140/0x180
--- Exception: 501 (Hardware Interrupt) at c0000000000105b4 .arch_local_irq_restore+0x74/0x90
[c000000000edfd40] c000000000058480 .pSeries_idle+0x10/0x40 (unreliable)
[c000000000edfdb0] c000000000017d70 .cpu_idle+0x190/0x290
[c000000000edfe70] c00000000000b308 .rest_init+0x88/0xa0
[c000000000edfef0] c0000000008c0d1c .start_kernel+0x554/0x574
[c000000000edff90] c000000000009658 .start_here_common+0x20/0x48
0:mon> di 



> From 249f8649bf95a4c3e6637284754a165c1d83c394 Mon Sep 17 00:00:00 2001
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Tue, 8 May 2012 13:31:59 +1000
> Subject: [PATCH 2/3] powerpc/irq: Fix bug with new lazy IRQ handling code
> 
> We had a case where we could turn on hard interrupts while
> leaving the PACA_IRQ_HARD_DIS bit set in the PACA. This can
> in turn cause a BUG_ON() to hit in __check_irq_replay() due
> to interrupt state getting out of sync.
> 
> The assembly code was also way too convoluted. Instead, we
> now leave it to the C code to do the right thing which ends
> up being smaller and more readable.
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  arch/powerpc/kernel/entry_64.S |   18 ------------------
>  arch/powerpc/kernel/irq.c      |    8 +++++++-
>  2 files changed, 7 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index fd46046..29f1357 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -763,16 +763,6 @@ do_work:
>  	SOFT_DISABLE_INTS(r3,r4)
>  1:	bl	.preempt_schedule_irq
>  
> -	/* Hard-disable interrupts again (and update PACA) */
> -#ifdef CONFIG_PPC_BOOK3E
> -	wrteei	0
> -#else
> -	ld	r10,PACAKMSR(r13) /* Get kernel MSR without EE */
> -	mtmsrd	r10,1
> -#endif /* CONFIG_PPC_BOOK3E */
> -	li	r0,PACA_IRQ_HARD_DIS
> -	stb	r0,PACAIRQHAPPENED(r13)
> -
>  	/* Re-test flags and eventually loop */
>  	clrrdi	r9,r1,THREAD_SHIFT
>  	ld	r4,TI_FLAGS(r9)
> @@ -783,14 +773,6 @@ do_work:
>  user_work:
>  #endif /* CONFIG_PREEMPT */
>  
> -	/* Enable interrupts */
> -#ifdef CONFIG_PPC_BOOK3E
> -	wrteei	1
> -#else
> -	ori	r10,r10,MSR_EE
> -	mtmsrd	r10,1
> -#endif /* CONFIG_PPC_BOOK3E */
> -
>  	andi.	r0,r4,_TIF_NEED_RESCHED
>  	beq	1f
>  	bl	.restore_interrupts
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5ec1b23..3717fb5 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -260,11 +260,17 @@ EXPORT_SYMBOL(arch_local_irq_restore);
>   * if they are currently disabled. This is typically called before
>   * schedule() or do_signal() when returning to userspace. We do it
>   * in C to avoid the burden of dealing with lockdep etc...
> + *
> + * NOTE: This is called with interrupts hard disabled but not marked
> + * as such in paca->irq_happened, so we need to resync this.
>   */
>  void restore_interrupts(void)
>  {
> -	if (irqs_disabled())
> +	if (irqs_disabled()) {
> +		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
>  		local_irq_enable();
> +	} else
> +		__hard_irq_enable();
>  }
>  
>  #endif /* CONFIG_PPC64 */


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay
@ 2012-05-10  5:37                   ` Wang Sheng-Hui
  0 siblings, 0 replies; 29+ messages in thread
From: Wang Sheng-Hui @ 2012-05-10  5:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-kernel, Milton Miller, Anton Blanchard,
	linuxppc-dev

On 2012年05月08日 11:46, Benjamin Herrenschmidt wrote:
> Hi Wang !
> 
> Does this patch fixes it for you ?
> 

Sorry, this patch doesn't work. And my system crashed again with the patch.
======================================================
# kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
cpu 0x0: Vector: 700 (Program Check) at [c00000026ffebbb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c000000000010578: .arch_local_irq_restore+0x38/0x90
    sp: c00000026ffebe30
   msr: 8000000000029032
  current = 0xc000000000e27be0
  paca    = 0xc000000003580000	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/0
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
enter ? for help
[link register   ] c000000000010578 .arch_local_irq_restore+0x38/0x90
[c00000026ffebe30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffebea0] c000000000085854 .__do_softirq+0xa4/0x2a0
[c00000026ffebf90] c0000000000229b8 .call_do_softirq+0x14/0x24
[c000000000edf870] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000000edf910] c000000000085544 .irq_exit+0xc4/0xf0
[c000000000edf990] c0000000000100a4 .do_IRQ+0xe4/0x310
[c000000000edfa50] c0000000000038c0 hardware_interrupt_common+0x140/0x180
--- Exception: 501 (Hardware Interrupt) at c0000000000105b4 .arch_local_irq_restore+0x74/0x90
[c000000000edfd40] c000000000058480 .pSeries_idle+0x10/0x40 (unreliable)
[c000000000edfdb0] c000000000017d70 .cpu_idle+0x190/0x290
[c000000000edfe70] c00000000000b308 .rest_init+0x88/0xa0
[c000000000edfef0] c0000000008c0d1c .start_kernel+0x554/0x574
[c000000000edff90] c000000000009658 .start_here_common+0x20/0x48
0:mon> e
cpu 0x0: Vector: 700 (Program Check) at [c00000026ffebbb0]
    pc: c00000000000ea9c: .__check_irq_replay+0x7c/0x90
    lr: c000000000010578: .arch_local_irq_restore+0x38/0x90
    sp: c00000026ffebe30
   msr: 8000000000029032
  current = 0xc000000000e27be0
  paca    = 0xc000000003580000	 softe: 0	 irq_happened: 0x01
    pid   = 0, comm = swapper/0
kernel BUG at /usr/src/kernels/linux/arch/powerpc/kernel/irq.c:188!
0:mon> r
R00 = 0000000000000001   R16 = 0000000003680000
R01 = c00000026ffebe30   R17 = 000000000021ed0f
R02 = c000000000edd228   R18 = 000000000021efbb
R03 = 0000000000000500   R19 = 000000000021ee84
R04 = 0000000000000000   R20 = c000000000f42100
R05 = 00000000000007ea   R21 = 0000000000000000
R06 = 00000000273f6d30   R22 = c000000000955b80
R07 = 00363d0e68097e11   R23 = c000000000955b80
R08 = 00000000008c0000   R24 = 000000000000000a
R09 = c000000003580000   R25 = 0000000000000000
R10 = 0000000000000001   R26 = c000000000edc100
R11 = 0000000000000000   R27 = c00000026ffe8000
R12 = 0000000000000002   R28 = 0000000000000000
R13 = c000000003580000   R29 = c000000000f42100
R14 = 0000000002e1fa78   R30 = c000000000e60890
R15 = 0000000001173000   R31 = 0000000000000040
pc  = c00000000000ea9c .__check_irq_replay+0x7c/0x90
cfar= c00000000000ea3c .__check_irq_replay+0x1c/0x90
lr  = c000000000010578 .arch_local_irq_restore+0x38/0x90
msr = 8000000000029032   cr  = 28000048
ctr = c000000000063f70   xer = 0000000000000001   trap =  700
0:mon> t
[link register   ] c000000000010578 .arch_local_irq_restore+0x38/0x90
[c00000026ffebe30] c000000000f42100 softirq_vec+0x0/0x80 (unreliable)
[c00000026ffebea0] c000000000085854 .__do_softirq+0xa4/0x2a0
[c00000026ffebf90] c0000000000229b8 .call_do_softirq+0x14/0x24
[c000000000edf870] c0000000000106c8 .do_softirq+0xf8/0x130
[c000000000edf910] c000000000085544 .irq_exit+0xc4/0xf0
[c000000000edf990] c0000000000100a4 .do_IRQ+0xe4/0x310
[c000000000edfa50] c0000000000038c0 hardware_interrupt_common+0x140/0x180
--- Exception: 501 (Hardware Interrupt) at c0000000000105b4 .arch_local_irq_restore+0x74/0x90
[c000000000edfd40] c000000000058480 .pSeries_idle+0x10/0x40 (unreliable)
[c000000000edfdb0] c000000000017d70 .cpu_idle+0x190/0x290
[c000000000edfe70] c00000000000b308 .rest_init+0x88/0xa0
[c000000000edfef0] c0000000008c0d1c .start_kernel+0x554/0x574
[c000000000edff90] c000000000009658 .start_here_common+0x20/0x48
0:mon> di 



> From 249f8649bf95a4c3e6637284754a165c1d83c394 Mon Sep 17 00:00:00 2001
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Tue, 8 May 2012 13:31:59 +1000
> Subject: [PATCH 2/3] powerpc/irq: Fix bug with new lazy IRQ handling code
> 
> We had a case where we could turn on hard interrupts while
> leaving the PACA_IRQ_HARD_DIS bit set in the PACA. This can
> in turn cause a BUG_ON() to hit in __check_irq_replay() due
> to interrupt state getting out of sync.
> 
> The assembly code was also way too convoluted. Instead, we
> now leave it to the C code to do the right thing which ends
> up being smaller and more readable.
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  arch/powerpc/kernel/entry_64.S |   18 ------------------
>  arch/powerpc/kernel/irq.c      |    8 +++++++-
>  2 files changed, 7 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index fd46046..29f1357 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -763,16 +763,6 @@ do_work:
>  	SOFT_DISABLE_INTS(r3,r4)
>  1:	bl	.preempt_schedule_irq
>  
> -	/* Hard-disable interrupts again (and update PACA) */
> -#ifdef CONFIG_PPC_BOOK3E
> -	wrteei	0
> -#else
> -	ld	r10,PACAKMSR(r13) /* Get kernel MSR without EE */
> -	mtmsrd	r10,1
> -#endif /* CONFIG_PPC_BOOK3E */
> -	li	r0,PACA_IRQ_HARD_DIS
> -	stb	r0,PACAIRQHAPPENED(r13)
> -
>  	/* Re-test flags and eventually loop */
>  	clrrdi	r9,r1,THREAD_SHIFT
>  	ld	r4,TI_FLAGS(r9)
> @@ -783,14 +773,6 @@ do_work:
>  user_work:
>  #endif /* CONFIG_PREEMPT */
>  
> -	/* Enable interrupts */
> -#ifdef CONFIG_PPC_BOOK3E
> -	wrteei	1
> -#else
> -	ori	r10,r10,MSR_EE
> -	mtmsrd	r10,1
> -#endif /* CONFIG_PPC_BOOK3E */
> -
>  	andi.	r0,r4,_TIF_NEED_RESCHED
>  	beq	1f
>  	bl	.restore_interrupts
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5ec1b23..3717fb5 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -260,11 +260,17 @@ EXPORT_SYMBOL(arch_local_irq_restore);
>   * if they are currently disabled. This is typically called before
>   * schedule() or do_signal() when returning to userspace. We do it
>   * in C to avoid the burden of dealing with lockdep etc...
> + *
> + * NOTE: This is called with interrupts hard disabled but not marked
> + * as such in paca->irq_happened, so we need to resync this.
>   */
>  void restore_interrupts(void)
>  {
> -	if (irqs_disabled())
> +	if (irqs_disabled()) {
> +		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
>  		local_irq_enable();
> +	} else
> +		__hard_irq_enable();
>  }
>  
>  #endif /* CONFIG_PPC64 */

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-05-10  5:37 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-03  1:53 [PATCH] powerpc: use local var instead of local_paca->irq_happened directly in __check_irq_replay Wang Sheng-Hui
2012-05-03  2:15 ` Benjamin Herrenschmidt
2012-05-03  2:15   ` Benjamin Herrenschmidt
2012-05-03  2:27   ` Wang Sheng-Hui
2012-05-03  2:27     ` Wang Sheng-Hui
2012-05-03  2:32   ` Wang Sheng-Hui
2012-05-03  2:32     ` Wang Sheng-Hui
2012-05-03  4:26     ` Benjamin Herrenschmidt
2012-05-03  4:26       ` Benjamin Herrenschmidt
2012-05-03  4:22   ` Benjamin Herrenschmidt
2012-05-03  4:22     ` Benjamin Herrenschmidt
2012-05-03  5:51     ` Wang Sheng-Hui
2012-05-03  5:51       ` Wang Sheng-Hui
2012-05-03  6:52       ` Benjamin Herrenschmidt
2012-05-03  6:52         ` Benjamin Herrenschmidt
2012-05-03  6:33     ` Wang Sheng-Hui
2012-05-03  6:33       ` Wang Sheng-Hui
2012-05-03  6:59       ` Wang Sheng-Hui
2012-05-03  6:59         ` Wang Sheng-Hui
2012-05-03  8:09         ` Benjamin Herrenschmidt
2012-05-03  8:09           ` Benjamin Herrenschmidt
2012-05-03 23:35           ` Wang Sheng-Hui
2012-05-03 23:35             ` Wang Sheng-Hui
2012-05-04  0:10             ` Benjamin Herrenschmidt
2012-05-04  0:10               ` Benjamin Herrenschmidt
2012-05-08  3:46               ` Benjamin Herrenschmidt
2012-05-08  3:46                 ` Benjamin Herrenschmidt
2012-05-10  5:37                 ` Wang Sheng-Hui
2012-05-10  5:37                   ` Wang Sheng-Hui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.