All of lore.kernel.org
 help / color / mirror / Atom feed
* panic kexec broken on ARM64?
@ 2018-06-05  8:01 ` Petr Tesarik
  0 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-05  8:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
when a panic kernel is loaded. I attached a hardware debugger and found
out that all CPU cores were stopped except one which was stuck in the
idle thread. It seems that irq_set_irqchip_state() may sleep, which is
definitely not safe after a kernel panic.

If I'm right, then this is broken in general, but I have only ever seen
it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
be more subtle. FWIW the code for 32-bit ARM seems to work just fine
without this code in machine_kexec_mask_interrupts():

                /*
                 * First try to remove the active state. If this
                 * fails, try to EOI the interrupt.
                 */
                ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);

I wonder what breaks if this call to irq_set_irqchip_state() is removed.

For reference, here is a stack trace of the process which originally
triggered the panic:

#0  __switch_to (prev=0xffff000008e62a00 <init_task>, next=0xffff80002b796080) at ../arch/arm64/kernel/process.c:355
#1  0xffff0000088f584c in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2896
#2  __schedule (preempt=false) at ../kernel/sched/core.c:3457
#3  0xffff0000088f5eac in schedule () at ../kernel/sched/core.c:3516
#4  0xffff0000088f9448 in schedule_timeout (timeout=<optimized out>) at ../kernel/time/timer.c:1743
#5  0xffff0000088f6afc in do_wait_for_common (state=<optimized out>, timeout=500, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:77
#6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:96
#7  wait_for_common (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500, state=<optimized out>) at ../kernel/sched/completion.c:104
#8  0xffff0000088f6c1c in wait_for_completion_timeout (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500) at ../kernel/sched/completion.c:144
#9  0xffff000000a19f1c in usb_start_wait_urb (urb=0xffff80002c1cd700, timeout=5000, actual_length=0xffff000008e538dc <init_thread_union+14556>)
    at ../drivers/usb/core/message.c:61
#10 0xffff000000a1a05c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>, 
    usb_dev=<optimized out>) at ../drivers/usb/core/message.c:100
#11 usb_control_msg (dev=0xffff80002c348000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002b6fa080, size=4, 
    timeout=5000) at ../drivers/usb/core/message.c:151
#12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
    at ../drivers/net/usb/lan78xx.c:425
#13 0xffff00000100365c in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at ../drivers/net/usb/lan78xx.c:1909
#14 0xffff00000813e590 in chip_bus_sync_unlock (desc=<optimized out>) at ../kernel/irq/internals.h:129
#15 __irq_put_desc_unlock (desc=0xffff80002c361c00, flags=128, bus=true) at ../kernel/irq/irqdesc.c:804
#16 0xffff00000813f604 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at ../kernel/irq/internals.h:155
#17 irq_set_irqchip_state (irq=<optimized out>, which=<optimized out>, val=false) at ../kernel/irq/manage.c:2136
#18 0xffff00000809b7d4 in machine_kexec_mask_interrupts () at ../arch/arm64/kernel/machine_kexec.c:233
#19 machine_crash_shutdown (regs=<optimized out>) at ../arch/arm64/kernel/machine_kexec.c:259
#20 0xffff000008180fd4 in __crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:943
#21 0xffff0000081810e4 in crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:965
#22 0xffff00000808ab58 in die (str=<optimized out>, regs=0xffff000008e53d70 <init_thread_union+15728>, err=-2046820348) at ../arch/arm64/kernel/traps.c:266
#23 0xffff0000080a1c14 in __do_kernel_fault (mm=0x0, addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:226
#24 0xffff0000088fc8dc in do_page_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:476
#25 0xffff0000088fccdc in do_translation_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:502
#26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
#27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548

Petr T

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
@ 2018-06-05  8:01 ` Petr Tesarik
  0 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-05  8:01 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Matthias Brugger, kexec mailing list

Hi all,

I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
when a panic kernel is loaded. I attached a hardware debugger and found
out that all CPU cores were stopped except one which was stuck in the
idle thread. It seems that irq_set_irqchip_state() may sleep, which is
definitely not safe after a kernel panic.

If I'm right, then this is broken in general, but I have only ever seen
it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
be more subtle. FWIW the code for 32-bit ARM seems to work just fine
without this code in machine_kexec_mask_interrupts():

                /*
                 * First try to remove the active state. If this
                 * fails, try to EOI the interrupt.
                 */
                ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);

I wonder what breaks if this call to irq_set_irqchip_state() is removed.

For reference, here is a stack trace of the process which originally
triggered the panic:

#0  __switch_to (prev=0xffff000008e62a00 <init_task>, next=0xffff80002b796080) at ../arch/arm64/kernel/process.c:355
#1  0xffff0000088f584c in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2896
#2  __schedule (preempt=false) at ../kernel/sched/core.c:3457
#3  0xffff0000088f5eac in schedule () at ../kernel/sched/core.c:3516
#4  0xffff0000088f9448 in schedule_timeout (timeout=<optimized out>) at ../kernel/time/timer.c:1743
#5  0xffff0000088f6afc in do_wait_for_common (state=<optimized out>, timeout=500, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:77
#6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:96
#7  wait_for_common (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500, state=<optimized out>) at ../kernel/sched/completion.c:104
#8  0xffff0000088f6c1c in wait_for_completion_timeout (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500) at ../kernel/sched/completion.c:144
#9  0xffff000000a19f1c in usb_start_wait_urb (urb=0xffff80002c1cd700, timeout=5000, actual_length=0xffff000008e538dc <init_thread_union+14556>)
    at ../drivers/usb/core/message.c:61
#10 0xffff000000a1a05c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>, 
    usb_dev=<optimized out>) at ../drivers/usb/core/message.c:100
#11 usb_control_msg (dev=0xffff80002c348000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002b6fa080, size=4, 
    timeout=5000) at ../drivers/usb/core/message.c:151
#12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
    at ../drivers/net/usb/lan78xx.c:425
#13 0xffff00000100365c in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at ../drivers/net/usb/lan78xx.c:1909
#14 0xffff00000813e590 in chip_bus_sync_unlock (desc=<optimized out>) at ../kernel/irq/internals.h:129
#15 __irq_put_desc_unlock (desc=0xffff80002c361c00, flags=128, bus=true) at ../kernel/irq/irqdesc.c:804
#16 0xffff00000813f604 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at ../kernel/irq/internals.h:155
#17 irq_set_irqchip_state (irq=<optimized out>, which=<optimized out>, val=false) at ../kernel/irq/manage.c:2136
#18 0xffff00000809b7d4 in machine_kexec_mask_interrupts () at ../arch/arm64/kernel/machine_kexec.c:233
#19 machine_crash_shutdown (regs=<optimized out>) at ../arch/arm64/kernel/machine_kexec.c:259
#20 0xffff000008180fd4 in __crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:943
#21 0xffff0000081810e4 in crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:965
#22 0xffff00000808ab58 in die (str=<optimized out>, regs=0xffff000008e53d70 <init_thread_union+15728>, err=-2046820348) at ../arch/arm64/kernel/traps.c:266
#23 0xffff0000080a1c14 in __do_kernel_fault (mm=0x0, addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:226
#24 0xffff0000088fc8dc in do_page_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:476
#25 0xffff0000088fccdc in do_translation_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:502
#26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
#27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548

Petr T

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-05  8:01 ` Petr Tesarik
@ 2018-06-05 17:46   ` James Morse
  -1 siblings, 0 replies; 42+ messages in thread
From: James Morse @ 2018-06-05 17:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Petr,

(CC: +Akashi, Marc)

On 05/06/18 09:01, Petr Tesarik wrote:
> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> when a panic kernel is loaded.

kdump is a best-effort thing, it looks like this is a case where the
crashed-kernel can't tear itself down.

Do you have the rest of the stack trace? Was it handling an irq when it decided
to panic?:
https://lkml.org/lkml/2018/3/13/1134


> I attached a hardware debugger and found
> out that all CPU cores were stopped except one which was stuck in the
> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> definitely not safe after a kernel panic.

I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
around in mmio registers, this should all be safe unless you re-entered the same
code.


> If I'm right, then this is broken in general, but I have only ever seen
> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> be more subtle.

Is there a hardware difference around the interrupt controller on these?


> FWIW the code for 32-bit ARM seems to work just fine
> without this code in machine_kexec_mask_interrupts():
> 
>                 /*
>                  * First try to remove the active state. If this
>                  * fails, try to EOI the interrupt.
>                  */
>                 ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);
> 
> I wonder what breaks if this call to irq_set_irqchip_state() is removed.

My understanding is this is to reset all interrupts so the new kernel doesn't
spend its first waking minutes declaring all these pending interrupts as
spurious as the device drivers haven't (re-)claimed them yet.

I don't know if/how this is done on 32bit.


> For reference, here is a stack trace of the process which originally
> triggered the panic:
> 
> #0  __switch_to (prev=0xffff000008e62a00 <init_task>, next=0xffff80002b796080) at ../arch/arm64/kernel/process.c:355
> #1  0xffff0000088f584c in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2896
> #2  __schedule (preempt=false) at ../kernel/sched/core.c:3457
> #3  0xffff0000088f5eac in schedule () at ../kernel/sched/core.c:3516
> #4  0xffff0000088f9448 in schedule_timeout (timeout=<optimized out>) at ../kernel/time/timer.c:1743
> #5  0xffff0000088f6afc in do_wait_for_common (state=<optimized out>, timeout=500, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:77
> #6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:96
> #7  wait_for_common (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500, state=<optimized out>) at ../kernel/sched/completion.c:104
> #8  0xffff0000088f6c1c in wait_for_completion_timeout (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500) at ../kernel/sched/completion.c:144
> #9  0xffff000000a19f1c in usb_start_wait_urb (urb=0xffff80002c1cd700, timeout=5000, actual_length=0xffff000008e538dc <init_thread_union+14556>)
>     at ../drivers/usb/core/message.c:61

(USB?!)

> #10 0xffff000000a1a05c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>, 
>     usb_dev=<optimized out>) at ../drivers/usb/core/message.c:100
> #11 usb_control_msg (dev=0xffff80002c348000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002b6fa080, size=4, 
>     timeout=5000) at ../drivers/usb/core/message.c:151
> #12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
>     at ../drivers/net/usb/lan78xx.c:425
> #13 0xffff00000100365c in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at ../drivers/net/usb/lan78xx.c:1909

I'm not sure what these 'struct irq_chip' outside drivers/irqchip are,
presumably irq-controllers can be nested, and devices believe they are interrupt
controllers too.

This looks like yours is actually a network chip on the other end of a usb bus.
Any configuration attempt involves taking mutexs, allocating memory and sitting
on a wait queue until the response comes, (all relying on a different kind of
interrupt).

So for this network-irqcontroller-chip its not safe to call
irq_set_irqchip_state() from irq context. (you also survived taking a mutex and
allocating a few buffers before hitting the wait queue).

I'm not sure how this should be fixed, but as suggested on that irqchip thread
above, having a irqchip-specific separate 'reset' API could do something more
drastic than trying to modify the configuration, which requires these
locks/memory-allocation.


> #14 0xffff00000813e590 in chip_bus_sync_unlock (desc=<optimized out>) at ../kernel/irq/internals.h:129
> #15 __irq_put_desc_unlock (desc=0xffff80002c361c00, flags=128, bus=true) at ../kernel/irq/irqdesc.c:804
> #16 0xffff00000813f604 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at ../kernel/irq/internals.h:155
> #17 irq_set_irqchip_state (irq=<optimized out>, which=<optimized out>, val=false) at ../kernel/irq/manage.c:2136
> #18 0xffff00000809b7d4 in machine_kexec_mask_interrupts () at ../arch/arm64/kernel/machine_kexec.c:233
> #19 machine_crash_shutdown (regs=<optimized out>) at ../arch/arm64/kernel/machine_kexec.c:259
> #20 0xffff000008180fd4 in __crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:943
> #21 0xffff0000081810e4 in crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:965
> #22 0xffff00000808ab58 in die (str=<optimized out>, regs=0xffff000008e53d70 <init_thread_union+15728>, err=-2046820348) at ../arch/arm64/kernel/traps.c:266
> #23 0xffff0000080a1c14 in __do_kernel_fault (mm=0x0, addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:226
> #24 0xffff0000088fc8dc in do_page_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:476
> #25 0xffff0000088fccdc in do_translation_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:502
> #26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
> #27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548

What was going on just before this NULL deference? This looks like CPU0's idle
thread stack, which rules out another irq.


Thanks,

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-05 17:46   ` James Morse
  0 siblings, 0 replies; 42+ messages in thread
From: James Morse @ 2018-06-05 17:46 UTC (permalink / raw)
  To: Petr Tesarik
  Cc: Marc Zyngier, takahiro.akashi, Matthias Brugger,
	kexec mailing list, linux-arm-kernel

Hi Petr,

(CC: +Akashi, Marc)

On 05/06/18 09:01, Petr Tesarik wrote:
> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> when a panic kernel is loaded.

kdump is a best-effort thing, it looks like this is a case where the
crashed-kernel can't tear itself down.

Do you have the rest of the stack trace? Was it handling an irq when it decided
to panic?:
https://lkml.org/lkml/2018/3/13/1134


> I attached a hardware debugger and found
> out that all CPU cores were stopped except one which was stuck in the
> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> definitely not safe after a kernel panic.

I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
around in mmio registers, this should all be safe unless you re-entered the same
code.


> If I'm right, then this is broken in general, but I have only ever seen
> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> be more subtle.

Is there a hardware difference around the interrupt controller on these?


> FWIW the code for 32-bit ARM seems to work just fine
> without this code in machine_kexec_mask_interrupts():
> 
>                 /*
>                  * First try to remove the active state. If this
>                  * fails, try to EOI the interrupt.
>                  */
>                 ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);
> 
> I wonder what breaks if this call to irq_set_irqchip_state() is removed.

My understanding is this is to reset all interrupts so the new kernel doesn't
spend its first waking minutes declaring all these pending interrupts as
spurious as the device drivers haven't (re-)claimed them yet.

I don't know if/how this is done on 32bit.


> For reference, here is a stack trace of the process which originally
> triggered the panic:
> 
> #0  __switch_to (prev=0xffff000008e62a00 <init_task>, next=0xffff80002b796080) at ../arch/arm64/kernel/process.c:355
> #1  0xffff0000088f584c in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2896
> #2  __schedule (preempt=false) at ../kernel/sched/core.c:3457
> #3  0xffff0000088f5eac in schedule () at ../kernel/sched/core.c:3516
> #4  0xffff0000088f9448 in schedule_timeout (timeout=<optimized out>) at ../kernel/time/timer.c:1743
> #5  0xffff0000088f6afc in do_wait_for_common (state=<optimized out>, timeout=500, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:77
> #6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:96
> #7  wait_for_common (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500, state=<optimized out>) at ../kernel/sched/completion.c:104
> #8  0xffff0000088f6c1c in wait_for_completion_timeout (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500) at ../kernel/sched/completion.c:144
> #9  0xffff000000a19f1c in usb_start_wait_urb (urb=0xffff80002c1cd700, timeout=5000, actual_length=0xffff000008e538dc <init_thread_union+14556>)
>     at ../drivers/usb/core/message.c:61

(USB?!)

> #10 0xffff000000a1a05c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>, 
>     usb_dev=<optimized out>) at ../drivers/usb/core/message.c:100
> #11 usb_control_msg (dev=0xffff80002c348000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002b6fa080, size=4, 
>     timeout=5000) at ../drivers/usb/core/message.c:151
> #12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
>     at ../drivers/net/usb/lan78xx.c:425
> #13 0xffff00000100365c in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at ../drivers/net/usb/lan78xx.c:1909

I'm not sure what these 'struct irq_chip' outside drivers/irqchip are,
presumably irq-controllers can be nested, and devices believe they are interrupt
controllers too.

This looks like yours is actually a network chip on the other end of a usb bus.
Any configuration attempt involves taking mutexs, allocating memory and sitting
on a wait queue until the response comes, (all relying on a different kind of
interrupt).

So for this network-irqcontroller-chip its not safe to call
irq_set_irqchip_state() from irq context. (you also survived taking a mutex and
allocating a few buffers before hitting the wait queue).

I'm not sure how this should be fixed, but as suggested on that irqchip thread
above, having a irqchip-specific separate 'reset' API could do something more
drastic than trying to modify the configuration, which requires these
locks/memory-allocation.


> #14 0xffff00000813e590 in chip_bus_sync_unlock (desc=<optimized out>) at ../kernel/irq/internals.h:129
> #15 __irq_put_desc_unlock (desc=0xffff80002c361c00, flags=128, bus=true) at ../kernel/irq/irqdesc.c:804
> #16 0xffff00000813f604 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at ../kernel/irq/internals.h:155
> #17 irq_set_irqchip_state (irq=<optimized out>, which=<optimized out>, val=false) at ../kernel/irq/manage.c:2136
> #18 0xffff00000809b7d4 in machine_kexec_mask_interrupts () at ../arch/arm64/kernel/machine_kexec.c:233
> #19 machine_crash_shutdown (regs=<optimized out>) at ../arch/arm64/kernel/machine_kexec.c:259
> #20 0xffff000008180fd4 in __crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:943
> #21 0xffff0000081810e4 in crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:965
> #22 0xffff00000808ab58 in die (str=<optimized out>, regs=0xffff000008e53d70 <init_thread_union+15728>, err=-2046820348) at ../arch/arm64/kernel/traps.c:266
> #23 0xffff0000080a1c14 in __do_kernel_fault (mm=0x0, addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:226
> #24 0xffff0000088fc8dc in do_page_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:476
> #25 0xffff0000088fccdc in do_translation_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:502
> #26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
> #27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548

What was going on just before this NULL deference? This looks like CPU0's idle
thread stack, which rules out another irq.


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-05  8:01 ` Petr Tesarik
@ 2018-06-06  5:36   ` Bhupesh Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Bhupesh Sharma @ 2018-06-06  5:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hello Petr,

On Tue, Jun 5, 2018 at 1:31 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> Hi all,
>
> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> when a panic kernel is loaded. I attached a hardware debugger and found
> out that all CPU cores were stopped except one which was stuck in the
> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> definitely not safe after a kernel panic.

Normally we limit the number of cpus in the crash kernel to 1 (via
maxcpus=1) in distributions like fedora. Can you please share the
command-line/bootargs which you use to invoke the crash kernel.

Also do you get any console output from the crash kernel (you can try
passing earlycon to the crash kernel to see if it crashes early
enough)? If yes, can you please share the same.

> If I'm right, then this is broken in general, but I have only ever seen
> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> be more subtle. FWIW the code for 32-bit ARM seems to work just fine
> without this code in machine_kexec_mask_interrupts():
>
>                 /*
>                  * First try to remove the active state. If this
>                  * fails, try to EOI the interrupt.
>                  */
>                 ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);
>
> I wonder what breaks if this call to irq_set_irqchip_state() is removed.
>
> For reference, here is a stack trace of the process which originally
> triggered the panic:
>
> #0  __switch_to (prev=0xffff000008e62a00 <init_task>, next=0xffff80002b796080) at ../arch/arm64/kernel/process.c:355
> #1  0xffff0000088f584c in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2896
> #2  __schedule (preempt=false) at ../kernel/sched/core.c:3457
> #3  0xffff0000088f5eac in schedule () at ../kernel/sched/core.c:3516
> #4  0xffff0000088f9448 in schedule_timeout (timeout=<optimized out>) at ../kernel/time/timer.c:1743
> #5  0xffff0000088f6afc in do_wait_for_common (state=<optimized out>, timeout=500, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:77
> #6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:96
> #7  wait_for_common (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500, state=<optimized out>) at ../kernel/sched/completion.c:104
> #8  0xffff0000088f6c1c in wait_for_completion_timeout (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500) at ../kernel/sched/completion.c:144
> #9  0xffff000000a19f1c in usb_start_wait_urb (urb=0xffff80002c1cd700, timeout=5000, actual_length=0xffff000008e538dc <init_thread_union+14556>)
>     at ../drivers/usb/core/message.c:61
> #10 0xffff000000a1a05c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>,
>     usb_dev=<optimized out>) at ../drivers/usb/core/message.c:100
> #11 usb_control_msg (dev=0xffff80002c348000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002b6fa080, size=4,
>     timeout=5000) at ../drivers/usb/core/message.c:151
> #12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
>     at ../drivers/net/usb/lan78xx.c:425

Hmmm, this seems a bit misplaced, but are you using a usb-ethernet
adapter which causes a URB submission to timeout?

> #13 0xffff00000100365c in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at ../drivers/net/usb/lan78xx.c:1909
> #14 0xffff00000813e590 in chip_bus_sync_unlock (desc=<optimized out>) at ../kernel/irq/internals.h:129
> #15 __irq_put_desc_unlock (desc=0xffff80002c361c00, flags=128, bus=true) at ../kernel/irq/irqdesc.c:804
> #16 0xffff00000813f604 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at ../kernel/irq/internals.h:155
> #17 irq_set_irqchip_state (irq=<optimized out>, which=<optimized out>, val=false) at ../kernel/irq/manage.c:2136
> #18 0xffff00000809b7d4 in machine_kexec_mask_interrupts () at ../arch/arm64/kernel/machine_kexec.c:233
> #19 machine_crash_shutdown (regs=<optimized out>) at ../arch/arm64/kernel/machine_kexec.c:259
> #20 0xffff000008180fd4 in __crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:943
> #21 0xffff0000081810e4 in crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:965
> #22 0xffff00000808ab58 in die (str=<optimized out>, regs=0xffff000008e53d70 <init_thread_union+15728>, err=-2046820348) at ../arch/arm64/kernel/traps.c:266
> #23 0xffff0000080a1c14 in __do_kernel_fault (mm=0x0, addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:226
> #24 0xffff0000088fc8dc in do_page_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:476
> #25 0xffff0000088fccdc in do_translation_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:502
> #26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
> #27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548

The ESR value from the logs (2248146948) indicates the following about
the panic causes (see ARMv8 Architecture Reference Manual for more
details):

EC -> 100001, Instruction Abort taken without a change in Exception
level (Used for MMU faults generated by instruction accesses and
Synchronous external aborts, including synchronous parity or ECC
errors. Not used for debug related exceptions.)
FnV -> 0, FAR is valid
IFSC -> 000100, Translation fault, level 0

So in brief an Instruction Abort was taken at exception level EL1(?)
which was caused by a translation fault at level 0 and the FAR
register holds the faulting Virtual Address.

So, since you have the hardware debugger, can you try and see the
values of FAR (Fault Address Register) and ELR registers at this point
via the debugger and see if they can indicate the Faulty Address from
which the exception was taken and debug using the same.

If you can share the earlycon messages from the crashkernel and the
values of the above registers, can help you further with debugging the
issue which you are seeing.

Thanks,
Bhupesh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-06  5:36   ` Bhupesh Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Bhupesh Sharma @ 2018-06-06  5:36 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Matthias Brugger, kexec mailing list, linux-arm-kernel

Hello Petr,

On Tue, Jun 5, 2018 at 1:31 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> Hi all,
>
> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> when a panic kernel is loaded. I attached a hardware debugger and found
> out that all CPU cores were stopped except one which was stuck in the
> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> definitely not safe after a kernel panic.

Normally we limit the number of cpus in the crash kernel to 1 (via
maxcpus=1) in distributions like fedora. Can you please share the
command-line/bootargs which you use to invoke the crash kernel.

Also do you get any console output from the crash kernel (you can try
passing earlycon to the crash kernel to see if it crashes early
enough)? If yes, can you please share the same.

> If I'm right, then this is broken in general, but I have only ever seen
> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> be more subtle. FWIW the code for 32-bit ARM seems to work just fine
> without this code in machine_kexec_mask_interrupts():
>
>                 /*
>                  * First try to remove the active state. If this
>                  * fails, try to EOI the interrupt.
>                  */
>                 ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);
>
> I wonder what breaks if this call to irq_set_irqchip_state() is removed.
>
> For reference, here is a stack trace of the process which originally
> triggered the panic:
>
> #0  __switch_to (prev=0xffff000008e62a00 <init_task>, next=0xffff80002b796080) at ../arch/arm64/kernel/process.c:355
> #1  0xffff0000088f584c in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2896
> #2  __schedule (preempt=false) at ../kernel/sched/core.c:3457
> #3  0xffff0000088f5eac in schedule () at ../kernel/sched/core.c:3516
> #4  0xffff0000088f9448 in schedule_timeout (timeout=<optimized out>) at ../kernel/time/timer.c:1743
> #5  0xffff0000088f6afc in do_wait_for_common (state=<optimized out>, timeout=500, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:77
> #6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at ../kernel/sched/completion.c:96
> #7  wait_for_common (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500, state=<optimized out>) at ../kernel/sched/completion.c:104
> #8  0xffff0000088f6c1c in wait_for_completion_timeout (x=0xffff000008e53848 <init_thread_union+14408>, timeout=500) at ../kernel/sched/completion.c:144
> #9  0xffff000000a19f1c in usb_start_wait_urb (urb=0xffff80002c1cd700, timeout=5000, actual_length=0xffff000008e538dc <init_thread_union+14556>)
>     at ../drivers/usb/core/message.c:61
> #10 0xffff000000a1a05c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>,
>     usb_dev=<optimized out>) at ../drivers/usb/core/message.c:100
> #11 usb_control_msg (dev=0xffff80002c348000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002b6fa080, size=4,
>     timeout=5000) at ../drivers/usb/core/message.c:151
> #12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
>     at ../drivers/net/usb/lan78xx.c:425

Hmmm, this seems a bit misplaced, but are you using a usb-ethernet
adapter which causes a URB submission to timeout?

> #13 0xffff00000100365c in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at ../drivers/net/usb/lan78xx.c:1909
> #14 0xffff00000813e590 in chip_bus_sync_unlock (desc=<optimized out>) at ../kernel/irq/internals.h:129
> #15 __irq_put_desc_unlock (desc=0xffff80002c361c00, flags=128, bus=true) at ../kernel/irq/irqdesc.c:804
> #16 0xffff00000813f604 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at ../kernel/irq/internals.h:155
> #17 irq_set_irqchip_state (irq=<optimized out>, which=<optimized out>, val=false) at ../kernel/irq/manage.c:2136
> #18 0xffff00000809b7d4 in machine_kexec_mask_interrupts () at ../arch/arm64/kernel/machine_kexec.c:233
> #19 machine_crash_shutdown (regs=<optimized out>) at ../arch/arm64/kernel/machine_kexec.c:259
> #20 0xffff000008180fd4 in __crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:943
> #21 0xffff0000081810e4 in crash_kexec (regs=0xffff000008e53d70 <init_thread_union+15728>) at ../kernel/kexec_core.c:965
> #22 0xffff00000808ab58 in die (str=<optimized out>, regs=0xffff000008e53d70 <init_thread_union+15728>, err=-2046820348) at ../arch/arm64/kernel/traps.c:266
> #23 0xffff0000080a1c14 in __do_kernel_fault (mm=0x0, addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:226
> #24 0xffff0000088fc8dc in do_page_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:476
> #25 0xffff0000088fccdc in do_translation_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:502
> #26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
> #27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548

The ESR value from the logs (2248146948) indicates the following about
the panic causes (see ARMv8 Architecture Reference Manual for more
details):

EC -> 100001, Instruction Abort taken without a change in Exception
level (Used for MMU faults generated by instruction accesses and
Synchronous external aborts, including synchronous parity or ECC
errors. Not used for debug related exceptions.)
FnV -> 0, FAR is valid
IFSC -> 000100, Translation fault, level 0

So in brief an Instruction Abort was taken at exception level EL1(?)
which was caused by a translation fault at level 0 and the FAR
register holds the faulting Virtual Address.

So, since you have the hardware debugger, can you try and see the
values of FAR (Fault Address Register) and ELR registers at this point
via the debugger and see if they can indicate the Faulty Address from
which the exception was taken and debug using the same.

If you can share the earlycon messages from the crashkernel and the
values of the above registers, can help you further with debugging the
issue which you are seeing.

Thanks,
Bhupesh

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-05 17:46   ` James Morse
@ 2018-06-06  7:02     ` Stefan Wahren
  -1 siblings, 0 replies; 42+ messages in thread
From: Stefan Wahren @ 2018-06-06  7:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Petr,

Am 05.06.2018 um 19:46 schrieb James Morse:
> Hi Petr,
>
> (CC: +Akashi, Marc)
>
> On 05/06/18 09:01, Petr Tesarik wrote:
>> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
>> when a panic kernel is loaded.
> kdump is a best-effort thing, it looks like this is a case where the
> crashed-kernel can't tear itself down.
>
> Do you have the rest of the stack trace? Was it handling an irq when it decided
> to panic?:
> https://lkml.org/lkml/2018/3/13/1134

the Raspberry Pi 3 B+ support is very fresh (linux-next). Since i didn't 
see a version, i need to doublecheck.

You are actually using linux-next and not the downstream kernel?

>
>> I attached a hardware debugger and found
>> out that all CPU cores were stopped except one which was stuck in the
>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>> definitely not safe after a kernel panic.
> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> around in mmio registers, this should all be safe unless you re-entered the same
> code.
>
>
>> If I'm right, then this is broken in general, but I have only ever seen
>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>> be more subtle.
> Is there a hardware difference around the interrupt controller on these?
@James:
No, but the RPi 3 B has a different USB network chip on board (smsc95xx, 
Fast ethernet) instead of lan78xx (Gigabit ethernet).

Stefan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-06  7:02     ` Stefan Wahren
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Wahren @ 2018-06-06  7:02 UTC (permalink / raw)
  To: James Morse, Petr Tesarik
  Cc: Marc Zyngier, takahiro.akashi, Matthias Brugger,
	kexec mailing list, linux-arm-kernel

Hi Petr,

Am 05.06.2018 um 19:46 schrieb James Morse:
> Hi Petr,
>
> (CC: +Akashi, Marc)
>
> On 05/06/18 09:01, Petr Tesarik wrote:
>> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
>> when a panic kernel is loaded.
> kdump is a best-effort thing, it looks like this is a case where the
> crashed-kernel can't tear itself down.
>
> Do you have the rest of the stack trace? Was it handling an irq when it decided
> to panic?:
> https://lkml.org/lkml/2018/3/13/1134

the Raspberry Pi 3 B+ support is very fresh (linux-next). Since i didn't 
see a version, i need to doublecheck.

You are actually using linux-next and not the downstream kernel?

>
>> I attached a hardware debugger and found
>> out that all CPU cores were stopped except one which was stuck in the
>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>> definitely not safe after a kernel panic.
> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> around in mmio registers, this should all be safe unless you re-entered the same
> code.
>
>
>> If I'm right, then this is broken in general, but I have only ever seen
>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>> be more subtle.
> Is there a hardware difference around the interrupt controller on these?
@James:
No, but the RPi 3 B has a different USB network chip on board (smsc95xx, 
Fast ethernet) instead of lan78xx (Gigabit ethernet).

Stefan


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-06  5:36   ` Bhupesh Sharma
@ 2018-06-06  7:58     ` Petr Tesarik
  -1 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-06  7:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 6 Jun 2018 11:06:28 +0530
Bhupesh Sharma <bhsharma@redhat.com> wrote:

> Hello Petr,
> 
> On Tue, Jun 5, 2018 at 1:31 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> > Hi all,
> >
> > I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> > when a panic kernel is loaded. I attached a hardware debugger and found
> > out that all CPU cores were stopped except one which was stuck in the
> > idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> > definitely not safe after a kernel panic.  
>[...]
> Also do you get any console output from the crash kernel (you can try
> passing earlycon to the crash kernel to see if it crashes early
> enough)? If yes, can you please share the same.

Maybe I wasn't clear enough. The crashed kernel does not even get to
the kexec's purgatory code, so there cannot be any output from the
crash kernel.

>[...]
> > #12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
> >     at ../drivers/net/usb/lan78xx.c:425  
> 
> Hmmm, this seems a bit misplaced, but are you using a usb-ethernet
> adapter which causes a URB submission to timeout?

Yes, RPi 3 B+ contains an on-board Microchip LAN7515, which is indeed a
USB device.

>[...]
> > #26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
> > #27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548  
> 
> The ESR value from the logs (2248146948) indicates the following about
> the panic causes (see ARMv8 Architecture Reference Manual for more
> details):

Thank you for all the detailx, but that's not relevant here. I am
crashing my system intentionally to debug the kexec/kdump
infrastructure...

Petr T

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-06  7:58     ` Petr Tesarik
  0 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-06  7:58 UTC (permalink / raw)
  To: Bhupesh Sharma; +Cc: Matthias Brugger, kexec mailing list, linux-arm-kernel

On Wed, 6 Jun 2018 11:06:28 +0530
Bhupesh Sharma <bhsharma@redhat.com> wrote:

> Hello Petr,
> 
> On Tue, Jun 5, 2018 at 1:31 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> > Hi all,
> >
> > I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> > when a panic kernel is loaded. I attached a hardware debugger and found
> > out that all CPU cores were stopped except one which was stuck in the
> > idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> > definitely not safe after a kernel panic.  
>[...]
> Also do you get any console output from the crash kernel (you can try
> passing earlycon to the crash kernel to see if it crashes early
> enough)? If yes, can you please share the same.

Maybe I wasn't clear enough. The crashed kernel does not even get to
the kexec's purgatory code, so there cannot be any output from the
crash kernel.

>[...]
> > #12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c <init_thread_union+14700>, dev=<optimized out>, dev=<optimized out>)
> >     at ../drivers/net/usb/lan78xx.c:425  
> 
> Hmmm, this seems a bit misplaced, but are you using a usb-ethernet
> adapter which causes a URB submission to timeout?

Yes, RPi 3 B+ contains an on-board Microchip LAN7515, which is indeed a
USB device.

>[...]
> > #26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 <init_thread_union+15728>) at ../arch/arm64/mm/fault.c:657
> > #27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548  
> 
> The ESR value from the logs (2248146948) indicates the following about
> the panic causes (see ARMv8 Architecture Reference Manual for more
> details):

Thank you for all the detailx, but that's not relevant here. I am
crashing my system intentionally to debug the kexec/kdump
infrastructure...

Petr T

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-06  7:02     ` Stefan Wahren
@ 2018-06-06  8:00       ` Petr Tesarik
  -1 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-06  8:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 6 Jun 2018 09:02:04 +0200
Stefan Wahren <stefan.wahren@i2se.com> wrote:

> Hi Petr,
> 
> Am 05.06.2018 um 19:46 schrieb James Morse:
> > Hi Petr,
> >
> > (CC: +Akashi, Marc)
> >
> > On 05/06/18 09:01, Petr Tesarik wrote:  
> >> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> >> when a panic kernel is loaded.  
> > kdump is a best-effort thing, it looks like this is a case where the
> > crashed-kernel can't tear itself down.
> >
> > Do you have the rest of the stack trace? Was it handling an irq when it decided
> > to panic?:
> > https://lkml.org/lkml/2018/3/13/1134  
> 
> the Raspberry Pi 3 B+ support is very fresh (linux-next). Since i didn't 
> see a version, i need to doublecheck.
> 
> You are actually using linux-next and not the downstream kernel?

Very good point. I'll try again with linux-next.

Stay tuned,
Petr T

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-06  8:00       ` Petr Tesarik
  0 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-06  8:00 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Matthias Brugger, Marc Zyngier, kexec mailing list,
	takahiro.akashi, James Morse, linux-arm-kernel

On Wed, 6 Jun 2018 09:02:04 +0200
Stefan Wahren <stefan.wahren@i2se.com> wrote:

> Hi Petr,
> 
> Am 05.06.2018 um 19:46 schrieb James Morse:
> > Hi Petr,
> >
> > (CC: +Akashi, Marc)
> >
> > On 05/06/18 09:01, Petr Tesarik wrote:  
> >> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> >> when a panic kernel is loaded.  
> > kdump is a best-effort thing, it looks like this is a case where the
> > crashed-kernel can't tear itself down.
> >
> > Do you have the rest of the stack trace? Was it handling an irq when it decided
> > to panic?:
> > https://lkml.org/lkml/2018/3/13/1134  
> 
> the Raspberry Pi 3 B+ support is very fresh (linux-next). Since i didn't 
> see a version, i need to doublecheck.
> 
> You are actually using linux-next and not the downstream kernel?

Very good point. I'll try again with linux-next.

Stay tuned,
Petr T

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-06  7:02     ` Stefan Wahren
@ 2018-06-06 11:37       ` James Morse
  -1 siblings, 0 replies; 42+ messages in thread
From: James Morse @ 2018-06-06 11:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Stefan,

On 06/06/18 08:02, Stefan Wahren wrote:
> Am 05.06.2018 um 19:46 schrieb James Morse:
>> On 05/06/18 09:01, Petr Tesarik wrote:
>>> I attached a hardware debugger and found
>>> out that all CPU cores were stopped except one which was stuck in the
>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>> definitely not safe after a kernel panic.

>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>> around in mmio registers, this should all be safe unless you re-entered the same
>> code.

>>> If I'm right, then this is broken in general, but I have only ever seen
>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>> be more subtle.

>> Is there a hardware difference around the interrupt controller on these?

> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> ethernet) instead of lan78xx (Gigabit ethernet).

Bingo: its the lan78xx driver that is sleeping from the irqchip callbacks; The
smsc95xx driver doesn't have a struct irq_chip, which is why the RPi-3-B doesn't
do this.

It may be valid for kdump to only teardown the 'root irqdomain' (if that even
means anything). I assume these secondary irqchip's would have a
summary-interrupt that goes to another irqchip. But I can't see a way to tell
them apart..,

I think we need to wait until after the merge window for Marc's wisdom on this!


Thanks,

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-06 11:37       ` James Morse
  0 siblings, 0 replies; 42+ messages in thread
From: James Morse @ 2018-06-06 11:37 UTC (permalink / raw)
  To: Stefan Wahren, Petr Tesarik
  Cc: Marc Zyngier, takahiro.akashi, Matthias Brugger,
	kexec mailing list, linux-arm-kernel

Hi Stefan,

On 06/06/18 08:02, Stefan Wahren wrote:
> Am 05.06.2018 um 19:46 schrieb James Morse:
>> On 05/06/18 09:01, Petr Tesarik wrote:
>>> I attached a hardware debugger and found
>>> out that all CPU cores were stopped except one which was stuck in the
>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>> definitely not safe after a kernel panic.

>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>> around in mmio registers, this should all be safe unless you re-entered the same
>> code.

>>> If I'm right, then this is broken in general, but I have only ever seen
>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>> be more subtle.

>> Is there a hardware difference around the interrupt controller on these?

> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> ethernet) instead of lan78xx (Gigabit ethernet).

Bingo: its the lan78xx driver that is sleeping from the irqchip callbacks; The
smsc95xx driver doesn't have a struct irq_chip, which is why the RPi-3-B doesn't
do this.

It may be valid for kdump to only teardown the 'root irqdomain' (if that even
means anything). I assume these secondary irqchip's would have a
summary-interrupt that goes to another irqchip. But I can't see a way to tell
them apart..,

I think we need to wait until after the merge window for Marc's wisdom on this!


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-06  8:00       ` Petr Tesarik
@ 2018-06-06 11:41         ` Petr Tesarik
  -1 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-06 11:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 6 Jun 2018 10:00:24 +0200
Petr Tesarik <ptesarik@suse.cz> wrote:

> On Wed, 6 Jun 2018 09:02:04 +0200
> Stefan Wahren <stefan.wahren@i2se.com> wrote:
> 
> > Hi Petr,
> > 
> > Am 05.06.2018 um 19:46 schrieb James Morse:  
> > > Hi Petr,
> > >
> > > (CC: +Akashi, Marc)
> > >
> > > On 05/06/18 09:01, Petr Tesarik wrote:    
> > >> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> > >> when a panic kernel is loaded.    
> > > kdump is a best-effort thing, it looks like this is a case where the
> > > crashed-kernel can't tear itself down.
> > >
> > > Do you have the rest of the stack trace? Was it handling an irq when it decided
> > > to panic?:
> > > https://lkml.org/lkml/2018/3/13/1134    
> > 
> > the Raspberry Pi 3 B+ support is very fresh (linux-next). Since i didn't 
> > see a version, i need to doublecheck.
> > 
> > You are actually using linux-next and not the downstream kernel?  
> 
> Very good point. I'll try again with linux-next.

It took me some time to set up everything correctly again...

Unfortunately, it makes no difference. I set a hardware breakpoint on
machine_crash_shutdown, followed by a breakpoint at __switch_to, and it
did trigger:

(gdb) lx-version 
Linux version 4.17.0-next-20180605-18-default (root at thunderx10) (gcc version 4.8.5 (SUSE Linux)) #1 SMP Wed Jun 6 10:26:46 CEST 2018
(gdb) bt
#0  __switch_to (prev=0xffff80002b428240, next=0xffff000008c32700 <init_task>) at arch/arm64/kernel/process.c:419
#1  0xffff0000088003d4 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:2860
#2  __schedule (preempt=false) at kernel/sched/core.c:3502
#3  0xffff00000880092c in schedule () at kernel/sched/core.c:3546
#4  0xffff000008803e24 in schedule_timeout (timeout=<optimized out>) at kernel/time/timer.c:1801
#5  0xffff00000880144c in do_wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>)
    at kernel/sched/completion.c:83
#6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at kernel/sched/completion.c:104
#7  wait_for_common (x=0xffff80002d0ef548, timeout=500, state=<optimized out>) at kernel/sched/completion.c:115
#8  0xffff000008801554 in wait_for_completion_timeout (x=0xffff80002d0ef548, timeout=<optimized out>) at kernel/sched/completion.c:155
#9  0xffff0000008f5ef8 in usb_start_wait_urb (urb=0xffff80002c593400, timeout=5000, actual_length=0xffff80002d0ef5dc) at drivers/usb/core/message.c:62
#10 0xffff0000008f602c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>, 
    usb_dev=<optimized out>) at drivers/usb/core/message.c:101
#11 usb_control_msg (dev=0xffff80002c684000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002d421c80, size=4, 
    timeout=5000) at drivers/usb/core/message.c:152
#12 0xffff000000f29e10 in lan78xx_read_reg (index=152, data=0xffff80002d0ef66c, dev=<optimized out>, dev=<optimized out>) at drivers/net/usb/lan78xx.c:449
#13 0xffff000000f2c018 in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at drivers/net/usb/lan78xx.c:1954
#14 0xffff0000081168e4 in chip_bus_sync_unlock (desc=<optimized out>) at kernel/irq/internals.h:147
#15 __irq_put_desc_unlock (desc=0xffff80002e7a9400, flags=<optimized out>, bus=true) at kernel/irq/irqdesc.c:837
#16 0xffff0000081176c0 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at kernel/irq/internals.h:173
#17 irq_set_irqchip_state (irq=<optimized out>, which=IRQCHIP_STATE_ACTIVE, val=false) at kernel/irq/manage.c:2205
#18 0xffff00000809e0b0 in machine_kexec_mask_interrupts () at arch/arm64/kernel/machine_kexec.c:233
#19 machine_crash_shutdown (regs=<optimized out>) at arch/arm64/kernel/machine_kexec.c:259
#20 0xffff00000815b358 in __crash_kexec (regs=0xffff80002d0efb50) at kernel/kexec_core.c:943
#21 0xffff00000815b45c in crash_kexec (regs=0xffff80002d0efb50) at kernel/kexec_core.c:965
#22 0xffff00000808dc84 in die (str=<optimized out>, regs=0xffff80002d0efb50, err=<optimized out>) at arch/arm64/kernel/traps.c:210
#23 0xffff0000080a2114 in die_kernel_fault (msg=0xffff000008a09c88 "NULL pointer dereference", addr=0, esr=2516582468, regs=<optimized out>)
    at arch/arm64/mm/fault.c:269
#24 0xffff0000080a1d68 in __do_kernel_fault (addr=0, esr=2516582468, regs=0xffff80002d0efb50) at arch/arm64/mm/fault.c:297
#25 0xffff000008806e38 in do_page_fault (addr=0, esr=2516582468, regs=0xffff80002d0efb50) at arch/arm64/mm/fault.c:599
#26 0xffff0000088070dc in do_translation_fault (addr=0, esr=<optimized out>, regs=<optimized out>) at arch/arm64/mm/fault.c:608
#27 0xffff0000080812cc in do_mem_abort (addr=0, esr=2516582468, regs=0xffff80002d0efb50) at arch/arm64/mm/fault.c:744
#28 0xffff000008082ed0 in el1_sync () at arch/arm64/kernel/entry.S:583
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

The system hanged in the idle thread after continuing here.

Petr T

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-06 11:41         ` Petr Tesarik
  0 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2018-06-06 11:41 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Matthias Brugger, Marc Zyngier, kexec mailing list,
	takahiro.akashi, James Morse, linux-arm-kernel

On Wed, 6 Jun 2018 10:00:24 +0200
Petr Tesarik <ptesarik@suse.cz> wrote:

> On Wed, 6 Jun 2018 09:02:04 +0200
> Stefan Wahren <stefan.wahren@i2se.com> wrote:
> 
> > Hi Petr,
> > 
> > Am 05.06.2018 um 19:46 schrieb James Morse:  
> > > Hi Petr,
> > >
> > > (CC: +Akashi, Marc)
> > >
> > > On 05/06/18 09:01, Petr Tesarik wrote:    
> > >> I have observed hangs after crash on a Raspberry Pi 3 Model B+ board
> > >> when a panic kernel is loaded.    
> > > kdump is a best-effort thing, it looks like this is a case where the
> > > crashed-kernel can't tear itself down.
> > >
> > > Do you have the rest of the stack trace? Was it handling an irq when it decided
> > > to panic?:
> > > https://lkml.org/lkml/2018/3/13/1134    
> > 
> > the Raspberry Pi 3 B+ support is very fresh (linux-next). Since i didn't 
> > see a version, i need to doublecheck.
> > 
> > You are actually using linux-next and not the downstream kernel?  
> 
> Very good point. I'll try again with linux-next.

It took me some time to set up everything correctly again...

Unfortunately, it makes no difference. I set a hardware breakpoint on
machine_crash_shutdown, followed by a breakpoint at __switch_to, and it
did trigger:

(gdb) lx-version 
Linux version 4.17.0-next-20180605-18-default (root@thunderx10) (gcc version 4.8.5 (SUSE Linux)) #1 SMP Wed Jun 6 10:26:46 CEST 2018
(gdb) bt
#0  __switch_to (prev=0xffff80002b428240, next=0xffff000008c32700 <init_task>) at arch/arm64/kernel/process.c:419
#1  0xffff0000088003d4 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at kernel/sched/core.c:2860
#2  __schedule (preempt=false) at kernel/sched/core.c:3502
#3  0xffff00000880092c in schedule () at kernel/sched/core.c:3546
#4  0xffff000008803e24 in schedule_timeout (timeout=<optimized out>) at kernel/time/timer.c:1801
#5  0xffff00000880144c in do_wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>)
    at kernel/sched/completion.c:83
#6  __wait_for_common (state=<optimized out>, timeout=<optimized out>, action=<optimized out>, x=<optimized out>) at kernel/sched/completion.c:104
#7  wait_for_common (x=0xffff80002d0ef548, timeout=500, state=<optimized out>) at kernel/sched/completion.c:115
#8  0xffff000008801554 in wait_for_completion_timeout (x=0xffff80002d0ef548, timeout=<optimized out>) at kernel/sched/completion.c:155
#9  0xffff0000008f5ef8 in usb_start_wait_urb (urb=0xffff80002c593400, timeout=5000, actual_length=0xffff80002d0ef5dc) at drivers/usb/core/message.c:62
#10 0xffff0000008f602c in usb_internal_control_msg (timeout=<optimized out>, len=<optimized out>, data=<optimized out>, cmd=<optimized out>, pipe=<optimized out>, 
    usb_dev=<optimized out>) at drivers/usb/core/message.c:101
#11 usb_control_msg (dev=0xffff80002c684000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002d421c80, size=4, 
    timeout=5000) at drivers/usb/core/message.c:152
#12 0xffff000000f29e10 in lan78xx_read_reg (index=152, data=0xffff80002d0ef66c, dev=<optimized out>, dev=<optimized out>) at drivers/net/usb/lan78xx.c:449
#13 0xffff000000f2c018 in lan78xx_irq_bus_sync_unlock (irqd=<optimized out>) at drivers/net/usb/lan78xx.c:1954
#14 0xffff0000081168e4 in chip_bus_sync_unlock (desc=<optimized out>) at kernel/irq/internals.h:147
#15 __irq_put_desc_unlock (desc=0xffff80002e7a9400, flags=<optimized out>, bus=true) at kernel/irq/irqdesc.c:837
#16 0xffff0000081176c0 in irq_put_desc_busunlock (flags=<optimized out>, desc=<optimized out>) at kernel/irq/internals.h:173
#17 irq_set_irqchip_state (irq=<optimized out>, which=IRQCHIP_STATE_ACTIVE, val=false) at kernel/irq/manage.c:2205
#18 0xffff00000809e0b0 in machine_kexec_mask_interrupts () at arch/arm64/kernel/machine_kexec.c:233
#19 machine_crash_shutdown (regs=<optimized out>) at arch/arm64/kernel/machine_kexec.c:259
#20 0xffff00000815b358 in __crash_kexec (regs=0xffff80002d0efb50) at kernel/kexec_core.c:943
#21 0xffff00000815b45c in crash_kexec (regs=0xffff80002d0efb50) at kernel/kexec_core.c:965
#22 0xffff00000808dc84 in die (str=<optimized out>, regs=0xffff80002d0efb50, err=<optimized out>) at arch/arm64/kernel/traps.c:210
#23 0xffff0000080a2114 in die_kernel_fault (msg=0xffff000008a09c88 "NULL pointer dereference", addr=0, esr=2516582468, regs=<optimized out>)
    at arch/arm64/mm/fault.c:269
#24 0xffff0000080a1d68 in __do_kernel_fault (addr=0, esr=2516582468, regs=0xffff80002d0efb50) at arch/arm64/mm/fault.c:297
#25 0xffff000008806e38 in do_page_fault (addr=0, esr=2516582468, regs=0xffff80002d0efb50) at arch/arm64/mm/fault.c:599
#26 0xffff0000088070dc in do_translation_fault (addr=0, esr=<optimized out>, regs=<optimized out>) at arch/arm64/mm/fault.c:608
#27 0xffff0000080812cc in do_mem_abort (addr=0, esr=2516582468, regs=0xffff80002d0efb50) at arch/arm64/mm/fault.c:744
#28 0xffff000008082ed0 in el1_sync () at arch/arm64/kernel/entry.S:583
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

The system hanged in the idle thread after continuing here.

Petr T

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-06 11:37       ` James Morse
@ 2018-06-10 12:24         ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-06-10 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 06 Jun 2018 12:37:02 +0100,
James Morse wrote:
> 
> Hi Stefan,
> 
> On 06/06/18 08:02, Stefan Wahren wrote:
> > Am 05.06.2018 um 19:46 schrieb James Morse:
> >> On 05/06/18 09:01, Petr Tesarik wrote:
> >>> I attached a hardware debugger and found
> >>> out that all CPU cores were stopped except one which was stuck in the
> >>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> >>> definitely not safe after a kernel panic.
> 
> >> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> >> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> >> around in mmio registers, this should all be safe unless you re-entered the same
> >> code.
> 
> >>> If I'm right, then this is broken in general, but I have only ever seen
> >>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> >>> be more subtle.
> 
> >> Is there a hardware difference around the interrupt controller on these?
> 
> > No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> > ethernet) instead of lan78xx (Gigabit ethernet).
> 
> Bingo: its the lan78xx driver that is sleeping from the irqchip
> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> is why the RPi-3-B doesn't do this.
> 
> It may be valid for kdump to only teardown the 'root irqdomain' (if
> that even means anything). I assume these secondary irqchip's would
> have a summary-interrupt that goes to another irqchip. But I can't
> see a way to tell them apart..,

There is none. A cascaded irqchip is just like a root irqchip, just
that its output line is connected to another irqchip. But we have no
easy way to identify the parent. Also, this particular driver looks
quite creative (it reinvents the wheel for chained interrupts -- see
intr_complete and lan78xx_status), meaning that even if we could have
a magic way of identify a chained irqchip, we'd miss that one. Broken.

> I think we need to wait until after the merge window for Marc's
> wisdom on this!

Overall, I can't think of an easy fix. We have a few options, but none
of them involve a centralised change:

1) We provide a reset infrastructure for irqchips, with an opt-in
   mechanism. This involves changing the way we teardown irqs at
   crash-time, and we'd then need some notion of reset ordering (think
   of the layered ITS and GICv3, for example).

2) We provide a way to identify interrupts that are ultimately backed
   by a root controller, which implies walking down the hierarchy for
   each one of them. Fairly expensive, but minimal in way of changes
   in the crash code. Requires a per-irqchip flag, but ordering comes
   in for free.

3) We do the same as (2), but at the irqdomain level. Not sure that's
   any better, and it may be even more complicated and bring back some
   ordering issues.

I'm currently angling for (2), with (1) as a final hammer option once
we have nuked all the individual interrupts (useful for the GICv3
redistributor case).

Thoughts?

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-06-10 12:24         ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-06-10 12:24 UTC (permalink / raw)
  To: James Morse
  Cc: Stefan Wahren, Matthias Brugger, kexec mailing list,
	Petr Tesarik, takahiro.akashi, linux-arm-kernel

On Wed, 06 Jun 2018 12:37:02 +0100,
James Morse wrote:
> 
> Hi Stefan,
> 
> On 06/06/18 08:02, Stefan Wahren wrote:
> > Am 05.06.2018 um 19:46 schrieb James Morse:
> >> On 05/06/18 09:01, Petr Tesarik wrote:
> >>> I attached a hardware debugger and found
> >>> out that all CPU cores were stopped except one which was stuck in the
> >>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> >>> definitely not safe after a kernel panic.
> 
> >> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> >> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> >> around in mmio registers, this should all be safe unless you re-entered the same
> >> code.
> 
> >>> If I'm right, then this is broken in general, but I have only ever seen
> >>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> >>> be more subtle.
> 
> >> Is there a hardware difference around the interrupt controller on these?
> 
> > No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> > ethernet) instead of lan78xx (Gigabit ethernet).
> 
> Bingo: its the lan78xx driver that is sleeping from the irqchip
> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> is why the RPi-3-B doesn't do this.
> 
> It may be valid for kdump to only teardown the 'root irqdomain' (if
> that even means anything). I assume these secondary irqchip's would
> have a summary-interrupt that goes to another irqchip. But I can't
> see a way to tell them apart..,

There is none. A cascaded irqchip is just like a root irqchip, just
that its output line is connected to another irqchip. But we have no
easy way to identify the parent. Also, this particular driver looks
quite creative (it reinvents the wheel for chained interrupts -- see
intr_complete and lan78xx_status), meaning that even if we could have
a magic way of identify a chained irqchip, we'd miss that one. Broken.

> I think we need to wait until after the merge window for Marc's
> wisdom on this!

Overall, I can't think of an easy fix. We have a few options, but none
of them involve a centralised change:

1) We provide a reset infrastructure for irqchips, with an opt-in
   mechanism. This involves changing the way we teardown irqs at
   crash-time, and we'd then need some notion of reset ordering (think
   of the layered ITS and GICv3, for example).

2) We provide a way to identify interrupts that are ultimately backed
   by a root controller, which implies walking down the hierarchy for
   each one of them. Fairly expensive, but minimal in way of changes
   in the crash code. Requires a per-irqchip flag, but ordering comes
   in for free.

3) We do the same as (2), but at the irqdomain level. Not sure that's
   any better, and it may be even more complicated and bring back some
   ordering issues.

I'm currently angling for (2), with (1) as a final hammer option once
we have nuked all the individual interrupts (useful for the GICv3
redistributor case).

Thoughts?

	M.

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-06-10 12:24         ` Marc Zyngier
@ 2018-07-03  7:01           ` takahiro.akashi
  -1 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi at linaro.org @ 2018-07-03  7:01 UTC (permalink / raw)
  To: linux-arm-kernel

Marc, James,

I'd like to re-ignite the discussion.

On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> On Wed, 06 Jun 2018 12:37:02 +0100,
> James Morse wrote:
> > 
> > Hi Stefan,
> > 
> > On 06/06/18 08:02, Stefan Wahren wrote:
> > > Am 05.06.2018 um 19:46 schrieb James Morse:
> > >> On 05/06/18 09:01, Petr Tesarik wrote:
> > >>> I attached a hardware debugger and found
> > >>> out that all CPU cores were stopped except one which was stuck in the
> > >>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> > >>> definitely not safe after a kernel panic.
> > 
> > >> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> > >> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> > >> around in mmio registers, this should all be safe unless you re-entered the same
> > >> code.
> > 
> > >>> If I'm right, then this is broken in general, but I have only ever seen
> > >>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> > >>> be more subtle.
> > 
> > >> Is there a hardware difference around the interrupt controller on these?
> > 
> > > No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> > > ethernet) instead of lan78xx (Gigabit ethernet).
> > 
> > Bingo: its the lan78xx driver that is sleeping from the irqchip
> > callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> > is why the RPi-3-B doesn't do this.
> > 
> > It may be valid for kdump to only teardown the 'root irqdomain' (if
> > that even means anything). I assume these secondary irqchip's would
> > have a summary-interrupt that goes to another irqchip. But I can't
> > see a way to tell them apart..,
> 
> There is none. A cascaded irqchip is just like a root irqchip, just
> that its output line is connected to another irqchip. But we have no
> easy way to identify the parent. Also, this particular driver looks
> quite creative (it reinvents the wheel for chained interrupts -- see
> intr_complete and lan78xx_status), meaning that even if we could have
> a magic way of identify a chained irqchip, we'd miss that one. Broken.
> 
> > I think we need to wait until after the merge window for Marc's
> > wisdom on this!
> 
> Overall, I can't think of an easy fix. We have a few options, but none
> of them involve a centralised change:
> 
> 1) We provide a reset infrastructure for irqchips, with an opt-in
>    mechanism. This involves changing the way we teardown irqs at
>    crash-time, and we'd then need some notion of reset ordering (think
>    of the layered ITS and GICv3, for example).

Does this mean that all the irqchips have to be implemented with reset?
> 
> 2) We provide a way to identify interrupts that are ultimately backed
>    by a root controller, which implies walking down the hierarchy for

To be clear, from bottom to top (or root), right?

>    each one of them. Fairly expensive, but minimal in way of changes
>    in the crash code. Requires a per-irqchip flag, but ordering comes
>    in for free.
> 
> 3) We do the same as (2), but at the irqdomain level. Not sure that's
>    any better, and it may be even more complicated and bring back some
>    ordering issues.

Do you think that the same thing may happen in case of pci/msi?
I have no confidence but MSI has some kind of irq domain hierarchy.

Thanks,
-Takahiro AKASHI

> I'm currently angling for (2), with (1) as a final hammer option once
> we have nuked all the individual interrupts (useful for the GICv3
> redistributor case).
> 
> Thoughts?
> 
> 	M.
> 
> -- 
> Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-03  7:01           ` takahiro.akashi
  0 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi @ 2018-07-03  7:01 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Stefan Wahren, Matthias Brugger, kexec mailing list,
	Petr Tesarik, James Morse, linux-arm-kernel

Marc, James,

I'd like to re-ignite the discussion.

On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> On Wed, 06 Jun 2018 12:37:02 +0100,
> James Morse wrote:
> > 
> > Hi Stefan,
> > 
> > On 06/06/18 08:02, Stefan Wahren wrote:
> > > Am 05.06.2018 um 19:46 schrieb James Morse:
> > >> On 05/06/18 09:01, Petr Tesarik wrote:
> > >>> I attached a hardware debugger and found
> > >>> out that all CPU cores were stopped except one which was stuck in the
> > >>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> > >>> definitely not safe after a kernel panic.
> > 
> > >> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> > >> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> > >> around in mmio registers, this should all be safe unless you re-entered the same
> > >> code.
> > 
> > >>> If I'm right, then this is broken in general, but I have only ever seen
> > >>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> > >>> be more subtle.
> > 
> > >> Is there a hardware difference around the interrupt controller on these?
> > 
> > > No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> > > ethernet) instead of lan78xx (Gigabit ethernet).
> > 
> > Bingo: its the lan78xx driver that is sleeping from the irqchip
> > callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> > is why the RPi-3-B doesn't do this.
> > 
> > It may be valid for kdump to only teardown the 'root irqdomain' (if
> > that even means anything). I assume these secondary irqchip's would
> > have a summary-interrupt that goes to another irqchip. But I can't
> > see a way to tell them apart..,
> 
> There is none. A cascaded irqchip is just like a root irqchip, just
> that its output line is connected to another irqchip. But we have no
> easy way to identify the parent. Also, this particular driver looks
> quite creative (it reinvents the wheel for chained interrupts -- see
> intr_complete and lan78xx_status), meaning that even if we could have
> a magic way of identify a chained irqchip, we'd miss that one. Broken.
> 
> > I think we need to wait until after the merge window for Marc's
> > wisdom on this!
> 
> Overall, I can't think of an easy fix. We have a few options, but none
> of them involve a centralised change:
> 
> 1) We provide a reset infrastructure for irqchips, with an opt-in
>    mechanism. This involves changing the way we teardown irqs at
>    crash-time, and we'd then need some notion of reset ordering (think
>    of the layered ITS and GICv3, for example).

Does this mean that all the irqchips have to be implemented with reset?
> 
> 2) We provide a way to identify interrupts that are ultimately backed
>    by a root controller, which implies walking down the hierarchy for

To be clear, from bottom to top (or root), right?

>    each one of them. Fairly expensive, but minimal in way of changes
>    in the crash code. Requires a per-irqchip flag, but ordering comes
>    in for free.
> 
> 3) We do the same as (2), but at the irqdomain level. Not sure that's
>    any better, and it may be even more complicated and bring back some
>    ordering issues.

Do you think that the same thing may happen in case of pci/msi?
I have no confidence but MSI has some kind of irq domain hierarchy.

Thanks,
-Takahiro AKASHI

> I'm currently angling for (2), with (1) as a final hammer option once
> we have nuked all the individual interrupts (useful for the GICv3
> redistributor case).
> 
> Thoughts?
> 
> 	M.
> 
> -- 
> Jazz is not dead, it just smell funny.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-03  7:01           ` takahiro.akashi
@ 2018-07-03  8:58             ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-03  8:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
> Marc, James,
> 
> I'd like to re-ignite the discussion.
> 
> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>> On Wed, 06 Jun 2018 12:37:02 +0100,
>> James Morse wrote:
>>>
>>> Hi Stefan,
>>>
>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>> I attached a hardware debugger and found
>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>> definitely not safe after a kernel panic.
>>>
>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>> code.
>>>
>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>> be more subtle.
>>>
>>>>> Is there a hardware difference around the interrupt controller on these?
>>>
>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>
>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>> is why the RPi-3-B doesn't do this.
>>>
>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>> that even means anything). I assume these secondary irqchip's would
>>> have a summary-interrupt that goes to another irqchip. But I can't
>>> see a way to tell them apart..,
>>
>> There is none. A cascaded irqchip is just like a root irqchip, just
>> that its output line is connected to another irqchip. But we have no
>> easy way to identify the parent. Also, this particular driver looks
>> quite creative (it reinvents the wheel for chained interrupts -- see
>> intr_complete and lan78xx_status), meaning that even if we could have
>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>
>>> I think we need to wait until after the merge window for Marc's
>>> wisdom on this!
>>
>> Overall, I can't think of an easy fix. We have a few options, but none
>> of them involve a centralised change:
>>
>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>    mechanism. This involves changing the way we teardown irqs at
>>    crash-time, and we'd then need some notion of reset ordering (think
>>    of the layered ITS and GICv3, for example).
> 
> Does this mean that all the irqchips have to be implemented with reset?

No. Only those that want to be reset at kexec time.

>>
>> 2) We provide a way to identify interrupts that are ultimately backed
>>    by a root controller, which implies walking down the hierarchy for
> 
> To be clear, from bottom to top (or root), right?

I'm not sure I understand your question. The idea is to walk the
irq_data chain, until we hit a root irqchip. If we do hit one, we
deactivate/eoi/disable this interrupt. If we don't, we do nothing.

This would avoid the above brokenness, and still ensures that no
interrupt reaches the CPU.

> 
>>    each one of them. Fairly expensive, but minimal in way of changes
>>    in the crash code. Requires a per-irqchip flag, but ordering comes
>>    in for free.
>>
>> 3) We do the same as (2), but at the irqdomain level. Not sure that's
>>    any better, and it may be even more complicated and bring back some
>>    ordering issues.
> 
> Do you think that the same thing may happen in case of pci/msi?
> I have no confidence but MSI has some kind of irq domain hierarchy.

Anything can happen, as people implement their interrupt infrastructure
in weird and wonderful ways. So we need to be prepared for the worse.

I've pushed 3 patches on a branch[1]. It is mostly untested, but it
should allow the above RPi3 disaster to cope with kexec.

	M.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-03  8:58             ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-03  8:58 UTC (permalink / raw)
  To: takahiro.akashi, James Morse, Stefan Wahren, Petr Tesarik,
	Matthias Brugger, kexec mailing list, linux-arm-kernel

On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
> Marc, James,
> 
> I'd like to re-ignite the discussion.
> 
> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>> On Wed, 06 Jun 2018 12:37:02 +0100,
>> James Morse wrote:
>>>
>>> Hi Stefan,
>>>
>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>> I attached a hardware debugger and found
>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>> definitely not safe after a kernel panic.
>>>
>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>> code.
>>>
>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>> be more subtle.
>>>
>>>>> Is there a hardware difference around the interrupt controller on these?
>>>
>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>
>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>> is why the RPi-3-B doesn't do this.
>>>
>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>> that even means anything). I assume these secondary irqchip's would
>>> have a summary-interrupt that goes to another irqchip. But I can't
>>> see a way to tell them apart..,
>>
>> There is none. A cascaded irqchip is just like a root irqchip, just
>> that its output line is connected to another irqchip. But we have no
>> easy way to identify the parent. Also, this particular driver looks
>> quite creative (it reinvents the wheel for chained interrupts -- see
>> intr_complete and lan78xx_status), meaning that even if we could have
>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>
>>> I think we need to wait until after the merge window for Marc's
>>> wisdom on this!
>>
>> Overall, I can't think of an easy fix. We have a few options, but none
>> of them involve a centralised change:
>>
>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>    mechanism. This involves changing the way we teardown irqs at
>>    crash-time, and we'd then need some notion of reset ordering (think
>>    of the layered ITS and GICv3, for example).
> 
> Does this mean that all the irqchips have to be implemented with reset?

No. Only those that want to be reset at kexec time.

>>
>> 2) We provide a way to identify interrupts that are ultimately backed
>>    by a root controller, which implies walking down the hierarchy for
> 
> To be clear, from bottom to top (or root), right?

I'm not sure I understand your question. The idea is to walk the
irq_data chain, until we hit a root irqchip. If we do hit one, we
deactivate/eoi/disable this interrupt. If we don't, we do nothing.

This would avoid the above brokenness, and still ensures that no
interrupt reaches the CPU.

> 
>>    each one of them. Fairly expensive, but minimal in way of changes
>>    in the crash code. Requires a per-irqchip flag, but ordering comes
>>    in for free.
>>
>> 3) We do the same as (2), but at the irqdomain level. Not sure that's
>>    any better, and it may be even more complicated and bring back some
>>    ordering issues.
> 
> Do you think that the same thing may happen in case of pci/msi?
> I have no confidence but MSI has some kind of irq domain hierarchy.

Anything can happen, as people implement their interrupt infrastructure
in weird and wonderful ways. So we need to be prepared for the worse.

I've pushed 3 patches on a branch[1]. It is mostly untested, but it
should allow the above RPi3 disaster to cope with kexec.

	M.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-03  8:58             ` Marc Zyngier
@ 2018-07-04  8:41               ` takahiro.akashi
  -1 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi at linaro.org @ 2018-07-04  8:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
> On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
> > Marc, James,
> > 
> > I'd like to re-ignite the discussion.
> > 
> > On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> >> On Wed, 06 Jun 2018 12:37:02 +0100,
> >> James Morse wrote:
> >>>
> >>> Hi Stefan,
> >>>
> >>> On 06/06/18 08:02, Stefan Wahren wrote:
> >>>> Am 05.06.2018 um 19:46 schrieb James Morse:
> >>>>> On 05/06/18 09:01, Petr Tesarik wrote:
> >>>>>> I attached a hardware debugger and found
> >>>>>> out that all CPU cores were stopped except one which was stuck in the
> >>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> >>>>>> definitely not safe after a kernel panic.
> >>>
> >>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> >>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> >>>>> around in mmio registers, this should all be safe unless you re-entered the same
> >>>>> code.
> >>>
> >>>>>> If I'm right, then this is broken in general, but I have only ever seen
> >>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> >>>>>> be more subtle.
> >>>
> >>>>> Is there a hardware difference around the interrupt controller on these?
> >>>
> >>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> >>>> ethernet) instead of lan78xx (Gigabit ethernet).
> >>>
> >>> Bingo: its the lan78xx driver that is sleeping from the irqchip
> >>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> >>> is why the RPi-3-B doesn't do this.
> >>>
> >>> It may be valid for kdump to only teardown the 'root irqdomain' (if
> >>> that even means anything). I assume these secondary irqchip's would
> >>> have a summary-interrupt that goes to another irqchip. But I can't
> >>> see a way to tell them apart..,
> >>
> >> There is none. A cascaded irqchip is just like a root irqchip, just
> >> that its output line is connected to another irqchip. But we have no
> >> easy way to identify the parent. Also, this particular driver looks
> >> quite creative (it reinvents the wheel for chained interrupts -- see
> >> intr_complete and lan78xx_status), meaning that even if we could have
> >> a magic way of identify a chained irqchip, we'd miss that one. Broken.
> >>
> >>> I think we need to wait until after the merge window for Marc's
> >>> wisdom on this!
> >>
> >> Overall, I can't think of an easy fix. We have a few options, but none
> >> of them involve a centralised change:
> >>
> >> 1) We provide a reset infrastructure for irqchips, with an opt-in
> >>    mechanism. This involves changing the way we teardown irqs at
> >>    crash-time, and we'd then need some notion of reset ordering (think
> >>    of the layered ITS and GICv3, for example).
> > 
> > Does this mean that all the irqchips have to be implemented with reset?
> 
> No. Only those that want to be reset at kexec time.

I don't get the point yet. Who should have reset interface?
What is the criteria?

> >>
> >> 2) We provide a way to identify interrupts that are ultimately backed
> >>    by a root controller, which implies walking down the hierarchy for
> > 
> > To be clear, from bottom to top (or root), right?
> 
> I'm not sure I understand your question. The idea is to walk the
> irq_data chain, until we hit a root irqchip. If we do hit one, we
> deactivate/eoi/disable this interrupt. If we don't, we do nothing.

I thought that we would traverse the (chained irq) hierarchy from
bottom to top and call deactivate or others in that order.
Am I wrong here?

> This would avoid the above brokenness, and still ensures that no
> interrupt reaches the CPU.
> 
> > 
> >>    each one of them. Fairly expensive, but minimal in way of changes
> >>    in the crash code. Requires a per-irqchip flag, but ordering comes
> >>    in for free.
> >>
> >> 3) We do the same as (2), but at the irqdomain level. Not sure that's
> >>    any better, and it may be even more complicated and bring back some
> >>    ordering issues.
> > 
> > Do you think that the same thing may happen in case of pci/msi?
> > I have no confidence but MSI has some kind of irq domain hierarchy.
> 
> Anything can happen, as people implement their interrupt infrastructure
> in weird and wonderful ways. So we need to be prepared for the worse.
> 
> I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> should allow the above RPi3 disaster to cope with kexec.

I don't have any hardware that sees this kind of issue and can't test.

-Takahiro AKASHI

> 	M.
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> 
> -- 
> Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-04  8:41               ` takahiro.akashi
  0 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi @ 2018-07-04  8:41 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Stefan Wahren, Matthias Brugger, kexec mailing list,
	Petr Tesarik, James Morse, linux-arm-kernel

On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
> On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
> > Marc, James,
> > 
> > I'd like to re-ignite the discussion.
> > 
> > On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> >> On Wed, 06 Jun 2018 12:37:02 +0100,
> >> James Morse wrote:
> >>>
> >>> Hi Stefan,
> >>>
> >>> On 06/06/18 08:02, Stefan Wahren wrote:
> >>>> Am 05.06.2018 um 19:46 schrieb James Morse:
> >>>>> On 05/06/18 09:01, Petr Tesarik wrote:
> >>>>>> I attached a hardware debugger and found
> >>>>>> out that all CPU cores were stopped except one which was stuck in the
> >>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> >>>>>> definitely not safe after a kernel panic.
> >>>
> >>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> >>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> >>>>> around in mmio registers, this should all be safe unless you re-entered the same
> >>>>> code.
> >>>
> >>>>>> If I'm right, then this is broken in general, but I have only ever seen
> >>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> >>>>>> be more subtle.
> >>>
> >>>>> Is there a hardware difference around the interrupt controller on these?
> >>>
> >>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> >>>> ethernet) instead of lan78xx (Gigabit ethernet).
> >>>
> >>> Bingo: its the lan78xx driver that is sleeping from the irqchip
> >>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> >>> is why the RPi-3-B doesn't do this.
> >>>
> >>> It may be valid for kdump to only teardown the 'root irqdomain' (if
> >>> that even means anything). I assume these secondary irqchip's would
> >>> have a summary-interrupt that goes to another irqchip. But I can't
> >>> see a way to tell them apart..,
> >>
> >> There is none. A cascaded irqchip is just like a root irqchip, just
> >> that its output line is connected to another irqchip. But we have no
> >> easy way to identify the parent. Also, this particular driver looks
> >> quite creative (it reinvents the wheel for chained interrupts -- see
> >> intr_complete and lan78xx_status), meaning that even if we could have
> >> a magic way of identify a chained irqchip, we'd miss that one. Broken.
> >>
> >>> I think we need to wait until after the merge window for Marc's
> >>> wisdom on this!
> >>
> >> Overall, I can't think of an easy fix. We have a few options, but none
> >> of them involve a centralised change:
> >>
> >> 1) We provide a reset infrastructure for irqchips, with an opt-in
> >>    mechanism. This involves changing the way we teardown irqs at
> >>    crash-time, and we'd then need some notion of reset ordering (think
> >>    of the layered ITS and GICv3, for example).
> > 
> > Does this mean that all the irqchips have to be implemented with reset?
> 
> No. Only those that want to be reset at kexec time.

I don't get the point yet. Who should have reset interface?
What is the criteria?

> >>
> >> 2) We provide a way to identify interrupts that are ultimately backed
> >>    by a root controller, which implies walking down the hierarchy for
> > 
> > To be clear, from bottom to top (or root), right?
> 
> I'm not sure I understand your question. The idea is to walk the
> irq_data chain, until we hit a root irqchip. If we do hit one, we
> deactivate/eoi/disable this interrupt. If we don't, we do nothing.

I thought that we would traverse the (chained irq) hierarchy from
bottom to top and call deactivate or others in that order.
Am I wrong here?

> This would avoid the above brokenness, and still ensures that no
> interrupt reaches the CPU.
> 
> > 
> >>    each one of them. Fairly expensive, but minimal in way of changes
> >>    in the crash code. Requires a per-irqchip flag, but ordering comes
> >>    in for free.
> >>
> >> 3) We do the same as (2), but at the irqdomain level. Not sure that's
> >>    any better, and it may be even more complicated and bring back some
> >>    ordering issues.
> > 
> > Do you think that the same thing may happen in case of pci/msi?
> > I have no confidence but MSI has some kind of irq domain hierarchy.
> 
> Anything can happen, as people implement their interrupt infrastructure
> in weird and wonderful ways. So we need to be prepared for the worse.
> 
> I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> should allow the above RPi3 disaster to cope with kexec.

I don't have any hardware that sees this kind of issue and can't test.

-Takahiro AKASHI

> 	M.
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> 
> -- 
> Jazz is not dead, it just smell funny.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-04  8:41               ` takahiro.akashi
@ 2018-07-04  9:02                 ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-04  9:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/07/18 09:41, takahiro.akashi at linaro.org wrote:
> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>> On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
>>> Marc, James,
>>>
>>> I'd like to re-ignite the discussion.
>>>
>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>>> On Wed, 06 Jun 2018 12:37:02 +0100,
>>>> James Morse wrote:
>>>>>
>>>>> Hi Stefan,
>>>>>
>>>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>>>> I attached a hardware debugger and found
>>>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>>>> definitely not safe after a kernel panic.
>>>>>
>>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>>>> code.
>>>>>
>>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>>>> be more subtle.
>>>>>
>>>>>>> Is there a hardware difference around the interrupt controller on these?
>>>>>
>>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>>>
>>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>>> is why the RPi-3-B doesn't do this.
>>>>>
>>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>>> that even means anything). I assume these secondary irqchip's would
>>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>>> see a way to tell them apart..,
>>>>
>>>> There is none. A cascaded irqchip is just like a root irqchip, just
>>>> that its output line is connected to another irqchip. But we have no
>>>> easy way to identify the parent. Also, this particular driver looks
>>>> quite creative (it reinvents the wheel for chained interrupts -- see
>>>> intr_complete and lan78xx_status), meaning that even if we could have
>>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>>>
>>>>> I think we need to wait until after the merge window for Marc's
>>>>> wisdom on this!
>>>>
>>>> Overall, I can't think of an easy fix. We have a few options, but none
>>>> of them involve a centralised change:
>>>>
>>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>>    mechanism. This involves changing the way we teardown irqs at
>>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>>    of the layered ITS and GICv3, for example).
>>>
>>> Does this mean that all the irqchips have to be implemented with reset?
>>
>> No. Only those that want to be reset at kexec time.
> 
> I don't get the point yet. Who should have reset interface?
> What is the criteria?

The criteria is "this irqchip requires a reset to be safely used in the
secondary kernel". This is a judgement call from the person writing the
driver.

> 
>>>>
>>>> 2) We provide a way to identify interrupts that are ultimately backed
>>>>    by a root controller, which implies walking down the hierarchy for
>>>
>>> To be clear, from bottom to top (or root), right?
>>
>> I'm not sure I understand your question. The idea is to walk the
>> irq_data chain, until we hit a root irqchip. If we do hit one, we
>> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> 
> I thought that we would traverse the (chained irq) hierarchy from
> bottom to top and call deactivate or others in that order.
> Am I wrong here?

You *cannot* traverse a hierarchy through a chained irq. At that stage,
irq_data->parent_data is NULL. The only thing you can do is to iterate
over all the interrupts and deactivate those that are directly on the
root interrupt controller. This will have the effect of stopping the
interrupts that are behind a chained controller.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-04  9:02                 ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-04  9:02 UTC (permalink / raw)
  To: takahiro.akashi, James Morse, Stefan Wahren, Petr Tesarik,
	Matthias Brugger, kexec mailing list, linux-arm-kernel

On 04/07/18 09:41, takahiro.akashi@linaro.org wrote:
> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>> On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
>>> Marc, James,
>>>
>>> I'd like to re-ignite the discussion.
>>>
>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>>> On Wed, 06 Jun 2018 12:37:02 +0100,
>>>> James Morse wrote:
>>>>>
>>>>> Hi Stefan,
>>>>>
>>>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>>>> I attached a hardware debugger and found
>>>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>>>> definitely not safe after a kernel panic.
>>>>>
>>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>>>> code.
>>>>>
>>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>>>> be more subtle.
>>>>>
>>>>>>> Is there a hardware difference around the interrupt controller on these?
>>>>>
>>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>>>
>>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>>> is why the RPi-3-B doesn't do this.
>>>>>
>>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>>> that even means anything). I assume these secondary irqchip's would
>>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>>> see a way to tell them apart..,
>>>>
>>>> There is none. A cascaded irqchip is just like a root irqchip, just
>>>> that its output line is connected to another irqchip. But we have no
>>>> easy way to identify the parent. Also, this particular driver looks
>>>> quite creative (it reinvents the wheel for chained interrupts -- see
>>>> intr_complete and lan78xx_status), meaning that even if we could have
>>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>>>
>>>>> I think we need to wait until after the merge window for Marc's
>>>>> wisdom on this!
>>>>
>>>> Overall, I can't think of an easy fix. We have a few options, but none
>>>> of them involve a centralised change:
>>>>
>>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>>    mechanism. This involves changing the way we teardown irqs at
>>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>>    of the layered ITS and GICv3, for example).
>>>
>>> Does this mean that all the irqchips have to be implemented with reset?
>>
>> No. Only those that want to be reset at kexec time.
> 
> I don't get the point yet. Who should have reset interface?
> What is the criteria?

The criteria is "this irqchip requires a reset to be safely used in the
secondary kernel". This is a judgement call from the person writing the
driver.

> 
>>>>
>>>> 2) We provide a way to identify interrupts that are ultimately backed
>>>>    by a root controller, which implies walking down the hierarchy for
>>>
>>> To be clear, from bottom to top (or root), right?
>>
>> I'm not sure I understand your question. The idea is to walk the
>> irq_data chain, until we hit a root irqchip. If we do hit one, we
>> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> 
> I thought that we would traverse the (chained irq) hierarchy from
> bottom to top and call deactivate or others in that order.
> Am I wrong here?

You *cannot* traverse a hierarchy through a chained irq. At that stage,
irq_data->parent_data is NULL. The only thing you can do is to iterate
over all the interrupts and deactivate those that are directly on the
root interrupt controller. This will have the effect of stopping the
interrupts that are behind a chained controller.

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-04  8:41               ` takahiro.akashi
@ 2018-07-04 12:47                 ` James Morse
  -1 siblings, 0 replies; 42+ messages in thread
From: James Morse @ 2018-07-04 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Akashi,

On 04/07/18 09:41, takahiro.akashi at linaro.org wrote:
> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>> On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>>> On Wed, 06 Jun 2018 12:37:02 +0100 James Morse wrote:,
>>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>>> is why the RPi-3-B doesn't do this.
>>>>>
>>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>>> that even means anything). I assume these secondary irqchip's would
>>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>>> see a way to tell them apart..,

>>>> Overall, I can't think of an easy fix. We have a few options, but none
>>>> of them involve a centralised change:
>>>>
>>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>>    mechanism. This involves changing the way we teardown irqs at
>>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>>    of the layered ITS and GICv3, for example).
>>>
>>> Does this mean that all the irqchips have to be implemented with reset?
>>
>> No. Only those that want to be reset at kexec time.
> 
> I don't get the point yet.

(this stuff is new to me, below terminology is probably wrong:)

It seems there is actually a tree of irqchips, which feed interrupts through the
tree via some summary-interrupt.
The problem is one of these later-irqchips is on the other end of the USB bus,
and requires a fair amount of sleeping in order to reset it.

The trick is everything comes through the root irqchip. So we can reset that to
disable all the other interrupts in this tree.

This only works until the new kernel re-enables the summary-interrupt, it needs
to reset the irqchip that the summary-interrupt leads to first.

(I assume shared summary-interrupts are somehow forbidden).


> Who should have reset interface?
> What is the criteria?

(reset-interface -> be reset at kdump time)

Just the root irchip that all interrupts have to come through.


Thanks,

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-04 12:47                 ` James Morse
  0 siblings, 0 replies; 42+ messages in thread
From: James Morse @ 2018-07-04 12:47 UTC (permalink / raw)
  To: takahiro.akashi
  Cc: Stefan Wahren, Matthias Brugger, Marc Zyngier,
	kexec mailing list, Petr Tesarik, linux-arm-kernel

Hi Akashi,

On 04/07/18 09:41, takahiro.akashi@linaro.org wrote:
> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>> On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>>> On Wed, 06 Jun 2018 12:37:02 +0100 James Morse wrote:,
>>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>>> is why the RPi-3-B doesn't do this.
>>>>>
>>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>>> that even means anything). I assume these secondary irqchip's would
>>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>>> see a way to tell them apart..,

>>>> Overall, I can't think of an easy fix. We have a few options, but none
>>>> of them involve a centralised change:
>>>>
>>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>>    mechanism. This involves changing the way we teardown irqs at
>>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>>    of the layered ITS and GICv3, for example).
>>>
>>> Does this mean that all the irqchips have to be implemented with reset?
>>
>> No. Only those that want to be reset at kexec time.
> 
> I don't get the point yet.

(this stuff is new to me, below terminology is probably wrong:)

It seems there is actually a tree of irqchips, which feed interrupts through the
tree via some summary-interrupt.
The problem is one of these later-irqchips is on the other end of the USB bus,
and requires a fair amount of sleeping in order to reset it.

The trick is everything comes through the root irqchip. So we can reset that to
disable all the other interrupts in this tree.

This only works until the new kernel re-enables the summary-interrupt, it needs
to reset the irqchip that the summary-interrupt leads to first.

(I assume shared summary-interrupts are somehow forbidden).


> Who should have reset interface?
> What is the criteria?

(reset-interface -> be reset at kdump time)

Just the root irchip that all interrupts have to come through.


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-03  8:58             ` Marc Zyngier
@ 2018-07-04 14:08               ` Matthias Brugger
  -1 siblings, 0 replies; 42+ messages in thread
From: Matthias Brugger @ 2018-07-04 14:08 UTC (permalink / raw)
  To: linux-arm-kernel



On 03/07/18 10:58, Marc Zyngier wrote:
> On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
>> Marc, James,
>>
>> I'd like to re-ignite the discussion.
>>
>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>> On Wed, 06 Jun 2018 12:37:02 +0100,
>>> James Morse wrote:
>>>>
>>>> Hi Stefan,
>>>>
>>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>>> I attached a hardware debugger and found
>>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>>> definitely not safe after a kernel panic.
>>>>
>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>>> code.
>>>>
>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>>> be more subtle.
>>>>
>>>>>> Is there a hardware difference around the interrupt controller on these?
>>>>
>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>>
>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>> is why the RPi-3-B doesn't do this.
>>>>
>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>> that even means anything). I assume these secondary irqchip's would
>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>> see a way to tell them apart..,
>>>
>>> There is none. A cascaded irqchip is just like a root irqchip, just
>>> that its output line is connected to another irqchip. But we have no
>>> easy way to identify the parent. Also, this particular driver looks
>>> quite creative (it reinvents the wheel for chained interrupts -- see
>>> intr_complete and lan78xx_status), meaning that even if we could have
>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>>
>>>> I think we need to wait until after the merge window for Marc's
>>>> wisdom on this!
>>>
>>> Overall, I can't think of an easy fix. We have a few options, but none
>>> of them involve a centralised change:
>>>
>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>    mechanism. This involves changing the way we teardown irqs at
>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>    of the layered ITS and GICv3, for example).
>>
>> Does this mean that all the irqchips have to be implemented with reset?
> 
> No. Only those that want to be reset at kexec time.
> 
>>>
>>> 2) We provide a way to identify interrupts that are ultimately backed
>>>    by a root controller, which implies walking down the hierarchy for
>>
>> To be clear, from bottom to top (or root), right?
> 
> I'm not sure I understand your question. The idea is to walk the
> irq_data chain, until we hit a root irqchip. If we do hit one, we
> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> 
> This would avoid the above brokenness, and still ensures that no
> interrupt reaches the CPU.
> 
>>
>>>    each one of them. Fairly expensive, but minimal in way of changes
>>>    in the crash code. Requires a per-irqchip flag, but ordering comes
>>>    in for free.
>>>
>>> 3) We do the same as (2), but at the irqdomain level. Not sure that's
>>>    any better, and it may be even more complicated and bring back some
>>>    ordering issues.
>>
>> Do you think that the same thing may happen in case of pci/msi?
>> I have no confidence but MSI has some kind of irq domain hierarchy.
> 
> Anything can happen, as people implement their interrupt infrastructure
> in weird and wonderful ways. So we need to be prepared for the worse.
> 
> I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> should allow the above RPi3 disaster to cope with kexec.
> 
> 	M.
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> 

I threw the kernel on my RPi3+ model but I wasn't able to start the crash
kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the
same reason.

Regards,
Matthias

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-04 14:08               ` Matthias Brugger
  0 siblings, 0 replies; 42+ messages in thread
From: Matthias Brugger @ 2018-07-04 14:08 UTC (permalink / raw)
  To: Marc Zyngier, takahiro.akashi, James Morse, Stefan Wahren,
	Petr Tesarik, kexec mailing list, linux-arm-kernel



On 03/07/18 10:58, Marc Zyngier wrote:
> On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
>> Marc, James,
>>
>> I'd like to re-ignite the discussion.
>>
>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>> On Wed, 06 Jun 2018 12:37:02 +0100,
>>> James Morse wrote:
>>>>
>>>> Hi Stefan,
>>>>
>>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>>> I attached a hardware debugger and found
>>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>>> definitely not safe after a kernel panic.
>>>>
>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>>> code.
>>>>
>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>>> be more subtle.
>>>>
>>>>>> Is there a hardware difference around the interrupt controller on these?
>>>>
>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>>
>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>> is why the RPi-3-B doesn't do this.
>>>>
>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>> that even means anything). I assume these secondary irqchip's would
>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>> see a way to tell them apart..,
>>>
>>> There is none. A cascaded irqchip is just like a root irqchip, just
>>> that its output line is connected to another irqchip. But we have no
>>> easy way to identify the parent. Also, this particular driver looks
>>> quite creative (it reinvents the wheel for chained interrupts -- see
>>> intr_complete and lan78xx_status), meaning that even if we could have
>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>>
>>>> I think we need to wait until after the merge window for Marc's
>>>> wisdom on this!
>>>
>>> Overall, I can't think of an easy fix. We have a few options, but none
>>> of them involve a centralised change:
>>>
>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>    mechanism. This involves changing the way we teardown irqs at
>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>    of the layered ITS and GICv3, for example).
>>
>> Does this mean that all the irqchips have to be implemented with reset?
> 
> No. Only those that want to be reset at kexec time.
> 
>>>
>>> 2) We provide a way to identify interrupts that are ultimately backed
>>>    by a root controller, which implies walking down the hierarchy for
>>
>> To be clear, from bottom to top (or root), right?
> 
> I'm not sure I understand your question. The idea is to walk the
> irq_data chain, until we hit a root irqchip. If we do hit one, we
> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> 
> This would avoid the above brokenness, and still ensures that no
> interrupt reaches the CPU.
> 
>>
>>>    each one of them. Fairly expensive, but minimal in way of changes
>>>    in the crash code. Requires a per-irqchip flag, but ordering comes
>>>    in for free.
>>>
>>> 3) We do the same as (2), but at the irqdomain level. Not sure that's
>>>    any better, and it may be even more complicated and bring back some
>>>    ordering issues.
>>
>> Do you think that the same thing may happen in case of pci/msi?
>> I have no confidence but MSI has some kind of irq domain hierarchy.
> 
> Anything can happen, as people implement their interrupt infrastructure
> in weird and wonderful ways. So we need to be prepared for the worse.
> 
> I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> should allow the above RPi3 disaster to cope with kexec.
> 
> 	M.
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> 

I threw the kernel on my RPi3+ model but I wasn't able to start the crash
kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the
same reason.

Regards,
Matthias

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-04 14:08               ` Matthias Brugger
@ 2018-07-04 14:20                 ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-04 14:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 04 Jul 2018 15:08:38 +0100,
Matthias Brugger <mbrugger@suse.com> wrote:
> On 03/07/18 10:58, Marc Zyngier wrote:
> > I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> > should allow the above RPi3 disaster to cope with kexec.
> > 
> > 	M.
> > 
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> > 
> 
> I threw the kernel on my RPi3+ model but I wasn't able to start the crash
> kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the
> same reason.

Have you annotated the RPi3 root irq_chip structure so that the arm64
kexec code knows that this is the root interrupt controller? I've only
done so on the GIC drivers.

      M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-04 14:20                 ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-04 14:20 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: Stefan Wahren, Petr Tesarik, kexec mailing list, takahiro.akashi,
	James Morse, linux-arm-kernel

On Wed, 04 Jul 2018 15:08:38 +0100,
Matthias Brugger <mbrugger@suse.com> wrote:
> On 03/07/18 10:58, Marc Zyngier wrote:
> > I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> > should allow the above RPi3 disaster to cope with kexec.
> > 
> > 	M.
> > 
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> > 
> 
> I threw the kernel on my RPi3+ model but I wasn't able to start the crash
> kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the
> same reason.

Have you annotated the RPi3 root irq_chip structure so that the arm64
kexec code knows that this is the root interrupt controller? I've only
done so on the GIC drivers.

      M.

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-04  9:02                 ` Marc Zyngier
@ 2018-07-05 10:13                   ` takahiro.akashi
  -1 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi at linaro.org @ 2018-07-05 10:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2018 at 10:02:24AM +0100, Marc Zyngier wrote:
> On 04/07/18 09:41, takahiro.akashi at linaro.org wrote:
> > On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
> >> On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
> >>> Marc, James,
> >>>
> >>> I'd like to re-ignite the discussion.
> >>>
> >>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> >>>> On Wed, 06 Jun 2018 12:37:02 +0100,
> >>>> James Morse wrote:
> >>>>>
> >>>>> Hi Stefan,
> >>>>>
> >>>>> On 06/06/18 08:02, Stefan Wahren wrote:
> >>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
> >>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
> >>>>>>>> I attached a hardware debugger and found
> >>>>>>>> out that all CPU cores were stopped except one which was stuck in the
> >>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> >>>>>>>> definitely not safe after a kernel panic.
> >>>>>
> >>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> >>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> >>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
> >>>>>>> code.
> >>>>>
> >>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
> >>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> >>>>>>>> be more subtle.
> >>>>>
> >>>>>>> Is there a hardware difference around the interrupt controller on these?
> >>>>>
> >>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> >>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
> >>>>>
> >>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
> >>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> >>>>> is why the RPi-3-B doesn't do this.
> >>>>>
> >>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
> >>>>> that even means anything). I assume these secondary irqchip's would
> >>>>> have a summary-interrupt that goes to another irqchip. But I can't
> >>>>> see a way to tell them apart..,
> >>>>
> >>>> There is none. A cascaded irqchip is just like a root irqchip, just
> >>>> that its output line is connected to another irqchip. But we have no
> >>>> easy way to identify the parent. Also, this particular driver looks
> >>>> quite creative (it reinvents the wheel for chained interrupts -- see
> >>>> intr_complete and lan78xx_status), meaning that even if we could have
> >>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
> >>>>
> >>>>> I think we need to wait until after the merge window for Marc's
> >>>>> wisdom on this!
> >>>>
> >>>> Overall, I can't think of an easy fix. We have a few options, but none
> >>>> of them involve a centralised change:
> >>>>
> >>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
> >>>>    mechanism. This involves changing the way we teardown irqs at
> >>>>    crash-time, and we'd then need some notion of reset ordering (think
> >>>>    of the layered ITS and GICv3, for example).
> >>>
> >>> Does this mean that all the irqchips have to be implemented with reset?
> >>
> >> No. Only those that want to be reset at kexec time.
> > 
> > I don't get the point yet. Who should have reset interface?
> > What is the criteria?
> 
> The criteria is "this irqchip requires a reset to be safely used in the
> secondary kernel". This is a judgement call from the person writing the
> driver.

This doesn't tell me anything more than "do it if you need it."
So let me ask you in other words.
Does gic driver need to provide a reset function?
Whether yes or no, why do you think so?

> 
> > 
> >>>>
> >>>> 2) We provide a way to identify interrupts that are ultimately backed
> >>>>    by a root controller, which implies walking down the hierarchy for
> >>>
> >>> To be clear, from bottom to top (or root), right?
> >>
> >> I'm not sure I understand your question. The idea is to walk the
> >> irq_data chain, until we hit a root irqchip. If we do hit one, we
> >> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> > 
> > I thought that we would traverse the (chained irq) hierarchy from
> > bottom to top and call deactivate or others in that order.
> > Am I wrong here?
> 
> You *cannot* traverse a hierarchy through a chained irq. At that stage,
> irq_data->parent_data is NULL.

What do you mean by "at that stage?"
In your patch, "genirq: Add predicate for irq served by root irqchip,"
irq_irqchip_is_root() dereferences its parent_data.

-Takahiro AKASHI

> The only thing you can do is to iterate
> over all the interrupts and deactivate those that are directly on the
> root interrupt controller. This will have the effect of stopping the
> interrupts that are behind a chained controller.
> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-05 10:13                   ` takahiro.akashi
  0 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi @ 2018-07-05 10:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Stefan Wahren, Matthias Brugger, kexec mailing list,
	Petr Tesarik, James Morse, linux-arm-kernel

On Wed, Jul 04, 2018 at 10:02:24AM +0100, Marc Zyngier wrote:
> On 04/07/18 09:41, takahiro.akashi@linaro.org wrote:
> > On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
> >> On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
> >>> Marc, James,
> >>>
> >>> I'd like to re-ignite the discussion.
> >>>
> >>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> >>>> On Wed, 06 Jun 2018 12:37:02 +0100,
> >>>> James Morse wrote:
> >>>>>
> >>>>> Hi Stefan,
> >>>>>
> >>>>> On 06/06/18 08:02, Stefan Wahren wrote:
> >>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
> >>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
> >>>>>>>> I attached a hardware debugger and found
> >>>>>>>> out that all CPU cores were stopped except one which was stuck in the
> >>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> >>>>>>>> definitely not safe after a kernel panic.
> >>>>>
> >>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> >>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
> >>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
> >>>>>>> code.
> >>>>>
> >>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
> >>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> >>>>>>>> be more subtle.
> >>>>>
> >>>>>>> Is there a hardware difference around the interrupt controller on these?
> >>>>>
> >>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
> >>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
> >>>>>
> >>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
> >>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> >>>>> is why the RPi-3-B doesn't do this.
> >>>>>
> >>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
> >>>>> that even means anything). I assume these secondary irqchip's would
> >>>>> have a summary-interrupt that goes to another irqchip. But I can't
> >>>>> see a way to tell them apart..,
> >>>>
> >>>> There is none. A cascaded irqchip is just like a root irqchip, just
> >>>> that its output line is connected to another irqchip. But we have no
> >>>> easy way to identify the parent. Also, this particular driver looks
> >>>> quite creative (it reinvents the wheel for chained interrupts -- see
> >>>> intr_complete and lan78xx_status), meaning that even if we could have
> >>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
> >>>>
> >>>>> I think we need to wait until after the merge window for Marc's
> >>>>> wisdom on this!
> >>>>
> >>>> Overall, I can't think of an easy fix. We have a few options, but none
> >>>> of them involve a centralised change:
> >>>>
> >>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
> >>>>    mechanism. This involves changing the way we teardown irqs at
> >>>>    crash-time, and we'd then need some notion of reset ordering (think
> >>>>    of the layered ITS and GICv3, for example).
> >>>
> >>> Does this mean that all the irqchips have to be implemented with reset?
> >>
> >> No. Only those that want to be reset at kexec time.
> > 
> > I don't get the point yet. Who should have reset interface?
> > What is the criteria?
> 
> The criteria is "this irqchip requires a reset to be safely used in the
> secondary kernel". This is a judgement call from the person writing the
> driver.

This doesn't tell me anything more than "do it if you need it."
So let me ask you in other words.
Does gic driver need to provide a reset function?
Whether yes or no, why do you think so?

> 
> > 
> >>>>
> >>>> 2) We provide a way to identify interrupts that are ultimately backed
> >>>>    by a root controller, which implies walking down the hierarchy for
> >>>
> >>> To be clear, from bottom to top (or root), right?
> >>
> >> I'm not sure I understand your question. The idea is to walk the
> >> irq_data chain, until we hit a root irqchip. If we do hit one, we
> >> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> > 
> > I thought that we would traverse the (chained irq) hierarchy from
> > bottom to top and call deactivate or others in that order.
> > Am I wrong here?
> 
> You *cannot* traverse a hierarchy through a chained irq. At that stage,
> irq_data->parent_data is NULL.

What do you mean by "at that stage?"
In your patch, "genirq: Add predicate for irq served by root irqchip,"
irq_irqchip_is_root() dereferences its parent_data.

-Takahiro AKASHI

> The only thing you can do is to iterate
> over all the interrupts and deactivate those that are directly on the
> root interrupt controller. This will have the effect of stopping the
> interrupts that are behind a chained controller.
> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-04 12:47                 ` James Morse
@ 2018-07-05 10:18                   ` takahiro.akashi
  -1 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi at linaro.org @ 2018-07-05 10:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 04, 2018 at 01:47:11PM +0100, James Morse wrote:
> Hi Akashi,
> 
> On 04/07/18 09:41, takahiro.akashi at linaro.org wrote:
> > On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
> >> On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
> >>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> >>>> On Wed, 06 Jun 2018 12:37:02 +0100 James Morse wrote:,
> >>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
> >>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> >>>>> is why the RPi-3-B doesn't do this.
> >>>>>
> >>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
> >>>>> that even means anything). I assume these secondary irqchip's would
> >>>>> have a summary-interrupt that goes to another irqchip. But I can't
> >>>>> see a way to tell them apart..,
> 
> >>>> Overall, I can't think of an easy fix. We have a few options, but none
> >>>> of them involve a centralised change:
> >>>>
> >>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
> >>>>    mechanism. This involves changing the way we teardown irqs at
> >>>>    crash-time, and we'd then need some notion of reset ordering (think
> >>>>    of the layered ITS and GICv3, for example).
> >>>
> >>> Does this mean that all the irqchips have to be implemented with reset?
> >>
> >> No. Only those that want to be reset at kexec time.
> > 
> > I don't get the point yet.
> 
> (this stuff is new to me, below terminology is probably wrong:)
> 
> It seems there is actually a tree of irqchips, which feed interrupts through the
> tree via some summary-interrupt.
> The problem is one of these later-irqchips is on the other end of the USB bus,
> and requires a fair amount of sleeping in order to reset it.
> 
> The trick is everything comes through the root irqchip. So we can reset that to
> disable all the other interrupts in this tree.
> 
> This only works until the new kernel re-enables the summary-interrupt, it needs
> to reset the irqchip that the summary-interrupt leads to first.
> 
> (I assume shared summary-interrupts are somehow forbidden).

So you are choking at the "root" irqchip?
Is this safe when a new (kdump) kernel starts up and re-initialises
irq hierarchy?

-Takahiro AKASHI

> 
> > Who should have reset interface?
> > What is the criteria?
> 
> (reset-interface -> be reset at kdump time)
> 
> Just the root irchip that all interrupts have to come through.
> 
> 
> Thanks,
> 
> James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-05 10:18                   ` takahiro.akashi
  0 siblings, 0 replies; 42+ messages in thread
From: takahiro.akashi @ 2018-07-05 10:18 UTC (permalink / raw)
  To: James Morse
  Cc: Stefan Wahren, Matthias Brugger, Marc Zyngier,
	kexec mailing list, Petr Tesarik, linux-arm-kernel

On Wed, Jul 04, 2018 at 01:47:11PM +0100, James Morse wrote:
> Hi Akashi,
> 
> On 04/07/18 09:41, takahiro.akashi@linaro.org wrote:
> > On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
> >> On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
> >>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> >>>> On Wed, 06 Jun 2018 12:37:02 +0100 James Morse wrote:,
> >>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
> >>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> >>>>> is why the RPi-3-B doesn't do this.
> >>>>>
> >>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
> >>>>> that even means anything). I assume these secondary irqchip's would
> >>>>> have a summary-interrupt that goes to another irqchip. But I can't
> >>>>> see a way to tell them apart..,
> 
> >>>> Overall, I can't think of an easy fix. We have a few options, but none
> >>>> of them involve a centralised change:
> >>>>
> >>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
> >>>>    mechanism. This involves changing the way we teardown irqs at
> >>>>    crash-time, and we'd then need some notion of reset ordering (think
> >>>>    of the layered ITS and GICv3, for example).
> >>>
> >>> Does this mean that all the irqchips have to be implemented with reset?
> >>
> >> No. Only those that want to be reset at kexec time.
> > 
> > I don't get the point yet.
> 
> (this stuff is new to me, below terminology is probably wrong:)
> 
> It seems there is actually a tree of irqchips, which feed interrupts through the
> tree via some summary-interrupt.
> The problem is one of these later-irqchips is on the other end of the USB bus,
> and requires a fair amount of sleeping in order to reset it.
> 
> The trick is everything comes through the root irqchip. So we can reset that to
> disable all the other interrupts in this tree.
> 
> This only works until the new kernel re-enables the summary-interrupt, it needs
> to reset the irqchip that the summary-interrupt leads to first.
> 
> (I assume shared summary-interrupts are somehow forbidden).

So you are choking at the "root" irqchip?
Is this safe when a new (kdump) kernel starts up and re-initialises
irq hierarchy?

-Takahiro AKASHI

> 
> > Who should have reset interface?
> > What is the criteria?
> 
> (reset-interface -> be reset at kdump time)
> 
> Just the root irchip that all interrupts have to come through.
> 
> 
> Thanks,
> 
> James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-05 10:13                   ` takahiro.akashi
@ 2018-07-05 10:19                     ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-05 10:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/07/18 11:13, takahiro.akashi at linaro.org wrote:
> On Wed, Jul 04, 2018 at 10:02:24AM +0100, Marc Zyngier wrote:
>> On 04/07/18 09:41, takahiro.akashi at linaro.org wrote:
>>> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>>>> On 03/07/18 08:01, takahiro.akashi at linaro.org wrote:
>>>>> Marc, James,
>>>>>
>>>>> I'd like to re-ignite the discussion.
>>>>>
>>>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>>>>> On Wed, 06 Jun 2018 12:37:02 +0100,
>>>>>> James Morse wrote:
>>>>>>>
>>>>>>> Hi Stefan,
>>>>>>>
>>>>>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>>>>>> I attached a hardware debugger and found
>>>>>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>>>>>> definitely not safe after a kernel panic.
>>>>>>>
>>>>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>>>>>> code.
>>>>>>>
>>>>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>>>>>> be more subtle.
>>>>>>>
>>>>>>>>> Is there a hardware difference around the interrupt controller on these?
>>>>>>>
>>>>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>>>>>
>>>>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>>>>> is why the RPi-3-B doesn't do this.
>>>>>>>
>>>>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>>>>> that even means anything). I assume these secondary irqchip's would
>>>>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>>>>> see a way to tell them apart..,
>>>>>>
>>>>>> There is none. A cascaded irqchip is just like a root irqchip, just
>>>>>> that its output line is connected to another irqchip. But we have no
>>>>>> easy way to identify the parent. Also, this particular driver looks
>>>>>> quite creative (it reinvents the wheel for chained interrupts -- see
>>>>>> intr_complete and lan78xx_status), meaning that even if we could have
>>>>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>>>>>
>>>>>>> I think we need to wait until after the merge window for Marc's
>>>>>>> wisdom on this!
>>>>>>
>>>>>> Overall, I can't think of an easy fix. We have a few options, but none
>>>>>> of them involve a centralised change:
>>>>>>
>>>>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>>>>    mechanism. This involves changing the way we teardown irqs at
>>>>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>>>>    of the layered ITS and GICv3, for example).
>>>>>
>>>>> Does this mean that all the irqchips have to be implemented with reset?
>>>>
>>>> No. Only those that want to be reset at kexec time.
>>>
>>> I don't get the point yet. Who should have reset interface?
>>> What is the criteria?
>>
>> The criteria is "this irqchip requires a reset to be safely used in the
>> secondary kernel". This is a judgement call from the person writing the
>> driver.
> 
> This doesn't tell me anything more than "do it if you need it."
> So let me ask you in other words.
> Does gic driver need to provide a reset function?
> Whether yes or no, why do you think so?

Because I know the architecture and I can assess that it needs it. Case
in point: The RDs have memory tables. kexec without disabling LPIs, and
you end-up with memory corruption.

Sorry, but there is no magic bullet. You have to understand what you're
doing.

> 
>>
>>>
>>>>>>
>>>>>> 2) We provide a way to identify interrupts that are ultimately backed
>>>>>>    by a root controller, which implies walking down the hierarchy for
>>>>>
>>>>> To be clear, from bottom to top (or root), right?
>>>>
>>>> I'm not sure I understand your question. The idea is to walk the
>>>> irq_data chain, until we hit a root irqchip. If we do hit one, we
>>>> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
>>>
>>> I thought that we would traverse the (chained irq) hierarchy from
>>> bottom to top and call deactivate or others in that order.
>>> Am I wrong here?
>>
>> You *cannot* traverse a hierarchy through a chained irq. At that stage,
>> irq_data->parent_data is NULL.
> 
> What do you mean by "at that stage?"

At the point you reach a secondary interrupt controller that exposese a
chained IRQ.

> In your patch, "genirq: Add predicate for irq served by root irqchip,"
> irq_irqchip_is_root() dereferences its parent_data.

And? Hierarchies and chained controllers are two orthogonal concepts.
You can have a hierarchy on top of a chained controller, and that
doesn't make it a root controller.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-07-05 10:19                     ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-07-05 10:19 UTC (permalink / raw)
  To: takahiro.akashi, James Morse, Stefan Wahren, Petr Tesarik,
	Matthias Brugger, kexec mailing list, linux-arm-kernel

On 05/07/18 11:13, takahiro.akashi@linaro.org wrote:
> On Wed, Jul 04, 2018 at 10:02:24AM +0100, Marc Zyngier wrote:
>> On 04/07/18 09:41, takahiro.akashi@linaro.org wrote:
>>> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>>>> On 03/07/18 08:01, takahiro.akashi@linaro.org wrote:
>>>>> Marc, James,
>>>>>
>>>>> I'd like to re-ignite the discussion.
>>>>>
>>>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>>>>> On Wed, 06 Jun 2018 12:37:02 +0100,
>>>>>> James Morse wrote:
>>>>>>>
>>>>>>> Hi Stefan,
>>>>>>>
>>>>>>> On 06/06/18 08:02, Stefan Wahren wrote:
>>>>>>>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>>>>>>>> On 05/06/18 09:01, Petr Tesarik wrote:
>>>>>>>>>> I attached a hardware debugger and found
>>>>>>>>>> out that all CPU cores were stopped except one which was stuck in the
>>>>>>>>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>>>>>>>>> definitely not safe after a kernel panic.
>>>>>>>
>>>>>>>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>>>>>>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just poking
>>>>>>>>> around in mmio registers, this should all be safe unless you re-entered the same
>>>>>>>>> code.
>>>>>>>
>>>>>>>>>> If I'm right, then this is broken in general, but I have only ever seen
>>>>>>>>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>>>>>>>>> be more subtle.
>>>>>>>
>>>>>>>>> Is there a hardware difference around the interrupt controller on these?
>>>>>>>
>>>>>>>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, Fast
>>>>>>>> ethernet) instead of lan78xx (Gigabit ethernet).
>>>>>>>
>>>>>>> Bingo: its the lan78xx driver that is sleeping from the irqchip
>>>>>>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
>>>>>>> is why the RPi-3-B doesn't do this.
>>>>>>>
>>>>>>> It may be valid for kdump to only teardown the 'root irqdomain' (if
>>>>>>> that even means anything). I assume these secondary irqchip's would
>>>>>>> have a summary-interrupt that goes to another irqchip. But I can't
>>>>>>> see a way to tell them apart..,
>>>>>>
>>>>>> There is none. A cascaded irqchip is just like a root irqchip, just
>>>>>> that its output line is connected to another irqchip. But we have no
>>>>>> easy way to identify the parent. Also, this particular driver looks
>>>>>> quite creative (it reinvents the wheel for chained interrupts -- see
>>>>>> intr_complete and lan78xx_status), meaning that even if we could have
>>>>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>>>>>
>>>>>>> I think we need to wait until after the merge window for Marc's
>>>>>>> wisdom on this!
>>>>>>
>>>>>> Overall, I can't think of an easy fix. We have a few options, but none
>>>>>> of them involve a centralised change:
>>>>>>
>>>>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>>>>    mechanism. This involves changing the way we teardown irqs at
>>>>>>    crash-time, and we'd then need some notion of reset ordering (think
>>>>>>    of the layered ITS and GICv3, for example).
>>>>>
>>>>> Does this mean that all the irqchips have to be implemented with reset?
>>>>
>>>> No. Only those that want to be reset at kexec time.
>>>
>>> I don't get the point yet. Who should have reset interface?
>>> What is the criteria?
>>
>> The criteria is "this irqchip requires a reset to be safely used in the
>> secondary kernel". This is a judgement call from the person writing the
>> driver.
> 
> This doesn't tell me anything more than "do it if you need it."
> So let me ask you in other words.
> Does gic driver need to provide a reset function?
> Whether yes or no, why do you think so?

Because I know the architecture and I can assess that it needs it. Case
in point: The RDs have memory tables. kexec without disabling LPIs, and
you end-up with memory corruption.

Sorry, but there is no magic bullet. You have to understand what you're
doing.

> 
>>
>>>
>>>>>>
>>>>>> 2) We provide a way to identify interrupts that are ultimately backed
>>>>>>    by a root controller, which implies walking down the hierarchy for
>>>>>
>>>>> To be clear, from bottom to top (or root), right?
>>>>
>>>> I'm not sure I understand your question. The idea is to walk the
>>>> irq_data chain, until we hit a root irqchip. If we do hit one, we
>>>> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
>>>
>>> I thought that we would traverse the (chained irq) hierarchy from
>>> bottom to top and call deactivate or others in that order.
>>> Am I wrong here?
>>
>> You *cannot* traverse a hierarchy through a chained irq. At that stage,
>> irq_data->parent_data is NULL.
> 
> What do you mean by "at that stage?"

At the point you reach a secondary interrupt controller that exposese a
chained IRQ.

> In your patch, "genirq: Add predicate for irq served by root irqchip,"
> irq_irqchip_is_root() dereferences its parent_data.

And? Hierarchies and chained controllers are two orthogonal concepts.
You can have a hierarchy on top of a chained controller, and that
doesn't make it a root controller.

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-07-05 10:19                     ` Marc Zyngier
@ 2018-08-02 15:49                       ` David Woodhouse
  -1 siblings, 0 replies; 42+ messages in thread
From: David Woodhouse @ 2018-08-02 15:49 UTC (permalink / raw)
  To: linux-arm-kernel



On Thu, 2018-07-05 at 11:19 +0100, Marc Zyngier wrote:
> >> The criteria is "this irqchip requires a reset to be safely used in the
> >> secondary kernel". This is a judgement call from the person writing the
> >> driver.
> >?
> > This doesn't tell me anything more than "do it if you need it."
> > So let me ask you in other words.
> > Does gic driver need to provide a reset function?
> > Whether yes or no, why do you think so?
> 
> Because I know the architecture and I can assess that it needs it. Case
> in point: The RDs have memory tables. kexec without disabling LPIs, and
> you end-up with memory corruption.
> 
> Sorry, but there is no magic bullet. You have to understand what you're
> doing.

Remember, kexec and kdump are subtly different things.

In the case of an orderly kexec, sure you can go walking chains of
interrupt controllers (and other devices) and nicely quiescing them.

In the kdump case it's different. You really want as few instructions
as possible between realising you're going to panic, and entering the
kdump kernel. You NMI? all the other cores to dump their state, and
just GTFO.

In the kdump case you also aren't *reusing* the memory, which means
that existing memory tables which are being accessed by hardware
shouldn't be an issue. You can let the second kernel reset it all from
a controlled and not-already-panicking environment.

--?
dwmw2


? Oops no NMI. Doh.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5213 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180802/9aa4c489/attachment.bin>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-08-02 15:49                       ` David Woodhouse
  0 siblings, 0 replies; 42+ messages in thread
From: David Woodhouse @ 2018-08-02 15:49 UTC (permalink / raw)
  To: Marc Zyngier, takahiro.akashi, James Morse, Stefan Wahren,
	Petr Tesarik, Matthias Brugger, kexec mailing list,
	linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1453 bytes --]



On Thu, 2018-07-05 at 11:19 +0100, Marc Zyngier wrote:
> >> The criteria is "this irqchip requires a reset to be safely used in the
> >> secondary kernel". This is a judgement call from the person writing the
> >> driver.
> > 
> > This doesn't tell me anything more than "do it if you need it."
> > So let me ask you in other words.
> > Does gic driver need to provide a reset function?
> > Whether yes or no, why do you think so?
> 
> Because I know the architecture and I can assess that it needs it. Case
> in point: The RDs have memory tables. kexec without disabling LPIs, and
> you end-up with memory corruption.
> 
> Sorry, but there is no magic bullet. You have to understand what you're
> doing.

Remember, kexec and kdump are subtly different things.

In the case of an orderly kexec, sure you can go walking chains of
interrupt controllers (and other devices) and nicely quiescing them.

In the kdump case it's different. You really want as few instructions
as possible between realising you're going to panic, and entering the
kdump kernel. You NMI¹ all the other cores to dump their state, and
just GTFO.

In the kdump case you also aren't *reusing* the memory, which means
that existing memory tables which are being accessed by hardware
shouldn't be an issue. You can let the second kernel reset it all from
a controlled and not-already-panicking environment.

-- 
dwmw2


¹ Oops no NMI. Doh.

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

* panic kexec broken on ARM64?
  2018-08-02 15:49                       ` David Woodhouse
@ 2018-08-03  6:06                         ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-08-03  6:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 02 Aug 2018 16:49:54 +0100,
David Woodhouse <dwmw2@infradead.org> wrote:
> 
> On Thu, 2018-07-05 at 11:19 +0100, Marc Zyngier wrote:
> > >> The criteria is "this irqchip requires a reset to be safely used in the
> > >> secondary kernel". This is a judgement call from the person writing the
> > >> driver.
> > >?
> > > This doesn't tell me anything more than "do it if you need it."
> > > So let me ask you in other words.
> > > Does gic driver need to provide a reset function?
> > > Whether yes or no, why do you think so?
> > 
> > Because I know the architecture and I can assess that it needs it. Case
> > in point: The RDs have memory tables. kexec without disabling LPIs, and
> > you end-up with memory corruption.
> > 
> > Sorry, but there is no magic bullet. You have to understand what you're
> > doing.
> 
> Remember, kexec and kdump are subtly different things.
> 
> In the case of an orderly kexec, sure you can go walking chains of
> interrupt controllers (and other devices) and nicely quiescing them.
> 
> In the kdump case it's different. You really want as few instructions
> as possible between realising you're going to panic, and entering the
> kdump kernel. You NMI? all the other cores to dump their state, and
> just GTFO.
> 
> In the kdump case you also aren't *reusing* the memory, which means
> that existing memory tables which are being accessed by hardware
> shouldn't be an issue. You can let the second kernel reset it all from
> a controlled and not-already-panicking environment.

Yes, in the kdump case, you can safely continue without experiencing
memory corruption. No, you cannot reset them, which is the whole
point. What you can do is use the LPI configuration as it is, and cope
with it (at the expense of changing memory that was under control of
the primary kernel -- let's hope you're not trying to debug
interrupt-related issues...).

See [1] which implements that and a bit more to actually support kexec
in such a configuration.

> ? Oops no NMI. Doh.

Care to help reviewing this[2]?

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/gicv3-kdump

[2] https://lwn.net/Articles/755906/

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: panic kexec broken on ARM64?
@ 2018-08-03  6:06                         ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2018-08-03  6:06 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Stefan Wahren, Matthias Brugger, Petr Tesarik,
	kexec mailing list, takahiro.akashi, James Morse,
	linux-arm-kernel

On Thu, 02 Aug 2018 16:49:54 +0100,
David Woodhouse <dwmw2@infradead.org> wrote:
> 
> On Thu, 2018-07-05 at 11:19 +0100, Marc Zyngier wrote:
> > >> The criteria is "this irqchip requires a reset to be safely used in the
> > >> secondary kernel". This is a judgement call from the person writing the
> > >> driver.
> > > 
> > > This doesn't tell me anything more than "do it if you need it."
> > > So let me ask you in other words.
> > > Does gic driver need to provide a reset function?
> > > Whether yes or no, why do you think so?
> > 
> > Because I know the architecture and I can assess that it needs it. Case
> > in point: The RDs have memory tables. kexec without disabling LPIs, and
> > you end-up with memory corruption.
> > 
> > Sorry, but there is no magic bullet. You have to understand what you're
> > doing.
> 
> Remember, kexec and kdump are subtly different things.
> 
> In the case of an orderly kexec, sure you can go walking chains of
> interrupt controllers (and other devices) and nicely quiescing them.
> 
> In the kdump case it's different. You really want as few instructions
> as possible between realising you're going to panic, and entering the
> kdump kernel. You NMI¹ all the other cores to dump their state, and
> just GTFO.
> 
> In the kdump case you also aren't *reusing* the memory, which means
> that existing memory tables which are being accessed by hardware
> shouldn't be an issue. You can let the second kernel reset it all from
> a controlled and not-already-panicking environment.

Yes, in the kdump case, you can safely continue without experiencing
memory corruption. No, you cannot reset them, which is the whole
point. What you can do is use the LPI configuration as it is, and cope
with it (at the expense of changing memory that was under control of
the primary kernel -- let's hope you're not trying to debug
interrupt-related issues...).

See [1] which implements that and a bit more to actually support kexec
in such a configuration.

> ¹ Oops no NMI. Doh.

Care to help reviewing this[2]?

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/gicv3-kdump

[2] https://lwn.net/Articles/755906/

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2018-08-03  6:06 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-05  8:01 panic kexec broken on ARM64? Petr Tesarik
2018-06-05  8:01 ` Petr Tesarik
2018-06-05 17:46 ` James Morse
2018-06-05 17:46   ` James Morse
2018-06-06  7:02   ` Stefan Wahren
2018-06-06  7:02     ` Stefan Wahren
2018-06-06  8:00     ` Petr Tesarik
2018-06-06  8:00       ` Petr Tesarik
2018-06-06 11:41       ` Petr Tesarik
2018-06-06 11:41         ` Petr Tesarik
2018-06-06 11:37     ` James Morse
2018-06-06 11:37       ` James Morse
2018-06-10 12:24       ` Marc Zyngier
2018-06-10 12:24         ` Marc Zyngier
2018-07-03  7:01         ` takahiro.akashi at linaro.org
2018-07-03  7:01           ` takahiro.akashi
2018-07-03  8:58           ` Marc Zyngier
2018-07-03  8:58             ` Marc Zyngier
2018-07-04  8:41             ` takahiro.akashi at linaro.org
2018-07-04  8:41               ` takahiro.akashi
2018-07-04  9:02               ` Marc Zyngier
2018-07-04  9:02                 ` Marc Zyngier
2018-07-05 10:13                 ` takahiro.akashi at linaro.org
2018-07-05 10:13                   ` takahiro.akashi
2018-07-05 10:19                   ` Marc Zyngier
2018-07-05 10:19                     ` Marc Zyngier
2018-08-02 15:49                     ` David Woodhouse
2018-08-02 15:49                       ` David Woodhouse
2018-08-03  6:06                       ` Marc Zyngier
2018-08-03  6:06                         ` Marc Zyngier
2018-07-04 12:47               ` James Morse
2018-07-04 12:47                 ` James Morse
2018-07-05 10:18                 ` takahiro.akashi at linaro.org
2018-07-05 10:18                   ` takahiro.akashi
2018-07-04 14:08             ` Matthias Brugger
2018-07-04 14:08               ` Matthias Brugger
2018-07-04 14:20               ` Marc Zyngier
2018-07-04 14:20                 ` Marc Zyngier
2018-06-06  5:36 ` Bhupesh Sharma
2018-06-06  5:36   ` Bhupesh Sharma
2018-06-06  7:58   ` Petr Tesarik
2018-06-06  7:58     ` Petr Tesarik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.