On 26.01.22 16:31, Corey Minyard wrote: > On Wed, Jan 26, 2022 at 03:51:36PM +0100, Juergen Gross wrote: >> On 26.01.22 14:56, Corey Minyard wrote: >>> On Wed, Jan 26, 2022 at 07:08:22AM +0100, Juergen Gross wrote: > > snip.. > >>> >>> csd: cnt(63d8e1f): 0003->0037 queue >>> csd: cnt(63d8e20): 0003->0037 ipi >>> csd: cnt(63d8e21): 0003->0037 ping >>> >>> In __smp_call_single_queue_debug CPU 3 sends another message to >>> CPU 55 and sends an IPI. But there should be a pinged entry >>> after this. >>> >>> csd: cnt(63d8e22): 0003->0037 queue >>> csd: cnt(63d8e23): 0003->0037 noipi >> >> This is interesting. Those are 5 consecutive entries without any >> missing in between (see the counter values). Could it be that after >> the ping there was an interrupt and the code was re-entered for >> sending another IPI? This would clearly result in a hang as seen. > > Since preempt is enabled, wouldn't it eventually come back to the first > thread and send the IPI? Unless CPU 3 is stuck in an interrupt or > interrupt storm. With preempt disabled (you probably meant that) only an IPI from interrupt context would be possible. And it would be stuck, of course, as it would need to wait for the CSD lock. Juergen