Hi Stefan, On Sun, Jan 16, 2022 at 06:26:58PM +0100, Stefan Wahren wrote: > recently i saw a report [1] about bad chained IRQ with Linux 5.15.13 > Aarch64 with Arch Linux. I'm able to reproduce this issue on my > Raspberry Pi 4 B (8 GB RAM, Firmware: 2022-01-06T15:39:30) by turning > the connected HDMI monitor off and on again. By turning the monitor on and off, you mean that you used the power button on it? Not something like disabling the output in sysfs, right? > Kernel output is the following: > > [15053.285438] irq 10, desc: 00000000acc41fca, depth: 0, count: 0, > unhandled: 0 > [15053.295440] ->handle_irq():  00000000b28cf1d1, > brcmstb_l2_intc_irq_handle+0x0/0x1e0 > [15053.306049] ->irq_data.chip(): 000000005f172760, gic_data+0x0/0x768 > [15053.315233] ->action(): 00000000236e815e > [15053.322022] ->action->handler(): 0000000013023289, > bad_chained_irq+0x0/0x50 > [15053.331909]      IRQ_LEVEL set > [15053.337822]    IRQ_NOPROBE set > [15053.343715]  IRQ_NOREQUEST set > [15053.349585]   IRQ_NOTHREAD set IRQ10 is the interrupt that a monitor has been connected on HDMI1, which makes sense if you were using HDMI1. Usually, when a display is turned on, it will issue a pulse on the HPD line so we would have a disconnection interrupt followed by a connection interrupt. This is weird though, since we have an interrupt handler on that interrupt (hpd-connected in the DT binding): https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vc4/vc4_hdmi.c#L1578 > Content of /proc/interrupts after the issue occured: > >            CPU0       CPU1       CPU2       CPU3       >   9:          0          0          0          0     GICv2  25 Level     > vgic >  10:          1          0          0          0     GICv2 128 Level     > (null) >  12:     130322      26028      27670     135225     GICv2  30 Level     > arch_timer >  13:          0          0          0          0     GICv2  27 Level     > kvm guest vtimer >  19:          0          0          0          0     GICv2 107 Level     > fe004000.txp >  20:       7450          0          0          0     GICv2  65 Level     > fe00b880.mailbox >  25:       6525          0          0          0     GICv2 153 Level     > uart-pl011 >  26:          0          0          0          0     GICv2 149 Level     > fe205000.i2c, fe804000.i2c >  27:          9          0          0          0     GICv2 125 Level     > ttyS1 >  28:      36999          0          0          0     GICv2 158 Level     > mmc0, mmc1 >  29:          1          0          0          0     GICv2 129 Level     > vc4 hvs >  30:          0          0          0          0     GICv2 105 Level     > fe980000.usb, fe980000.usb >  31:          0          0          0          0     GICv2 112 Level     > DMA IRQ >  33:          0          0          0          0     GICv2 114 Level     > DMA IRQ >  40:          0          0          0          0     GICv2 141 Level     > vc4 crtc >  41:          0          0          0          0     GICv2 142 Level     > vc4 crtc, vc4 crtc >  42:         10          0          0          0     GICv2 133 Level     > vc4 crtc >  43:          1          0          0          0  > interrupt-controller@7ef00100   0 Edge      vc4 hdmi cec tx >  44:          0          0          0          0  > interrupt-controller@7ef00100   1 Edge      vc4 hdmi cec rx >  47:          0          0          0          0  > interrupt-controller@7ef00100   4 Edge      vc4 hdmi hpd connected >  48:          1          0          0          0  > interrupt-controller@7ef00100   5 Edge      vc4 hdmi hpd disconnected >  49:          0          0          0          0  > interrupt-controller@7ef00100   8 Edge      vc4 hdmi cec tx >  50:          0          0          0          0  > interrupt-controller@7ef00100   7 Edge      vc4 hdmi cec rx >  53:          0          0          0          0  > interrupt-controller@7ef00100  10 Edge      vc4 hdmi hpd connected >  54:          0          0          0          0  And it's there as well. > interrupt-controller@7ef00100  11 Edge      vc4 hdmi hpd disconnected >  55:          7          0          0          0     GICv2  66 Level     > VCHIQ doorbell >  56:          0          0          0          0     GICv2  48 Level     > arm-pmu >  57:          0          0          0          0     GICv2  49 Level     > arm-pmu >  58:          0          0          0          0     GICv2  50 Level     > arm-pmu >  59:          0          0          0          0     GICv2  51 Level     > arm-pmu >  62:      47599          0          0          0     GICv2 189 Level     > eth0 >  63:       4681          0          0          0     GICv2 190 Level     > eth0 >  64:          0          0          0          0     GICv2 175 Level     > PCIe PME, aerdrv >  65:        326          0          0          0  BRCM STB PCIe MSI > 524288 Edge      xhci_hcd > IPI0:      2442       5185       7195      18290       Rescheduling > interrupts > IPI1:       481        383        518        533       Function call > interrupts > IPI2:         0          0          0          0       CPU stop interrupts > IPI3:         0          0          0          0       CPU stop (for > crash dump) interrupts > IPI4:         0          0          0          0       Timer broadcast > interrupts > IPI5:         1          0          0          0       IRQ work interrupts > IPI6:         0          0          0          0       CPU wake-up > interrupts > Err:          1 > > Comparing the vendor & mainline DTS, i noticed differences at hdmi0/1. > The vendor DTS has an additional register to access the same space as > aon_intr (interrupt parent), which looks ugly [2]. This is an artifact from the past. We used to use that register directly in our driver before we went to upstream the CEC support, but we don't anymore. The DT patch must have been carried around since then, but nothing should be using it. > Additionally i noted that bcm2711.dtsi uses the compatible > "brcm,bcm2711-l2-intc" with a level high interrupt, but according to > irq-brcmstb-l2.c [3] the compatible is not defined and would fallback to > "brcm,l2-intc" with brcmstb_l2_edge_intc_of_init. This looks fishy. > > I didn't try to reproduce this with Raspberry Pi OS & mainline kernel, > but i hope these are enough information so far. I don't remember anyone reporting this before, and I have tested the disconnection / connection interrupts myself a number of times without ever seeing this. The level vs edge stuff might be a good explanation Maxime