On Thu, 21 Feb 2019, Julien Grall wrote: > Hi Roger, > > On Thu, 21 Feb 2019, 08:08 Roger Pau Monné, wrote: > FWIW, you can also mask the interrupt while waiting for the thread to > execute the interrupt handler. Ie: > > > Thank you for providing steps, however where would the masking be done? By the irqchip or a custom solution? > > > 1. Interrupt injected > 2. Execute guest event channel callback > 3. Scan for pending interrupts > 4. Mask interrupt > 5. Clear pending field > 6. Queue threaded handler > 7. Go to 3 until all interrupts are drained > [...] > 8. Execute interrupt handler in thread > 9. Unmask interrupt > > That should prevent you from stacking interrupts? Sorry for coming late to the thread, and thanks Julien for pointing it out to me. I am afraid I was the one to break the flow back in 2011 with the following commit: 7e186bdd0098 xen: do not clear and mask evtchns in __xen_evtchn_do_upcall Oops :-) Xen event channels have their own workflow; the one Roger wrote above. They used to be handled using handle_fasteoi_irq until 7e186bdd0098, then I switched (almost) all of them to handle_edge_irq. Looking closely at irq handling again, it doesn't look like we can do what we need with handle_edge_irq today: we can't mask the event channel before clearing it. But we can do that if we go back to using handle_fasteoi_irq. In fact, I managed to verify that LinuxRT works fine as dom0 with the attached dynamic.patch that switches back xen_dynamic_chip IRQs to handle_fasteoi_irq. From the rest of this thread, it looks like the issue might appear with PIRQs as well. Thus, I wrote a second patch pirqs.patch to switch back to handle_fasteoi_irq PIRQs as well. However, Xen on ARM does not use PIRQs so I couldn't test it at all. I would appreciate if Boris/Juegen tested it. Let me know what you want me to do with the second patch.