On Thu, 21 Feb 2019, Julien Grall wrote:
> Hi Roger,
> 
> On Thu, 21 Feb 2019, 08:08 Roger Pau Monné, <roger.pau@citrix.com> wrote:
>       FWIW, you can also mask the interrupt while waiting for the thread to
>       execute the interrupt handler. Ie:
> 
> 
> Thank you for providing steps, however where would the masking be done? By the irqchip or a custom solution?
> 
> 
>       1. Interrupt injected
>       2. Execute guest event channel callback
>       3. Scan for pending interrupts
>       4. Mask interrupt
>       5. Clear pending field
>       6. Queue threaded handler
>       7. Go to 3 until all interrupts are drained
>       [...]
>       8. Execute interrupt handler in thread
>       9. Unmask interrupt
> 
>       That should prevent you from stacking interrupts?

Sorry for coming late to the thread, and thanks Julien for pointing it
out to me. I am afraid I was the one to break the flow back in 2011 with
the following commit:

  7e186bdd0098 xen: do not clear and mask evtchns in __xen_evtchn_do_upcall

Oops :-)


Xen event channels have their own workflow; the one Roger wrote above.
They used to be handled using handle_fasteoi_irq until 7e186bdd0098,
then I switched (almost) all of them to handle_edge_irq.

Looking closely at irq handling again, it doesn't look like we can do
what we need with handle_edge_irq today: we can't mask the event channel
before clearing it. But we can do that if we go back to using
handle_fasteoi_irq.

In fact, I managed to verify that LinuxRT works fine as dom0 with the
attached dynamic.patch that switches back xen_dynamic_chip IRQs to
handle_fasteoi_irq.


From the rest of this thread, it looks like the issue might appear with
PIRQs as well. Thus, I wrote a second patch pirqs.patch to switch back
to handle_fasteoi_irq PIRQs as well. However, Xen on ARM does not use
PIRQs so I couldn't test it at all. I would appreciate if Boris/Juegen
tested it. Let me know what you want me to do with the second patch.