Hi Thomas, I think i got mixed up with logical apic id and logical cpu :-( Patch and trace attached. On Thu, May 07, 2020 at 02:53:41PM +0200, Thomas Gleixner wrote: > Ashok, > > "Raj, Ashok" writes: > > > We did a bit more tracing and it looks like the IRR check is actually > > not happening on the right cpu. See below. > > What? > > >> So we have 3 points where an interrupt can fire: > >> > >> A) Before #2 > >> > >> B) After #2 and before #3 > >> > >> C) After #3 > >> > >> #A is hitting the old vector which is still valid on the old CPU and > >> will be handled once interrupts are enabled with the correct irq > >> descriptor - Normal operation (same as with maskable MSI) > >> > >> #B This must be checked in the IRR because the there is no valid vector > >> on the old CPU. > > > > The check for IRR seems like on a random cpu3 vs checking for the new vector 33 > > on old cpu 6? > > The whole sequence runs on CPU 3. If old CPU was 6 then this should > never run on CPU 3. > > > This is the place when we force the retrigger without the IRR check things seem to fix itself. > > It's not fixing it. It's papering over the root cause. > > > Did we miss something? > > Yes, you missed to analyze why this runs on CPU3 when old CPU is 6. But > the last interrupt actually was on CPU3. > > > -0 [003] d.h. 200.278052: xhci_irq: xhci irq > > Can you please provide the full trace and the patch you used to generate > it? > Cheers, Ashok