On 27.11.20 14:40, Jan Beulich wrote: > On 27.11.2020 14:31, Manuel Bouyer wrote: >> On Fri, Nov 27, 2020 at 02:18:54PM +0100, Jan Beulich wrote: >>> On 27.11.2020 14:13, Manuel Bouyer wrote: >>>> On Fri, Nov 27, 2020 at 12:29:35PM +0100, Jan Beulich wrote: >>>>> On 27.11.2020 11:59, Roger Pau Monné wrote: >>>>>> --- a/xen/arch/x86/hvm/irq.c >>>>>> +++ b/xen/arch/x86/hvm/irq.c >>>>>> @@ -187,6 +187,10 @@ void hvm_gsi_assert(struct domain *d, unsigned int gsi) >>>>>> * to know if the GSI is pending or not. >>>>>> */ >>>>>> spin_lock(&d->arch.hvm.irq_lock); >>>>>> + if ( gsi == TRACK_IRQ ) >>>>>> + debugtrace_printk("hvm_gsi_assert irq %u trig %u assert count %u\n", >>>>>> + gsi, trig, hvm_irq->gsi_assert_count[gsi]); >>>>> >>>>> This produces >>>>> >>>>> 81961 hvm_gsi_assert irq 34 trig 1 assert count 1 >>>>> >>>>> Since the logging occurs ahead of the call to assert_gsi(), it >>>>> means we don't signal anything to Dom0, because according to our >>>>> records there's still an IRQ in flight. Unfortunately we only >>>>> see the tail of the trace, so it's not possible to tell how / when >>>>> we got into this state. >>>>> >>>>> Manuel - is this the only patch you have in place? Or did you keep >>>>> any prior ones? Iirc there once was one where Roger also suppressed >>>>> some de-assert call. >>>> >>>> Yes, I have some of the previous patches (otherwise Xen panics). >>>> Attached is the diffs I currently have >>> >>> I think you want to delete the hunk dropping the call to >>> hvm_gsi_deassert() from pt_irq_time_out(). Iirc it was that >>> addition which changed the behavior to just a single IRQ ever >>> making it into Dom0. And it ought to be only the change to >>> msix_write() which is needed to avoid the panic. >> >> yes, I did keep the hvm_gsi_deassert() patch because I expected it >> to make things easier, as it allows to interract with Xen without changing >> interrupt states. > > Right, but then we'd need to see the beginning of the trace, > rather than it starting at (in this case) about 95,000. Yet ... > >> I removed it, here's a new trace >> >> http://www-soc.lip6.fr/~bouyer/xen-log12.txt > > ... hmm, odd - no change at all: > > 95572 hvm_gsi_assert irq 34 trig 1 assert count 1 > > I was sort of expecting that this might be where we fail to > set the assert count back to zero. Will need further > thinking, if nothing else than how to turn down the verbosity > without hiding crucial information. Or maybe Roger has got > some idea ... Set debugtrace buffer size to something huge? Panic when the buffer is full? It should be noted that the debugtrace in being printed in case of a panic. Juergen