Re: [xen-unstable test] 106504: regressions

* Re: [xen-unstable test] 106504: regressions - FAIL
  2017-03-07  9:16 ` Jan Beulich
@ 2017-03-07  4:24   ` Chao Gao
  2017-03-07 14:11     ` Jan Beulich
  2017-03-08  3:16     ` Xuquan (Quan Xu)
  0 siblings, 2 replies; 20+ messages in thread
From: Chao Gao @ 2017-03-07  4:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Kevin Tian, osstest-admin, Xuquan

On Tue, Mar 07, 2017 at 02:16:50AM -0700, Jan Beulich wrote:
>>>> On 07.03.17 at 06:52, <osstest-admin@xenproject.org> wrote:
>> flight 106504 xen-unstable real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/106504/ 
>> 
>> Regressions :-(
>> 
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  [...]
>>  test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 16 guest-stop fail REGR. vs. 
>> 106482
>
>Here we go:
>
>(XEN) d15v0: intack: 02:48 pt: 38
>(XEN) vIRR: 00000000 00000000 00000000 00000000 00000000 00000000 00010000 00000000
>(XEN)  PIR: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>(XEN) Assertion 'intack.vector >= pt_vector' failed at intr.c:360
>(XEN) ----[ Xen-4.9-unstable  x86_64  debug=y   Not tainted ]----
>(XEN) CPU:    0
>(XEN) RIP:    e008:[<ffff82d0802039e8>] vmx_intr_assist+0x5fa/0x61a
>(XEN) RFLAGS: 0000000000010292   CONTEXT: hypervisor (d15v0)
>(XEN) rax: ffff82d0804754a8   rbx: ffff83007f375680   rcx: 0000000000000000
>(XEN) rdx: ffff83007cd3ffff   rsi: 000000000000000a   rdi: ffff82d0803316d8
>(XEN) rbp: ffff83007cd3ff08   rsp: ffff83007cd3fea8   r8:  ffff830277db8000
>(XEN) r9:  0000000000000001   r10: 0000000000000000   r11: 0000000000000001
>(XEN) r12: 00000000ffffffff   r13: ffff82d0802b5b02   r14: ffff82d0802b5b02
>(XEN) r15: ffff83027d82e000   cr0: 0000000080050033   cr4: 00000000001526e0
>(XEN) cr3: 0000000259135000   cr2: 000000000164f034
>(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>(XEN) Xen code around <ffff82d0802039e8> (vmx_intr_assist+0x5fa/0x61a):
>(XEN)  fb ff ff e9 49 fc ff ff <0f> 0b 89 ce 48 89 df e8 2a 21 00 00 e9 49 fe ff
>(XEN) Xen stack trace from rsp=ffff83007cd3fea8:
>(XEN)    ffff82d08044ab00 00000038ffffffff ffff83007cd3ffff ffff83027d82e000
>(XEN)    ffff83007cd3fef8 ffff82d080133a3d ffff83007f375000 ffff83007f375000
>(XEN)    ffff83007f7fc000 ffff83026df78000 0000000000000000 ffff83027d82e000
>(XEN)    ffff83007cd3fdb0 ffff82d080213191 0000000000000004 00000000000000c2
>(XEN)    0000000000000020 0000000000000002 ffff880029994000 ffffffff81ade0a0
>(XEN)    0000000000000246 0000000000000000 ffff88002d000008 0000000000000004
>(XEN)    000000000000006c 0000000000000000 00000000000003f8 00000000000003f8
>(XEN)    ffffffff81ade0a0 0000beef0000beef ffffffff81389ac4 000000bf0000beef
>(XEN)    0000000000000002 ffff88002f403e08 000000000000beef 000000000000beef
>(XEN)    000000000000beef 000000000000beef 000000000000beef 0000000000000000
>(XEN)    ffff83007f375000 0000000000000000 00000000001526e0
>(XEN) Xen call trace:
>(XEN)    [<ffff82d0802039e8>] vmx_intr_assist+0x5fa/0x61a
>(XEN)    [<ffff82d080213191>] vmx_asm_vmexit_handler+0x41/0x120
>(XEN) 
>(XEN) 
>(XEN) ****************************************
>(XEN) Panic on CPU 0:
>(XEN) Assertion 'intack.vector >= pt_vector' failed at intr.c:360
>(XEN) ****************************************
>
>I didn't make an attempt at interpreting this yet, but I wonder if it
>is more than coincidence that - just like the first time the ASSERT()
>triggered - this is again a guest-stop of a qemuu-debianhvm.
>

Cc: xuquan.

Exciting! I have been monitoring osstest for about one months through
a python script. But I always crawl the flights one time a day.

From the output, the pt_vector is 0x38 and the intack.vector is
0x30. these two values are same with they were in the first time.
And only one bit 0x30 is set in vIRR. PIR is NULL. So maybe
our suspicion that PIR is not synced to vIRR is wrong. The 0x38 bit
is not present in vIRR is strange. Is it possible that we clear the 0x38 bit
just after we return from pt_update_irq()? Or, just like I suspected that
it is caused by pt_update_irq() sets 0x30 but wrongly returns 0x38.
Do you think it worths a try to disable guest's LAPIC timer and
force it use IRQ0 along with changing RTE very frequently?
If yes, I am glad to do it.

Thanks,
Chao

>Jan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread