* guest being crashed from viridian_start_apic_assist()
@ 2016-06-03 9:04 Jan Beulich
2016-06-03 9:39 ` Paul Durrant
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2016-06-03 9:04 UTC (permalink / raw)
To: Paul Durrant; +Cc: xen-devel
Paul,
to test a guest ACPI change I went through all Windows versions that
I have installed anywhere, to find 32-bit Win7 dying (the other
variants I tried worked fine), albeit retrying a couple of times I then
didn't see this a second time. Is this something you've observed
elsewhere? The APIC assist logic isn't that complicated, and I can't
really spot any race in there...
Thanks, Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 9:04 guest being crashed from viridian_start_apic_assist() Jan Beulich
@ 2016-06-03 9:39 ` Paul Durrant
2016-06-03 9:59 ` Jan Beulich
0 siblings, 1 reply; 9+ messages in thread
From: Paul Durrant @ 2016-06-03 9:39 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 03 June 2016 10:04
> To: Paul Durrant
> Cc: xen-devel
> Subject: guest being crashed from viridian_start_apic_assist()
>
> Paul,
>
> to test a guest ACPI change I went through all Windows versions that
> I have installed anywhere, to find 32-bit Win7 dying (the other
> variants I tried worked fine), albeit retrying a couple of times I then
> didn't see this a second time. Is this something you've observed
> elsewhere? The APIC assist logic isn't that complicated, and I can't
> really spot any race in there...
>
Hi Jan,
How did it die?
I've not seen any problems in dev testing and I usually use win7 32-bit. APIC assist has not yet been enabled by default in XenServer builds though so it has not had automated test exposure yet... I should really get round to turning it on now that 7.0 has shipped.
Paul
> Thanks, Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 9:39 ` Paul Durrant
@ 2016-06-03 9:59 ` Jan Beulich
2016-06-03 12:11 ` Paul Durrant
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2016-06-03 9:59 UTC (permalink / raw)
To: Paul Durrant; +Cc: xen-devel
>>> On 03.06.16 at 11:39, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 03 June 2016 10:04
>>
>> to test a guest ACPI change I went through all Windows versions that
>> I have installed anywhere, to find 32-bit Win7 dying (the other
>> variants I tried worked fine), albeit retrying a couple of times I then
>> didn't see this a second time. Is this something you've observed
>> elsewhere? The APIC assist logic isn't that complicated, and I can't
>> really spot any race in there...
>
> How did it die?
From the single domain_crash() in that function. No other information
was available (apart from the register dump, which doesn't look like
it would be very useful for analysis here).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 9:59 ` Jan Beulich
@ 2016-06-03 12:11 ` Paul Durrant
2016-06-03 13:05 ` Jan Beulich
0 siblings, 1 reply; 9+ messages in thread
From: Paul Durrant @ 2016-06-03 12:11 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 03 June 2016 10:59
> To: Paul Durrant
> Cc: xen-devel
> Subject: RE: guest being crashed from viridian_start_apic_assist()
>
> >>> On 03.06.16 at 11:39, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 03 June 2016 10:04
> >>
> >> to test a guest ACPI change I went through all Windows versions that
> >> I have installed anywhere, to find 32-bit Win7 dying (the other
> >> variants I tried worked fine), albeit retrying a couple of times I then
> >> didn't see this a second time. Is this something you've observed
> >> elsewhere? The APIC assist logic isn't that complicated, and I can't
> >> really spot any race in there...
> >
> > How did it die?
>
> From the single domain_crash() in that function. No other information
> was available (apart from the register dump, which doesn't look like
> it would be very useful for analysis here).
>
There is an inconsistency between the test to abort a previous assist, which is based on:
irr = vlapic_find_highest_irr(vlapic);
isr = vlapic_find_highest_isr(vlapic);
if ( (isr & 0xf0) >= (irr & 0xf0) )
and the test to start a new one:
isr = vlapic_find_lowest_vector(&vlapic->regs->data[APIC_ISR]);
if ( isr >= 0 && isr < vector )
I suspect that may be the problem you hit.
Paul
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 12:11 ` Paul Durrant
@ 2016-06-03 13:05 ` Jan Beulich
2016-06-03 13:31 ` Paul Durrant
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2016-06-03 13:05 UTC (permalink / raw)
To: Paul Durrant; +Cc: xen-devel
>>> On 03.06.16 at 14:11, <Paul.Durrant@citrix.com> wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 03 June 2016 10:59
>> To: Paul Durrant
>> Cc: xen-devel
>> Subject: RE: guest being crashed from viridian_start_apic_assist()
>>
>> >>> On 03.06.16 at 11:39, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 03 June 2016 10:04
>> >>
>> >> to test a guest ACPI change I went through all Windows versions that
>> >> I have installed anywhere, to find 32-bit Win7 dying (the other
>> >> variants I tried worked fine), albeit retrying a couple of times I then
>> >> didn't see this a second time. Is this something you've observed
>> >> elsewhere? The APIC assist logic isn't that complicated, and I can't
>> >> really spot any race in there...
>> >
>> > How did it die?
>>
>> From the single domain_crash() in that function. No other information
>> was available (apart from the register dump, which doesn't look like
>> it would be very useful for analysis here).
>>
>
> There is an inconsistency between the test to abort a previous assist, which
> is based on:
>
> irr = vlapic_find_highest_irr(vlapic);
>
> isr = vlapic_find_highest_isr(vlapic);
> if ( (isr & 0xf0) >= (irr & 0xf0) )
>
> and the test to start a new one:
>
> isr = vlapic_find_lowest_vector(&vlapic->regs->data[APIC_ISR]);
> if ( isr >= 0 && isr < vector )
>
> I suspect that may be the problem you hit.
I don't think so. Of the second if(), the right side could actually be
dropped, as the logic in vlapic_has_pending_irq() guarantees
(vector & 0xf0) > (highest_isr & 0xf0), i.e. vector > highest_isr
and hence necessarily vector > lowest_isr. That is, if
vlapic_has_pending_irq() finds _some_ ISR bit set, it must be one
below vector. Which means viridian_start_apic_assist() gets called
solely when there's _no_ other ISR bit set.
What I find more problematic looking at those functions (but
unrelated to the problem here afaict) is the
vlapic_virtual_intr_delivery_enabled() related logic and its
interaction with the Viridian APIC assist (leaving aside nested
mode for the moment): vlapic_has_pending_irq() won't do
anything with the APIC assist in that case, but if force_ack is
true in vlapic_ack_pending_irq() there will be an interaction.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 13:05 ` Jan Beulich
@ 2016-06-03 13:31 ` Paul Durrant
2016-06-03 14:37 ` Jan Beulich
0 siblings, 1 reply; 9+ messages in thread
From: Paul Durrant @ 2016-06-03 13:31 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 03 June 2016 14:06
> To: Paul Durrant
> Cc: xen-devel
> Subject: RE: guest being crashed from viridian_start_apic_assist()
>
> >>> On 03.06.16 at 14:11, <Paul.Durrant@citrix.com> wrote:
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 03 June 2016 10:59
> >> To: Paul Durrant
> >> Cc: xen-devel
> >> Subject: RE: guest being crashed from viridian_start_apic_assist()
> >>
> >> >>> On 03.06.16 at 11:39, <Paul.Durrant@citrix.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 03 June 2016 10:04
> >> >>
> >> >> to test a guest ACPI change I went through all Windows versions that
> >> >> I have installed anywhere, to find 32-bit Win7 dying (the other
> >> >> variants I tried worked fine), albeit retrying a couple of times I then
> >> >> didn't see this a second time. Is this something you've observed
> >> >> elsewhere? The APIC assist logic isn't that complicated, and I can't
> >> >> really spot any race in there...
> >> >
> >> > How did it die?
> >>
> >> From the single domain_crash() in that function. No other information
> >> was available (apart from the register dump, which doesn't look like
> >> it would be very useful for analysis here).
> >>
> >
> > There is an inconsistency between the test to abort a previous assist, which
> > is based on:
> >
> > irr = vlapic_find_highest_irr(vlapic);
> >
> > isr = vlapic_find_highest_isr(vlapic);
> > if ( (isr & 0xf0) >= (irr & 0xf0) )
> >
> > and the test to start a new one:
> >
> > isr = vlapic_find_lowest_vector(&vlapic->regs->data[APIC_ISR]);
> > if ( isr >= 0 && isr < vector )
> >
> > I suspect that may be the problem you hit.
>
> I don't think so. Of the second if(), the right side could actually be
> dropped, as the logic in vlapic_has_pending_irq() guarantees
> (vector & 0xf0) > (highest_isr & 0xf0), i.e. vector > highest_isr
> and hence necessarily vector > lowest_isr. That is, if
> vlapic_has_pending_irq() finds _some_ ISR bit set, it must be one
> below vector. Which means viridian_start_apic_assist() gets called
> solely when there's _no_ other ISR bit set.
>
Yes, that test does appear to be unnecessary.
> What I find more problematic looking at those functions (but
> unrelated to the problem here afaict) is the
> vlapic_virtual_intr_delivery_enabled() related logic and its
> interaction with the Viridian APIC assist (leaving aside nested
> mode for the moment): vlapic_has_pending_irq() won't do
> anything with the APIC assist in that case, but if force_ack is
> true in vlapic_ack_pending_irq() there will be an interaction.
>
...and the interaction would indeed be the crash you saw, I think, because you could start an APIC assist when vlapic_virtual_intr_delivery_enabled() is true, but never complete it because vlapic_has_pending_irq() bails before doing so. Thus, attempting the next APIC assist would lead to a domain_crash().
Paul
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 13:31 ` Paul Durrant
@ 2016-06-03 14:37 ` Jan Beulich
2016-06-03 14:44 ` Paul Durrant
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2016-06-03 14:37 UTC (permalink / raw)
To: Paul Durrant; +Cc: xen-devel
>>> On 03.06.16 at 15:31, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 03 June 2016 14:06
>>
>> What I find more problematic looking at those functions (but
>> unrelated to the problem here afaict) is the
>> vlapic_virtual_intr_delivery_enabled() related logic and its
>> interaction with the Viridian APIC assist (leaving aside nested
>> mode for the moment): vlapic_has_pending_irq() won't do
>> anything with the APIC assist in that case, but if force_ack is
>> true in vlapic_ack_pending_irq() there will be an interaction.
>
> ...and the interaction would indeed be the crash you saw, I think, because
> you could start an APIC assist when vlapic_virtual_intr_delivery_enabled() is
> true, but never complete it because vlapic_has_pending_irq() bails before
> doing so. Thus, attempting the next APIC assist would lead to a
> domain_crash().
With one caveat - this was on a Nehalem, which doesn't have that
capability. Hence me saying "unrelated"...
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 14:37 ` Jan Beulich
@ 2016-06-03 14:44 ` Paul Durrant
2016-06-03 15:01 ` Jan Beulich
0 siblings, 1 reply; 9+ messages in thread
From: Paul Durrant @ 2016-06-03 14:44 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 03 June 2016 15:38
> To: Paul Durrant
> Cc: xen-devel
> Subject: RE: guest being crashed from viridian_start_apic_assist()
>
> >>> On 03.06.16 at 15:31, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 03 June 2016 14:06
> >>
> >> What I find more problematic looking at those functions (but
> >> unrelated to the problem here afaict) is the
> >> vlapic_virtual_intr_delivery_enabled() related logic and its
> >> interaction with the Viridian APIC assist (leaving aside nested
> >> mode for the moment): vlapic_has_pending_irq() won't do
> >> anything with the APIC assist in that case, but if force_ack is
> >> true in vlapic_ack_pending_irq() there will be an interaction.
> >
> > ...and the interaction would indeed be the crash you saw, I think, because
> > you could start an APIC assist when vlapic_virtual_intr_delivery_enabled()
> is
> > true, but never complete it because vlapic_has_pending_irq() bails before
> > doing so. Thus, attempting the next APIC assist would lead to a
> > domain_crash().
>
> With one caveat - this was on a Nehalem, which doesn't have that
> capability. Hence me saying "unrelated"...
>
Ah, I see.
Paul
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: guest being crashed from viridian_start_apic_assist()
2016-06-03 14:44 ` Paul Durrant
@ 2016-06-03 15:01 ` Jan Beulich
0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2016-06-03 15:01 UTC (permalink / raw)
To: Paul Durrant; +Cc: xen-devel
>>> On 03.06.16 at 16:44, <Paul.Durrant@citrix.com> wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 03 June 2016 15:38
>> To: Paul Durrant
>> Cc: xen-devel
>> Subject: RE: guest being crashed from viridian_start_apic_assist()
>>
>> >>> On 03.06.16 at 15:31, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 03 June 2016 14:06
>> >>
>> >> What I find more problematic looking at those functions (but
>> >> unrelated to the problem here afaict) is the
>> >> vlapic_virtual_intr_delivery_enabled() related logic and its
>> >> interaction with the Viridian APIC assist (leaving aside nested
>> >> mode for the moment): vlapic_has_pending_irq() won't do
>> >> anything with the APIC assist in that case, but if force_ack is
>> >> true in vlapic_ack_pending_irq() there will be an interaction.
>> >
>> > ...and the interaction would indeed be the crash you saw, I think, because
>> > you could start an APIC assist when vlapic_virtual_intr_delivery_enabled()
>> is
>> > true, but never complete it because vlapic_has_pending_irq() bails before
>> > doing so. Thus, attempting the next APIC assist would lead to a
>> > domain_crash().
>>
>> With one caveat - this was on a Nehalem, which doesn't have that
>> capability. Hence me saying "unrelated"...
>
> Ah, I see.
And btw., that force_ack flag also gets set to true only by vVMX
code (i.e. I wouldn't be surprised at all if something was broken
there).
The bad thing then is - it looks like we still have no real explanation
for that one time guest crash.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-06-03 15:01 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-03 9:04 guest being crashed from viridian_start_apic_assist() Jan Beulich
2016-06-03 9:39 ` Paul Durrant
2016-06-03 9:59 ` Jan Beulich
2016-06-03 12:11 ` Paul Durrant
2016-06-03 13:05 ` Jan Beulich
2016-06-03 13:31 ` Paul Durrant
2016-06-03 14:37 ` Jan Beulich
2016-06-03 14:44 ` Paul Durrant
2016-06-03 15:01 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).