All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Atom2 <ariel.atom2@web2web.at>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>, xen-devel@lists.xen.org
Subject: Re: HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2
Date: Fri, 13 Nov 2015 00:25:03 -0700	[thread overview]
Message-ID: <56459E5F02000078000B4944@prv-mh.provo.novell.com> (raw)
In-Reply-To: <56451A2B.9090706@web2web.at>

>>> On 13.11.15 at 00:00, <ariel.atom2@web2web.at> wrote:
> Am 12.11.15 um 17:43 schrieb Andrew Cooper:
>> On 12/11/15 14:29, Atom2 wrote:
>>> Hi Andrew,
>>> thanks for your reply. Answers are inline further down.
>>>
>>> Am 12.11.15 um 14:01 schrieb Andrew Cooper:
>>>> On 12/11/15 12:52, Jan Beulich wrote:
>>>>>>>> On 12.11.15 at 02:08, <ariel.atom2@web2web.at> wrote:
>>>>>> After the upgrade HVM domUs appear to no longer work - regardless
>>>>>> of the
>>>>>> dom0 kernel (tested with both 3.18.9 and 4.1.7 as the dom0 kernel); PV
>>>>>> domUs, however, work just fine as before on both dom0 kernels.
>>>>>>
>>>>>> xl dmesg shows the following information after the first crashed HVM
>>>>>> domU which is started as part of the machine booting up:
>>>>>> [...]
>>>>>> (XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest
>>>>>> state (0).
>>>>>> (XEN) ************* VMCS Area **************
>>>>>> (XEN) *** Guest State ***
>>>>>> (XEN) CR0: actual=0x0000000000000039, shadow=0x0000000000000011,
>>>>>> gh_mask=ffffffffffffffff
>>>>>> (XEN) CR4: actual=0x0000000000002050, shadow=0x0000000000000000,
>>>>>> gh_mask=ffffffffffffffff
>>>>>> (XEN) CR3: actual=0x0000000000800000, target_count=0
>>>>>> (XEN)      target0=0000000000000000, target1=0000000000000000
>>>>>> (XEN)      target2=0000000000000000, target3=0000000000000000
>>>>>> (XEN) RSP = 0x0000000000006fdc (0x0000000000006fdc)  RIP =
>>>>>> 0x0000000100000000 (0x0000000100000000)
>>>>> Other than RIP looking odd for a guest still in non-paged protected
>>>>> mode I can't seem to spot anything wrong with guest state.
>>>> odd? That will be the source of the failure.
>>>>
>>>> Out of long mode, the upper 32bit of %rip should all be zero, and it
>>>> should not be possible to set any of them.
>>>>
>>>> I suspect that the guest has exited for emulation, and there has been a
>>>> bad update to %rip.  The alternative (which I hope is not the case) is
>>>> that there is a hardware errata which allows the guest to accidentally
>>>> get it self into this condition.
>>>>
>>>> Are you able to rerun with a debug build of the hypervisor?
>>> [snip]
>>> Another question is whether prior to enabling the debug USE flag it
>>> might make sense to re-compile with gcc-4.8.5 (please see my previous
>>> list reply) to rule out any compiler related issues. Jan, Andrew -
>>> what are your thoughts?
>> First of all, check whether the compiler makes a difference on 4.5.2
> Hi Andrew,
> I changed the compiler and there was no change to the better: 
> Unfortunately the HVM domU is still crashing with a similar error 
> message as soon as it is being started.
>> If both compiles result in a guest crashing in that manner, test a debug
>> Xen to see if any assertions/errors are encountered just before the
>> guest crashes.
>>
> As the compiler did not make any difference, I enabled the debug USE 
> flag, re-compiled (using gcc-4.9.3), and rebooted using a serial console 
> to capture output. Unfortunately I did not get very far and things 
> become even stranger: This time the system did not even finnish the boot 
> process, but rather hard-stopped pretty early with a message reading 
> "Panic on CPU 3: DOUBLE FAULT -- system shutdown". The captured logfile 
> is attached as "serial log.txt".
> 
> As this happened immediately after the CPU microcode update, I thought 
> there might be a connection and disabled the microcode update. After the 
> next reboot it seemed as if the boot process got a bit further as 
> evidenced by a few more lines in the log file (those between lines 136 
> and 197 in the second log file named "serial log no ucode.txt"), but in 
> the end it finnished off with an identical error message (only the CPU # 
> was different this time, but that number seems to change between boots 
> anyways).
> 
> I hope that makes some sense to you.

Not really, other than now even more suspecting bad hardware or
something fundamentally wrong with your build. Did you retry with
a freshly built 4.5.1? Could you alternatively try with a known good
build of 4.5.2 (e.g. from osstest)?

Jan

  reply	other threads:[~2015-11-13  7:25 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-12  1:08 HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2 Atom2
2015-11-12 12:52 ` Jan Beulich
2015-11-12 13:01   ` Andrew Cooper
2015-11-12 14:29     ` Atom2
2015-11-12 15:32       ` Jan Beulich
2015-11-12 16:43       ` Andrew Cooper
2015-11-12 23:00         ` Atom2
2015-11-13  7:25           ` Jan Beulich [this message]
2015-11-13 10:09             ` Andrew Cooper
2015-11-14  0:16               ` Atom2
2015-11-14 20:32                 ` Andrew Cooper
2015-11-15  0:14                   ` Atom2
2015-11-15 15:12                     ` Andrew Cooper
2015-11-16  0:39                       ` Atom2
2015-11-16 10:02                         ` Andrew Cooper
2015-11-15 20:12                     ` Doug Goldstein
2015-11-16  1:05                       ` Atom2
2015-11-16 15:31                         ` Konrad Rzeszutek Wilk
2015-11-16 19:16                           ` Atom2
2015-11-16 19:25                             ` Konrad Rzeszutek Wilk
2015-11-16 19:39                               ` Doug Goldstein
2015-11-16 19:47                                 ` Konrad Rzeszutek Wilk
2015-11-16 19:45                               ` Atom2
2015-11-16 23:01                             ` Andrew Cooper
2015-11-16 23:10                               ` Atom2
2015-11-18 22:51                                 ` Atom2
2015-11-18 23:17                                   ` Andrew Cooper
2015-11-19  0:31                                     ` Atom2
2015-11-19  1:06                                       ` Andrew Cooper
2015-11-19 20:02                                         ` Atom2
2015-11-19 23:53                                           ` Andrew Cooper
2015-11-24 11:53                                             ` Atom2
2015-11-19 10:24                                     ` Jan Beulich
2015-11-19 10:38                                       ` Andrew Cooper
2015-11-19 19:51                                         ` Atom2
2015-11-20  7:57                                           ` Jan Beulich
2015-11-24 10:32                                             ` Atom2
2015-11-24 10:43                                               ` Jan Beulich
2015-11-27 22:51                                                 ` Atom2
2015-11-30  9:04                                                   ` Jan Beulich
2015-11-16 19:47                         ` Doug Goldstein
2015-11-16 20:14                           ` Atom2
2015-11-12 14:12   ` Atom2

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56459E5F02000078000B4944@prv-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ariel.atom2@web2web.at \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.