All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Jan Beulich <JBeulich@suse.com>, Kevin Moraga <kmoragas@riseup.net>
Cc: xen-devel@lists.xen.org
Subject: Re: crash on boot with 4.6.1 on fedora 24
Date: Tue, 10 May 2016 09:39:49 -0400	[thread overview]
Message-ID: <5731E4A5.30906@oracle.com> (raw)
In-Reply-To: <5731A87502000078000E9D8C@prv-mh.provo.novell.com>

On 05/10/2016 03:23 AM, Jan Beulich wrote:
>>>> On 09.05.16 at 20:40, <boris.ostrovsky@oracle.com> wrote:
>> On 05/09/2016 01:22 PM, Kevin Moraga wrote:
>>> On 05/09/2016 11:15 AM, Boris Ostrovsky wrote:
>>>> On 05/09/2016 12:40 PM, Kevin Moraga wrote:
>>>>> On 05/09/2016 09:53 AM, Jan Beulich wrote:
>>>>>>>>> On 09.05.16 at 16:52, <kmoragas@riseup.net> wrote:
>>>>>>> On 05/09/2016 04:08 AM, Jan Beulich wrote:
>>>>>>>>>>> On 09.05.16 at 00:51, <kmoragas@riseup.net> wrote:
>>>>>>>>> I'm try to compile kernel 4.4.8 (using fedora 23) to run with Xen 4.6.0
>>>>>>>>> and Intel Skylake processor (Intel Core i7-6600U)
>>>>>>>>>
>>>>>>>>> This kernel is crashing almost in the same way as explained in this
>>>>>>>>> thread... But my problem is mainly with Skylake. Because the same
>>>>>>>>> configuration works within another machine but with another processor
>>>>>>>>> (Intel Core i5-3340M). Attached are the boot logs.
>>>>>>>> The address the fault occurs on (ffff8000006bdee0) is bogus, so
>>>>>>>> from the register and stack dump alone I don't think we can derive
>>>>>>>> much. What we'd need is access to the kernel binary used (or
>>>>>>>> really the vmlinux accompanying the vmlinuz that was used), in
>>>>>>>> order to see where exactly the kernel died, and hence where this
>>>>>>>> bogus address originates from. As I understand it this is a kernel
>>>>>>>> you built yourself - can you make said binary from exactly that
>>>>>>>> build available somewhere? 
>>>>>>> Yes I have it. But I get the same crash on various 4.4.X and also with
>>>>>>> 4.5.3.
>>>>>>>
>>>>>>> **https://drive.google.com/open?id=0B6Ol0ob95UxXQV9HM1BWMmhCZ0E 
>>>>>> Well, this doesn't contain the file I'm after (vmlinux), and taking
>>>>>> apart vmlinuz would be quite cumbersome.
>>>>>>
>>>>>> Jan
>>>>>>
>>>>> Oh sorry, here is the link to vmlinux
>>>>>
>>>>>
>> https://drive.google.com/file/d/0B6Ol0ob95UxXN0dDMWM1a29vMEk/view?usp=sharing 
>>>> This is still vmlinuz but the failure is at
>>>>
>>>> ffffffff81007ef3:       48 3b 1d 4e 2e ec 00    cmp   
>>>> 0xec2e4e(%rip),%rbx        # 0xffffffff81ecad48
>>>> ffffffff81007efa:       73 51                   jae    0xffffffff81007f4d
>>>> ffffffff81007efc:       31 c0                   xor    %eax,%eax
>>>> ffffffff81007efe:       48 8b 15 03 d2 c0 00    mov   
>>>> 0xc0d203(%rip),%rdx        # 0xffffffff81c15108
>>>> ffffffff81007f05:       90                      nop
>>>> ffffffff81007f06:       90                      nop
>>>> ffffffff81007f07:       90                      nop
>>>> ffffffff81007f08:       4c 8b 2c da             mov   
>>>> (%rdx,%rbx,8),%r13    <======
>>>> ffffffff81007f0c:       90                      nop
>>>> ffffffff81007f0d:       90                      nop
>>>> ffffffff81007f0e:       90                      nop
>>>> ffffffff81007f0f:       85 c0                   test   %eax,%eax
>>>> ffffffff81007f11:       78 3a                   js     0xffffffff81007f4d
>>>> ffffffff81007f13:       48 8b 05 ee 11 d2 00    mov   
>>>> 0xd211ee(%rip),%rax        # 0xffffffff81d29108
>>>> ffffffff81007f1a:       49 39 c5                cmp    %rax,%r13
>>>> ffffffff81007f1d:       73 6f                   jae    0xffffffff81007f8e
>>>> ffffffff81007f1f:       48 8b 05 ea 11 d2 00    mov   
>>>> 0xd211ea(%rip),%rax        # 0xffffffff81d29110
>>>> ffffffff81007f26:       4a 8b 04 e8             mov    (%rax,%r13,8),%rax
>>>>
>>>> Any chance you could provide an un-stripped binary or System.map?
>>> Here is the link for System.map
>>>
>>>
>> https://drive.google.com/file/d/0B6Ol0ob95UxXYVE4SzdMcENsWWs/view?usp=sharing 
>>
>> So my semi-educated guess at your stack is
>> __early_ioremap
>>   -> __early_set_fixmap
>>     -> set_pte
>>       -> xen_set_pte_init
>>         -> mask_rw_pte
>>           -> pte_pfn
>>             -> pte_val
>>                -> xen_pte_val
>>                  -> pte_mfn_to_pfn
>>                    -> mfn_to_pfn_no_overrides
>>                      -> ret =
>> xen_safe_read_ulong(&machine_to_phys_mapping[mfn], &pfn)
>>
>>
>> With ffffffff81007f08 being the faulted address the last one looks
>> plausible:
>>
>>
>> ffffffff81007efe:       48 8b 15 03 d2 c0 00    mov   
>> 0xc0d203(%rip),%rdx        # 0xffffffff81c15108
>> ffffffff81007f05:       90                      nop
>> ffffffff81007f06:       90                      nop
>> ffffffff81007f07:       90                      nop
>> ffffffff81007f08:       4c 8b 2c da       mov    (%rdx,%rbx,8),%r13
>>
>> since
>>
>> ostr@workbase> grep  ffffffff81c15108
>> /tmp/System.map-4.4.8-9.pvops.qubes.x86_64
>> ffffffff81c15108 D machine_to_phys_mapping
>> ostr@workbase>
>>
>> But %rdx is not ffffffff81c15108, it is ffff800000000000:
>>
>> (XEN) rax: 0000000000000000   rbx: 00000000000d7bdc   rcx: ffff880002059000
>> (XEN) rdx: ffff800000000000   rsi: 80000000d7bdc063   rdi: 80000000d7bdc063
> But that's a MOV above, i.e. %rdx = [0xffffffff81c15108], which
> sensibly is MACH2PHYS_VIRT_START. 

<facepalm> of course!

> And the MFN in %rbx
> would then match with the value in %cr2. Question is - where
> does MFN 0xd7bdc come from (it's in a reserved range, and hence
> can only be MMIO, which shouldn't be subject to M2P translation),
> and why is this a problem only on Skylake (or maybe that's not
> CPU related at all, but just dependent on the memory layout
> produced by the firmware).
>
> Obviously, accesses to the sparse[!] M2P prior to a proper #PF
> handler established can't end well. With no RAM present in the
> range 0xc0000000-0xffffffff, the 4th 2Mb M2P page doesn't get
> populated, i.e. this page walk
>
> (XEN) Pagetable walk from ffff8000006bdee0:
> (XEN)  L4[0x100] = 000000081daf9067 ffffffffffffffff
> (XEN)  L3[0x000] = 000000081daf7067 ffffffffffffffff
> (XEN)  L2[0x003] = 0000000000000000 ffffffffffffffff 
>
> is to be expected.
>
> Anyway, Kevin, it would really make things a lot easier if you
> provided the vmlinux matching the vmlinuz, which you should
> have (assuming my understanding is correct that this is a kernel
> you built yourself). After all what we may need to figure out is
> the caller of __early_ioremap() in the call stack Boris deduced.

I didn't finish unwrapping the stack yesterday. Here it is:

setup_arch -> dmi_scan_machine -> dmi_walk_early -> early_ioremap

-boris



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-05-10 13:39 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-08 22:51 crash on boot with 4.6.1 on fedora 24 Kevin Moraga
2016-05-09  7:23 ` Andrew Cooper
2016-05-09 10:05   ` Jan Beulich
2016-05-09 10:08 ` Jan Beulich
2016-05-09 14:52   ` Kevin Moraga
2016-05-09 15:53     ` Jan Beulich
2016-05-09 16:40       ` Kevin Moraga
2016-05-09 17:15         ` Boris Ostrovsky
2016-05-09 17:22           ` Kevin Moraga
2016-05-09 18:40             ` Boris Ostrovsky
2016-05-10  7:23               ` Jan Beulich
2016-05-10 13:39                 ` Boris Ostrovsky [this message]
2016-05-10 13:57                   ` Jan Beulich
2016-05-10 15:19                     ` Juergen Gross
2016-05-10 15:35                       ` Jan Beulich
     [not found]                       ` <57321BFA02000078000EA3C2@suse.com>
2016-05-10 15:43                         ` Juergen Gross
2016-05-10 16:35                           ` Boris Ostrovsky
2016-05-11  5:49                             ` Juergen Gross
2016-05-11  6:35                               ` Jan Beulich
     [not found]                               ` <5732EEBF02000078000EA613@suse.com>
2016-05-11  7:00                                 ` Juergen Gross
2016-05-11  7:15                                   ` Jan Beulich
     [not found]                                   ` <5732F83D02000078000EA6A2@suse.com>
2016-05-11  9:57                                     ` Juergen Gross
2016-05-11 10:03                                       ` Jan Beulich
     [not found]                                       ` <57331FA002000078000EA831@suse.com>
2016-05-11 10:10                                         ` Juergen Gross
2016-05-11 12:09                                           ` Jan Beulich
2016-05-11 10:16                                   ` David Vrabel
2016-05-11 12:21                                     ` Jan Beulich
2016-05-11 12:48                                       ` David Vrabel
2016-05-11 13:13                                         ` Jan Beulich
2016-05-11 13:15                                         ` Juergen Gross
2016-05-17 15:11                                     ` David Vrabel
2016-05-17 20:58                                       ` Kevin Moraga
2016-05-26 10:24                                       ` David Vrabel
2016-05-26 14:05                                         ` Boris Ostrovsky
2016-05-26 15:24                                           ` David Vrabel
2016-06-01 16:12                                         ` Martin Cerveny
2016-06-01 16:23                                           ` Martin Cerveny
2016-06-01 19:32                                             ` Boris Ostrovsky
2016-06-01 21:01                                               ` Martin Cerveny
2016-06-01 22:37                                                 ` Boris Ostrovsky
2016-06-02  6:04                                                   ` Martin Cerveny
2016-06-02 13:15                                                     ` Martin Cerveny
2016-06-02  9:54                                           ` David Vrabel
2016-05-10 16:11                 ` Kevin Moraga
2016-05-10 20:11                   ` Boris Ostrovsky
2016-05-12  4:52                     ` Kevin Moraga
  -- strict thread matches above, loose matches on Subject: below --
2016-03-28 17:00 Michael Young
2016-03-29 10:07 ` Jan Beulich
2016-03-29 17:50 ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5731E4A5.30906@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=kmoragas@riseup.net \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.