All of lore.kernel.org
 help / color / mirror / Atom feed
* debian stretch dom0 + xen 4.9 fails to boot
@ 2017-06-06 14:32 Paul Durrant
  2017-06-06 15:11 ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-06 14:32 UTC (permalink / raw)
  To: xen-devel (xen-devel@lists.xenproject.org)

I've been having fun setting up a new test rig...

I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a 4.9 kernel) and then tried building and installing the latest Xen staging-4.9 code. The system failed to boot... basically it got stuck before even managing to get sufficiently into Xen to spit out anything on the console. Xen 4.8 OTOH booted just fine so I started bisecting and after 14 iterations I got down to the following commit is being the problem:

commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
Author: Juergen Gross <jgross@suse.com>
Date:   Fri Mar 24 14:18:54 2017 +0100

    x86: split boot trampoline into permanent and temporary part

    The hypervisor needs a trampoline in low memory for early boot and
    later for bringing up cpus and during wakeup from suspend. Today this
    trampoline is kept completely even if most of it isn't needed later.

    Split the trampoline into a permanent part and a temporary part needed
    at early boot only. Introduce a new entry at the boundary.

    Reduce the stack for wakeup code in order for the permanent
    trampoline to fit in a single page. 4k of stack seems excessive, about
    3k should be more than enough.

    Add an ASSERT() to the linker script to ensure the wakeup stack is
    always at least 3k.

    Signed-off-by: Juergen Gross <jgross@suse.com>
    Reviewed-by: Jan Beulich <jbeulich@suse.com>

To verify this I checked out master, reverted that commit, and tried again. The NUC still booted fine.

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 14:32 debian stretch dom0 + xen 4.9 fails to boot Paul Durrant
@ 2017-06-06 15:11 ` Jan Beulich
  2017-06-06 15:51   ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-06 15:11 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel (xen-devel@lists.xenproject.org)

>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
> I've been having fun setting up a new test rig...
> 
> I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a 
> 4.9 kernel) and then tried building and installing the latest Xen staging-4.9 
> code. The system failed to boot... basically it got stuck before even 
> managing to get sufficiently into Xen to spit out anything on the console. 
> Xen 4.8 OTOH booted just fine so I started bisecting and after 14 iterations 
> I got down to the following commit is being the problem:
> 
> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
> Author: Juergen Gross <jgross@suse.com>
> Date:   Fri Mar 24 14:18:54 2017 +0100
> 
>     x86: split boot trampoline into permanent and temporary part
> 
>     The hypervisor needs a trampoline in low memory for early boot and
>     later for bringing up cpus and during wakeup from suspend. Today this
>     trampoline is kept completely even if most of it isn't needed later.
> 
>     Split the trampoline into a permanent part and a temporary part needed
>     at early boot only. Introduce a new entry at the boundary.
> 
>     Reduce the stack for wakeup code in order for the permanent
>     trampoline to fit in a single page. 4k of stack seems excessive, about
>     3k should be more than enough.
> 
>     Add an ASSERT() to the linker script to ensure the wakeup stack is
>     always at least 3k.
> 
>     Signed-off-by: Juergen Gross <jgross@suse.com>
>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> To verify this I checked out master, reverted that commit, and tried again. 
> The NUC still booted fine.

Well, interesting, but I don't think it is very realistic to expect any
fix with just the information you supply. There must be something
rather special about that system, and likely it would help if we
knew what that is. E.g. an unusual E820 map. Worse would be if
they used memory outside of properly marked E820 regions in a
way colliding with what we do.

Otherwise I'm afraid we need to hope for you to debug the issue.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 15:11 ` Jan Beulich
@ 2017-06-06 15:51   ` Paul Durrant
  2017-06-06 16:28     ` Paul Durrant
  2017-06-06 17:40     ` Julien Grall
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-06 15:51 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel (xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 06 June 2017 16:11
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
> > I've been having fun setting up a new test rig...
> >
> > I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
> > 4.9 kernel) and then tried building and installing the latest Xen staging-4.9
> > code. The system failed to boot... basically it got stuck before even
> > managing to get sufficiently into Xen to spit out anything on the console.
> > Xen 4.8 OTOH booted just fine so I started bisecting and after 14 iterations
> > I got down to the following commit is being the problem:
> >
> > commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
> > Author: Juergen Gross <jgross@suse.com>
> > Date:   Fri Mar 24 14:18:54 2017 +0100
> >
> >     x86: split boot trampoline into permanent and temporary part
> >
> >     The hypervisor needs a trampoline in low memory for early boot and
> >     later for bringing up cpus and during wakeup from suspend. Today this
> >     trampoline is kept completely even if most of it isn't needed later.
> >
> >     Split the trampoline into a permanent part and a temporary part needed
> >     at early boot only. Introduce a new entry at the boundary.
> >
> >     Reduce the stack for wakeup code in order for the permanent
> >     trampoline to fit in a single page. 4k of stack seems excessive, about
> >     3k should be more than enough.
> >
> >     Add an ASSERT() to the linker script to ensure the wakeup stack is
> >     always at least 3k.
> >
> >     Signed-off-by: Juergen Gross <jgross@suse.com>
> >     Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >
> > To verify this I checked out master, reverted that commit, and tried again.
> > The NUC still booted fine.
> 
> Well, interesting, but I don't think it is very realistic to expect any
> fix with just the information you supply. There must be something
> rather special about that system, and likely it would help if we
> knew what that is. E.g. an unusual E820 map. Worse would be if
> they used memory outside of properly marked E820 regions in a
> way colliding with what we do.
> 
> Otherwise I'm afraid we need to hope for you to debug the issue.
> 

Yes, I was posting this more a heads-up for the moment, so that 4.9 does not go out with this regression.

I will try to figure out what is going on... My initial thoughts on looking at what the patch does are that it may be something to do with the fact I am using a vga console rather than a serial one. I need to try another 4.9 on another system (gigabyte brix) to see if the problem manifests there too. I'll also have to play with the BIOS settings on the skull canyon.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 15:51   ` Paul Durrant
@ 2017-06-06 16:28     ` Paul Durrant
  2017-06-06 17:00       ` Boris Ostrovsky
  2017-06-06 17:40     ` Julien Grall
  1 sibling, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-06 16:28 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: xen-devel (xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Paul Durrant
> Sent: 06 June 2017 16:52
> To: 'Jan Beulich' <JBeulich@suse.com>
> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 06 June 2017 16:11
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> > devel@lists.xenproject.org>
> > Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >
> > >>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
> > > I've been having fun setting up a new test rig...
> > >
> > > I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
> > > 4.9 kernel) and then tried building and installing the latest Xen staging-4.9
> > > code. The system failed to boot... basically it got stuck before even
> > > managing to get sufficiently into Xen to spit out anything on the console.
> > > Xen 4.8 OTOH booted just fine so I started bisecting and after 14
> iterations
> > > I got down to the following commit is being the problem:
> > >
> > > commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
> > > Author: Juergen Gross <jgross@suse.com>
> > > Date:   Fri Mar 24 14:18:54 2017 +0100
> > >
> > >     x86: split boot trampoline into permanent and temporary part
> > >
> > >     The hypervisor needs a trampoline in low memory for early boot and
> > >     later for bringing up cpus and during wakeup from suspend. Today this
> > >     trampoline is kept completely even if most of it isn't needed later.
> > >
> > >     Split the trampoline into a permanent part and a temporary part
> needed
> > >     at early boot only. Introduce a new entry at the boundary.
> > >
> > >     Reduce the stack for wakeup code in order for the permanent
> > >     trampoline to fit in a single page. 4k of stack seems excessive, about
> > >     3k should be more than enough.
> > >
> > >     Add an ASSERT() to the linker script to ensure the wakeup stack is
> > >     always at least 3k.
> > >
> > >     Signed-off-by: Juergen Gross <jgross@suse.com>
> > >     Reviewed-by: Jan Beulich <jbeulich@suse.com>
> > >
> > > To verify this I checked out master, reverted that commit, and tried again.
> > > The NUC still booted fine.
> >
> > Well, interesting, but I don't think it is very realistic to expect any
> > fix with just the information you supply. There must be something
> > rather special about that system, and likely it would help if we
> > knew what that is. E.g. an unusual E820 map. Worse would be if
> > they used memory outside of properly marked E820 regions in a
> > way colliding with what we do.
> >
> > Otherwise I'm afraid we need to hope for you to debug the issue.
> >
> 
> Yes, I was posting this more a heads-up for the moment, so that 4.9 does not
> go out with this regression.
> 
> I will try to figure out what is going on... My initial thoughts on looking at what
> the patch does are that it may be something to do with the fact I am using a
> vga console rather than a serial one. I need to try another 4.9 on another
> system (gigabyte brix) to see if the problem manifests there too. I'll also have
> to play with the BIOS settings on the skull canyon.
> 

The problem definitely doesn't manifest on the brix, so the next theory is that it is something to do with the BIOS of the skull canyon.

  Paul

>   Paul
> 
> > Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 16:28     ` Paul Durrant
@ 2017-06-06 17:00       ` Boris Ostrovsky
  2017-06-07  8:07         ` Jan Beulich
  2017-06-07  8:07         ` Paul Durrant
  0 siblings, 2 replies; 57+ messages in thread
From: Boris Ostrovsky @ 2017-06-06 17:00 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: xen-devel (xen-devel@lists.xenproject.org)

On 06/06/2017 12:28 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>> Paul Durrant
>> Sent: 06 June 2017 16:52
>> To: 'Jan Beulich' <JBeulich@suse.com>
>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>> devel@lists.xenproject.org>
>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>
>>> -----Original Message-----
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>> Sent: 06 June 2017 16:11
>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>> devel@lists.xenproject.org>
>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>
>>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
>>>> I've been having fun setting up a new test rig...
>>>>
>>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
>>>> 4.9 kernel) and then tried building and installing the latest Xen staging-4.9
>>>> code. The system failed to boot... basically it got stuck before even
>>>> managing to get sufficiently into Xen to spit out anything on the console.
>>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
>> iterations
>>>> I got down to the following commit is being the problem:
>>>>
>>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
>>>> Author: Juergen Gross <jgross@suse.com>
>>>> Date:   Fri Mar 24 14:18:54 2017 +0100
>>>>
>>>>     x86: split boot trampoline into permanent and temporary part
>>>>
>>>>     The hypervisor needs a trampoline in low memory for early boot and
>>>>     later for bringing up cpus and during wakeup from suspend. Today this
>>>>     trampoline is kept completely even if most of it isn't needed later.
>>>>
>>>>     Split the trampoline into a permanent part and a temporary part
>> needed
>>>>     at early boot only. Introduce a new entry at the boundary.
>>>>
>>>>     Reduce the stack for wakeup code in order for the permanent
>>>>     trampoline to fit in a single page. 4k of stack seems excessive, about
>>>>     3k should be more than enough.
>>>>
>>>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
>>>>     always at least 3k.
>>>>
>>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>>
>>>> To verify this I checked out master, reverted that commit, and tried again.
>>>> The NUC still booted fine.
>>> Well, interesting, but I don't think it is very realistic to expect any
>>> fix with just the information you supply. There must be something
>>> rather special about that system, and likely it would help if we
>>> knew what that is. E.g. an unusual E820 map. Worse would be if
>>> they used memory outside of properly marked E820 regions in a
>>> way colliding with what we do.
>>>
>>> Otherwise I'm afraid we need to hope for you to debug the issue.
>>>
>> Yes, I was posting this more a heads-up for the moment, so that 4.9 does not
>> go out with this regression.
>>
>> I will try to figure out what is going on... My initial thoughts on looking at what
>> the patch does are that it may be something to do with the fact I am using a
>> vga console rather than a serial one. I need to try another 4.9 on another
>> system (gigabyte brix) to see if the problem manifests there too. I'll also have
>> to play with the BIOS settings on the skull canyon.
>>
> The problem definitely doesn't manifest on the brix, so the next theory is that it is something to do with the BIOS of the skull canyon.
>


FWIW, one of machines in our test farm choked on this very patch. I
don't remember details now but essentially it turned out that syslinux
(we are pxe-booting) could not handle changes in ELF sections layout
(the way syslinux calculated how to load the binary into memory resulted
in overlap of some sort).

I hacked it (mboot.c32 specifically) to work around this but never came
up with a proper solution.

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 15:51   ` Paul Durrant
  2017-06-06 16:28     ` Paul Durrant
@ 2017-06-06 17:40     ` Julien Grall
  2017-06-07  8:05       ` Paul Durrant
  1 sibling, 1 reply; 57+ messages in thread
From: Julien Grall @ 2017-06-06 17:40 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: xen-devel (xen-devel@lists.xenproject.org)

Hi Paul,

On 06/06/17 16:51, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 06 June 2017 16:11
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>> devel@lists.xenproject.org>
>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>
>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
>>> I've been having fun setting up a new test rig...
>>>
>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
>>> 4.9 kernel) and then tried building and installing the latest Xen staging-4.9
>>> code. The system failed to boot... basically it got stuck before even
>>> managing to get sufficiently into Xen to spit out anything on the console.
>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14 iterations
>>> I got down to the following commit is being the problem:
>>>
>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
>>> Author: Juergen Gross <jgross@suse.com>
>>> Date:   Fri Mar 24 14:18:54 2017 +0100
>>>
>>>     x86: split boot trampoline into permanent and temporary part
>>>
>>>     The hypervisor needs a trampoline in low memory for early boot and
>>>     later for bringing up cpus and during wakeup from suspend. Today this
>>>     trampoline is kept completely even if most of it isn't needed later.
>>>
>>>     Split the trampoline into a permanent part and a temporary part needed
>>>     at early boot only. Introduce a new entry at the boundary.
>>>
>>>     Reduce the stack for wakeup code in order for the permanent
>>>     trampoline to fit in a single page. 4k of stack seems excessive, about
>>>     3k should be more than enough.
>>>
>>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
>>>     always at least 3k.
>>>
>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> To verify this I checked out master, reverted that commit, and tried again.
>>> The NUC still booted fine.
>>
>> Well, interesting, but I don't think it is very realistic to expect any
>> fix with just the information you supply. There must be something
>> rather special about that system, and likely it would help if we
>> knew what that is. E.g. an unusual E820 map. Worse would be if
>> they used memory outside of properly marked E820 regions in a
>> way colliding with what we do.
>>
>> Otherwise I'm afraid we need to hope for you to debug the issue.
>>
>
> Yes, I was posting this more a heads-up for the moment, so that 4.9 does not go out with this regression.

I would have appreciated to be CCed in this e-mail as this concern 4.9 
release... Please take the habit to CC the release manager for anything 
targeting a release.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 17:40     ` Julien Grall
@ 2017-06-07  8:05       ` Paul Durrant
  0 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-07  8:05 UTC (permalink / raw)
  To: 'Julien Grall', 'Jan Beulich'
  Cc: xen-devel (xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Julien Grall [mailto:julien.grall@arm.com]
> Sent: 06 June 2017 18:41
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> <JBeulich@suse.com>
> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> Hi Paul,
> 
> On 06/06/17 16:51, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 06 June 2017 16:11
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >> devel@lists.xenproject.org>
> >> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>
> >>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
> >>> I've been having fun setting up a new test rig...
> >>>
> >>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
> >>> 4.9 kernel) and then tried building and installing the latest Xen staging-
> 4.9
> >>> code. The system failed to boot... basically it got stuck before even
> >>> managing to get sufficiently into Xen to spit out anything on the console.
> >>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
> iterations
> >>> I got down to the following commit is being the problem:
> >>>
> >>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
> >>> Author: Juergen Gross <jgross@suse.com>
> >>> Date:   Fri Mar 24 14:18:54 2017 +0100
> >>>
> >>>     x86: split boot trampoline into permanent and temporary part
> >>>
> >>>     The hypervisor needs a trampoline in low memory for early boot and
> >>>     later for bringing up cpus and during wakeup from suspend. Today this
> >>>     trampoline is kept completely even if most of it isn't needed later.
> >>>
> >>>     Split the trampoline into a permanent part and a temporary part
> needed
> >>>     at early boot only. Introduce a new entry at the boundary.
> >>>
> >>>     Reduce the stack for wakeup code in order for the permanent
> >>>     trampoline to fit in a single page. 4k of stack seems excessive, about
> >>>     3k should be more than enough.
> >>>
> >>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
> >>>     always at least 3k.
> >>>
> >>>     Signed-off-by: Juergen Gross <jgross@suse.com>
> >>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>>
> >>> To verify this I checked out master, reverted that commit, and tried
> again.
> >>> The NUC still booted fine.
> >>
> >> Well, interesting, but I don't think it is very realistic to expect any
> >> fix with just the information you supply. There must be something
> >> rather special about that system, and likely it would help if we
> >> knew what that is. E.g. an unusual E820 map. Worse would be if
> >> they used memory outside of properly marked E820 regions in a
> >> way colliding with what we do.
> >>
> >> Otherwise I'm afraid we need to hope for you to debug the issue.
> >>
> >
> > Yes, I was posting this more a heads-up for the moment, so that 4.9 does
> not go out with this regression.
> 
> I would have appreciated to be CCed in this e-mail as this concern 4.9
> release... Please take the habit to CC the release manager for anything
> targeting a release.
> 

Yes, sorry I should have cc-ed... I was in a bit of a rush and forgot.

  Cheers,

    Paul

> Cheers,
> 
> --
> Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 17:00       ` Boris Ostrovsky
@ 2017-06-07  8:07         ` Jan Beulich
  2017-06-07  8:09           ` Paul Durrant
  2017-06-07 14:05           ` Boris Ostrovsky
  2017-06-07  8:07         ` Paul Durrant
  1 sibling, 2 replies; 57+ messages in thread
From: Jan Beulich @ 2017-06-07  8:07 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: xen-devel (xen-devel@lists.xenproject.org), Paul Durrant

>>> On 06.06.17 at 19:00, <boris.ostrovsky@oracle.com> wrote:
> FWIW, one of machines in our test farm choked on this very patch. I
> don't remember details now but essentially it turned out that syslinux
> (we are pxe-booting) could not handle changes in ELF sections layout
> (the way syslinux calculated how to load the binary into memory resulted
> in overlap of some sort).

There has always been an overlap between the main and the notes
segment; there being only two segments I don't see any other
potential for an overlap. In fact I can't see anything other than size
differences between a 4.8.1 and a 4.9 binary, plus of course the
base address change resulting from Daniel's EFI/GrUB2 patches. So
I'm rather puzzled as to what effect Jürgen's patch could have had
on the behavior of any loader whatsoever.

The only possibly misleading section I notice is .reloc, but that's
present in xen-syms only, not in xen.gz. And again it's a result of
Daniel's series, not Jürgen's patch.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-06 17:00       ` Boris Ostrovsky
  2017-06-07  8:07         ` Jan Beulich
@ 2017-06-07  8:07         ` Paul Durrant
  2017-06-07  8:27           ` Jan Beulich
       [not found]           ` <5937D4FF02000078001602F6@suse.com>
  1 sibling, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-07  8:07 UTC (permalink / raw)
  To: 'Boris Ostrovsky', 'Jan Beulich'
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com)

> -----Original Message-----
> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> Sent: 06 June 2017 18:00
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> <JBeulich@suse.com>
> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> On 06/06/2017 12:28 PM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> >> Paul Durrant
> >> Sent: 06 June 2017 16:52
> >> To: 'Jan Beulich' <JBeulich@suse.com>
> >> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >> devel@lists.xenproject.org>
> >> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>
> >>> -----Original Message-----
> >>> From: Jan Beulich [mailto:JBeulich@suse.com]
> >>> Sent: 06 June 2017 16:11
> >>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >>> devel@lists.xenproject.org>
> >>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>>
> >>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
> >>>> I've been having fun setting up a new test rig...
> >>>>
> >>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
> >>>> 4.9 kernel) and then tried building and installing the latest Xen staging-
> 4.9
> >>>> code. The system failed to boot... basically it got stuck before even
> >>>> managing to get sufficiently into Xen to spit out anything on the
> console.
> >>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
> >> iterations
> >>>> I got down to the following commit is being the problem:
> >>>>
> >>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
> >>>> Author: Juergen Gross <jgross@suse.com>
> >>>> Date:   Fri Mar 24 14:18:54 2017 +0100
> >>>>
> >>>>     x86: split boot trampoline into permanent and temporary part
> >>>>
> >>>>     The hypervisor needs a trampoline in low memory for early boot and
> >>>>     later for bringing up cpus and during wakeup from suspend. Today
> this
> >>>>     trampoline is kept completely even if most of it isn't needed later.
> >>>>
> >>>>     Split the trampoline into a permanent part and a temporary part
> >> needed
> >>>>     at early boot only. Introduce a new entry at the boundary.
> >>>>
> >>>>     Reduce the stack for wakeup code in order for the permanent
> >>>>     trampoline to fit in a single page. 4k of stack seems excessive, about
> >>>>     3k should be more than enough.
> >>>>
> >>>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
> >>>>     always at least 3k.
> >>>>
> >>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
> >>>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>>>
> >>>> To verify this I checked out master, reverted that commit, and tried
> again.
> >>>> The NUC still booted fine.
> >>> Well, interesting, but I don't think it is very realistic to expect any
> >>> fix with just the information you supply. There must be something
> >>> rather special about that system, and likely it would help if we
> >>> knew what that is. E.g. an unusual E820 map. Worse would be if
> >>> they used memory outside of properly marked E820 regions in a
> >>> way colliding with what we do.
> >>>
> >>> Otherwise I'm afraid we need to hope for you to debug the issue.
> >>>
> >> Yes, I was posting this more a heads-up for the moment, so that 4.9 does
> not
> >> go out with this regression.
> >>
> >> I will try to figure out what is going on... My initial thoughts on looking at
> what
> >> the patch does are that it may be something to do with the fact I am using
> a
> >> vga console rather than a serial one. I need to try another 4.9 on another
> >> system (gigabyte brix) to see if the problem manifests there too. I'll also
> have
> >> to play with the BIOS settings on the skull canyon.
> >>
> > The problem definitely doesn't manifest on the brix, so the next theory is
> that it is something to do with the BIOS of the skull canyon.
> >
> 
> 
> FWIW, one of machines in our test farm choked on this very patch. I
> don't remember details now but essentially it turned out that syslinux
> (we are pxe-booting) could not handle changes in ELF sections layout
> (the way syslinux calculated how to load the binary into memory resulted
> in overlap of some sort).
> 
> I hacked it (mboot.c32 specifically) to work around this but never came
> up with a proper solution.
> 

In my case it was grub2... and thinking about it I am running an older version on the brix so I guess it may still manifest there if I update. Either way it sounds like it may be better to revert the patch until the issue is better understood.

  Paul

> -boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07  8:07         ` Jan Beulich
@ 2017-06-07  8:09           ` Paul Durrant
  2017-06-07  8:19             ` Paul Durrant
  2017-06-07 14:05           ` Boris Ostrovsky
  1 sibling, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07  8:09 UTC (permalink / raw)
  To: 'Jan Beulich', Boris Ostrovsky
  Cc: xen-devel (xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 June 2017 09:07
> To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel (xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 06.06.17 at 19:00, <boris.ostrovsky@oracle.com> wrote:
> > FWIW, one of machines in our test farm choked on this very patch. I
> > don't remember details now but essentially it turned out that syslinux
> > (we are pxe-booting) could not handle changes in ELF sections layout
> > (the way syslinux calculated how to load the binary into memory resulted
> > in overlap of some sort).
> 
> There has always been an overlap between the main and the notes
> segment; there being only two segments I don't see any other
> potential for an overlap. In fact I can't see anything other than size
> differences between a 4.8.1 and a 4.9 binary, plus of course the
> base address change resulting from Daniel's EFI/GrUB2 patches. So
> I'm rather puzzled as to what effect Jürgen's patch could have had
> on the behavior of any loader whatsoever.
> 
> The only possibly misleading section I notice is .reloc, but that's
> present in xen-syms only, not in xen.gz. And again it's a result of
> Daniel's series, not Jürgen's patch.
> 

I guess I could apply the patch in isolation against 4.8 and see if it causes a problem. I'll give that a quick try.

  Paul

> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07  8:09           ` Paul Durrant
@ 2017-06-07  8:19             ` Paul Durrant
  0 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-07  8:19 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich', Boris Ostrovsky
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com)

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Paul Durrant
> Sent: 07 June 2017 09:10
> To: 'Jan Beulich' <JBeulich@suse.com>; Boris Ostrovsky
> <boris.ostrovsky@oracle.com>
> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 07 June 2017 09:07
> > To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> > Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel (xen-
> > devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>
> > Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >
> > >>> On 06.06.17 at 19:00, <boris.ostrovsky@oracle.com> wrote:
> > > FWIW, one of machines in our test farm choked on this very patch. I
> > > don't remember details now but essentially it turned out that syslinux
> > > (we are pxe-booting) could not handle changes in ELF sections layout
> > > (the way syslinux calculated how to load the binary into memory resulted
> > > in overlap of some sort).
> >
> > There has always been an overlap between the main and the notes
> > segment; there being only two segments I don't see any other
> > potential for an overlap. In fact I can't see anything other than size
> > differences between a 4.8.1 and a 4.9 binary, plus of course the
> > base address change resulting from Daniel's EFI/GrUB2 patches. So
> > I'm rather puzzled as to what effect Jürgen's patch could have had
> > on the behavior of any loader whatsoever.
> >
> > The only possibly misleading section I notice is .reloc, but that's
> > present in xen-syms only, not in xen.gz. And again it's a result of
> > Daniel's series, not Jürgen's patch.
> >
> 
> I guess I could apply the patch in isolation against 4.8 and see if it causes a
> problem. I'll give that a quick try.
> 

Applying the patch to 4.8.1 *does* cause the problem, so it's definitely something in the patch rather than an interaction with other patches in 4.9.

  Paul

>   Paul
> 
> > Jan
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07  8:07         ` Paul Durrant
@ 2017-06-07  8:27           ` Jan Beulich
       [not found]           ` <5937D4FF02000078001602F6@suse.com>
  1 sibling, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2017-06-07  8:27 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

>>> On 07.06.17 at 10:07, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
>> Sent: 06 June 2017 18:00
>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>> <JBeulich@suse.com>
>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>> devel@lists.xenproject.org>
>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>> 
>> On 06/06/2017 12:28 PM, Paul Durrant wrote:
>> >> -----Original Message-----
>> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>> >> Paul Durrant
>> >> Sent: 06 June 2017 16:52
>> >> To: 'Jan Beulich' <JBeulich@suse.com>
>> >> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>> >> devel@lists.xenproject.org>
>> >> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>> >>
>> >>> -----Original Message-----
>> >>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >>> Sent: 06 June 2017 16:11
>> >>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>> >>> devel@lists.xenproject.org>
>> >>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>> >>>
>> >>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
>> >>>> I've been having fun setting up a new test rig...
>> >>>>
>> >>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
>> >>>> 4.9 kernel) and then tried building and installing the latest Xen staging-
>> 4.9
>> >>>> code. The system failed to boot... basically it got stuck before even
>> >>>> managing to get sufficiently into Xen to spit out anything on the
>> console.
>> >>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
>> >> iterations
>> >>>> I got down to the following commit is being the problem:
>> >>>>
>> >>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
>> >>>> Author: Juergen Gross <jgross@suse.com>
>> >>>> Date:   Fri Mar 24 14:18:54 2017 +0100
>> >>>>
>> >>>>     x86: split boot trampoline into permanent and temporary part
>> >>>>
>> >>>>     The hypervisor needs a trampoline in low memory for early boot and
>> >>>>     later for bringing up cpus and during wakeup from suspend. Today
>> this
>> >>>>     trampoline is kept completely even if most of it isn't needed later.
>> >>>>
>> >>>>     Split the trampoline into a permanent part and a temporary part
>> >> needed
>> >>>>     at early boot only. Introduce a new entry at the boundary.
>> >>>>
>> >>>>     Reduce the stack for wakeup code in order for the permanent
>> >>>>     trampoline to fit in a single page. 4k of stack seems excessive, about
>> >>>>     3k should be more than enough.
>> >>>>
>> >>>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
>> >>>>     always at least 3k.
>> >>>>
>> >>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>> >>>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> >>>>
>> >>>> To verify this I checked out master, reverted that commit, and tried
>> again.
>> >>>> The NUC still booted fine.
>> >>> Well, interesting, but I don't think it is very realistic to expect any
>> >>> fix with just the information you supply. There must be something
>> >>> rather special about that system, and likely it would help if we
>> >>> knew what that is. E.g. an unusual E820 map. Worse would be if
>> >>> they used memory outside of properly marked E820 regions in a
>> >>> way colliding with what we do.
>> >>>
>> >>> Otherwise I'm afraid we need to hope for you to debug the issue.
>> >>>
>> >> Yes, I was posting this more a heads-up for the moment, so that 4.9 does
>> not
>> >> go out with this regression.
>> >>
>> >> I will try to figure out what is going on... My initial thoughts on looking 
> at
>> what
>> >> the patch does are that it may be something to do with the fact I am using
>> a
>> >> vga console rather than a serial one. I need to try another 4.9 on another
>> >> system (gigabyte brix) to see if the problem manifests there too. I'll also
>> have
>> >> to play with the BIOS settings on the skull canyon.
>> >>
>> > The problem definitely doesn't manifest on the brix, so the next theory is
>> that it is something to do with the BIOS of the skull canyon.
>> >
>> 
>> 
>> FWIW, one of machines in our test farm choked on this very patch. I
>> don't remember details now but essentially it turned out that syslinux
>> (we are pxe-booting) could not handle changes in ELF sections layout
>> (the way syslinux calculated how to load the binary into memory resulted
>> in overlap of some sort).
>> 
>> I hacked it (mboot.c32 specifically) to work around this but never came
>> up with a proper solution.
>> 
> 
> In my case it was grub2... and thinking about it I am running an older 
> version on the brix so I guess it may still manifest there if I update. 
> Either way it sounds like it may be better to revert the patch until the 
> issue is better understood.

I'm not sure if we could simply revert this one patch - it's the first of a
3-patch series. At the first glance I can't really see any dependency
of the later two patches on it, but then again I seem to recall that the
split was a prereq. Adding Jürgen.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
       [not found]           ` <5937D4FF02000078001602F6@suse.com>
@ 2017-06-07  9:03             ` Juergen Gross
  2017-06-07  9:05               ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Juergen Gross @ 2017-06-07  9:03 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

On 07/06/17 10:27, Jan Beulich wrote:
>>>> On 07.06.17 at 10:07, <Paul.Durrant@citrix.com> wrote:
>>>  -----Original Message-----
>>> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
>>> Sent: 06 June 2017 18:00
>>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>>> <JBeulich@suse.com>
>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>> devel@lists.xenproject.org>
>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>
>>> On 06/06/2017 12:28 PM, Paul Durrant wrote:
>>>>> -----Original Message-----
>>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>>>>> Paul Durrant
>>>>> Sent: 06 June 2017 16:52
>>>>> To: 'Jan Beulich' <JBeulich@suse.com>
>>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>>>> devel@lists.xenproject.org>
>>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>> Sent: 06 June 2017 16:11
>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>>>>> devel@lists.xenproject.org>
>>>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>>>>
>>>>>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
>>>>>>> I've been having fun setting up a new test rig...
>>>>>>>
>>>>>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so that's a
>>>>>>> 4.9 kernel) and then tried building and installing the latest Xen staging-
>>> 4.9
>>>>>>> code. The system failed to boot... basically it got stuck before even
>>>>>>> managing to get sufficiently into Xen to spit out anything on the
>>> console.
>>>>>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
>>>>> iterations
>>>>>>> I got down to the following commit is being the problem:
>>>>>>>
>>>>>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
>>>>>>> Author: Juergen Gross <jgross@suse.com>
>>>>>>> Date:   Fri Mar 24 14:18:54 2017 +0100
>>>>>>>
>>>>>>>     x86: split boot trampoline into permanent and temporary part
>>>>>>>
>>>>>>>     The hypervisor needs a trampoline in low memory for early boot and
>>>>>>>     later for bringing up cpus and during wakeup from suspend. Today
>>> this
>>>>>>>     trampoline is kept completely even if most of it isn't needed later.
>>>>>>>
>>>>>>>     Split the trampoline into a permanent part and a temporary part
>>>>> needed
>>>>>>>     at early boot only. Introduce a new entry at the boundary.
>>>>>>>
>>>>>>>     Reduce the stack for wakeup code in order for the permanent
>>>>>>>     trampoline to fit in a single page. 4k of stack seems excessive, about
>>>>>>>     3k should be more than enough.
>>>>>>>
>>>>>>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
>>>>>>>     always at least 3k.
>>>>>>>
>>>>>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>>>>>
>>>>>>> To verify this I checked out master, reverted that commit, and tried
>>> again.
>>>>>>> The NUC still booted fine.
>>>>>> Well, interesting, but I don't think it is very realistic to expect any
>>>>>> fix with just the information you supply. There must be something
>>>>>> rather special about that system, and likely it would help if we
>>>>>> knew what that is. E.g. an unusual E820 map. Worse would be if
>>>>>> they used memory outside of properly marked E820 regions in a
>>>>>> way colliding with what we do.
>>>>>>
>>>>>> Otherwise I'm afraid we need to hope for you to debug the issue.
>>>>>>
>>>>> Yes, I was posting this more a heads-up for the moment, so that 4.9 does
>>> not
>>>>> go out with this regression.
>>>>>
>>>>> I will try to figure out what is going on... My initial thoughts on looking 
>> at
>>> what
>>>>> the patch does are that it may be something to do with the fact I am using
>>> a
>>>>> vga console rather than a serial one. I need to try another 4.9 on another
>>>>> system (gigabyte brix) to see if the problem manifests there too. I'll also
>>> have
>>>>> to play with the BIOS settings on the skull canyon.
>>>>>
>>>> The problem definitely doesn't manifest on the brix, so the next theory is
>>> that it is something to do with the BIOS of the skull canyon.
>>>>
>>>
>>>
>>> FWIW, one of machines in our test farm choked on this very patch. I
>>> don't remember details now but essentially it turned out that syslinux
>>> (we are pxe-booting) could not handle changes in ELF sections layout
>>> (the way syslinux calculated how to load the binary into memory resulted
>>> in overlap of some sort).
>>>
>>> I hacked it (mboot.c32 specifically) to work around this but never came
>>> up with a proper solution.
>>>
>>
>> In my case it was grub2... and thinking about it I am running an older 
>> version on the brix so I guess it may still manifest there if I update. 
>> Either way it sounds like it may be better to revert the patch until the 
>> issue is better understood.
> 
> I'm not sure if we could simply revert this one patch - it's the first of a
> 3-patch series. At the first glance I can't really see any dependency
> of the later two patches on it, but then again I seem to recall that the
> split was a prereq. Adding Jürgen.

I think it could be reverted. It was a prerequisite for another patch I
prepared but didn't send as it was quite late in the 4.9 cycle and it
depended on the other patches of Daniel.

TBH: I really can't see what is wrong with that patch. The only change
which should be able to break something seems to be the reduction of the
wakeup stack size to 3kB, but this shouldn't affect booting the system
at all...


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07  9:03             ` Juergen Gross
@ 2017-06-07  9:05               ` Paul Durrant
  2017-06-07  9:09                 ` Andrew Cooper
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07  9:05 UTC (permalink / raw)
  To: 'Juergen Gross', Jan Beulich
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

> -----Original Message-----
> From: Juergen Gross [mailto:jgross@suse.com]
> Sent: 07 June 2017 10:03
> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; xen-devel
> (xen-devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> On 07/06/17 10:27, Jan Beulich wrote:
> >>>> On 07.06.17 at 10:07, <Paul.Durrant@citrix.com> wrote:
> >>>  -----Original Message-----
> >>> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> >>> Sent: 06 June 2017 18:00
> >>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> >>> <JBeulich@suse.com>
> >>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >>> devel@lists.xenproject.org>
> >>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>>
> >>> On 06/06/2017 12:28 PM, Paul Durrant wrote:
> >>>>> -----Original Message-----
> >>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
> Of
> >>>>> Paul Durrant
> >>>>> Sent: 06 June 2017 16:52
> >>>>> To: 'Jan Beulich' <JBeulich@suse.com>
> >>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >>>>> devel@lists.xenproject.org>
> >>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
> >>>>>> Sent: 06 June 2017 16:11
> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >>>>>> devel@lists.xenproject.org>
> >>>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>>>>>
> >>>>>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
> >>>>>>> I've been having fun setting up a new test rig...
> >>>>>>>
> >>>>>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so
> that's a
> >>>>>>> 4.9 kernel) and then tried building and installing the latest Xen
> staging-
> >>> 4.9
> >>>>>>> code. The system failed to boot... basically it got stuck before even
> >>>>>>> managing to get sufficiently into Xen to spit out anything on the
> >>> console.
> >>>>>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
> >>>>> iterations
> >>>>>>> I got down to the following commit is being the problem:
> >>>>>>>
> >>>>>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
> >>>>>>> Author: Juergen Gross <jgross@suse.com>
> >>>>>>> Date:   Fri Mar 24 14:18:54 2017 +0100
> >>>>>>>
> >>>>>>>     x86: split boot trampoline into permanent and temporary part
> >>>>>>>
> >>>>>>>     The hypervisor needs a trampoline in low memory for early boot
> and
> >>>>>>>     later for bringing up cpus and during wakeup from suspend.
> Today
> >>> this
> >>>>>>>     trampoline is kept completely even if most of it isn't needed
> later.
> >>>>>>>
> >>>>>>>     Split the trampoline into a permanent part and a temporary part
> >>>>> needed
> >>>>>>>     at early boot only. Introduce a new entry at the boundary.
> >>>>>>>
> >>>>>>>     Reduce the stack for wakeup code in order for the permanent
> >>>>>>>     trampoline to fit in a single page. 4k of stack seems excessive,
> about
> >>>>>>>     3k should be more than enough.
> >>>>>>>
> >>>>>>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
> >>>>>>>     always at least 3k.
> >>>>>>>
> >>>>>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
> >>>>>>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>>>>>>
> >>>>>>> To verify this I checked out master, reverted that commit, and tried
> >>> again.
> >>>>>>> The NUC still booted fine.
> >>>>>> Well, interesting, but I don't think it is very realistic to expect any
> >>>>>> fix with just the information you supply. There must be something
> >>>>>> rather special about that system, and likely it would help if we
> >>>>>> knew what that is. E.g. an unusual E820 map. Worse would be if
> >>>>>> they used memory outside of properly marked E820 regions in a
> >>>>>> way colliding with what we do.
> >>>>>>
> >>>>>> Otherwise I'm afraid we need to hope for you to debug the issue.
> >>>>>>
> >>>>> Yes, I was posting this more a heads-up for the moment, so that 4.9
> does
> >>> not
> >>>>> go out with this regression.
> >>>>>
> >>>>> I will try to figure out what is going on... My initial thoughts on looking
> >> at
> >>> what
> >>>>> the patch does are that it may be something to do with the fact I am
> using
> >>> a
> >>>>> vga console rather than a serial one. I need to try another 4.9 on
> another
> >>>>> system (gigabyte brix) to see if the problem manifests there too. I'll
> also
> >>> have
> >>>>> to play with the BIOS settings on the skull canyon.
> >>>>>
> >>>> The problem definitely doesn't manifest on the brix, so the next theory
> is
> >>> that it is something to do with the BIOS of the skull canyon.
> >>>>
> >>>
> >>>
> >>> FWIW, one of machines in our test farm choked on this very patch. I
> >>> don't remember details now but essentially it turned out that syslinux
> >>> (we are pxe-booting) could not handle changes in ELF sections layout
> >>> (the way syslinux calculated how to load the binary into memory
> resulted
> >>> in overlap of some sort).
> >>>
> >>> I hacked it (mboot.c32 specifically) to work around this but never came
> >>> up with a proper solution.
> >>>
> >>
> >> In my case it was grub2... and thinking about it I am running an older
> >> version on the brix so I guess it may still manifest there if I update.
> >> Either way it sounds like it may be better to revert the patch until the
> >> issue is better understood.
> >
> > I'm not sure if we could simply revert this one patch - it's the first of a
> > 3-patch series. At the first glance I can't really see any dependency
> > of the later two patches on it, but then again I seem to recall that the
> > split was a prereq. Adding Jürgen.
> 
> I think it could be reverted. It was a prerequisite for another patch I
> prepared but didn't send as it was quite late in the 4.9 cycle and it
> depended on the other patches of Daniel.
> 
> TBH: I really can't see what is wrong with that patch. The only change
> which should be able to break something seems to be the reduction of the
> wakeup stack size to 3kB, but this shouldn't affect booting the system
> at all...
> 

Yeah, my next test is going to be increasing the size of the wakeup stack again, but there is really nothing obviously wrong with the patch.

  Paul

> 
> Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07  9:05               ` Paul Durrant
@ 2017-06-07  9:09                 ` Andrew Cooper
  2017-06-07 10:36                   ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Andrew Cooper @ 2017-06-07  9:09 UTC (permalink / raw)
  To: Paul Durrant, 'Juergen Gross', Jan Beulich
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

On 07/06/2017 10:05, Paul Durrant wrote:
>> -----Original Message-----
>> From: Juergen Gross [mailto:jgross@suse.com]
>> Sent: 07 June 2017 10:03
>> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
>> <Paul.Durrant@citrix.com>
>> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; xen-devel
>> (xen-devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
>> Ostrovsky' <boris.ostrovsky@oracle.com>
>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>
>> On 07/06/17 10:27, Jan Beulich wrote:
>>>>>> On 07.06.17 at 10:07, <Paul.Durrant@citrix.com> wrote:
>>>>>  -----Original Message-----
>>>>> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
>>>>> Sent: 06 June 2017 18:00
>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>>>>> <JBeulich@suse.com>
>>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>>>> devel@lists.xenproject.org>
>>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>>>
>>>>> On 06/06/2017 12:28 PM, Paul Durrant wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
>> Of
>>>>>>> Paul Durrant
>>>>>>> Sent: 06 June 2017 16:52
>>>>>>> To: 'Jan Beulich' <JBeulich@suse.com>
>>>>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>>>>>> devel@lists.xenproject.org>
>>>>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>> Sent: 06 June 2017 16:11
>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>>>>>>> devel@lists.xenproject.org>
>>>>>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>>>>>>
>>>>>>>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
>>>>>>>>> I've been having fun setting up a new test rig...
>>>>>>>>>
>>>>>>>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so
>> that's a
>>>>>>>>> 4.9 kernel) and then tried building and installing the latest Xen
>> staging-
>>>>> 4.9
>>>>>>>>> code. The system failed to boot... basically it got stuck before even
>>>>>>>>> managing to get sufficiently into Xen to spit out anything on the
>>>>> console.
>>>>>>>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
>>>>>>> iterations
>>>>>>>>> I got down to the following commit is being the problem:
>>>>>>>>>
>>>>>>>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
>>>>>>>>> Author: Juergen Gross <jgross@suse.com>
>>>>>>>>> Date:   Fri Mar 24 14:18:54 2017 +0100
>>>>>>>>>
>>>>>>>>>     x86: split boot trampoline into permanent and temporary part
>>>>>>>>>
>>>>>>>>>     The hypervisor needs a trampoline in low memory for early boot
>> and
>>>>>>>>>     later for bringing up cpus and during wakeup from suspend.
>> Today
>>>>> this
>>>>>>>>>     trampoline is kept completely even if most of it isn't needed
>> later.
>>>>>>>>>     Split the trampoline into a permanent part and a temporary part
>>>>>>> needed
>>>>>>>>>     at early boot only. Introduce a new entry at the boundary.
>>>>>>>>>
>>>>>>>>>     Reduce the stack for wakeup code in order for the permanent
>>>>>>>>>     trampoline to fit in a single page. 4k of stack seems excessive,
>> about
>>>>>>>>>     3k should be more than enough.
>>>>>>>>>
>>>>>>>>>     Add an ASSERT() to the linker script to ensure the wakeup stack is
>>>>>>>>>     always at least 3k.
>>>>>>>>>
>>>>>>>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>>>>>     Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>>>>>>>
>>>>>>>>> To verify this I checked out master, reverted that commit, and tried
>>>>> again.
>>>>>>>>> The NUC still booted fine.
>>>>>>>> Well, interesting, but I don't think it is very realistic to expect any
>>>>>>>> fix with just the information you supply. There must be something
>>>>>>>> rather special about that system, and likely it would help if we
>>>>>>>> knew what that is. E.g. an unusual E820 map. Worse would be if
>>>>>>>> they used memory outside of properly marked E820 regions in a
>>>>>>>> way colliding with what we do.
>>>>>>>>
>>>>>>>> Otherwise I'm afraid we need to hope for you to debug the issue.
>>>>>>>>
>>>>>>> Yes, I was posting this more a heads-up for the moment, so that 4.9
>> does
>>>>> not
>>>>>>> go out with this regression.
>>>>>>>
>>>>>>> I will try to figure out what is going on... My initial thoughts on looking
>>>> at
>>>>> what
>>>>>>> the patch does are that it may be something to do with the fact I am
>> using
>>>>> a
>>>>>>> vga console rather than a serial one. I need to try another 4.9 on
>> another
>>>>>>> system (gigabyte brix) to see if the problem manifests there too. I'll
>> also
>>>>> have
>>>>>>> to play with the BIOS settings on the skull canyon.
>>>>>>>
>>>>>> The problem definitely doesn't manifest on the brix, so the next theory
>> is
>>>>> that it is something to do with the BIOS of the skull canyon.
>>>>>
>>>>> FWIW, one of machines in our test farm choked on this very patch. I
>>>>> don't remember details now but essentially it turned out that syslinux
>>>>> (we are pxe-booting) could not handle changes in ELF sections layout
>>>>> (the way syslinux calculated how to load the binary into memory
>> resulted
>>>>> in overlap of some sort).
>>>>>
>>>>> I hacked it (mboot.c32 specifically) to work around this but never came
>>>>> up with a proper solution.
>>>>>
>>>> In my case it was grub2... and thinking about it I am running an older
>>>> version on the brix so I guess it may still manifest there if I update.
>>>> Either way it sounds like it may be better to revert the patch until the
>>>> issue is better understood.
>>> I'm not sure if we could simply revert this one patch - it's the first of a
>>> 3-patch series. At the first glance I can't really see any dependency
>>> of the later two patches on it, but then again I seem to recall that the
>>> split was a prereq. Adding Jürgen.
>> I think it could be reverted. It was a prerequisite for another patch I
>> prepared but didn't send as it was quite late in the 4.9 cycle and it
>> depended on the other patches of Daniel.
>>
>> TBH: I really can't see what is wrong with that patch. The only change
>> which should be able to break something seems to be the reduction of the
>> wakeup stack size to 3kB, but this shouldn't affect booting the system
>> at all...
>>
> Yeah, my next test is going to be increasing the size of the wakeup stack again, but there is really nothing obviously wrong with the patch.

My gut feeling is that there is some path through boot (tickled by these
two machines) which is clobbering the wrong piece of memory, which was
previously safe and is now not, because of the rearrangements here.

Debugging these machines is very tricky, because they have no serial or
IMPI whatsoever.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07  9:09                 ` Andrew Cooper
@ 2017-06-07 10:36                   ` Paul Durrant
  2017-06-07 11:06                     ` Paul Durrant
  2017-06-07 11:50                     ` Jan Beulich
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 10:36 UTC (permalink / raw)
  To: Andrew Cooper, 'Juergen Gross', Jan Beulich
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

> -----Original Message-----
[snip]
> >>
> >> TBH: I really can't see what is wrong with that patch. The only change
> >> which should be able to break something seems to be the reduction of
> the
> >> wakeup stack size to 3kB, but this shouldn't affect booting the system
> >> at all...
> >>
> > Yeah, my next test is going to be increasing the size of the wakeup stack
> again, but there is really nothing obviously wrong with the patch.
> 
> My gut feeling is that there is some path through boot (tickled by these
> two machines) which is clobbering the wrong piece of memory, which was
> previously safe and is now not, because of the rearrangements here.
> 
> Debugging these machines is very tricky, because they have no serial or
> IMPI whatsoever.
> 

It does appear to be a layout issue. If I modify the code to just set wakeup_stack to wakeup_stack_start + PAGE_SIZE, so it has the full 4k then I still get the problem. However if I then move that code block that includes wakeup.S and move it to the end of trampoline.S so that wakup code and stack are once again located at the end then the problem goes away.

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 10:36                   ` Paul Durrant
@ 2017-06-07 11:06                     ` Paul Durrant
  2017-06-07 11:57                       ` Juergen Gross
  2017-06-07 11:50                     ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 11:06 UTC (permalink / raw)
  To: Paul Durrant, Andrew Cooper, 'Juergen Gross', Jan Beulich
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Paul Durrant
> Sent: 07 June 2017 11:37
> To: Andrew Cooper <Andrew.Cooper3@citrix.com>; 'Juergen Gross'
> <jgross@suse.com>; Jan Beulich <JBeulich@suse.com>
> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>; Julien Grall (julien.grall@arm.com)
> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> > -----Original Message-----
> [snip]
> > >>
> > >> TBH: I really can't see what is wrong with that patch. The only change
> > >> which should be able to break something seems to be the reduction of
> > the
> > >> wakeup stack size to 3kB, but this shouldn't affect booting the system
> > >> at all...
> > >>
> > > Yeah, my next test is going to be increasing the size of the wakeup stack
> > again, but there is really nothing obviously wrong with the patch.
> >
> > My gut feeling is that there is some path through boot (tickled by these
> > two machines) which is clobbering the wrong piece of memory, which was
> > previously safe and is now not, because of the rearrangements here.
> >
> > Debugging these machines is very tricky, because they have no serial or
> > IMPI whatsoever.
> >
> 
> It does appear to be a layout issue. If I modify the code to just set
> wakeup_stack to wakeup_stack_start + PAGE_SIZE, so it has the full 4k then I
> still get the problem. However if I then move that code block that includes
> wakeup.S and move it to the end of trampoline.S so that wakup code and
> stack are once again located at the end then the problem goes away.
> 

It appears that it is just the code that needs to go at the end. The following patch is sufficient to avoid the problem. This may be preferable to a full reversion...

  Paul

diff --git a/xen/arch/x86/boot/trampoline.S b/xen/arch/x86/boot/trampoline.S
index 4d640f3fcd..7709a782f9 100644
--- a/xen/arch/x86/boot/trampoline.S
+++ b/xen/arch/x86/boot/trampoline.S
@@ -156,7 +156,7 @@ start64:
         movabs  $__high_start,%rax
         jmpq    *%rax

-#include "wakeup.S"
+ENTRY(wakeup_stack_start)

 /* The first page of trampoline is permanent, the rest boot-time only. */
 /* Reuse the boot trampoline on the 1st trampoline page as stack for wakeup. */
@@ -280,3 +280,4 @@ rm_idt: .word   256*4-1, 0, 0
 #include "mem.S"
 #include "edd.S"
 #include "video.S"
+#include "wakeup.S"
diff --git a/xen/arch/x86/boot/wakeup.S b/xen/arch/x86/boot/wakeup.S
index f9632eef95..d4824b55d5 100644
--- a/xen/arch/x86/boot/wakeup.S
+++ b/xen/arch/x86/boot/wakeup.S
@@ -173,5 +173,3 @@ bogus_saved_magic:
         movw    $0x0e00 + 'S', 0xb8014
         jmp     bogus_saved_magic

-/* Stack for wakeup: rest of first trampoline page. */
-ENTRY(wakeup_stack_start)

>   Paul
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 10:36                   ` Paul Durrant
  2017-06-07 11:06                     ` Paul Durrant
@ 2017-06-07 11:50                     ` Jan Beulich
  2017-06-07 11:55                       ` Paul Durrant
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-07 11:50 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel (xen-devel@lists.xenproject.org)

>>> On 07.06.17 at 12:36, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
> [snip]
>> >>
>> >> TBH: I really can't see what is wrong with that patch. The only change
>> >> which should be able to break something seems to be the reduction of
>> the
>> >> wakeup stack size to 3kB, but this shouldn't affect booting the system
>> >> at all...
>> >>
>> > Yeah, my next test is going to be increasing the size of the wakeup stack
>> again, but there is really nothing obviously wrong with the patch.
>> 
>> My gut feeling is that there is some path through boot (tickled by these
>> two machines) which is clobbering the wrong piece of memory, which was
>> previously safe and is now not, because of the rearrangements here.
>> 
>> Debugging these machines is very tricky, because they have no serial or
>> IMPI whatsoever.
>> 
> 
> It does appear to be a layout issue. If I modify the code to just set 
> wakeup_stack to wakeup_stack_start + PAGE_SIZE, so it has the full 4k then I 
> still get the problem. However if I then move that code block that includes 
> wakeup.S and move it to the end of trampoline.S so that wakup code and stack 
> are once again located at the end then the problem goes away.

Could you do the following two things:
1) Subtract, say, 4k from trampoline_phys right before setting it
(immediately ahead of trampoline_setup)? Ideally you'd also log
the resulting value (in case it works).
2) Provide the E820 map of that box.
I'm suspecting the BIOS might use an EBDA without recording it in
the low BIOS data area. If it's reported in E820 that would then
likely be the final kick for us to obey to the E820 map when
determining where to put the trampoline.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 11:50                     ` Jan Beulich
@ 2017-06-07 11:55                       ` Paul Durrant
  2017-06-07 12:00                         ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 11:55 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel (xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 June 2017 12:50
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel (xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 07.06.17 at 12:36, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> > [snip]
> >> >>
> >> >> TBH: I really can't see what is wrong with that patch. The only change
> >> >> which should be able to break something seems to be the reduction of
> >> the
> >> >> wakeup stack size to 3kB, but this shouldn't affect booting the system
> >> >> at all...
> >> >>
> >> > Yeah, my next test is going to be increasing the size of the wakeup stack
> >> again, but there is really nothing obviously wrong with the patch.
> >>
> >> My gut feeling is that there is some path through boot (tickled by these
> >> two machines) which is clobbering the wrong piece of memory, which was
> >> previously safe and is now not, because of the rearrangements here.
> >>
> >> Debugging these machines is very tricky, because they have no serial or
> >> IMPI whatsoever.
> >>
> >
> > It does appear to be a layout issue. If I modify the code to just set
> > wakeup_stack to wakeup_stack_start + PAGE_SIZE, so it has the full 4k
> then I
> > still get the problem. However if I then move that code block that includes
> > wakeup.S and move it to the end of trampoline.S so that wakup code and
> stack
> > are once again located at the end then the problem goes away.
> 
> Could you do the following two things:
> 1) Subtract, say, 4k from trampoline_phys right before setting it
> (immediately ahead of trampoline_setup)? Ideally you'd also log
> the resulting value (in case it works).

Ok, I'll have a go at that.

> 2) Provide the E820 map of that box.
> I'm suspecting the BIOS might use an EBDA without recording it in
> the low BIOS data area. If it's reported in E820 that would then
> likely be the final kick for us to obey to the E820 map when
> determining where to put the trampoline.
> 

The stretch kernel booted bare-metal reports:

[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x00000000000963ff] usable
[    0.000000] BIOS-e820: [mem 0x0000000000096400-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000034d53fff] usable
[    0.000000] BIOS-e820: [mem 0x0000000034d54000-0x0000000034d54fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000034d55000-0x0000000034d9efff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000034d9f000-0x000000003bee1fff] usable
[    0.000000] BIOS-e820: [mem 0x000000003bee2000-0x000000003c22cfff] reserved
[    0.000000] BIOS-e820: [mem 0x000000003c22d000-0x000000003c268fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000003c269000-0x000000003cb61fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x000000003cb62000-0x000000003d2fdfff] reserved
[    0.000000] BIOS-e820: [mem 0x000000003d2fe000-0x000000003d2fefff] usable
[    0.000000] BIOS-e820: [mem 0x000000003d300000-0x000000003d3fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x00000008beffffff] usable

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 11:06                     ` Paul Durrant
@ 2017-06-07 11:57                       ` Juergen Gross
  2017-06-07 12:02                         ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Juergen Gross @ 2017-06-07 11:57 UTC (permalink / raw)
  To: Paul Durrant, Andrew Cooper, Jan Beulich
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

On 07/06/17 13:06, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>> Paul Durrant
>> Sent: 07 June 2017 11:37
>> To: Andrew Cooper <Andrew.Cooper3@citrix.com>; 'Juergen Gross'
>> <jgross@suse.com>; Jan Beulich <JBeulich@suse.com>
>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>> devel@lists.xenproject.org>; Julien Grall (julien.grall@arm.com)
>> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>
>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>
>>> -----Original Message-----
>> [snip]
>>>>>
>>>>> TBH: I really can't see what is wrong with that patch. The only change
>>>>> which should be able to break something seems to be the reduction of
>>> the
>>>>> wakeup stack size to 3kB, but this shouldn't affect booting the system
>>>>> at all...
>>>>>
>>>> Yeah, my next test is going to be increasing the size of the wakeup stack
>>> again, but there is really nothing obviously wrong with the patch.
>>>
>>> My gut feeling is that there is some path through boot (tickled by these
>>> two machines) which is clobbering the wrong piece of memory, which was
>>> previously safe and is now not, because of the rearrangements here.
>>>
>>> Debugging these machines is very tricky, because they have no serial or
>>> IMPI whatsoever.
>>>
>>
>> It does appear to be a layout issue. If I modify the code to just set
>> wakeup_stack to wakeup_stack_start + PAGE_SIZE, so it has the full 4k then I
>> still get the problem. However if I then move that code block that includes
>> wakeup.S and move it to the end of trampoline.S so that wakup code and
>> stack are once again located at the end then the problem goes away.
>>
> 
> It appears that it is just the code that needs to go at the end. The following patch is sufficient to avoid the problem. This may be preferable to a full reversion...

I believe this is wrong. You risk the wakeup_stack extending into wakeup
code and the main reason of the patch is gone, as now the permanent
trampoline no longer is on a single page.


Juergen

> 
>   Paul
> 
> diff --git a/xen/arch/x86/boot/trampoline.S b/xen/arch/x86/boot/trampoline.S
> index 4d640f3fcd..7709a782f9 100644
> --- a/xen/arch/x86/boot/trampoline.S
> +++ b/xen/arch/x86/boot/trampoline.S
> @@ -156,7 +156,7 @@ start64:
>          movabs  $__high_start,%rax
>          jmpq    *%rax
> 
> -#include "wakeup.S"
> +ENTRY(wakeup_stack_start)
> 
>  /* The first page of trampoline is permanent, the rest boot-time only. */
>  /* Reuse the boot trampoline on the 1st trampoline page as stack for wakeup. */
> @@ -280,3 +280,4 @@ rm_idt: .word   256*4-1, 0, 0
>  #include "mem.S"
>  #include "edd.S"
>  #include "video.S"
> +#include "wakeup.S"
> diff --git a/xen/arch/x86/boot/wakeup.S b/xen/arch/x86/boot/wakeup.S
> index f9632eef95..d4824b55d5 100644
> --- a/xen/arch/x86/boot/wakeup.S
> +++ b/xen/arch/x86/boot/wakeup.S
> @@ -173,5 +173,3 @@ bogus_saved_magic:
>          movw    $0x0e00 + 'S', 0xb8014
>          jmp     bogus_saved_magic
> 
> -/* Stack for wakeup: rest of first trampoline page. */
> -ENTRY(wakeup_stack_start)
> 
>>   Paul
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 11:55                       ` Paul Durrant
@ 2017-06-07 12:00                         ` Jan Beulich
  2017-06-07 12:46                           ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-07 12:00 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 07.06.17 at 13:55, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 June 2017 12:50
>> 2) Provide the E820 map of that box.
>> I'm suspecting the BIOS might use an EBDA without recording it in
>> the low BIOS data area. If it's reported in E820 that would then
>> likely be the final kick for us to obey to the E820 map when
>> determining where to put the trampoline.
>> 
> 
> The stretch kernel booted bare-metal reports:
> 
> [    0.000000] e820: BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x00000000000963ff] usable
> [    0.000000] BIOS-e820: [mem 0x0000000000096400-0x000000000009ffff] reserved

There we go. Subtracting 4k may then even be too little (depending
what EBDA and low memory values the system reports). Of course
it would be a BIOS bug if they reported some memory they use for
themselves through only E820, as that interface is not required to
be present, and really, really old software wouldn't even know
about it and would hence also be in trouble.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 11:57                       ` Juergen Gross
@ 2017-06-07 12:02                         ` Paul Durrant
  2017-06-07 12:13                           ` Juergen Gross
  2017-06-07 12:19                           ` Jan Beulich
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 12:02 UTC (permalink / raw)
  To: 'Juergen Gross', Andrew Cooper, Jan Beulich
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

> -----Original Message-----
> From: Juergen Gross [mailto:jgross@suse.com]
> Sent: 07 June 2017 12:57
> To: Paul Durrant <Paul.Durrant@citrix.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Jan Beulich <JBeulich@suse.com>
> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>; Julien Grall (julien.grall@arm.com)
> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> On 07/06/17 13:06, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> >> Paul Durrant
> >> Sent: 07 June 2017 11:37
> >> To: Andrew Cooper <Andrew.Cooper3@citrix.com>; 'Juergen Gross'
> >> <jgross@suse.com>; Jan Beulich <JBeulich@suse.com>
> >> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >> devel@lists.xenproject.org>; Julien Grall (julien.grall@arm.com)
> >> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>
> >> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>
> >>> -----Original Message-----
> >> [snip]
> >>>>>
> >>>>> TBH: I really can't see what is wrong with that patch. The only change
> >>>>> which should be able to break something seems to be the reduction
> of
> >>> the
> >>>>> wakeup stack size to 3kB, but this shouldn't affect booting the system
> >>>>> at all...
> >>>>>
> >>>> Yeah, my next test is going to be increasing the size of the wakeup
> stack
> >>> again, but there is really nothing obviously wrong with the patch.
> >>>
> >>> My gut feeling is that there is some path through boot (tickled by these
> >>> two machines) which is clobbering the wrong piece of memory, which
> was
> >>> previously safe and is now not, because of the rearrangements here.
> >>>
> >>> Debugging these machines is very tricky, because they have no serial or
> >>> IMPI whatsoever.
> >>>
> >>
> >> It does appear to be a layout issue. If I modify the code to just set
> >> wakeup_stack to wakeup_stack_start + PAGE_SIZE, so it has the full 4k
> then I
> >> still get the problem. However if I then move that code block that includes
> >> wakeup.S and move it to the end of trampoline.S so that wakup code and
> >> stack are once again located at the end then the problem goes away.
> >>
> >
> > It appears that it is just the code that needs to go at the end. The following
> patch is sufficient to avoid the problem. This may be preferable to a full
> reversion...
> 
> I believe this is wrong. You risk the wakeup_stack extending into wakeup
> code and the main reason of the patch is gone, as now the permanent
> trampoline no longer is on a single page.
> 

I must be misunderstanding something then. The stack grows down from wakeup_stack towards wakeup_stack_start doesn't it? So why would there be an issue with the stack overwriting wakeup code?

  Paul

> 
> Juergen
> 
> >
> >   Paul
> >
> > diff --git a/xen/arch/x86/boot/trampoline.S
> b/xen/arch/x86/boot/trampoline.S
> > index 4d640f3fcd..7709a782f9 100644
> > --- a/xen/arch/x86/boot/trampoline.S
> > +++ b/xen/arch/x86/boot/trampoline.S
> > @@ -156,7 +156,7 @@ start64:
> >          movabs  $__high_start,%rax
> >          jmpq    *%rax
> >
> > -#include "wakeup.S"
> > +ENTRY(wakeup_stack_start)
> >
> >  /* The first page of trampoline is permanent, the rest boot-time only. */
> >  /* Reuse the boot trampoline on the 1st trampoline page as stack for
> wakeup. */
> > @@ -280,3 +280,4 @@ rm_idt: .word   256*4-1, 0, 0
> >  #include "mem.S"
> >  #include "edd.S"
> >  #include "video.S"
> > +#include "wakeup.S"
> > diff --git a/xen/arch/x86/boot/wakeup.S b/xen/arch/x86/boot/wakeup.S
> > index f9632eef95..d4824b55d5 100644
> > --- a/xen/arch/x86/boot/wakeup.S
> > +++ b/xen/arch/x86/boot/wakeup.S
> > @@ -173,5 +173,3 @@ bogus_saved_magic:
> >          movw    $0x0e00 + 'S', 0xb8014
> >          jmp     bogus_saved_magic
> >
> > -/* Stack for wakeup: rest of first trampoline page. */
> > -ENTRY(wakeup_stack_start)
> >
> >>   Paul
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 12:02                         ` Paul Durrant
@ 2017-06-07 12:13                           ` Juergen Gross
  2017-06-07 12:19                           ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Juergen Gross @ 2017-06-07 12:13 UTC (permalink / raw)
  To: Paul Durrant, Andrew Cooper, Jan Beulich
  Cc: xen-devel (xen-devel@lists.xenproject.org),
	Julien Grall (julien.grall@arm.com), 'Boris Ostrovsky'

On 07/06/17 14:02, Paul Durrant wrote:
>> -----Original Message-----
>> From: Juergen Gross [mailto:jgross@suse.com]
>> Sent: 07 June 2017 12:57
>> To: Paul Durrant <Paul.Durrant@citrix.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; Jan Beulich <JBeulich@suse.com>
>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>> devel@lists.xenproject.org>; Julien Grall (julien.grall@arm.com)
>> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>
>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>
>> On 07/06/17 13:06, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>>>> Paul Durrant
>>>> Sent: 07 June 2017 11:37
>>>> To: Andrew Cooper <Andrew.Cooper3@citrix.com>; 'Juergen Gross'
>>>> <jgross@suse.com>; Jan Beulich <JBeulich@suse.com>
>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
>>>> devel@lists.xenproject.org>; Julien Grall (julien.grall@arm.com)
>>>> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>
>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>>>>
>>>>> -----Original Message-----
>>>> [snip]
>>>>>>>
>>>>>>> TBH: I really can't see what is wrong with that patch. The only change
>>>>>>> which should be able to break something seems to be the reduction
>> of
>>>>> the
>>>>>>> wakeup stack size to 3kB, but this shouldn't affect booting the system
>>>>>>> at all...
>>>>>>>
>>>>>> Yeah, my next test is going to be increasing the size of the wakeup
>> stack
>>>>> again, but there is really nothing obviously wrong with the patch.
>>>>>
>>>>> My gut feeling is that there is some path through boot (tickled by these
>>>>> two machines) which is clobbering the wrong piece of memory, which
>> was
>>>>> previously safe and is now not, because of the rearrangements here.
>>>>>
>>>>> Debugging these machines is very tricky, because they have no serial or
>>>>> IMPI whatsoever.
>>>>>
>>>>
>>>> It does appear to be a layout issue. If I modify the code to just set
>>>> wakeup_stack to wakeup_stack_start + PAGE_SIZE, so it has the full 4k
>> then I
>>>> still get the problem. However if I then move that code block that includes
>>>> wakeup.S and move it to the end of trampoline.S so that wakup code and
>>>> stack are once again located at the end then the problem goes away.
>>>>
>>>
>>> It appears that it is just the code that needs to go at the end. The following
>> patch is sufficient to avoid the problem. This may be preferable to a full
>> reversion...
>>
>> I believe this is wrong. You risk the wakeup_stack extending into wakeup
>> code and the main reason of the patch is gone, as now the permanent
>> trampoline no longer is on a single page.
>>
> 
> I must be misunderstanding something then. The stack grows down from wakeup_stack towards wakeup_stack_start doesn't it? So why would there be an issue with the stack overwriting wakeup code?

wakeup_stack is just defined to be trampoline_start + PAGE_SIZE,
without any real space reserved for the stack. So it may well be that
wakeup_start points somewhere into wakeup.S.

There must be no permanent trampoline coding after wakeup_stack_start.

Juergen

> 
>   Paul
> 
>>
>> Juergen
>>
>>>
>>>   Paul
>>>
>>> diff --git a/xen/arch/x86/boot/trampoline.S
>> b/xen/arch/x86/boot/trampoline.S
>>> index 4d640f3fcd..7709a782f9 100644
>>> --- a/xen/arch/x86/boot/trampoline.S
>>> +++ b/xen/arch/x86/boot/trampoline.S
>>> @@ -156,7 +156,7 @@ start64:
>>>          movabs  $__high_start,%rax
>>>          jmpq    *%rax
>>>
>>> -#include "wakeup.S"
>>> +ENTRY(wakeup_stack_start)
>>>
>>>  /* The first page of trampoline is permanent, the rest boot-time only. */
>>>  /* Reuse the boot trampoline on the 1st trampoline page as stack for
>> wakeup. */
>>> @@ -280,3 +280,4 @@ rm_idt: .word   256*4-1, 0, 0
>>>  #include "mem.S"
>>>  #include "edd.S"
>>>  #include "video.S"
>>> +#include "wakeup.S"
>>> diff --git a/xen/arch/x86/boot/wakeup.S b/xen/arch/x86/boot/wakeup.S
>>> index f9632eef95..d4824b55d5 100644
>>> --- a/xen/arch/x86/boot/wakeup.S
>>> +++ b/xen/arch/x86/boot/wakeup.S
>>> @@ -173,5 +173,3 @@ bogus_saved_magic:
>>>          movw    $0x0e00 + 'S', 0xb8014
>>>          jmp     bogus_saved_magic
>>>
>>> -/* Stack for wakeup: rest of first trampoline page. */
>>> -ENTRY(wakeup_stack_start)
>>>
>>>>   Paul
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xen.org
>>>> https://lists.xen.org/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 12:02                         ` Paul Durrant
  2017-06-07 12:13                           ` Juergen Gross
@ 2017-06-07 12:19                           ` Jan Beulich
  2017-06-07 12:26                             ` Paul Durrant
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-07 12:19 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel (xen-devel@lists.xenproject.org)

>>> On 07.06.17 at 14:02, <Paul.Durrant@citrix.com> wrote:
>> From: Juergen Gross [mailto:jgross@suse.com]
>> Sent: 07 June 2017 12:57
>> 
>> On 07/06/17 13:06, Paul Durrant wrote:
>> > It appears that it is just the code that needs to go at the end. The following
>> patch is sufficient to avoid the problem. This may be preferable to a full
>> reversion...
>> 
>> I believe this is wrong. You risk the wakeup_stack extending into wakeup
>> code and the main reason of the patch is gone, as now the permanent
>> trampoline no longer is on a single page.
>> 
> 
> I must be misunderstanding something then. The stack grows down from 
> wakeup_stack towards wakeup_stack_start doesn't it? So why would there be an 
> issue with the stack overwriting wakeup code?

I think this is a pointless discussion: Once we know memory is being
corrupted, it doesn't help shuffling things around. By putting the
wakeup code at the end, it'll be that which gets corrupted, and
hence S3 wakeup would not work. We can really only try to figure
out what parts of memory we need to avoid touching _at all_.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 12:19                           ` Jan Beulich
@ 2017-06-07 12:26                             ` Paul Durrant
  2017-06-07 12:34                               ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 12:26 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel (xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 June 2017 13:19
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel (xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 07.06.17 at 14:02, <Paul.Durrant@citrix.com> wrote:
> >> From: Juergen Gross [mailto:jgross@suse.com]
> >> Sent: 07 June 2017 12:57
> >>
> >> On 07/06/17 13:06, Paul Durrant wrote:
> >> > It appears that it is just the code that needs to go at the end. The
> following
> >> patch is sufficient to avoid the problem. This may be preferable to a full
> >> reversion...
> >>
> >> I believe this is wrong. You risk the wakeup_stack extending into wakeup
> >> code and the main reason of the patch is gone, as now the permanent
> >> trampoline no longer is on a single page.
> >>
> >
> > I must be misunderstanding something then. The stack grows down from
> > wakeup_stack towards wakeup_stack_start doesn't it? So why would
> there be an
> > issue with the stack overwriting wakeup code?
> 
> I think this is a pointless discussion: Once we know memory is being
> corrupted, it doesn't help shuffling things around. By putting the
> wakeup code at the end, it'll be that which gets corrupted, and
> hence S3 wakeup would not work. We can really only try to figure
> out what parts of memory we need to avoid touching _at all_.
> 

Yes, fair enough. I'm currently trying to figure out what the code in head.S just ahead of trampoline_setup is trying to do.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 12:26                             ` Paul Durrant
@ 2017-06-07 12:34                               ` Jan Beulich
  0 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2017-06-07 12:34 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 07.06.17 at 14:26, <Paul.Durrant@citrix.com> wrote:
> I'm currently trying to figure out what the code in head.S 
> just ahead of trampoline_setup is trying to do.

That's where we determine where to put the trampoline, by looking
at base memory size, EBDA location, and the MB provided low
memory size. Insane values are attempted to be ignored.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 12:00                         ` Jan Beulich
@ 2017-06-07 12:46                           ` Paul Durrant
  2017-06-07 12:55                             ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 12:46 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 June 2017 13:00
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>;
> 'BorisOstrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 07.06.17 at 13:55, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 07 June 2017 12:50
> >> 2) Provide the E820 map of that box.
> >> I'm suspecting the BIOS might use an EBDA without recording it in
> >> the low BIOS data area. If it's reported in E820 that would then
> >> likely be the final kick for us to obey to the E820 map when
> >> determining where to put the trampoline.
> >>
> >
> > The stretch kernel booted bare-metal reports:
> >
> > [    0.000000] e820: BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x00000000000963ff]
> usable
> > [    0.000000] BIOS-e820: [mem 0x0000000000096400-0x000000000009ffff]
> reserved
> 
> There we go. Subtracting 4k may then even be too little (depending
> what EBDA and low memory values the system reports). Of course
> it would be a BIOS bug if they reported some memory they use for
> themselves through only E820, as that interface is not required to
> be present, and really, really old software wouldn't even know
> about it and would hence also be in trouble.
> 

Neither 4k nor 8k seemed to be enough. Even subtracting another 64k doesn't work. I guess I'm going to have to try to write some code to log values to the VGA buffer to see what is going on.

  Paul


> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 12:46                           ` Paul Durrant
@ 2017-06-07 12:55                             ` Jan Beulich
  2017-06-07 15:06                               ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-07 12:55 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 07.06.17 at 14:46, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 June 2017 13:00
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
>> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
>> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>;
>> 'BorisOstrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
>> <jgross@suse.com>
>> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>> 
>> >>> On 07.06.17 at 13:55, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 07 June 2017 12:50
>> >> 2) Provide the E820 map of that box.
>> >> I'm suspecting the BIOS might use an EBDA without recording it in
>> >> the low BIOS data area. If it's reported in E820 that would then
>> >> likely be the final kick for us to obey to the E820 map when
>> >> determining where to put the trampoline.
>> >>
>> >
>> > The stretch kernel booted bare-metal reports:
>> >
>> > [    0.000000] e820: BIOS-provided physical RAM map:
>> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x00000000000963ff]
>> usable
>> > [    0.000000] BIOS-e820: [mem 0x0000000000096400-0x000000000009ffff]
>> reserved
>> 
>> There we go. Subtracting 4k may then even be too little (depending
>> what EBDA and low memory values the system reports). Of course
>> it would be a BIOS bug if they reported some memory they use for
>> themselves through only E820, as that interface is not required to
>> be present, and really, really old software wouldn't even know
>> about it and would hence also be in trouble.
>> 
> 
> Neither 4k nor 8k seemed to be enough. Even subtracting another
> 64k doesn't work.

That's rather unexpected.

> I guess I'm going to have to try to write some code to log values to 
> the VGA buffer to see what is going on.

Good luck!

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07  8:07         ` Jan Beulich
  2017-06-07  8:09           ` Paul Durrant
@ 2017-06-07 14:05           ` Boris Ostrovsky
  1 sibling, 0 replies; 57+ messages in thread
From: Boris Ostrovsky @ 2017-06-07 14:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel (xen-devel@lists.xenproject.org), Paul Durrant

On 06/07/2017 04:07 AM, Jan Beulich wrote:
>>>> On 06.06.17 at 19:00, <boris.ostrovsky@oracle.com> wrote:
>> FWIW, one of machines in our test farm choked on this very patch. I
>> don't remember details now but essentially it turned out that syslinux
>> (we are pxe-booting) could not handle changes in ELF sections layout
>> (the way syslinux calculated how to load the binary into memory resulted
>> in overlap of some sort).
> There has always been an overlap between the main and the notes
> segment; there being only two segments I don't see any other
> potential for an overlap. 

I realize this thread has progressed since yesterday but just to
clarify: I wasn't referring to overlap between Xen sections. It was
syslinux failing to fit sections of xen.gz, bzImage (and initramfs,
although I think mapping had already failed by the time syslinux got to
it) into e820 on that particular machine.

Because we are using rather old syslinux I figured it was something
specific to that version, which is why I didn't flag this as an issue.
Which, in the hindsight, was wrong.

-boris

> In fact I can't see anything other than size
> differences between a 4.8.1 and a 4.9 binary, plus of course the
> base address change resulting from Daniel's EFI/GrUB2 patches. So
> I'm rather puzzled as to what effect Jürgen's patch could have had
> on the behavior of any loader whatsoever.
>
> The only possibly misleading section I notice is .reloc, but that's
> present in xen-syms only, not in xen.gz. And again it's a result of
> Daniel's series, not Jürgen's patch.
>
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 12:55                             ` Jan Beulich
@ 2017-06-07 15:06                               ` Paul Durrant
  2017-06-07 15:33                                 ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 15:06 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 June 2017 13:56
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>;
> 'BorisOstrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 07.06.17 at 14:46, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 07 June 2017 13:00
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> >> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> >> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>;
> >> 'BorisOstrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> >> <jgross@suse.com>
> >> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>
> >> >>> On 07.06.17 at 13:55, <Paul.Durrant@citrix.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 07 June 2017 12:50
> >> >> 2) Provide the E820 map of that box.
> >> >> I'm suspecting the BIOS might use an EBDA without recording it in
> >> >> the low BIOS data area. If it's reported in E820 that would then
> >> >> likely be the final kick for us to obey to the E820 map when
> >> >> determining where to put the trampoline.
> >> >>
> >> >
> >> > The stretch kernel booted bare-metal reports:
> >> >
> >> > [    0.000000] e820: BIOS-provided physical RAM map:
> >> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-
> 0x00000000000963ff]
> >> usable
> >> > [    0.000000] BIOS-e820: [mem 0x0000000000096400-
> 0x000000000009ffff]
> >> reserved
> >>
> >> There we go. Subtracting 4k may then even be too little (depending
> >> what EBDA and low memory values the system reports). Of course
> >> it would be a BIOS bug if they reported some memory they use for
> >> themselves through only E820, as that interface is not required to
> >> be present, and really, really old software wouldn't even know
> >> about it and would hence also be in trouble.
> >>
> >
> > Neither 4k nor 8k seemed to be enough. Even subtracting another
> > 64k doesn't work.
> 
> That's rather unexpected.
> 
> > I guess I'm going to have to try to write some code to log values to
> > the VGA buffer to see what is going on.
> 
> Good luck!
> 

That really was too hard... Instead I reverted the patch and stashed EBDA and the initial location of the trampoline and dumped them in __start_xen(). The EBDA tallies with the E820:

(XEN) boot_ebda = 9640
.
.
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000096400 (usable)
(XEN)  0000000000096400 - 00000000000a0000 (reserved)

And the initial location of the trampoline appears to be ok...

(XEN) orig_trampoline_phys = 86000

So, still no clue as to why moving the wakeup code around is messing things up.

  Paul


> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 15:06                               ` Paul Durrant
@ 2017-06-07 15:33                                 ` Jan Beulich
  2017-06-07 15:40                                   ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-07 15:33 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 07.06.17 at 17:06, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 June 2017 13:56
>> >>> On 07.06.17 at 14:46, <Paul.Durrant@citrix.com> wrote:
>> > I guess I'm going to have to try to write some code to log values to
>> > the VGA buffer to see what is going on.
>> 
>> Good luck!
>> 
> 
> That really was too hard... Instead I reverted the patch and stashed EBDA 
> and the initial location of the trampoline and dumped them in __start_xen(). 
> The EBDA tallies with the E820:
> 
> (XEN) boot_ebda = 9640
> .
> .
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 0000000000096400 (usable)
> (XEN)  0000000000096400 - 00000000000a0000 (reserved)
> 
> And the initial location of the trampoline appears to be ok...
> 
> (XEN) orig_trampoline_phys = 86000
> 
> So, still no clue as to why moving the wakeup code around is messing things 
> up.

This looks to be turning into a nightmare. Since you said it doesn't
make it to the point where Xen would do any normal output, have
you been able to narrow down how far it gets? I think this is one
of the few remaining avenues to gain some more understanding.

One other thing to try might be, with that patch reverted, to fill
all memory upwards from wakeup_start (or really its low
memory copy) with a pattern, and later inspect whether anything
changed (you could of course also simply compare low memory
copy and original). Or maybe you have tried this already...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 15:33                                 ` Jan Beulich
@ 2017-06-07 15:40                                   ` Paul Durrant
  2017-06-07 15:52                                     ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-07 15:40 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 June 2017 16:33
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>;
> 'BorisOstrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 07.06.17 at 17:06, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 07 June 2017 13:56
> >> >>> On 07.06.17 at 14:46, <Paul.Durrant@citrix.com> wrote:
> >> > I guess I'm going to have to try to write some code to log values to
> >> > the VGA buffer to see what is going on.
> >>
> >> Good luck!
> >>
> >
> > That really was too hard... Instead I reverted the patch and stashed EBDA
> > and the initial location of the trampoline and dumped them in
> __start_xen().
> > The EBDA tallies with the E820:
> >
> > (XEN) boot_ebda = 9640
> > .
> > .
> > (XEN) Xen-e820 RAM map:
> > (XEN)  0000000000000000 - 0000000000096400 (usable)
> > (XEN)  0000000000096400 - 00000000000a0000 (reserved)
> >
> > And the initial location of the trampoline appears to be ok...
> >
> > (XEN) orig_trampoline_phys = 86000
> >
> > So, still no clue as to why moving the wakeup code around is messing things
> > up.
> 
> This looks to be turning into a nightmare.

Definitely!

> Since you said it doesn't
> make it to the point where Xen would do any normal output, have
> you been able to narrow down how far it gets? I think this is one
> of the few remaining avenues to gain some more understanding.
> 

Yes, that's what I'm now attempting. Andrew has some serial logging patches that I'm going to try to convert to VGA logging.

> One other thing to try might be, with that patch reverted, to fill
> all memory upwards from wakeup_start (or really its low
> memory copy) with a pattern, and later inspect whether anything
> changed (you could of course also simply compare low memory
> copy and original). Or maybe you have tried this already...
> 

No, not done that... sounds like a good idea.

Thanks,

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 15:40                                   ` Paul Durrant
@ 2017-06-07 15:52                                     ` Jan Beulich
  2017-06-08 12:42                                       ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-07 15:52 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 07.06.17 at 17:40, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 June 2017 16:33
>>
>> Since you said it doesn't
>> make it to the point where Xen would do any normal output, have
>> you been able to narrow down how far it gets? I think this is one
>> of the few remaining avenues to gain some more understanding.
>> 
> 
> Yes, that's what I'm now attempting. Andrew has some serial logging patches 
> that I'm going to try to convert to VGA logging.

If the box hangs rather than rebooting that's an option. The
"canonical" debugging method I've used to use in such cases
is to leverage the fact that most chipsets leave at least the
unused ports in the 0x81...0x8f range untouched across boot,
so I've stored progress indicators and/or auxiliary data there,
reading them out upon next boot. But I have no library for
doing so (and this also was mostly for projects I worked on
prior to turning to Xen and Linux) ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-07 15:52                                     ` Jan Beulich
@ 2017-06-08 12:42                                       ` Paul Durrant
  2017-06-08 12:46                                         ` Juergen Gross
  2017-06-08 13:18                                         ` Jan Beulich
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-08 12:42 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

For those following this...

By poking characters at the screen and bisecting where they stopped, I have narrowed the problem to the code in edd.S. I can successfully boot by setting opt_edd=off on the Xen cmd line and I can also boot with the following patch applied:

diff --git a/xen/arch/x86/boot/edd.S b/xen/arch/x86/boot/edd.S
index 73371f98b5..5409f1d9a1 100644
--- a/xen/arch/x86/boot/edd.S
+++ b/xen/arch/x86/boot/edd.S
@@ -148,5 +148,6 @@ GLOBAL(boot_mbr_signature_nr)
         .byte   0
 GLOBAL(boot_mbr_signature)
         .fill   EDD_MBR_SIG_MAX*8,1,0
+       .align  4096
 GLOBAL(boot_edd_info)
-        .fill   512,1,0                         # big enough for a disc sector
+        .fill   4096,1,0                         # big enough for a disc sector

(based on a hunch that the BIOS defaults to a 4K sector for my NVMe drive)

I need to investigate some more but I do wonder whether the EDD info should be read first to determine the appropriate size of memory buffer to use when issuing the read of the MBR. Hardcoding a 4k reservation seems like the wrong thing to do, even if it is sufficient for this BIOS.

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-08 12:42                                       ` Paul Durrant
@ 2017-06-08 12:46                                         ` Juergen Gross
  2017-06-08 13:18                                         ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Juergen Gross @ 2017-06-08 12:46 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

On 08/06/17 14:42, Paul Durrant wrote:
> For those following this...
> 
> By poking characters at the screen and bisecting where they stopped, I have narrowed the problem to the code in edd.S. I can successfully boot by setting opt_edd=off on the Xen cmd line and I can also boot with the following patch applied:
> 
> diff --git a/xen/arch/x86/boot/edd.S b/xen/arch/x86/boot/edd.S
> index 73371f98b5..5409f1d9a1 100644
> --- a/xen/arch/x86/boot/edd.S
> +++ b/xen/arch/x86/boot/edd.S
> @@ -148,5 +148,6 @@ GLOBAL(boot_mbr_signature_nr)
>          .byte   0
>  GLOBAL(boot_mbr_signature)
>          .fill   EDD_MBR_SIG_MAX*8,1,0
> +       .align  4096
>  GLOBAL(boot_edd_info)
> -        .fill   512,1,0                         # big enough for a disc sector
> +        .fill   4096,1,0                         # big enough for a disc sector
> 
> (based on a hunch that the BIOS defaults to a 4K sector for my NVMe drive)
> 
> I need to investigate some more but I do wonder whether the EDD info should be read first to determine the appropriate size of memory buffer to use when issuing the read of the MBR. Hardcoding a 4k reservation seems like the wrong thing to do, even if it is sufficient for this BIOS
Thanks for rehabilitation of my patch. :-)


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-08 12:42                                       ` Paul Durrant
  2017-06-08 12:46                                         ` Juergen Gross
@ 2017-06-08 13:18                                         ` Jan Beulich
  2017-06-08 13:24                                           ` Paul Durrant
  2017-06-09 12:19                                           ` Paul Durrant
  1 sibling, 2 replies; 57+ messages in thread
From: Jan Beulich @ 2017-06-08 13:18 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 08.06.17 at 14:42, <Paul.Durrant@citrix.com> wrote:
> For those following this...
> 
> By poking characters at the screen and bisecting where they stopped, I have 
> narrowed the problem to the code in edd.S. I can successfully boot by setting 
> opt_edd=off on the Xen cmd line and I can also boot with the following patch 
> applied:
> 
> diff --git a/xen/arch/x86/boot/edd.S b/xen/arch/x86/boot/edd.S
> index 73371f98b5..5409f1d9a1 100644
> --- a/xen/arch/x86/boot/edd.S
> +++ b/xen/arch/x86/boot/edd.S
> @@ -148,5 +148,6 @@ GLOBAL(boot_mbr_signature_nr)
>          .byte   0
>  GLOBAL(boot_mbr_signature)
>          .fill   EDD_MBR_SIG_MAX*8,1,0
> +       .align  4096
>  GLOBAL(boot_edd_info)
> -        .fill   512,1,0                         # big enough for a disc 
> sector
> +        .fill   4096,1,0                         # big enough for a disc 
> sector
> 
> (based on a hunch that the BIOS defaults to a 4K sector for my NVMe drive)
> 
> I need to investigate some more but I do wonder whether the EDD info should 
> be read first to determine the appropriate size of memory buffer to use when 
> issuing the read of the MBR. Hardcoding a 4k reservation seems like the wrong 
> thing to do, even if it is sufficient for this BIOS.

boot_edd_info is being used for two things - reading the MBR of
each disk and storing data retrieved from INT 13 Fn 41 and 48.
The latter occupies 492 bytes (6 times 8+74). Which would make
me guess the system has a 4k disk, and the BIOS doesn't abstract
away this characteristic when handling INT 13 Fn 02 (which is
supposed to only act in multiples of 512-byte sectors, as opposed
to Fn 42).

The alternative of Fn 48 overflowing its buffer would seem less
likely, especially with the buffer holding a size on input.

Do you, btw, really need both the size and alignment increases?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-08 13:18                                         ` Jan Beulich
@ 2017-06-08 13:24                                           ` Paul Durrant
  2017-06-09 12:19                                           ` Paul Durrant
  1 sibling, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-08 13:24 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 June 2017 14:19
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>;
> 'BorisOstrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 08.06.17 at 14:42, <Paul.Durrant@citrix.com> wrote:
> > For those following this...
> >
> > By poking characters at the screen and bisecting where they stopped, I
> have
> > narrowed the problem to the code in edd.S. I can successfully boot by
> setting
> > opt_edd=off on the Xen cmd line and I can also boot with the following
> patch
> > applied:
> >
> > diff --git a/xen/arch/x86/boot/edd.S b/xen/arch/x86/boot/edd.S
> > index 73371f98b5..5409f1d9a1 100644
> > --- a/xen/arch/x86/boot/edd.S
> > +++ b/xen/arch/x86/boot/edd.S
> > @@ -148,5 +148,6 @@ GLOBAL(boot_mbr_signature_nr)
> >          .byte   0
> >  GLOBAL(boot_mbr_signature)
> >          .fill   EDD_MBR_SIG_MAX*8,1,0
> > +       .align  4096
> >  GLOBAL(boot_edd_info)
> > -        .fill   512,1,0                         # big enough for a disc
> > sector
> > +        .fill   4096,1,0                         # big enough for a disc
> > sector
> >
> > (based on a hunch that the BIOS defaults to a 4K sector for my NVMe drive)
> >
> > I need to investigate some more but I do wonder whether the EDD info
> should
> > be read first to determine the appropriate size of memory buffer to use
> when
> > issuing the read of the MBR. Hardcoding a 4k reservation seems like the
> wrong
> > thing to do, even if it is sufficient for this BIOS.
> 
> boot_edd_info is being used for two things - reading the MBR of
> each disk and storing data retrieved from INT 13 Fn 41 and 48.
> The latter occupies 492 bytes (6 times 8+74). Which would make
> me guess the system has a 4k disk, and the BIOS doesn't abstract
> away this characteristic when handling INT 13 Fn 02 (which is
> supposed to only act in multiples of 512-byte sectors, as opposed
> to Fn 42).
> 
> The alternative of Fn 48 overflowing its buffer would seem less
> likely, especially with the buffer holding a size on input.

Yes, I tested with edd=skipmbr on the command line (and no patch applied) and the system booted, so it's definitely the MBR read that is at fault.

> 
> Do you, btw, really need both the size and alignment increases?
> 

At first I tried just increasing the .fill to 4096 but that did not seem to work. I have not found anything that says int13 0x2 buffers need to be aligned... but the BIOS being buggy in this respect I guess it could easily require that.
I'm just testing some more code to try to see exactly how much memory the MBR read scribbles on.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-08 13:18                                         ` Jan Beulich
  2017-06-08 13:24                                           ` Paul Durrant
@ 2017-06-09 12:19                                           ` Paul Durrant
  2017-06-09 13:05                                             ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-09 12:19 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 June 2017 14:19
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>;
> 'BorisOstrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 08.06.17 at 14:42, <Paul.Durrant@citrix.com> wrote:
> > For those following this...
> >
> > By poking characters at the screen and bisecting where they stopped, I
> have
> > narrowed the problem to the code in edd.S. I can successfully boot by
> setting
> > opt_edd=off on the Xen cmd line and I can also boot with the following
> patch
> > applied:
> >
> > diff --git a/xen/arch/x86/boot/edd.S b/xen/arch/x86/boot/edd.S
> > index 73371f98b5..5409f1d9a1 100644
> > --- a/xen/arch/x86/boot/edd.S
> > +++ b/xen/arch/x86/boot/edd.S
> > @@ -148,5 +148,6 @@ GLOBAL(boot_mbr_signature_nr)
> >          .byte   0
> >  GLOBAL(boot_mbr_signature)
> >          .fill   EDD_MBR_SIG_MAX*8,1,0
> > +       .align  4096
> >  GLOBAL(boot_edd_info)
> > -        .fill   512,1,0                         # big enough for a disc
> > sector
> > +        .fill   4096,1,0                         # big enough for a disc
> > sector
> >
> > (based on a hunch that the BIOS defaults to a 4K sector for my NVMe drive)
> >
> > I need to investigate some more but I do wonder whether the EDD info
> should
> > be read first to determine the appropriate size of memory buffer to use
> when
> > issuing the read of the MBR. Hardcoding a 4k reservation seems like the
> wrong
> > thing to do, even if it is sufficient for this BIOS.
> 
> boot_edd_info is being used for two things - reading the MBR of
> each disk and storing data retrieved from INT 13 Fn 41 and 48.
> The latter occupies 492 bytes (6 times 8+74). Which would make
> me guess the system has a 4k disk, and the BIOS doesn't abstract
> away this characteristic when handling INT 13 Fn 02 (which is
> supposed to only act in multiples of 512-byte sectors, as opposed
> to Fn 42).
> 
> The alternative of Fn 48 overflowing its buffer would seem less
> likely, especially with the buffer holding a size on input.
> 
> Do you, btw, really need both the size and alignment increases?
> 

More investigation has characterized the problem a little more but I still don't understand precisely what it happening. The trampoline code sets %ds to 0x86 and the image is loaded at offset 0 in that segment, i.e. it is located at 0x86000. The boot_edd_info 512 byte range ends up spanning the 0x87000 boundary and when this area is used for reading the MBR I see the lock-up. If I insert some bytes sufficient to push boot_edd_info up to or beyond 0x87000 then the system boots and I have verified that bytes located immediately before or after boot_edd_info are not scribbled on by the int13 call. However, if I arrange for boot_edd_info to be located even just one byte below 0x87000 then the system again fails to boot.
I am now attempting to grab some memory below the trampoline, pattern fill it and then try to figure out if there is any collateral damage from the int13 that is further afield from the actual value in es:bx, but all this has got me wondering why Xen bothers to read the MBR, or the EDD info for that matter? EDD or MBR signatures are returned by the XENPF_firmware_info hypercall, and Linux does seem to have code called early on in xen_start_kernel() that does make such hypercalls, but it also appears to be able to boot happily if I put edd=off on my Xen command line, so is this code really necessary?

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-09 12:19                                           ` Paul Durrant
@ 2017-06-09 13:05                                             ` Jan Beulich
  2017-06-09 13:52                                               ` Boris Ostrovsky
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-09 13:05 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'BorisOstrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 09.06.17 at 14:19, <Paul.Durrant@citrix.com> wrote:
> ..., but all this has 
> got me wondering why Xen bothers to read the MBR, or the EDD info for that 
> matter? EDD or MBR signatures are returned by the XENPF_firmware_info 
> hypercall, and Linux does seem to have code called early on in 
> xen_start_kernel() that does make such hypercalls, but it also appears to be 
> able to boot happily if I put edd=off on my Xen command line, so is this code 
> really necessary?

Well, that's a question to the Linux folks. I would guess there's
management code around wanting that info, but I'm not sure. Us
doing this is simply because of Linux wanting it and having no
other way to get at least some of this information (it could surely
read the MBRs, but it wouldn't be able to associate them with
BIOS drive numbers used for the other EDD information obtained).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-09 13:05                                             ` Jan Beulich
@ 2017-06-09 13:52                                               ` Boris Ostrovsky
  2017-06-09 15:14                                                 ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Boris Ostrovsky @ 2017-06-09 13:52 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	xen-devel(xen-devel@lists.xenproject.org)

On 06/09/2017 09:05 AM, Jan Beulich wrote:
>>>> On 09.06.17 at 14:19, <Paul.Durrant@citrix.com> wrote:
>> ..., but all this has 
>> got me wondering why Xen bothers to read the MBR, or the EDD info for that 
>> matter? EDD or MBR signatures are returned by the XENPF_firmware_info 
>> hypercall, and Linux does seem to have code called early on in 
>> xen_start_kernel() that does make such hypercalls, but it also appears to be 
>> able to boot happily if I put edd=off on my Xen command line, so is this code 
>> really necessary?
> Well, that's a question to the Linux folks. I would guess there's
> management code around wanting that info, but I'm not sure. Us
> doing this is simply because of Linux wanting it and having no
> other way to get at least some of this information (it could surely
> read the MBRs, but it wouldn't be able to associate them with
> BIOS drive numbers used for the other EDD information obtained).

Not sure what it is for. Perhaps there are some tools that poke into sysfs?


commit 96f28bc66adb1414cfc9405ff80cfffdc44edd84
Author: David Vrabel <david.vrabel@citrix.com>
Date:   Wed Apr 3 17:31:50 2013 +0100

    x86/xen: populate boot_params with EDD data
   
    During early setup of a dom0 kernel, populate boot_params with the
    Enhanced Disk Drive (EDD) and MBR signature data.  This makes
    information on the BIOS boot device available in /sys/firmware/edd/.
   
    Signed-off-by: David Vrabel <david.vrabel@citrix.com>
    Acked-by: Jan Beulich <jbeulich@suse.com>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-09 13:52                                               ` Boris Ostrovsky
@ 2017-06-09 15:14                                                 ` Paul Durrant
  2017-06-09 15:41                                                   ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-09 15:14 UTC (permalink / raw)
  To: 'Boris Ostrovsky', Jan Beulich
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> Sent: 09 June 2017 14:52
> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; Juergen
> Gross <jgross@suse.com>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> On 06/09/2017 09:05 AM, Jan Beulich wrote:
> >>>> On 09.06.17 at 14:19, <Paul.Durrant@citrix.com> wrote:
> >> ..., but all this has
> >> got me wondering why Xen bothers to read the MBR, or the EDD info for
> that
> >> matter? EDD or MBR signatures are returned by the
> XENPF_firmware_info
> >> hypercall, and Linux does seem to have code called early on in
> >> xen_start_kernel() that does make such hypercalls, but it also appears to
> be
> >> able to boot happily if I put edd=off on my Xen command line, so is this
> code
> >> really necessary?
> > Well, that's a question to the Linux folks. I would guess there's
> > management code around wanting that info, but I'm not sure. Us
> > doing this is simply because of Linux wanting it and having no
> > other way to get at least some of this information (it could surely
> > read the MBRs, but it wouldn't be able to associate them with
> > BIOS drive numbers used for the other EDD information obtained).
> 
> Not sure what it is for. Perhaps there are some tools that poke into sysfs?
> 
> 
> commit 96f28bc66adb1414cfc9405ff80cfffdc44edd84
> Author: David Vrabel <david.vrabel@citrix.com>
> Date:   Wed Apr 3 17:31:50 2013 +0100
> 
>     x86/xen: populate boot_params with EDD data
> 
>     During early setup of a dom0 kernel, populate boot_params with the
>     Enhanced Disk Drive (EDD) and MBR signature data.  This makes
>     information on the BIOS boot device available in /sys/firmware/edd/.
> 
>     Signed-off-by: David Vrabel <david.vrabel@citrix.com>
>     Acked-by: Jan Beulich <jbeulich@suse.com>
>     Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Interesting. The Xen side of things seems to have been there forever:

commit 79e96982cade240531d7d84fa5b966b2b64c04af
Author: kfraser@localhost.localdomain <kfraser@localhost.localdomain>
Date:   Tue Jun 12 14:03:09 2007 +0100

    x86: Gather BIOS EDD info during boot.
    Still needs plumbing to dom0.
    Signed-off-by: Keir Fraser <keir@xensource.com>

I've characterised the issue some more and it appears to be an overflow inside the int13 handler if es:bx is less than 512 bytes below a 4k boundary. I modified the code to use a hardcoded segment, which I set at 0x6000, and all values of bx up to 0xe00 resulted in a good MBR signature. Values above 0xe00 but below 0xe20 resulted in the buffer not being identified as a valid MBR (I guess because the 0xAA55 fell off) and values of bx above 0xe20 resulted in either a hang (sometimes with a black screen) or a reboot.
This led me to believe that backing out all my debug code and adding a '.align 512' just before the definition of boot_edd_info should result in a successful boot. Alas this appears not to be the case... I seem to need at least 2k alignment. I wonder whether it may be more robust to go for 4k alignment though.

  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-09 15:14                                                 ` Paul Durrant
@ 2017-06-09 15:41                                                   ` Jan Beulich
  2017-06-09 15:47                                                     ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-09 15:41 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 09.06.17 at 17:14, <Paul.Durrant@citrix.com> wrote:
> I've characterised the issue some more and it appears to be an overflow 
> inside the int13 handler if es:bx is less than 512 bytes below a 4k boundary. 
> I modified the code to use a hardcoded segment, which I set at 0x6000, and 
> all values of bx up to 0xe00 resulted in a good MBR signature. Values above 
> 0xe00 but below 0xe20 resulted in the buffer not being identified as a valid 
> MBR (I guess because the 0xAA55 fell off) and values of bx above 0xe20 
> resulted in either a hang (sometimes with a black screen) or a reboot.
> This led me to believe that backing out all my debug code and adding a 
> '.align 512' just before the definition of boot_edd_info should result in a 
> successful boot. Alas this appears not to be the case... I seem to need at 
> least 2k alignment. I wonder whether it may be more robust to go for 4k 
> alignment though.

At least until we've seen (and merged) Jürgen's further trampoline
adjustments, we need to be careful with growing its overall size.
Memory below 1Mb is known to be scarce specifically on some EFI
systems, and we're currently still allocating space for all of the
trampoline instead of just its permanent part. Even on non-EFI
systems I'd prefer the trampoline to remain as small as possible.

With what you say about the requirements this buggy BIOS has
I wonder whether we couldn't help ourselves by doing I/O to
other than boot_edd_info. Especially if we did the EDD stuff last
(rather than before video), a good portion of the boot time only
trampoline space will no longer be needed.

Otoh I wonder where a system this buggy shouldn't be declared
unusable (until a suitable BIOS update becomes available). Did
you check what constraints Linux places on the buffer used for
I/O? IOW can you judge whether bare metal Linux just happens
to work (just like older Xen did), or has been fixed to cope with
such a situation?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-09 15:41                                                   ` Jan Beulich
@ 2017-06-09 15:47                                                     ` Paul Durrant
  2017-06-09 15:58                                                       ` Jan Beulich
  2017-06-12  8:14                                                       ` Paul Durrant
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-09 15:47 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 June 2017 16:41
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 09.06.17 at 17:14, <Paul.Durrant@citrix.com> wrote:
> > I've characterised the issue some more and it appears to be an overflow
> > inside the int13 handler if es:bx is less than 512 bytes below a 4k boundary.
> > I modified the code to use a hardcoded segment, which I set at 0x6000, and
> > all values of bx up to 0xe00 resulted in a good MBR signature. Values above
> > 0xe00 but below 0xe20 resulted in the buffer not being identified as a valid
> > MBR (I guess because the 0xAA55 fell off) and values of bx above 0xe20
> > resulted in either a hang (sometimes with a black screen) or a reboot.
> > This led me to believe that backing out all my debug code and adding a
> > '.align 512' just before the definition of boot_edd_info should result in a
> > successful boot. Alas this appears not to be the case... I seem to need at
> > least 2k alignment. I wonder whether it may be more robust to go for 4k
> > alignment though.
> 
> At least until we've seen (and merged) Jürgen's further trampoline
> adjustments, we need to be careful with growing its overall size.
> Memory below 1Mb is known to be scarce specifically on some EFI
> systems, and we're currently still allocating space for all of the
> trampoline instead of just its permanent part. Even on non-EFI
> systems I'd prefer the trampoline to remain as small as possible.
> 
> With what you say about the requirements this buggy BIOS has
> I wonder whether we couldn't help ourselves by doing I/O to
> other than boot_edd_info. Especially if we did the EDD stuff last
> (rather than before video), a good portion of the boot time only
> trampoline space will no longer be needed.

I think that would be sensible, but I was looking for the simplest fix/workaround possible for 4.9 and setting the alignment seems to it.

> 
> Otoh I wonder where a system this buggy shouldn't be declared
> unusable (until a suitable BIOS update becomes available). Did
> you check what constraints Linux places on the buffer used for
> I/O? IOW can you judge whether bare metal Linux just happens
> to work (just like older Xen did), or has been fixed to cope with
> such a situation?
> 

I'll go have a look and the linux edd code. I'm also trying a BIOS update (which is proving to be trickier than I thought as it seems to have killed networking in some weird way).

  Paul

> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-09 15:47                                                     ` Paul Durrant
@ 2017-06-09 15:58                                                       ` Jan Beulich
  2017-06-12  8:14                                                       ` Paul Durrant
  1 sibling, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2017-06-09 15:58 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 09.06.17 at 17:47, <Paul.Durrant@citrix.com> wrote:
> I'll go have a look and the linux edd code. I'm also trying a BIOS update 
> (which is proving to be trickier than I thought as it seems to have killed 
> networking in some weird way).

Speaks for the quality of what that vendor delivers...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-09 15:47                                                     ` Paul Durrant
  2017-06-09 15:58                                                       ` Jan Beulich
@ 2017-06-12  8:14                                                       ` Paul Durrant
  2017-06-12 10:40                                                         ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-12  8:14 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Paul Durrant
> Sent: 09 June 2017 16:47
> To: 'Jan Beulich' <JBeulich@suse.com>
> Cc: Juergen Gross <jgross@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Julien Grall (julien.grall@arm.com)
> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>;
> xen-devel(xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 09 June 2017 16:41
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> > Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> > devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> > Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> > <jgross@suse.com>
> > Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >
> > >>> On 09.06.17 at 17:14, <Paul.Durrant@citrix.com> wrote:
> > > I've characterised the issue some more and it appears to be an overflow
> > > inside the int13 handler if es:bx is less than 512 bytes below a 4k
> boundary.
> > > I modified the code to use a hardcoded segment, which I set at 0x6000,
> and
> > > all values of bx up to 0xe00 resulted in a good MBR signature. Values
> above
> > > 0xe00 but below 0xe20 resulted in the buffer not being identified as a
> valid
> > > MBR (I guess because the 0xAA55 fell off) and values of bx above 0xe20
> > > resulted in either a hang (sometimes with a black screen) or a reboot.
> > > This led me to believe that backing out all my debug code and adding a
> > > '.align 512' just before the definition of boot_edd_info should result in a
> > > successful boot. Alas this appears not to be the case... I seem to need at
> > > least 2k alignment. I wonder whether it may be more robust to go for 4k
> > > alignment though.
> >
> > At least until we've seen (and merged) Jürgen's further trampoline
> > adjustments, we need to be careful with growing its overall size.
> > Memory below 1Mb is known to be scarce specifically on some EFI
> > systems, and we're currently still allocating space for all of the
> > trampoline instead of just its permanent part. Even on non-EFI
> > systems I'd prefer the trampoline to remain as small as possible.
> >
> > With what you say about the requirements this buggy BIOS has
> > I wonder whether we couldn't help ourselves by doing I/O to
> > other than boot_edd_info. Especially if we did the EDD stuff last
> > (rather than before video), a good portion of the boot time only
> > trampoline space will no longer be needed.
> 
> I think that would be sensible, but I was looking for the simplest
> fix/workaround possible for 4.9 and setting the alignment seems to it.
> 
> >
> > Otoh I wonder where a system this buggy shouldn't be declared
> > unusable (until a suitable BIOS update becomes available). Did
> > you check what constraints Linux places on the buffer used for
> > I/O? IOW can you judge whether bare metal Linux just happens
> > to work (just like older Xen did), or has been fixed to cope with
> > such a situation?
> >
> 
> I'll go have a look and the linux edd code. I'm also trying a BIOS update (which
> is proving to be trickier than I thought as it seems to have killed networking in
> some weird way).

Looking at the code in arch/x86/boot/edd.c in Linux, it sector aligns the buffer into which it reads the MBR and the sector size is pulled from the EDD which means, I believe, that the MBR read on the skull canyon would be 4k aligned.

What do you think it best to do for Xen 4.9? Hardcoding a 4k alignment is clearly easy and would work around this BIOS issue but, as you say, it does grow the image. Reverting Juergen's patch also works round the issue, but that is more by luck. Re-working the code is preferable, but I guess it's too late to introduce such code-churn in 4.9.

  Paul

> 
>   Paul
> 
> > Jan
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12  8:14                                                       ` Paul Durrant
@ 2017-06-12 10:40                                                         ` Jan Beulich
  2017-06-12 10:44                                                           ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-12 10:40 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

[-- Attachment #1: Type: text/plain, Size: 1155 bytes --]

>>> On 12.06.17 at 10:14, <Paul.Durrant@citrix.com> wrote:
> Looking at the code in arch/x86/boot/edd.c in Linux, it sector aligns the 
> buffer into which it reads the MBR and the sector size is pulled from the EDD 
> which means, I believe, that the MBR read on the skull canyon would be 4k 
> aligned.
> 
> What do you think it best to do for Xen 4.9? Hardcoding a 4k alignment is 
> clearly easy and would work around this BIOS issue but, as you say, it does 
> grow the image. Reverting Juergen's patch also works round the issue, but 
> that is more by luck. Re-working the code is preferable, but I guess it's too 
> late to introduce such code-churn in 4.9.

Reverting Jürgen's code is out of question with all the information
you've gathered by now. I think re-working the EDD code slightly
is the best option. Would you mind giving the attached patch a
try? This still slightly grows the trampoline due to a few more
instructions being needed, but should still be far better than
embedding a whole 4k buffer (and then later finding a BIOS/disk
combination which wants even more). Note that I've left a tiny
bit of debugging code in there.

Jan

[-- Attachment #2: x86-MBR-below-trampoline.patch --]
[-- Type: text/plain, Size: 7173 bytes --]


TODO: remove //temp-s

We place the trampoline no lower than at 256k, so we have ample space
to read the MBRs of BIOS disks into an aligned buffer right below the
trampoline (not doing so has been found to be a problem on a buggy BIOS
coming with a Skull Canyon NUC). To facilitate that move MBR reading
past EDD info retrieval.

Also add a wrap check to the EDD info retrieval loop, to match that in
the MBR reading one.

Reported-by: Paul Durrant <Paul.Durrant@citrix.com>
---
Using 512-byte sector size as default right now - perhaps worth
considering to use 4k instead. I'm also not sure whether we shouldn't
sanity check the sector size some more.

--- unstable.orig/xen/arch/x86/boot/edd.S	2017-02-09 14:28:18.000000000 +0100
+++ unstable/xen/arch/x86/boot/edd.S	2017-06-12 12:33:36.353082705 +0200
@@ -26,46 +26,6 @@
 get_edd:
         cmpb    $2, bootsym(opt_edd)            # edd=off ?
         je      edd_done
-        cmpb    $1, bootsym(opt_edd)            # edd=skipmbr ?
-        je      edd_start
-
-# Read the first sector of each BIOS disk device and store the 4-byte signature
-edd_mbr_sig_start:
-        movb    $0x80, %dl                      # from device 80
-        movw    $bootsym(boot_mbr_signature),%bx # store buffer ptr in bx
-edd_mbr_sig_read:
-        pushw   %bx
-        movb    $0x02, %ah                      # 0x02 Read Sectors
-        movb    $1, %al                         # read 1 sector
-        movb    $0, %dh                         # at head 0
-        movw    $1, %cx                         # cylinder 0, sector 0
-        pushw   %es
-        pushw   %ds
-        popw    %es
-        movw    $bootsym(boot_edd_info), %bx    # disk's data goes into info
-        pushw   %dx             # work around buggy BIOSes
-        stc                     # work around buggy BIOSes
-        int     $0x13
-        sti                     # work around buggy BIOSes
-        popw    %dx
-        popw    %es
-        popw    %bx
-        jc      edd_mbr_sig_done                # on failure, we're done.
-        cmpb    $0, %ah                         # some BIOSes do not set CF
-        jne     edd_mbr_sig_done                # on failure, we're done.
-        cmpw    $0xaa55, bootsym(boot_edd_info)+0x1fe
-        jne     .Ledd_mbr_sig_next
-        movl    bootsym(boot_edd_info)+EDD_MBR_SIG_OFFSET,%eax
-        movb    %dl, (%bx)                      # store BIOS drive number
-        movl    %eax, 4(%bx)                    # store signature from MBR
-        incb    bootsym(boot_mbr_signature_nr)  # note that we stored something
-        addw    $8, %bx                         # increment sig buffer ptr
-.Ledd_mbr_sig_next:
-        incb    %dl                             # increment to next device
-        jz      edd_mbr_sig_done
-        cmpb    $EDD_MBR_SIG_MAX,bootsym(boot_mbr_signature_nr)
-        jb      edd_mbr_sig_read
-edd_mbr_sig_done:
 
 # Do the BIOS Enhanced Disk Drive calls
 # This consists of two calls:
@@ -136,10 +96,72 @@ edd_legacy_done:
 
 edd_next:
         incb    %dl                             # increment to next device
+        jz      edd_done
         cmpb    $EDD_INFO_MAX,bootsym(boot_edd_info_nr)
         jb      edd_check_ext
 
 edd_done:
+        cmpb    $1, bootsym(opt_edd)            # edd=skipmbr ?
+        je      .Ledd_mbr_sig_skip
+
+# Read the first sector of each BIOS disk device and store the 4-byte signature
+.Ledd_mbr_sig_start:
+        pushw   %es
+        movb    $0x80, %dl                      # from device 80
+        movw    $bootsym(boot_mbr_signature), %bx # store buffer ptr in bx
+.Ledd_mbr_sig_read:
+        pushw   %bx
+        movw    $bootsym(boot_edd_info), %bx
+        movzbw  bootsym(boot_edd_info_nr), %cx
+        jcxz    .Ledd_mbr_sig_default
+.Ledd_mbr_sig_find_info:
+        cmpb    %dl, (%bx)
+        ja      .Ledd_mbr_sig_default
+        je      .Ledd_mbr_sig_get_size
+        add     $EDDEXTSIZE+EDDPARMSIZE, %bx
+        loop    .Ledd_mbr_sig_find_info
+.Ledd_mbr_sig_default:
+        movw    $(512 >> 4), %bx
+        jmp     .Ledd_mbr_sig_set_buf
+.Ledd_mbr_sig_get_size:
+        movw    EDDEXTSIZE+0x18(%bx), %bx       # sector size
+        shr     $4, %bx                         # convert to paragraphs
+        jz      .Ledd_mbr_sig_default
+.Ledd_mbr_sig_set_buf:
+        movw    %ds, %ax
+        subw    %bx, %ax                        # disk's data goes right ahead
+        movw    %ax, %es                        # of trampoline
+        xorw    %bx, %bx
+        movw    %bx, %es:0x1fe(%bx)             # clear BIOS magic just in case
+        pushw   %dx                             # work around buggy BIOSes
+        stc                                     # work around buggy BIOSes
+        movw    $0x0201, %ax                    # read 1 sector
+        movb    $0, %dh                         # at head 0
+        movw    $1, %cx                         # cylinder 0, sector 0
+        int     $0x13
+        sti                                     # work around buggy BIOSes
+        popw    %dx
+        movw    %es:0x1fe(%bx), %si
+        movl    %es:EDD_MBR_SIG_OFFSET(%bx), %ecx
+        popw    %bx
+        jc      .Ledd_mbr_sig_done              # on failure, we're done.
+        testb   %ah, %ah                        # some BIOSes do not set CF
+        jnz     .Ledd_mbr_sig_done              # on failure, we're done.
+        cmpw    $0xaa55, %si
+        jne     .Ledd_mbr_sig_next
+        movb    %dl, (%bx)                      # store BIOS drive number
+ movw %es,2(%bx)//temp
+        movl    %ecx, 4(%bx)                    # store signature from MBR
+        incb    bootsym(boot_mbr_signature_nr)  # note that we stored something
+        addw    $8, %bx                         # increment sig buffer ptr
+.Ledd_mbr_sig_next:
+        incb    %dl                             # increment to next device
+        jz      .Ledd_mbr_sig_done
+        cmpb    $EDD_MBR_SIG_MAX, bootsym(boot_mbr_signature_nr)
+        jb      .Ledd_mbr_sig_read
+.Ledd_mbr_sig_done:
+        popw    %es
+.Ledd_mbr_sig_skip:
         ret
 
 GLOBAL(boot_edd_info_nr)
@@ -149,4 +171,4 @@ GLOBAL(boot_mbr_signature_nr)
 GLOBAL(boot_mbr_signature)
         .fill   EDD_MBR_SIG_MAX*8,1,0
 GLOBAL(boot_edd_info)
-        .fill   512,1,0                         # big enough for a disc sector
+        .fill   EDD_INFO_MAX * (EDDEXTSIZE + EDDPARMSIZE), 1, 0
--- unstable.orig/xen/arch/x86/platform_hypercall.c	2015-07-20 14:49:38.000000000 +0200
+++ unstable/xen/arch/x86/platform_hypercall.c	2017-06-12 12:22:01.658928095 +0200
@@ -376,6 +376,7 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PA
                 break;
 
             sig = bootsym(boot_mbr_signature) + op->u.firmware_info.index;
+printk("MBR[%02x] @ %02x%02x (%04lx)\n", sig->device, sig->pad[2], sig->pad[1], trampoline_phys);//temp
 
             op->u.firmware_info.u.disk_mbr_signature.device = sig->device;
             op->u.firmware_info.u.disk_mbr_signature.mbr_signature =

[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 10:40                                                         ` Jan Beulich
@ 2017-06-12 10:44                                                           ` Paul Durrant
  2017-06-12 10:53                                                             ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 10:44 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 June 2017 11:41
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 12.06.17 at 10:14, <Paul.Durrant@citrix.com> wrote:
> > Looking at the code in arch/x86/boot/edd.c in Linux, it sector aligns the
> > buffer into which it reads the MBR and the sector size is pulled from the
> EDD
> > which means, I believe, that the MBR read on the skull canyon would be 4k
> > aligned.
> >
> > What do you think it best to do for Xen 4.9? Hardcoding a 4k alignment is
> > clearly easy and would work around this BIOS issue but, as you say, it does
> > grow the image. Reverting Juergen's patch also works round the issue, but
> > that is more by luck. Re-working the code is preferable, but I guess it's too
> > late to introduce such code-churn in 4.9.
> 
> Reverting Jürgen's code is out of question with all the information
> you've gathered by now. I think re-working the EDD code slightly
> is the best option. Would you mind giving the attached patch a
> try? This still slightly grows the trampoline due to a few more
> instructions being needed, but should still be far better than
> embedding a whole 4k buffer (and then later finding a BIOS/disk
> combination which wants even more). Note that I've left a tiny
> bit of debugging code in there.
> 

Sure, I'll give that a go now.

  Paul

> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 10:44                                                           ` Paul Durrant
@ 2017-06-12 10:53                                                             ` Paul Durrant
  2017-06-12 11:12                                                               ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 10:53 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
[snip]
> > >
> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k alignment is
> > > clearly easy and would work around this BIOS issue but, as you say, it
> does
> > > grow the image. Reverting Juergen's patch also works round the issue,
> but
> > > that is more by luck. Re-working the code is preferable, but I guess it's
> too
> > > late to introduce such code-churn in 4.9.
> >
> > Reverting Jürgen's code is out of question with all the information
> > you've gathered by now. I think re-working the EDD code slightly
> > is the best option. Would you mind giving the attached patch a
> > try? This still slightly grows the trampoline due to a few more
> > instructions being needed, but should still be far better than
> > embedding a whole 4k buffer (and then later finding a BIOS/disk
> > combination which wants even more). Note that I've left a tiny
> > bit of debugging code in there.
> >
> 
> Sure, I'll give that a go now.
> 

That worked fine:

(XEN) MBR[80] @ 85e0 (86000)

so you can add my Tested-by to that.

Thanks,

  Paul

>   Paul
> 
> > Jan
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 10:53                                                             ` Paul Durrant
@ 2017-06-12 11:12                                                               ` Jan Beulich
  2017-06-12 12:05                                                                 ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-12 11:12 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 12.06.17 at 12:53, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
> [snip]
>> > >
>> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k alignment is
>> > > clearly easy and would work around this BIOS issue but, as you say, it
>> does
>> > > grow the image. Reverting Juergen's patch also works round the issue,
>> but
>> > > that is more by luck. Re-working the code is preferable, but I guess it's
>> too
>> > > late to introduce such code-churn in 4.9.
>> >
>> > Reverting Jürgen's code is out of question with all the information
>> > you've gathered by now. I think re-working the EDD code slightly
>> > is the best option. Would you mind giving the attached patch a
>> > try? This still slightly grows the trampoline due to a few more
>> > instructions being needed, but should still be far better than
>> > embedding a whole 4k buffer (and then later finding a BIOS/disk
>> > combination which wants even more). Note that I've left a tiny
>> > bit of debugging code in there.
>> >
>> 
>> Sure, I'll give that a go now.
>> 
> 
> That worked fine:
> 
> (XEN) MBR[80] @ 85e0 (86000)

But that's contrary to your earlier findings: Didn't you say simply
avoiding a 4k-boundary wasn't enough? And it certainly tells us
that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
sectors) - I was really expecting a larger gap between the two
logged values.

> so you can add my Tested-by to that.

I.e. I'm not sure about this, as I'm still uncertain whether some
corruption didn't again occur. Of course APs coming up properly
would already be a relatively good sign (as now the permanent
part of the trampoline would be the predestined area for
corruption to occur in).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 11:12                                                               ` Jan Beulich
@ 2017-06-12 12:05                                                                 ` Paul Durrant
  2017-06-12 12:25                                                                   ` Paul Durrant
  2017-06-12 13:54                                                                   ` Jan Beulich
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 12:05 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 June 2017 12:12
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 12.06.17 at 12:53, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> > [snip]
> >> > >
> >> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k alignment
> is
> >> > > clearly easy and would work around this BIOS issue but, as you say, it
> >> does
> >> > > grow the image. Reverting Juergen's patch also works round the issue,
> >> but
> >> > > that is more by luck. Re-working the code is preferable, but I guess it's
> >> too
> >> > > late to introduce such code-churn in 4.9.
> >> >
> >> > Reverting Jürgen's code is out of question with all the information
> >> > you've gathered by now. I think re-working the EDD code slightly
> >> > is the best option. Would you mind giving the attached patch a
> >> > try? This still slightly grows the trampoline due to a few more
> >> > instructions being needed, but should still be far better than
> >> > embedding a whole 4k buffer (and then later finding a BIOS/disk
> >> > combination which wants even more). Note that I've left a tiny
> >> > bit of debugging code in there.
> >> >
> >>
> >> Sure, I'll give that a go now.
> >>
> >
> > That worked fine:
> >
> > (XEN) MBR[80] @ 85e0 (86000)
> 
> But that's contrary to your earlier findings: Didn't you say simply
> avoiding a 4k-boundary wasn't enough? And it certainly tells us
> that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
> sectors) - I was really expecting a larger gap between the two
> logged values.
> 

I'll go dump out the edd and double check what it is saying.

My findings indicated that the problem seemed to be doing a read that spanned a 4k boundary caused a problem, so using 0x85e00 would be safe. The anomaly was that simply aligning the edd_info buffer and a 512 byte boundary and continuing to use that for reading did not work.
 
> > so you can add my Tested-by to that.
> 
> I.e. I'm not sure about this, as I'm still uncertain whether some
> corruption didn't again occur. Of course APs coming up properly
> would already be a relatively good sign (as now the permanent
> part of the trampoline would be the predestined area for
> corruption to occur in).
> 

None of my findings ever indicated memory corruption (although there, of course, may have been some that I happened to miss), but rather misbehaviour of the int13 handler itself - either locking up, having odd effects (e.g. black screen), or both.

  Paul

> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 12:05                                                                 ` Paul Durrant
@ 2017-06-12 12:25                                                                   ` Paul Durrant
  2017-06-12 13:54                                                                   ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 12:25 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Juergen Gross',
	Andrew Cooper, 'Julien Grall (julien.grall@arm.com)',
	'Boris Ostrovsky',
	'xen-devel(xen-devel@lists.xenproject.org)'

> -----Original Message-----
> From: Paul Durrant
> Sent: 12 June 2017 13:06
> To: 'Jan Beulich' <JBeulich@suse.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 12 June 2017 12:12
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> > Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> > devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> > Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> > <jgross@suse.com>
> > Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >
> > >>> On 12.06.17 at 12:53, <Paul.Durrant@citrix.com> wrote:
> > >>  -----Original Message-----
> > > [snip]
> > >> > >
> > >> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k
> alignment
> > is
> > >> > > clearly easy and would work around this BIOS issue but, as you say, it
> > >> does
> > >> > > grow the image. Reverting Juergen's patch also works round the
> issue,
> > >> but
> > >> > > that is more by luck. Re-working the code is preferable, but I guess
> it's
> > >> too
> > >> > > late to introduce such code-churn in 4.9.
> > >> >
> > >> > Reverting Jürgen's code is out of question with all the information
> > >> > you've gathered by now. I think re-working the EDD code slightly
> > >> > is the best option. Would you mind giving the attached patch a
> > >> > try? This still slightly grows the trampoline due to a few more
> > >> > instructions being needed, but should still be far better than
> > >> > embedding a whole 4k buffer (and then later finding a BIOS/disk
> > >> > combination which wants even more). Note that I've left a tiny
> > >> > bit of debugging code in there.
> > >> >
> > >>
> > >> Sure, I'll give that a go now.
> > >>
> > >
> > > That worked fine:
> > >
> > > (XEN) MBR[80] @ 85e0 (86000)
> >
> > But that's contrary to your earlier findings: Didn't you say simply
> > avoiding a 4k-boundary wasn't enough? And it certainly tells us
> > that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
> > sectors) - I was really expecting a larger gap between the two
> > logged values.
> >
> 
> I'll go dump out the edd and double check what it is saying.
> 

I dumped a bit of the info:

(XEN) device 0x80 version 0x30
(XEN) number_of_sectors = 0x1dcf32b0
(XEN) sectors_per_track = 0x3f
(XEN) bytes_per_sector = 0x200

So it is indeed advertising a 512 byte sector. It is an SSD though so it'll be something much bigger underneath.

  Paul

> My findings indicated that the problem seemed to be doing a read that
> spanned a 4k boundary caused a problem, so using 0x85e00 would be safe.
> The anomaly was that simply aligning the edd_info buffer and a 512 byte
> boundary and continuing to use that for reading did not work.
> 
> > > so you can add my Tested-by to that.
> >
> > I.e. I'm not sure about this, as I'm still uncertain whether some
> > corruption didn't again occur. Of course APs coming up properly
> > would already be a relatively good sign (as now the permanent
> > part of the trampoline would be the predestined area for
> > corruption to occur in).
> >
> 
> None of my findings ever indicated memory corruption (although there, of
> course, may have been some that I happened to miss), but rather
> misbehaviour of the int13 handler itself - either locking up, having odd
> effects (e.g. black screen), or both.
> 
>   Paul
> 
> > Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 12:05                                                                 ` Paul Durrant
  2017-06-12 12:25                                                                   ` Paul Durrant
@ 2017-06-12 13:54                                                                   ` Jan Beulich
  2017-06-12 14:28                                                                     ` Paul Durrant
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-12 13:54 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, AndrewCooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 12.06.17 at 14:05, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 June 2017 12:12
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
>> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
>> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
>> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
>> <jgross@suse.com>
>> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>> 
>> >>> On 12.06.17 at 12:53, <Paul.Durrant@citrix.com> wrote:
>> >>  -----Original Message-----
>> > [snip]
>> >> > >
>> >> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k alignment
>> is
>> >> > > clearly easy and would work around this BIOS issue but, as you say, it
>> >> does
>> >> > > grow the image. Reverting Juergen's patch also works round the issue,
>> >> but
>> >> > > that is more by luck. Re-working the code is preferable, but I guess it's
>> >> too
>> >> > > late to introduce such code-churn in 4.9.
>> >> >
>> >> > Reverting Jürgen's code is out of question with all the information
>> >> > you've gathered by now. I think re-working the EDD code slightly
>> >> > is the best option. Would you mind giving the attached patch a
>> >> > try? This still slightly grows the trampoline due to a few more
>> >> > instructions being needed, but should still be far better than
>> >> > embedding a whole 4k buffer (and then later finding a BIOS/disk
>> >> > combination which wants even more). Note that I've left a tiny
>> >> > bit of debugging code in there.
>> >> >
>> >>
>> >> Sure, I'll give that a go now.
>> >>
>> >
>> > That worked fine:
>> >
>> > (XEN) MBR[80] @ 85e0 (86000)
>> 
>> But that's contrary to your earlier findings: Didn't you say simply
>> avoiding a 4k-boundary wasn't enough? And it certainly tells us
>> that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
>> sectors) - I was really expecting a larger gap between the two
>> logged values.
>> 
> 
> I'll go dump out the edd and double check what it is saying.
> 
> My findings indicated that the problem seemed to be doing a read that 
> spanned a 4k boundary caused a problem, so using 0x85e00 would be safe. The 
> anomaly was that simply aligning the edd_info buffer and a 512 byte boundary 
> and continuing to use that for reading did not work.

But a 512-byte aligned 512-byte buffer can't possibly cross a page
boundary.

>> > so you can add my Tested-by to that.
>> 
>> I.e. I'm not sure about this, as I'm still uncertain whether some
>> corruption didn't again occur. Of course APs coming up properly
>> would already be a relatively good sign (as now the permanent
>> part of the trampoline would be the predestined area for
>> corruption to occur in).
>> 
> 
> None of my findings ever indicated memory corruption (although there, of 
> course, may have been some that I happened to miss), but rather misbehaviour 
> of the int13 handler itself - either locking up, having odd effects (e.g. 
> black screen), or both.

Ah, I didn't understand it this way so far, and instead had implied
that the handler did return, but corrupt our trampoline area in
one way or another.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 13:54                                                                   ` Jan Beulich
@ 2017-06-12 14:28                                                                     ` Paul Durrant
  2017-06-12 14:43                                                                       ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 14:28 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 June 2017 14:55
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 12.06.17 at 14:05, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 12 June 2017 12:12
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> >> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> >> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> >> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> >> <jgross@suse.com>
> >> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>
> >> >>> On 12.06.17 at 12:53, <Paul.Durrant@citrix.com> wrote:
> >> >>  -----Original Message-----
> >> > [snip]
> >> >> > >
> >> >> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k
> alignment
> >> is
> >> >> > > clearly easy and would work around this BIOS issue but, as you say,
> it
> >> >> does
> >> >> > > grow the image. Reverting Juergen's patch also works round the
> issue,
> >> >> but
> >> >> > > that is more by luck. Re-working the code is preferable, but I guess
> it's
> >> >> too
> >> >> > > late to introduce such code-churn in 4.9.
> >> >> >
> >> >> > Reverting Jürgen's code is out of question with all the information
> >> >> > you've gathered by now. I think re-working the EDD code slightly
> >> >> > is the best option. Would you mind giving the attached patch a
> >> >> > try? This still slightly grows the trampoline due to a few more
> >> >> > instructions being needed, but should still be far better than
> >> >> > embedding a whole 4k buffer (and then later finding a BIOS/disk
> >> >> > combination which wants even more). Note that I've left a tiny
> >> >> > bit of debugging code in there.
> >> >> >
> >> >>
> >> >> Sure, I'll give that a go now.
> >> >>
> >> >
> >> > That worked fine:
> >> >
> >> > (XEN) MBR[80] @ 85e0 (86000)
> >>
> >> But that's contrary to your earlier findings: Didn't you say simply
> >> avoiding a 4k-boundary wasn't enough? And it certainly tells us
> >> that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
> >> sectors) - I was really expecting a larger gap between the two
> >> logged values.
> >>
> >
> > I'll go dump out the edd and double check what it is saying.
> >
> > My findings indicated that the problem seemed to be doing a read that
> > spanned a 4k boundary caused a problem, so using 0x85e00 would be safe.
> The
> > anomaly was that simply aligning the edd_info buffer and a 512 byte
> boundary
> > and continuing to use that for reading did not work.
> 
> But a 512-byte aligned 512-byte buffer can't possibly cross a page
> boundary.

Indeed, which is why I was perplexed. I found that 0x60e00 was ok. Your patch chose 0x85e00, which was ok too, but for some reason a '.align 512' in front of boot_edd_info yielded an address which was not ok. I just checked what address that yielded though (by booting with edd=off to avoid the hang) and it was 0x86f40... which clearly means that '.align 512' is not doing what I thought it would do.

  Paul

> 
> >> > so you can add my Tested-by to that.
> >>
> >> I.e. I'm not sure about this, as I'm still uncertain whether some
> >> corruption didn't again occur. Of course APs coming up properly
> >> would already be a relatively good sign (as now the permanent
> >> part of the trampoline would be the predestined area for
> >> corruption to occur in).
> >>
> >
> > None of my findings ever indicated memory corruption (although there, of
> > course, may have been some that I happened to miss), but rather
> misbehaviour
> > of the int13 handler itself - either locking up, having odd effects (e.g.
> > black screen), or both.
> 
> Ah, I didn't understand it this way so far, and instead had implied
> that the handler did return, but corrupt our trampoline area in
> one way or another.
> 
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 14:28                                                                     ` Paul Durrant
@ 2017-06-12 14:43                                                                       ` Paul Durrant
  2017-06-12 15:03                                                                         ` Paul Durrant
  2017-06-12 15:07                                                                         ` Jan Beulich
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 14:43 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Paul Durrant
> Sent: 12 June 2017 15:29
> To: 'Jan Beulich' <JBeulich@suse.com>
> Cc: Juergen Gross <jgross@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Julien Grall (julien.grall@arm.com)
> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>;
> xen-devel(xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 12 June 2017 14:55
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> > Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> > devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> > Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> > <jgross@suse.com>
> > Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >
> > >>> On 12.06.17 at 14:05, <Paul.Durrant@citrix.com> wrote:
> > >>  -----Original Message-----
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: 12 June 2017 12:12
> > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > >> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> > >> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> > >> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> > >> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> > >> <jgross@suse.com>
> > >> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> > >>
> > >> >>> On 12.06.17 at 12:53, <Paul.Durrant@citrix.com> wrote:
> > >> >>  -----Original Message-----
> > >> > [snip]
> > >> >> > >
> > >> >> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k
> > alignment
> > >> is
> > >> >> > > clearly easy and would work around this BIOS issue but, as you
> say,
> > it
> > >> >> does
> > >> >> > > grow the image. Reverting Juergen's patch also works round the
> > issue,
> > >> >> but
> > >> >> > > that is more by luck. Re-working the code is preferable, but I
> guess
> > it's
> > >> >> too
> > >> >> > > late to introduce such code-churn in 4.9.
> > >> >> >
> > >> >> > Reverting Jürgen's code is out of question with all the information
> > >> >> > you've gathered by now. I think re-working the EDD code slightly
> > >> >> > is the best option. Would you mind giving the attached patch a
> > >> >> > try? This still slightly grows the trampoline due to a few more
> > >> >> > instructions being needed, but should still be far better than
> > >> >> > embedding a whole 4k buffer (and then later finding a BIOS/disk
> > >> >> > combination which wants even more). Note that I've left a tiny
> > >> >> > bit of debugging code in there.
> > >> >> >
> > >> >>
> > >> >> Sure, I'll give that a go now.
> > >> >>
> > >> >
> > >> > That worked fine:
> > >> >
> > >> > (XEN) MBR[80] @ 85e0 (86000)
> > >>
> > >> But that's contrary to your earlier findings: Didn't you say simply
> > >> avoiding a 4k-boundary wasn't enough? And it certainly tells us
> > >> that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
> > >> sectors) - I was really expecting a larger gap between the two
> > >> logged values.
> > >>
> > >
> > > I'll go dump out the edd and double check what it is saying.
> > >
> > > My findings indicated that the problem seemed to be doing a read that
> > > spanned a 4k boundary caused a problem, so using 0x85e00 would be
> safe.
> > The
> > > anomaly was that simply aligning the edd_info buffer and a 512 byte
> > boundary
> > > and continuing to use that for reading did not work.
> >
> > But a 512-byte aligned 512-byte buffer can't possibly cross a page
> > boundary.
> 
> Indeed, which is why I was perplexed. I found that 0x60e00 was ok. Your
> patch chose 0x85e00, which was ok too, but for some reason a '.align 512' in
> front of boot_edd_info yielded an address which was not ok. I just checked
> what address that yielded though (by booting with edd=off to avoid the
> hang) and it was 0x86f40... which clearly means that '.align 512' is not doing
> what I thought it would do.

No, the problem turns out to be the GLOBAL() macro which, in assembly files, contains an implicit .align 16!

  Paul

> 
>   Paul
> 
> >
> > >> > so you can add my Tested-by to that.
> > >>
> > >> I.e. I'm not sure about this, as I'm still uncertain whether some
> > >> corruption didn't again occur. Of course APs coming up properly
> > >> would already be a relatively good sign (as now the permanent
> > >> part of the trampoline would be the predestined area for
> > >> corruption to occur in).
> > >>
> > >
> > > None of my findings ever indicated memory corruption (although there,
> of
> > > course, may have been some that I happened to miss), but rather
> > misbehaviour
> > > of the int13 handler itself - either locking up, having odd effects (e.g.
> > > black screen), or both.
> >
> > Ah, I didn't understand it this way so far, and instead had implied
> > that the handler did return, but corrupt our trampoline area in
> > one way or another.
> >
> > Jan
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 14:43                                                                       ` Paul Durrant
@ 2017-06-12 15:03                                                                         ` Paul Durrant
  2017-06-12 15:07                                                                         ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 15:03 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Paul Durrant
> Sent: 12 June 2017 15:43
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> <JBeulich@suse.com>
> Cc: Juergen Gross <jgross@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Julien Grall (julien.grall@arm.com)
> <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>;
> xen-devel(xen-devel@lists.xenproject.org) <xen-
> devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> > Paul Durrant
> > Sent: 12 June 2017 15:29
> > To: 'Jan Beulich' <JBeulich@suse.com>
> > Cc: Juergen Gross <jgross@suse.com>; Andrew Cooper
> > <Andrew.Cooper3@citrix.com>; Julien Grall (julien.grall@arm.com)
> > <julien.grall@arm.com>; 'Boris Ostrovsky' <boris.ostrovsky@oracle.com>;
> > xen-devel(xen-devel@lists.xenproject.org) <xen-
> > devel@lists.xenproject.org>
> > Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >
> > > -----Original Message-----
> > > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > Sent: 12 June 2017 14:55
> > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> > > Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> > > devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> > > Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> > > <jgross@suse.com>
> > > Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> > >
> > > >>> On 12.06.17 at 14:05, <Paul.Durrant@citrix.com> wrote:
> > > >>  -----Original Message-----
> > > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > > >> Sent: 12 June 2017 12:12
> > > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > > >> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>;
> Andrew
> > > >> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> > > >> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> > > >> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> > > >> <jgross@suse.com>
> > > >> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> > > >>
> > > >> >>> On 12.06.17 at 12:53, <Paul.Durrant@citrix.com> wrote:
> > > >> >>  -----Original Message-----
> > > >> > [snip]
> > > >> >> > >
> > > >> >> > > What do you think it best to do for Xen 4.9? Hardcoding a 4k
> > > alignment
> > > >> is
> > > >> >> > > clearly easy and would work around this BIOS issue but, as you
> > say,
> > > it
> > > >> >> does
> > > >> >> > > grow the image. Reverting Juergen's patch also works round
> the
> > > issue,
> > > >> >> but
> > > >> >> > > that is more by luck. Re-working the code is preferable, but I
> > guess
> > > it's
> > > >> >> too
> > > >> >> > > late to introduce such code-churn in 4.9.
> > > >> >> >
> > > >> >> > Reverting Jürgen's code is out of question with all the information
> > > >> >> > you've gathered by now. I think re-working the EDD code slightly
> > > >> >> > is the best option. Would you mind giving the attached patch a
> > > >> >> > try? This still slightly grows the trampoline due to a few more
> > > >> >> > instructions being needed, but should still be far better than
> > > >> >> > embedding a whole 4k buffer (and then later finding a BIOS/disk
> > > >> >> > combination which wants even more). Note that I've left a tiny
> > > >> >> > bit of debugging code in there.
> > > >> >> >
> > > >> >>
> > > >> >> Sure, I'll give that a go now.
> > > >> >>
> > > >> >
> > > >> > That worked fine:
> > > >> >
> > > >> > (XEN) MBR[80] @ 85e0 (86000)
> > > >>
> > > >> But that's contrary to your earlier findings: Didn't you say simply
> > > >> avoiding a 4k-boundary wasn't enough? And it certainly tells us
> > > >> that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
> > > >> sectors) - I was really expecting a larger gap between the two
> > > >> logged values.
> > > >>
> > > >
> > > > I'll go dump out the edd and double check what it is saying.
> > > >
> > > > My findings indicated that the problem seemed to be doing a read that
> > > > spanned a 4k boundary caused a problem, so using 0x85e00 would be
> > safe.
> > > The
> > > > anomaly was that simply aligning the edd_info buffer and a 512 byte
> > > boundary
> > > > and continuing to use that for reading did not work.
> > >
> > > But a 512-byte aligned 512-byte buffer can't possibly cross a page
> > > boundary.
> >
> > Indeed, which is why I was perplexed. I found that 0x60e00 was ok. Your
> > patch chose 0x85e00, which was ok too, but for some reason a '.align 512' in
> > front of boot_edd_info yielded an address which was not ok. I just checked
> > what address that yielded though (by booting with edd=off to avoid the
> > hang) and it was 0x86f40... which clearly means that '.align 512' is not doing
> > what I thought it would do.
> 
> No, the problem turns out to be the GLOBAL() macro which, in assembly
> files, contains an implicit .align 16!
> 

No, I misread.. ENTRY() contains the implicit align.

It's clearly even more subtle. Running objdump tells me the symbol is indeed 512 byte aligned, but when it ends up on memory it's clearly not. So I guess it must be down to how the trampoline is loaded. Thus, not using a buffer within the trampoline image is most definitely the best idea.

  Paul

>   Paul
> 
> >
> >   Paul
> >
> > >
> > > >> > so you can add my Tested-by to that.
> > > >>
> > > >> I.e. I'm not sure about this, as I'm still uncertain whether some
> > > >> corruption didn't again occur. Of course APs coming up properly
> > > >> would already be a relatively good sign (as now the permanent
> > > >> part of the trampoline would be the predestined area for
> > > >> corruption to occur in).
> > > >>
> > > >
> > > > None of my findings ever indicated memory corruption (although
> there,
> > of
> > > > course, may have been some that I happened to miss), but rather
> > > misbehaviour
> > > > of the int13 handler itself - either locking up, having odd effects (e.g.
> > > > black screen), or both.
> > >
> > > Ah, I didn't understand it this way so far, and instead had implied
> > > that the handler did return, but corrupt our trampoline area in
> > > one way or another.
> > >
> > > Jan
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 14:43                                                                       ` Paul Durrant
  2017-06-12 15:03                                                                         ` Paul Durrant
@ 2017-06-12 15:07                                                                         ` Jan Beulich
  2017-06-12 15:21                                                                           ` Paul Durrant
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2017-06-12 15:07 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

>>> On 12.06.17 at 16:43, <Paul.Durrant@citrix.com> wrote:
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>> Paul Durrant
>> Sent: 12 June 2017 15:29
>> > From: Jan Beulich [mailto:JBeulich@suse.com]
>> > Sent: 12 June 2017 14:55
>> > >> > That worked fine:
>> > >> >
>> > >> > (XEN) MBR[80] @ 85e0 (86000)
>> > >>
>> > >> But that's contrary to your earlier findings: Didn't you say simply
>> > >> avoiding a 4k-boundary wasn't enough? And it certainly tells us
>> > >> that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
>> > >> sectors) - I was really expecting a larger gap between the two
>> > >> logged values.
>> > >>
>> > >
>> > > I'll go dump out the edd and double check what it is saying.
>> > >
>> > > My findings indicated that the problem seemed to be doing a read that
>> > > spanned a 4k boundary caused a problem, so using 0x85e00 would be
>> safe.
>> > The
>> > > anomaly was that simply aligning the edd_info buffer and a 512 byte
>> > boundary
>> > > and continuing to use that for reading did not work.
>> >
>> > But a 512-byte aligned 512-byte buffer can't possibly cross a page
>> > boundary.
>> 
>> Indeed, which is why I was perplexed. I found that 0x60e00 was ok. Your
>> patch chose 0x85e00, which was ok too, but for some reason a '.align 512' in
>> front of boot_edd_info yielded an address which was not ok. I just checked
>> what address that yielded though (by booting with edd=off to avoid the
>> hang) and it was 0x86f40... which clearly means that '.align 512' is not doing
>> what I thought it would do.
> 
> No, the problem turns out to be the GLOBAL() macro which, in assembly files, 
> contains an implicit .align 16!

No, I don't think so - two successive .align don't have any bad effect,
the higher value will be it. Instead I think you're suffering from the
copying of the trampoline space to low memory: What is aligned to a
512-byte boundary in the image won't necessarily be in low memory,
unless trampoline_start is also aligned at least as much.

But with this likely having been the problem in your experiments I'm
not feeling sufficiently reassured to submit the patch "officially".

Jan

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: debian stretch dom0 + xen 4.9 fails to boot
  2017-06-12 15:07                                                                         ` Jan Beulich
@ 2017-06-12 15:21                                                                           ` Paul Durrant
  0 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2017-06-12 15:21 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Juergen Gross, Andrew Cooper, Julien Grall (julien.grall@arm.com),
	'Boris Ostrovsky',
	xen-devel(xen-devel@lists.xenproject.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 June 2017 16:08
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; xen-devel(xen-
> devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>; Juergen Gross
> <jgross@suse.com>
> Subject: RE: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> 
> >>> On 12.06.17 at 16:43, <Paul.Durrant@citrix.com> wrote:
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> >> Paul Durrant
> >> Sent: 12 June 2017 15:29
> >> > From: Jan Beulich [mailto:JBeulich@suse.com]
> >> > Sent: 12 June 2017 14:55
> >> > >> > That worked fine:
> >> > >> >
> >> > >> > (XEN) MBR[80] @ 85e0 (86000)
> >> > >>
> >> > >> But that's contrary to your earlier findings: Didn't you say simply
> >> > >> avoiding a 4k-boundary wasn't enough? And it certainly tells us
> >> > >> that this isn't a 4k drive (or at least the BIOS doesn't surface 4k
> >> > >> sectors) - I was really expecting a larger gap between the two
> >> > >> logged values.
> >> > >>
> >> > >
> >> > > I'll go dump out the edd and double check what it is saying.
> >> > >
> >> > > My findings indicated that the problem seemed to be doing a read
> that
> >> > > spanned a 4k boundary caused a problem, so using 0x85e00 would be
> >> safe.
> >> > The
> >> > > anomaly was that simply aligning the edd_info buffer and a 512 byte
> >> > boundary
> >> > > and continuing to use that for reading did not work.
> >> >
> >> > But a 512-byte aligned 512-byte buffer can't possibly cross a page
> >> > boundary.
> >>
> >> Indeed, which is why I was perplexed. I found that 0x60e00 was ok. Your
> >> patch chose 0x85e00, which was ok too, but for some reason a '.align 512'
> in
> >> front of boot_edd_info yielded an address which was not ok. I just
> checked
> >> what address that yielded though (by booting with edd=off to avoid the
> >> hang) and it was 0x86f40... which clearly means that '.align 512' is not
> doing
> >> what I thought it would do.
> >
> > No, the problem turns out to be the GLOBAL() macro which, in assembly
> files,
> > contains an implicit .align 16!
> 
> No, I don't think so - two successive .align don't have any bad effect,
> the higher value will be it. Instead I think you're suffering from the
> copying of the trampoline space to low memory: What is aligned to a
> 512-byte boundary in the image won't necessarily be in low memory,
> unless trampoline_start is also aligned at least as much.
> 
> But with this likely having been the problem in your experiments I'm
> not feeling sufficiently reassured to submit the patch "officially".
> 

I see you submitted the patch.

I'm happy now because the anomaly in what I was seeing is explained. I was convinced that, at some stage, I had found that the image was 64k aligned in memory. I was clearly wrong.

  Paul

> Jan
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2017-06-12 15:21 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-06 14:32 debian stretch dom0 + xen 4.9 fails to boot Paul Durrant
2017-06-06 15:11 ` Jan Beulich
2017-06-06 15:51   ` Paul Durrant
2017-06-06 16:28     ` Paul Durrant
2017-06-06 17:00       ` Boris Ostrovsky
2017-06-07  8:07         ` Jan Beulich
2017-06-07  8:09           ` Paul Durrant
2017-06-07  8:19             ` Paul Durrant
2017-06-07 14:05           ` Boris Ostrovsky
2017-06-07  8:07         ` Paul Durrant
2017-06-07  8:27           ` Jan Beulich
     [not found]           ` <5937D4FF02000078001602F6@suse.com>
2017-06-07  9:03             ` Juergen Gross
2017-06-07  9:05               ` Paul Durrant
2017-06-07  9:09                 ` Andrew Cooper
2017-06-07 10:36                   ` Paul Durrant
2017-06-07 11:06                     ` Paul Durrant
2017-06-07 11:57                       ` Juergen Gross
2017-06-07 12:02                         ` Paul Durrant
2017-06-07 12:13                           ` Juergen Gross
2017-06-07 12:19                           ` Jan Beulich
2017-06-07 12:26                             ` Paul Durrant
2017-06-07 12:34                               ` Jan Beulich
2017-06-07 11:50                     ` Jan Beulich
2017-06-07 11:55                       ` Paul Durrant
2017-06-07 12:00                         ` Jan Beulich
2017-06-07 12:46                           ` Paul Durrant
2017-06-07 12:55                             ` Jan Beulich
2017-06-07 15:06                               ` Paul Durrant
2017-06-07 15:33                                 ` Jan Beulich
2017-06-07 15:40                                   ` Paul Durrant
2017-06-07 15:52                                     ` Jan Beulich
2017-06-08 12:42                                       ` Paul Durrant
2017-06-08 12:46                                         ` Juergen Gross
2017-06-08 13:18                                         ` Jan Beulich
2017-06-08 13:24                                           ` Paul Durrant
2017-06-09 12:19                                           ` Paul Durrant
2017-06-09 13:05                                             ` Jan Beulich
2017-06-09 13:52                                               ` Boris Ostrovsky
2017-06-09 15:14                                                 ` Paul Durrant
2017-06-09 15:41                                                   ` Jan Beulich
2017-06-09 15:47                                                     ` Paul Durrant
2017-06-09 15:58                                                       ` Jan Beulich
2017-06-12  8:14                                                       ` Paul Durrant
2017-06-12 10:40                                                         ` Jan Beulich
2017-06-12 10:44                                                           ` Paul Durrant
2017-06-12 10:53                                                             ` Paul Durrant
2017-06-12 11:12                                                               ` Jan Beulich
2017-06-12 12:05                                                                 ` Paul Durrant
2017-06-12 12:25                                                                   ` Paul Durrant
2017-06-12 13:54                                                                   ` Jan Beulich
2017-06-12 14:28                                                                     ` Paul Durrant
2017-06-12 14:43                                                                       ` Paul Durrant
2017-06-12 15:03                                                                         ` Paul Durrant
2017-06-12 15:07                                                                         ` Jan Beulich
2017-06-12 15:21                                                                           ` Paul Durrant
2017-06-06 17:40     ` Julien Grall
2017-06-07  8:05       ` Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.