xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement
@ 2020-03-03 11:52 Roger Pau Monne
  2020-03-03 15:40 ` Jan Beulich
  2020-03-03 18:47 ` Andrew Cooper
  0 siblings, 2 replies; 7+ messages in thread
From: Roger Pau Monne @ 2020-03-03 11:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

Don't assume there's going to be enough space at the tail of the
loaded kernel and instead try to find a suitable memory area where the
initrd and metadata can be loaded.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Calculate end of e820 entry earlier.
 - Only check if the end of the region is < 1MB.
 - Check for range overlaps with the kernel region.
 - Check the region is of type RAM.
 - Fix off-by-one checks in range overlaps.
 - Add a comment about why initrd and metadata is placed together.
 - Add parentheses around size calculations.
---
 xen/arch/x86/hvm/dom0_build.c | 58 ++++++++++++++++++++++++++++++++++-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index eded87eaf5..33520ec1bc 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -490,6 +490,45 @@ static int __init pvh_populate_p2m(struct domain *d)
 #undef MB1_PAGES
 }
 
+static paddr_t find_memory(const struct domain *d, const struct elf_binary *elf,
+                           size_t size)
+{
+    paddr_t kernel_start = (paddr_t)elf->dest_base;
+    paddr_t kernel_end = (paddr_t)(elf->dest_base + elf->dest_size);
+    unsigned int i;
+
+    for ( i = 0; i < d->arch.nr_e820; i++ )
+    {
+        paddr_t start, end = d->arch.e820[i].addr + d->arch.e820[i].size;
+
+        /* Don't use memory below 1MB, as it could overwrite the BDA/EBDA. */
+        if ( end <= MB(1) || d->arch.e820[i].type != E820_RAM )
+            continue;
+
+        start = MAX(ROUNDUP(d->arch.e820[i].addr, PAGE_SIZE), MB(1));
+
+        if ( end <= kernel_start || start >= kernel_end )
+            ; /* No overlap, nothing to do. */
+        /* Deal with the kernel already being loaded in the region. */
+        else if ( kernel_start <= start && kernel_end > start )
+            /* Truncate the start of the region. */
+            start = ROUNDUP(kernel_end, PAGE_SIZE);
+        else if ( kernel_start <= end && kernel_end > end )
+            /* Truncate the end of the region. */
+            end = kernel_start;
+        /* Pick the biggest of the split regions. */
+        else if ( kernel_start - start > end - kernel_end )
+            end = kernel_start;
+        else
+            start = ROUNDUP(kernel_end, PAGE_SIZE);
+
+        if ( end - start >= size )
+            return start;
+    }
+
+    return INVALID_PADDR;
+}
+
 static int __init pvh_load_kernel(struct domain *d, const module_t *image,
                                   unsigned long image_headroom,
                                   module_t *initrd, void *image_base,
@@ -546,7 +585,24 @@ static int __init pvh_load_kernel(struct domain *d, const module_t *image,
         return rc;
     }
 
-    last_addr = ROUNDUP(parms.virt_kend - parms.virt_base, PAGE_SIZE);
+    /*
+     * Find a RAM region big enough (and that doesn't overlap with the loaded
+     * kernel) in order to load the initrd and the metadata. Note it could be
+     * split into smaller allocations, done it as a single region in order to
+     * simplify it.
+     */
+    last_addr = find_memory(d, &elf, sizeof(start_info) +
+                            (initrd ? ROUNDUP(initrd->mod_end, PAGE_SIZE) +
+                                      sizeof(mod)
+                                    : 0) +
+                            (cmdline ? ROUNDUP(strlen(cmdline) + 1,
+                                               elf_64bit(&elf) ? 8 : 4)
+                                     : 0));
+    if ( last_addr == INVALID_PADDR )
+    {
+        printk("Unable to find a memory region to load initrd and metadata\n");
+        return -ENOMEM;
+    }
 
     if ( initrd != NULL )
     {
-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement
  2020-03-03 11:52 [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement Roger Pau Monne
@ 2020-03-03 15:40 ` Jan Beulich
  2020-03-04  9:53   ` Roger Pau Monné
  2020-03-03 18:47 ` Andrew Cooper
  1 sibling, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2020-03-03 15:40 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Andrew Cooper

On 03.03.2020 12:52, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/dom0_build.c
> +++ b/xen/arch/x86/hvm/dom0_build.c
> @@ -490,6 +490,45 @@ static int __init pvh_populate_p2m(struct domain *d)
>  #undef MB1_PAGES
>  }
>  
> +static paddr_t find_memory(const struct domain *d, const struct elf_binary *elf,
> +                           size_t size)
> +{
> +    paddr_t kernel_start = (paddr_t)elf->dest_base;
> +    paddr_t kernel_end = (paddr_t)(elf->dest_base + elf->dest_size);
> +    unsigned int i;
> +
> +    for ( i = 0; i < d->arch.nr_e820; i++ )
> +    {
> +        paddr_t start, end = d->arch.e820[i].addr + d->arch.e820[i].size;
> +
> +        /* Don't use memory below 1MB, as it could overwrite the BDA/EBDA. */
> +        if ( end <= MB(1) || d->arch.e820[i].type != E820_RAM )
> +            continue;
> +
> +        start = MAX(ROUNDUP(d->arch.e820[i].addr, PAGE_SIZE), MB(1));
> +
> +        if ( end <= kernel_start || start >= kernel_end )
> +            ; /* No overlap, nothing to do. */
> +        /* Deal with the kernel already being loaded in the region. */
> +        else if ( kernel_start <= start && kernel_end > start )

Since, according to your reply on v1, [kernel_start,kernel_end) is
a subset of [start,end), I understand that the <= could equally
well be == - do you agree? From this then ...

> +            /* Truncate the start of the region. */
> +            start = ROUNDUP(kernel_end, PAGE_SIZE);
> +        else if ( kernel_start <= end && kernel_end > end )

... it follows that you now have two off-by-1s here, as you changed
the right side of the && instead of the left one (the right side
could, as per above, use == again). Using == in both places would,
in lieu of a comment, imo make more visible to the reader that
there is this sub-range relationship between both ranges.

> +            /* Truncate the end of the region. */
> +            end = kernel_start;
> +        /* Pick the biggest of the split regions. */

Then again - wouldn't this part suffice? if start == kernel_start
or end == kernel_end, one side of the "split" region would simply
be empty.

> +        else if ( kernel_start - start > end - kernel_end )
> +            end = kernel_start;
> +        else
> +            start = ROUNDUP(kernel_end, PAGE_SIZE);
> +
> +        if ( end - start >= size )
> +            return start;
> +    }
> +
> +    return INVALID_PADDR;
> +}
> +
>  static int __init pvh_load_kernel(struct domain *d, const module_t *image,
>                                    unsigned long image_headroom,
>                                    module_t *initrd, void *image_base,
> @@ -546,7 +585,24 @@ static int __init pvh_load_kernel(struct domain *d, const module_t *image,
>          return rc;
>      }
>  
> -    last_addr = ROUNDUP(parms.virt_kend - parms.virt_base, PAGE_SIZE);
> +    /*
> +     * Find a RAM region big enough (and that doesn't overlap with the loaded
> +     * kernel) in order to load the initrd and the metadata. Note it could be
> +     * split into smaller allocations, done it as a single region in order to
> +     * simplify it.

I guess either "done" without "it" or "doing it"?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement
  2020-03-03 11:52 [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement Roger Pau Monne
  2020-03-03 15:40 ` Jan Beulich
@ 2020-03-03 18:47 ` Andrew Cooper
  2020-03-04 10:49   ` Roger Pau Monné
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2020-03-03 18:47 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Wei Liu, Jan Beulich

On 03/03/2020 11:52, Roger Pau Monne wrote:
> Don't assume there's going to be enough space at the tail of the
> loaded kernel and instead try to find a suitable memory area where the
> initrd and metadata can be loaded.
>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

I can confirm that this fixes the "failed to boot PVH" on my Rome system.

Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>

We've still got the excessive-time-to-construct issues to look at, but
this definitely brings things to a better position.

> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
> index eded87eaf5..33520ec1bc 100644
> --- a/xen/arch/x86/hvm/dom0_build.c
> +++ b/xen/arch/x86/hvm/dom0_build.c
> @@ -490,6 +490,45 @@ static int __init pvh_populate_p2m(struct domain *d)
>  #undef MB1_PAGES
>  }
>  
> +static paddr_t find_memory(const struct domain *d, const struct elf_binary *elf,
> +                           size_t size)
> +{
> +    paddr_t kernel_start = (paddr_t)elf->dest_base;
> +    paddr_t kernel_end = (paddr_t)(elf->dest_base + elf->dest_size);
> +    unsigned int i;
> +
> +    for ( i = 0; i < d->arch.nr_e820; i++ )
> +    {
> +        paddr_t start, end = d->arch.e820[i].addr + d->arch.e820[i].size;
> +
> +        /* Don't use memory below 1MB, as it could overwrite the BDA/EBDA. */

The BDA is in mfn 0 so is special for other reasons*.  The EBDA and IBFT
are the problem datastructures.

~Andrew

[*] Thinking about it, how should a PVH hardware domain reconcile its
paravirtualised boot with finding itself on a BIOS or EFI system...

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement
  2020-03-03 15:40 ` Jan Beulich
@ 2020-03-04  9:53   ` Roger Pau Monné
  2020-03-04 10:00     ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Roger Pau Monné @ 2020-03-04  9:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Tue, Mar 03, 2020 at 04:40:36PM +0100, Jan Beulich wrote:
> On 03.03.2020 12:52, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/dom0_build.c
> > +++ b/xen/arch/x86/hvm/dom0_build.c
> > @@ -490,6 +490,45 @@ static int __init pvh_populate_p2m(struct domain *d)
> >  #undef MB1_PAGES
> >  }
> >  
> > +static paddr_t find_memory(const struct domain *d, const struct elf_binary *elf,
> > +                           size_t size)
> > +{
> > +    paddr_t kernel_start = (paddr_t)elf->dest_base;
> > +    paddr_t kernel_end = (paddr_t)(elf->dest_base + elf->dest_size);
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < d->arch.nr_e820; i++ )
> > +    {
> > +        paddr_t start, end = d->arch.e820[i].addr + d->arch.e820[i].size;
> > +
> > +        /* Don't use memory below 1MB, as it could overwrite the BDA/EBDA. */
> > +        if ( end <= MB(1) || d->arch.e820[i].type != E820_RAM )
> > +            continue;
> > +
> > +        start = MAX(ROUNDUP(d->arch.e820[i].addr, PAGE_SIZE), MB(1));
> > +
> > +        if ( end <= kernel_start || start >= kernel_end )
> > +            ; /* No overlap, nothing to do. */
> > +        /* Deal with the kernel already being loaded in the region. */
> > +        else if ( kernel_start <= start && kernel_end > start )
> 
> Since, according to your reply on v1, [kernel_start,kernel_end) is
> a subset of [start,end), I understand that the <= could equally
> well be == - do you agree? From this then ...
> 
> > +            /* Truncate the start of the region. */
> > +            start = ROUNDUP(kernel_end, PAGE_SIZE);
> > +        else if ( kernel_start <= end && kernel_end > end )
> 
> ... it follows that you now have two off-by-1s here, as you changed
> the right side of the && instead of the left one (the right side
> could, as per above, use == again). Using == in both places would,
> in lieu of a comment, imo make more visible to the reader that
> there is this sub-range relationship between both ranges.

Right, I agree to both the above and have adjusted the conditions.

> > +            /* Truncate the end of the region. */
> > +            end = kernel_start;
> > +        /* Pick the biggest of the split regions. */
> 
> Then again - wouldn't this part suffice? if start == kernel_start
> or end == kernel_end, one side of the "split" region would simply
> be empty.

That's why it's using an else if construct, so that we only get
here if the kernel is loaded in the middle of the region, and there
are two regions left as part of the split.

> 
> > +        else if ( kernel_start - start > end - kernel_end )
> > +            end = kernel_start;
> > +        else
> > +            start = ROUNDUP(kernel_end, PAGE_SIZE);
> > +
> > +        if ( end - start >= size )
> > +            return start;
> > +    }
> > +
> > +    return INVALID_PADDR;
> > +}
> > +
> >  static int __init pvh_load_kernel(struct domain *d, const module_t *image,
> >                                    unsigned long image_headroom,
> >                                    module_t *initrd, void *image_base,
> > @@ -546,7 +585,24 @@ static int __init pvh_load_kernel(struct domain *d, const module_t *image,
> >          return rc;
> >      }
> >  
> > -    last_addr = ROUNDUP(parms.virt_kend - parms.virt_base, PAGE_SIZE);
> > +    /*
> > +     * Find a RAM region big enough (and that doesn't overlap with the loaded
> > +     * kernel) in order to load the initrd and the metadata. Note it could be
> > +     * split into smaller allocations, done it as a single region in order to
> > +     * simplify it.
> 
> I guess either "done" without "it" or "doing it"?

Fixed, thanks.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement
  2020-03-04  9:53   ` Roger Pau Monné
@ 2020-03-04 10:00     ` Jan Beulich
  2020-03-04 10:09       ` Roger Pau Monné
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2020-03-04 10:00 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Andrew Cooper

On 04.03.2020 10:53, Roger Pau Monné wrote:
> On Tue, Mar 03, 2020 at 04:40:36PM +0100, Jan Beulich wrote:
>> On 03.03.2020 12:52, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/dom0_build.c
>>> +++ b/xen/arch/x86/hvm/dom0_build.c
>>> @@ -490,6 +490,45 @@ static int __init pvh_populate_p2m(struct domain *d)
>>>  #undef MB1_PAGES
>>>  }
>>>  
>>> +static paddr_t find_memory(const struct domain *d, const struct elf_binary *elf,
>>> +                           size_t size)
>>> +{
>>> +    paddr_t kernel_start = (paddr_t)elf->dest_base;
>>> +    paddr_t kernel_end = (paddr_t)(elf->dest_base + elf->dest_size);
>>> +    unsigned int i;
>>> +
>>> +    for ( i = 0; i < d->arch.nr_e820; i++ )
>>> +    {
>>> +        paddr_t start, end = d->arch.e820[i].addr + d->arch.e820[i].size;
>>> +
>>> +        /* Don't use memory below 1MB, as it could overwrite the BDA/EBDA. */
>>> +        if ( end <= MB(1) || d->arch.e820[i].type != E820_RAM )
>>> +            continue;
>>> +
>>> +        start = MAX(ROUNDUP(d->arch.e820[i].addr, PAGE_SIZE), MB(1));
>>> +
>>> +        if ( end <= kernel_start || start >= kernel_end )
>>> +            ; /* No overlap, nothing to do. */
>>> +        /* Deal with the kernel already being loaded in the region. */
>>> +        else if ( kernel_start <= start && kernel_end > start )
>>
>> Since, according to your reply on v1, [kernel_start,kernel_end) is
>> a subset of [start,end), I understand that the <= could equally
>> well be == - do you agree? From this then ...
>>
>>> +            /* Truncate the start of the region. */
>>> +            start = ROUNDUP(kernel_end, PAGE_SIZE);
>>> +        else if ( kernel_start <= end && kernel_end > end )
>>
>> ... it follows that you now have two off-by-1s here, as you changed
>> the right side of the && instead of the left one (the right side
>> could, as per above, use == again). Using == in both places would,
>> in lieu of a comment, imo make more visible to the reader that
>> there is this sub-range relationship between both ranges.
> 
> Right, I agree to both the above and have adjusted the conditions.
> 
>>> +            /* Truncate the end of the region. */
>>> +            end = kernel_start;
>>> +        /* Pick the biggest of the split regions. */
>>
>> Then again - wouldn't this part suffice? if start == kernel_start
>> or end == kernel_end, one side of the "split" region would simply
>> be empty.
> 
> That's why it's using an else if construct, so that we only get
> here if the kernel is loaded in the middle of the region, and there
> are two regions left as part of the split.

But my question is - do we really need the earlier parts of
this if/else-if chain? Won't this latter part deal find with
zero-sized ranges at head or tail of the region?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement
  2020-03-04 10:00     ` Jan Beulich
@ 2020-03-04 10:09       ` Roger Pau Monné
  0 siblings, 0 replies; 7+ messages in thread
From: Roger Pau Monné @ 2020-03-04 10:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Wed, Mar 04, 2020 at 11:00:18AM +0100, Jan Beulich wrote:
> On 04.03.2020 10:53, Roger Pau Monné wrote:
> > On Tue, Mar 03, 2020 at 04:40:36PM +0100, Jan Beulich wrote:
> >> On 03.03.2020 12:52, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/hvm/dom0_build.c
> >>> +++ b/xen/arch/x86/hvm/dom0_build.c
> >>> @@ -490,6 +490,45 @@ static int __init pvh_populate_p2m(struct domain *d)
> >>>  #undef MB1_PAGES
> >>>  }
> >>>  
> >>> +static paddr_t find_memory(const struct domain *d, const struct elf_binary *elf,
> >>> +                           size_t size)
> >>> +{
> >>> +    paddr_t kernel_start = (paddr_t)elf->dest_base;
> >>> +    paddr_t kernel_end = (paddr_t)(elf->dest_base + elf->dest_size);
> >>> +    unsigned int i;
> >>> +
> >>> +    for ( i = 0; i < d->arch.nr_e820; i++ )
> >>> +    {
> >>> +        paddr_t start, end = d->arch.e820[i].addr + d->arch.e820[i].size;
> >>> +
> >>> +        /* Don't use memory below 1MB, as it could overwrite the BDA/EBDA. */
> >>> +        if ( end <= MB(1) || d->arch.e820[i].type != E820_RAM )
> >>> +            continue;
> >>> +
> >>> +        start = MAX(ROUNDUP(d->arch.e820[i].addr, PAGE_SIZE), MB(1));
> >>> +
> >>> +        if ( end <= kernel_start || start >= kernel_end )
> >>> +            ; /* No overlap, nothing to do. */
> >>> +        /* Deal with the kernel already being loaded in the region. */
> >>> +        else if ( kernel_start <= start && kernel_end > start )
> >>
> >> Since, according to your reply on v1, [kernel_start,kernel_end) is
> >> a subset of [start,end), I understand that the <= could equally
> >> well be == - do you agree? From this then ...
> >>
> >>> +            /* Truncate the start of the region. */
> >>> +            start = ROUNDUP(kernel_end, PAGE_SIZE);
> >>> +        else if ( kernel_start <= end && kernel_end > end )
> >>
> >> ... it follows that you now have two off-by-1s here, as you changed
> >> the right side of the && instead of the left one (the right side
> >> could, as per above, use == again). Using == in both places would,
> >> in lieu of a comment, imo make more visible to the reader that
> >> there is this sub-range relationship between both ranges.
> > 
> > Right, I agree to both the above and have adjusted the conditions.
> > 
> >>> +            /* Truncate the end of the region. */
> >>> +            end = kernel_start;
> >>> +        /* Pick the biggest of the split regions. */
> >>
> >> Then again - wouldn't this part suffice? if start == kernel_start
> >> or end == kernel_end, one side of the "split" region would simply
> >> be empty.
> > 
> > That's why it's using an else if construct, so that we only get
> > here if the kernel is loaded in the middle of the region, and there
> > are two regions left as part of the split.
> 
> But my question is - do we really need the earlier parts of
> this if/else-if chain? Won't this latter part deal find with
> zero-sized ranges at head or tail of the region?

Oh, I misread your reply sorry. Yes you are right, I can achieve the
same just with this last part.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement
  2020-03-03 18:47 ` Andrew Cooper
@ 2020-03-04 10:49   ` Roger Pau Monné
  0 siblings, 0 replies; 7+ messages in thread
From: Roger Pau Monné @ 2020-03-04 10:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Jan Beulich

On Tue, Mar 03, 2020 at 06:47:35PM +0000, Andrew Cooper wrote:
> On 03/03/2020 11:52, Roger Pau Monne wrote:
> > Don't assume there's going to be enough space at the tail of the
> > loaded kernel and instead try to find a suitable memory area where the
> > initrd and metadata can be loaded.
> >
> > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> I can confirm that this fixes the "failed to boot PVH" on my Rome system.
> 
> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks!

> We've still got the excessive-time-to-construct issues to look at, but
> this definitely brings things to a better position.

Well, PV is always going to be faster to construct, since you only
need to allocate memory and create the initial page tables that cover
the kernel, the metadata and optionally the initrd IIRC.

On PVH we need to populate the full p2m, so I think it's safe to say
that PVH build time is always going to be worse than PV. That doesn't
mean we can't make it faster.
I have to 
Since I also have an AMD box that I can play with, how much memory are
you assigning to dom0?

> > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
> > index eded87eaf5..33520ec1bc 100644
> > --- a/xen/arch/x86/hvm/dom0_build.c
> > +++ b/xen/arch/x86/hvm/dom0_build.c
> > @@ -490,6 +490,45 @@ static int __init pvh_populate_p2m(struct domain *d)
> >  #undef MB1_PAGES
> >  }
> >  
> > +static paddr_t find_memory(const struct domain *d, const struct elf_binary *elf,
> > +                           size_t size)
> > +{
> > +    paddr_t kernel_start = (paddr_t)elf->dest_base;
> > +    paddr_t kernel_end = (paddr_t)(elf->dest_base + elf->dest_size);
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < d->arch.nr_e820; i++ )
> > +    {
> > +        paddr_t start, end = d->arch.e820[i].addr + d->arch.e820[i].size;
> > +
> > +        /* Don't use memory below 1MB, as it could overwrite the BDA/EBDA. */
> 
> The BDA is in mfn 0 so is special for other reasons*.  The EBDA and IBFT
> are the problem datastructures.

Sure. Sorry I haven't added it to the comment.

> ~Andrew
> 
> [*] Thinking about it, how should a PVH hardware domain reconcile its
> paravirtualised boot with finding itself on a BIOS or EFI system...

I guess the same applies to PV which also boots using a PV path but
has access to firmware.

I have to admit I never looked closely at how Linux does that, for
FreeBSD it's fairly easy because at least when booting from BIOS the
kernel won't try to make any calls into the BIOS, and instead expect
the data it requires to be provided by the loader as part of the
metadata blob, together with the modules &c.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-04 10:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-03 11:52 [Xen-devel] [PATCH v2] x86/dom0: improve PVH initrd and metadata placement Roger Pau Monne
2020-03-03 15:40 ` Jan Beulich
2020-03-04  9:53   ` Roger Pau Monné
2020-03-04 10:00     ` Jan Beulich
2020-03-04 10:09       ` Roger Pau Monné
2020-03-03 18:47 ` Andrew Cooper
2020-03-04 10:49   ` Roger Pau Monné

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).