All of lore.kernel.org
 help / color / mirror / Atom feed
* Supporting systems with large E820 maps
@ 2017-03-20 19:03 Alex Thorlton
  2017-03-21  5:14 ` Juergen Gross
  2017-03-21 10:03 ` Jan Beulich
  0 siblings, 2 replies; 6+ messages in thread
From: Alex Thorlton @ 2017-03-20 19:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, alex.thorlton, rja

Hey everyone,

Recently, I've been working with Boris Ostrovsky to get Xen running on
some of our larger systems, and we've run into a few problems with the
amount of space that Xen sets aside for the E820 map.

The first problem that I hit was that E820MAX is far too small, at 128
entries, for the system that we're testing with.  The EFI memory map
handed up from the boot loader tops out at 783 entries, which far
exceeds the amount of space allocated for the memory map in
arch/x86/boot/mem.S.  I was able to get past this problem by bumping
E820MAX up to 1024 in arch/x86/boot/mem.S and include/asm-x86/e820.h.

The second problem that I encountered was that Xen uses a signed char to
store the number of entries in the memory map in a few places, which is
too small to hold the number of entries after bumping E820MAX up to
1024.  I made the following changes to get past this:

8<---
---
 arch/x86/e820.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- xen.orig/arch/x86/e820.c
+++ xen/arch/x86/e820.c
@@ -134,7 +134,7 @@ static struct change_member *change_poin
 static struct e820entry *overlap_list[E820MAX] __initdata;
 static struct e820entry new_bios[E820MAX] __initdata;

-static int __init sanitize_e820_map(struct e820entry * biosmap, char * pnr_map)
+static int __init sanitize_e820_map(struct e820entry * biosmap, unsigned int * pnr_map)
 {
     struct change_member *change_tmp;
     unsigned long current_type, last_type;
@@ -509,13 +509,13 @@ static void __init reserve_dmi_region(vo
     }
 }

-static void __init machine_specific_memory_setup(struct e820entry *raw, char *raw_nr)
+static void __init machine_specific_memory_setup(struct e820entry *raw, unsigned int *raw_nr)
 {
     unsigned long mpt_limit, ro_mpt_limit;
     uint64_t top_of_ram, size;
     int i;

-    char nr = (char)*raw_nr;
+    unsigned int nr = (unsigned int)*raw_nr;
     sanitize_e820_map(raw, &nr);
     *raw_nr = nr;
     (void)copy_e820_map(raw, nr);
--->8

I didn't need to go all the way up to unsigned int here, but I did this
as a quick/dirty test to see if it got things working.

These small changes get our large machine to boot up and recognize all
32TB of available RAM.  I know that these changes are probably not what
we'll want to go with in the end, but I wanted to get them sent upstream
to get a dialogue started.

So, what do others think here?  How do we want to handle a large E820
map?  Boris mentioned to me that we might want to attempt to do a
dynamic allocation scheme, where we reserve more space for the memory
map when we detect that E820 is large.

Any comments/suggestions are greatly appreciated!

- Alex

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Supporting systems with large E820 maps
  2017-03-20 19:03 Supporting systems with large E820 maps Alex Thorlton
@ 2017-03-21  5:14 ` Juergen Gross
  2017-03-21 10:05   ` Jan Beulich
       [not found]   ` <58D109050200007800145959@suse.com>
  2017-03-21 10:03 ` Jan Beulich
  1 sibling, 2 replies; 6+ messages in thread
From: Juergen Gross @ 2017-03-21  5:14 UTC (permalink / raw)
  To: Alex Thorlton, xen-devel
  Cc: Andrew Cooper, Boris Ostrovsky, rja, Jan Beulich, Daniel Kiper

On 20/03/17 20:03, Alex Thorlton wrote:
> Hey everyone,
> 
> Recently, I've been working with Boris Ostrovsky to get Xen running on
> some of our larger systems, and we've run into a few problems with the
> amount of space that Xen sets aside for the E820 map.
> 
> The first problem that I hit was that E820MAX is far too small, at 128
> entries, for the system that we're testing with.  The EFI memory map
> handed up from the boot loader tops out at 783 entries, which far
> exceeds the amount of space allocated for the memory map in
> arch/x86/boot/mem.S.  I was able to get past this problem by bumping
> E820MAX up to 1024 in arch/x86/boot/mem.S and include/asm-x86/e820.h.
> 
> The second problem that I encountered was that Xen uses a signed char to
> store the number of entries in the memory map in a few places, which is
> too small to hold the number of entries after bumping E820MAX up to
> 1024.  I made the following changes to get past this:

The problem with setting E820MAX to a higher value in mem.S without
further measures is that you are growing the trampoline size. This is
problematic for memory allocation in the multiboot path.

I have some patches sitting here waiting for Daniel's multiboot series
to go in. My patches are not using the mem.S e820 array for the EFI
memory map, so the BIOS memory map buffer can remain smaller while the
EFI buffer can be made rather large. This avoids growing the trampoline
(in fact I've managed to reduce it to a single page).

I didn't post my series up to now in order to not block Daniel's series
again. So what do people think: should I wait some more time with my
patches, or should I send them rather soon?


Juergen

> 
> 8<---
> ---
>  arch/x86/e820.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> --- xen.orig/arch/x86/e820.c
> +++ xen/arch/x86/e820.c
> @@ -134,7 +134,7 @@ static struct change_member *change_poin
>  static struct e820entry *overlap_list[E820MAX] __initdata;
>  static struct e820entry new_bios[E820MAX] __initdata;
> 
> -static int __init sanitize_e820_map(struct e820entry * biosmap, char * pnr_map)
> +static int __init sanitize_e820_map(struct e820entry * biosmap, unsigned int * pnr_map)
>  {
>      struct change_member *change_tmp;
>      unsigned long current_type, last_type;
> @@ -509,13 +509,13 @@ static void __init reserve_dmi_region(vo
>      }
>  }
> 
> -static void __init machine_specific_memory_setup(struct e820entry *raw, char *raw_nr)
> +static void __init machine_specific_memory_setup(struct e820entry *raw, unsigned int *raw_nr)
>  {
>      unsigned long mpt_limit, ro_mpt_limit;
>      uint64_t top_of_ram, size;
>      int i;
> 
> -    char nr = (char)*raw_nr;
> +    unsigned int nr = (unsigned int)*raw_nr;
>      sanitize_e820_map(raw, &nr);
>      *raw_nr = nr;
>      (void)copy_e820_map(raw, nr);
> --->8
> 
> I didn't need to go all the way up to unsigned int here, but I did this
> as a quick/dirty test to see if it got things working.
> 
> These small changes get our large machine to boot up and recognize all
> 32TB of available RAM.  I know that these changes are probably not what
> we'll want to go with in the end, but I wanted to get them sent upstream
> to get a dialogue started.
> 
> So, what do others think here?  How do we want to handle a large E820
> map?  Boris mentioned to me that we might want to attempt to do a
> dynamic allocation scheme, where we reserve more space for the memory
> map when we detect that E820 is large.
> 
> Any comments/suggestions are greatly appreciated!
> 
> - Alex
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Supporting systems with large E820 maps
  2017-03-20 19:03 Supporting systems with large E820 maps Alex Thorlton
  2017-03-21  5:14 ` Juergen Gross
@ 2017-03-21 10:03 ` Jan Beulich
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2017-03-21 10:03 UTC (permalink / raw)
  To: alex.thorlton; +Cc: rja, Boris Ostrovsky, xen-devel

>>> On 20.03.17 at 20:03, <alex.thorlton@hpe.com> wrote:
> --- xen.orig/arch/x86/e820.c
> +++ xen/arch/x86/e820.c
> @@ -134,7 +134,7 @@ static struct change_member *change_poin
>  static struct e820entry *overlap_list[E820MAX] __initdata;
>  static struct e820entry new_bios[E820MAX] __initdata;
> 
> -static int __init sanitize_e820_map(struct e820entry * biosmap, char * pnr_map)
> +static int __init sanitize_e820_map(struct e820entry * biosmap, unsigned int * pnr_map)
>  {
>      struct change_member *change_tmp;
>      unsigned long current_type, last_type;
> @@ -509,13 +509,13 @@ static void __init reserve_dmi_region(vo
>      }
>  }
> 
> -static void __init machine_specific_memory_setup(struct e820entry *raw, char *raw_nr)
> +static void __init machine_specific_memory_setup(struct e820entry *raw, unsigned int *raw_nr)

I'm confused: staging has

static void __init machine_specific_memory_setup(
    struct e820entry *raw, unsigned int *raw_nr)
{

>  {
>      unsigned long mpt_limit, ro_mpt_limit;
>      uint64_t top_of_ram, size;
>      int i;
> 
> -    char nr = (char)*raw_nr;
> +    unsigned int nr = (unsigned int)*raw_nr;
>      sanitize_e820_map(raw, &nr);
>      *raw_nr = nr;
>      (void)copy_e820_map(raw, nr);
> --->8
> 
> I didn't need to go all the way up to unsigned int here, but I did this
> as a quick/dirty test to see if it got things working.

I think this type change could be submitted right away (properly
cleaned up for style). After all when plain char is a signed type
(which it is allowed to be) it won't cope with 128 entries.

> These small changes get our large machine to boot up and recognize all
> 32TB of available RAM.  I know that these changes are probably not what
> we'll want to go with in the end, but I wanted to get them sent upstream
> to get a dialogue started.
> 
> So, what do others think here?  How do we want to handle a large E820
> map?  Boris mentioned to me that we might want to attempt to do a
> dynamic allocation scheme, where we reserve more space for the memory
> map when we detect that E820 is large.

Since Jürgen says he already has something ready, I think there's
not much else to say.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Supporting systems with large E820 maps
  2017-03-21  5:14 ` Juergen Gross
@ 2017-03-21 10:05   ` Jan Beulich
       [not found]   ` <58D109050200007800145959@suse.com>
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2017-03-21 10:05 UTC (permalink / raw)
  To: Juergen Gross
  Cc: rja, Alex Thorlton, Andrew Cooper, Daniel Kiper, xen-devel,
	Boris Ostrovsky

>>> On 21.03.17 at 06:14, <jgross@suse.com> wrote:
> On 20/03/17 20:03, Alex Thorlton wrote:
>> Hey everyone,
>> 
>> Recently, I've been working with Boris Ostrovsky to get Xen running on
>> some of our larger systems, and we've run into a few problems with the
>> amount of space that Xen sets aside for the E820 map.
>> 
>> The first problem that I hit was that E820MAX is far too small, at 128
>> entries, for the system that we're testing with.  The EFI memory map
>> handed up from the boot loader tops out at 783 entries, which far
>> exceeds the amount of space allocated for the memory map in
>> arch/x86/boot/mem.S.  I was able to get past this problem by bumping
>> E820MAX up to 1024 in arch/x86/boot/mem.S and include/asm-x86/e820.h.
>> 
>> The second problem that I encountered was that Xen uses a signed char to
>> store the number of entries in the memory map in a few places, which is
>> too small to hold the number of entries after bumping E820MAX up to
>> 1024.  I made the following changes to get past this:
> 
> The problem with setting E820MAX to a higher value in mem.S without
> further measures is that you are growing the trampoline size. This is
> problematic for memory allocation in the multiboot path.
> 
> I have some patches sitting here waiting for Daniel's multiboot series
> to go in. My patches are not using the mem.S e820 array for the EFI
> memory map, so the BIOS memory map buffer can remain smaller while the
> EFI buffer can be made rather large. This avoids growing the trampoline
> (in fact I've managed to reduce it to a single page).
> 
> I didn't post my series up to now in order to not block Daniel's series
> again. So what do people think: should I wait some more time with my
> patches, or should I send them rather soon?

I'd say go ahead - whether the rest of Daniel's series will go in for
4.9 is undetermined at this point in time. At the very least we first
need to get details on the boot regression Andrew says they're
observing with what has gone in already.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Supporting systems with large E820 maps
       [not found]   ` <58D109050200007800145959@suse.com>
@ 2017-03-21 12:01     ` Juergen Gross
  2017-03-21 13:09       ` Daniel Kiper
  0 siblings, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2017-03-21 12:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: rja, Alex Thorlton, Andrew Cooper, Daniel Kiper, xen-devel,
	Boris Ostrovsky

On 21/03/17 11:05, Jan Beulich wrote:
>>>> On 21.03.17 at 06:14, <jgross@suse.com> wrote:
>> On 20/03/17 20:03, Alex Thorlton wrote:
>>> Hey everyone,
>>>
>>> Recently, I've been working with Boris Ostrovsky to get Xen running on
>>> some of our larger systems, and we've run into a few problems with the
>>> amount of space that Xen sets aside for the E820 map.
>>>
>>> The first problem that I hit was that E820MAX is far too small, at 128
>>> entries, for the system that we're testing with.  The EFI memory map
>>> handed up from the boot loader tops out at 783 entries, which far
>>> exceeds the amount of space allocated for the memory map in
>>> arch/x86/boot/mem.S.  I was able to get past this problem by bumping
>>> E820MAX up to 1024 in arch/x86/boot/mem.S and include/asm-x86/e820.h.
>>>
>>> The second problem that I encountered was that Xen uses a signed char to
>>> store the number of entries in the memory map in a few places, which is
>>> too small to hold the number of entries after bumping E820MAX up to
>>> 1024.  I made the following changes to get past this:
>>
>> The problem with setting E820MAX to a higher value in mem.S without
>> further measures is that you are growing the trampoline size. This is
>> problematic for memory allocation in the multiboot path.
>>
>> I have some patches sitting here waiting for Daniel's multiboot series
>> to go in. My patches are not using the mem.S e820 array for the EFI
>> memory map, so the BIOS memory map buffer can remain smaller while the
>> EFI buffer can be made rather large. This avoids growing the trampoline
>> (in fact I've managed to reduce it to a single page).
>>
>> I didn't post my series up to now in order to not block Daniel's series
>> again. So what do people think: should I wait some more time with my
>> patches, or should I send them rather soon?
> 
> I'd say go ahead - whether the rest of Daniel's series will go in for
> 4.9 is undetermined at this point in time. At the very least we first
> need to get details on the boot regression Andrew says they're
> observing with what has gone in already.

Okay. I think I'll just post the first three patches which basically
will support more EFI memory map entries without affecting the
assembler parts too much.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Supporting systems with large E820 maps
  2017-03-21 12:01     ` Juergen Gross
@ 2017-03-21 13:09       ` Daniel Kiper
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Kiper @ 2017-03-21 13:09 UTC (permalink / raw)
  To: Juergen Gross
  Cc: rja, Alex Thorlton, Andrew Cooper, xen-devel, Jan Beulich,
	Boris Ostrovsky

On Tue, Mar 21, 2017 at 01:01:44PM +0100, Juergen Gross wrote:
> On 21/03/17 11:05, Jan Beulich wrote:
> >>>> On 21.03.17 at 06:14, <jgross@suse.com> wrote:
> >> On 20/03/17 20:03, Alex Thorlton wrote:
> >>> Hey everyone,
> >>>
> >>> Recently, I've been working with Boris Ostrovsky to get Xen running on
> >>> some of our larger systems, and we've run into a few problems with the
> >>> amount of space that Xen sets aside for the E820 map.
> >>>
> >>> The first problem that I hit was that E820MAX is far too small, at 128
> >>> entries, for the system that we're testing with.  The EFI memory map
> >>> handed up from the boot loader tops out at 783 entries, which far
> >>> exceeds the amount of space allocated for the memory map in
> >>> arch/x86/boot/mem.S.  I was able to get past this problem by bumping
> >>> E820MAX up to 1024 in arch/x86/boot/mem.S and include/asm-x86/e820.h.
> >>>
> >>> The second problem that I encountered was that Xen uses a signed char to
> >>> store the number of entries in the memory map in a few places, which is
> >>> too small to hold the number of entries after bumping E820MAX up to
> >>> 1024.  I made the following changes to get past this:
> >>
> >> The problem with setting E820MAX to a higher value in mem.S without
> >> further measures is that you are growing the trampoline size. This is
> >> problematic for memory allocation in the multiboot path.
> >>
> >> I have some patches sitting here waiting for Daniel's multiboot series
> >> to go in. My patches are not using the mem.S e820 array for the EFI
> >> memory map, so the BIOS memory map buffer can remain smaller while the
> >> EFI buffer can be made rather large. This avoids growing the trampoline
> >> (in fact I've managed to reduce it to a single page).
> >>
> >> I didn't post my series up to now in order to not block Daniel's series
> >> again. So what do people think: should I wait some more time with my
> >> patches, or should I send them rather soon?
> >
> > I'd say go ahead - whether the rest of Daniel's series will go in for
> > 4.9 is undetermined at this point in time. At the very least we first
> > need to get details on the boot regression Andrew says they're
> > observing with what has gone in already.
>
> Okay. I think I'll just post the first three patches which basically
> will support more EFI memory map entries without affecting the
> assembler parts too much.

This is not a problem for me in general. However, if you can try to not
touch much of the same code as I do then it will be nice. And of course
if you CC me I will be more than happy.

Daniel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-21 13:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-20 19:03 Supporting systems with large E820 maps Alex Thorlton
2017-03-21  5:14 ` Juergen Gross
2017-03-21 10:05   ` Jan Beulich
     [not found]   ` <58D109050200007800145959@suse.com>
2017-03-21 12:01     ` Juergen Gross
2017-03-21 13:09       ` Daniel Kiper
2017-03-21 10:03 ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.