All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] Requesting more memory from firmware
@ 2021-10-12  7:29 Daniel Axtens
  2021-10-12  7:29 ` [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management Daniel Axtens
                   ` (18 more replies)
  0 siblings, 19 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

Hi,

This extends Patrick's work on adding the ability to dynamically
request more memory from firmware.

It now supports powerpc-ieee1275 - I can allocate pretty large chunks
of memory and still successfully boot under both SLOF and Power8 PFW.

Structure of the series:

 - Patches 1 & 2: little patches to fix tests. Probably mergeable as-is.

 - Patch 3: Document some mm structures. Hopefully mergable as-is.

 - Patch 4: internal mm consistency check.

 - Patch 5: enhance the algorithm for merging a new region into an
   existing region.

 - Patch 6: pass MM_DEBUG from configure into config.h

 - Patch 7: a tool to test large memory allocations

 - Patches 8-13: Patrick's series with some minor tweaks

 - Patches 14-16: ieee1275 support

 - Patches 17-18: debug print patches

 - Patch 19: an RFC suggesting a possible improvement to Patrick's series.

Kind regards,
Daniel

Daniel Axtens (12):
  grub-shell: pseries: don't pass fw_opt to qemu
  mm: document grub internal memory management structures
  mm: assert that we preserve header vs region alignment
  mm: when adding a region, merge with region after as well as before
  configure: properly pass through MM_DEBUG
  Add memtool module with memory allocation stress-test
  ieee1275: request memory with ibm,client-architecture-support
  ieee1275: drop len -= 1 quirk in heap_init
  ieee1275: support runtime memory claiming
  [not for merge] print more debug info in mm
  [not for merge] ieee1275 debugging info
  RFC: Ignore REGION_CONSECUTIVE

Glenn Washburn (1):
  grub-shell: Boot PowerPC using PMU instead of CUDA for power
    management

Patrick Steinhardt (6):
  mm: Drop unused unloading of modules on OOM
  mm: Allow dynamically requesting additional memory regions
  efi: mm: Always request a fixed number of pages on init
  efi: mm: Extract function to add memory regions
  efi: mm: Pass up errors from `add_memory_regions ()`
  efi: mm: Implement runtime addition of pages



^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-10-12 17:11   ` Glenn Washburn
  2021-10-12  7:29 ` [PATCH 02/19] grub-shell: pseries: don't pass fw_opt to qemu Daniel Axtens
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Glenn Washburn, Daniel Axtens

From: Glenn Washburn <development@efficientek.com>

A recent refactoring of CUDA command code has exposed a bug in OpenBIOS[1]
which was causing system powerdown and system reset to fail, thus causing
the Qemu instance to hang. This in turn caused the grub-shell command to
timeout causing it to return an error code when the test actually completed
successfully.

Since it could be a while before the patch fixing this issue in OpenBIOS
filters down to the average distro, switch to PMU to allow powerdowns and
reboots to work as expected.

[1] https://gitlab.com/qemu-project/qemu/-/issues/624

Signed-off-by: Glenn Washburn <development@efficientek.com>
[dja: unbreak pseries]
Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 tests/util/grub-shell.in | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/util/grub-shell.in b/tests/util/grub-shell.in
index 93e9f5148408..33590baeb13c 100644
--- a/tests/util/grub-shell.in
+++ b/tests/util/grub-shell.in
@@ -84,6 +84,7 @@ case "${grub_modinfo_target_cpu}-${grub_modinfo_platform}" in
 	serial_null="-serial null"
 	netbootext=elf
 	trim=1
+	qemuopts="-M mac99,via=pmu $qemuopts"
 	;;
 
     sparc64-ieee1275)
@@ -231,7 +232,7 @@ for option in "$@"; do
 	qemu=qemu-system-ppc64
 	serial_port=ieee1275/hvterm
 	serial_null=
-	qemuopts="$qemuopts -M pseries -no-reboot"
+	qemuopts="$(echo $qemuopts | sed -E 's/-M [^ ]+//') -M pseries -no-reboot"
 	trim=1
 	pseries=y
 	    ;;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 02/19] grub-shell: pseries: don't pass fw_opt to qemu
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
  2021-10-12  7:29 ` [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-10-12 18:57   ` Glenn Washburn
  2021-10-12  7:29 ` [PATCH 03/19] mm: document grub internal memory management structures Daniel Axtens
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

At some point this started to break the test - qemu just rejects the
command line. Just don't pass it in, it's seabios specific.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 tests/util/grub-shell.in | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/util/grub-shell.in b/tests/util/grub-shell.in
index 33590baeb13c..92c0b6f320c3 100644
--- a/tests/util/grub-shell.in
+++ b/tests/util/grub-shell.in
@@ -376,7 +376,11 @@ if test -z "$debug"; then
   # workaround unfortunately causes qemu to issue a warning 'externally
   # provided fw_cfg item names should be prefixed with "opt/"', but there
   # doesn't seem to be a better option.
-  qemuopts="${qemuopts} -fw_cfg name=etc/sercon-port,string=0"
+  #
+  # Don't do this on pseries, it breaks recent qemus.
+  if [ $pseries == n ]; then
+    qemuopts="${qemuopts} -fw_cfg name=etc/sercon-port,string=0"
+  fi
 fi
 
 if [ x$boot != xnet ] && [ x$boot != xemu ]; then
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 03/19] mm: document grub internal memory management structures
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
  2021-10-12  7:29 ` [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management Daniel Axtens
  2021-10-12  7:29 ` [PATCH 02/19] grub-shell: pseries: don't pass fw_opt to qemu Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-10-12 19:18   ` Glenn Washburn
  2021-10-12  7:29 ` [PATCH 04/19] mm: assert that we preserve header vs region alignment Daniel Axtens
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

I spent more than a trivial quantity of time figuring out pre_size and
whether a memory region's size contains the header cell or not.

Document the meanings of all the properties. Hopefully now no-one else
has to figure it out!

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 include/grub/mm_private.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/include/grub/mm_private.h b/include/grub/mm_private.h
index c2c4cb1511c6..e80a059dd4e4 100644
--- a/include/grub/mm_private.h
+++ b/include/grub/mm_private.h
@@ -25,11 +25,18 @@
 #define GRUB_MM_FREE_MAGIC	0x2d3c2808
 #define GRUB_MM_ALLOC_MAGIC	0x6db08fa4
 
+/* A header describing a block of memory - either allocated or free */
 typedef struct grub_mm_header
 {
+  /* The 'next' free block in this region. A circular list.
+     Only meaningful if the block is free.
+   */
   struct grub_mm_header *next;
+  /* The region size in cells (not bytes). Includes the header cell. */
   grub_size_t size;
+  /* either free or alloc magic, depending on the block type. */
   grub_size_t magic;
+  /* pad to cell size: see the top of mm.c. */
 #if GRUB_CPU_SIZEOF_VOID_P == 4
   char padding[4];
 #elif GRUB_CPU_SIZEOF_VOID_P == 8
@@ -48,11 +55,25 @@ typedef struct grub_mm_header
 
 #define GRUB_MM_ALIGN	(1 << GRUB_MM_ALIGN_LOG2)
 
+/* A region from which we can make allocations. */
 typedef struct grub_mm_region
 {
+  /* The first free block in this region. */
   struct grub_mm_header *first;
+
+  /* The next region in the linked list of regions. Regions are initially
+     sorted in order of increasing size, but can grow, in which case the
+     ordering may not be preserved.
+   */
   struct grub_mm_region *next;
+
+  /* A grub_mm_region will always be aligned to cell size. The pre-size is
+     the number of bytes we were given but had to skip in order to get that
+     alignment.
+   */
   grub_size_t pre_size;
+
+  /* How many bytes are in this region? (free and allocated) */
   grub_size_t size;
 }
 *grub_mm_region_t;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 04/19] mm: assert that we preserve header vs region alignment
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (2 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 03/19] mm: document grub internal memory management structures Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-10-20 17:43   ` Daniel Kiper
  2021-10-12  7:29 ` [PATCH 05/19] mm: when adding a region, merge with region after as well as before Daniel Axtens
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

grub_mm_region_init() does:

  h = (grub_mm_header_t) (r + 1);

where h is a grub_mm_header_t and r is a grub_mm_region_t.

Cells are supposed to be GRUB_MM_ALIGN aligned, but while grub_mm_dump
ensures this vs the region header, grub_mm_region_init() does not.

It's better to be explicit than implicit here: rather than changing
grub_mm_region_init() to ALIGN_UP(), require that the struct is
explictly a multiple of the header size.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 include/grub/mm_private.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/grub/mm_private.h b/include/grub/mm_private.h
index e80a059dd4e4..533b47173e18 100644
--- a/include/grub/mm_private.h
+++ b/include/grub/mm_private.h
@@ -20,6 +20,7 @@
 #define GRUB_MM_PRIVATE_H	1
 
 #include <grub/mm.h>
+#include <grub/misc.h>
 
 /* Magic words.  */
 #define GRUB_MM_FREE_MAGIC	0x2d3c2808
@@ -82,4 +83,11 @@ typedef struct grub_mm_region
 extern grub_mm_region_t EXPORT_VAR (grub_mm_base);
 #endif
 
+static inline void grub_mm_size_sanity_check(void) {
+  /* Ensure we preserve alignment when doing h = (grub_mm_header_t) (r + 1) */
+  COMPILE_TIME_ASSERT((sizeof(struct grub_mm_region) %
+		       sizeof(struct grub_mm_header)) == 0);
+
+}
+
 #endif
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 05/19] mm: when adding a region, merge with region after as well as before
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (3 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 04/19] mm: assert that we preserve header vs region alignment Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-11-04 16:36   ` Daniel Kiper
  2021-10-12  7:29 ` [PATCH 06/19] configure: properly pass through MM_DEBUG Daniel Axtens
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

On x86_64-efi (at least) regions seem to be added from top down. The mm
code will merge a new region with an existing region that comes
immediately before the new region. This allows larger allocations to be
satisfied that would otherwise be the case.

On powerpc-ieee1275, however, regions are added from bottom up. So if
we add 3x 32MB regions, we can still only satisfy a 32MB allocation,
rather than the 96MB allocation we might otherwise be able to satisfy.

 * Define 'post_size' as being bytes lost to the end of an allocation
   due to being given weird sizes from firmware that are not multiples
   of GRUB_MM_ALIGN.

 * Allow merging of regions immediately _after_ existing regions, not
   just before. As with the other approach, we create an allocated
   block to represent the new space and the pass it to grub_free() to
   get the metadata right.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 grub-core/kern/mm.c       | 55 +++++++++++++++++++++++++--------------
 include/grub/mm_private.h | 15 +++++++++++
 2 files changed, 51 insertions(+), 19 deletions(-)

diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
index c070afc621f8..835ed8a8f6f9 100644
--- a/grub-core/kern/mm.c
+++ b/grub-core/kern/mm.c
@@ -129,25 +129,41 @@ grub_mm_init_region (void *addr, grub_size_t size)
     size = ((grub_addr_t) -0x1000) - (grub_addr_t) addr;
 
   for (p = &grub_mm_base, q = *p; q; p = &(q->next), q = *p)
-    if ((grub_uint8_t *) addr + size + q->pre_size == (grub_uint8_t *) q)
-      {
-	r = (grub_mm_region_t) ALIGN_UP ((grub_addr_t) addr, GRUB_MM_ALIGN);
-	*r = *q;
-	r->pre_size += size;
-	
-	if (r->pre_size >> GRUB_MM_ALIGN_LOG2)
-	  {
-	    h = (grub_mm_header_t) (r + 1);
-	    h->size = (r->pre_size >> GRUB_MM_ALIGN_LOG2);
-	    h->magic = GRUB_MM_ALLOC_MAGIC;
-	    r->size += h->size << GRUB_MM_ALIGN_LOG2;
-	    r->pre_size &= (GRUB_MM_ALIGN - 1);
-	    *p = r;
-	    grub_free (h + 1);
-	  }
-	*p = r;
-	return;
-      }
+    {
+      /* Does this region come _before_ an existing region? */
+      if ((grub_uint8_t *) addr + size + q->pre_size == (grub_uint8_t *) q)
+	{
+	  r = (grub_mm_region_t) ALIGN_UP ((grub_addr_t) addr, GRUB_MM_ALIGN);
+	  *r = *q;
+	  r->pre_size += size;
+
+	  if (r->pre_size >> GRUB_MM_ALIGN_LOG2)
+	    {
+	      h = (grub_mm_header_t) (r + 1);
+	      h->size = (r->pre_size >> GRUB_MM_ALIGN_LOG2);
+	      h->magic = GRUB_MM_ALLOC_MAGIC;
+	      r->size += h->size << GRUB_MM_ALIGN_LOG2;
+	      r->pre_size &= (GRUB_MM_ALIGN - 1);
+	      *p = r;
+	      grub_free (h + 1);
+	    }
+	  *p = r;
+	  return;
+	}
+
+      /* Does this region come _after_ an existing region? */
+      if ((grub_uint8_t *)q + sizeof(*q) + q->size + q->post_size ==
+	  (grub_uint8_t *) addr)
+	{
+	  h = (grub_mm_header_t) ((grub_uint8_t *)addr - q->post_size);
+	  h->size = (size + q->post_size) >> GRUB_MM_ALIGN_LOG2;
+	  h->magic = GRUB_MM_ALLOC_MAGIC;
+	  q->size += h->size << GRUB_MM_ALIGN_LOG2;
+	  q->post_size = (q->post_size + size) & (GRUB_MM_ALIGN - 1);
+	  grub_free (h + 1);
+	  return;
+	}
+    }
 
   /* Allocate a region from the head.  */
   r = (grub_mm_region_t) ALIGN_UP ((grub_addr_t) addr, GRUB_MM_ALIGN);
@@ -166,6 +182,7 @@ grub_mm_init_region (void *addr, grub_size_t size)
   r->first = h;
   r->pre_size = (grub_addr_t) r - (grub_addr_t) addr;
   r->size = (h->size << GRUB_MM_ALIGN_LOG2);
+  r->post_size = size - r->size;
 
   /* Find where to insert this region. Put a smaller one before bigger ones,
      to prevent fragmentation.  */
diff --git a/include/grub/mm_private.h b/include/grub/mm_private.h
index 533b47173e18..0effbc45a668 100644
--- a/include/grub/mm_private.h
+++ b/include/grub/mm_private.h
@@ -74,8 +74,23 @@ typedef struct grub_mm_region
    */
   grub_size_t pre_size;
 
+  /* Likewise, the post-size is the number of bytes we wasted at the end
+     of the allocation because it wasn't a multiple of GRUB_MM_ALIGN
+   */
+  grub_size_t post_size;
+
   /* How many bytes are in this region? (free and allocated) */
   grub_size_t size;
+
+  /* pad to a multiple of cell size */
+#if GRUB_CPU_SIZEOF_VOID_P == 4
+  char padding[4+4+4];
+#elif GRUB_CPU_SIZEOF_VOID_P == 8
+  char padding[8+8+8];
+#else
+# error "unknown word size"
+#endif
+
 }
 *grub_mm_region_t;
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 06/19] configure: properly pass through MM_DEBUG
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (4 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 05/19] mm: when adding a region, merge with region after as well as before Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-11-04 18:00   ` Daniel Kiper
  2021-10-12  7:29 ` [PATCH 07/19] Add memtool module with memory allocation stress-test Daniel Axtens
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

I noticed that compiling with --enable-mm-debug didn't cause the
functions in #ifdef MM_DEBUG to be compiled in. Somehow the variable
wasn't actually being substituted into anything that was built.

Change configure.ac to do AC_SUBST(), and put a substitution into
config.h.in. This makes MM_DEBUG available to any file which includes
config.h.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 config.h.in  | 3 +++
 configure.ac | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/config.h.in b/config.h.in
index 9e8f9911b183..8cf62616cad7 100644
--- a/config.h.in
+++ b/config.h.in
@@ -13,6 +13,9 @@
 #define DISK_CACHE_STATS @DISK_CACHE_STATS@
 #define BOOT_TIME_STATS @BOOT_TIME_STATS@
 
+/* Define to 1 to enable mm debugging */
+#define MM_DEBUG @MM_DEBUG@
+
 /* We don't need those.  */
 #define MINILZO_CFG_SKIP_LZO_PTR 1
 #define MINILZO_CFG_SKIP_LZO_UTIL 1
diff --git a/configure.ac b/configure.ac
index 8d1c81a7316e..889d07b3c1d0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1511,9 +1511,11 @@ AC_ARG_ENABLE([mm-debug],
 	      AS_HELP_STRING([--enable-mm-debug],
                              [include memory manager debugging]))
 if test x$enable_mm_debug = xyes; then
-    AC_DEFINE([MM_DEBUG], [1],
-            [Define to 1 if you enable memory manager debugging.])
+    MM_DEBUG=1
+else
+    MM_DEBUG=0
 fi
+AC_SUBST([MM_DEBUG])
 
 AC_ARG_ENABLE([cache-stats],
 	      AS_HELP_STRING([--enable-cache-stats],
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 07/19] Add memtool module with memory allocation stress-test
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (5 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 06/19] configure: properly pass through MM_DEBUG Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-10-19 19:47   ` Glenn Washburn
  2021-11-09 12:54   ` Daniel Kiper
  2021-10-12  7:29 ` [PATCH 08/19] mm: Drop unused unloading of modules on OOM Daniel Axtens
                   ` (11 subsequent siblings)
  18 siblings, 2 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

When working on memory, it's nice to be able to test your work.

Add a memtest module. When compiled with --enable-mm-debug, it exposes
3 commands:

 * lsmem - print all allocations and free space in all regions
 * lsfreemem - print free space in all regions

 * stress_big_allocs - stress test large allocations:
  - how much memory can we allocate in one chunk?
  - how many 1MB chunks can we allocate?
  - check that gap-filling works with a 1MB aligned 900kB alloc + a
     100kB alloc.

Signed-off-by: Daniel Axtens <dja@axtens.net>

---

I've put this as copyright IBM for now - hopefully we can conclude on
whether we're still doing FSF copyright assignments?
---
 grub-core/Makefile.core.def   |   5 ++
 grub-core/commands/memtools.c | 157 ++++++++++++++++++++++++++++++++++
 grub-core/kern/mm.c           |   4 +
 include/grub/mm.h             |   4 +-
 4 files changed, 168 insertions(+), 2 deletions(-)
 create mode 100644 grub-core/commands/memtools.c

diff --git a/grub-core/Makefile.core.def b/grub-core/Makefile.core.def
index 8022e1c0a794..0cc3a4a500ec 100644
--- a/grub-core/Makefile.core.def
+++ b/grub-core/Makefile.core.def
@@ -2527,3 +2527,8 @@ module = {
   common = commands/i386/wrmsr.c;
   enable = x86;
 };
+
+module = {
+  name = memtools;
+  common = commands/memtools.c;
+};
diff --git a/grub-core/commands/memtools.c b/grub-core/commands/memtools.c
new file mode 100644
index 000000000000..6d5778f4a1b0
--- /dev/null
+++ b/grub-core/commands/memtools.c
@@ -0,0 +1,157 @@
+/*
+ *  GRUB  --  GRand Unified Bootloader
+ *  Copyright (C) 2021  IBM Corporation
+ *
+ *  GRUB is free software: you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation, either version 3 of the License, or
+ *  (at your option) any later version.
+ *
+ *  GRUB is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with GRUB.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <config.h>
+#include <grub/dl.h>
+#include <grub/misc.h>
+#include <grub/command.h>
+#include <grub/i18n.h>
+#include <grub/memory.h>
+#include <grub/mm.h>
+
+GRUB_MOD_LICENSE ("GPLv3+");
+
+#ifdef MM_DEBUG
+
+static grub_err_t
+grub_cmd_lsmem (grub_command_t cmd __attribute__ ((unused)),
+		 int argc __attribute__ ((unused)),
+		 char **args __attribute__ ((unused)))
+
+{
+#ifndef GRUB_MACHINE_EMU
+  grub_mm_dump(0);
+#endif
+
+  return 0;
+}
+
+static grub_err_t
+grub_cmd_lsfreemem (grub_command_t cmd __attribute__ ((unused)),
+		    int argc __attribute__ ((unused)),
+		    char **args __attribute__ ((unused)))
+
+{
+#ifndef GRUB_MACHINE_EMU
+  grub_mm_dump_free();
+#endif
+
+  return 0;
+}
+
+
+#define BIG_ALLOC (64 * 1024 * 1024)
+#define SMALL_ALLOC 32
+
+static grub_err_t
+grub_cmd_stress_big_allocs (grub_command_t cmd __attribute__ ((unused)),
+			    int argc __attribute__ ((unused)),
+			    char **args __attribute__ ((unused)))
+{
+  int i, max_mb, blocks_alloced;
+  void *mem;
+  void **blocklist;
+
+  grub_printf ("Test 1: increasingly sized allocs to 1GB block\n");
+  for (i = 1; i < 1024; i++) {
+    grub_printf ("%d MB . ", i);
+    mem = grub_malloc (i * 1024 * 1024);
+    if (mem == NULL)
+      {
+	grub_printf ("failed\n");
+	break;
+      }
+    else
+      grub_free (mem);
+
+    if (i % 10 == 0)
+      grub_printf ("\n");
+  }
+
+  max_mb = i - 1;
+  grub_printf ("Max sized allocation we did was %d MB\n", max_mb);
+  
+  grub_printf ("Test 2: 1MB at a time, max 4GB\n");
+  blocklist = grub_calloc (4096, sizeof (void *));
+  for (i = 0; i < 4096; i++)
+    {
+      blocklist[i] = grub_malloc (1024 * 1024);
+      if (!blocklist[i])
+	{
+	  grub_printf ("Ran out of memory at iteration %d\n", i);
+	  break;
+	}
+    }
+  blocks_alloced = i;
+  for (i = 0; i < blocks_alloced; i++)
+    grub_free (blocklist[i]);
+
+  grub_printf ("\nTest 3: 1MB aligned 900kB + 100kB\n");
+  //grub_mm_debug=1;
+  for (i = 0; i < 4096; i += 2)
+    {
+      blocklist[i] = grub_memalign (1024 * 1024, 900 * 1024);
+      if (!blocklist[i])
+	{
+	  grub_printf ("Failed big allocation, iteration %d\n", i);
+	  blocks_alloced = i;
+	  break;
+	}
+
+      blocklist[i + 1] = grub_malloc (100 * 1024);
+      if (!blocklist[i + 1])
+	{
+	  grub_printf ("Failed small allocation, iteration %d\n", i);
+	  blocks_alloced = i + 1;
+	  break;
+	}
+      grub_printf (".");
+    }
+  for (i = 0; i < blocks_alloced; i++)
+    grub_free (blocklist[i]);
+
+  grub_free (blocklist);
+
+  grub_errno = GRUB_ERR_NONE;
+  return GRUB_ERR_NONE;
+}
+
+static grub_command_t cmd_lsmem, cmd_lsfreemem, cmd_sba;
+
+#endif /* MM_DEBUG */
+
+GRUB_MOD_INIT(memtools)
+{
+#ifdef MM_DEBUG
+  cmd_lsmem = grub_register_command ("lsmem", grub_cmd_lsmem,
+				     0, N_("List free and allocated memory blocks."));
+  cmd_lsfreemem = grub_register_command ("lsfreemem", grub_cmd_lsfreemem,
+					 0, N_("List free memory blocks."));
+  cmd_sba = grub_register_command ("stress_big_allocs", grub_cmd_stress_big_allocs,
+  				   0, N_("Stress test large allocations."));
+#endif
+}
+
+GRUB_MOD_FINI(memtools)
+{
+#ifdef MM_DEBUG
+  grub_unregister_command (cmd_lsmem);
+  grub_unregister_command (cmd_lsfreemem);
+  grub_unregister_command (cmd_sba);
+#endif
+}
diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
index 835ed8a8f6f9..032c8f71aed2 100644
--- a/grub-core/kern/mm.c
+++ b/grub-core/kern/mm.c
@@ -556,6 +556,8 @@ grub_mm_dump_free (void)
     {
       grub_mm_header_t p;
 
+      grub_printf ("Region %p (size %" PRIuGRUB_SIZE ")\n\n", r, r->size);
+
       /* Follow the free list.  */
       p = r->first;
       do
@@ -583,6 +585,8 @@ grub_mm_dump (unsigned lineno)
     {
       grub_mm_header_t p;
 
+      grub_printf ("Region %p (size %" PRIuGRUB_SIZE ")\n\n", r, r->size);
+
       for (p = (grub_mm_header_t) ALIGN_UP ((grub_addr_t) (r + 1),
 					    GRUB_MM_ALIGN);
 	   (grub_addr_t) p < (grub_addr_t) (r+1) + r->size;
diff --git a/include/grub/mm.h b/include/grub/mm.h
index 9c38dd3ca5d2..44fde7cb9033 100644
--- a/include/grub/mm.h
+++ b/include/grub/mm.h
@@ -46,8 +46,8 @@ void grub_mm_check_real (const char *file, int line);
 /* Set this variable to 1 when you want to trace all memory function calls.  */
 extern int EXPORT_VAR(grub_mm_debug);
 
-void grub_mm_dump_free (void);
-void grub_mm_dump (unsigned lineno);
+void EXPORT_FUNC(grub_mm_dump_free) (void);
+void EXPORT_FUNC(grub_mm_dump) (unsigned lineno);
 
 #define grub_calloc(nmemb, size)	\
   grub_debug_calloc (GRUB_FILE, __LINE__, nmemb, size)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 08/19] mm: Drop unused unloading of modules on OOM
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (6 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 07/19] Add memtool module with memory allocation stress-test Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-11-09 13:15   ` Daniel Kiper
  2021-10-12  7:29 ` [PATCH 09/19] mm: Allow dynamically requesting additional memory regions Daniel Axtens
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper

From: Patrick Steinhardt <ps@pks.im>

In `grub_memalign ()`, there's a commented section which would allow for
unloading of unneeded modules in case where there is not enough free
memory available to satisfy a request. Given that this code is never
compiled in, let's remove it together with `grub_dl_unload_unneeded()`

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 grub-core/kern/dl.c | 20 --------------------
 grub-core/kern/mm.c |  8 --------
 include/grub/dl.h   |  1 -
 3 files changed, 29 deletions(-)

diff --git a/grub-core/kern/dl.c b/grub-core/kern/dl.c
index 48f8a79073dd..a62dbeebbfe9 100644
--- a/grub-core/kern/dl.c
+++ b/grub-core/kern/dl.c
@@ -803,23 +803,3 @@ grub_dl_unload (grub_dl_t mod)
   grub_free (mod);
   return 1;
 }
-
-/* Unload unneeded modules.  */
-void
-grub_dl_unload_unneeded (void)
-{
-  /* Because grub_dl_remove modifies the list of modules, this
-     implementation is tricky.  */
-  grub_dl_t p = grub_dl_head;
-
-  while (p)
-    {
-      if (grub_dl_unload (p))
-	{
-	  p = grub_dl_head;
-	  continue;
-	}
-
-      p = p->next;
-    }
-}
diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
index 032c8f71aed2..6038c0d0cbb2 100644
--- a/grub-core/kern/mm.c
+++ b/grub-core/kern/mm.c
@@ -377,14 +377,6 @@ grub_memalign (grub_size_t align, grub_size_t size)
       count++;
       goto again;
 
-#if 0
-    case 1:
-      /* Unload unneeded modules.  */
-      grub_dl_unload_unneeded ();
-      count++;
-      goto again;
-#endif
-
     default:
       break;
     }
diff --git a/include/grub/dl.h b/include/grub/dl.h
index b3753c9ca262..536717776367 100644
--- a/include/grub/dl.h
+++ b/include/grub/dl.h
@@ -203,7 +203,6 @@ grub_dl_t EXPORT_FUNC(grub_dl_load) (const char *name);
 grub_dl_t grub_dl_load_core (void *addr, grub_size_t size);
 grub_dl_t EXPORT_FUNC(grub_dl_load_core_noinit) (void *addr, grub_size_t size);
 int EXPORT_FUNC(grub_dl_unload) (grub_dl_t mod);
-extern void grub_dl_unload_unneeded (void);
 extern int EXPORT_FUNC(grub_dl_ref) (grub_dl_t mod);
 extern int EXPORT_FUNC(grub_dl_unref) (grub_dl_t mod);
 extern int EXPORT_FUNC(grub_dl_ref_count) (grub_dl_t mod);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 09/19] mm: Allow dynamically requesting additional memory regions
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (7 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 08/19] mm: Drop unused unloading of modules on OOM Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-11-09 13:25   ` Daniel Kiper
  2021-10-12  7:29 ` [PATCH 10/19] efi: mm: Always request a fixed number of pages on init Daniel Axtens
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

From: Patrick Steinhardt <ps@pks.im>

Currently, all platforms will set up their heap on initialization of the
platform code. While this works mostly fine, it poses some limitations
on memory management on us. Most notably, allocating big chunks of
memory in the gigabyte range would require us to pre-request this many
bytes from the firmware and add it to the heap from the beginning on
some platforms like EFI. As this isn't needed for most configurations,
it is inefficient and may even negatively impact some usecases when,
e.g., chainloading. Nonetheless, allocating big chunks of memory is
required sometimes, where one example is the upcoming support for the
Argon2 key derival function in LUKS2.

In order to avoid pre-allocating big chunks of memory, this commit
implements a runtime mechanism to add more pages to the system. When a
given allocation cannot be currently satisfied, we'll call a given
callback set up by the platform's own memory management subsystem,
asking it to add a memory area with at least `n` bytes. If this
succeeds, we retry searching for a valid memory region, which should now
succeed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[dja: add this to the documentation at the top of mm.c]
Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 grub-core/kern/mm.c | 13 +++++++++++++
 include/grub/mm.h   | 16 ++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
index 6038c0d0cbb2..58d5b89e8860 100644
--- a/grub-core/kern/mm.c
+++ b/grub-core/kern/mm.c
@@ -28,6 +28,9 @@
   - multiple regions may be used as free space. They may not be
   contiguous.
 
+  - if existing regions are insufficient to satisfy an allocation, a new
+  region can be requested from firmware.
+
   Regions are managed by a singly linked list, and the meta information is
   stored in the beginning of each region. Space after the meta information
   is used to allocate memory.
@@ -81,6 +84,7 @@
 \f
 
 grub_mm_region_t grub_mm_base;
+grub_mm_add_region_func_t grub_mm_add_region_fn;
 
 /* Get a header from the pointer PTR, and set *P and *R to a pointer
    to the header and a pointer to its region, respectively. PTR must
@@ -377,6 +381,15 @@ grub_memalign (grub_size_t align, grub_size_t size)
       count++;
       goto again;
 
+    case 1:
+      /* Request additional pages.  */
+      count++;
+
+      if (grub_mm_add_region_fn && grub_mm_add_region_fn (size, GRUB_MM_ADD_REGION_CONSECUTIVE) == GRUB_ERR_NONE)
+	goto again;
+
+      /* fallthrough */
+
     default:
       break;
     }
diff --git a/include/grub/mm.h b/include/grub/mm.h
index 44fde7cb9033..5d916809666c 100644
--- a/include/grub/mm.h
+++ b/include/grub/mm.h
@@ -20,6 +20,7 @@
 #ifndef GRUB_MM_H
 #define GRUB_MM_H	1
 
+#include <grub/err.h>
 #include <grub/types.h>
 #include <grub/symbol.h>
 #include <config.h>
@@ -28,6 +29,21 @@
 # define NULL	((void *) 0)
 #endif
 
+#define GRUB_MM_ADD_REGION_NONE        0
+#define GRUB_MM_ADD_REGION_CONSECUTIVE (1 << 0)
+
+/*
+ * Function used to request memory regions of `grub_size_t` bytes. The second
+ * parameter is a bitfield of `GRUB_MM_ADD_REGION` flags.
+ */
+typedef grub_err_t (*grub_mm_add_region_func_t) (grub_size_t, unsigned int);
+
+/*
+ * Set this function pointer to enable adding memory-regions at runtime in case
+ * a memory allocation cannot be satisfied with existing regions.
+ */
+extern grub_mm_add_region_func_t EXPORT_VAR(grub_mm_add_region_fn);
+
 void grub_mm_init_region (void *addr, grub_size_t size);
 void *EXPORT_FUNC(grub_calloc) (grub_size_t nmemb, grub_size_t size);
 void *EXPORT_FUNC(grub_malloc) (grub_size_t size);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 10/19] efi: mm: Always request a fixed number of pages on init
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (8 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 09/19] mm: Allow dynamically requesting additional memory regions Daniel Axtens
@ 2021-10-12  7:29 ` Daniel Axtens
  2021-11-09 13:32   ` Daniel Kiper
  2021-10-12  7:30 ` [PATCH 11/19] efi: mm: Extract function to add memory regions Daniel Axtens
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:29 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper

From: Patrick Steinhardt <ps@pks.im>

When initializing the EFI memory subsytem, we will by default request a
quarter of the available memory, bounded by a minimum/maximum value.
Given that we're about to extend the EFI memory system to dynamically
request additional pages from the firmware as required, this scaling of
requested memory based on available memory will not make a lot of sense
anymore.

Remove this logic as a preparatory patch such that we'll instead defer
to the runtime memory allocator. Note that ideally, we'd want to change
this after dynamic requesting of pages has been implemented for the EFI
platform. But because we'll need to split up initialization of the
memory subsystem and the request of pages from the firmware, we'd have
to duplicate quite some logic at first only to remove it afterwards
again. This seems quite pointless, so we instead have patches slightly
out of order.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 grub-core/kern/efi/mm.c | 35 +++--------------------------------
 1 file changed, 3 insertions(+), 32 deletions(-)

diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
index 9838fb2f50db..4d276bc87a4c 100644
--- a/grub-core/kern/efi/mm.c
+++ b/grub-core/kern/efi/mm.c
@@ -38,9 +38,8 @@
    a multiplier of 4KB.  */
 #define MEMORY_MAP_SIZE	0x3000
 
-/* The minimum and maximum heap size for GRUB itself.  */
-#define MIN_HEAP_SIZE	0x100000
-#define MAX_HEAP_SIZE	(1600 * 0x100000)
+/* The default heap size for GRUB itself in bytes.  */
+#define DEFAULT_HEAP_SIZE	0x100000
 
 static void *finish_mmap_buf = 0;
 static grub_efi_uintn_t finish_mmap_size = 0;
@@ -478,23 +477,6 @@ filter_memory_map (grub_efi_memory_descriptor_t *memory_map,
   return filtered_desc;
 }
 
-/* Return the total number of pages.  */
-static grub_efi_uint64_t
-get_total_pages (grub_efi_memory_descriptor_t *memory_map,
-		 grub_efi_uintn_t desc_size,
-		 grub_efi_memory_descriptor_t *memory_map_end)
-{
-  grub_efi_memory_descriptor_t *desc;
-  grub_efi_uint64_t total = 0;
-
-  for (desc = memory_map;
-       desc < memory_map_end;
-       desc = NEXT_MEMORY_DESCRIPTOR (desc, desc_size))
-    total += desc->num_pages;
-
-  return total;
-}
-
 /* Add memory regions.  */
 static void
 add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
@@ -583,8 +565,6 @@ grub_efi_mm_init (void)
   grub_efi_memory_descriptor_t *filtered_memory_map_end;
   grub_efi_uintn_t map_size;
   grub_efi_uintn_t desc_size;
-  grub_efi_uint64_t total_pages;
-  grub_efi_uint64_t required_pages;
   int mm_status;
 
   /* Prepare a memory region to store two memory maps.  */
@@ -624,22 +604,13 @@ grub_efi_mm_init (void)
   filtered_memory_map_end = filter_memory_map (memory_map, filtered_memory_map,
 					       desc_size, memory_map_end);
 
-  /* By default, request a quarter of the available memory.  */
-  total_pages = get_total_pages (filtered_memory_map, desc_size,
-				 filtered_memory_map_end);
-  required_pages = (total_pages >> 2);
-  if (required_pages < BYTES_TO_PAGES (MIN_HEAP_SIZE))
-    required_pages = BYTES_TO_PAGES (MIN_HEAP_SIZE);
-  else if (required_pages > BYTES_TO_PAGES (MAX_HEAP_SIZE))
-    required_pages = BYTES_TO_PAGES (MAX_HEAP_SIZE);
-
   /* Sort the filtered descriptors, so that GRUB can allocate pages
      from smaller regions.  */
   sort_memory_map (filtered_memory_map, desc_size, filtered_memory_map_end);
 
   /* Allocate memory regions for GRUB's memory management.  */
   add_memory_regions (filtered_memory_map, desc_size,
-		      filtered_memory_map_end, required_pages);
+		      filtered_memory_map_end, BYTES_TO_PAGES (DEFAULT_HEAP_SIZE));
 
 #if 0
   /* For debug.  */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 11/19] efi: mm: Extract function to add memory regions
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (9 preceding siblings ...)
  2021-10-12  7:29 ` [PATCH 10/19] efi: mm: Always request a fixed number of pages on init Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-10-19 21:39   ` Glenn Washburn
                     ` (2 more replies)
  2021-10-12  7:30 ` [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()` Daniel Axtens
                   ` (7 subsequent siblings)
  18 siblings, 3 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper

From: Patrick Steinhardt <ps@pks.im>

In preparation of support for runtime-allocating additional memory
region, this patch extracts the function to retrieve the EFI memory map
and add a subset of it to GRUB's own memory regions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 grub-core/kern/efi/mm.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
index 4d276bc87a4c..cfc6a67fc0cc 100644
--- a/grub-core/kern/efi/mm.c
+++ b/grub-core/kern/efi/mm.c
@@ -504,7 +504,7 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
 
       addr = grub_efi_allocate_pages_real (start, pages,
 					   GRUB_EFI_ALLOCATE_ADDRESS,
-					   GRUB_EFI_LOADER_CODE);      
+					   GRUB_EFI_LOADER_CODE);
       if (! addr)
 	grub_fatal ("cannot allocate conventional memory %p with %u pages",
 		    (void *) ((grub_addr_t) start),
@@ -556,8 +556,8 @@ print_memory_map (grub_efi_memory_descriptor_t *memory_map,
 }
 #endif
 
-void
-grub_efi_mm_init (void)
+static grub_err_t
+grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
 {
   grub_efi_memory_descriptor_t *memory_map;
   grub_efi_memory_descriptor_t *memory_map_end;
@@ -570,7 +570,7 @@ grub_efi_mm_init (void)
   /* Prepare a memory region to store two memory maps.  */
   memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (MEMORY_MAP_SIZE));
   if (! memory_map)
-    grub_fatal ("cannot allocate memory");
+    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");
 
   /* Obtain descriptors for available memory.  */
   map_size = MEMORY_MAP_SIZE;
@@ -588,14 +588,14 @@ grub_efi_mm_init (void)
 
       memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (map_size));
       if (! memory_map)
-	grub_fatal ("cannot allocate memory");
+	return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");
 
       mm_status = grub_efi_get_memory_map (&map_size, memory_map, 0,
 					   &desc_size, 0);
     }
 
   if (mm_status < 0)
-    grub_fatal ("cannot get memory map");
+    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot get memory map");
 
   memory_map_end = NEXT_MEMORY_DESCRIPTOR (memory_map, map_size);
 
@@ -610,7 +610,7 @@ grub_efi_mm_init (void)
 
   /* Allocate memory regions for GRUB's memory management.  */
   add_memory_regions (filtered_memory_map, desc_size,
-		      filtered_memory_map_end, BYTES_TO_PAGES (DEFAULT_HEAP_SIZE));
+		      filtered_memory_map_end, BYTES_TO_PAGES (required_bytes));
 
 #if 0
   /* For debug.  */
@@ -628,6 +628,15 @@ grub_efi_mm_init (void)
   /* Release the memory maps.  */
   grub_efi_free_pages ((grub_addr_t) memory_map,
 		       2 * BYTES_TO_PAGES (MEMORY_MAP_SIZE));
+
+  return GRUB_ERR_NONE;
+}
+
+void
+grub_efi_mm_init (void)
+{
+  if (grub_efi_mm_add_regions (DEFAULT_HEAP_SIZE) != GRUB_ERR_NONE)
+    grub_fatal ("%s", grub_errmsg);
 }
 
 #if defined (__aarch64__) || defined (__arm__) || defined (__riscv)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()`
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (10 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 11/19] efi: mm: Extract function to add memory regions Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-10-19 21:37   ` Glenn Washburn
  2021-11-09 16:10   ` Daniel Kiper
  2021-10-12  7:30 ` [PATCH 13/19] efi: mm: Implement runtime addition of pages Daniel Axtens
                   ` (6 subsequent siblings)
  18 siblings, 2 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper

From: Patrick Steinhardt <ps@pks.im>

The function `add_memory_regions ()` is currently only called on system
initialization to allocate a fixed amount of pages. As such, it didn't
need to return any errors: in case it failed, we cannot proceed anyway.
This will change with the upcoming support for requesting more memory
from the firmware at runtime, where it doesn't make sense anymore to
fail hard.

Refactor the function to return an error to prepare for this. Note that
this does not change the behaviour when initializing the memory system
because `grub_efi_mm_init ()` knows to call `grub_fatal ()` in case
`grub_efi_mm_add_regions ()` returns an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 grub-core/kern/efi/mm.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
index cfc6a67fc0cc..ced3ee5e7537 100644
--- a/grub-core/kern/efi/mm.c
+++ b/grub-core/kern/efi/mm.c
@@ -478,7 +478,7 @@ filter_memory_map (grub_efi_memory_descriptor_t *memory_map,
 }
 
 /* Add memory regions.  */
-static void
+static grub_err_t
 add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
 		    grub_efi_uintn_t desc_size,
 		    grub_efi_memory_descriptor_t *memory_map_end,
@@ -506,9 +506,9 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
 					   GRUB_EFI_ALLOCATE_ADDRESS,
 					   GRUB_EFI_LOADER_CODE);
       if (! addr)
-	grub_fatal ("cannot allocate conventional memory %p with %u pages",
-		    (void *) ((grub_addr_t) start),
-		    (unsigned) pages);
+	return grub_error (GRUB_ERR_OUT_OF_MEMORY,
+			    "cannot allocate conventional memory %p with %u pages",
+			    (void *) ((grub_addr_t) start), (unsigned) pages);
 
       grub_mm_init_region (addr, PAGES_TO_BYTES (pages));
 
@@ -518,7 +518,9 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
     }
 
   if (required_pages > 0)
-    grub_fatal ("too little memory");
+    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "too little memory");
+
+  return GRUB_ERR_NONE;
 }
 
 void
@@ -565,6 +567,7 @@ grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
   grub_efi_memory_descriptor_t *filtered_memory_map_end;
   grub_efi_uintn_t map_size;
   grub_efi_uintn_t desc_size;
+  grub_err_t err;
   int mm_status;
 
   /* Prepare a memory region to store two memory maps.  */
@@ -609,8 +612,11 @@ grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
   sort_memory_map (filtered_memory_map, desc_size, filtered_memory_map_end);
 
   /* Allocate memory regions for GRUB's memory management.  */
-  add_memory_regions (filtered_memory_map, desc_size,
-		      filtered_memory_map_end, BYTES_TO_PAGES (required_bytes));
+  err = add_memory_regions (filtered_memory_map, desc_size,
+			    filtered_memory_map_end,
+			    BYTES_TO_PAGES (required_bytes));
+  if (err != GRUB_ERR_NONE)
+    return err;
 
 #if 0
   /* For debug.  */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 13/19] efi: mm: Implement runtime addition of pages
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (11 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()` Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-11-09 16:13   ` Daniel Kiper
  2021-10-12  7:30 ` [PATCH 14/19] ieee1275: request memory with ibm, client-architecture-support Daniel Axtens
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper

From: Patrick Steinhardt <ps@pks.im>

Adjust the interface of `grub_efi_mm_add_regions ()` to take a set of
`GRUB_MM_ADD_REGION_*` flags, which most notably is currently only the
`CONSECUTVE` flag. This allows us to set the function up as callback for
the memory subsystem and have it call out to us in case there's not
enough pages available in the current heap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 grub-core/kern/efi/mm.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
index ced3ee5e7537..f3d2e2b99830 100644
--- a/grub-core/kern/efi/mm.c
+++ b/grub-core/kern/efi/mm.c
@@ -482,7 +482,8 @@ static grub_err_t
 add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
 		    grub_efi_uintn_t desc_size,
 		    grub_efi_memory_descriptor_t *memory_map_end,
-		    grub_efi_uint64_t required_pages)
+		    grub_efi_uint64_t required_pages,
+		    unsigned int flags)
 {
   grub_efi_memory_descriptor_t *desc;
 
@@ -496,6 +497,10 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
 
       start = desc->physical_start;
       pages = desc->num_pages;
+
+      if (pages < required_pages && (flags & GRUB_MM_ADD_REGION_CONSECUTIVE))
+	continue;
+
       if (pages > required_pages)
 	{
 	  start += PAGES_TO_BYTES (pages - required_pages);
@@ -559,7 +564,7 @@ print_memory_map (grub_efi_memory_descriptor_t *memory_map,
 #endif
 
 static grub_err_t
-grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
+grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes, unsigned int flags)
 {
   grub_efi_memory_descriptor_t *memory_map;
   grub_efi_memory_descriptor_t *memory_map_end;
@@ -614,7 +619,8 @@ grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
   /* Allocate memory regions for GRUB's memory management.  */
   err = add_memory_regions (filtered_memory_map, desc_size,
 			    filtered_memory_map_end,
-			    BYTES_TO_PAGES (required_bytes));
+			    BYTES_TO_PAGES (required_bytes),
+			    flags);
   if (err != GRUB_ERR_NONE)
     return err;
 
@@ -641,8 +647,9 @@ grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
 void
 grub_efi_mm_init (void)
 {
-  if (grub_efi_mm_add_regions (DEFAULT_HEAP_SIZE) != GRUB_ERR_NONE)
+  if (grub_efi_mm_add_regions (DEFAULT_HEAP_SIZE, GRUB_MM_ADD_REGION_NONE) != GRUB_ERR_NONE)
     grub_fatal ("%s", grub_errmsg);
+  grub_mm_add_region_fn = grub_efi_mm_add_regions;
 }
 
 #if defined (__aarch64__) || defined (__arm__) || defined (__riscv)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 14/19] ieee1275: request memory with ibm, client-architecture-support
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (12 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 13/19] efi: mm: Implement runtime addition of pages Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-10-12  7:30 ` [PATCH 15/19] ieee1275: drop len -= 1 quirk in heap_init Daniel Axtens
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

On PowerVM, the first time we boot a Linux partition, we may only get
256MB of real memory area, even if the partition has more memory.

This isn't really enough. Fortunately, the Power Architecture Platform
Reference (PAPR) defines a method we can call to ask for more memory.
This is part of the broad and powerful ibm,client-architecture-support
(CAS) method.

CAS can do an enormous amount of things on a PAPR platform: as well as
asking for memory, you can set the supported processor level, the interrupt
controller, hash vs radix mmu, and so on. We want to touch as little of
this as possible because we don't want to step on the toes of the future OS.

If:

 - we are running under what we think is PowerVM (compatible property of /
   begins with "IBM"), and

 - the full amount of RMA is less than 512MB (as determined by the reg
   property of /memory)

then call CAS as follows: (refer to the Linux on Power Architecture
Reference, LoPAR, which is public, at B.5.2.3):

 - Use the "any" PVR value and supply 2 option vectors.

 - Set option vector 1 (PowerPC Server Processor Architecture Level)
   to "ignore".

 - Set option vector 2 with default or Linux-like options, including a
   min-rma-size of 512MB.

This will cause a CAS reboot and the partition will restart with 512MB
of RMA. Grub will notice the 512MB and not call CAS again.

(A partition can be configured with only 256MB of memory, which would
mean this request couldn't be satisfied, but PFW refuses to load with
only 256MB of memory, so it's a bit moot. SLOF will run fine with 256MB,
but we will never call CAS under qemu/SLOF because /compatible won't
begin with "IBM".)

One of the first things Linux does while still running under OpenFirmware
is to call CAS with a much fuller set of options (including asking for
512MB of memory). This includes a much more restrictive set of PVR values
and processor support levels, and this will induce another reboot. On this
reboot grub will again notice the higher RMA, and not call CAS. We will get
to Linux, Linux will call CAS but because the values are now set for Linux
this will not induce another CAS reboot and we will finally boot.

On all subsequent boots, everything will be configured with 512MB of RMA
and all the settings Linux likes, so there will be no further CAS reboots.

(phyp is super sticky with the RMA size - it persists even on cold boots.
So if you've ever booted Linux in a partition, you'll probably never have
grub call CAS. It'll only ever fire the first time a partition loads grub,
or if you deliberately lower the amount of memory your partition has below
512MB.)

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 grub-core/kern/ieee1275/cmain.c  |   3 +
 grub-core/kern/ieee1275/init.c   | 140 +++++++++++++++++++++++++++++++
 include/grub/ieee1275/ieee1275.h |   6 ++
 3 files changed, 149 insertions(+)

diff --git a/grub-core/kern/ieee1275/cmain.c b/grub-core/kern/ieee1275/cmain.c
index 4442b6a83193..b707798ec3fb 100644
--- a/grub-core/kern/ieee1275/cmain.c
+++ b/grub-core/kern/ieee1275/cmain.c
@@ -123,6 +123,9 @@ grub_ieee1275_find_options (void)
 	      break;
 	    }
 	}
+
+      if (grub_strncmp (tmp, "IBM,", 4) == 0)
+	grub_ieee1275_set_flag (GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY);
     }
 
   if (is_smartfirmware)
diff --git a/grub-core/kern/ieee1275/init.c b/grub-core/kern/ieee1275/init.c
index 8d944dae083e..46b1aee732d3 100644
--- a/grub-core/kern/ieee1275/init.c
+++ b/grub-core/kern/ieee1275/init.c
@@ -200,11 +200,151 @@ heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
   return 0;
 }
 
+/* How much memory does OF believe it has? (regardless of whether
+   it's accessible or not) */
+static grub_err_t
+grub_ieee1275_total_mem (grub_uint64_t *total)
+{
+  grub_ieee1275_phandle_t root;
+  grub_ieee1275_phandle_t memory;
+  grub_uint32_t reg[4];
+  grub_ssize_t reg_size;
+  grub_uint32_t address_cells = 1;
+  grub_uint32_t size_cells = 1;
+  grub_uint64_t size;
+
+  /* If we fail to get to the end, report 0. */
+  *total = 0;
+
+  /* Determine the format of each entry in `reg'.  */
+  grub_ieee1275_finddevice ("/", &root);
+  grub_ieee1275_get_integer_property (root, "#address-cells", &address_cells,
+				      sizeof address_cells, 0);
+  grub_ieee1275_get_integer_property (root, "#size-cells", &size_cells,
+				      sizeof size_cells, 0);
+
+  if (size_cells > address_cells)
+    address_cells = size_cells;
+
+  /* Load `/memory/reg'.  */
+  if (grub_ieee1275_finddevice ("/memory", &memory))
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE,
+		       "couldn't find /memory node");
+  if (grub_ieee1275_get_integer_property (memory, "reg", reg,
+					  sizeof reg, &reg_size))
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE,
+		       "couldn't examine /memory/reg property");
+  if (reg_size < 0 || (grub_size_t) reg_size > sizeof (reg))
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE,
+                       "/memory response buffer exceeded");
+
+  if (grub_ieee1275_test_flag (GRUB_IEEE1275_FLAG_BROKEN_ADDRESS_CELLS))
+    {
+      address_cells = 1;
+      size_cells = 1;
+    }
+
+  /* Decode only the size */
+  size = reg[address_cells];
+  if (size_cells == 2)
+    size = (size << 32) | reg[address_cells + 1];
+
+  *total = size;
+
+  return grub_errno;
+}
+
+/* See PAPR or arch/powerpc/kernel/prom_init.c */
+struct option_vector2 {
+	grub_uint8_t byte1;
+	grub_uint16_t reserved;
+	grub_uint32_t real_base;
+	grub_uint32_t real_size;
+	grub_uint32_t virt_base;
+	grub_uint32_t virt_size;
+	grub_uint32_t load_base;
+	grub_uint32_t min_rma;
+	grub_uint32_t min_load;
+	grub_uint8_t min_rma_percent;
+	grub_uint8_t max_pft_size;
+} __attribute__((packed));
+
+struct pvr_entry {
+	  grub_uint32_t mask;
+	  grub_uint32_t entry;
+};
+
+struct cas_vector {
+    struct {
+      struct pvr_entry terminal;
+    } pvr_list;
+    grub_uint8_t num_vecs;
+    grub_uint8_t vec1_size;
+    grub_uint8_t vec1;
+    grub_uint8_t vec2_size;
+    struct option_vector2 vec2;
+} __attribute__((packed));
+
+/* Call ibm,client-architecture-support to try to get more RMA.
+   We ask for 512MB which should be enough to verify a distro kernel.
+   We ignore most errors: if we don't succeed we'll proceed with whatever
+   memory we have. */
+static void
+grub_ieee1275_ibm_cas (void)
+{
+  int rc;
+  grub_ieee1275_ihandle_t root;
+  struct cas_args {
+    struct grub_ieee1275_common_hdr common;
+    grub_ieee1275_cell_t method;
+    grub_ieee1275_ihandle_t ihandle;
+    grub_ieee1275_cell_t cas_addr;
+    grub_ieee1275_cell_t result;
+  } args;
+  struct cas_vector vector = {
+    .pvr_list = { { 0x00000000, 0xffffffff } }, /* any processor */
+    .num_vecs = 2 - 1,
+    .vec1_size = 0,
+    .vec1 = 0x80, /* ignore */
+    .vec2_size = 1 + sizeof(struct option_vector2) - 2,
+    .vec2 = {
+      0, 0, -1, -1, -1, -1, -1, 512, -1, 0, 48
+    },
+  };
+
+  INIT_IEEE1275_COMMON (&args.common, "call-method", 3, 2);
+  args.method = (grub_ieee1275_cell_t)"ibm,client-architecture-support";
+  rc = grub_ieee1275_open("/", &root);
+  if (rc) {
+	  grub_error (GRUB_ERR_IO, "could not open root when trying to call CAS");
+	  return;
+  }
+  args.ihandle = root;
+  args.cas_addr = (grub_ieee1275_cell_t)&vector;
+
+  grub_printf("Calling ibm,client-architecture-support...");
+  IEEE1275_CALL_ENTRY_FN (&args);
+  grub_printf("done\n");
+
+  grub_ieee1275_close(root);
+}
+
 static void 
 grub_claim_heap (void)
 {
   unsigned long total = 0;
 
+  if (grub_ieee1275_test_flag (GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY))
+    {
+      grub_uint64_t rma_size;
+      grub_err_t err;
+
+      err = grub_ieee1275_total_mem (&rma_size);
+      /* if we have an error, don't call CAS, just hope for the best */
+      if (!err && rma_size < (512 * 1024 * 1024))
+	grub_ieee1275_ibm_cas();
+    }
+
   grub_machine_mmap_iterate (heap_init, &total);
 }
 #endif
diff --git a/include/grub/ieee1275/ieee1275.h b/include/grub/ieee1275/ieee1275.h
index 6ce8605d6daf..1752c58e8ef6 100644
--- a/include/grub/ieee1275/ieee1275.h
+++ b/include/grub/ieee1275/ieee1275.h
@@ -128,6 +128,12 @@ enum grub_ieee1275_flag
   GRUB_IEEE1275_FLAG_CURSORONOFF_ANSI_BROKEN,
 
   GRUB_IEEE1275_FLAG_RAW_DEVNAMES,
+
+  /* On PFW, the first time we boot a Linux partition, we may only get 256MB
+     of real memory area, even if the partition has more memory. Set this flag
+     if we think we're running under PFW. Then, if this flag is set, and the
+     RMA is only 256MB in size, try asking for more with CAS. */
+  GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY,
 };
 
 extern int EXPORT_FUNC(grub_ieee1275_test_flag) (enum grub_ieee1275_flag flag);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 15/19] ieee1275: drop len -= 1 quirk in heap_init
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (13 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 14/19] ieee1275: request memory with ibm, client-architecture-support Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-10-12  7:30 ` [PATCH 16/19] ieee1275: support runtime memory claiming Daniel Axtens
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

This was apparently 'required by some firmware': commit dc9468500919
("2007-02-12  Hollis Blanchard  <hollis@penguinppc.org>").

It's not clear what firmware that was, and what platform from 14 years ago
which exhibited the bug then is still both in use and buggy now.

It doesn't cause issues on qemu (mac99 or pseries) or under PFW for Power8.

I don't have access to old Mac hardware, but if anyone feels especially
strongly we can put it under some feature flag. I really want to disable
it under pseries because it will mess with region merging.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 grub-core/kern/ieee1275/init.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grub-core/kern/ieee1275/init.c b/grub-core/kern/ieee1275/init.c
index 46b1aee732d3..bf2cd200893d 100644
--- a/grub-core/kern/ieee1275/init.c
+++ b/grub-core/kern/ieee1275/init.c
@@ -166,7 +166,6 @@ heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
 	  addr = 0x180000;
 	}
     }
-  len -= 1; /* Required for some firmware.  */
 
   /* Never exceed HEAP_MAX_SIZE  */
   if (*total + len > HEAP_MAX_SIZE)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 16/19] ieee1275: support runtime memory claiming
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (14 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 15/19] ieee1275: drop len -= 1 quirk in heap_init Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-10-12  7:30 ` [PATCH 17/19] [not for merge] print more debug info in mm Daniel Axtens
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

On powerpc-ieee1275, we are running out of memory trying to verify
anything. This is because:

 - we have to load an entire file into memory to verify it. This is
   extremely difficult to change with appended signatures.
 - We only have 32MB of heap.
 - Distro kernels are now often around 30MB.

So we want to be able to claim more memory from OpenFirmware for our heap
at runtime.

There are some complications:

 - The grub mm code isn't the only thing that will make claims on
   memory from OpenFirmware:

    * PFW/SLOF will have claimed some for their own use.

    * The ieee1275 loader will try to find other bits of memory that we
      haven't claimed to place the kernel and initrd when we go to boot.

    * Once we load Linux, it will also try to claim memory. It claims
      memory without any reference to /memory/available, it just starts
      at min(top of RMO, 768MB) and works down. So we need to avoid this
      area. See arch/powerpc/kernel/prom_init.c as of v5.11.

 - The smallest amount of memory a ppc64 KVM guest can have is 256MB.
   It doesn't work with distro kernels but can work with custom kernels.
   We should maintain support for that. (ppc32 can boot with even less,
   and we shouldn't break that either.)

 - Even if a VM has more memory, the memory OpenFirmware makes available
   as Real Memory Area can be restricted. Even with our CAS work, an LPAR
   on a PowerVM box is likely to have only 512MB available to OpenFirmware
   even if it has many gigabytes of memory allocated.

The current EFI approach is in flux. Previously, EFI systems would attempt
to allocate 1/4th of the available memory, clamped to between 1M and 1600M.
Now EFI seems to be moving towards a base allocation plus runtime allocations.

What should we do?

We don't know in advance how big the kernel and initrd are going to be,
which makes figuring out how much memory we can take a bit tricky.

To figure out how much memory we should leave unused, I looked at:

 - an Ubuntu 20.04.1 ppc64le pseries KVM guest:
    vmlinux: ~30MB
    initrd:  ~50MB

 - a RHEL8.2 ppc64le pseries KVM guest:
    vmlinux: ~30MB
    initrd:  ~30MB

So to give us a little wriggle room, I think we want to leave at least
128MB for the loader to put vmlinux and initrd in memory and leave Linux
with space to satisfy its early allocations.

Other space can now be allocated at runtime.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 docs/grub-dev.texi             |   7 +-
 grub-core/kern/ieee1275/init.c | 179 +++++++++++++++++++++++++++++----
 2 files changed, 165 insertions(+), 21 deletions(-)

diff --git a/docs/grub-dev.texi b/docs/grub-dev.texi
index fb2cc965ed80..d576d17e3be2 100644
--- a/docs/grub-dev.texi
+++ b/docs/grub-dev.texi
@@ -1047,7 +1047,10 @@ space is limited to 4GiB. GRUB allocates pages from EFI for its heap, at most
 1.6 GiB.
 
 On i386-ieee1275 and powerpc-ieee1275 GRUB uses same stack as IEEE1275.
-It allocates at most 32MiB for its heap.
+
+On i386-ieee1275 and powerpc-ieee1275, GRUB will allocate 32MiB for its heap on
+startup. It may allocate more at runtime, as long as at least 128MiB remain free
+in OpenFirmware.
 
 On sparc64-ieee1275 stack is 256KiB and heap is 2MiB.
 
@@ -1075,7 +1078,7 @@ In short:
 @item i386-qemu               @tab 60 KiB  @tab < 4 GiB
 @item *-efi                   @tab ?       @tab < 1.6 GiB
 @item i386-ieee1275           @tab ?       @tab < 32 MiB
-@item powerpc-ieee1275        @tab ?       @tab < 32 MiB
+@item powerpc-ieee1275        @tab ?       @tab available memory - 128MiB
 @item sparc64-ieee1275        @tab 256KiB  @tab 2 MiB
 @item arm-uboot               @tab 256KiB  @tab 2 MiB
 @item mips(el)-qemu_mips      @tab 2MiB    @tab 253 MiB
diff --git a/grub-core/kern/ieee1275/init.c b/grub-core/kern/ieee1275/init.c
index bf2cd200893d..0fb7bae280df 100644
--- a/grub-core/kern/ieee1275/init.c
+++ b/grub-core/kern/ieee1275/init.c
@@ -45,13 +45,21 @@
 #include <grub/machine/kernel.h>
 #endif
 
-/* The maximum heap size we're going to claim */
+/* The maximum heap size we're going to claim at boot. Not used by sparc. */
 #ifdef __i386__
 #define HEAP_MAX_SIZE		(unsigned long) (64 * 1024 * 1024)
-#else
+#else // __powerpc__
 #define HEAP_MAX_SIZE		(unsigned long) (32 * 1024 * 1024)
 #endif
 
+/* The amount of OF space we will not claim here so as to leave space for
+   the loader and linux to service early allocations.
+
+   In 2021, Daniel Axtens claims that we should leave at least 128MB to
+   ensure we can load a stock kernel and initrd on a pseries guest with
+   a 512MB real memory area under PowerVM. */
+#define RUNTIME_MIN_SPACE (unsigned long) (128 * 1024 * 1024)
+
 extern char _start[];
 extern char _end[];
 
@@ -145,16 +153,54 @@ grub_claim_heap (void)
 				 + GRUB_KERNEL_MACHINE_STACK_SIZE), 0x200000);
 }
 #else
-/* Helper for grub_claim_heap.  */
+/* Helpers for mm on powerpc. */
+
+/* How much memory does OF believe exists in total?
+
+   This isn't necessarily the true total. It can be the total memory
+   accessible in real mode for a pseries guest, for example.
+ */
+static grub_uint64_t rmo_top;
+
+/* How much have we claimed so far? */
+static grub_uint32_t allocated_memory;
+
 static int
-heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
-	   void *data)
+count_free (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
+	    void *data)
+{
+  if (type != GRUB_MEMORY_AVAILABLE)
+    return 0;
+
+  /* Do not consider memory beyond 4GB */
+  if (addr > 0xffffffffULL)
+    return 0;
+
+  if (addr + len > 0xffffffffULL)
+    len = 0xffffffffULL - addr;
+
+  *(grub_uint32_t *)data += len;
+
+  return 0;
+}
+
+static int
+regions_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
+	      unsigned int flags, void *data)
 {
-  unsigned long *total = data;
+  grub_uint32_t total = *(grub_uint32_t *)data;
+  grub_uint32_t linux_rmo_save;
 
   if (type != GRUB_MEMORY_AVAILABLE)
     return 0;
 
+  /* Do not consider memory beyond 4GB */
+  if (addr > 0xffffffffULL)
+    return 0;
+
+  if (addr + len > 0xffffffffULL)
+    len = 0xffffffffULL - addr;
+
   if (grub_ieee1275_test_flag (GRUB_IEEE1275_FLAG_NO_PRE1_5M_CLAIM))
     {
       if (addr + len <= 0x180000)
@@ -167,10 +213,6 @@ heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
 	}
     }
 
-  /* Never exceed HEAP_MAX_SIZE  */
-  if (*total + len > HEAP_MAX_SIZE)
-    len = HEAP_MAX_SIZE - *total;
-
   /* In theory, firmware should already prevent this from happening by not
      listing our own image in /memory/available.  The check below is intended
      as a safeguard in case that doesn't happen.  However, it doesn't protect
@@ -182,6 +224,47 @@ heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
       len = 0;
     }
 
+  /* Linux likes to claim memory at min(RMO top, 768MB) and works down
+     without reference to /memory/available. (See prom_init.c::alloc_down)
+
+     If this block contains min(RMO top, 768MB), do not claim below that for
+     at least a few MB (this is where RTAS, SML and potentially TCEs live).
+
+     We also need to leave enough space for the DT in the RMA. (See
+     prom_init.c::alloc_up)
+
+     Finally, we also want to make sure that when grub loads the kernel,
+     it isn't going to use up all the memory we're trying to reserve! So
+     enforce our entire RUNTIME_MIN_SPACE here.
+
+     Of course, only consider this if we have more than RUNTIME_MIN_SPACE
+     to begin with. If we have <= 128MB of ram, we claim only the static
+     blocks summing to HEAP_MAX_SIZE.
+   */
+  if (rmo_top > RUNTIME_MIN_SPACE)
+    {
+      linux_rmo_save = grub_min (0x30000000, rmo_top) - RUNTIME_MIN_SPACE;
+      if ((addr < linux_rmo_save) && ((addr + len) > linux_rmo_save))
+	len = linux_rmo_save - addr;
+      else if (addr == linux_rmo_save)
+	{
+	  if (len < RUNTIME_MIN_SPACE)
+	    return 0;
+	  addr = 0x30000000;
+	  len -= RUNTIME_MIN_SPACE;
+        }
+    }
+
+  if (flags & GRUB_MM_ADD_REGION_CONSECUTIVE)
+    {
+      // only continue if we can satisfy the entire allocation
+      if (len < total)
+	return 0;
+    }
+
+  if (len > total)
+    len = total;
+
   if (len)
     {
       grub_err_t err;
@@ -190,15 +273,69 @@ heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
       if (err)
 	return err;
       grub_mm_init_region ((void *) (grub_addr_t) addr, len);
+      total -= len;
+      allocated_memory += len;
     }
 
-  *total += len;
-  if (*total >= HEAP_MAX_SIZE)
+  *(grub_uint32_t *)data = total;
+
+  if (total == 0)
     return 1;
 
   return 0;
 }
 
+static int
+heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
+	   void *data)
+{
+  return regions_claim(addr, len, type, GRUB_MM_ADD_REGION_NONE, data);
+}
+
+static int
+region_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
+	   void *data)
+{
+  return regions_claim(addr, len, type, GRUB_MM_ADD_REGION_CONSECUTIVE, data);
+}
+
+
+static grub_err_t grub_ieee1275_mm_add_region (grub_size_t size, unsigned int flags)
+{
+  grub_uint32_t total = size;
+  grub_uint32_t free_memory;
+
+  if (flags & GRUB_MM_ADD_REGION_CONSECUTIVE)
+    {
+      /* Update free memory each time, which is a bit inefficient but guards us
+	 against some driver going out to firmware and fw grabbing memory to service
+	the request */
+      grub_machine_mmap_iterate(count_free, &free_memory);
+
+      /* Ensure we leave enough space to boot
+	 (be careful of underflow here) */
+      if (free_memory <= RUNTIME_MIN_SPACE)
+	return GRUB_ERR_OUT_OF_MEMORY;
+
+      if (size > free_memory - RUNTIME_MIN_SPACE)
+       return GRUB_ERR_OUT_OF_MEMORY;
+      else
+	{
+	  total = grub_max(ALIGN_UP(size, 1024 * 1024) + 1024 * 1024, 32 * 1024 * 1024);
+	  total = grub_min(free_memory - RUNTIME_MIN_SPACE, total);
+	}
+
+      grub_machine_mmap_iterate (region_claim, &total);
+    }
+  else
+    grub_machine_mmap_iterate (heap_init, &total);
+
+  if (total == 0)
+    return GRUB_ERR_NONE;
+  else
+    return GRUB_ERR_OUT_OF_MEMORY;
+}
+
 /* How much memory does OF believe it has? (regardless of whether
    it's accessible or not) */
 static grub_err_t
@@ -331,20 +468,24 @@ grub_ieee1275_ibm_cas (void)
 static void 
 grub_claim_heap (void)
 {
-  unsigned long total = 0;
+  grub_err_t err;
+
+  err = grub_ieee1275_total_mem (&rmo_top);
+
+  /* If we cannot size the available memory, we can't be sure we're leaving
+     space for the kernel, initrd and things Linux loads early in boot. So
+     only allow further allocations from firmware on success */
+  if (err == GRUB_ERR_NONE)
+    grub_mm_add_region_fn = grub_ieee1275_mm_add_region;
 
   if (grub_ieee1275_test_flag (GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY))
     {
-      grub_uint64_t rma_size;
-      grub_err_t err;
-
-      err = grub_ieee1275_total_mem (&rma_size);
       /* if we have an error, don't call CAS, just hope for the best */
-      if (!err && rma_size < (512 * 1024 * 1024))
+      if (!err && rmo_top < (512 * 1024 * 1024))
 	grub_ieee1275_ibm_cas();
     }
 
-  grub_machine_mmap_iterate (heap_init, &total);
+  grub_ieee1275_mm_add_region (HEAP_MAX_SIZE, GRUB_MM_ADD_REGION_NONE);
 }
 #endif
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 17/19] [not for merge] print more debug info in mm
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (15 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 16/19] ieee1275: support runtime memory claiming Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-11-10 13:47   ` Daniel Kiper
  2021-10-12  7:30 ` [PATCH 18/19] [not for merge] ieee1275 debugging info Daniel Axtens
  2021-10-12  7:30 ` [PATCH 19/19] RFC: Ignore REGION_CONSECUTIVE Daniel Axtens
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

This is handy for debugging - I'm including it in case anyone else hacking
on this area finds it helpful.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 grub-core/kern/mm.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
index 58d5b89e8860..811df1ab5ebb 100644
--- a/grub-core/kern/mm.c
+++ b/grub-core/kern/mm.c
@@ -135,6 +135,9 @@ grub_mm_init_region (void *addr, grub_size_t size)
   for (p = &grub_mm_base, q = *p; q; p = &(q->next), q = *p)
     {
       /* Does this region come _before_ an existing region? */
+      grub_printf ("Extending w/ before %p + %" PRIxGRUB_SIZE " + %" PRIxGRUB_SIZE " = %p ? %s\n",
+		 (grub_uint8_t *)addr, size, q->pre_size, (grub_uint8_t *)q,
+		 (grub_uint8_t *)addr + size + q->pre_size == (grub_uint8_t *) q ? "yes" : "no");
       if ((grub_uint8_t *) addr + size + q->pre_size == (grub_uint8_t *) q)
 	{
 	  r = (grub_mm_region_t) ALIGN_UP ((grub_addr_t) addr, GRUB_MM_ALIGN);
@@ -143,6 +146,7 @@ grub_mm_init_region (void *addr, grub_size_t size)
 
 	  if (r->pre_size >> GRUB_MM_ALIGN_LOG2)
 	    {
+	      grub_printf ("extending a region\n");
 	      h = (grub_mm_header_t) (r + 1);
 	      h->size = (r->pre_size >> GRUB_MM_ALIGN_LOG2);
 	      h->magic = GRUB_MM_ALLOC_MAGIC;
@@ -156,9 +160,13 @@ grub_mm_init_region (void *addr, grub_size_t size)
 	}
 
       /* Does this region come _after_ an existing region? */
+      grub_printf ("Extending w/ after %p + %" PRIxGRUB_SIZE " + %" PRIxGRUB_SIZE " + %" PRIxGRUB_SIZE " = %p ? %s\n",
+		 (grub_uint8_t *)q, sizeof(*q), q->size, q->post_size, (grub_uint8_t *)addr,
+		 (grub_uint8_t *)q + sizeof(*q) + q->size + q->post_size == (grub_uint8_t *) addr ? "yes" : "no");
       if ((grub_uint8_t *)q + sizeof(*q) + q->size + q->post_size ==
 	  (grub_uint8_t *) addr)
 	{
+	  grub_printf ("extending a region\n");
 	  h = (grub_mm_header_t) ((grub_uint8_t *)addr - q->post_size);
 	  h->size = (size + q->post_size) >> GRUB_MM_ALIGN_LOG2;
 	  h->magic = GRUB_MM_ALLOC_MAGIC;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 18/19] [not for merge] ieee1275 debugging info
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (16 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 17/19] [not for merge] print more debug info in mm Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-10-12  7:30 ` [PATCH 19/19] RFC: Ignore REGION_CONSECUTIVE Daniel Axtens
  18 siblings, 0 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

This is also handy for debugging.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 grub-core/kern/ieee1275/init.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/grub-core/kern/ieee1275/init.c b/grub-core/kern/ieee1275/init.c
index 0fb7bae280df..f70cb134178b 100644
--- a/grub-core/kern/ieee1275/init.c
+++ b/grub-core/kern/ieee1275/init.c
@@ -243,6 +243,7 @@ regions_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
    */
   if (rmo_top > RUNTIME_MIN_SPACE)
     {
+      grub_printf ("was %llx - %llx, ", addr, addr + len);
       linux_rmo_save = grub_min (0x30000000, rmo_top) - RUNTIME_MIN_SPACE;
       if ((addr < linux_rmo_save) && ((addr + len) > linux_rmo_save))
 	len = linux_rmo_save - addr;
@@ -253,6 +254,7 @@ regions_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
 	  addr = 0x30000000;
 	  len -= RUNTIME_MIN_SPACE;
         }
+      grub_printf ("now %llx - %llx (vs %x)\n", addr, addr + len, linux_rmo_save);
     }
 
   if (flags & GRUB_MM_ADD_REGION_CONSECUTIVE)
@@ -315,7 +317,12 @@ static grub_err_t grub_ieee1275_mm_add_region (grub_size_t size, unsigned int fl
       /* Ensure we leave enough space to boot
 	 (be careful of underflow here) */
       if (free_memory <= RUNTIME_MIN_SPACE)
-	return GRUB_ERR_OUT_OF_MEMORY;
+	{
+	  grub_printf ("Giving up, already at/over limit\n");
+	  return GRUB_ERR_OUT_OF_MEMORY;
+	}
+
+      grub_printf ("ieee1275: free = 0x%x, allocated = 0x%x\n", free_memory, allocated_memory);
 
       if (size > free_memory - RUNTIME_MIN_SPACE)
        return GRUB_ERR_OUT_OF_MEMORY;
@@ -324,8 +331,10 @@ static grub_err_t grub_ieee1275_mm_add_region (grub_size_t size, unsigned int fl
 	  total = grub_max(ALIGN_UP(size, 1024 * 1024) + 1024 * 1024, 32 * 1024 * 1024);
 	  total = grub_min(free_memory - RUNTIME_MIN_SPACE, total);
 	}
+      grub_printf ("ieee1275: looking for %x bytes of memory (%x requested)\n", total, size);
 
       grub_machine_mmap_iterate (region_claim, &total);
+      grub_printf ("ieee1275: get memory from fw %s\n", total == 0 ? "succeeded" : "failed");
     }
   else
     grub_machine_mmap_iterate (heap_init, &total);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 19/19] RFC: Ignore REGION_CONSECUTIVE
  2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
                   ` (17 preceding siblings ...)
  2021-10-12  7:30 ` [PATCH 18/19] [not for merge] ieee1275 debugging info Daniel Axtens
@ 2021-10-12  7:30 ` Daniel Axtens
  2021-11-10 14:00   ` Daniel Kiper
  18 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2021-10-12  7:30 UTC (permalink / raw)
  To: grub-devel; +Cc: leif, stefanb, ps, dkiper, Daniel Axtens

Looking into the region merging stuff, I started to wonder if we actually
want to just try adding as much memory as we can (up to the limit requested),
even if we cannot satisfy the full allocation (i.e. we are not CONSECUTIVE).

On powerpc-ieee1275 this actually allows me to satisfy bigger allocations:
while the FW cannot necessarily satisfy e.g. a 128MB allocation, it can
satisfy e.g. a 32MB allocation, and that might merge with a region that's
100MB, and then we can satisfy our 128MB allocation.

The way the call out to firmware operates in mm.c is that it preceeds a
'goto again;' - it kicks off a whole new round of attempting to satisfy
allocation. mm.c doesn't fill the allocation only from the new memory from
firmware.

Perhaps an even better approach would be for the mm.c code to try going to
firmware twice - once for the precise size of the allocation (rounded up etc)
and then if that doesn't succeed go back and ask for 'anything you can give me'?

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 grub-core/kern/ieee1275/init.c | 37 +++++++++-------------------------
 1 file changed, 10 insertions(+), 27 deletions(-)

diff --git a/grub-core/kern/ieee1275/init.c b/grub-core/kern/ieee1275/init.c
index f70cb134178b..87c1ab997138 100644
--- a/grub-core/kern/ieee1275/init.c
+++ b/grub-core/kern/ieee1275/init.c
@@ -186,7 +186,7 @@ count_free (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
 
 static int
 regions_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
-	      unsigned int flags, void *data)
+	       void *data)
 {
   grub_uint32_t total = *(grub_uint32_t *)data;
   grub_uint32_t linux_rmo_save;
@@ -257,13 +257,6 @@ regions_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
       grub_printf ("now %llx - %llx (vs %x)\n", addr, addr + len, linux_rmo_save);
     }
 
-  if (flags & GRUB_MM_ADD_REGION_CONSECUTIVE)
-    {
-      // only continue if we can satisfy the entire allocation
-      if (len < total)
-	return 0;
-    }
-
   if (len > total)
     len = total;
 
@@ -287,21 +280,6 @@ regions_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
   return 0;
 }
 
-static int
-heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
-	   void *data)
-{
-  return regions_claim(addr, len, type, GRUB_MM_ADD_REGION_NONE, data);
-}
-
-static int
-region_claim (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
-	   void *data)
-{
-  return regions_claim(addr, len, type, GRUB_MM_ADD_REGION_CONSECUTIVE, data);
-}
-
-
 static grub_err_t grub_ieee1275_mm_add_region (grub_size_t size, unsigned int flags)
 {
   grub_uint32_t total = size;
@@ -325,7 +303,10 @@ static grub_err_t grub_ieee1275_mm_add_region (grub_size_t size, unsigned int fl
       grub_printf ("ieee1275: free = 0x%x, allocated = 0x%x\n", free_memory, allocated_memory);
 
       if (size > free_memory - RUNTIME_MIN_SPACE)
-       return GRUB_ERR_OUT_OF_MEMORY;
+        {
+	  // hope region extension saves us
+	  total = free_memory - RUNTIME_MIN_SPACE;
+	}
       else
 	{
 	  total = grub_max(ALIGN_UP(size, 1024 * 1024) + 1024 * 1024, 32 * 1024 * 1024);
@@ -333,13 +314,15 @@ static grub_err_t grub_ieee1275_mm_add_region (grub_size_t size, unsigned int fl
 	}
       grub_printf ("ieee1275: looking for %x bytes of memory (%x requested)\n", total, size);
 
-      grub_machine_mmap_iterate (region_claim, &total);
+      grub_machine_mmap_iterate (regions_claim, &total);
       grub_printf ("ieee1275: get memory from fw %s\n", total == 0 ? "succeeded" : "failed");
     }
   else
-    grub_machine_mmap_iterate (heap_init, &total);
+    grub_machine_mmap_iterate (regions_claim, &total);
 
-  if (total == 0)
+  // lie a little: if we allocated _anything_ give it another go
+  // because region merging might have saved the day
+  if (total != size)
     return GRUB_ERR_NONE;
   else
     return GRUB_ERR_OUT_OF_MEMORY;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management
  2021-10-12  7:29 ` [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management Daniel Axtens
@ 2021-10-12 17:11   ` Glenn Washburn
  2021-10-20 15:39     ` Daniel Kiper
  0 siblings, 1 reply; 49+ messages in thread
From: Glenn Washburn @ 2021-10-12 17:11 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps, dkiper

On Tue, 12 Oct 2021 18:29:50 +1100
Daniel Axtens <dja@axtens.net> wrote:

> From: Glenn Washburn <development@efficientek.com>
> 
> A recent refactoring of CUDA command code has exposed a bug in OpenBIOS[1]
> which was causing system powerdown and system reset to fail, thus causing
> the Qemu instance to hang. This in turn caused the grub-shell command to
> timeout causing it to return an error code when the test actually completed
> successfully.
> 
> Since it could be a while before the patch fixing this issue in OpenBIOS
> filters down to the average distro, switch to PMU to allow powerdowns and
> reboots to work as expected.
> 
> [1] https://gitlab.com/qemu-project/qemu/-/issues/624
> 
> Signed-off-by: Glenn Washburn <development@efficientek.com>
> [dja: unbreak pseries]
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  tests/util/grub-shell.in | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/util/grub-shell.in b/tests/util/grub-shell.in
> index 93e9f5148408..33590baeb13c 100644
> --- a/tests/util/grub-shell.in
> +++ b/tests/util/grub-shell.in
> @@ -84,6 +84,7 @@ case "${grub_modinfo_target_cpu}-${grub_modinfo_platform}" in
>  	serial_null="-serial null"
>  	netbootext=elf
>  	trim=1
> +	qemuopts="-M mac99,via=pmu $qemuopts"
>  	;;
>  
>      sparc64-ieee1275)
> @@ -231,7 +232,7 @@ for option in "$@"; do
>  	qemu=qemu-system-ppc64
>  	serial_port=ieee1275/hvterm
>  	serial_null=
> -	qemuopts="$qemuopts -M pseries -no-reboot"
> +	qemuopts="$(echo $qemuopts | sed -E 's/-M [^ ]+//') -M pseries -no-reboot"

LGTM.

Glenn

>  	trim=1
>  	pseries=y
>  	    ;;


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/19] grub-shell: pseries: don't pass fw_opt to qemu
  2021-10-12  7:29 ` [PATCH 02/19] grub-shell: pseries: don't pass fw_opt to qemu Daniel Axtens
@ 2021-10-12 18:57   ` Glenn Washburn
  2021-10-20 15:43     ` Daniel Kiper
  0 siblings, 1 reply; 49+ messages in thread
From: Glenn Washburn @ 2021-10-12 18:57 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

On Tue, 12 Oct 2021 18:29:51 +1100
Daniel Axtens <dja@axtens.net> wrote:

> At some point this started to break the test - qemu just rejects the
> command line. Just don't pass it in, it's seabios specific.
> 
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  tests/util/grub-shell.in | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/util/grub-shell.in b/tests/util/grub-shell.in
> index 33590baeb13c..92c0b6f320c3 100644
> --- a/tests/util/grub-shell.in
> +++ b/tests/util/grub-shell.in
> @@ -376,7 +376,11 @@ if test -z "$debug"; then
>    # workaround unfortunately causes qemu to issue a warning 'externally
>    # provided fw_cfg item names should be prefixed with "opt/"', but there
>    # doesn't seem to be a better option.
> -  qemuopts="${qemuopts} -fw_cfg name=etc/sercon-port,string=0"
> +  #
> +  # Don't do this on pseries, it breaks recent qemus.
> +  if [ $pseries == n ]; then
> +    qemuopts="${qemuopts} -fw_cfg name=etc/sercon-port,string=0"

I've confirmed that this is an issue on Qemu 5.2 from Debian 11 and
that this patch works as advertised. The better solution would only be
added on targets using SeaBIOS. This would only be i386, right? There
is an HPPA port, but I don't think we support that architecture.

> +  fi
>  fi
>  
>  if [ x$boot != xnet ] && [ x$boot != xemu ]; then

Tested-by: Glenn Washburn <development@efficientek.com>

Glenn


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 03/19] mm: document grub internal memory management structures
  2021-10-12  7:29 ` [PATCH 03/19] mm: document grub internal memory management structures Daniel Axtens
@ 2021-10-12 19:18   ` Glenn Washburn
  2021-11-24  0:57     ` Daniel Axtens
  0 siblings, 1 reply; 49+ messages in thread
From: Glenn Washburn @ 2021-10-12 19:18 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

On Tue, 12 Oct 2021 18:29:52 +1100
Daniel Axtens <dja@axtens.net> wrote:

> I spent more than a trivial quantity of time figuring out pre_size and
> whether a memory region's size contains the header cell or not.
> 
> Document the meanings of all the properties. Hopefully now no-one else
> has to figure it out!
> 
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  include/grub/mm_private.h | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/include/grub/mm_private.h b/include/grub/mm_private.h
> index c2c4cb1511c6..e80a059dd4e4 100644
> --- a/include/grub/mm_private.h
> +++ b/include/grub/mm_private.h
> @@ -25,11 +25,18 @@
>  #define GRUB_MM_FREE_MAGIC	0x2d3c2808
>  #define GRUB_MM_ALLOC_MAGIC	0x6db08fa4
>  
> +/* A header describing a block of memory - either allocated or free */
>  typedef struct grub_mm_header
>  {
> +  /* The 'next' free block in this region. A circular list.
> +     Only meaningful if the block is free.
> +   */
>    struct grub_mm_header *next;

This and the subsequent comments do not follow the multiline comment
style[1].

One nit, "region. A circular" -> "region's circular".

This comment leads me to wonder if the block is not free, what is the
value of next? Seems like it should be NULL, if its not meaningful.

https://www.gnu.org/software/grub/manual/grub-dev/grub-dev.html#Multi_002dLine-Comments

> +  /* The region size in cells (not bytes). Includes the header cell. */

What exactly does "cell" mean in this context? I'm gathering cell is
not defined as in this link[2], which is equivalent to "bit". Perhaps
"size" should be renamed to "ncells" or something more descriptive.

[2] https://en.wikipedia.org/wiki/Memory_cell_(computing)

>    grub_size_t size;
> +  /* either free or alloc magic, depending on the block type. */
>    grub_size_t magic;
> +  /* pad to cell size: see the top of mm.c. */
>  #if GRUB_CPU_SIZEOF_VOID_P == 4
>    char padding[4];
>  #elif GRUB_CPU_SIZEOF_VOID_P == 8
> @@ -48,11 +55,25 @@ typedef struct grub_mm_header
>  
>  #define GRUB_MM_ALIGN	(1 << GRUB_MM_ALIGN_LOG2)
>  
> +/* A region from which we can make allocations. */
>  typedef struct grub_mm_region
>  {
> +  /* The first free block in this region. */
>    struct grub_mm_header *first;
> +
> +  /* The next region in the linked list of regions. Regions are initially
> +     sorted in order of increasing size, but can grow, in which case the
> +     ordering may not be preserved.
> +   */
>    struct grub_mm_region *next;
> +
> +  /* A grub_mm_region will always be aligned to cell size. The pre-size is
> +     the number of bytes we were given but had to skip in order to get that
> +     alignment.
> +   */
>    grub_size_t pre_size;
> +
> +  /* How many bytes are in this region? (free and allocated) */
>    grub_size_t size;
>  }
>  *grub_mm_region_t;

Glenn



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 07/19] Add memtool module with memory allocation stress-test
  2021-10-12  7:29 ` [PATCH 07/19] Add memtool module with memory allocation stress-test Daniel Axtens
@ 2021-10-19 19:47   ` Glenn Washburn
  2021-11-09 12:54   ` Daniel Kiper
  1 sibling, 0 replies; 49+ messages in thread
From: Glenn Washburn @ 2021-10-19 19:47 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

On Tue, 12 Oct 2021 18:29:56 +1100
Daniel Axtens <dja@axtens.net> wrote:

> When working on memory, it's nice to be able to test your work.
> 
> Add a memtest module. When compiled with --enable-mm-debug, it exposes
> 3 commands:
> 
>  * lsmem - print all allocations and free space in all regions
>  * lsfreemem - print free space in all regions
> 
>  * stress_big_allocs - stress test large allocations:
>   - how much memory can we allocate in one chunk?
>   - how many 1MB chunks can we allocate?
>   - check that gap-filling works with a 1MB aligned 900kB alloc + a
>      100kB alloc.

Perhaps note here that the questions are "up to 4GB".

> 
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> 
> ---
> 
> I've put this as copyright IBM for now - hopefully we can conclude on
> whether we're still doing FSF copyright assignments?
> ---
>  grub-core/Makefile.core.def   |   5 ++
>  grub-core/commands/memtools.c | 157 ++++++++++++++++++++++++++++++++++
>  grub-core/kern/mm.c           |   4 +
>  include/grub/mm.h             |   4 +-
>  4 files changed, 168 insertions(+), 2 deletions(-)
>  create mode 100644 grub-core/commands/memtools.c
> 
> diff --git a/grub-core/Makefile.core.def b/grub-core/Makefile.core.def
> index 8022e1c0a794..0cc3a4a500ec 100644
> --- a/grub-core/Makefile.core.def
> +++ b/grub-core/Makefile.core.def
> @@ -2527,3 +2527,8 @@ module = {
>    common = commands/i386/wrmsr.c;
>    enable = x86;
>  };
> +
> +module = {
> +  name = memtools;
> +  common = commands/memtools.c;
> +};
> diff --git a/grub-core/commands/memtools.c b/grub-core/commands/memtools.c
> new file mode 100644
> index 000000000000..6d5778f4a1b0
> --- /dev/null
> +++ b/grub-core/commands/memtools.c
> @@ -0,0 +1,157 @@
> +/*
> + *  GRUB  --  GRand Unified Bootloader
> + *  Copyright (C) 2021  IBM Corporation
> + *
> + *  GRUB is free software: you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation, either version 3 of the License, or
> + *  (at your option) any later version.
> + *
> + *  GRUB is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with GRUB.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <config.h>
> +#include <grub/dl.h>
> +#include <grub/misc.h>
> +#include <grub/command.h>
> +#include <grub/i18n.h>
> +#include <grub/memory.h>
> +#include <grub/mm.h>
> +
> +GRUB_MOD_LICENSE ("GPLv3+");
> +
> +#ifdef MM_DEBUG
> +
> +static grub_err_t
> +grub_cmd_lsmem (grub_command_t cmd __attribute__ ((unused)),
> +		 int argc __attribute__ ((unused)),
> +		 char **args __attribute__ ((unused)))
> +
> +{
> +#ifndef GRUB_MACHINE_EMU
> +  grub_mm_dump(0);
> +#endif
> +
> +  return 0;
> +}
> +
> +static grub_err_t
> +grub_cmd_lsfreemem (grub_command_t cmd __attribute__ ((unused)),
> +		    int argc __attribute__ ((unused)),
> +		    char **args __attribute__ ((unused)))
> +
> +{
> +#ifndef GRUB_MACHINE_EMU
> +  grub_mm_dump_free();
> +#endif
> +
> +  return 0;
> +}
> +
> +
> +#define BIG_ALLOC (64 * 1024 * 1024)
> +#define SMALL_ALLOC 32

Where are these used? A search is coming up with nothing.

> +
> +static grub_err_t
> +grub_cmd_stress_big_allocs (grub_command_t cmd __attribute__ ((unused)),
> +			    int argc __attribute__ ((unused)),
> +			    char **args __attribute__ ((unused)))

I'm thinking this would be valuable to be included in the functional
tests. What do you think?

> +{
> +  int i, max_mb, blocks_alloced;
> +  void *mem;
> +  void **blocklist;
> +
> +  grub_printf ("Test 1: increasingly sized allocs to 1GB block\n");

Perhaps, "Test 1: Allocate and free up from 1MB to 1GB in 1MB
increments."

> +  for (i = 1; i < 1024; i++) {
> +    grub_printf ("%d MB . ", i);
> +    mem = grub_malloc (i * 1024 * 1024);
> +    if (mem == NULL)
> +      {
> +	grub_printf ("failed\n");

"Failed to allocate a %dMB continuous block of memory.\n"

> +	break;
> +      }
> +    else
> +      grub_free (mem);
> +
> +    if (i % 10 == 0)
> +      grub_printf ("\n");

So every 10Mb a newline is output. That's seems small, if you've got
generally at least 80 char with terminal. Why not make it 64 and thus
evenly divide 1024?

> +  }
> +
> +  max_mb = i - 1;
> +  grub_printf ("Max sized allocation we did was %d MB\n", max_mb);
> +  

Line with just a space, should be empty.

> +  grub_printf ("Test 2: 1MB at a time, max 4GB\n");

"Test 2: Allocate total of 4GB of memory in 1MB chunks"

> +  blocklist = grub_calloc (4096, sizeof (void *));
> +  for (i = 0; i < 4096; i++)
> +    {
> +      blocklist[i] = grub_malloc (1024 * 1024);
> +      if (!blocklist[i])
> +	{
> +	  grub_printf ("Ran out of memory at iteration %d\n", i);

"Failed to allocate 1MB memory chunk on block %d of 4096"

> +	  break;
> +	}
> +    }
> +  blocks_alloced = i;
> +  for (i = 0; i < blocks_alloced; i++)
> +    grub_free (blocklist[i]);
> +
> +  grub_printf ("\nTest 3: 1MB aligned 900kB + 100kB\n");

"Test 3: Allocate total of 4Gb of memory in 1MB aligned 900kB + 100kB
chunks"

> +  //grub_mm_debug=1;

Was this comment intended to be included? If so, why here?

> +  for (i = 0; i < 4096; i += 2)
> +    {
> +      blocklist[i] = grub_memalign (1024 * 1024, 900 * 1024);
> +      if (!blocklist[i])
> +	{
> +	  grub_printf ("Failed big allocation, iteration %d\n", i);
> +	  blocks_alloced = i;
> +	  break;
> +	}
> +
> +      blocklist[i + 1] = grub_malloc (100 * 1024);
> +      if (!blocklist[i + 1])
> +	{
> +	  grub_printf ("Failed small allocation, iteration %d\n", i);
> +	  blocks_alloced = i + 1;
> +	  break;
> +	}
> +      grub_printf (".");
> +    }
> +  for (i = 0; i < blocks_alloced; i++)
> +    grub_free (blocklist[i]);
> +
> +  grub_free (blocklist);
> +
> +  grub_errno = GRUB_ERR_NONE;
> +  return GRUB_ERR_NONE;
> +}
> +
> +static grub_command_t cmd_lsmem, cmd_lsfreemem, cmd_sba;
> +
> +#endif /* MM_DEBUG */
> +
> +GRUB_MOD_INIT(memtools)
> +{
> +#ifdef MM_DEBUG
> +  cmd_lsmem = grub_register_command ("lsmem", grub_cmd_lsmem,
> +				     0, N_("List free and allocated memory blocks."));
> +  cmd_lsfreemem = grub_register_command ("lsfreemem", grub_cmd_lsfreemem,
> +					 0, N_("List free memory blocks."));
> +  cmd_sba = grub_register_command ("stress_big_allocs", grub_cmd_stress_big_allocs,
> +  				   0, N_("Stress test large allocations."));

Leading space characters.

> +#endif
> +}
> +
> +GRUB_MOD_FINI(memtools)
> +{
> +#ifdef MM_DEBUG
> +  grub_unregister_command (cmd_lsmem);
> +  grub_unregister_command (cmd_lsfreemem);
> +  grub_unregister_command (cmd_sba);
> +#endif
> +}
> diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
> index 835ed8a8f6f9..032c8f71aed2 100644
> --- a/grub-core/kern/mm.c
> +++ b/grub-core/kern/mm.c
> @@ -556,6 +556,8 @@ grub_mm_dump_free (void)
>      {
>        grub_mm_header_t p;
>  
> +      grub_printf ("Region %p (size %" PRIuGRUB_SIZE ")\n\n", r, r->size);
> +
>        /* Follow the free list.  */
>        p = r->first;
>        do
> @@ -583,6 +585,8 @@ grub_mm_dump (unsigned lineno)
>      {
>        grub_mm_header_t p;
>  
> +      grub_printf ("Region %p (size %" PRIuGRUB_SIZE ")\n\n", r, r->size);
> +
>        for (p = (grub_mm_header_t) ALIGN_UP ((grub_addr_t) (r + 1),
>  					    GRUB_MM_ALIGN);
>  	   (grub_addr_t) p < (grub_addr_t) (r+1) + r->size;
> diff --git a/include/grub/mm.h b/include/grub/mm.h
> index 9c38dd3ca5d2..44fde7cb9033 100644
> --- a/include/grub/mm.h
> +++ b/include/grub/mm.h
> @@ -46,8 +46,8 @@ void grub_mm_check_real (const char *file, int line);
>  /* Set this variable to 1 when you want to trace all memory function calls.  */
>  extern int EXPORT_VAR(grub_mm_debug);
>  
> -void grub_mm_dump_free (void);
> -void grub_mm_dump (unsigned lineno);
> +void EXPORT_FUNC(grub_mm_dump_free) (void);
> +void EXPORT_FUNC(grub_mm_dump) (unsigned lineno);

This makes me wonder if we could easily move grub_mm_dump and
grub_mm_dump_free to memtools.c, and always have them built and lsmem
and lsmemfree as well. IOW, lsmem would work without need to compile
with --enable-mm-debug. If so, since its compiled as a module there's
not an issue of it increasing the core size.

>  
>  #define grub_calloc(nmemb, size)	\
>    grub_debug_calloc (GRUB_FILE, __LINE__, nmemb, size)

Glenn



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()`
  2021-10-12  7:30 ` [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()` Daniel Axtens
@ 2021-10-19 21:37   ` Glenn Washburn
  2021-11-09 13:56     ` Daniel Kiper
  2021-11-09 16:10   ` Daniel Kiper
  1 sibling, 1 reply; 49+ messages in thread
From: Glenn Washburn @ 2021-10-19 21:37 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

On Tue, 12 Oct 2021 18:30:01 +1100
Daniel Axtens <dja@axtens.net> wrote:

> From: Patrick Steinhardt <ps@pks.im>
> 
> The function `add_memory_regions ()` is currently only called on system
> initialization to allocate a fixed amount of pages. As such, it didn't
> need to return any errors: in case it failed, we cannot proceed anyway.
> This will change with the upcoming support for requesting more memory
> from the firmware at runtime, where it doesn't make sense anymore to
> fail hard.
> 
> Refactor the function to return an error to prepare for this. Note that
> this does not change the behaviour when initializing the memory system
> because `grub_efi_mm_init ()` knows to call `grub_fatal ()` in case
> `grub_efi_mm_add_regions ()` returns an error.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  grub-core/kern/efi/mm.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
> index cfc6a67fc0cc..ced3ee5e7537 100644
> --- a/grub-core/kern/efi/mm.c
> +++ b/grub-core/kern/efi/mm.c
> @@ -478,7 +478,7 @@ filter_memory_map (grub_efi_memory_descriptor_t *memory_map,
>  }
>  
>  /* Add memory regions.  */
> -static void
> +static grub_err_t
>  add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>  		    grub_efi_uintn_t desc_size,
>  		    grub_efi_memory_descriptor_t *memory_map_end,
> @@ -506,9 +506,9 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>  					   GRUB_EFI_ALLOCATE_ADDRESS,
>  					   GRUB_EFI_LOADER_CODE);
>        if (! addr)
> -	grub_fatal ("cannot allocate conventional memory %p with %u pages",
> -		    (void *) ((grub_addr_t) start),
> -		    (unsigned) pages);
> +	return grub_error (GRUB_ERR_OUT_OF_MEMORY,
> +			    "cannot allocate conventional memory %p with %u pages",
> +			    (void *) ((grub_addr_t) start), (unsigned) pages);

I wonder if we shouldn't print a debug message here and try the next
descriptor. Is it possible that grub_efi_allocate_pages_real fails for
the specified range, but a range further on will succeed? The fact that
it hasn't mattered so far seems to indicate that either this case is
very rarely encountered or we should not continue on a failure for some
reason.

>  
>        grub_mm_init_region (addr, PAGES_TO_BYTES (pages));
>  
> @@ -518,7 +518,9 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>      }
>  
>    if (required_pages > 0)
> -    grub_fatal ("too little memory");
> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "too little memory");

A more descriptive error might be nice here too, perhaps "insufficient 
memory, require %d more bytes"

> +
> +  return GRUB_ERR_NONE;
>  }
>  
>  void
> @@ -565,6 +567,7 @@ grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
>    grub_efi_memory_descriptor_t *filtered_memory_map_end;
>    grub_efi_uintn_t map_size;
>    grub_efi_uintn_t desc_size;
> +  grub_err_t err;
>    int mm_status;
>  
>    /* Prepare a memory region to store two memory maps.  */
> @@ -609,8 +612,11 @@ grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
>    sort_memory_map (filtered_memory_map, desc_size, filtered_memory_map_end);
>  
>    /* Allocate memory regions for GRUB's memory management.  */
> -  add_memory_regions (filtered_memory_map, desc_size,
> -		      filtered_memory_map_end, BYTES_TO_PAGES (required_bytes));
> +  err = add_memory_regions (filtered_memory_map, desc_size,
> +			    filtered_memory_map_end,
> +			    BYTES_TO_PAGES (required_bytes));
> +  if (err != GRUB_ERR_NONE)
> +    return err;
>  
>  #if 0
>    /* For debug.  */

Glenn


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 11/19] efi: mm: Extract function to add memory regions
  2021-10-12  7:30 ` [PATCH 11/19] efi: mm: Extract function to add memory regions Daniel Axtens
@ 2021-10-19 21:39   ` Glenn Washburn
  2021-10-19 21:58   ` Glenn Washburn
  2021-11-09 13:38   ` Daniel Kiper
  2 siblings, 0 replies; 49+ messages in thread
From: Glenn Washburn @ 2021-10-19 21:39 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

On Tue, 12 Oct 2021 18:30:00 +1100
Daniel Axtens <dja@axtens.net> wrote:

> From: Patrick Steinhardt <ps@pks.im>
> 
> In preparation of support for runtime-allocating additional memory
> region, this patch extracts the function to retrieve the EFI memory map
> and add a subset of it to GRUB's own memory regions.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  grub-core/kern/efi/mm.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
> index 4d276bc87a4c..cfc6a67fc0cc 100644
> --- a/grub-core/kern/efi/mm.c
> +++ b/grub-core/kern/efi/mm.c
> @@ -504,7 +504,7 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>  
>        addr = grub_efi_allocate_pages_real (start, pages,
>  					   GRUB_EFI_ALLOCATE_ADDRESS,
> -					   GRUB_EFI_LOADER_CODE);      
> +					   GRUB_EFI_LOADER_CODE);
>        if (! addr)
>  	grub_fatal ("cannot allocate conventional memory %p with %u pages",
>  		    (void *) ((grub_addr_t) start),
> @@ -556,8 +556,8 @@ print_memory_map (grub_efi_memory_descriptor_t *memory_map,
>  }
>  #endif
>  
> -void
> -grub_efi_mm_init (void)
> +static grub_err_t
> +grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
>  {
>    grub_efi_memory_descriptor_t *memory_map;
>    grub_efi_memory_descriptor_t *memory_map_end;
> @@ -570,7 +570,7 @@ grub_efi_mm_init (void)
>    /* Prepare a memory region to store two memory maps.  */
>    memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (MEMORY_MAP_SIZE));
>    if (! memory_map)
> -    grub_fatal ("cannot allocate memory");
> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");

Something more descriptive would be nice, may be even "cannot allocate
memory for memory map"

>  
>    /* Obtain descriptors for available memory.  */
>    map_size = MEMORY_MAP_SIZE;
> @@ -588,14 +588,14 @@ grub_efi_mm_init (void)
>  
>        memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (map_size));
>        if (! memory_map)
> -	grub_fatal ("cannot allocate memory");
> +	return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");

Ditto. And maybe nice to have it slightly different, perhaps "cannot
re-allocate memory map"

>  
>        mm_status = grub_efi_get_memory_map (&map_size, memory_map, 0,
>  					   &desc_size, 0);
>      }
>  
>    if (mm_status < 0)
> -    grub_fatal ("cannot get memory map");
> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot get memory map");

What are the range of values that mm_status could be? Would it be
useful to include the status code in the error message? IOW, could a
failure here be affected by a configuration that the user could change
to make this work? Also I like the message "failed to retrieve memory
map (status=%d)".

>  
>    memory_map_end = NEXT_MEMORY_DESCRIPTOR (memory_map, map_size);
>  
> @@ -610,7 +610,7 @@ grub_efi_mm_init (void)
>  
>    /* Allocate memory regions for GRUB's memory management.  */
>    add_memory_regions (filtered_memory_map, desc_size,
> -		      filtered_memory_map_end, BYTES_TO_PAGES (DEFAULT_HEAP_SIZE));
> +		      filtered_memory_map_end, BYTES_TO_PAGES (required_bytes));
>  
>  #if 0
>    /* For debug.  */

What about turning this on when MM_DEBUG is on? Seems like it could be
a useful (would been to get rid of/change the grub_fatal calls).

> @@ -628,6 +628,15 @@ grub_efi_mm_init (void)
>    /* Release the memory maps.  */
>    grub_efi_free_pages ((grub_addr_t) memory_map,
>  		       2 * BYTES_TO_PAGES (MEMORY_MAP_SIZE));
> +
> +  return GRUB_ERR_NONE;
> +}
> +
> +void
> +grub_efi_mm_init (void)
> +{
> +  if (grub_efi_mm_add_regions (DEFAULT_HEAP_SIZE) != GRUB_ERR_NONE)
> +    grub_fatal ("%s", grub_errmsg);
>  }
>  
>  #if defined (__aarch64__) || defined (__arm__) || defined (__riscv)

Glenn


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 11/19] efi: mm: Extract function to add memory regions
  2021-10-12  7:30 ` [PATCH 11/19] efi: mm: Extract function to add memory regions Daniel Axtens
  2021-10-19 21:39   ` Glenn Washburn
@ 2021-10-19 21:58   ` Glenn Washburn
  2022-03-25  3:30     ` Daniel Axtens
  2021-11-09 13:38   ` Daniel Kiper
  2 siblings, 1 reply; 49+ messages in thread
From: Glenn Washburn @ 2021-10-19 21:58 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

On Tue, 12 Oct 2021 18:30:00 +1100
Daniel Axtens <dja@axtens.net> wrote:

> From: Patrick Steinhardt <ps@pks.im>
> 
> In preparation of support for runtime-allocating additional memory
> region, this patch extracts the function to retrieve the EFI memory map
> and add a subset of it to GRUB's own memory regions.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  grub-core/kern/efi/mm.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
> index 4d276bc87a4c..cfc6a67fc0cc 100644
> --- a/grub-core/kern/efi/mm.c
> +++ b/grub-core/kern/efi/mm.c
> @@ -504,7 +504,7 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>  
>        addr = grub_efi_allocate_pages_real (start, pages,
>  					   GRUB_EFI_ALLOCATE_ADDRESS,
> -					   GRUB_EFI_LOADER_CODE);      
> +					   GRUB_EFI_LOADER_CODE);
>        if (! addr)
>  	grub_fatal ("cannot allocate conventional memory %p with %u pages",
>  		    (void *) ((grub_addr_t) start),
> @@ -556,8 +556,8 @@ print_memory_map (grub_efi_memory_descriptor_t *memory_map,
>  }
>  #endif
>  
> -void
> -grub_efi_mm_init (void)
> +static grub_err_t
> +grub_efi_mm_add_regions (grub_efi_uint64_t required_bytes)
>  {
>    grub_efi_memory_descriptor_t *memory_map;
>    grub_efi_memory_descriptor_t *memory_map_end;
> @@ -570,7 +570,7 @@ grub_efi_mm_init (void)
>    /* Prepare a memory region to store two memory maps.  */
>    memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (MEMORY_MAP_SIZE));
>    if (! memory_map)
> -    grub_fatal ("cannot allocate memory");
> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");

Something more descriptive would be nice, may be even "cannot allocate
memory for memory map"

>  
>    /* Obtain descriptors for available memory.  */
>    map_size = MEMORY_MAP_SIZE;
> @@ -588,14 +588,14 @@ grub_efi_mm_init (void)
>  
>        memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (map_size));
>        if (! memory_map)
> -	grub_fatal ("cannot allocate memory");
> +	return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");

Ditto. And maybe nice to have it slightly different, perhaps "cannot
re-allocate memory map"

>  
>        mm_status = grub_efi_get_memory_map (&map_size, memory_map, 0,
>  					   &desc_size, 0);
>      }
>  
>    if (mm_status < 0)
> -    grub_fatal ("cannot get memory map");
> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot get memory map");

What are the range of values that mm_status could be? Would it be
useful to include the status code in the error message? IOW, could a
failure here be affected by a configuration that the user could change
to make this work? Also I like the message "failed to retrieve memory
map (status=%d)".

>  
>    memory_map_end = NEXT_MEMORY_DESCRIPTOR (memory_map, map_size);
>  
> @@ -610,7 +610,7 @@ grub_efi_mm_init (void)
>  
>    /* Allocate memory regions for GRUB's memory management.  */
>    add_memory_regions (filtered_memory_map, desc_size,
> -		      filtered_memory_map_end, BYTES_TO_PAGES (DEFAULT_HEAP_SIZE));
> +		      filtered_memory_map_end, BYTES_TO_PAGES (required_bytes));
>  
>  #if 0
>    /* For debug.  */

What about turning this on when MM_DEBUG is on? Seems like it could be
a useful (would been to get rid of/change the grub_fatal calls).

> @@ -628,6 +628,15 @@ grub_efi_mm_init (void)
>    /* Release the memory maps.  */
>    grub_efi_free_pages ((grub_addr_t) memory_map,
>  		       2 * BYTES_TO_PAGES (MEMORY_MAP_SIZE));
> +
> +  return GRUB_ERR_NONE;
> +}
> +
> +void
> +grub_efi_mm_init (void)
> +{
> +  if (grub_efi_mm_add_regions (DEFAULT_HEAP_SIZE) != GRUB_ERR_NONE)
> +    grub_fatal ("%s", grub_errmsg);
>  }
>  
>  #if defined (__aarch64__) || defined (__arm__) || defined (__riscv)

Glenn


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management
  2021-10-12 17:11   ` Glenn Washburn
@ 2021-10-20 15:39     ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-10-20 15:39 UTC (permalink / raw)
  To: Glenn Washburn; +Cc: Daniel Axtens, grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 12:11:30PM -0500, Glenn Washburn wrote:
> On Tue, 12 Oct 2021 18:29:50 +1100
> Daniel Axtens <dja@axtens.net> wrote:
>
> > From: Glenn Washburn <development@efficientek.com>
> >
> > A recent refactoring of CUDA command code has exposed a bug in OpenBIOS[1]
> > which was causing system powerdown and system reset to fail, thus causing
> > the Qemu instance to hang. This in turn caused the grub-shell command to
> > timeout causing it to return an error code when the test actually completed
> > successfully.
> >
> > Since it could be a while before the patch fixing this issue in OpenBIOS
> > filters down to the average distro, switch to PMU to allow powerdowns and
> > reboots to work as expected.
> >
> > [1] https://gitlab.com/qemu-project/qemu/-/issues/624
> >
> > Signed-off-by: Glenn Washburn <development@efficientek.com>
> > [dja: unbreak pseries]
> > Signed-off-by: Daniel Axtens <dja@axtens.net>
> > ---
> >  tests/util/grub-shell.in | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/util/grub-shell.in b/tests/util/grub-shell.in
> > index 93e9f5148408..33590baeb13c 100644
> > --- a/tests/util/grub-shell.in
> > +++ b/tests/util/grub-shell.in
> > @@ -84,6 +84,7 @@ case "${grub_modinfo_target_cpu}-${grub_modinfo_platform}" in
> >  	serial_null="-serial null"
> >  	netbootext=elf
> >  	trim=1
> > +	qemuopts="-M mac99,via=pmu $qemuopts"
> >  	;;
> >
> >      sparc64-ieee1275)
> > @@ -231,7 +232,7 @@ for option in "$@"; do
> >  	qemu=qemu-system-ppc64
> >  	serial_port=ieee1275/hvterm
> >  	serial_null=
> > -	qemuopts="$qemuopts -M pseries -no-reboot"
> > +	qemuopts="$(echo $qemuopts | sed -E 's/-M [^ ]+//') -M pseries -no-reboot"
>
> LGTM.

Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/19] grub-shell: pseries: don't pass fw_opt to qemu
  2021-10-12 18:57   ` Glenn Washburn
@ 2021-10-20 15:43     ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-10-20 15:43 UTC (permalink / raw)
  To: Glenn Washburn
  Cc: Daniel Axtens, The development of GNU GRUB, leif, stefanb, ps

On Tue, Oct 12, 2021 at 01:57:43PM -0500, Glenn Washburn wrote:
> On Tue, 12 Oct 2021 18:29:51 +1100
> Daniel Axtens <dja@axtens.net> wrote:
>
> > At some point this started to break the test - qemu just rejects the
> > command line. Just don't pass it in, it's seabios specific.
> >
> > Signed-off-by: Daniel Axtens <dja@axtens.net>
> > ---
> >  tests/util/grub-shell.in | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/util/grub-shell.in b/tests/util/grub-shell.in
> > index 33590baeb13c..92c0b6f320c3 100644
> > --- a/tests/util/grub-shell.in
> > +++ b/tests/util/grub-shell.in
> > @@ -376,7 +376,11 @@ if test -z "$debug"; then
> >    # workaround unfortunately causes qemu to issue a warning 'externally
> >    # provided fw_cfg item names should be prefixed with "opt/"', but there
> >    # doesn't seem to be a better option.
> > -  qemuopts="${qemuopts} -fw_cfg name=etc/sercon-port,string=0"
> > +  #
> > +  # Don't do this on pseries, it breaks recent qemus.
> > +  if [ $pseries == n ]; then
> > +    qemuopts="${qemuopts} -fw_cfg name=etc/sercon-port,string=0"
>
> I've confirmed that this is an issue on Qemu 5.2 from Debian 11 and
> that this patch works as advertised. The better solution would only be
> added on targets using SeaBIOS. This would only be i386, right? There
> is an HPPA port, but I don't think we support that architecture.

Yeah, I agree, the sercon-port looks like the SeaBIOS specific thing.
Daniel A., could you rework the patch?

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/19] mm: assert that we preserve header vs region alignment
  2021-10-12  7:29 ` [PATCH 04/19] mm: assert that we preserve header vs region alignment Daniel Axtens
@ 2021-10-20 17:43   ` Daniel Kiper
  2022-03-23  5:47     ` Daniel Axtens
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Kiper @ 2021-10-20 17:43 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:29:53PM +1100, Daniel Axtens wrote:
> grub_mm_region_init() does:
>
>   h = (grub_mm_header_t) (r + 1);
>
> where h is a grub_mm_header_t and r is a grub_mm_region_t.
>
> Cells are supposed to be GRUB_MM_ALIGN aligned, but while grub_mm_dump
> ensures this vs the region header, grub_mm_region_init() does not.
>
> It's better to be explicit than implicit here: rather than changing
> grub_mm_region_init() to ALIGN_UP(), require that the struct is
> explictly a multiple of the header size.
>
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  include/grub/mm_private.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/include/grub/mm_private.h b/include/grub/mm_private.h
> index e80a059dd4e4..533b47173e18 100644
> --- a/include/grub/mm_private.h
> +++ b/include/grub/mm_private.h
> @@ -20,6 +20,7 @@
>  #define GRUB_MM_PRIVATE_H	1
>
>  #include <grub/mm.h>
> +#include <grub/misc.h>
>
>  /* Magic words.  */
>  #define GRUB_MM_FREE_MAGIC	0x2d3c2808
> @@ -82,4 +83,11 @@ typedef struct grub_mm_region
>  extern grub_mm_region_t EXPORT_VAR (grub_mm_base);
>  #endif
>
> +static inline void grub_mm_size_sanity_check(void) {

static inline void
grub_mm_size_sanity_check (void)
{

> +  /* Ensure we preserve alignment when doing h = (grub_mm_header_t) (r + 1) */
> +  COMPILE_TIME_ASSERT((sizeof(struct grub_mm_region) %

Missing spaces before "(" above and below...

s/struct grub_mm_region/grub_mm_region_t/
s/struct grub_mm_header/grub_mm_header_t/

> +		       sizeof(struct grub_mm_header)) == 0);

I think this check is insufficient. The GRUB_MM_ALIGN should be checked
somehow too. E.g. if sizeof(struct grub_mm_region) == 16,
sizeof(struct grub_mm_header) == 8 and GRUB_MM_ALIGN == 32 then assert
above will pass but the GRUB will blow up later. Of course numbers are not
real but I chose them to show the problem. Hmmm... If we could choose
grub_mm_region_t and grub_mm_header_t sizes properly then probably we
could drop GRUB_MM_ALIGN...

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 05/19] mm: when adding a region, merge with region after as well as before
  2021-10-12  7:29 ` [PATCH 05/19] mm: when adding a region, merge with region after as well as before Daniel Axtens
@ 2021-11-04 16:36   ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-04 16:36 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:29:54PM +1100, Daniel Axtens wrote:
> On x86_64-efi (at least) regions seem to be added from top down. The mm
> code will merge a new region with an existing region that comes
> immediately before the new region. This allows larger allocations to be
> satisfied that would otherwise be the case.
>
> On powerpc-ieee1275, however, regions are added from bottom up. So if
> we add 3x 32MB regions, we can still only satisfy a 32MB allocation,
> rather than the 96MB allocation we might otherwise be able to satisfy.
>
>  * Define 'post_size' as being bytes lost to the end of an allocation
>    due to being given weird sizes from firmware that are not multiples
>    of GRUB_MM_ALIGN.
>
>  * Allow merging of regions immediately _after_ existing regions, not
>    just before. As with the other approach, we create an allocated
>    block to represent the new space and the pass it to grub_free() to
>    get the metadata right.
>
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  grub-core/kern/mm.c       | 55 +++++++++++++++++++++++++--------------
>  include/grub/mm_private.h | 15 +++++++++++
>  2 files changed, 51 insertions(+), 19 deletions(-)
>
> diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
> index c070afc621f8..835ed8a8f6f9 100644
> --- a/grub-core/kern/mm.c
> +++ b/grub-core/kern/mm.c
> @@ -129,25 +129,41 @@ grub_mm_init_region (void *addr, grub_size_t size)
>      size = ((grub_addr_t) -0x1000) - (grub_addr_t) addr;
>
>    for (p = &grub_mm_base, q = *p; q; p = &(q->next), q = *p)
> -    if ((grub_uint8_t *) addr + size + q->pre_size == (grub_uint8_t *) q)
> -      {
> -	r = (grub_mm_region_t) ALIGN_UP ((grub_addr_t) addr, GRUB_MM_ALIGN);
> -	*r = *q;
> -	r->pre_size += size;
> -
> -	if (r->pre_size >> GRUB_MM_ALIGN_LOG2)
> -	  {
> -	    h = (grub_mm_header_t) (r + 1);
> -	    h->size = (r->pre_size >> GRUB_MM_ALIGN_LOG2);
> -	    h->magic = GRUB_MM_ALLOC_MAGIC;
> -	    r->size += h->size << GRUB_MM_ALIGN_LOG2;
> -	    r->pre_size &= (GRUB_MM_ALIGN - 1);
> -	    *p = r;
> -	    grub_free (h + 1);
> -	  }
> -	*p = r;
> -	return;
> -      }
> +    {
> +      /* Does this region come _before_ an existing region? */
> +      if ((grub_uint8_t *) addr + size + q->pre_size == (grub_uint8_t *) q)
> +	{
> +	  r = (grub_mm_region_t) ALIGN_UP ((grub_addr_t) addr, GRUB_MM_ALIGN);
> +	  *r = *q;
> +	  r->pre_size += size;
> +
> +	  if (r->pre_size >> GRUB_MM_ALIGN_LOG2)
> +	    {
> +	      h = (grub_mm_header_t) (r + 1);
> +	      h->size = (r->pre_size >> GRUB_MM_ALIGN_LOG2);
> +	      h->magic = GRUB_MM_ALLOC_MAGIC;
> +	      r->size += h->size << GRUB_MM_ALIGN_LOG2;
> +	      r->pre_size &= (GRUB_MM_ALIGN - 1);
> +	      *p = r;
> +	      grub_free (h + 1);
> +	    }
> +	  *p = r;
> +	  return;
> +	}
> +
> +      /* Does this region come _after_ an existing region? */
> +      if ((grub_uint8_t *)q + sizeof(*q) + q->size + q->post_size ==
> +	  (grub_uint8_t *) addr)
> +	{
> +	  h = (grub_mm_header_t) ((grub_uint8_t *)addr - q->post_size);
> +	  h->size = (size + q->post_size) >> GRUB_MM_ALIGN_LOG2;
> +	  h->magic = GRUB_MM_ALLOC_MAGIC;
> +	  q->size += h->size << GRUB_MM_ALIGN_LOG2;
> +	  q->post_size = (q->post_size + size) & (GRUB_MM_ALIGN - 1);
> +	  grub_free (h + 1);
> +	  return;

The code itself LGTM. However, I would be more than happy if you add
bunch of comments and maybe some ASCII drawings explaining what is
happening here. This should save a lot time of our successors... :-)

> +	}
> +    }
>
>    /* Allocate a region from the head.  */
>    r = (grub_mm_region_t) ALIGN_UP ((grub_addr_t) addr, GRUB_MM_ALIGN);
> @@ -166,6 +182,7 @@ grub_mm_init_region (void *addr, grub_size_t size)
>    r->first = h;
>    r->pre_size = (grub_addr_t) r - (grub_addr_t) addr;
>    r->size = (h->size << GRUB_MM_ALIGN_LOG2);
> +  r->post_size = size - r->size;
>
>    /* Find where to insert this region. Put a smaller one before bigger ones,
>       to prevent fragmentation.  */
> diff --git a/include/grub/mm_private.h b/include/grub/mm_private.h
> index 533b47173e18..0effbc45a668 100644
> --- a/include/grub/mm_private.h
> +++ b/include/grub/mm_private.h
> @@ -74,8 +74,23 @@ typedef struct grub_mm_region
>     */
>    grub_size_t pre_size;
>
> +  /* Likewise, the post-size is the number of bytes we wasted at the end
> +     of the allocation because it wasn't a multiple of GRUB_MM_ALIGN
> +   */

Please fix the formatting of this comment.

> +  grub_size_t post_size;
> +
>    /* How many bytes are in this region? (free and allocated) */
>    grub_size_t size;
> +
> +  /* pad to a multiple of cell size */
> +#if GRUB_CPU_SIZEOF_VOID_P == 4
> +  char padding[4+4+4];
> +#elif GRUB_CPU_SIZEOF_VOID_P == 8
> +  char padding[8+8+8];
> +#else
> +# error "unknown word size"
> +#endif

grub_uint8_t padding[3 * GRUB_CPU_SIZEOF_VOID_P];?

Or do not we have a construct in the C which allows us to add paddings
like that one in smarter way?

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/19] configure: properly pass through MM_DEBUG
  2021-10-12  7:29 ` [PATCH 06/19] configure: properly pass through MM_DEBUG Daniel Axtens
@ 2021-11-04 18:00   ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-04 18:00 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:29:55PM +1100, Daniel Axtens wrote:
> I noticed that compiling with --enable-mm-debug didn't cause the
> functions in #ifdef MM_DEBUG to be compiled in. Somehow the variable
> wasn't actually being substituted into anything that was built.
>
> Change configure.ac to do AC_SUBST(), and put a substitution into
> config.h.in. This makes MM_DEBUG available to any file which includes
> config.h.
>
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  config.h.in  | 3 +++
>  configure.ac | 6 ++++--
>  2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/config.h.in b/config.h.in
> index 9e8f9911b183..8cf62616cad7 100644
> --- a/config.h.in
> +++ b/config.h.in
> @@ -13,6 +13,9 @@
>  #define DISK_CACHE_STATS @DISK_CACHE_STATS@
>  #define BOOT_TIME_STATS @BOOT_TIME_STATS@
>
> +/* Define to 1 to enable mm debugging */
> +#define MM_DEBUG @MM_DEBUG@
> +
>  /* We don't need those.  */
>  #define MINILZO_CFG_SKIP_LZO_PTR 1
>  #define MINILZO_CFG_SKIP_LZO_UTIL 1
> diff --git a/configure.ac b/configure.ac
> index 8d1c81a7316e..889d07b3c1d0 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1511,9 +1511,11 @@ AC_ARG_ENABLE([mm-debug],
>  	      AS_HELP_STRING([--enable-mm-debug],
>                               [include memory manager debugging]))
>  if test x$enable_mm_debug = xyes; then
> -    AC_DEFINE([MM_DEBUG], [1],
> -            [Define to 1 if you enable memory manager debugging.])
> +    MM_DEBUG=1
> +else
> +    MM_DEBUG=0
>  fi
> +AC_SUBST([MM_DEBUG])

It seems to me this patch is incorrect. The MM_DEBUG constant is properly
defined in config-util.h (config-util.h.in contains "#undef MM_DEBUG")
which is included conditionally from config.h. The key word here is
"conditionally". It means MM_DEBUG is defined when utils are build but
it is not defined when the GRUB itself is build. Of course it does not
make a lot of sense but it is what it is. I think this is due to how
config-util.h and config.h are generated. The former is generated using
AC_CONFIG_HEADER() and the latter is generated using AC_CONFIG_FILES().
I tried to move "#undef MM_DEBUG" to the config.h.in but it does not
help for some reason. If you could investigate this further that would
be perfect...

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 07/19] Add memtool module with memory allocation stress-test
  2021-10-12  7:29 ` [PATCH 07/19] Add memtool module with memory allocation stress-test Daniel Axtens
  2021-10-19 19:47   ` Glenn Washburn
@ 2021-11-09 12:54   ` Daniel Kiper
  2021-11-10 22:29     ` Daniel Kiper
  1 sibling, 1 reply; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 12:54 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:29:56PM +1100, Daniel Axtens wrote:
> When working on memory, it's nice to be able to test your work.
>
> Add a memtest module. When compiled with --enable-mm-debug, it exposes
> 3 commands:
>
>  * lsmem - print all allocations and free space in all regions
>  * lsfreemem - print free space in all regions
>
>  * stress_big_allocs - stress test large allocations:
>   - how much memory can we allocate in one chunk?
>   - how many 1MB chunks can we allocate?
>   - check that gap-filling works with a 1MB aligned 900kB alloc + a
>      100kB alloc.
>
> Signed-off-by: Daniel Axtens <dja@axtens.net>
>
> ---
>
> I've put this as copyright IBM for now - hopefully we can conclude on
> whether we're still doing FSF copyright assignments?

It seems to me we should do it. In the worst case, I think, we can put
IBM and FSF copyright into the file.

> ---
>  grub-core/Makefile.core.def   |   5 ++
>  grub-core/commands/memtools.c | 157 ++++++++++++++++++++++++++++++++++
>  grub-core/kern/mm.c           |   4 +
>  include/grub/mm.h             |   4 +-
>  4 files changed, 168 insertions(+), 2 deletions(-)
>  create mode 100644 grub-core/commands/memtools.c
>
> diff --git a/grub-core/Makefile.core.def b/grub-core/Makefile.core.def
> index 8022e1c0a794..0cc3a4a500ec 100644
> --- a/grub-core/Makefile.core.def
> +++ b/grub-core/Makefile.core.def
> @@ -2527,3 +2527,8 @@ module = {
>    common = commands/i386/wrmsr.c;
>    enable = x86;
>  };
> +
> +module = {
> +  name = memtools;
> +  common = commands/memtools.c;
> +};
> diff --git a/grub-core/commands/memtools.c b/grub-core/commands/memtools.c
> new file mode 100644
> index 000000000000..6d5778f4a1b0
> --- /dev/null
> +++ b/grub-core/commands/memtools.c
> @@ -0,0 +1,157 @@
> +/*
> + *  GRUB  --  GRand Unified Bootloader
> + *  Copyright (C) 2021  IBM Corporation
> + *
> + *  GRUB is free software: you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation, either version 3 of the License, or
> + *  (at your option) any later version.
> + *
> + *  GRUB is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with GRUB.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <config.h>
> +#include <grub/dl.h>
> +#include <grub/misc.h>
> +#include <grub/command.h>
> +#include <grub/i18n.h>
> +#include <grub/memory.h>
> +#include <grub/mm.h>
> +
> +GRUB_MOD_LICENSE ("GPLv3+");
> +
> +#ifdef MM_DEBUG

Could we avoid this ifdefery and do everything in Makefile? Then
non-debug builds will no contain even memtools stub module.

s/memtools/memdebug/?

> +static grub_err_t
> +grub_cmd_lsmem (grub_command_t cmd __attribute__ ((unused)),
> +		 int argc __attribute__ ((unused)),
> +		 char **args __attribute__ ((unused)))
> +
> +{
> +#ifndef GRUB_MACHINE_EMU
> +  grub_mm_dump(0);

Missing space between function name and "(". Please fix the same
mistakes below.

> +#endif
> +
> +  return 0;
> +}
> +
> +static grub_err_t
> +grub_cmd_lsfreemem (grub_command_t cmd __attribute__ ((unused)),
> +		    int argc __attribute__ ((unused)),
> +		    char **args __attribute__ ((unused)))
> +
> +{
> +#ifndef GRUB_MACHINE_EMU
> +  grub_mm_dump_free();

Ditto...

> +#endif
> +
> +  return 0;

return GRUB_ERR_NONE

> +}
> +
> +
> +#define BIG_ALLOC (64 * 1024 * 1024)

s/BIG_ALLOC/STRESS_BIG_ALLOC/?

> +#define SMALL_ALLOC 32

s/SMALL_ALLOC/STRESS_SMALL_ALLOC/?

And could you define these constants immediately behind GRUB_MOD_LICENSE()?

> +static grub_err_t
> +grub_cmd_stress_big_allocs (grub_command_t cmd __attribute__ ((unused)),
> +			    int argc __attribute__ ((unused)),
> +			    char **args __attribute__ ((unused)))
> +{
> +  int i, max_mb, blocks_alloced;
> +  void *mem;
> +  void **blocklist;
> +
> +  grub_printf ("Test 1: increasingly sized allocs to 1GB block\n");
> +  for (i = 1; i < 1024; i++) {
> +    grub_printf ("%d MB . ", i);
> +    mem = grub_malloc (i * 1024 * 1024);
> +    if (mem == NULL)
> +      {
> +	grub_printf ("failed\n");
> +	break;
> +      }
> +    else
> +      grub_free (mem);
> +
> +    if (i % 10 == 0)
> +      grub_printf ("\n");
> +  }
> +
> +  max_mb = i - 1;

I think i is not incremented when break is executed. So, max_mb value will be wrong.

> +  grub_printf ("Max sized allocation we did was %d MB\n", max_mb);
> +
> +  grub_printf ("Test 2: 1MB at a time, max 4GB\n");
> +  blocklist = grub_calloc (4096, sizeof (void *));

The blocklist can be NULL here.

> +  for (i = 0; i < 4096; i++)
> +    {
> +      blocklist[i] = grub_malloc (1024 * 1024);
> +      if (!blocklist[i])

if (blocklist[i] == NULL)

And please fix similar things below...

> +	{
> +	  grub_printf ("Ran out of memory at iteration %d\n", i);
> +	  break;

Should not we print a dot after every 32 or 64 iterations?

> +	}
> +    }
> +  blocks_alloced = i;

Again, i value can be wrong after break.

> +  for (i = 0; i < blocks_alloced; i++)
> +    grub_free (blocklist[i]);

I think each of these tests should have separate command assigned.

> +  grub_printf ("\nTest 3: 1MB aligned 900kB + 100kB\n");
> +  //grub_mm_debug=1;

Please drop this.

> +  for (i = 0; i < 4096; i += 2)
> +    {
> +      blocklist[i] = grub_memalign (1024 * 1024, 900 * 1024);
> +      if (!blocklist[i])
> +	{
> +	  grub_printf ("Failed big allocation, iteration %d\n", i);
> +	  blocks_alloced = i;
> +	  break;
> +	}
> +
> +      blocklist[i + 1] = grub_malloc (100 * 1024);
> +      if (!blocklist[i + 1])
> +	{
> +	  grub_printf ("Failed small allocation, iteration %d\n", i);
> +	  blocks_alloced = i + 1;
> +	  break;
> +	}
> +      grub_printf (".");

Should not we do this every 32 or 64 iterations instead?

> +    }
> +  for (i = 0; i < blocks_alloced; i++)
> +    grub_free (blocklist[i]);
> +
> +  grub_free (blocklist);
> +
> +  grub_errno = GRUB_ERR_NONE;
> +  return GRUB_ERR_NONE;
> +}

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/19] mm: Drop unused unloading of modules on OOM
  2021-10-12  7:29 ` [PATCH 08/19] mm: Drop unused unloading of modules on OOM Daniel Axtens
@ 2021-11-09 13:15   ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 13:15 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:29:57PM +1100, Daniel Axtens wrote:
> From: Patrick Steinhardt <ps@pks.im>
>
> In `grub_memalign ()`, there's a commented section which would allow for
> unloading of unneeded modules in case where there is not enough free
> memory available to satisfy a request. Given that this code is never
> compiled in, let's remove it together with `grub_dl_unload_unneeded()`
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>

You should add your SOB here.

Otherwise Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/19] mm: Allow dynamically requesting additional memory regions
  2021-10-12  7:29 ` [PATCH 09/19] mm: Allow dynamically requesting additional memory regions Daniel Axtens
@ 2021-11-09 13:25   ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 13:25 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:29:58PM +1100, Daniel Axtens wrote:
> From: Patrick Steinhardt <ps@pks.im>
>
> Currently, all platforms will set up their heap on initialization of the
> platform code. While this works mostly fine, it poses some limitations
> on memory management on us. Most notably, allocating big chunks of
> memory in the gigabyte range would require us to pre-request this many
> bytes from the firmware and add it to the heap from the beginning on
> some platforms like EFI. As this isn't needed for most configurations,
> it is inefficient and may even negatively impact some usecases when,
> e.g., chainloading. Nonetheless, allocating big chunks of memory is
> required sometimes, where one example is the upcoming support for the
> Argon2 key derival function in LUKS2.
>
> In order to avoid pre-allocating big chunks of memory, this commit
> implements a runtime mechanism to add more pages to the system. When a
> given allocation cannot be currently satisfied, we'll call a given
> callback set up by the platform's own memory management subsystem,
> asking it to add a memory area with at least `n` bytes. If this
> succeeds, we retry searching for a valid memory region, which should now
> succeed.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> [dja: add this to the documentation at the top of mm.c]
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  grub-core/kern/mm.c | 13 +++++++++++++
>  include/grub/mm.h   | 16 ++++++++++++++++
>  2 files changed, 29 insertions(+)
>
> diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
> index 6038c0d0cbb2..58d5b89e8860 100644
> --- a/grub-core/kern/mm.c
> +++ b/grub-core/kern/mm.c
> @@ -28,6 +28,9 @@
>    - multiple regions may be used as free space. They may not be
>    contiguous.
>
> +  - if existing regions are insufficient to satisfy an allocation, a new
> +  region can be requested from firmware.
> +
>    Regions are managed by a singly linked list, and the meta information is
>    stored in the beginning of each region. Space after the meta information
>    is used to allocate memory.
> @@ -81,6 +84,7 @@
>  \f
>
>  grub_mm_region_t grub_mm_base;
> +grub_mm_add_region_func_t grub_mm_add_region_fn;
>
>  /* Get a header from the pointer PTR, and set *P and *R to a pointer
>     to the header and a pointer to its region, respectively. PTR must
> @@ -377,6 +381,15 @@ grub_memalign (grub_size_t align, grub_size_t size)
>        count++;
>        goto again;
>
> +    case 1:
> +      /* Request additional pages.  */
> +      count++;
> +
> +      if (grub_mm_add_region_fn && grub_mm_add_region_fn (size, GRUB_MM_ADD_REGION_CONSECUTIVE) == GRUB_ERR_NONE)

if (grub_mm_add_region_fn != NULL && ...

If you fix this you can add Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>

> +	goto again;
> +
> +      /* fallthrough */
> +
>      default:
>        break;
>      }

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/19] efi: mm: Always request a fixed number of pages on init
  2021-10-12  7:29 ` [PATCH 10/19] efi: mm: Always request a fixed number of pages on init Daniel Axtens
@ 2021-11-09 13:32   ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 13:32 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:29:59PM +1100, Daniel Axtens wrote:
> From: Patrick Steinhardt <ps@pks.im>
>
> When initializing the EFI memory subsytem, we will by default request a
> quarter of the available memory, bounded by a minimum/maximum value.
> Given that we're about to extend the EFI memory system to dynamically
> request additional pages from the firmware as required, this scaling of
> requested memory based on available memory will not make a lot of sense
> anymore.
>
> Remove this logic as a preparatory patch such that we'll instead defer
> to the runtime memory allocator. Note that ideally, we'd want to change
> this after dynamic requesting of pages has been implemented for the EFI
> platform. But because we'll need to split up initialization of the
> memory subsystem and the request of pages from the firmware, we'd have
> to duplicate quite some logic at first only to remove it afterwards
> again. This seems quite pointless, so we instead have patches slightly
> out of order.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>

Please add your SOB here.

Otherwise Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 11/19] efi: mm: Extract function to add memory regions
  2021-10-12  7:30 ` [PATCH 11/19] efi: mm: Extract function to add memory regions Daniel Axtens
  2021-10-19 21:39   ` Glenn Washburn
  2021-10-19 21:58   ` Glenn Washburn
@ 2021-11-09 13:38   ` Daniel Kiper
  2 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 13:38 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps, development

On Tue, Oct 12, 2021 at 06:30:00PM +1100, Daniel Axtens wrote:
> From: Patrick Steinhardt <ps@pks.im>
>
> In preparation of support for runtime-allocating additional memory
> region, this patch extracts the function to retrieve the EFI memory map
> and add a subset of it to GRUB's own memory regions.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  grub-core/kern/efi/mm.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
> index 4d276bc87a4c..cfc6a67fc0cc 100644
> --- a/grub-core/kern/efi/mm.c
> +++ b/grub-core/kern/efi/mm.c
> @@ -504,7 +504,7 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>
>        addr = grub_efi_allocate_pages_real (start, pages,
>  					   GRUB_EFI_ALLOCATE_ADDRESS,
> -					   GRUB_EFI_LOADER_CODE);
> +					   GRUB_EFI_LOADER_CODE);

Please drop this change.

...and if possible please take into account Glenn's comments...

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()`
  2021-10-19 21:37   ` Glenn Washburn
@ 2021-11-09 13:56     ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 13:56 UTC (permalink / raw)
  To: Glenn Washburn
  Cc: Daniel Axtens, The development of GNU GRUB, leif, stefanb, ps

On Tue, Oct 19, 2021 at 04:37:03PM -0500, Glenn Washburn wrote:
> On Tue, 12 Oct 2021 18:30:01 +1100
> Daniel Axtens <dja@axtens.net> wrote:
>
> > From: Patrick Steinhardt <ps@pks.im>
> >
> > The function `add_memory_regions ()` is currently only called on system
> > initialization to allocate a fixed amount of pages. As such, it didn't
> > need to return any errors: in case it failed, we cannot proceed anyway.
> > This will change with the upcoming support for requesting more memory
> > from the firmware at runtime, where it doesn't make sense anymore to
> > fail hard.
> >
> > Refactor the function to return an error to prepare for this. Note that
> > this does not change the behaviour when initializing the memory system
> > because `grub_efi_mm_init ()` knows to call `grub_fatal ()` in case
> > `grub_efi_mm_add_regions ()` returns an error.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  grub-core/kern/efi/mm.c | 20 +++++++++++++-------
> >  1 file changed, 13 insertions(+), 7 deletions(-)
> >
> > diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
> > index cfc6a67fc0cc..ced3ee5e7537 100644
> > --- a/grub-core/kern/efi/mm.c
> > +++ b/grub-core/kern/efi/mm.c
> > @@ -478,7 +478,7 @@ filter_memory_map (grub_efi_memory_descriptor_t *memory_map,
> >  }
> >
> >  /* Add memory regions.  */
> > -static void
> > +static grub_err_t
> >  add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
> >  		    grub_efi_uintn_t desc_size,
> >  		    grub_efi_memory_descriptor_t *memory_map_end,
> > @@ -506,9 +506,9 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
> >  					   GRUB_EFI_ALLOCATE_ADDRESS,
> >  					   GRUB_EFI_LOADER_CODE);
> >        if (! addr)
> > -	grub_fatal ("cannot allocate conventional memory %p with %u pages",
> > -		    (void *) ((grub_addr_t) start),
> > -		    (unsigned) pages);
> > +	return grub_error (GRUB_ERR_OUT_OF_MEMORY,
> > +			    "cannot allocate conventional memory %p with %u pages",
> > +			    (void *) ((grub_addr_t) start), (unsigned) pages);
>
> I wonder if we shouldn't print a debug message here and try the next
> descriptor. Is it possible that grub_efi_allocate_pages_real fails for
> the specified range, but a range further on will succeed? The fact that
> it hasn't mattered so far seems to indicate that either this case is
> very rarely encountered or we should not continue on a failure for some
> reason.

You allocate number of pages of a given UEFI memory region or less. So,
the grub_efi_allocate_pages_real() may fail only due to bug somewhere.
Then I do not think we should try next UEFI memory region.

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()`
  2021-10-12  7:30 ` [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()` Daniel Axtens
  2021-10-19 21:37   ` Glenn Washburn
@ 2021-11-09 16:10   ` Daniel Kiper
  1 sibling, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 16:10 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:30:01PM +1100, Daniel Axtens wrote:
> From: Patrick Steinhardt <ps@pks.im>
>
> The function `add_memory_regions ()` is currently only called on system
> initialization to allocate a fixed amount of pages. As such, it didn't
> need to return any errors: in case it failed, we cannot proceed anyway.
> This will change with the upcoming support for requesting more memory
> from the firmware at runtime, where it doesn't make sense anymore to
> fail hard.
>
> Refactor the function to return an error to prepare for this. Note that
> this does not change the behaviour when initializing the memory system
> because `grub_efi_mm_init ()` knows to call `grub_fatal ()` in case
> `grub_efi_mm_add_regions ()` returns an error.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  grub-core/kern/efi/mm.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/grub-core/kern/efi/mm.c b/grub-core/kern/efi/mm.c
> index cfc6a67fc0cc..ced3ee5e7537 100644
> --- a/grub-core/kern/efi/mm.c
> +++ b/grub-core/kern/efi/mm.c
> @@ -478,7 +478,7 @@ filter_memory_map (grub_efi_memory_descriptor_t *memory_map,
>  }
>
>  /* Add memory regions.  */
> -static void
> +static grub_err_t
>  add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>  		    grub_efi_uintn_t desc_size,
>  		    grub_efi_memory_descriptor_t *memory_map_end,
> @@ -506,9 +506,9 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>  					   GRUB_EFI_ALLOCATE_ADDRESS,
>  					   GRUB_EFI_LOADER_CODE);
>        if (! addr)
> -	grub_fatal ("cannot allocate conventional memory %p with %u pages",
> -		    (void *) ((grub_addr_t) start),
> -		    (unsigned) pages);
> +	return grub_error (GRUB_ERR_OUT_OF_MEMORY,
> +			    "cannot allocate conventional memory %p with %u pages",
> +			    (void *) ((grub_addr_t) start), (unsigned) pages);
>
>        grub_mm_init_region (addr, PAGES_TO_BYTES (pages));
>
> @@ -518,7 +518,9 @@ add_memory_regions (grub_efi_memory_descriptor_t *memory_map,
>      }
>
>    if (required_pages > 0)
> -    grub_fatal ("too little memory");
> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "too little memory");

I agree with Glenn. The message is a bit imprecise here. Maybe we should
make more clear that the UEFI allocation somehow failed. I think I would
improve message above too.

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 13/19] efi: mm: Implement runtime addition of pages
  2021-10-12  7:30 ` [PATCH 13/19] efi: mm: Implement runtime addition of pages Daniel Axtens
@ 2021-11-09 16:13   ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-09 16:13 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:30:02PM +1100, Daniel Axtens wrote:
> From: Patrick Steinhardt <ps@pks.im>
>
> Adjust the interface of `grub_efi_mm_add_regions ()` to take a set of
> `GRUB_MM_ADD_REGION_*` flags, which most notably is currently only the
> `CONSECUTVE` flag. This allows us to set the function up as callback for
> the memory subsystem and have it call out to us in case there's not
> enough pages available in the current heap.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>

Please add your SOB here.

Otherwise Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 17/19] [not for merge] print more debug info in mm
  2021-10-12  7:30 ` [PATCH 17/19] [not for merge] print more debug info in mm Daniel Axtens
@ 2021-11-10 13:47   ` Daniel Kiper
  2021-11-10 19:35     ` Glenn Washburn
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Kiper @ 2021-11-10 13:47 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps, development

CC-ing Glenn...

On Tue, Oct 12, 2021 at 06:30:06PM +1100, Daniel Axtens wrote:
> This is handy for debugging - I'm including it in case anyone else hacking
> on this area finds it helpful.
>
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>  grub-core/kern/mm.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
> index 58d5b89e8860..811df1ab5ebb 100644
> --- a/grub-core/kern/mm.c
> +++ b/grub-core/kern/mm.c
> @@ -135,6 +135,9 @@ grub_mm_init_region (void *addr, grub_size_t size)
>    for (p = &grub_mm_base, q = *p; q; p = &(q->next), q = *p)
>      {
>        /* Does this region come _before_ an existing region? */
> +      grub_printf ("Extending w/ before %p + %" PRIxGRUB_SIZE " + %" PRIxGRUB_SIZE " = %p ? %s\n",
> +		 (grub_uint8_t *)addr, size, q->pre_size, (grub_uint8_t *)q,
> +		 (grub_uint8_t *)addr + size + q->pre_size == (grub_uint8_t *) q ? "yes" : "no");

I think this kind of messages can be useful. Same applies to patch #18.
Though I would use grub_dprintf() instead which should be wrapped with
#ifdef MM_DEBUG. However, we have to be very careful with printing any
messages from mm and do not exeecec 255 chars message length. If we go
above that limit then we will trigger dynamic allocation in grub_dprintf()
from mm which may lead to a recursion...

Additionally, I think Glenn's patch allowing us to disable logging from
certain subsystem would be useful here. Glenn, could you take a look at
it once again?

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 19/19] RFC: Ignore REGION_CONSECUTIVE
  2021-10-12  7:30 ` [PATCH 19/19] RFC: Ignore REGION_CONSECUTIVE Daniel Axtens
@ 2021-11-10 14:00   ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-10 14:00 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Oct 12, 2021 at 06:30:08PM +1100, Daniel Axtens wrote:
> Looking into the region merging stuff, I started to wonder if we actually
> want to just try adding as much memory as we can (up to the limit requested),
> even if we cannot satisfy the full allocation (i.e. we are not CONSECUTIVE).
>
> On powerpc-ieee1275 this actually allows me to satisfy bigger allocations:
> while the FW cannot necessarily satisfy e.g. a 128MB allocation, it can
> satisfy e.g. a 32MB allocation, and that might merge with a region that's
> 100MB, and then we can satisfy our 128MB allocation.
>
> The way the call out to firmware operates in mm.c is that it preceeds a
> 'goto again;' - it kicks off a whole new round of attempting to satisfy
> allocation. mm.c doesn't fill the allocation only from the new memory from
> firmware.
>
> Perhaps an even better approach would be for the mm.c code to try going to
> firmware twice - once for the precise size of the allocation (rounded up etc)
> and then if that doesn't succeed go back and ask for 'anything you can give me'?

I think this is good idea. Though maybe we should allow platforms to
choose mode of operation, i.e. try to allocate one memory region with
precise size first and if this does not work then fulfill my request
returning memory to the GRUB mm in chunks or vice versa or maybe some
platforms would want to use only one allocation strategy...

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 17/19] [not for merge] print more debug info in mm
  2021-11-10 13:47   ` Daniel Kiper
@ 2021-11-10 19:35     ` Glenn Washburn
  2021-11-10 22:17       ` Daniel Kiper
  0 siblings, 1 reply; 49+ messages in thread
From: Glenn Washburn @ 2021-11-10 19:35 UTC (permalink / raw)
  To: Daniel Kiper; +Cc: Daniel Axtens, grub-devel, leif, stefanb, ps

On Wed, 10 Nov 2021 14:47:07 +0100
Daniel Kiper <dkiper@net-space.pl> wrote:

> CC-ing Glenn...
> 
> On Tue, Oct 12, 2021 at 06:30:06PM +1100, Daniel Axtens wrote:
> > This is handy for debugging - I'm including it in case anyone else hacking
> > on this area finds it helpful.
> >
> > Signed-off-by: Daniel Axtens <dja@axtens.net>
> > ---
> >  grub-core/kern/mm.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
> > index 58d5b89e8860..811df1ab5ebb 100644
> > --- a/grub-core/kern/mm.c
> > +++ b/grub-core/kern/mm.c
> > @@ -135,6 +135,9 @@ grub_mm_init_region (void *addr, grub_size_t size)
> >    for (p = &grub_mm_base, q = *p; q; p = &(q->next), q = *p)
> >      {
> >        /* Does this region come _before_ an existing region? */
> > +      grub_printf ("Extending w/ before %p + %" PRIxGRUB_SIZE " + %" PRIxGRUB_SIZE " = %p ? %s\n",
> > +		 (grub_uint8_t *)addr, size, q->pre_size, (grub_uint8_t *)q,
> > +		 (grub_uint8_t *)addr + size + q->pre_size == (grub_uint8_t *) q ? "yes" : "no");
> 
> I think this kind of messages can be useful. Same applies to patch #18.
> Though I would use grub_dprintf() instead which should be wrapped with
> #ifdef MM_DEBUG. However, we have to be very careful with printing any
> messages from mm and do not exeecec 255 chars message length. If we go
> above that limit then we will trigger dynamic allocation in grub_dprintf()
> from mm which may lead to a recursion...
> 
> Additionally, I think Glenn's patch allowing us to disable logging from
> certain subsystem would be useful here. Glenn, could you take a look at
> it once again?

Yes, I'm planning on it. Its been low priority for me and this month is
very busy fo me IRL. I don't think my patches should hold this up
though (or would this cause way to much logs if one didn't want them?)

Glenn


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 17/19] [not for merge] print more debug info in mm
  2021-11-10 19:35     ` Glenn Washburn
@ 2021-11-10 22:17       ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-10 22:17 UTC (permalink / raw)
  To: Glenn Washburn; +Cc: Daniel Axtens, grub-devel, leif, stefanb, ps

On Wed, Nov 10, 2021 at 01:35:12PM -0600, Glenn Washburn wrote:
> On Wed, 10 Nov 2021 14:47:07 +0100
> Daniel Kiper <dkiper@net-space.pl> wrote:
>
> > CC-ing Glenn...
> >
> > On Tue, Oct 12, 2021 at 06:30:06PM +1100, Daniel Axtens wrote:
> > > This is handy for debugging - I'm including it in case anyone else hacking
> > > on this area finds it helpful.
> > >
> > > Signed-off-by: Daniel Axtens <dja@axtens.net>
> > > ---
> > >  grub-core/kern/mm.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/grub-core/kern/mm.c b/grub-core/kern/mm.c
> > > index 58d5b89e8860..811df1ab5ebb 100644
> > > --- a/grub-core/kern/mm.c
> > > +++ b/grub-core/kern/mm.c
> > > @@ -135,6 +135,9 @@ grub_mm_init_region (void *addr, grub_size_t size)
> > >    for (p = &grub_mm_base, q = *p; q; p = &(q->next), q = *p)
> > >      {
> > >        /* Does this region come _before_ an existing region? */
> > > +      grub_printf ("Extending w/ before %p + %" PRIxGRUB_SIZE " + %" PRIxGRUB_SIZE " = %p ? %s\n",
> > > +		 (grub_uint8_t *)addr, size, q->pre_size, (grub_uint8_t *)q,
> > > +		 (grub_uint8_t *)addr + size + q->pre_size == (grub_uint8_t *) q ? "yes" : "no");
> >
> > I think this kind of messages can be useful. Same applies to patch #18.
> > Though I would use grub_dprintf() instead which should be wrapped with
> > #ifdef MM_DEBUG. However, we have to be very careful with printing any
> > messages from mm and do not exeecec 255 chars message length. If we go
> > above that limit then we will trigger dynamic allocation in grub_dprintf()
> > from mm which may lead to a recursion...
> >
> > Additionally, I think Glenn's patch allowing us to disable logging from
> > certain subsystem would be useful here. Glenn, could you take a look at
> > it once again?
>
> Yes, I'm planning on it. Its been low priority for me and this month is

Cool! Thanks!

> very busy fo me IRL. I don't think my patches should hold this up
> though (or would this cause way to much logs if one didn't want them?)

No rush...

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 07/19] Add memtool module with memory allocation stress-test
  2021-11-09 12:54   ` Daniel Kiper
@ 2021-11-10 22:29     ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2021-11-10 22:29 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Tue, Nov 09, 2021 at 01:54:04PM +0100, Daniel Kiper wrote:
> On Tue, Oct 12, 2021 at 06:29:56PM +1100, Daniel Axtens wrote:
> > When working on memory, it's nice to be able to test your work.
> >
> > Add a memtest module. When compiled with --enable-mm-debug, it exposes
> > 3 commands:
> >
> >  * lsmem - print all allocations and free space in all regions
> >  * lsfreemem - print free space in all regions
> >
> >  * stress_big_allocs - stress test large allocations:
> >   - how much memory can we allocate in one chunk?
> >   - how many 1MB chunks can we allocate?
> >   - check that gap-filling works with a 1MB aligned 900kB alloc + a
> >      100kB alloc.
> >
> > Signed-off-by: Daniel Axtens <dja@axtens.net>
> >
> > ---
> >
> > I've put this as copyright IBM for now - hopefully we can conclude on
> > whether we're still doing FSF copyright assignments?
>
> It seems to me we should do it. In the worst case, I think, we can put
> IBM and FSF copyright into the file.
>
> > ---
> >  grub-core/Makefile.core.def   |   5 ++
> >  grub-core/commands/memtools.c | 157 ++++++++++++++++++++++++++++++++++
> >  grub-core/kern/mm.c           |   4 +
> >  include/grub/mm.h             |   4 +-
> >  4 files changed, 168 insertions(+), 2 deletions(-)
> >  create mode 100644 grub-core/commands/memtools.c
> >
> > diff --git a/grub-core/Makefile.core.def b/grub-core/Makefile.core.def
> > index 8022e1c0a794..0cc3a4a500ec 100644
> > --- a/grub-core/Makefile.core.def
> > +++ b/grub-core/Makefile.core.def
> > @@ -2527,3 +2527,8 @@ module = {
> >    common = commands/i386/wrmsr.c;
> >    enable = x86;
> >  };
> > +
> > +module = {
> > +  name = memtools;
> > +  common = commands/memtools.c;
> > +};
> > diff --git a/grub-core/commands/memtools.c b/grub-core/commands/memtools.c
> > new file mode 100644
> > index 000000000000..6d5778f4a1b0
> > --- /dev/null
> > +++ b/grub-core/commands/memtools.c
> > @@ -0,0 +1,157 @@
> > +/*
> > + *  GRUB  --  GRand Unified Bootloader
> > + *  Copyright (C) 2021  IBM Corporation
> > + *
> > + *  GRUB is free software: you can redistribute it and/or modify
> > + *  it under the terms of the GNU General Public License as published by
> > + *  the Free Software Foundation, either version 3 of the License, or
> > + *  (at your option) any later version.
> > + *
> > + *  GRUB is distributed in the hope that it will be useful,
> > + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + *  GNU General Public License for more details.
> > + *
> > + *  You should have received a copy of the GNU General Public License
> > + *  along with GRUB.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#include <config.h>
> > +#include <grub/dl.h>
> > +#include <grub/misc.h>
> > +#include <grub/command.h>
> > +#include <grub/i18n.h>
> > +#include <grub/memory.h>
> > +#include <grub/mm.h>
> > +
> > +GRUB_MOD_LICENSE ("GPLv3+");
> > +
> > +#ifdef MM_DEBUG
>
> Could we avoid this ifdefery and do everything in Makefile? Then
> non-debug builds will no contain even memtools stub module.
>
> s/memtools/memdebug/?
>
> > +static grub_err_t
> > +grub_cmd_lsmem (grub_command_t cmd __attribute__ ((unused)),
> > +		 int argc __attribute__ ((unused)),
> > +		 char **args __attribute__ ((unused)))
> > +
> > +{
> > +#ifndef GRUB_MACHINE_EMU
> > +  grub_mm_dump(0);
>
> Missing space between function name and "(". Please fix the same
> mistakes below.
>
> > +#endif
> > +
> > +  return 0;
> > +}
> > +
> > +static grub_err_t
> > +grub_cmd_lsfreemem (grub_command_t cmd __attribute__ ((unused)),
> > +		    int argc __attribute__ ((unused)),
> > +		    char **args __attribute__ ((unused)))
> > +
> > +{
> > +#ifndef GRUB_MACHINE_EMU
> > +  grub_mm_dump_free();
>
> Ditto...
>
> > +#endif
> > +
> > +  return 0;
>
> return GRUB_ERR_NONE
>
> > +}
> > +
> > +
> > +#define BIG_ALLOC (64 * 1024 * 1024)
>
> s/BIG_ALLOC/STRESS_BIG_ALLOC/?
>
> > +#define SMALL_ALLOC 32
>
> s/SMALL_ALLOC/STRESS_SMALL_ALLOC/?
>
> And could you define these constants immediately behind GRUB_MOD_LICENSE()?
>
> > +static grub_err_t
> > +grub_cmd_stress_big_allocs (grub_command_t cmd __attribute__ ((unused)),
> > +			    int argc __attribute__ ((unused)),
> > +			    char **args __attribute__ ((unused)))
> > +{
> > +  int i, max_mb, blocks_alloced;
> > +  void *mem;
> > +  void **blocklist;
> > +
> > +  grub_printf ("Test 1: increasingly sized allocs to 1GB block\n");
> > +  for (i = 1; i < 1024; i++) {
> > +    grub_printf ("%d MB . ", i);
> > +    mem = grub_malloc (i * 1024 * 1024);
> > +    if (mem == NULL)
> > +      {
> > +	grub_printf ("failed\n");
> > +	break;
> > +      }
> > +    else
> > +      grub_free (mem);
> > +
> > +    if (i % 10 == 0)
> > +      grub_printf ("\n");
> > +  }
> > +
> > +  max_mb = i - 1;
>
> I think i is not incremented when break is executed. So, max_mb value will be wrong.
>
> > +  grub_printf ("Max sized allocation we did was %d MB\n", max_mb);
> > +
> > +  grub_printf ("Test 2: 1MB at a time, max 4GB\n");
> > +  blocklist = grub_calloc (4096, sizeof (void *));
>
> The blocklist can be NULL here.
>
> > +  for (i = 0; i < 4096; i++)
> > +    {
> > +      blocklist[i] = grub_malloc (1024 * 1024);
> > +      if (!blocklist[i])
>
> if (blocklist[i] == NULL)
>
> And please fix similar things below...
>
> > +	{
> > +	  grub_printf ("Ran out of memory at iteration %d\n", i);
> > +	  break;
>
> Should not we print a dot after every 32 or 64 iterations?
>
> > +	}
> > +    }
> > +  blocks_alloced = i;
>
> Again, i value can be wrong after break.
>
> > +  for (i = 0; i < blocks_alloced; i++)
> > +    grub_free (blocklist[i]);
>
> I think each of these tests should have separate command assigned.

...and then everyone should get some parameters instead of being them
hard coded. This should ease testing with various memory allocation sizes
and number of iterations.

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 03/19] mm: document grub internal memory management structures
  2021-10-12 19:18   ` Glenn Washburn
@ 2021-11-24  0:57     ` Daniel Axtens
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Axtens @ 2021-11-24  0:57 UTC (permalink / raw)
  To: development; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

Glenn Washburn <development@efficientek.com> writes:

> On Tue, 12 Oct 2021 18:29:52 +1100
> Daniel Axtens <dja@axtens.net> wrote:
>
>> I spent more than a trivial quantity of time figuring out pre_size and
>> whether a memory region's size contains the header cell or not.
>> 
>> Document the meanings of all the properties. Hopefully now no-one else
>> has to figure it out!
>> 
>> Signed-off-by: Daniel Axtens <dja@axtens.net>
>> ---
>>  include/grub/mm_private.h | 21 +++++++++++++++++++++
>>  1 file changed, 21 insertions(+)
>> 
>> diff --git a/include/grub/mm_private.h b/include/grub/mm_private.h
>> index c2c4cb1511c6..e80a059dd4e4 100644
>> --- a/include/grub/mm_private.h
>> +++ b/include/grub/mm_private.h
>> @@ -25,11 +25,18 @@
>>  #define GRUB_MM_FREE_MAGIC	0x2d3c2808
>>  #define GRUB_MM_ALLOC_MAGIC	0x6db08fa4
>>  
>> +/* A header describing a block of memory - either allocated or free */
>>  typedef struct grub_mm_header
>>  {
>> +  /* The 'next' free block in this region. A circular list.
>> +     Only meaningful if the block is free.
>> +   */
>>    struct grub_mm_header *next;
>
> This and the subsequent comments do not follow the multiline comment
> style[1].

Will fix, thanks.

> One nit, "region. A circular" -> "region's circular".

Sure, will do.

> This comment leads me to wonder if the block is not free, what is the
> value of next? Seems like it should be NULL, if its not meaningful.

In practice the value will be:

 - if the block was converted from an existing free block: whatever the
   'next' pointer was when the block was free

 - if the block was carved out of the middle or end of an existing free
   block: whatever happened to be at that offset in memory.

We certainly _could_ null it out. I don't think there's any real value
in doing so but I'm open to being convinced. I suppose you could make an
argument that it reduces the chance that we'll follow a pointer to
somewhere random in memory but we check the magic before we check next
so we're fairly safe.

> https://www.gnu.org/software/grub/manual/grub-dev/grub-dev.html#Multi_002dLine-Comments
>
>> +  /* The region size in cells (not bytes). Includes the header cell. */
>
> What exactly does "cell" mean in this context? I'm gathering cell is
> not defined as in this link[2], which is equivalent to "bit". Perhaps
> "size" should be renamed to "ncells" or something more descriptive.
>
> [2] https://en.wikipedia.org/wiki/Memory_cell_(computing)

The definition of cell is from mm.c:

  The memory space is used as cells instead of bytes for simplicity. This
  is important for some CPUs which may not access multiple bytes at a time
  when the first byte is not aligned at a certain boundary (typically,
  4-byte or 8-byte). The size of each cell is equal to the size of struct
  grub_mm_header, so the header of each allocated/free block fits into one
  cell precisely. One cell is 16 bytes on 32-bit platforms and 32 bytes
  on 64-bit platforms.

I don't want to change variable names in this patch but I will update
the comment.

Kind regards,
Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/19] mm: assert that we preserve header vs region alignment
  2021-10-20 17:43   ` Daniel Kiper
@ 2022-03-23  5:47     ` Daniel Axtens
  2022-03-24 15:20       ` Daniel Kiper
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Axtens @ 2022-03-23  5:47 UTC (permalink / raw)
  To: Daniel Kiper; +Cc: grub-devel, leif, stefanb, ps


> s/struct grub_mm_region/grub_mm_region_t/
> s/struct grub_mm_header/grub_mm_header_t/

The problem is that grub_mm_{region,header}_t is a pointer type, not a
struct type. So sizeof (grub_mm_region_t) == sizeof(void *). You also
can't do sizeof (*grub_mm_region_t), because you can't dereference a
type. If there's a better way to express this I am open to it, but for
now I will stick with the struct type...

>
>> +		       sizeof(struct grub_mm_header)) == 0);
>
> I think this check is insufficient. The GRUB_MM_ALIGN should be checked
> somehow too. E.g. if sizeof(struct grub_mm_region) == 16,
> sizeof(struct grub_mm_header) == 8 and GRUB_MM_ALIGN == 32 then assert
> above will pass but the GRUB will blow up later. Of course numbers are not
> real but I chose them to show the problem. Hmmm... If we could choose
> grub_mm_region_t and grub_mm_header_t sizes properly then probably we
> could drop GRUB_MM_ALIGN...
>

I have added that sizeof(struct header) == GRUB_MM_ALIGN. GRUB_MM_ALIGN
is supposed to align you to a cell, and a header is supposed to be 1
cell.

We probably _could_ do away with the constant but I think that requires
a bit more close thought.

Kind regards,
Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/19] mm: assert that we preserve header vs region alignment
  2022-03-23  5:47     ` Daniel Axtens
@ 2022-03-24 15:20       ` Daniel Kiper
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Kiper @ 2022-03-24 15:20 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: grub-devel, leif, stefanb, ps

On Wed, Mar 23, 2022 at 04:47:51PM +1100, Daniel Axtens wrote:
> > s/struct grub_mm_region/grub_mm_region_t/
> > s/struct grub_mm_header/grub_mm_header_t/
>
> The problem is that grub_mm_{region,header}_t is a pointer type, not a
> struct type. So sizeof (grub_mm_region_t) == sizeof(void *). You also
> can't do sizeof (*grub_mm_region_t), because you can't dereference a
> type. If there's a better way to express this I am open to it, but for
> now I will stick with the struct type...

OK.

> >> +		       sizeof(struct grub_mm_header)) == 0);
> >
> > I think this check is insufficient. The GRUB_MM_ALIGN should be checked
> > somehow too. E.g. if sizeof(struct grub_mm_region) == 16,
> > sizeof(struct grub_mm_header) == 8 and GRUB_MM_ALIGN == 32 then assert
> > above will pass but the GRUB will blow up later. Of course numbers are not
> > real but I chose them to show the problem. Hmmm... If we could choose
> > grub_mm_region_t and grub_mm_header_t sizes properly then probably we
> > could drop GRUB_MM_ALIGN...
> >
>
> I have added that sizeof(struct header) == GRUB_MM_ALIGN. GRUB_MM_ALIGN
> is supposed to align you to a cell, and a header is supposed to be 1
> cell.

Cool!

> We probably _could_ do away with the constant but I think that requires
> a bit more close thought.

Yeah, I think additional GRUB_MM_ALIGN check is enough right now.
It does not make sense to spent more time on this and block the
other series any longer.

Thanks,

Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 11/19] efi: mm: Extract function to add memory regions
  2021-10-19 21:58   ` Glenn Washburn
@ 2022-03-25  3:30     ` Daniel Axtens
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Axtens @ 2022-03-25  3:30 UTC (permalink / raw)
  To: development; +Cc: The development of GNU GRUB, leif, stefanb, ps, dkiper

>>    /* Prepare a memory region to store two memory maps.  */
>>    memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (MEMORY_MAP_SIZE));
>>    if (! memory_map)
>> -    grub_fatal ("cannot allocate memory");
>> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");
>
> Something more descriptive would be nice, may be even "cannot allocate
> memory for memory map"

Yes.

>>  
>>    /* Obtain descriptors for available memory.  */
>>    map_size = MEMORY_MAP_SIZE;
>> @@ -588,14 +588,14 @@ grub_efi_mm_init (void)
>>  
>>        memory_map = grub_efi_allocate_any_pages (2 * BYTES_TO_PAGES (map_size));
>>        if (! memory_map)
>> -	grub_fatal ("cannot allocate memory");
>> +	return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot allocate memory");
>
> Ditto. And maybe nice to have it slightly different, perhaps "cannot
> re-allocate memory map"

Yes.

>>  
>>        mm_status = grub_efi_get_memory_map (&map_size, memory_map, 0,
>>  					   &desc_size, 0);
>>      }
>>  
>>    if (mm_status < 0)
>> -    grub_fatal ("cannot get memory map");
>> +    return grub_error (GRUB_ERR_OUT_OF_MEMORY, "cannot get memory map");
>
> What are the range of values that mm_status could be? Would it be
> useful to include the status code in the error message? IOW, could a
> failure here be affected by a configuration that the user could change
> to make this work? Also I like the message "failed to retrieve memory
> map (status=%d)".
>

The function says:
/* Get the memory map as defined in the EFI spec. Return 1 if successful,
   return 0 if partial, or return -1 if an error occurs.  */

So no, I don't think we could print anything more interesting without
reworking the grub_efi_get_memory_map function.

>>  
>>    memory_map_end = NEXT_MEMORY_DESCRIPTOR (memory_map, map_size);
>>  
>> @@ -610,7 +610,7 @@ grub_efi_mm_init (void)
>>  
>>    /* Allocate memory regions for GRUB's memory management.  */
>>    add_memory_regions (filtered_memory_map, desc_size,
>> -		      filtered_memory_map_end, BYTES_TO_PAGES (DEFAULT_HEAP_SIZE));
>> +		      filtered_memory_map_end, BYTES_TO_PAGES (required_bytes));
>>  
>>  #if 0
>>    /* For debug.  */
>
> What about turning this on when MM_DEBUG is on? Seems like it could be
> a useful (would been to get rid of/change the grub_fatal calls).
>

Perhaps, but I'm not sufficiently across EFI to be confident making that
sort of change, and I'm not well set up to test EFI systems or evaluate
the correctness or usefulness of any output. I think this would be
better deferred to another patch from someone with more experience in
the field.

Kind regards,
Daniel


^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2022-03-25  3:31 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-12  7:29 [PATCH 00/19] Requesting more memory from firmware Daniel Axtens
2021-10-12  7:29 ` [PATCH 01/19] grub-shell: Boot PowerPC using PMU instead of CUDA for power management Daniel Axtens
2021-10-12 17:11   ` Glenn Washburn
2021-10-20 15:39     ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 02/19] grub-shell: pseries: don't pass fw_opt to qemu Daniel Axtens
2021-10-12 18:57   ` Glenn Washburn
2021-10-20 15:43     ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 03/19] mm: document grub internal memory management structures Daniel Axtens
2021-10-12 19:18   ` Glenn Washburn
2021-11-24  0:57     ` Daniel Axtens
2021-10-12  7:29 ` [PATCH 04/19] mm: assert that we preserve header vs region alignment Daniel Axtens
2021-10-20 17:43   ` Daniel Kiper
2022-03-23  5:47     ` Daniel Axtens
2022-03-24 15:20       ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 05/19] mm: when adding a region, merge with region after as well as before Daniel Axtens
2021-11-04 16:36   ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 06/19] configure: properly pass through MM_DEBUG Daniel Axtens
2021-11-04 18:00   ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 07/19] Add memtool module with memory allocation stress-test Daniel Axtens
2021-10-19 19:47   ` Glenn Washburn
2021-11-09 12:54   ` Daniel Kiper
2021-11-10 22:29     ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 08/19] mm: Drop unused unloading of modules on OOM Daniel Axtens
2021-11-09 13:15   ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 09/19] mm: Allow dynamically requesting additional memory regions Daniel Axtens
2021-11-09 13:25   ` Daniel Kiper
2021-10-12  7:29 ` [PATCH 10/19] efi: mm: Always request a fixed number of pages on init Daniel Axtens
2021-11-09 13:32   ` Daniel Kiper
2021-10-12  7:30 ` [PATCH 11/19] efi: mm: Extract function to add memory regions Daniel Axtens
2021-10-19 21:39   ` Glenn Washburn
2021-10-19 21:58   ` Glenn Washburn
2022-03-25  3:30     ` Daniel Axtens
2021-11-09 13:38   ` Daniel Kiper
2021-10-12  7:30 ` [PATCH 12/19] efi: mm: Pass up errors from `add_memory_regions ()` Daniel Axtens
2021-10-19 21:37   ` Glenn Washburn
2021-11-09 13:56     ` Daniel Kiper
2021-11-09 16:10   ` Daniel Kiper
2021-10-12  7:30 ` [PATCH 13/19] efi: mm: Implement runtime addition of pages Daniel Axtens
2021-11-09 16:13   ` Daniel Kiper
2021-10-12  7:30 ` [PATCH 14/19] ieee1275: request memory with ibm, client-architecture-support Daniel Axtens
2021-10-12  7:30 ` [PATCH 15/19] ieee1275: drop len -= 1 quirk in heap_init Daniel Axtens
2021-10-12  7:30 ` [PATCH 16/19] ieee1275: support runtime memory claiming Daniel Axtens
2021-10-12  7:30 ` [PATCH 17/19] [not for merge] print more debug info in mm Daniel Axtens
2021-11-10 13:47   ` Daniel Kiper
2021-11-10 19:35     ` Glenn Washburn
2021-11-10 22:17       ` Daniel Kiper
2021-10-12  7:30 ` [PATCH 18/19] [not for merge] ieee1275 debugging info Daniel Axtens
2021-10-12  7:30 ` [PATCH 19/19] RFC: Ignore REGION_CONSECUTIVE Daniel Axtens
2021-11-10 14:00   ` Daniel Kiper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.