All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 00/18] Hyperlaunch
@ 2022-07-06 21:04 Daniel P. Smith
  2022-07-06 21:04 ` [PATCH v1 01/18] kconfig: allow configuration of maximum modules Daniel P. Smith
                   ` (18 more replies)
  0 siblings, 19 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel P. Smith, scott.davis, christopher.clark

This work being submitted in this series was made possible with a great thanks
to Star Lab Cop ration and their generous funding of this project.

The patch series is based on the existing xsm series for starting the idle
domain privileged. After that, the first four commits were previously submitted
as an RFC and expanded by an additional commit to refactor xen commandline
handling. The remaining preliminary patches are FDT refactoring and a doc
update. From there is where the series begins to  morph x86 arch to support
building multiple domains at boot.

This series has been fairly well tested using qemu with a multiboot1 bootoader
and under EFI + GRUB multiboot2 boot. While there are likely some rough spots
remaining in the series, it is at a point now where the series should be reviewed,
exercised, and tested for consideration into tree.

Information, including docuemntation, meeting minutes, presentations, and past
series postings can be found on the Xen wiki:

https://wiki.xenproject.org/wiki/Hyperlaunch

Daniel P. Smith (18):
  kconfig: allow configuration of maximum modules
  introduction of generalized boot info
  x86: adopt new boot info structures
  x86: refactor entrypoints to new boot info
  x86: refactor xen cmdline into general framework
  fdt: make fdt handling reusable across arch
  docs: update hyperlaunch device tree documentation
  kconfig: introduce domain builder config option
  x86: introduce abstractions for domain builder
  x86: introduce the domain builder
  x86: initial conversion to domain builder
  x86: convert dom0 creation to domain builder
  x86: generalize physmap logic
  x86: generalize vcpu for domain building
  x86: rework domain page allocation
  x86: add pv multidomain construction
  builder: introduce domain builder hypfs tree
  tools: introduce example late pv helper

 .gitignore                                    |   1 +
 .../designs/launch/hyperlaunch-devicetree.rst | 497 +++++++++++-------
 tools/helpers/Makefile                        |  11 +
 tools/helpers/builder-hypfs.c                 | 253 +++++++++
 tools/helpers/hypfs-helpers.h                 |   9 +
 tools/helpers/late-init-pv.c                  | 287 ++++++++++
 tools/helpers/late-init-pv.h                  |  29 +
 tools/helpers/xs-helpers.c                    | 117 +++++
 tools/helpers/xs-helpers.h                    |  27 +
 xen/arch/Kconfig                              |  12 +
 xen/arch/arm/bootfdt.c                        | 115 +---
 xen/arch/arm/include/asm/setup.h              |   5 +-
 xen/arch/x86/Makefile                         |   1 +
 xen/arch/x86/boot/boot_info32.h               |  97 ++++
 xen/arch/x86/boot/defs.h                      |  17 +-
 xen/arch/x86/boot/reloc.c                     | 187 +++++--
 xen/arch/x86/bzimage.c                        |  18 +-
 xen/arch/x86/cpu/microcode/core.c             | 133 +++--
 xen/arch/x86/dom0_build.c                     | 129 +----
 xen/arch/x86/domain_builder.c                 | 284 ++++++++++
 xen/arch/x86/efi/efi-boot.h                   |  96 ++--
 xen/arch/x86/guest/xen/pvh-boot.c             |  64 ++-
 xen/arch/x86/hvm/dom0_build.c                 |  62 +--
 xen/arch/x86/include/asm/bootdomain.h         |  30 ++
 xen/arch/x86/include/asm/bootinfo.h           |  99 ++++
 xen/arch/x86/include/asm/bzimage.h            |   5 +-
 xen/arch/x86/include/asm/dom0_build.h         |  27 +-
 xen/arch/x86/include/asm/guest/pvh-boot.h     |   6 +-
 xen/arch/x86/include/asm/setup.h              |  18 +-
 xen/arch/x86/pv/Makefile                      |   2 +-
 .../x86/pv/{dom0_build.c => domain_builder.c} | 141 ++---
 xen/arch/x86/pv/shim.c                        |   4 +-
 xen/arch/x86/setup.c                          | 392 ++++++--------
 xen/common/Kconfig                            |   5 +
 xen/common/Makefile                           |   4 +-
 xen/common/domain-builder/Kconfig             |  36 ++
 xen/common/domain-builder/Makefile            |   3 +
 xen/common/domain-builder/core.c              | 207 ++++++++
 xen/common/domain-builder/fdt.c               | 295 +++++++++++
 xen/common/domain-builder/fdt.h               |   7 +
 xen/common/domain-builder/hypfs.c             | 193 +++++++
 xen/common/efi/boot.c                         |   4 +-
 xen/common/fdt.c                              | 131 +++++
 xen/common/sched/core.c                       |  25 +-
 xen/include/xen/bootdomain.h                  |  58 ++
 xen/include/xen/bootinfo.h                    | 132 +++++
 xen/include/xen/device_tree.h                 |  50 +-
 xen/include/xen/domain_builder.h              |  88 ++++
 xen/include/xen/fdt.h                         |  79 +++
 xen/include/xen/sched.h                       |   3 +-
 xen/include/xsm/xsm.h                         |  26 +-
 xen/xsm/xsm_core.c                            |  43 +-
 xen/xsm/xsm_policy.c                          |  56 +-
 53 files changed, 3544 insertions(+), 1076 deletions(-)
 create mode 100644 tools/helpers/builder-hypfs.c
 create mode 100644 tools/helpers/hypfs-helpers.h
 create mode 100644 tools/helpers/late-init-pv.c
 create mode 100644 tools/helpers/late-init-pv.h
 create mode 100644 tools/helpers/xs-helpers.c
 create mode 100644 tools/helpers/xs-helpers.h
 create mode 100644 xen/arch/x86/boot/boot_info32.h
 create mode 100644 xen/arch/x86/domain_builder.c
 create mode 100644 xen/arch/x86/include/asm/bootdomain.h
 create mode 100644 xen/arch/x86/include/asm/bootinfo.h
 rename xen/arch/x86/pv/{dom0_build.c => domain_builder.c} (88%)
 create mode 100644 xen/common/domain-builder/Kconfig
 create mode 100644 xen/common/domain-builder/Makefile
 create mode 100644 xen/common/domain-builder/core.c
 create mode 100644 xen/common/domain-builder/fdt.c
 create mode 100644 xen/common/domain-builder/fdt.h
 create mode 100644 xen/common/domain-builder/hypfs.c
 create mode 100644 xen/common/fdt.c
 create mode 100644 xen/include/xen/bootdomain.h
 create mode 100644 xen/include/xen/bootinfo.h
 create mode 100644 xen/include/xen/domain_builder.h
 create mode 100644 xen/include/xen/fdt.h

-- 
2.20.1



^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-07  1:44   ` Henry Wang
                     ` (2 more replies)
  2022-07-06 21:04 ` [PATCH v1 02/18] introduction of generalized boot info Daniel P. Smith
                   ` (17 subsequent siblings)
  18 siblings, 3 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Volodymyr Babchuk, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Roger Pau Monné

For x86 the number of allowable multiboot modules varies between the different
entry points, non-efi boot, pvh boot, and efi boot. In the case of both Arm and
x86 this value is fixed to values based on generalized assumptions. With
hyperlaunch for x86 and dom0less on Arm, use of static sizes results in large
allocations compiled into the hypervisor that will go unused by many use cases.

This commit introduces a Kconfig variable that is set with sane defaults based
on configuration selection. This variable is in turned used as the array size
for the cases where a static allocated array of boot modules is declared.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/Kconfig                  | 12 ++++++++++++
 xen/arch/arm/include/asm/setup.h  |  5 +++--
 xen/arch/x86/efi/efi-boot.h       |  2 +-
 xen/arch/x86/guest/xen/pvh-boot.c |  2 +-
 xen/arch/x86/setup.c              |  4 ++--
 5 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index f16eb0df43..24139057be 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -17,3 +17,15 @@ config NR_CPUS
 	  For CPU cores which support Simultaneous Multi-Threading or similar
 	  technologies, this the number of logical threads which Xen will
 	  support.
+
+config NR_BOOTMODS
+	int "Maximum number of boot modules that a loader can pass"
+	range 1 32768
+	default "8" if X86
+	default "32" if ARM
+	help
+	  Controls the build-time size of various arrays allocated for
+	  parsing the boot modules passed by a loader when starting Xen.
+
+	  This is of particular interest when using Xen's hypervisor domain
+	  capabilities such as dom0less.
diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
index 2bb01ecfa8..312a3e4209 100644
--- a/xen/arch/arm/include/asm/setup.h
+++ b/xen/arch/arm/include/asm/setup.h
@@ -10,7 +10,8 @@
 
 #define NR_MEM_BANKS 256
 
-#define MAX_MODULES 32 /* Current maximum useful modules */
+/* Current maximum useful modules */
+#define MAX_MODULES CONFIG_NR_BOOTMODS
 
 typedef enum {
     BOOTMOD_XEN,
@@ -38,7 +39,7 @@ struct meminfo {
  * The domU flag is set for kernels and ramdisks of "xen,domain" nodes.
  * The purpose of the domU flag is to avoid getting confused in
  * kernel_probe, where we try to guess which is the dom0 kernel and
- * initrd to be compatible with all versions of the multiboot spec. 
+ * initrd to be compatible with all versions of the multiboot spec.
  */
 #define BOOTMOD_MAX_CMDLINE 1024
 struct bootmodule {
diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
index 6e65b569b0..4e1a799749 100644
--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
  * The array size needs to be one larger than the number of modules we
  * support - see __start_xen().
  */
-static module_t __initdata mb_modules[5];
+static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
 
 static void __init edd_put_string(u8 *dst, size_t n, const char *src)
 {
diff --git a/xen/arch/x86/guest/xen/pvh-boot.c b/xen/arch/x86/guest/xen/pvh-boot.c
index 498625eae0..834b1ad16b 100644
--- a/xen/arch/x86/guest/xen/pvh-boot.c
+++ b/xen/arch/x86/guest/xen/pvh-boot.c
@@ -32,7 +32,7 @@ bool __initdata pvh_boot;
 uint32_t __initdata pvh_start_info_pa;
 
 static multiboot_info_t __initdata pvh_mbi;
-static module_t __initdata pvh_mbi_mods[8];
+static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];
 static const char *__initdata pvh_loader = "PVH Directboot";
 
 static void __init convert_pvh_info(multiboot_info_t **mbi,
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index f08b07b8de..2aa1e28c8f 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1020,9 +1020,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         panic("dom0 kernel not specified. Check bootloader configuration\n");
 
     /* Check that we don't have a silly number of modules. */
-    if ( mbi->mods_count > sizeof(module_map) * 8 )
+    if ( mbi->mods_count > CONFIG_NR_BOOTMODS )
     {
-        mbi->mods_count = sizeof(module_map) * 8;
+        mbi->mods_count = CONFIG_NR_BOOTMODS;
         printk("Excessive multiboot modules - using the first %u only\n",
                mbi->mods_count);
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 02/18] introduction of generalized boot info
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
  2022-07-06 21:04 ` [PATCH v1 01/18] kconfig: allow configuration of maximum modules Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-15 19:25   ` Julien Grall
  2022-07-19 13:11   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 03/18] x86: adopt new boot info structures Daniel P. Smith
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini

The x86 and Arm architectures represent in memory the general boot information
and boot modules differently despite having commonality. The x86
representations are bound to the multiboot v1 structures while the Arm
representations are a slightly generalized meta-data container for the boot
material. The multiboot structure does not lend itself well to being expanded
to accommodate additional metadata, both general and boot module specific. The
Arm structures are not bound to an external specification and thus are able to
be expanded for solutions such as dom0less.

This commit introduces a set of structures patterned off the Arm structures to
represent the boot information in a manner that captures common data. The
structures provide an arch field to allow arch specific expansions to the
structures. The intended goal of these new common structures is to enable
commonality between the different architectures.  Specifically to enable
dom0less and hyperlaunch to have a common representation of boot-time
constructed domains.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/include/asm/bootinfo.h | 48 +++++++++++++++++++++++++
 xen/include/xen/bootinfo.h          | 54 +++++++++++++++++++++++++++++
 2 files changed, 102 insertions(+)
 create mode 100644 xen/arch/x86/include/asm/bootinfo.h
 create mode 100644 xen/include/xen/bootinfo.h

diff --git a/xen/arch/x86/include/asm/bootinfo.h b/xen/arch/x86/include/asm/bootinfo.h
new file mode 100644
index 0000000000..b0754a3ed0
--- /dev/null
+++ b/xen/arch/x86/include/asm/bootinfo.h
@@ -0,0 +1,48 @@
+#ifndef __ARCH_X86_BOOTINFO_H__
+#define __ARCH_X86_BOOTINFO_H__
+
+/* unused for x86 */
+struct arch_bootstring { };
+
+struct __packed arch_bootmodule {
+#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
+    uint32_t flags;
+    uint32_t headroom;
+};
+
+struct __packed arch_boot_info {
+    uint32_t flags;
+#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
+#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
+#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
+#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
+#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
+#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
+#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
+#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
+#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
+#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
+#define BOOTINFO_FLAG_X86_APM        	1U << 10
+
+    bool xen_guest;
+
+    char *boot_loader_name;
+    char *kextra;
+
+    uint32_t mem_lower;
+    uint32_t mem_upper;
+
+    uint32_t mmap_length;
+    paddr_t mmap_addr;
+};
+
+struct __packed mb_memmap {
+    uint32_t size;
+    uint32_t base_addr_low;
+    uint32_t base_addr_high;
+    uint32_t length_low;
+    uint32_t length_high;
+    uint32_t type;
+};
+
+#endif
diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
new file mode 100644
index 0000000000..42b53a3ca6
--- /dev/null
+++ b/xen/include/xen/bootinfo.h
@@ -0,0 +1,54 @@
+#ifndef __XEN_BOOTINFO_H__
+#define __XEN_BOOTINFO_H__
+
+#include <xen/mm.h>
+#include <xen/types.h>
+
+#include <asm/bootinfo.h>
+
+typedef enum {
+    BOOTMOD_UNKNOWN,
+    BOOTMOD_XEN,
+    BOOTMOD_FDT,
+    BOOTMOD_KERNEL,
+    BOOTMOD_RAMDISK,
+    BOOTMOD_XSM,
+    BOOTMOD_UCODE,
+    BOOTMOD_GUEST_DTB,
+}  bootmodule_kind;
+
+typedef enum {
+    BOOTSTR_EMPTY,
+    BOOTSTR_STRING,
+    BOOTSTR_CMDLINE,
+} bootstring_kind;
+
+#define BOOTMOD_MAX_STRING 1024
+struct __packed boot_string {
+    bootstring_kind kind;
+    struct arch_bootstring *arch;
+
+    char bytes[BOOTMOD_MAX_STRING];
+    size_t len;
+};
+
+struct __packed boot_module {
+    bootmodule_kind kind;
+    paddr_t start;
+    mfn_t mfn;
+    size_t size;
+
+    struct arch_bootmodule *arch;
+    struct boot_string string;
+};
+
+struct __packed boot_info {
+    char *cmdline;
+
+    uint32_t nr_mods;
+    struct boot_module *mods;
+
+    struct arch_boot_info *arch;
+};
+
+#endif
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 03/18] x86: adopt new boot info structures
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
  2022-07-06 21:04 ` [PATCH v1 01/18] kconfig: allow configuration of maximum modules Daniel P. Smith
  2022-07-06 21:04 ` [PATCH v1 02/18] introduction of generalized boot info Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-19 13:19   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 04/18] x86: refactor entrypoints to new boot info Daniel P. Smith
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu, Daniel P. Smith
  Cc: scott.davis, christopher.clark, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, Daniel De Graaf

This commit replaces the use of the multiboot v1 structures starting
at __start_xen(). The majority of this commit is converting the fields
being accessed for the startup calculations. While adapting the ucode
boot module location logic, this code was refactored to reduce some
of the unnecessary complexity.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/bzimage.c                |  18 +-
 xen/arch/x86/cpu/microcode/core.c     | 133 ++++++++------
 xen/arch/x86/dom0_build.c             |  11 +-
 xen/arch/x86/hvm/dom0_build.c         |  42 ++---
 xen/arch/x86/include/asm/bootinfo.h   |   2 +-
 xen/arch/x86/include/asm/bzimage.h    |   5 +-
 xen/arch/x86/include/asm/dom0_build.h |  15 +-
 xen/arch/x86/include/asm/setup.h      |  14 +-
 xen/arch/x86/pv/dom0_build.c          |  34 ++--
 xen/arch/x86/setup.c                  | 245 +++++++++++++++-----------
 xen/include/xen/bootinfo.h            |  47 +++++
 xen/include/xsm/xsm.h                 |  26 ++-
 xen/xsm/xsm_core.c                    |  43 +++--
 xen/xsm/xsm_policy.c                  |  56 +++---
 14 files changed, 413 insertions(+), 278 deletions(-)

diff --git a/xen/arch/x86/bzimage.c b/xen/arch/x86/bzimage.c
index ac4fd428be..03cb372957 100644
--- a/xen/arch/x86/bzimage.c
+++ b/xen/arch/x86/bzimage.c
@@ -69,10 +69,8 @@ static __init int bzimage_check(struct setup_header *hdr, unsigned long len)
     return 1;
 }
 
-static unsigned long __initdata orig_image_len;
-
-unsigned long __init bzimage_headroom(void *image_start,
-                                      unsigned long image_length)
+unsigned long __init bzimage_headroom(
+    void *image_start, unsigned long image_length)
 {
     struct setup_header *hdr = (struct setup_header *)image_start;
     int err;
@@ -91,7 +89,6 @@ unsigned long __init bzimage_headroom(void *image_start,
     if ( elf_is_elfbinary(image_start, image_length) )
         return 0;
 
-    orig_image_len = image_length;
     headroom = output_length(image_start, image_length);
     if (gzip_check(image_start, image_length))
     {
@@ -104,12 +101,15 @@ unsigned long __init bzimage_headroom(void *image_start,
     return headroom;
 }
 
-int __init bzimage_parse(void *image_base, void **image_start,
-                         unsigned long *image_len)
+int __init bzimage_parse(
+    void *image_base, void **image_start, unsigned int headroom,
+    unsigned long *image_len)
 {
     struct setup_header *hdr = (struct setup_header *)(*image_start);
     int err = bzimage_check(hdr, *image_len);
-    unsigned long output_len;
+    unsigned long output_len, orig_image_len;
+
+    orig_image_len = *image_len - headroom;
 
     if ( err < 0 )
         return err;
@@ -125,7 +125,7 @@ int __init bzimage_parse(void *image_base, void **image_start,
 
     BUG_ON(!(image_base < *image_start));
 
-    output_len = output_length(*image_start, orig_image_len);
+    output_len = output_length(*image_start, *image_len);
 
     if ( (err = perform_gunzip(image_base, *image_start, orig_image_len)) > 0 )
         err = decompress(*image_start, orig_image_len, image_base);
diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c
index 452a7ca773..bfdba85796 100644
--- a/xen/arch/x86/cpu/microcode/core.c
+++ b/xen/arch/x86/cpu/microcode/core.c
@@ -22,6 +22,7 @@
  */
 
 #include <xen/alternative-call.h>
+#include <xen/bootinfo.h>
 #include <xen/cpu.h>
 #include <xen/earlycpio.h>
 #include <xen/err.h>
@@ -54,7 +55,6 @@
  */
 #define MICROCODE_UPDATE_TIMEOUT_US 1000000
 
-static module_t __initdata ucode_mod;
 static signed int __initdata ucode_mod_idx;
 static bool_t __initdata ucode_mod_forced;
 static unsigned int nr_cores;
@@ -147,74 +147,113 @@ static int __init cf_check parse_ucode(const char *s)
 }
 custom_param("ucode", parse_ucode);
 
-void __init microcode_scan_module(
-    unsigned long *module_map,
-    const multiboot_info_t *mbi)
+#define MICROCODE_MODULE_MATCH 1
+#define MICROCODE_MODULE_NONMATCH 0
+
+static int __init microcode_check_module(struct boot_module *mod)
 {
-    module_t *mod = (module_t *)__va(mbi->mods_addr);
     uint64_t *_blob_start;
     unsigned long _blob_size;
-    struct cpio_data cd;
+    struct cpio_data cd = { NULL, 0 };
     long offset;
     const char *p = NULL;
-    int i;
-
-    ucode_blob.size = 0;
-    if ( !ucode_scan )
-        return;
 
     if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
         p = "kernel/x86/microcode/AuthenticAMD.bin";
     else if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
         p = "kernel/x86/microcode/GenuineIntel.bin";
     else
+        return -EFAULT;
+
+    _blob_start = bootstrap_map(mod);
+    _blob_size = mod->size;
+    if ( !_blob_start )
+    {
+        printk("Could not map multiboot module (0x%lx) (size: %ld)\n",
+               mod->start, _blob_size);
+        /* Non-fatal error, so just say no match */
+        return MICROCODE_MODULE_NONMATCH;
+    }
+
+    cd = find_cpio_data(p, _blob_start, _blob_size, &offset /* ignore */);
+
+    if ( cd.data )
+    {
+        ucode_blob.size = cd.size;
+        ucode_blob.data = cd.data;
+
+        mod->kind = BOOTMOD_UCODE;
+        return MICROCODE_MODULE_MATCH;
+    }
+
+    bootstrap_map(NULL);
+
+    return 0;
+}
+
+void __init microcode_scan_module(struct boot_info *bi)
+{
+    int idx = 0;
+
+    if ( !ucode_scan )
         return;
 
     /*
-     * Try all modules and see whichever could be the microcode blob.
+     * Try unidentified modules and see which could be the microcode blob.
      */
-    for ( i = 1 /* Ignore dom0 kernel */; i < mbi->mods_count; i++ )
+    idx = bootmodule_next_idx_by_kind(bi, BOOTMOD_UNKNOWN, idx);
+    while ( idx < bi->nr_mods )
     {
-        if ( !test_bit(i, module_map) )
-            continue;
+        int ret;
 
-        _blob_start = bootstrap_map(&mod[i]);
-        _blob_size = mod[i].mod_end;
-        if ( !_blob_start )
+        ret = microcode_check_module(&bi->mods[idx]);
+        switch ( ret )
         {
-            printk("Could not map multiboot module #%d (size: %ld)\n",
-                   i, _blob_size);
+        case MICROCODE_MODULE_MATCH:
+            return;
+        case MICROCODE_MODULE_NONMATCH:
+            idx = bootmodule_next_idx_by_kind(bi, BOOTMOD_UNKNOWN, ++idx);
             continue;
+        default:
+            printk("%s: (err: %d) unable to check microcode\n",
+                   __func__, ret);
+            return;
         }
-        cd.data = NULL;
-        cd.size = 0;
-        cd = find_cpio_data(p, _blob_start, _blob_size, &offset /* ignore */);
-        if ( cd.data )
-        {
-            ucode_blob.size = cd.size;
-            ucode_blob.data = cd.data;
-            break;
-        }
-        bootstrap_map(NULL);
     }
 }
-void __init microcode_grab_module(
-    unsigned long *module_map,
-    const multiboot_info_t *mbi)
+
+void __init microcode_grab_module(struct boot_info *bi)
 {
-    module_t *mod = (module_t *)__va(mbi->mods_addr);
+    ucode_blob.size = 0;
 
     if ( ucode_mod_idx < 0 )
-        ucode_mod_idx += mbi->mods_count;
-    if ( ucode_mod_idx <= 0 || ucode_mod_idx >= mbi->mods_count ||
-         !__test_and_clear_bit(ucode_mod_idx, module_map) )
-        goto scan;
-    ucode_mod = mod[ucode_mod_idx];
-scan:
+        ucode_mod_idx += bi->nr_mods;
+    if ( ucode_mod_idx >= 0 &&  ucode_mod_idx <= bi->nr_mods &&
+         bi->mods[ucode_mod_idx].kind == BOOTMOD_UNKNOWN )
+    {
+        int ret = microcode_check_module(&bi->mods[ucode_mod_idx]);
+
+        switch ( ret )
+        {
+        case MICROCODE_MODULE_MATCH:
+            return;
+        case MICROCODE_MODULE_NONMATCH:
+            break;
+        default:
+            printk("%s: (err: %d) unable to check microcode\n",
+                   __func__, ret);
+            return;
+        }
+    }
+
     if ( ucode_scan )
-        microcode_scan_module(module_map, mbi);
+        microcode_scan_module(bi);
 }
 
+/* Undefining as they are not needed anymore */
+#undef MICROCODE_MODULE_MATCH
+#undef MICROCODE_MODULE_NONMATCH
+
 static struct microcode_ops __ro_after_init ucode_ops;
 
 static DEFINE_SPINLOCK(microcode_mutex);
@@ -711,11 +750,6 @@ static int __init cf_check microcode_init(void)
         ucode_blob.size = 0;
         ucode_blob.data = NULL;
     }
-    else if ( ucode_mod.mod_end )
-    {
-        bootstrap_map(NULL);
-        ucode_mod.mod_end = 0;
-    }
 
     return 0;
 }
@@ -745,11 +779,6 @@ static int __init early_microcode_update_cpu(void)
         len = ucode_blob.size;
         data = ucode_blob.data;
     }
-    else if ( ucode_mod.mod_end )
-    {
-        len = ucode_mod.mod_end;
-        data = bootstrap_map(&ucode_mod);
-    }
 
     if ( !data )
         return -ENOMEM;
@@ -799,7 +828,7 @@ int __init early_microcode_init(void)
 
     alternative_vcall(ucode_ops.collect_cpu_info);
 
-    if ( ucode_mod.mod_end || ucode_blob.size )
+    if ( ucode_blob.size )
         rc = early_microcode_update_cpu();
 
     return rc;
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 79234f18ff..9ca5a99510 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -4,6 +4,7 @@
  * Copyright (c) 2002-2005, K A Fraser
  */
 
+#include <xen/bootinfo.h>
 #include <xen/init.h>
 #include <xen/iocap.h>
 #include <xen/libelf.h>
@@ -574,9 +575,9 @@ int __init dom0_setup_permissions(struct domain *d)
     return rc;
 }
 
-int __init construct_dom0(struct domain *d, const module_t *image,
-                          unsigned long image_headroom, module_t *initrd,
-                          char *cmdline)
+int __init construct_dom0(
+    struct domain *d, const struct boot_module *image,
+    struct boot_module *initrd, char *cmdline)
 {
     int rc;
 
@@ -588,9 +589,9 @@ int __init construct_dom0(struct domain *d, const module_t *image,
     process_pending_softirqs();
 
     if ( is_hvm_domain(d) )
-        rc = dom0_construct_pvh(d, image, image_headroom, initrd, cmdline);
+        rc = dom0_construct_pvh(d, image, initrd, cmdline);
     else if ( is_pv_domain(d) )
-        rc = dom0_construct_pv(d, image, image_headroom, initrd, cmdline);
+        rc = dom0_construct_pv(d, image, initrd, cmdline);
     else
         panic("Cannot construct Dom0. No guest interface available\n");
 
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 1864d048a1..4e903a848d 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -19,9 +19,9 @@
  */
 
 #include <xen/acpi.h>
+#include <xen/bootinfo.h>
 #include <xen/init.h>
 #include <xen/libelf.h>
-#include <xen/multiboot.h>
 #include <xen/pci.h>
 #include <xen/softirq.h>
 
@@ -543,14 +543,13 @@ static paddr_t __init find_memory(
     return INVALID_PADDR;
 }
 
-static int __init pvh_load_kernel(struct domain *d, const module_t *image,
-                                  unsigned long image_headroom,
-                                  module_t *initrd, void *image_base,
-                                  char *cmdline, paddr_t *entry,
-                                  paddr_t *start_info_addr)
+static int __init pvh_load_kernel(
+    struct domain *d, const struct boot_module *image,
+    struct boot_module *initrd, void *image_base, char *cmdline,
+    paddr_t *entry, paddr_t *start_info_addr)
 {
-    void *image_start = image_base + image_headroom;
-    unsigned long image_len = image->mod_end;
+    void *image_start = image_base + image->arch->headroom;
+    unsigned long image_len = image->size;
     struct elf_binary elf;
     struct elf_dom_parms parms;
     paddr_t last_addr;
@@ -559,7 +558,9 @@ static int __init pvh_load_kernel(struct domain *d, const module_t *image,
     struct vcpu *v = d->vcpu[0];
     int rc;
 
-    if ( (rc = bzimage_parse(image_base, &image_start, &image_len)) != 0 )
+    if ( (rc =
+          bzimage_parse(image_base, &image_start, image->arch->headroom,
+                         &image_len)) != 0 )
     {
         printk("Error trying to detect bz compressed kernel\n");
         return rc;
@@ -606,7 +607,7 @@ static int __init pvh_load_kernel(struct domain *d, const module_t *image,
      * simplify it.
      */
     last_addr = find_memory(d, &elf, sizeof(start_info) +
-                            (initrd ? ROUNDUP(initrd->mod_end, PAGE_SIZE) +
+                            (initrd ? ROUNDUP(initrd->size, PAGE_SIZE) +
                                       sizeof(mod)
                                     : 0) +
                             (cmdline ? ROUNDUP(strlen(cmdline) + 1,
@@ -620,8 +621,8 @@ static int __init pvh_load_kernel(struct domain *d, const module_t *image,
 
     if ( initrd != NULL )
     {
-        rc = hvm_copy_to_guest_phys(last_addr, mfn_to_virt(initrd->mod_start),
-                                    initrd->mod_end, v);
+        rc = hvm_copy_to_guest_phys(last_addr, maddr_to_virt(initrd->start),
+                                    initrd->size, v);
         if ( rc )
         {
             printk("Unable to copy initrd to guest\n");
@@ -629,11 +630,11 @@ static int __init pvh_load_kernel(struct domain *d, const module_t *image,
         }
 
         mod.paddr = last_addr;
-        mod.size = initrd->mod_end;
-        last_addr += ROUNDUP(initrd->mod_end, elf_64bit(&elf) ? 8 : 4);
-        if ( initrd->string )
+        mod.size = initrd->size;
+        last_addr += ROUNDUP(initrd->size, elf_64bit(&elf) ? 8 : 4);
+        if ( initrd->string.kind == BOOTSTR_CMDLINE )
         {
-            char *str = __va(initrd->string);
+            char *str = initrd->string.bytes;
             size_t len = strlen(str) + 1;
 
             rc = hvm_copy_to_guest_phys(last_addr, str, len, v);
@@ -1216,10 +1217,9 @@ static void __hwdom_init pvh_setup_mmcfg(struct domain *d)
     }
 }
 
-int __init dom0_construct_pvh(struct domain *d, const module_t *image,
-                              unsigned long image_headroom,
-                              module_t *initrd,
-                              char *cmdline)
+int __init dom0_construct_pvh(
+    struct domain *d, const struct boot_module *image,
+    struct boot_module *initrd, char *cmdline)
 {
     paddr_t entry, start_info;
     int rc;
@@ -1249,7 +1249,7 @@ int __init dom0_construct_pvh(struct domain *d, const module_t *image,
         return rc;
     }
 
-    rc = pvh_load_kernel(d, image, image_headroom, initrd, bootstrap_map(image),
+    rc = pvh_load_kernel(d, image, initrd, bootstrap_map(image),
                          cmdline, &entry, &start_info);
     if ( rc )
     {
diff --git a/xen/arch/x86/include/asm/bootinfo.h b/xen/arch/x86/include/asm/bootinfo.h
index b0754a3ed0..e5135e402b 100644
--- a/xen/arch/x86/include/asm/bootinfo.h
+++ b/xen/arch/x86/include/asm/bootinfo.h
@@ -24,7 +24,7 @@ struct __packed arch_boot_info {
 #define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
 #define BOOTINFO_FLAG_X86_APM        	1U << 10
 
-    bool xen_guest;
+    bool xenguest;
 
     char *boot_loader_name;
     char *kextra;
diff --git a/xen/arch/x86/include/asm/bzimage.h b/xen/arch/x86/include/asm/bzimage.h
index 7ed69d3910..5a5a25b4d7 100644
--- a/xen/arch/x86/include/asm/bzimage.h
+++ b/xen/arch/x86/include/asm/bzimage.h
@@ -5,7 +5,8 @@
 
 unsigned long bzimage_headroom(void *image_start, unsigned long image_length);
 
-int bzimage_parse(void *image_base, void **image_start,
-                  unsigned long *image_len);
+int bzimage_parse(
+    void *image_base, void **image_start, unsigned int headroom,
+    unsigned long *image_len);
 
 #endif /* __X86_BZIMAGE_H__ */
diff --git a/xen/arch/x86/include/asm/dom0_build.h b/xen/arch/x86/include/asm/dom0_build.h
index a5f8c9e67f..ad33413710 100644
--- a/xen/arch/x86/include/asm/dom0_build.h
+++ b/xen/arch/x86/include/asm/dom0_build.h
@@ -1,6 +1,7 @@
 #ifndef _DOM0_BUILD_H_
 #define _DOM0_BUILD_H_
 
+#include <xen/bootinfo.h>
 #include <xen/libelf.h>
 #include <xen/sched.h>
 
@@ -13,15 +14,13 @@ unsigned long dom0_compute_nr_pages(struct domain *d,
                                     unsigned long initrd_len);
 int dom0_setup_permissions(struct domain *d);
 
-int dom0_construct_pv(struct domain *d, const module_t *image,
-                      unsigned long image_headroom,
-                      module_t *initrd,
-                      char *cmdline);
+int __init dom0_construct_pv(
+    struct domain *d, const struct boot_module *image,
+    struct boot_module *initrd, char *cmdline);
 
-int dom0_construct_pvh(struct domain *d, const module_t *image,
-                       unsigned long image_headroom,
-                       module_t *initrd,
-                       char *cmdline);
+int __init dom0_construct_pvh(
+    struct domain *d, const struct boot_module *image,
+    struct boot_module *initrd, char *cmdline);
 
 unsigned long dom0_paging_pages(const struct domain *d,
                                 unsigned long nr_pages);
diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h
index 21037b7f31..27c0d61819 100644
--- a/xen/arch/x86/include/asm/setup.h
+++ b/xen/arch/x86/include/asm/setup.h
@@ -1,7 +1,8 @@
 #ifndef __X86_SETUP_H_
 #define __X86_SETUP_H_
 
-#include <xen/multiboot.h>
+#include <xen/bootinfo.h>
+
 #include <asm/numa.h>
 
 extern const char __2M_text_start[], __2M_text_end[];
@@ -33,20 +34,17 @@ static inline void vesa_init(void) {};
 #endif
 
 int construct_dom0(
-    struct domain *d,
-    const module_t *kernel, unsigned long kernel_headroom,
-    module_t *initrd,
-    char *cmdline);
+    struct domain *d, const struct boot_module *image,
+    struct boot_module *initrd, char *cmdline);
 void setup_io_bitmap(struct domain *d);
 
 unsigned long initial_images_nrpages(nodeid_t node);
 void discard_initial_images(void);
-void *bootstrap_map(const module_t *mod);
+void *bootstrap_map(const struct boot_module *mod);
 
 int xen_in_range(unsigned long mfn);
 
-void microcode_grab_module(
-    unsigned long *, const multiboot_info_t *);
+void microcode_grab_module(struct boot_info *bi);
 
 extern uint8_t kbd_shift_flags;
 
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index e501979a86..f6131147ef 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -4,6 +4,7 @@
  * Copyright (c) 2002-2005, K A Fraser
  */
 
+#include <xen/bootinfo.h>
 #include <xen/console.h>
 #include <xen/domain.h>
 #include <xen/domain_page.h>
@@ -294,11 +295,9 @@ static struct page_info * __init alloc_chunk(struct domain *d,
     return page;
 }
 
-int __init dom0_construct_pv(struct domain *d,
-                             const module_t *image,
-                             unsigned long image_headroom,
-                             module_t *initrd,
-                             char *cmdline)
+int __init dom0_construct_pv(
+    struct domain *d, const struct boot_module *image,
+    struct boot_module *initrd, char *cmdline)
 {
     int i, rc, order, machine;
     bool compatible, compat;
@@ -314,9 +313,9 @@ int __init dom0_construct_pv(struct domain *d,
     start_info_t *si;
     struct vcpu *v = d->vcpu[0];
     void *image_base = bootstrap_map(image);
-    unsigned long image_len = image->mod_end;
-    void *image_start = image_base + image_headroom;
-    unsigned long initrd_len = initrd ? initrd->mod_end : 0;
+    unsigned long image_len = image->size;
+    void *image_start = image_base + image->arch->headroom;
+    unsigned long initrd_len = initrd ? initrd->size : 0;
     l4_pgentry_t *l4tab = NULL, *l4start = NULL;
     l3_pgentry_t *l3tab = NULL, *l3start = NULL;
     l2_pgentry_t *l2tab = NULL, *l2start = NULL;
@@ -355,7 +354,9 @@ int __init dom0_construct_pv(struct domain *d,
 
     d->max_pages = ~0U;
 
-    if ( (rc = bzimage_parse(image_base, &image_start, &image_len)) != 0 )
+    if ( (rc =
+          bzimage_parse(image_base, &image_start, image->arch->headroom,
+                         &image_len)) != 0 )
         return rc;
 
     if ( (rc = elf_init(&elf, image_start, image_len)) != 0 )
@@ -544,7 +545,7 @@ int __init dom0_construct_pv(struct domain *d,
         initrd_pfn = vinitrd_start ?
                      (vinitrd_start - v_start) >> PAGE_SHIFT :
                      domain_tot_pages(d);
-        initrd_mfn = mfn = initrd->mod_start;
+        initrd_mfn = mfn = mfn_x(initrd->mfn);
         count = PFN_UP(initrd_len);
         if ( d->arch.physaddr_bitsize &&
              ((mfn + count - 1) >> (d->arch.physaddr_bitsize - PAGE_SHIFT)) )
@@ -559,12 +560,13 @@ int __init dom0_construct_pv(struct domain *d,
                     free_domheap_pages(page, order);
                     page += 1UL << order;
                 }
-            memcpy(page_to_virt(page), mfn_to_virt(initrd->mod_start),
+            memcpy(page_to_virt(page), maddr_to_virt(initrd->start),
                    initrd_len);
-            mpt_alloc = (paddr_t)initrd->mod_start << PAGE_SHIFT;
+            mpt_alloc = initrd->start;
             init_domheap_pages(mpt_alloc,
                                mpt_alloc + PAGE_ALIGN(initrd_len));
-            initrd->mod_start = initrd_mfn = mfn_x(page_to_mfn(page));
+            bootmodule_update_mfn(initrd, page_to_mfn(page));
+            initrd_mfn = mfn_x(initrd->mfn);
         }
         else
         {
@@ -572,7 +574,7 @@ int __init dom0_construct_pv(struct domain *d,
                 if ( assign_pages(mfn_to_page(_mfn(mfn++)), 1, d, 0) )
                     BUG();
         }
-        initrd->mod_end = 0;
+        initrd->size = 0;
     }
 
     printk("PHYSICAL MEMORY ARRANGEMENT:\n"
@@ -583,7 +585,7 @@ int __init dom0_construct_pv(struct domain *d,
                nr_pages - domain_tot_pages(d));
     if ( initrd )
     {
-        mpt_alloc = (paddr_t)initrd->mod_start << PAGE_SHIFT;
+        mpt_alloc = initrd->start;
         printk("\n Init. ramdisk: %"PRIpaddr"->%"PRIpaddr,
                mpt_alloc, mpt_alloc + initrd_len);
     }
@@ -804,7 +806,7 @@ int __init dom0_construct_pv(struct domain *d,
         if ( pfn >= initrd_pfn )
         {
             if ( pfn < initrd_pfn + PFN_UP(initrd_len) )
-                mfn = initrd->mod_start + (pfn - initrd_pfn);
+                mfn = mfn_x(initrd->mfn) + (pfn - initrd_pfn);
             else
                 mfn -= PFN_UP(initrd_len);
         }
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 2aa1e28c8f..2700f4eb3e 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1,3 +1,4 @@
+#include <xen/bootinfo.h>
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/err.h>
@@ -270,8 +271,48 @@ static int __init cf_check parse_acpi_param(const char *s)
 }
 custom_param("acpi", parse_acpi_param);
 
-static const module_t *__initdata initial_images;
-static unsigned int __initdata nr_initial_images;
+struct boot_info __initdata *boot_info;
+
+static void __init mb_to_bootinfo(multiboot_info_t *mbi, module_t *mods)
+{
+    static struct boot_info       __initdata x86_binfo;
+    static struct arch_boot_info  __initdata arch_x86_binfo;
+    static struct boot_module     __initdata x86_mods[CONFIG_NR_BOOTMODS + 1];
+    static struct arch_bootmodule __initdata
+                                        arch_x86_mods[CONFIG_NR_BOOTMODS + 1];
+    int i;
+
+    x86_binfo.arch = &arch_x86_binfo;
+    x86_binfo.mods = x86_mods;
+
+    x86_binfo.cmdline = __va(mbi->cmdline);
+
+    /* The BOOTINFO_FLAG_X86_* flags are a 1-1 map to MBI_* */
+    arch_x86_binfo.flags = mbi->flags;
+    arch_x86_binfo.mem_upper = mbi->mem_upper;
+    arch_x86_binfo.mem_lower = mbi->mem_lower;
+    arch_x86_binfo.mmap_length = mbi->mmap_length;
+    arch_x86_binfo.mmap_addr = mbi->mmap_addr;
+    arch_x86_binfo.boot_loader_name = __va(mbi->boot_loader_name);
+
+    x86_binfo.nr_mods = mbi->mods_count;
+    for ( i = 0; i <= CONFIG_NR_BOOTMODS; i++)
+    {
+        x86_mods[i].arch = &arch_x86_mods[i];
+
+        if ( i < x86_binfo.nr_mods )
+        {
+            bootmodule_update_start(&x86_mods[i], mods[i].mod_start);
+            x86_mods[i].size = mods[i].mod_end - mods[i].mod_start;
+
+            x86_mods[i].string.len = strlcpy(x86_mods[i].string.bytes,
+                                              __va(mods[i].string),
+                                              BOOTMOD_MAX_STRING);
+        }
+    }
+
+    boot_info = &x86_binfo;
+}
 
 unsigned long __init initial_images_nrpages(nodeid_t node)
 {
@@ -280,10 +321,10 @@ unsigned long __init initial_images_nrpages(nodeid_t node)
     unsigned long nr;
     unsigned int i;
 
-    for ( nr = i = 0; i < nr_initial_images; ++i )
+    for ( nr = i = 0; i < boot_info->nr_mods; ++i )
     {
-        unsigned long start = initial_images[i].mod_start;
-        unsigned long end = start + PFN_UP(initial_images[i].mod_end);
+        unsigned long start = mfn_x(boot_info->mods[i].mfn);
+        unsigned long end = start + PFN_UP(boot_info->mods[i].size);
 
         if ( end > node_start && node_end > start )
             nr += min(node_end, end) - max(node_start, start);
@@ -296,16 +337,15 @@ void __init discard_initial_images(void)
 {
     unsigned int i;
 
-    for ( i = 0; i < nr_initial_images; ++i )
+    for ( i = 0; i < boot_info->nr_mods; ++i )
     {
-        uint64_t start = (uint64_t)initial_images[i].mod_start << PAGE_SHIFT;
+        uint64_t start = (uint64_t)boot_info->mods[i].start;
 
         init_domheap_pages(start,
-                           start + PAGE_ALIGN(initial_images[i].mod_end));
+                           start + PAGE_ALIGN(boot_info->mods[i].size));
     }
 
-    nr_initial_images = 0;
-    initial_images = NULL;
+    boot_info->nr_mods = 0;
 }
 
 extern char __init_begin[], __init_end[], __bss_start[], __bss_end[];
@@ -392,14 +432,14 @@ static void __init normalise_cpu_order(void)
  * Ensure a given physical memory range is present in the bootstrap mappings.
  * Use superpage mappings to ensure that pagetable memory needn't be allocated.
  */
-void *__init bootstrap_map(const module_t *mod)
+void *__init bootstrap_map(const struct boot_module *mod)
 {
     static unsigned long __initdata map_cur = BOOTSTRAP_MAP_BASE;
     uint64_t start, end, mask = (1L << L2_PAGETABLE_SHIFT) - 1;
     void *ret;
 
     if ( system_state != SYS_STATE_early_boot )
-        return mod ? mfn_to_virt(mod->mod_start) : NULL;
+        return mod ? maddr_to_virt(mod->start) : NULL;
 
     if ( !mod )
     {
@@ -408,8 +448,8 @@ void *__init bootstrap_map(const module_t *mod)
         return NULL;
     }
 
-    start = (uint64_t)mod->mod_start << PAGE_SHIFT;
-    end = start + mod->mod_end;
+    start = (uint64_t)mod->start;
+    end = start + mod->size;
     if ( start >= end )
         return NULL;
 
@@ -436,25 +476,25 @@ static void *__init move_memory(
 
     while ( size )
     {
-        module_t mod;
+        struct boot_module mod;
         unsigned int soffs = src & mask;
         unsigned int doffs = dst & mask;
         unsigned int sz;
         void *d, *s;
 
-        mod.mod_start = (src - soffs) >> PAGE_SHIFT;
-        mod.mod_end = soffs + size;
-        if ( mod.mod_end > blksz )
-            mod.mod_end = blksz;
-        sz = mod.mod_end - soffs;
+        mod.start = src - soffs;
+        mod.size = soffs + size;
+        if ( mod.size > blksz )
+            mod.size = blksz;
+        sz = mod.size - soffs;
         s = bootstrap_map(&mod);
 
-        mod.mod_start = (dst - doffs) >> PAGE_SHIFT;
-        mod.mod_end = doffs + size;
-        if ( mod.mod_end > blksz )
-            mod.mod_end = blksz;
-        if ( sz > mod.mod_end - doffs )
-            sz = mod.mod_end - doffs;
+        mod.start = dst - doffs;
+        mod.size = doffs + size;
+        if ( mod.size > blksz )
+            mod.size = blksz;
+        if ( sz > mod.size - doffs )
+            sz = mod.size - doffs;
         d = bootstrap_map(&mod);
 
         memmove(d + doffs, s + soffs, sz);
@@ -475,7 +515,7 @@ static void *__init move_memory(
 #undef BOOTSTRAP_MAP_LIMIT
 
 static uint64_t __init consider_modules(
-    uint64_t s, uint64_t e, uint32_t size, const module_t *mod,
+    uint64_t s, uint64_t e, uint32_t size, const struct boot_module *mod,
     unsigned int nr_mods, unsigned int this_mod)
 {
     unsigned int i;
@@ -485,8 +525,8 @@ static uint64_t __init consider_modules(
 
     for ( i = 0; i < nr_mods ; ++i )
     {
-        uint64_t start = (uint64_t)mod[i].mod_start << PAGE_SHIFT;
-        uint64_t end = start + PAGE_ALIGN(mod[i].mod_end);
+        uint64_t start = (uint64_t)mod[i].start;
+        uint64_t end = start + PAGE_ALIGN(mod[i].size);
 
         if ( i == this_mod )
             continue;
@@ -756,10 +796,8 @@ static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int li
     return n;
 }
 
-static struct domain *__init create_dom0(const module_t *image,
-                                         unsigned long headroom,
-                                         module_t *initrd, const char *kextra,
-                                         const char *loader)
+static struct domain *__init create_dom0(
+    const struct boot_info *bi, const char *kextra, const char *loader)
 {
     struct xen_domctl_createdomain dom0_cfg = {
         .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
@@ -772,9 +810,14 @@ static struct domain *__init create_dom0(const module_t *image,
             .misc_flags = opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0,
         },
     };
+    struct boot_module *image = bootmodule_next_by_kind(bi, BOOTMOD_KERNEL, 0);
+    struct boot_module *initrd = bootmodule_next_by_kind(bi, BOOTMOD_RAMDISK, 0);
     struct domain *d;
     char *cmdline;
-    domid_t domid;
+    domid_t domid = 0;
+
+    if ( image == NULL )
+        panic("Error creating d%uv0\n", domid);
 
     if ( opt_dom0_pvh )
     {
@@ -801,7 +844,8 @@ static struct domain *__init create_dom0(const module_t *image,
         panic("Error creating d%uv0\n", domid);
 
     /* Grab the DOM0 command line. */
-    cmdline = image->string ? __va(image->string) : NULL;
+    cmdline = (image->string.kind == BOOTSTR_CMDLINE) ?
+              image->string.bytes : NULL;
     if ( cmdline || kextra )
     {
         static char __initdata dom0_cmdline[MAX_GUEST_CMDLINE];
@@ -841,7 +885,7 @@ static struct domain *__init create_dom0(const module_t *image,
         write_cr4(read_cr4() & ~X86_CR4_SMAP);
     }
 
-    if ( construct_dom0(d, image, headroom, initrd, cmdline) != 0 )
+    if ( construct_dom0(d, image, initrd, cmdline) != 0 )
         panic("Could not construct domain 0\n");
 
     if ( cpu_has_smap )
@@ -865,7 +909,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     unsigned int initrdidx, num_parked = 0;
     multiboot_info_t *mbi;
     module_t *mod;
-    unsigned long nr_pages, raw_max_page, modules_headroom, module_map[1];
+    unsigned long nr_pages, raw_max_page;
     int i, j, e820_warn = 0, bytes = 0;
     unsigned long eb_start, eb_end;
     bool acpi_boot_table_init_done = false, relocated = false;
@@ -910,12 +954,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         mod = __va(mbi->mods_addr);
     }
 
-    loader = (mbi->flags & MBI_LOADERNAME)
-        ? (char *)__va(mbi->boot_loader_name) : "unknown";
+    mb_to_bootinfo(mbi, mod);
+
+    loader = (boot_info->arch->flags & BOOTINFO_FLAG_X86_LOADERNAME)
+        ? boot_info->arch->boot_loader_name : "unknown";
 
     /* Parse the command-line options. */
-    cmdline = cmdline_cook((mbi->flags & MBI_CMDLINE) ?
-                           __va(mbi->cmdline) : NULL,
+    cmdline = cmdline_cook((boot_info->arch->flags & BOOTINFO_FLAG_X86_CMDLINE) ?
+                            boot_info->cmdline : NULL,
                            loader);
     if ( (kextra = strstr(cmdline, " -- ")) != NULL )
     {
@@ -1016,19 +1062,22 @@ void __init noreturn __start_xen(unsigned long mbi_p)
            bootsym(boot_edd_info_nr));
 
     /* Check that we have at least one Multiboot module. */
-    if ( !(mbi->flags & MBI_MODULES) || (mbi->mods_count == 0) )
+    if ( !(boot_info->arch->flags & BOOTINFO_FLAG_X86_MODULES) ||
+         (boot_info->nr_mods == 0) )
         panic("dom0 kernel not specified. Check bootloader configuration\n");
 
     /* Check that we don't have a silly number of modules. */
-    if ( mbi->mods_count > CONFIG_NR_BOOTMODS )
+    if ( boot_info->nr_mods > CONFIG_NR_BOOTMODS )
     {
-        mbi->mods_count = CONFIG_NR_BOOTMODS;
+        boot_info->nr_mods = CONFIG_NR_BOOTMODS ;
         printk("Excessive multiboot modules - using the first %u only\n",
-               mbi->mods_count);
+               boot_info->nr_mods);
     }
 
-    bitmap_fill(module_map, mbi->mods_count);
-    __clear_bit(0, module_map); /* Dom0 kernel is always first */
+    /* Dom0 kernel is the first boot module */
+    boot_info->mods[0].kind = BOOTMOD_KERNEL;
+    if ( boot_info->mods[0].string.len )
+        boot_info->mods[0].string.kind = BOOTSTR_CMDLINE;
 
     if ( pvh_boot )
     {
@@ -1052,19 +1101,19 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     }
     else if ( efi_enabled(EFI_BOOT) )
         memmap_type = "EFI";
-    else if ( (e820_raw.nr_map = 
+    else if ( (e820_raw.nr_map =
                    copy_bios_e820(e820_raw.map,
                                   ARRAY_SIZE(e820_raw.map))) != 0 )
     {
         memmap_type = "Xen-e820";
     }
-    else if ( mbi->flags & MBI_MEMMAP )
+    else if ( boot_info->arch->flags & BOOTINFO_FLAG_X86_MEMMAP )
     {
         memmap_type = "Multiboot-e820";
-        while ( bytes < mbi->mmap_length &&
+        while ( bytes < boot_info->arch->mmap_length &&
                 e820_raw.nr_map < ARRAY_SIZE(e820_raw.map) )
         {
-            memory_map_t *map = __va(mbi->mmap_addr + bytes);
+            struct mb_memmap *map = __va(boot_info->arch->mmap_addr + bytes);
 
             /*
              * This is a gross workaround for a BIOS bug. Some bootloaders do
@@ -1165,17 +1214,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     set_kexec_crash_area_size((u64)nr_pages << PAGE_SHIFT);
     kexec_reserve_area(&boot_e820);
 
-    initial_images = mod;
-    nr_initial_images = mbi->mods_count;
-
-    for ( i = 0; !efi_enabled(EFI_LOADER) && i < mbi->mods_count; i++ )
-    {
-        if ( mod[i].mod_start & (PAGE_SIZE - 1) )
+    for ( i = 0; !efi_enabled(EFI_LOADER) && i < boot_info->nr_mods; i++ )
+        if ( boot_info->mods[i].start & (PAGE_SIZE - 1) )
             panic("Bootloader didn't honor module alignment request\n");
-        mod[i].mod_end -= mod[i].mod_start;
-        mod[i].mod_start >>= PAGE_SHIFT;
-        mod[i].reserved = 0;
-    }
 
     if ( xen_phys_start )
     {
@@ -1186,11 +1227,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
          * respective reserve_e820_ram() invocation below. No need to
          * query efi_boot_mem_unused() here, though.
          */
-        mod[mbi->mods_count].mod_start = virt_to_mfn(_stext);
-        mod[mbi->mods_count].mod_end = __2M_rwdata_end - _stext;
+        bootmodule_update_start(&boot_info->mods[boot_info->nr_mods],
+                                virt_to_maddr(_stext));
+        boot_info->mods[boot_info->nr_mods].size = __2M_rwdata_end - _stext;
     }
 
-    modules_headroom = bzimage_headroom(bootstrap_map(mod), mod->mod_end);
+    boot_info->mods[0].arch->headroom = bzimage_headroom(
+                                        bootstrap_map(&boot_info->mods[0]),
+                                        boot_info->mods[0].size);
     bootstrap_map(NULL);
 
 #ifndef highmem_start
@@ -1247,7 +1291,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         {
             /* Don't overlap with modules. */
             end = consider_modules(s, e, reloc_size + mask,
-                                   mod, mbi->mods_count, -1);
+                                   boot_info->mods, boot_info->nr_mods, -1);
             end &= ~mask;
         }
         else
@@ -1351,31 +1395,32 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         }
 
         /* Is the region suitable for relocating the multiboot modules? */
-        for ( j = mbi->mods_count - 1; j >= 0; j-- )
+        for ( j = boot_info->nr_mods - 1; j >= 0; j-- )
         {
-            unsigned long headroom = j ? 0 : modules_headroom;
-            unsigned long size = PAGE_ALIGN(headroom + mod[j].mod_end);
+            struct boot_module *mod = boot_info->mods;
+            unsigned long headroom = mod[j].arch->headroom;
+            unsigned long size = PAGE_ALIGN(headroom + mod[j].size);
 
-            if ( mod[j].reserved )
+            if ( mod[j].arch->flags & BOOTMOD_FLAG_X86_RELOCATED )
                 continue;
 
             /* Don't overlap with other modules (or Xen itself). */
             end = consider_modules(s, e, size, mod,
-                                   mbi->mods_count + relocated, j);
+                                   boot_info->nr_mods + relocated, j);
 
             if ( highmem_start && end > highmem_start )
                 continue;
 
             if ( s < end &&
                  (headroom ||
-                  ((end - size) >> PAGE_SHIFT) > mod[j].mod_start) )
+                  ((end - size) >> PAGE_SHIFT) > mfn_x(mod[j].mfn)) )
             {
                 move_memory(end - size + headroom,
-                            (uint64_t)mod[j].mod_start << PAGE_SHIFT,
-                            mod[j].mod_end, 0);
-                mod[j].mod_start = (end - size) >> PAGE_SHIFT;
-                mod[j].mod_end += headroom;
-                mod[j].reserved = 1;
+                            (uint64_t)mod[j].start,
+                            mod[j].size, 0);
+                bootmodule_update_start(&mod[j], end - size);
+                mod[j].size += headroom;
+                mod[j].arch->flags |= BOOTMOD_FLAG_X86_RELOCATED;
             }
         }
 
@@ -1387,8 +1432,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         while ( !kexec_crash_area.start )
         {
             /* Don't overlap with modules (or Xen itself). */
-            e = consider_modules(s, e, PAGE_ALIGN(kexec_crash_area.size), mod,
-                                 mbi->mods_count + relocated, -1);
+            e = consider_modules(s, e, PAGE_ALIGN(kexec_crash_area.size),
+                                 boot_info->mods,
+                                 boot_info->nr_mods + relocated, -1);
             if ( s >= e )
                 break;
             if ( e > kexec_crash_area_limit )
@@ -1401,13 +1447,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 #endif
     }
 
-    if ( modules_headroom && !mod->reserved )
+    if ( boot_info->mods[0].arch->headroom &&
+         !(boot_info->mods[0].arch->flags & BOOTMOD_FLAG_X86_RELOCATED) )
         panic("Not enough memory to relocate the dom0 kernel image\n");
-    for ( i = 0; i < mbi->mods_count; ++i )
+    for ( i = 0; i < boot_info->nr_mods; ++i )
     {
-        uint64_t s = (uint64_t)mod[i].mod_start << PAGE_SHIFT;
+        uint64_t s = (uint64_t)boot_info->mods[i].start;
 
-        reserve_e820_ram(&boot_e820, s, s + PAGE_ALIGN(mod[i].mod_end));
+        reserve_e820_ram(&boot_e820, s, s + PAGE_ALIGN(boot_info->mods[i].size));
     }
 
     if ( !xen_phys_start )
@@ -1472,10 +1519,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                     ASSERT(j);
                 }
                 map_e = boot_e820.map[j].addr + boot_e820.map[j].size;
-                for ( j = 0; j < mbi->mods_count; ++j )
+                for ( j = 0; j < boot_info->nr_mods; ++j )
                 {
-                    uint64_t end = pfn_to_paddr(mod[j].mod_start) +
-                                   mod[j].mod_end;
+                    uint64_t end = mfn_to_maddr(boot_info->mods[j].mfn) +
+                                   boot_info->mods[j].size;
 
                     if ( map_e < end )
                         map_e = end;
@@ -1548,13 +1595,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         }
     }
 
-    for ( i = 0; i < mbi->mods_count; ++i )
+    for ( i = 0; i < boot_info->nr_mods; ++i )
     {
-        set_pdx_range(mod[i].mod_start,
-                      mod[i].mod_start + PFN_UP(mod[i].mod_end));
-        map_pages_to_xen((unsigned long)mfn_to_virt(mod[i].mod_start),
-                         _mfn(mod[i].mod_start),
-                         PFN_UP(mod[i].mod_end), PAGE_HYPERVISOR);
+        set_pdx_range(mfn_x(boot_info->mods[i].mfn),
+                      mfn_x(boot_info->mods[i].mfn) +
+                      PFN_UP(boot_info->mods[i].size));
+        map_pages_to_xen((unsigned long)maddr_to_virt(boot_info->mods[i].start),
+                         boot_info->mods[i].mfn,
+                         PFN_UP(boot_info->mods[i].size), PAGE_HYPERVISOR);
     }
 
 #ifdef CONFIG_KEXEC
@@ -1704,7 +1752,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
                                   RANGESETF_prettyprint_hex);
 
-    xsm_multiboot_init(module_map, mbi);
+    xsm_bootmodule_init(boot_info);
 
     /*
      * IOMMU-related ACPI table parsing may require some of the system domains
@@ -1773,7 +1821,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
     init_IRQ();
 
-    microcode_grab_module(module_map, mbi);
+    microcode_grab_module(boot_info);
 
     timer_init();
 
@@ -1921,8 +1969,11 @@ void __init noreturn __start_xen(unsigned long mbi_p)
            cpu_has_nx ? XENLOG_INFO : XENLOG_WARNING "Warning: ",
            cpu_has_nx ? "" : "not ");
 
-    initrdidx = find_first_bit(module_map, mbi->mods_count);
-    if ( bitmap_weight(module_map, mbi->mods_count) > 1 )
+    initrdidx = bootmodule_next_idx_by_kind(boot_info, BOOTMOD_UNKNOWN, 0);
+    if ( initrdidx < boot_info->nr_mods )
+        boot_info->mods[initrdidx].kind = BOOTMOD_RAMDISK;
+
+    if ( bootmodule_count_by_kind(boot_info, BOOTMOD_UNKNOWN) > 1 )
         printk(XENLOG_WARNING
                "Multiple initrd candidates, picking module #%u\n",
                initrdidx);
@@ -1931,9 +1982,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
      * We're going to setup domain0 using the module(s) that we stashed safely
      * above our heap. The second module, if present, is an initrd ramdisk.
      */
-    dom0 = create_dom0(mod, modules_headroom,
-                       initrdidx < mbi->mods_count ? mod + initrdidx : NULL,
-                       kextra, loader);
+    dom0 = create_dom0(boot_info, kextra, loader);
     if ( !dom0 )
         panic("Could not set up DOM0 guest OS\n");
 
diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
index 42b53a3ca6..dde8202f62 100644
--- a/xen/include/xen/bootinfo.h
+++ b/xen/include/xen/bootinfo.h
@@ -51,4 +51,51 @@ struct __packed boot_info {
     struct arch_boot_info *arch;
 };
 
+extern struct boot_info *boot_info;
+
+static inline unsigned long bootmodule_next_idx_by_kind(
+    const struct boot_info *bi, bootmodule_kind kind, unsigned long start)
+{
+    for ( ; start < bi->nr_mods; start++ )
+        if ( bi->mods[start].kind == kind )
+            return start;
+
+    return bi->nr_mods + 1;
+}
+
+static inline unsigned long bootmodule_count_by_kind(
+    const struct boot_info *bi, bootmodule_kind kind)
+{
+    unsigned long count = 0;
+    int i;
+
+    for ( i=0; i < bi->nr_mods; i++ )
+        if ( bi->mods[i].kind == kind )
+            count++;
+
+    return count;
+}
+
+static inline struct boot_module *bootmodule_next_by_kind(
+    const struct boot_info *bi, bootmodule_kind kind, unsigned long start)
+{
+    for ( ; start < bi->nr_mods; start++ )
+        if ( bi->mods[start].kind == kind )
+            return &bi->mods[start];
+
+    return NULL;
+}
+
+static inline void bootmodule_update_start(struct boot_module *b, paddr_t new_start)
+{
+    b->start = new_start;
+    b->mfn = maddr_to_mfn(new_start);
+}
+
+static inline void bootmodule_update_mfn(struct boot_module *b, mfn_t new_mfn)
+{
+    b->mfn = new_mfn;
+    b->start = mfn_to_maddr(new_mfn);
+}
+
 #endif
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 8dad03fd3d..930939e925 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -16,8 +16,8 @@
 #define __XSM_H__
 
 #include <xen/alternative-call.h>
+#include <xen/bootinfo.h>
 #include <xen/sched.h>
-#include <xen/multiboot.h>
 
 /* policy magic number (defined by XSM_MAGIC) */
 typedef uint32_t xsm_magic_t;
@@ -776,15 +776,14 @@ static inline int xsm_argo_send(const struct domain *d, const struct domain *t)
 
 #endif /* XSM_NO_WRAPPERS */
 
-#ifdef CONFIG_MULTIBOOT
-int xsm_multiboot_init(
-    unsigned long *module_map, const multiboot_info_t *mbi);
-int xsm_multiboot_policy_init(
-    unsigned long *module_map, const multiboot_info_t *mbi,
-    void **policy_buffer, size_t *policy_size);
-#endif
+#ifndef CONFIG_HAS_DEVICE_TREE
+int xsm_bootmodule_init(const struct boot_info *bi);
+int xsm_bootmodule_policy_init(
+    const struct boot_info *bi, const unsigned char **policy_buffer,
+    size_t *policy_size);
+
+#else
 
-#ifdef CONFIG_HAS_DEVICE_TREE
 /*
  * Initialize XSM
  *
@@ -826,15 +825,14 @@ static const inline struct xsm_ops *silo_init(void)
 
 #include <xsm/dummy.h>
 
-#ifdef CONFIG_MULTIBOOT
-static inline int xsm_multiboot_init (
-    unsigned long *module_map, const multiboot_info_t *mbi)
+#ifndef CONFIG_HAS_DEVICE_TREE
+static inline int xsm_bootmodule_init(const struct boot_info *bi)
 {
     return 0;
 }
-#endif
 
-#ifdef CONFIG_HAS_DEVICE_TREE
+#else
+
 static inline int xsm_dt_init(void)
 {
     return 0;
diff --git a/xen/xsm/xsm_core.c b/xen/xsm/xsm_core.c
index 2286a502e3..8631f3e7bb 100644
--- a/xen/xsm/xsm_core.c
+++ b/xen/xsm/xsm_core.c
@@ -10,6 +10,9 @@
  *  as published by the Free Software Foundation.
  */
 
+#include <xen/bootinfo.h>
+#include <xen/errno.h>
+#include <xen/hypercall.h>
 #include <xen/init.h>
 #include <xen/errno.h>
 #include <xen/lib.h>
@@ -138,26 +141,34 @@ static int __init xsm_core_init(const void *policy_buffer, size_t policy_size)
     return 0;
 }
 
-#ifdef CONFIG_MULTIBOOT
-int __init xsm_multiboot_init(
-    unsigned long *module_map, const multiboot_info_t *mbi)
+/*
+ * ifdef'ing this against multiboot is no longer valid as the boot module
+ * is agnostic and it will be possible to dropped the ifndef should Arm
+ * adopt boot info
+ */
+#ifndef CONFIG_HAS_DEVICE_TREE
+int __init xsm_bootmodule_init(const struct boot_info *bi)
 {
     int ret = 0;
-    void *policy_buffer = NULL;
+    const unsigned char *policy_buffer = NULL;
     size_t policy_size = 0;
 
     printk("XSM Framework v" XSM_FRAMEWORK_VERSION " initialized\n");
 
     if ( XSM_MAGIC )
     {
-        ret = xsm_multiboot_policy_init(module_map, mbi, &policy_buffer,
-                                        &policy_size);
-        if ( ret )
-        {
-            bootstrap_map(NULL);
-            printk(XENLOG_ERR "Error %d initializing XSM policy\n", ret);
-            return -EINVAL;
-        }
+        int ret = xsm_bootmodule_policy_init(bi, &policy_buffer, &policy_size);
+        bootstrap_map(NULL);
+
+        if ( ret == -ENOENT )
+            /*
+             * The XSM module needs a policy file but one was not located.
+             * Report as a warning and continue as the XSM module may late
+             * load a policy file.
+             */
+            printk(XENLOG_WARNING "xsm: starting without a policy loaded!\n");
+        else if ( ret )
+            panic("Error %d initializing XSM policy\n", ret);
     }
 
     ret = xsm_core_init(policy_buffer, policy_size);
@@ -165,9 +176,9 @@ int __init xsm_multiboot_init(
 
     return 0;
 }
-#endif
 
-#ifdef CONFIG_HAS_DEVICE_TREE
+#else
+
 int __init xsm_dt_init(void)
 {
     int ret = 0;
@@ -215,9 +226,9 @@ bool __init has_xsm_magic(paddr_t start)
 
     return false;
 }
-#endif
+#endif /* CONFIG_HAS_DEVICE_TREE */
 
-#endif
+#endif /* CONFIG_XSM */
 
 long cf_check do_xsm_op(XEN_GUEST_HANDLE_PARAM(void) op)
 {
diff --git a/xen/xsm/xsm_policy.c b/xen/xsm/xsm_policy.c
index 8dafbc9381..c55ff2a574 100644
--- a/xen/xsm/xsm_policy.c
+++ b/xen/xsm/xsm_policy.c
@@ -18,61 +18,61 @@
  *
  */
 
-#include <xsm/xsm.h>
-#ifdef CONFIG_MULTIBOOT
-#include <xen/multiboot.h>
-#include <asm/setup.h>
-#endif
 #include <xen/bitops.h>
+#include <xen/bootinfo.h>
+#include <xsm/xsm.h>
+
 #ifdef CONFIG_HAS_DEVICE_TREE
-# include <asm/setup.h>
 # include <xen/device_tree.h>
 #endif
 
-#ifdef CONFIG_MULTIBOOT
-int __init xsm_multiboot_policy_init(
-    unsigned long *module_map, const multiboot_info_t *mbi,
-    void **policy_buffer, size_t *policy_size)
+# include <asm/setup.h>
+
+#ifndef CONFIG_HAS_DEVICE_TREE
+int __init xsm_bootmodule_policy_init(
+    const struct boot_info *bi, const unsigned char **policy_buffer,
+    size_t *policy_size)
 {
-    int i;
-    module_t *mod = (module_t *)__va(mbi->mods_addr);
-    int rc = 0;
+    unsigned long idx = 0;
+    int rc = -ENOENT;
     u32 *_policy_start;
     unsigned long _policy_len;
 
-    /*
-     * Try all modules and see whichever could be the binary policy.
-     * Adjust module_map for the module that is the binary policy.
-     */
-    for ( i = mbi->mods_count-1; i >= 1; i-- )
-    {
-        if ( !test_bit(i, module_map) )
-            continue;
+#ifdef CONFIG_XSM_FLASK_POLICY
+    /* Initially set to builtin policy, overriden if boot module is found. */
+    *policy_buffer = xsm_flask_init_policy;
+    *policy_size = xsm_flask_init_policy_size;
+    rc = 0;
+#endif
 
-        _policy_start = bootstrap_map(mod + i);
-        _policy_len   = mod[i].mod_end;
+    idx = bootmodule_next_idx_by_kind(bi, BOOTMOD_UNKNOWN, idx);
+    while ( idx < bi->nr_mods )
+    {
+        _policy_start = bootstrap_map(&bi->mods[idx]);
+        _policy_len   = bi->mods[idx].size;
 
         if ( (xsm_magic_t)(*_policy_start) == XSM_MAGIC )
         {
-            *policy_buffer = _policy_start;
+            *policy_buffer = (unsigned char *)_policy_start;
             *policy_size = _policy_len;
 
             printk("Policy len %#lx, start at %p.\n",
                    _policy_len,_policy_start);
 
-            __clear_bit(i, module_map);
+            bi->mods[idx].kind = BOOTMOD_XSM;
+            rc = 0;
             break;
-
         }
 
         bootstrap_map(NULL);
+        idx = bootmodule_next_idx_by_kind(bi, BOOTMOD_UNKNOWN, ++idx);
     }
 
     return rc;
 }
-#endif
 
-#ifdef CONFIG_HAS_DEVICE_TREE
+#else
+
 int __init xsm_dt_policy_init(void **policy_buffer, size_t *policy_size)
 {
     struct bootmodule *mod = boot_module_find_by_kind(BOOTMOD_XSM);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 04/18] x86: refactor entrypoints to new boot info
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (2 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 03/18] x86: adopt new boot info structures Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-18 13:58   ` Smith, Jackson
  2022-07-06 21:04 ` [PATCH v1 05/18] x86: refactor xen cmdline into general framework Daniel P. Smith
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné

This previous commit added a transition point from multiboot v1 structures to
the new boot info structures at the earliest common point for all the x86
entrypoints. The result is that each of the entrypoints would construct a
multiboot v1 structure from the structures used by each entrypoint.  This meant
that multiboot2, EFI, and PVH all converted their structures over to mutliboot
v1 to only be converted again upon entering __start_xen().

This commit drops the translation function and moves the population of the new
boot info structures down into the various entrypoints.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/boot/boot_info32.h           |  94 +++++++++++
 xen/arch/x86/boot/defs.h                  |  17 +-
 xen/arch/x86/boot/reloc.c                 | 187 +++++++++++++++-------
 xen/arch/x86/efi/efi-boot.h               |  96 ++++++-----
 xen/arch/x86/guest/xen/pvh-boot.c         |  64 +++++---
 xen/arch/x86/include/asm/guest/pvh-boot.h |   6 +-
 xen/arch/x86/setup.c                      |  71 +++-----
 xen/common/efi/boot.c                     |   4 +-
 8 files changed, 359 insertions(+), 180 deletions(-)
 create mode 100644 xen/arch/x86/boot/boot_info32.h

diff --git a/xen/arch/x86/boot/boot_info32.h b/xen/arch/x86/boot/boot_info32.h
new file mode 100644
index 0000000000..01af950efc
--- /dev/null
+++ b/xen/arch/x86/boot/boot_info32.h
@@ -0,0 +1,94 @@
+#ifndef __BOOT_INFO32_H__
+#define __BOOT_INFO32_H__
+
+#include "defs.h"
+
+typedef enum {
+    BOOTMOD_UNKNOWN,
+    BOOTMOD_XEN,
+    BOOTMOD_FDT,
+    BOOTMOD_KERNEL,
+    BOOTMOD_RAMDISK,
+    BOOTMOD_XSM,
+    BOOTMOD_UCODE,
+    BOOTMOD_GUEST_DTB,
+}  bootmodule_kind;
+
+typedef enum {
+    BOOTSTR_EMPTY,
+    BOOTSTR_STRING,
+    BOOTSTR_CMDLINE,
+} bootstring_kind;
+
+#define BOOTMOD_MAX_STRING 1024
+struct __packed boot_string {
+    u32 kind;
+    u64 arch;
+
+    char bytes[BOOTMOD_MAX_STRING];
+    u64 len;
+};
+
+struct __packed arch_bootmodule {
+    bool relocated;
+    u32 flags;
+#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
+    u32 headroom;
+};
+
+struct __packed boot_module {
+    u32 kind;
+    u64 start;
+    u64 mfn;
+    u64 size;
+
+    u64 arch;
+    struct boot_string string;
+};
+
+struct __packed arch_boot_info {
+    /* uint32_t */
+    u32 flags;
+#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
+#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
+#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
+#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
+#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
+#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
+#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
+#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
+#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
+#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
+#define BOOTINFO_FLAG_X86_APM        	1U << 10
+
+    /* bool */
+    u8 xen_guest;
+
+    /* char* */
+    u64 boot_loader_name;
+    u64 kextra;
+
+    /* uint32_t */
+    u32 mem_lower;
+    u32 mem_upper;
+
+    /* uint32_t */
+    u32 mmap_length;
+    /* paddr_t */
+    u64 mmap_addr;
+};
+
+struct __packed boot_info {
+    /* char* */
+    u64 cmdline;
+
+    /* uint32_t */
+    u32 nr_mods;
+    /* struct boot_module* */
+    u64 mods;
+
+    /* struct arch_boot_info* */
+    u64 arch;
+};
+
+#endif
diff --git a/xen/arch/x86/boot/defs.h b/xen/arch/x86/boot/defs.h
index f9840044ec..d742a2b52a 100644
--- a/xen/arch/x86/boot/defs.h
+++ b/xen/arch/x86/boot/defs.h
@@ -22,11 +22,11 @@
 
 #include "../../../include/xen/stdbool.h"
 
-#define __maybe_unused	__attribute__((__unused__))
-#define __packed	__attribute__((__packed__))
-#define __stdcall	__attribute__((__stdcall__))
+#define __maybe_unused  __attribute__((__unused__))
+#define __packed        __attribute__((__packed__))
+#define __stdcall       __attribute__((__stdcall__))
 
-#define NULL		((void *)0)
+#define NULL            ((void *)0)
 
 #define ALIGN_UP(arg, align) \
                 (((arg) + (align) - 1) & ~((typeof(arg))(align) - 1))
@@ -43,9 +43,10 @@
         (void) (&_x == &_y);            \
         _x > _y ? _x : _y; })
 
-#define _p(val)		((void *)(unsigned long)(val))
+#define _p(val)     ((void *)(unsigned long)(val))
+#define _addr(val)  ((unsigned long)(void *)(val))
 
-#define tolower(c)	((c) | 0x20)
+#define tolower(c)  ((c) | 0x20)
 
 typedef unsigned char u8;
 typedef unsigned short u16;
@@ -57,7 +58,7 @@ typedef u16 uint16_t;
 typedef u32 uint32_t;
 typedef u64 uint64_t;
 
-#define U16_MAX		((u16)(~0U))
-#define UINT_MAX	(~0U)
+#define U16_MAX     ((u16)(~0U))
+#define UINT_MAX    (~0U)
 
 #endif /* __BOOT_DEFS_H__ */
diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
index e22bb974bf..4c40cadff6 100644
--- a/xen/arch/x86/boot/reloc.c
+++ b/xen/arch/x86/boot/reloc.c
@@ -27,6 +27,7 @@ asm (
     );
 
 #include "defs.h"
+#include "boot_info32.h"
 #include "../../../include/xen/multiboot.h"
 #include "../../../include/xen/multiboot2.h"
 
@@ -138,65 +139,116 @@ static struct hvm_start_info *pvh_info_reloc(u32 in)
     return out;
 }
 
-static multiboot_info_t *mbi_reloc(u32 mbi_in)
+static struct boot_info *mbi_reloc(u32 mbi_in)
 {
+    const multiboot_info_t *mbi = _p(mbi_in);
+    struct boot_info *binfo;
+    struct arch_boot_info *arch_binfo;
     int i;
-    multiboot_info_t *mbi_out;
+    uint32_t ptr;
 
-    mbi_out = _p(copy_mem(mbi_in, sizeof(*mbi_out)));
+    ptr = alloc_mem(sizeof(*binfo));
+    zero_mem(ptr, sizeof(*binfo));
+    binfo = _p(ptr);
 
-    if ( mbi_out->flags & MBI_CMDLINE )
-        mbi_out->cmdline = copy_string(mbi_out->cmdline);
+    ptr = alloc_mem(sizeof(*arch_binfo));
+    zero_mem(ptr, sizeof(*arch_binfo));
+    binfo->arch = ptr;
+    arch_binfo = _p(ptr);
 
-    if ( mbi_out->flags & MBI_MODULES )
+    if ( mbi->flags & MBI_CMDLINE )
+    {
+        ptr = copy_string(mbi->cmdline);
+        binfo->cmdline = ptr;
+        arch_binfo->flags |= BOOTINFO_FLAG_X86_CMDLINE;
+    }
+
+    if ( mbi->flags & MBI_MODULES )
     {
         module_t *mods;
+        struct boot_module *bi_mods;
+        struct arch_bootmodule *arch_bi_mods;
+
+        /*
+         * We have to allocate one more module slot here. At some point
+         * __start_xen() may put Xen image placement into it.
+         */
+        ptr = alloc_mem((mbi->mods_count + 1) * sizeof(*bi_mods));
+        binfo->nr_mods = mbi->mods_count;
+        binfo->mods = ptr;
+        bi_mods = _p(ptr);
 
-        mbi_out->mods_addr = copy_mem(mbi_out->mods_addr,
-                                      mbi_out->mods_count * sizeof(module_t));
+        ptr = alloc_mem((mbi->mods_count + 1) * sizeof(*arch_bi_mods));
+        arch_bi_mods = _p(ptr);
 
-        mods = _p(mbi_out->mods_addr);
+        /* map the +1 allocated for Xen image */
+        bi_mods[mbi->mods_count].arch = _addr(&arch_bi_mods[mbi->mods_count]);
 
-        for ( i = 0; i < mbi_out->mods_count; i++ )
+        arch_binfo->flags |= BOOTINFO_FLAG_X86_MODULES;
+
+        mods = _p(mbi->mods_addr);
+
+        for ( i = 0; i < mbi->mods_count; i++ )
         {
+            bi_mods[i].start = mods[i].mod_start;
+            bi_mods[i].size = mods[i].mod_end - mods[i].mod_start;
+
             if ( mods[i].string )
-                mods[i].string = copy_string(mods[i].string);
+            {
+                int j;
+                char *c = _p(mods[i].string);
+
+                for ( j = 0; *c != '\0'; j++, c++ )
+                    bi_mods[i].string.bytes[j] = *c;
+
+                bi_mods[i].string.len = j + 1;
+            }
+
+            bi_mods[i].arch = _addr(&arch_bi_mods[i]);
         }
     }
 
-    if ( mbi_out->flags & MBI_MEMMAP )
-        mbi_out->mmap_addr = copy_mem(mbi_out->mmap_addr, mbi_out->mmap_length);
-
-    if ( mbi_out->flags & MBI_LOADERNAME )
-        mbi_out->boot_loader_name = copy_string(mbi_out->boot_loader_name);
+    if ( mbi->flags & MBI_MEMMAP )
+    {
+        arch_binfo->mmap_addr = copy_mem(mbi->mmap_addr, mbi->mmap_length);
+        arch_binfo->mmap_length = mbi->mmap_length;
+        arch_binfo->flags |= BOOTINFO_FLAG_X86_MEMMAP;
+    }
 
-    /* Mask features we don't understand or don't relocate. */
-    mbi_out->flags &= (MBI_MEMLIMITS |
-                       MBI_CMDLINE |
-                       MBI_MODULES |
-                       MBI_MEMMAP |
-                       MBI_LOADERNAME);
+    if ( mbi->flags & MBI_LOADERNAME )
+    {
+        ptr = copy_string(mbi->boot_loader_name);
+        arch_binfo->boot_loader_name = ptr;
+        arch_binfo->flags |= BOOTINFO_FLAG_X86_LOADERNAME;
+    }
 
-    return mbi_out;
+    return binfo;
 }
 
-static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out)
+static struct boot_info *mbi2_reloc(uint32_t mbi_in, uint32_t video_out)
 {
     const multiboot2_fixed_t *mbi_fix = _p(mbi_in);
     const multiboot2_memory_map_t *mmap_src;
     const multiboot2_tag_t *tag;
-    module_t *mbi_out_mods = NULL;
     memory_map_t *mmap_dst;
-    multiboot_info_t *mbi_out;
+    struct boot_info *binfo;
+    struct arch_boot_info *arch_binfo;
+    struct boot_module *bi_mods;
+    struct arch_bootmodule *arch_bi_mods;
 #ifdef CONFIG_VIDEO
     struct boot_video_info *video = NULL;
 #endif
     u32 ptr;
     unsigned int i, mod_idx = 0;
 
-    ptr = alloc_mem(sizeof(*mbi_out));
-    mbi_out = _p(ptr);
-    zero_mem(ptr, sizeof(*mbi_out));
+    ptr = alloc_mem(sizeof(*binfo));
+    zero_mem(ptr, sizeof(*binfo));
+    binfo = _p(ptr);
+
+    ptr = alloc_mem(sizeof(*arch_binfo));
+    zero_mem(ptr, sizeof(*arch_binfo));
+    binfo->arch = ptr;
+    arch_binfo = _p(ptr);
 
     /* Skip Multiboot2 information fixed part. */
     ptr = ALIGN_UP(mbi_in + sizeof(*mbi_fix), MULTIBOOT2_TAG_ALIGN);
@@ -206,21 +258,28 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out)
           tag = _p(ALIGN_UP((u32)tag + tag->size, MULTIBOOT2_TAG_ALIGN)) )
     {
         if ( tag->type == MULTIBOOT2_TAG_TYPE_MODULE )
-            ++mbi_out->mods_count;
+            ++binfo->nr_mods;
         else if ( tag->type == MULTIBOOT2_TAG_TYPE_END )
             break;
     }
 
-    if ( mbi_out->mods_count )
+    if ( binfo->nr_mods )
     {
-        mbi_out->flags |= MBI_MODULES;
         /*
          * We have to allocate one more module slot here. At some point
          * __start_xen() may put Xen image placement into it.
          */
-        mbi_out->mods_addr = alloc_mem((mbi_out->mods_count + 1) *
-                                       sizeof(*mbi_out_mods));
-        mbi_out_mods = _p(mbi_out->mods_addr);
+        ptr = alloc_mem((binfo->nr_mods + 1) * sizeof(*bi_mods));
+        binfo->mods = ptr;
+        bi_mods = _p(ptr);
+
+        ptr = alloc_mem((binfo->nr_mods + 1) * sizeof(*arch_bi_mods));
+        arch_bi_mods = _p(ptr);
+
+        /* map the +1 allocated for Xen image */
+        bi_mods[binfo->nr_mods].arch = _addr(&arch_bi_mods[binfo->nr_mods]);
+
+        arch_binfo->flags |= BOOTINFO_FLAG_X86_MODULES;
     }
 
     /* Skip Multiboot2 information fixed part. */
@@ -232,39 +291,38 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out)
         switch ( tag->type )
         {
         case MULTIBOOT2_TAG_TYPE_BOOT_LOADER_NAME:
-            mbi_out->flags |= MBI_LOADERNAME;
             ptr = get_mb2_string(tag, string, string);
-            mbi_out->boot_loader_name = copy_string(ptr);
+            arch_binfo->boot_loader_name = copy_string(ptr);
+            arch_binfo->flags |= BOOTINFO_FLAG_X86_LOADERNAME;
             break;
 
         case MULTIBOOT2_TAG_TYPE_CMDLINE:
-            mbi_out->flags |= MBI_CMDLINE;
             ptr = get_mb2_string(tag, string, string);
-            mbi_out->cmdline = copy_string(ptr);
+            binfo->cmdline = copy_string(ptr);
+            arch_binfo->flags |= BOOTINFO_FLAG_X86_CMDLINE;
             break;
 
         case MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO:
-            mbi_out->flags |= MBI_MEMLIMITS;
-            mbi_out->mem_lower = get_mb2_data(tag, basic_meminfo, mem_lower);
-            mbi_out->mem_upper = get_mb2_data(tag, basic_meminfo, mem_upper);
+            arch_binfo->mem_lower = get_mb2_data(tag, basic_meminfo, mem_lower);
+            arch_binfo->mem_upper = get_mb2_data(tag, basic_meminfo, mem_upper);
             break;
 
         case MULTIBOOT2_TAG_TYPE_MMAP:
             if ( get_mb2_data(tag, mmap, entry_size) < sizeof(*mmap_src) )
                 break;
 
-            mbi_out->flags |= MBI_MEMMAP;
-            mbi_out->mmap_length = get_mb2_data(tag, mmap, size);
-            mbi_out->mmap_length -= sizeof(multiboot2_tag_mmap_t);
-            mbi_out->mmap_length /= get_mb2_data(tag, mmap, entry_size);
-            mbi_out->mmap_length *= sizeof(*mmap_dst);
+            arch_binfo->mmap_length = get_mb2_data(tag, mmap, size);
+            arch_binfo->mmap_length -= sizeof(multiboot2_tag_mmap_t);
+            arch_binfo->mmap_length /= get_mb2_data(tag, mmap, entry_size);
+            arch_binfo->mmap_length *= sizeof(*mmap_dst);
 
-            mbi_out->mmap_addr = alloc_mem(mbi_out->mmap_length);
+            arch_binfo->mmap_addr = alloc_mem(arch_binfo->mmap_length);
+            arch_binfo->flags |= BOOTINFO_FLAG_X86_MEMMAP;
 
             mmap_src = get_mb2_data(tag, mmap, entries);
-            mmap_dst = _p(mbi_out->mmap_addr);
+            mmap_dst = _p(arch_binfo->mmap_addr);
 
-            for ( i = 0; i < mbi_out->mmap_length / sizeof(*mmap_dst); i++ )
+            for ( i = 0; i < arch_binfo->mmap_length / sizeof(*mmap_dst); i++ )
             {
                 /* Init size member properly. */
                 mmap_dst[i].size = sizeof(*mmap_dst);
@@ -280,14 +338,27 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out)
             break;
 
         case MULTIBOOT2_TAG_TYPE_MODULE:
-            if ( mod_idx >= mbi_out->mods_count )
+            if ( mod_idx >= binfo->nr_mods )
                 break;
 
-            mbi_out_mods[mod_idx].mod_start = get_mb2_data(tag, module, mod_start);
-            mbi_out_mods[mod_idx].mod_end = get_mb2_data(tag, module, mod_end);
+            bi_mods[mod_idx].start = get_mb2_data(tag, module, mod_start);
+            bi_mods[mod_idx].size = get_mb2_data(tag, module, mod_end)
+                                            - bi_mods[mod_idx].start;
+
             ptr = get_mb2_string(tag, module, cmdline);
-            mbi_out_mods[mod_idx].string = copy_string(ptr);
-            mbi_out_mods[mod_idx].reserved = 0;
+            if ( ptr )
+            {
+                int i;
+                char *c = _p(ptr);
+
+                for ( i = 0; *c != '\0'; i++, c++ )
+                    bi_mods[mod_idx].string.bytes[i] = *c;
+
+                bi_mods[mod_idx].string.len = i + 1;
+            }
+
+            bi_mods[mod_idx].arch = _addr(&arch_bi_mods[mod_idx]);
+
             ++mod_idx;
             break;
 
@@ -344,11 +415,11 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out)
         video->orig_video_isVGA = 0x23;
 #endif
 
-    return mbi_out;
+    return binfo;
 }
 
-void *__stdcall reloc(uint32_t magic, uint32_t in, uint32_t trampoline,
-                      uint32_t video_info)
+void *__stdcall reloc(
+    uint32_t magic, uint32_t in, uint32_t trampoline, uint32_t video_info)
 {
     alloc = trampoline;
 
diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
index 4e1a799749..933eb30a28 100644
--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -11,14 +11,17 @@
 #include <asm/setup.h>
 
 static struct file __initdata ucode;
-static multiboot_info_t __initdata mbi = {
-    .flags = MBI_MODULES | MBI_LOADERNAME
-};
+
+static struct boot_info __initdata efi_bi;
+static struct arch_boot_info __initdata efi_bi_arch;
 /*
  * The array size needs to be one larger than the number of modules we
  * support - see __start_xen().
  */
-static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
+static struct boot_module __initdata efi_mods[CONFIG_NR_BOOTMODS + 1];
+static struct arch_bootmodule __initdata efi_arch_mods[CONFIG_NR_BOOTMODS + 1];
+
+static const char *__initdata efi_loader = "PVH Directboot";
 
 static void __init edd_put_string(u8 *dst, size_t n, const char *src)
 {
@@ -269,20 +272,37 @@ static void __init noreturn efi_arch_post_exit_boot(void)
                    : [cr3] "r" (idle_pg_table),
                      [cs] "i" (__HYPERVISOR_CS),
                      [ds] "r" (__HYPERVISOR_DS),
-                     "D" (&mbi)
+                     "D" (&efi_bi)
                    : "memory" );
     unreachable();
 }
 
-static void __init efi_arch_cfg_file_early(const EFI_LOADED_IMAGE *image,
-                                           EFI_FILE_HANDLE dir_handle,
-                                           const char *section)
+static struct boot_info __init *efi_arch_bootinfo_init(void)
 {
+    int i;
+
+    efi_bi.arch = &efi_bi_arch;
+    efi_bi.mods = efi_mods;
+
+    for ( i=0; i <= CONFIG_NR_BOOTMODS; i++ )
+        efi_bi.mods[i].arch = &efi_arch_mods[i];
+
+    efi_bi_arch.boot_loader_name = _p(efi_loader);
+
+    efi_bi_arch.flags = BOOTINFO_FLAG_X86_MODULES |
+                        BOOTINFO_FLAG_X86_LOADERNAME;
+    return &efi_bi;
 }
 
-static void __init efi_arch_cfg_file_late(const EFI_LOADED_IMAGE *image,
-                                          EFI_FILE_HANDLE dir_handle,
-                                          const char *section)
+static void __init efi_arch_cfg_file_early(
+    const EFI_LOADED_IMAGE *image, EFI_FILE_HANDLE dir_handle,
+    const char *section)
+{
+}
+
+static void __init efi_arch_cfg_file_late(
+    const EFI_LOADED_IMAGE *image, EFI_FILE_HANDLE dir_handle,
+    const char *section)
 {
     union string name;
 
@@ -294,16 +314,15 @@ static void __init efi_arch_cfg_file_late(const EFI_LOADED_IMAGE *image,
         name.s = get_value(&cfg, "global", "ucode");
     if ( name.s )
     {
-        microcode_set_module(mbi.mods_count);
+        microcode_set_module(efi_bi.nr_mods);
         split_string(name.s);
         read_file(dir_handle, s2w(&name), &ucode, NULL);
         efi_bs->FreePool(name.w);
     }
 }
 
-static void __init efi_arch_handle_cmdline(CHAR16 *image_name,
-                                           CHAR16 *cmdline_options,
-                                           const char *cfgfile_options)
+static void __init efi_arch_handle_cmdline(
+    CHAR16 *image_name, CHAR16 *cmdline_options, const char *cfgfile_options)
 {
     union string name;
 
@@ -311,10 +330,10 @@ static void __init efi_arch_handle_cmdline(CHAR16 *image_name,
     {
         name.w = cmdline_options;
         w2s(&name);
-        place_string(&mbi.cmdline, name.s);
+        place_string((uint32_t *)efi_bi.cmdline, name.s);
     }
     if ( cfgfile_options )
-        place_string(&mbi.cmdline, cfgfile_options);
+        place_string((uint32_t *)efi_bi.cmdline, cfgfile_options);
     /* Insert image name last, as it gets prefixed to the other options. */
     if ( image_name )
     {
@@ -323,16 +342,10 @@ static void __init efi_arch_handle_cmdline(CHAR16 *image_name,
     }
     else
         name.s = "xen";
-    place_string(&mbi.cmdline, name.s);
+    place_string((uint32_t *)efi_bi.cmdline, name.s);
 
-    if ( mbi.cmdline )
-        mbi.flags |= MBI_CMDLINE;
-    /*
-     * These must not be initialized statically, since the value must
-     * not get relocated when processing base relocations later.
-     */
-    mbi.boot_loader_name = (long)"EFI";
-    mbi.mods_addr = (long)mb_modules;
+    if ( efi_bi.cmdline )
+        efi_bi_arch.flags |= BOOTINFO_FLAG_X86_CMDLINE;
 }
 
 static void __init efi_arch_edd(void)
@@ -695,9 +708,8 @@ static void __init efi_arch_memory_setup(void)
 #undef l2_4G_offset
 }
 
-static void __init efi_arch_handle_module(const struct file *file,
-                                          const CHAR16 *name,
-                                          const char *options)
+static void __init efi_arch_handle_module(
+    const struct file *file, const CHAR16 *name, const char *options)
 {
     union string local_name;
     void *ptr;
@@ -715,17 +727,25 @@ static void __init efi_arch_handle_module(const struct file *file,
     w2s(&local_name);
 
     /*
-     * If options are provided, put them in
-     * mb_modules[mbi.mods_count].string after the filename, with a space
-     * separating them.  place_string() prepends strings and adds separating
-     * spaces, so the call order is reversed.
+     * Set module string to filename and if options are provided, put them in
+     * after the filename, with a space separating them.
      */
+    strlcpy(efi_bi.mods[efi_bi.nr_mods].string.bytes, local_name.s,
+                 BOOTMOD_MAX_STRING);
     if ( options )
-        place_string(&mb_modules[mbi.mods_count].string, options);
-    place_string(&mb_modules[mbi.mods_count].string, local_name.s);
-    mb_modules[mbi.mods_count].mod_start = file->addr >> PAGE_SHIFT;
-    mb_modules[mbi.mods_count].mod_end = file->size;
-    ++mbi.mods_count;
+    {
+        strlcat(efi_bi.mods[efi_bi.nr_mods].string.bytes, " ",
+                BOOTMOD_MAX_STRING);
+        strlcat(efi_bi.mods[efi_bi.nr_mods].string.bytes, options,
+                BOOTMOD_MAX_STRING);
+    }
+    efi_bi.mods[efi_bi.nr_mods].string.kind = BOOTSTR_CMDLINE;
+
+    efi_bi.mods[efi_bi.nr_mods].start = file->addr;
+    efi_bi.mods[efi_bi.nr_mods].mfn = maddr_to_mfn(file->addr);
+    efi_bi.mods[efi_bi.nr_mods].size = file->size;
+
+    ++efi_bi.nr_mods;
     efi_bs->FreePool(ptr);
 }
 
diff --git a/xen/arch/x86/guest/xen/pvh-boot.c b/xen/arch/x86/guest/xen/pvh-boot.c
index 834b1ad16b..28cf5df0a3 100644
--- a/xen/arch/x86/guest/xen/pvh-boot.c
+++ b/xen/arch/x86/guest/xen/pvh-boot.c
@@ -18,6 +18,7 @@
  *
  * Copyright (c) 2017 Citrix Systems Ltd.
  */
+#include <xen/bootinfo.h>
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/mm.h>
@@ -31,12 +32,28 @@
 bool __initdata pvh_boot;
 uint32_t __initdata pvh_start_info_pa;
 
-static multiboot_info_t __initdata pvh_mbi;
-static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];
-static const char *__initdata pvh_loader = "PVH Directboot";
+static struct boot_info __initdata pvh_bi;
+static struct arch_boot_info __initdata arch_pvh_bi;
+static struct boot_module __initdata pvh_mods[CONFIG_NR_BOOTMODS + 1];
+static struct arch_bootmodule __initdata arch_pvh_mods[CONFIG_NR_BOOTMODS + 1];
+static char __initdata *pvh_loader = "PVH Directboot";
 
-static void __init convert_pvh_info(multiboot_info_t **mbi,
-                                    module_t **mod)
+static struct boot_info __init *init_pvh_info(void)
+{
+    int i;
+
+    pvh_bi.arch = &arch_pvh_bi;
+    pvh_bi.mods = pvh_mods;
+
+    for ( i=0; i <= CONFIG_NR_BOOTMODS; i++ )
+        pvh_bi.mods[i].arch = &arch_pvh_mods[i];
+
+    pvh_bi.arch->boot_loader_name = pvh_loader;
+
+    return &pvh_bi;
+}
+
+static void __init convert_pvh_info(struct boot_info *bi)
 {
     const struct hvm_start_info *pvh_info = __va(pvh_start_info_pa);
     const struct hvm_modlist_entry *entry;
@@ -50,23 +67,22 @@ static void __init convert_pvh_info(multiboot_info_t **mbi,
      * required. The extra element is used to aid relocation. See
      * arch/x86/setup.c:__start_xen().
      */
-    if ( ARRAY_SIZE(pvh_mbi_mods) <= pvh_info->nr_modules )
+    if ( ARRAY_SIZE(pvh_mods) <= pvh_info->nr_modules )
         panic("The module array is too small, size %zu, requested %u\n",
-              ARRAY_SIZE(pvh_mbi_mods), pvh_info->nr_modules);
+              ARRAY_SIZE(pvh_mods), pvh_info->nr_modules);
 
     /*
      * Turn hvm_start_info into mbi. Luckily all modules are placed under 4GB
      * boundary on x86.
      */
-    pvh_mbi.flags = MBI_CMDLINE | MBI_MODULES | MBI_LOADERNAME;
+    bi->arch->flags = BOOTINFO_FLAG_X86_CMDLINE | BOOTINFO_FLAG_X86_MODULES
+                      | BOOTINFO_FLAG_X86_LOADERNAME;
 
     BUG_ON(pvh_info->cmdline_paddr >> 32);
-    pvh_mbi.cmdline = pvh_info->cmdline_paddr;
-    pvh_mbi.boot_loader_name = __pa(pvh_loader);
+    bi->cmdline = _p(__va(pvh_info->cmdline_paddr));
 
-    BUG_ON(pvh_info->nr_modules >= ARRAY_SIZE(pvh_mbi_mods));
-    pvh_mbi.mods_count = pvh_info->nr_modules;
-    pvh_mbi.mods_addr = __pa(pvh_mbi_mods);
+    BUG_ON(pvh_info->nr_modules >= ARRAY_SIZE(pvh_mods));
+    bi->nr_mods = pvh_info->nr_modules;
 
     entry = __va(pvh_info->modlist_paddr);
     for ( i = 0; i < pvh_info->nr_modules; i++ )
@@ -74,15 +90,18 @@ static void __init convert_pvh_info(multiboot_info_t **mbi,
         BUG_ON(entry[i].paddr >> 32);
         BUG_ON(entry[i].cmdline_paddr >> 32);
 
-        pvh_mbi_mods[i].mod_start = entry[i].paddr;
-        pvh_mbi_mods[i].mod_end   = entry[i].paddr + entry[i].size;
-        pvh_mbi_mods[i].string    = entry[i].cmdline_paddr;
+        bi->mods[i].start = entry[i].paddr;
+        bi->mods[i].size  = entry[i].size;
+        if ( entry[i].cmdline_paddr)
+        {
+            char *c = _p(__va(entry[i].cmdline_paddr));
+
+            safe_strcpy(bi->mods[i].string.bytes, c);
+            bi->mods[i].string.kind = BOOTSTR_CMDLINE;
+        }
     }
 
     rsdp_hint = pvh_info->rsdp_paddr;
-
-    *mbi = &pvh_mbi;
-    *mod = pvh_mbi_mods;
 }
 
 static void __init get_memory_map(void)
@@ -99,13 +118,16 @@ static void __init get_memory_map(void)
     sanitize_e820_map(e820_raw.map, &e820_raw.nr_map);
 }
 
-void __init pvh_init(multiboot_info_t **mbi, module_t **mod)
+void __init pvh_init(struct boot_info **bi)
 {
-    convert_pvh_info(mbi, mod);
+    *bi = init_pvh_info();
+    convert_pvh_info(*bi);
 
     hypervisor_probe();
     ASSERT(xen_guest);
 
+    (*bi)->arch->xen_guest = xen_guest;
+
     get_memory_map();
 }
 
diff --git a/xen/arch/x86/include/asm/guest/pvh-boot.h b/xen/arch/x86/include/asm/guest/pvh-boot.h
index 48ffd1a0b1..120baf4ebb 100644
--- a/xen/arch/x86/include/asm/guest/pvh-boot.h
+++ b/xen/arch/x86/include/asm/guest/pvh-boot.h
@@ -19,13 +19,13 @@
 #ifndef __X86_PVH_BOOT_H__
 #define __X86_PVH_BOOT_H__
 
-#include <xen/multiboot.h>
+#include <xen/bootinfo.h>
 
 #ifdef CONFIG_PVH_GUEST
 
 extern bool pvh_boot;
 
-void pvh_init(multiboot_info_t **mbi, module_t **mod);
+void __init pvh_init(struct boot_info **bi);
 void pvh_print_info(void);
 
 #else
@@ -34,7 +34,7 @@ void pvh_print_info(void);
 
 #define pvh_boot 0
 
-static inline void pvh_init(multiboot_info_t **mbi, module_t **mod)
+static inline void __init pvh_init(struct boot_info **bi)
 {
     ASSERT_UNREACHABLE();
 }
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 2700f4eb3e..ad37f4a658 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -13,7 +13,6 @@
 #include <xen/console.h>
 #include <xen/serial.h>
 #include <xen/trace.h>
-#include <xen/multiboot.h>
 #include <xen/domain_page.h>
 #include <xen/version.h>
 #include <xen/hypercall.h>
@@ -273,47 +272,6 @@ custom_param("acpi", parse_acpi_param);
 
 struct boot_info __initdata *boot_info;
 
-static void __init mb_to_bootinfo(multiboot_info_t *mbi, module_t *mods)
-{
-    static struct boot_info       __initdata x86_binfo;
-    static struct arch_boot_info  __initdata arch_x86_binfo;
-    static struct boot_module     __initdata x86_mods[CONFIG_NR_BOOTMODS + 1];
-    static struct arch_bootmodule __initdata
-                                        arch_x86_mods[CONFIG_NR_BOOTMODS + 1];
-    int i;
-
-    x86_binfo.arch = &arch_x86_binfo;
-    x86_binfo.mods = x86_mods;
-
-    x86_binfo.cmdline = __va(mbi->cmdline);
-
-    /* The BOOTINFO_FLAG_X86_* flags are a 1-1 map to MBI_* */
-    arch_x86_binfo.flags = mbi->flags;
-    arch_x86_binfo.mem_upper = mbi->mem_upper;
-    arch_x86_binfo.mem_lower = mbi->mem_lower;
-    arch_x86_binfo.mmap_length = mbi->mmap_length;
-    arch_x86_binfo.mmap_addr = mbi->mmap_addr;
-    arch_x86_binfo.boot_loader_name = __va(mbi->boot_loader_name);
-
-    x86_binfo.nr_mods = mbi->mods_count;
-    for ( i = 0; i <= CONFIG_NR_BOOTMODS; i++)
-    {
-        x86_mods[i].arch = &arch_x86_mods[i];
-
-        if ( i < x86_binfo.nr_mods )
-        {
-            bootmodule_update_start(&x86_mods[i], mods[i].mod_start);
-            x86_mods[i].size = mods[i].mod_end - mods[i].mod_start;
-
-            x86_mods[i].string.len = strlcpy(x86_mods[i].string.bytes,
-                                              __va(mods[i].string),
-                                              BOOTMOD_MAX_STRING);
-        }
-    }
-
-    boot_info = &x86_binfo;
-}
-
 unsigned long __init initial_images_nrpages(nodeid_t node)
 {
     unsigned long node_start = node_start_pfn(node);
@@ -900,15 +858,13 @@ static struct domain *__init create_dom0(
 /* How much of the directmap is prebuilt at compile time. */
 #define PREBUILT_MAP_LIMIT (1 << L2_PAGETABLE_SHIFT)
 
-void __init noreturn __start_xen(unsigned long mbi_p)
+void __init noreturn __start_xen(unsigned long bi_p)
 {
     char *memmap_type = NULL;
     char *cmdline, *kextra, *loader;
     void *bsp_stack;
     struct cpu_info *info = get_cpu_info(), *bsp_info;
     unsigned int initrdidx, num_parked = 0;
-    multiboot_info_t *mbi;
-    module_t *mod;
     unsigned long nr_pages, raw_max_page;
     int i, j, e820_warn = 0, bytes = 0;
     unsigned long eb_start, eb_end;
@@ -945,16 +901,29 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
     if ( pvh_boot )
     {
-        ASSERT(mbi_p == 0);
-        pvh_init(&mbi, &mod);
+        ASSERT(bi_p == 0);
+        pvh_init(&boot_info);
     }
     else
     {
-        mbi = __va(mbi_p);
-        mod = __va(mbi->mods_addr);
-    }
+        /*
+         * Since addresses were setup before virtual addressing was enabled,
+         * fixup pointers to virtual addresses for proper dereferencing.
+         */
+        boot_info = __va(bi_p);
+        boot_info->cmdline = __va(boot_info->cmdline);
+        boot_info->mods = __va(boot_info->mods);
+        boot_info->arch = __va(boot_info->arch);
+
+        boot_info->arch->boot_loader_name =
+            __va(boot_info->arch->boot_loader_name);
 
-    mb_to_bootinfo(mbi, mod);
+        for ( i = 0; i <= boot_info->nr_mods; i++ )
+        {
+            boot_info->mods[i].mfn = maddr_to_mfn(boot_info->mods[i].start);
+            boot_info->mods[i].arch = __va(boot_info->mods[i].arch);
+        }
+    }
 
     loader = (boot_info->arch->flags & BOOTINFO_FLAG_X86_LOADERNAME)
         ? boot_info->arch->boot_loader_name : "unknown";
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index a25e1d29f1..287e48b49a 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -3,6 +3,7 @@
 #include <efi/efipciio.h>
 #include <public/xen.h>
 #include <xen/bitops.h>
+#include <xen/bootinfo.h>
 #include <xen/compile.h>
 #include <xen/ctype.h>
 #include <xen/dmi.h>
@@ -11,7 +12,6 @@
 #include <xen/keyhandler.h>
 #include <xen/lib.h>
 #include <xen/mm.h>
-#include <xen/multiboot.h>
 #include <xen/param.h>
 #include <xen/pci_regs.h>
 #include <xen/pfn.h>
@@ -1222,6 +1222,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
 
     efi_arch_relocate_image(0);
 
+    efi_arch_bootinfo_init();
+
     if ( use_cfg_file )
     {
         EFI_FILE_HANDLE dir_handle;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 05/18] x86: refactor xen cmdline into general framework
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (3 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 04/18] x86: refactor entrypoints to new boot info Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-19 13:26   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 06/18] fdt: make fdt handling reusable across arch Daniel P. Smith
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini

This refactors xen cmdline processing into a general framework
under the new boot info abstraction.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/include/asm/bootinfo.h | 49 ++++++++++++++++++++++++
 xen/arch/x86/setup.c                | 58 ++++-------------------------
 xen/include/xen/bootinfo.h          | 11 ++++++
 3 files changed, 68 insertions(+), 50 deletions(-)

diff --git a/xen/arch/x86/include/asm/bootinfo.h b/xen/arch/x86/include/asm/bootinfo.h
index e5135e402b..2fcd576023 100644
--- a/xen/arch/x86/include/asm/bootinfo.h
+++ b/xen/arch/x86/include/asm/bootinfo.h
@@ -45,4 +45,53 @@ struct __packed mb_memmap {
     uint32_t type;
 };
 
+static inline bool loader_is_grub2(const char *loader_name)
+{
+    /* GRUB1="GNU GRUB 0.xx"; GRUB2="GRUB 1.xx" */
+    const char *p = strstr(loader_name, "GRUB ");
+    return (p != NULL) && (p[5] != '0');
+}
+
+static inline char *arch_prepare_cmdline(char *p, struct arch_boot_info *arch)
+{
+    p = p ? : "";
+
+    /* Strip leading whitespace. */
+    while ( *p == ' ' )
+        p++;
+
+    /* GRUB2 and PVH don't not include image name as first item on command line. */
+    if ( !(arch->xenguest || loader_is_grub2(arch->boot_loader_name)) )
+    {
+        /* Strip image name plus whitespace. */
+        while ( (*p != ' ') && (*p != '\0') )
+            p++;
+        while ( *p == ' ' )
+            p++;
+    }
+
+    return p;
+}
+
+static inline char *arch_bootinfo_prepare_cmdline(
+    char *cmdline, struct arch_boot_info *arch)
+{
+    if ( !cmdline )
+        return "";
+
+    if ( (arch->kextra = strstr(cmdline, " -- ")) != NULL )
+    {
+        /*
+         * Options after ' -- ' separator belong to dom0.
+         *  1. Orphan dom0's options from Xen's command line.
+         *  2. Skip all but final leading space from dom0's options.
+         */
+        *arch->kextra = '\0';
+        arch->kextra += 3;
+        while ( arch->kextra[1] == ' ' ) arch->kextra++;
+    }
+
+
+    return arch_prepare_cmdline(cmdline, arch);
+}
 #endif
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index ad37f4a658..e4060d6219 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -716,34 +716,6 @@ ignore_param("edid");
  */
 ignore_param("placeholder");
 
-static bool __init loader_is_grub2(const char *loader_name)
-{
-    /* GRUB1="GNU GRUB 0.xx"; GRUB2="GRUB 1.xx" */
-    const char *p = strstr(loader_name, "GRUB ");
-    return (p != NULL) && (p[5] != '0');
-}
-
-static char * __init cmdline_cook(char *p, const char *loader_name)
-{
-    p = p ? : "";
-
-    /* Strip leading whitespace. */
-    while ( *p == ' ' )
-        p++;
-
-    /* GRUB2 and PVH don't not include image name as first item on command line. */
-    if ( xen_guest || loader_is_grub2(loader_name) )
-        return p;
-
-    /* Strip image name plus whitespace. */
-    while ( (*p != ' ') && (*p != '\0') )
-        p++;
-    while ( *p == ' ' )
-        p++;
-
-    return p;
-}
-
 static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int limit)
 {
     unsigned int n = min(bootsym(bios_e820nr), limit);
@@ -754,8 +726,7 @@ static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int li
     return n;
 }
 
-static struct domain *__init create_dom0(
-    const struct boot_info *bi, const char *kextra, const char *loader)
+static struct domain *__init create_dom0(const struct boot_info *bi)
 {
     struct xen_domctl_createdomain dom0_cfg = {
         .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
@@ -804,16 +775,16 @@ static struct domain *__init create_dom0(
     /* Grab the DOM0 command line. */
     cmdline = (image->string.kind == BOOTSTR_CMDLINE) ?
               image->string.bytes : NULL;
-    if ( cmdline || kextra )
+    if ( cmdline || bi->arch->kextra )
     {
         static char __initdata dom0_cmdline[MAX_GUEST_CMDLINE];
 
-        cmdline = cmdline_cook(cmdline, loader);
+        cmdline = arch_prepare_cmdline(cmdline, bi->arch);
         safe_strcpy(dom0_cmdline, cmdline);
 
-        if ( kextra )
+        if ( bi->arch->kextra )
             /* kextra always includes exactly one leading space. */
-            safe_strcat(dom0_cmdline, kextra);
+            safe_strcat(dom0_cmdline, bi->arch->kextra);
 
         /* Append any extra parameters. */
         if ( skip_ioapic_setup && !strstr(dom0_cmdline, "noapic") )
@@ -861,7 +832,7 @@ static struct domain *__init create_dom0(
 void __init noreturn __start_xen(unsigned long bi_p)
 {
     char *memmap_type = NULL;
-    char *cmdline, *kextra, *loader;
+    char *cmdline, *loader;
     void *bsp_stack;
     struct cpu_info *info = get_cpu_info(), *bsp_info;
     unsigned int initrdidx, num_parked = 0;
@@ -929,20 +900,7 @@ void __init noreturn __start_xen(unsigned long bi_p)
         ? boot_info->arch->boot_loader_name : "unknown";
 
     /* Parse the command-line options. */
-    cmdline = cmdline_cook((boot_info->arch->flags & BOOTINFO_FLAG_X86_CMDLINE) ?
-                            boot_info->cmdline : NULL,
-                           loader);
-    if ( (kextra = strstr(cmdline, " -- ")) != NULL )
-    {
-        /*
-         * Options after ' -- ' separator belong to dom0.
-         *  1. Orphan dom0's options from Xen's command line.
-         *  2. Skip all but final leading space from dom0's options.
-         */
-        *kextra = '\0';
-        kextra += 3;
-        while ( kextra[1] == ' ' ) kextra++;
-    }
+    cmdline = bootinfo_prepare_cmdline(boot_info);
     cmdline_parse(cmdline);
 
     /* Must be after command line argument parsing and before
@@ -1951,7 +1909,7 @@ void __init noreturn __start_xen(unsigned long bi_p)
      * We're going to setup domain0 using the module(s) that we stashed safely
      * above our heap. The second module, if present, is an initrd ramdisk.
      */
-    dom0 = create_dom0(boot_info, kextra, loader);
+    dom0 = create_dom0(boot_info);
     if ( !dom0 )
         panic("Could not set up DOM0 guest OS\n");
 
diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
index dde8202f62..477294dc10 100644
--- a/xen/include/xen/bootinfo.h
+++ b/xen/include/xen/bootinfo.h
@@ -53,6 +53,17 @@ struct __packed boot_info {
 
 extern struct boot_info *boot_info;
 
+static inline char *bootinfo_prepare_cmdline(struct boot_info *bi)
+{
+    bi->cmdline = arch_bootinfo_prepare_cmdline(bi->cmdline, bi->arch);
+
+    if ( *bi->cmdline == ' ' )
+        printk(XENLOG_WARNING "%s: leading whitespace left on cmdline\n",
+               __func__);
+
+    return bi->cmdline;
+}
+
 static inline unsigned long bootmodule_next_idx_by_kind(
     const struct boot_info *bi, bootmodule_kind kind, unsigned long start)
 {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 06/18] fdt: make fdt handling reusable across arch
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (4 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 05/18] x86: refactor xen cmdline into general framework Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-07  1:44   ` Henry Wang
  2022-07-19  9:36   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 07/18] docs: update hyperlaunch device tree documentation Daniel P. Smith
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Volodymyr Babchuk
  Cc: Daniel P. Smith, scott.davis, christopher.clark,
	Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Andrew Cooper, George Dunlap, Jan Beulich, Wei Liu

This refactors reusable code from Arm's bootfdt.c and device-tree.h that is
general fdt handling code.  The Kconfig parameter CORE_DEVICE_TREE is
introduced for when the ability of parsing DTB files is needed by a capability
such as hyperlaunch.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/arm/bootfdt.c        | 115 +----------------------------
 xen/common/Kconfig            |   4 ++
 xen/common/Makefile           |   3 +-
 xen/common/fdt.c              | 131 ++++++++++++++++++++++++++++++++++
 xen/include/xen/device_tree.h |  50 +------------
 xen/include/xen/fdt.h         |  79 ++++++++++++++++++++
 6 files changed, 218 insertions(+), 164 deletions(-)
 create mode 100644 xen/common/fdt.c
 create mode 100644 xen/include/xen/fdt.h

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index ec81a45de9..ddedb55fe7 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -14,53 +14,11 @@
 #include <xen/efi.h>
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
+#include <xen/fdt.h>
 #include <xen/sort.h>
 #include <xsm/xsm.h>
 #include <asm/setup.h>
 
-static bool __init device_tree_node_matches(const void *fdt, int node,
-                                            const char *match)
-{
-    const char *name;
-    size_t match_len;
-
-    name = fdt_get_name(fdt, node, NULL);
-    match_len = strlen(match);
-
-    /* Match both "match" and "match@..." patterns but not
-       "match-foo". */
-    return strncmp(name, match, match_len) == 0
-        && (name[match_len] == '@' || name[match_len] == '\0');
-}
-
-static bool __init device_tree_node_compatible(const void *fdt, int node,
-                                               const char *match)
-{
-    int len, l;
-    const void *prop;
-
-    prop = fdt_getprop(fdt, node, "compatible", &len);
-    if ( prop == NULL )
-        return false;
-
-    while ( len > 0 ) {
-        if ( !dt_compat_cmp(prop, match) )
-            return true;
-        l = strlen(prop) + 1;
-        prop += l;
-        len -= l;
-    }
-
-    return false;
-}
-
-void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
-                                u32 size_cells, u64 *start, u64 *size)
-{
-    *start = dt_next_cell(address_cells, cell);
-    *size = dt_next_cell(size_cells, cell);
-}
-
 static int __init device_tree_get_meminfo(const void *fdt, int node,
                                           const char *prop_name,
                                           u32 address_cells, u32 size_cells,
@@ -108,77 +66,6 @@ static int __init device_tree_get_meminfo(const void *fdt, int node,
     return 0;
 }
 
-u32 __init device_tree_get_u32(const void *fdt, int node,
-                               const char *prop_name, u32 dflt)
-{
-    const struct fdt_property *prop;
-
-    prop = fdt_get_property(fdt, node, prop_name, NULL);
-    if ( !prop || prop->len < sizeof(u32) )
-        return dflt;
-
-    return fdt32_to_cpu(*(uint32_t*)prop->data);
-}
-
-/**
- * device_tree_for_each_node - iterate over all device tree sub-nodes
- * @fdt: flat device tree.
- * @node: parent node to start the search from
- * @func: function to call for each sub-node.
- * @data: data to pass to @func.
- *
- * Any nodes nested at DEVICE_TREE_MAX_DEPTH or deeper are ignored.
- *
- * Returns 0 if all nodes were iterated over successfully.  If @func
- * returns a value different from 0, that value is returned immediately.
- */
-int __init device_tree_for_each_node(const void *fdt, int node,
-                                     device_tree_node_func func,
-                                     void *data)
-{
-    /*
-     * We only care about relative depth increments, assume depth of
-     * node is 0 for simplicity.
-     */
-    int depth = 0;
-    const int first_node = node;
-    u32 address_cells[DEVICE_TREE_MAX_DEPTH];
-    u32 size_cells[DEVICE_TREE_MAX_DEPTH];
-    int ret;
-
-    do {
-        const char *name = fdt_get_name(fdt, node, NULL);
-        u32 as, ss;
-
-        if ( depth >= DEVICE_TREE_MAX_DEPTH )
-        {
-            printk("Warning: device tree node `%s' is nested too deep\n",
-                   name);
-            continue;
-        }
-
-        as = depth > 0 ? address_cells[depth-1] : DT_ROOT_NODE_ADDR_CELLS_DEFAULT;
-        ss = depth > 0 ? size_cells[depth-1] : DT_ROOT_NODE_SIZE_CELLS_DEFAULT;
-
-        address_cells[depth] = device_tree_get_u32(fdt, node,
-                                                   "#address-cells", as);
-        size_cells[depth] = device_tree_get_u32(fdt, node,
-                                                "#size-cells", ss);
-
-        /* skip the first node */
-        if ( node != first_node )
-        {
-            ret = func(fdt, node, name, depth, as, ss, data);
-            if ( ret != 0 )
-                return ret;
-        }
-
-        node = fdt_next_node(fdt, node, &depth);
-    } while ( node >= 0 && depth > 0 );
-
-    return 0;
-}
-
 static int __init process_memory_node(const void *fdt, int node,
                                       const char *name, int depth,
                                       u32 address_cells, u32 size_cells,
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 41a67891bc..9fc6683932 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -31,8 +31,12 @@ config HAS_ALTERNATIVE
 config HAS_COMPAT
 	bool
 
+config CORE_DEVICE_TREE
+	bool
+
 config HAS_DEVICE_TREE
 	bool
+	select CORE_DEVICE_TREE
 
 config HAS_EX_TABLE
 	bool
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 3baf83d527..ebd3e2d659 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -10,6 +10,7 @@ obj-y += domain.o
 obj-y += event_2l.o
 obj-y += event_channel.o
 obj-y += event_fifo.o
+obj-$(CONFIG_CORE_DEVICE_TREE) += fdt.o
 obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
 obj-$(CONFIG_GRANT_TABLE) += grant_table.o
 obj-y += guestcopy.o
@@ -73,7 +74,7 @@ obj-y += sched/
 obj-$(CONFIG_UBSAN) += ubsan/
 
 obj-$(CONFIG_NEEDS_LIBELF) += libelf/
-obj-$(CONFIG_HAS_DEVICE_TREE) += libfdt/
+obj-$(CONFIG_CORE_DEVICE_TREE) += libfdt/
 
 CONF_FILE := $(if $(patsubst /%,,$(KCONFIG_CONFIG)),$(objtree)/)$(KCONFIG_CONFIG)
 $(obj)/config.gz: $(CONF_FILE)
diff --git a/xen/common/fdt.c b/xen/common/fdt.c
new file mode 100644
index 0000000000..ed9347e5f7
--- /dev/null
+++ b/xen/common/fdt.c
@@ -0,0 +1,131 @@
+/*
+ * Flattened Device Tree
+ *
+ * Copyright (C) 2012-2014 Citrix Systems, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <xen/fdt.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/types.h>
+
+bool __init device_tree_node_matches(
+    const void *fdt, int node, const char *match)
+{
+    const char *name;
+    size_t match_len;
+
+    name = fdt_get_name(fdt, node, NULL);
+    match_len = strlen(match);
+
+    /* Match both "match" and "match@..." patterns but not
+       "match-foo". */
+    return strncmp(name, match, match_len) == 0
+        && (name[match_len] == '@' || name[match_len] == '\0');
+}
+
+bool __init device_tree_node_compatible(
+    const void *fdt, int node, const char *match)
+{
+    int len, l;
+    int mlen;
+    const void *prop;
+
+    mlen = strlen(match);
+
+    prop = fdt_getprop(fdt, node, "compatible", &len);
+    if ( prop == NULL )
+        return false;
+
+    while ( len > 0 ) {
+        if ( !dt_compat_cmp(prop, match) )
+            return true;
+        l = strlen(prop) + 1;
+        prop += l;
+        len -= l;
+    }
+
+    return false;
+}
+
+void __init device_tree_get_reg(
+    const __be32 **cell, u32 address_cells, u32 size_cells, u64 *start,
+    u64 *size)
+{
+    *start = dt_next_cell(address_cells, cell);
+    *size = dt_next_cell(size_cells, cell);
+}
+
+u32 __init device_tree_get_u32(
+    const void *fdt, int node, const char *prop_name, u32 dflt)
+{
+    const struct fdt_property *prop;
+
+    prop = fdt_get_property(fdt, node, prop_name, NULL);
+    if ( !prop || prop->len < sizeof(u32) )
+        return dflt;
+
+    return fdt32_to_cpu(*(uint32_t*)prop->data);
+}
+
+/**
+ * device_tree_for_each_node - iterate over all device tree sub-nodes
+ * @fdt: flat device tree.
+ * @node: parent node to start the search from
+ * @func: function to call for each sub-node.
+ * @data: data to pass to @func.
+ *
+ * Any nodes nested at DEVICE_TREE_MAX_DEPTH or deeper are ignored.
+ *
+ * Returns 0 if all nodes were iterated over successfully.  If @func
+ * returns a value different from 0, that value is returned immediately.
+ */
+int __init device_tree_for_each_node(
+    const void *fdt, int node, device_tree_node_func func, void *data)
+{
+    /*
+     * We only care about relative depth increments, assume depth of
+     * node is 0 for simplicity.
+     */
+    int depth = 0;
+    const int first_node = node;
+    u32 address_cells[DEVICE_TREE_MAX_DEPTH];
+    u32 size_cells[DEVICE_TREE_MAX_DEPTH];
+    int ret;
+
+    do {
+        const char *name = fdt_get_name(fdt, node, NULL);
+        u32 as, ss;
+
+        if ( depth >= DEVICE_TREE_MAX_DEPTH )
+        {
+            printk("Warning: device tree node `%s' is nested too deep\n",
+                   name);
+            continue;
+        }
+
+        as = depth > 0 ? address_cells[depth-1] : DT_ROOT_NODE_ADDR_CELLS_DEFAULT;
+        ss = depth > 0 ? size_cells[depth-1] : DT_ROOT_NODE_SIZE_CELLS_DEFAULT;
+
+        address_cells[depth] = device_tree_get_u32(fdt, node,
+                                                   "#address-cells", as);
+        size_cells[depth] = device_tree_get_u32(fdt, node,
+                                                "#size-cells", ss);
+
+        /* skip the first node */
+        if ( node != first_node )
+        {
+            ret = func(fdt, node, name, depth, as, ss, data);
+            if ( ret != 0 )
+                return ret;
+        }
+
+        node = fdt_next_node(fdt, node, &depth);
+    } while ( node >= 0 && depth > 0 );
+
+    return 0;
+}
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 430a1ef445..c98c898ffc 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -14,13 +14,12 @@
 #include <asm/device.h>
 #include <public/xen.h>
 #include <public/device_tree_defs.h>
+#include <xen/fdt.h>
 #include <xen/kernel.h>
 #include <xen/string.h>
 #include <xen/types.h>
 #include <xen/list.h>
 
-#define DEVICE_TREE_MAX_DEPTH 16
-
 /*
  * Struct used for matching a device
  */
@@ -158,17 +157,8 @@ struct dt_raw_irq {
 #define dt_irq(irq) ((irq)->irq)
 #define dt_irq_flags(irq) ((irq)->flags)
 
-typedef int (*device_tree_node_func)(const void *fdt,
-                                     int node, const char *name, int depth,
-                                     u32 address_cells, u32 size_cells,
-                                     void *data);
-
 extern const void *device_tree_flattened;
 
-int device_tree_for_each_node(const void *fdt, int node,
-                              device_tree_node_func func,
-                              void *data);
-
 /**
  * dt_unflatten_host_device_tree - Unflatten the host device tree
  *
@@ -213,14 +203,6 @@ extern const struct dt_device_node *dt_interrupt_controller;
 struct dt_device_node *
 dt_find_interrupt_controller(const struct dt_device_match *matches);
 
-#define dt_prop_cmp(s1, s2) strcmp((s1), (s2))
-#define dt_node_cmp(s1, s2) strcasecmp((s1), (s2))
-#define dt_compat_cmp(s1, s2) strcasecmp((s1), (s2))
-
-/* Default #address and #size cells */
-#define DT_ROOT_NODE_ADDR_CELLS_DEFAULT 2
-#define DT_ROOT_NODE_SIZE_CELLS_DEFAULT 1
-
 #define dt_for_each_property_node(dn, pp)                   \
     for ( pp = dn->properties; pp != NULL; pp = pp->next )
 
@@ -230,36 +212,6 @@ dt_find_interrupt_controller(const struct dt_device_match *matches);
 #define dt_for_each_child_node(dt, dn)                      \
     for ( dn = dt->child; dn != NULL; dn = dn->sibling )
 
-/* Helper to read a big number; size is in cells (not bytes) */
-static inline u64 dt_read_number(const __be32 *cell, int size)
-{
-    u64 r = 0;
-
-    while ( size-- )
-        r = (r << 32) | be32_to_cpu(*(cell++));
-    return r;
-}
-
-/* Helper to convert a number of cells to bytes */
-static inline int dt_cells_to_size(int size)
-{
-    return (size * sizeof (u32));
-}
-
-/* Helper to convert a number of bytes to cells, rounds down */
-static inline int dt_size_to_cells(int bytes)
-{
-    return (bytes / sizeof(u32));
-}
-
-static inline u64 dt_next_cell(int s, const __be32 **cellp)
-{
-    const __be32 *p = *cellp;
-
-    *cellp = p + s;
-    return dt_read_number(p, s);
-}
-
 static inline const char *dt_node_full_name(const struct dt_device_node *np)
 {
     return (np && np->full_name) ? np->full_name : "<no-node>";
diff --git a/xen/include/xen/fdt.h b/xen/include/xen/fdt.h
new file mode 100644
index 0000000000..00f9f3792f
--- /dev/null
+++ b/xen/include/xen/fdt.h
@@ -0,0 +1,79 @@
+/*
+ * Flattened Device Tree
+ *
+ * Copyright (C) 2012 Citrix Systems, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef __XEN_FDT_H__
+#define __XEN_FDT_H__
+
+#include <xen/init.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/types.h>
+
+#define DEVICE_TREE_MAX_DEPTH 16
+
+/* Default #address and #size cells */
+#define DT_ROOT_NODE_ADDR_CELLS_DEFAULT 2
+#define DT_ROOT_NODE_SIZE_CELLS_DEFAULT 1
+
+#define dt_prop_cmp(s1, s2) strcmp((s1), (s2))
+#define dt_node_cmp(s1, s2) strcasecmp((s1), (s2))
+#define dt_compat_cmp(s1, s2) strcasecmp((s1), (s2))
+
+/* Helper to read a big number; size is in cells (not bytes) */
+static inline u64 dt_read_number(const __be32 *cell, int size)
+{
+    u64 r = 0;
+
+    while ( size-- )
+        r = (r << 32) | be32_to_cpu(*(cell++));
+    return r;
+}
+
+/* Helper to convert a number of cells to bytes */
+static inline int dt_cells_to_size(int size)
+{
+    return (size * sizeof (u32));
+}
+
+/* Helper to convert a number of bytes to cells, rounds down */
+static inline int dt_size_to_cells(int bytes)
+{
+    return (bytes / sizeof(u32));
+}
+
+static inline u64 dt_next_cell(int s, const __be32 **cellp)
+{
+    const __be32 *p = *cellp;
+
+    *cellp = p + s;
+    return dt_read_number(p, s);
+}
+
+
+bool __init device_tree_node_matches(
+    const void *fdt, int node, const char *match);
+
+bool __init device_tree_node_compatible(
+    const void *fdt, int node, const char *match);
+
+void __init device_tree_get_reg(
+    const __be32 **cell, u32 address_cells, u32 size_cells, u64 *start,
+    u64 *size);
+
+u32 __init device_tree_get_u32(
+    const void *fdt, int node, const char *prop_name, u32 dflt);
+
+typedef int (*device_tree_node_func)(
+    const void *fdt, int node, const char *name, int depth, u32 address_cells,
+    u32 size_cells, void *data);
+
+int device_tree_for_each_node(
+    const void *fdt, int node, device_tree_node_func func, void *data);
+
+
+#endif /* __XEN_FDT_H__ */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 07/18] docs: update hyperlaunch device tree documentation
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (5 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 06/18] fdt: make fdt handling reusable across arch Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-18 13:57   ` Smith, Jackson
  2022-07-06 21:04 ` [PATCH v1 08/18] kconfig: introduce domain builder config option Daniel P. Smith
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu

This commit is to update the hyperlaunch device tree documentation to align
with the DTB parsing implementation.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 .../designs/launch/hyperlaunch-devicetree.rst | 497 +++++++++++-------
 1 file changed, 306 insertions(+), 191 deletions(-)

diff --git a/docs/designs/launch/hyperlaunch-devicetree.rst b/docs/designs/launch/hyperlaunch-devicetree.rst
index b49c98cfbd..ae1a786d0b 100644
--- a/docs/designs/launch/hyperlaunch-devicetree.rst
+++ b/docs/designs/launch/hyperlaunch-devicetree.rst
@@ -13,12 +13,268 @@ difference is the introduction of the ``hypervisor`` node that is under the
 2. Allows for the domain construction information to easily be sanitized by
    simple removing the ``/chosen/hypervisor`` node.
 
+
+The Hypervisor node
+-------------------
+
+The ``hypervisor`` node is a top level container for the domains that will be built
+by hypervisor on start up. The node will be named ``hypervisor``  with a ``compatible``
+property to identify which hypervisors the configuration is intended. The hypervisor
+node will consist of one or more config nodes and one or more domain nodes.
+
+Properties
+""""""""""
+
+compatible
+  Identifies which hypervisors the configuration is compatible. Required.
+
+  Format: "hypervisor,<hypervisor name>", e.g "hypervisor,xen"
+
+Child Nodes
+"""""""""""
+
+* config
+* domain
+
+Config Node
+-----------
+
+A ``config`` node is for passing configuration data and identifying any boot
+modules that is of interest to the hypervisor.  For example this would be where
+Xen would be informed of microcode or XSM policy locations. Each ``config``
+node will require a unique device-tree compliant name as there may be one or
+more ``config`` nodes present in a single dtb file. To identify which
+hypervisor the configuration is intended, the required ``compatible`` property
+must be present.
+
+While the config node is not meant to replace the hypervisor commandline, there
+may be cases where it is better suited for passing configuration details at
+boot time.  This additional information may be carried in properties assigned
+to a ``config`` node. If there are any boot modules that are intended for the
+hypervisor, then a ``module`` child node should be provided to identify the
+boot module.
+
+Properties
+""""""""""
+
+compatible
+  Identifies the hypervisor the confiugration is intended. Required.
+
+  Format: "<hypervisor name>,config", e.g "xen,config"
+
+bootargs
+  This is used to provide the boot params for Xen.
+
+  Format: String, e.g. "flask=silo"
+
+Child Nodes
+"""""""""""
+
+* module
+
+Domain Node
+-----------
+
+A ``domain`` node is for describing the construction of a domain. Since there
+may be one or more domain nodes, each one requires a unique, DTB compliant name
+and a ``compatible`` property to identify as a domain node.
+
+A ``domain`` node  may provide a ``domid`` property which will be used as the
+requested domain id for the domain with a value of “0” signifying to use the
+next available domain id, which is the default behavior if omitted. It should
+be noted that a domain configuration is not able to request a domid of “0”.
+Beyond that a domain node may have any of the following optional properties.
+
+Properties
+""""""""""
+
+compatible
+  Identifies the node as a domain node and for which hypervisor. Required.
+
+  Format: "<hypervisor name>,domain", e.g "xen,domain"
+
+domid
+  Identifies the domid requested to assign to the domain.
+
+  Format: Integer, e.g <0>
+
+permissions
+  This sets what Discretionary Access Control permissions
+  a domain is assigned. Optional, default is none.
+
+  Format: Bitfield, e.g <3> or <0x00000003>
+
+          PERMISSION_NONE          (0)
+          PERMISSION_CONTROL       (1 << 0)
+          PERMISSION_HARDWARE      (1 << 1)
+
+functions
+  This identifies what system functions a domain will fulfill.
+  Optional, the default is none.
+
+  Format: Bitfield, e.g <3221225487> or <0xC0000007>
+
+          FUNCTION_NONE            (0)
+          FUNCTION_BOOT            (1 << 0)
+          FUNCTION_CRASH           (1 << 1)
+          FUNCTION_CONSOLE         (1 << 2)
+          FUNCTION_XENSTORE        (1 << 30)
+          FUNCTION_LEGACY_DOM0     (1 << 31)
+
+.. note::  The `functions` bits that have been selected to indicate
+   ``FUNCTION_XENSTORE`` and ``FUNCTION_LEGACY_DOM0`` are the last two bits
+   (30, 31) such that should these features ever be fully replaced or retired,
+   the flags may be dropped without leaving a gap in the flag set.
+
+mode
+  The mode the domain will be executed under. Required.
+
+  Format: Bitfield, e.g <5> or <0x00000005>
+
+          MODE_PARAVIRTUALIZED     (1 << 0) PV | PVH/HVM
+          MODE_ENABLE_DEVICE_MODEL (1 << 1) HVM | PVH
+          MODE_LONG                (1 << 2) 64 BIT | 32 BIT
+
+domain-uuid
+  A globally unique identifier for the domain. Optional,
+  the default is NULL.
+
+  Format: Byte Array, e.g [B3 FB 98 FB 8F 9F 67 A3]
+
+cpus
+  The number of vCPUs to be assigned to the domain. Optional,
+  the default is “1”.
+
+  Format: Integer, e.g <0>
+
+memory
+  The amount of memory to assign to the domain, in KBs. This field uses a DTB
+  Reg which contains a start and size. For memory allocation start may or may
+  not have significance but size will always be used for the amount of memory
+  Required.
+
+  Format: String  min:<sz> | max:<sz> | <sz>, e.g. "256M"
+
+security-id
+  The security identity to be assigned to the domain when XSM
+  is the access control mechanism being used. Optional,
+  the default is “system_u:system_r:domU_t”.
+
+  Format: string, e.g. "system_u:system_r:domU_t"
+
+Child Nodes
+"""""""""""
+
+* module
+
+Module node
+-----------
+
+This node describes a boot module loaded by the boot loader. A ``module`` node
+will often appear repeatedly and will require a unique and DTB compliant name
+for each instance. The compatible property is required to identify that the
+node is a ``module`` node, the type of boot module, and what it represents.
+
+Depending on the type of boot module, the ``module`` node will require either a
+``module-index`` or ``module-addr`` property must be present. They provide the
+boot module specific way of locating the boot module in memory.
+
+Properties
+""""""""""
+
+compatible
+  This identifies what the module is and thus what the hypervisor
+  should use the module for during domain construction. Required.
+
+  Format: "module,<module type>"[, "module,<locating type>"]
+          module type: kernel, ramdisk, device-tree, microcode, xsm-policy,
+                       config
+
+          locating type: index, addr
+
+module-index
+  This identifies the index for this module when in a module chain.
+  Required for multiboot environments.
+
+  Format: Integer, e.g. <0>
+
+module-addr
+  This identifies where in memory this module is located. Required for
+  non-multiboot environments.
+
+  Format: DTB Reg <start size>, e.g. <0x0 0x20000>
+
+bootargs
+  This is used to provide the boot params to kernel modules.
+
+  Format: String, e.g. "ro quiet"
+
+.. note::  The bootargs property is intended for situations where the same kernel multiboot module is used for more than one domain.
+
 Example Configuration
 ---------------------
 
-Below are two example device tree definitions for the hypervisor node. The
-first is an example of a multiboot-based configuration for x86 and the second
-is a module-based configuration for Arm.
+Below are examples device tree definitions for the hypervisor node. The first
+is an example of booting a dom0 only configuration. Afterh that are a
+multiboot-based configuration for x86 and a module-based configuration for Arm.
+
+Multiboot x86 Configuration Dom0-only:
+""""""""""""""""""""""""""""""""""""""
+The following dts file can be provided to the Device Tree compiler, ``dtc``, to
+produce a dtb file.
+::
+
+  /dts-v1/;
+
+  / {
+      chosen {
+          hypervisor {
+              compatible = "hypervisor,xen";
+
+              dom0 {
+                  compatible = "xen,domain";
+
+                  domid = <0>;
+
+                  permissions = <3>;
+                  functions = <0xC000000F>;
+                  mode = <5>;
+
+                  domain-uuid = [B3 FB 98 FB 8F 9F 67 A3 8A 6E 62 5A 09 13 F0 8C];
+
+                  cpus = <1>;
+                  memory = <0x0 0x20000000>;
+
+                  kernel {
+                      compatible = "module,kernel", "module,index";
+                      module-index = <1>;
+                  };
+              };
+
+          };
+      };
+  };
+
+The resulting dtb file, in this case dom0-only.dtb, can then be used with a
+GRUB menuentry as such,
+::
+
+  menuentry 'Devuan GNU/Linux, with Xen hyperlaunch' {
+        insmod part_gpt
+        insmod ext2
+        set root='hd0,gpt2'
+
+        echo    'Loading Xen hyperlaunch ...'
+
+        multiboot2      /xen.gz placeholder sync_console
+        echo    'Loading Dom0 hyperlaunch dtb ...'
+        module2 --nounzip   /dom0-only.dtb
+        echo    'Loading Linux 5.4.36+ ...'
+        module2 /vmlinuz-5.4.36+ placeholder root=/dev/mapper/test01--vg-root ro  quiet
+        echo    'Loading initial ramdisk ...'
+        module2 --nounzip   /initrd.img-5.4.36+
+  }
+
 
 Multiboot x86 Configuration:
 """"""""""""""""""""""""""""
@@ -31,89 +287,70 @@ Multiboot x86 Configuration:
         compatible = “hypervisor,xen”
 
         // Configuration container
-        config {
+        xen-config {
             compatible = "xen,config";
 
-            module {
-                compatible = "module,microcode", "multiboot,module";
-                mb-index = <1>;
+            bootargs = "console=com1,vga com1=115200,8n1 loglvl=all";
+
+            microcode {
+                compatible = "module,microcode", "module,index";
+                module-index = <1>;
             };
 
-            module {
-                compatible = "module,xsm-policy", "multiboot,module";
-                mb-index = <2>;
+            policy {
+                compatible = "module,xsm-policy", "module,index";
+                module-index = <2>;
             };
         };
 
         // Boot Domain definition
-        domain {
+        domB {
             compatible = "xen,domain";
 
             domid = <0x7FF5>;
 
-            // FUNCTION_NONE            (0)
-            // FUNCTION_BOOT            (1 << 0)
-            // FUNCTION_CRASH           (1 << 1)
-            // FUNCTION_CONSOLE         (1 << 2)
-            // FUNCTION_XENSTORE        (1 << 30)
-            // FUNCTION_LEGACY_DOM0     (1 << 31)
             functions = <0x00000001>;
 
             memory = <0x0 0x20000>;
             cpus = <1>;
-            module {
-                compatible = "module,kernel", "multiboot,module";
-                mb-index = <3>;
-            };
 
-            module {
-                compatible = "module,ramdisk", "multiboot,module";
-                mb-index = <4>;
+            kernel {
+                compatible = "module,kernel", "module,index";
+                module-index = <3>;
             };
-            module {
-                compatible = "module,config", "multiboot,module";
-                mb-index = <5>;
+            initrd {
+                compatible = "module,ramdisk", "module,index";
+                module-index = <4>;
+            };
+            dom-config {
+                compatible = "module,config", "module,index";
+                module-index = <5>;
             };
 
         // Classic Dom0 definition
-        domain {
+        dom0 {
             compatible = "xen,domain";
 
             domid = <0>;
 
-            // PERMISSION_NONE          (0)
-            // PERMISSION_CONTROL       (1 << 0)
-            // PERMISSION_HARDWARE      (1 << 1)
             permissions = <3>;
-
-            // FUNCTION_NONE            (0)
-            // FUNCTION_BOOT            (1 << 0)
-            // FUNCTION_CRASH           (1 << 1)
-            // FUNCTION_CONSOLE         (1 << 2)
-            // FUNCTION_XENSTORE        (1 << 30)
-            // FUNCTION_LEGACY_DOM0     (1 << 31)
             functions = <0xC0000006>;
-
-            // MODE_PARAVIRTUALIZED     (1 << 0) /* PV | PVH/HVM */
-            // MODE_ENABLE_DEVICE_MODEL (1 << 1) /* HVM | PVH */
-            // MODE_LONG                (1 << 2) /* 64 BIT | 32 BIT */
             mode = <5>; /* 64 BIT, PV */
 
-            // UUID
             domain-uuid = [B3 FB 98 FB 8F 9F 67 A3];
 
             cpus = <1>;
             memory = <0x0 0x20000>;
-            security-id = “dom0_t;
+            security-id = “system_u:system_r:dom0_t;
 
-            module {
-                compatible = "module,kernel", "multiboot,module";
-                mb-index = <6>;
+            kernel {
+                compatible = "module,kernel", "module,index";
+                module-index = <6>;
                 bootargs = "console=hvc0";
             };
-            module {
-                compatible = "module,ramdisk", "multiboot,module";
-                mb-index = <7>;
+            initrd {
+                compatible = "module,ramdisk", "module,index";
+                module-index = <7>;
             };
     };
 
@@ -137,89 +374,68 @@ Module Arm Configuration:
         compatible = “hypervisor,xen”
 
         // Configuration container
-        config {
+        xen-config {
             compatible = "xen,config";
 
-            module {
-                compatible = "module,microcode”;
+            microcode {
+                compatible = "module,microcode”,"module,addr";
                 module-addr = <0x0000ff00 0x80>;
             };
 
-            module {
-                compatible = "module,xsm-policy";
+            policy {
+                compatible = "module,xsm-policy","module,addr";
                 module-addr = <0x0000ff00 0x80>;
 
             };
         };
 
         // Boot Domain definition
-        domain {
+        domB {
             compatible = "xen,domain";
 
             domid = <0x7FF5>;
 
-            // FUNCTION_NONE            (0)
-            // FUNCTION_BOOT            (1 << 0)
-            // FUNCTION_CRASH           (1 << 1)
-            // FUNCTION_CONSOLE         (1 << 2)
-            // FUNCTION_XENSTORE        (1 << 30)
-            // FUNCTION_LEGACY_DOM0     (1 << 31)
             functions = <0x00000001>;
 
             memory = <0x0 0x20000>;
             cpus = <1>;
-            module {
-                compatible = "module,kernel";
+
+            kernel {
+                compatible = "module,kernel","module,addr";
                 module-addr = <0x0000ff00 0x80>;
             };
-
-            module {
-                compatible = "module,ramdisk";
+            initrd {
+                compatible = "module,ramdisk","module,addr";
                 module-addr = <0x0000ff00 0x80>;
             };
-            module {
-                compatible = "module,config";
+            dom-config {
+                compatible = "module,config","module,addr";
                 module-addr = <0x0000ff00 0x80>;
             };
 
         // Classic Dom0 definition
-        domain@0 {
+        dom0 {
             compatible = "xen,domain";
 
             domid = <0>;
 
-            // PERMISSION_NONE          (0)
-            // PERMISSION_CONTROL       (1 << 0)
-            // PERMISSION_HARDWARE      (1 << 1)
             permissions = <3>;
-
-            // FUNCTION_NONE            (0)
-            // FUNCTION_BOOT            (1 << 0)
-            // FUNCTION_CRASH           (1 << 1)
-            // FUNCTION_CONSOLE         (1 << 2)
-            // FUNCTION_XENSTORE        (1 << 30)
-            // FUNCTION_LEGACY_DOM0     (1 << 31)
             functions = <0xC0000006>;
-
-            // MODE_PARAVIRTUALIZED     (1 << 0) /* PV | PVH/HVM */
-            // MODE_ENABLE_DEVICE_MODEL (1 << 1) /* HVM | PVH */
-            // MODE_LONG                (1 << 2) /* 64 BIT | 32 BIT */
             mode = <5>; /* 64 BIT, PV */
 
-            // UUID
             domain-uuid = [B3 FB 98 FB 8F 9F 67 A3];
 
             cpus = <1>;
             memory = <0x0 0x20000>;
-            security-id = “dom0_t”;
+            security-id = “system_u:system_r:dom0_t”;
 
-            module {
-                compatible = "module,kernel";
+            kernel {
+                compatible = "module,kernel","module,addr";
                 module-addr = <0x0000ff00 0x80>;
                 bootargs = "console=hvc0";
             };
-            module {
-                compatible = "module,ramdisk";
+            intird {
+                compatible = "module,ramdisk","module,addr";
                 module-addr = <0x0000ff00 0x80>;
             };
     };
@@ -240,104 +456,3 @@ provided to Xen using the standard method currently in use. The remaining
 modules would need to be loaded in the respective addresses specified in the
 `module-addr` property.
 
-
-The Hypervisor node
--------------------
-
-The hypervisor node is a top level container for the domains that will be built
-by hypervisor on start up. On the ``hypervisor`` node the ``compatible``
-property is used to identify the type of hypervisor node present..
-
-compatible
-  Identifies the type of node. Required.
-
-The Config node
----------------
-
-A config node is for detailing any modules that are of interest to Xen itself.
-For example this would be where Xen would be informed of microcode or XSM
-policy locations. If the modules are multiboot modules and are able to be
-located by index within the module chain, the ``mb-index`` property should be
-used to specify the index in the multiboot module chain.. If the module will be
-located by physical memory address, then the ``module-addr`` property should be
-used to identify the location and size of the module.
-
-compatible
-  Identifies the type of node. Required.
-
-The Domain node
----------------
-
-A domain node is for describing the construction of a domain. It may provide a
-domid property which will be used as the requested domain id for the domain
-with a value of “0” signifying to use the next available domain id, which is
-the default behavior if omitted. A domain configuration is not able to request
-a domid of “0”. After that a domain node may have any of the following
-parameters,
-
-compatible
-  Identifies the type of node. Required.
-
-domid
-  Identifies the domid requested to assign to the domain. Required.
-
-permissions
-  This sets what Discretionary Access Control permissions
-  a domain is assigned. Optional, default is none.
-
-functions
-  This identifies what system functions a domain will fulfill.
-  Optional, the default is none.
-
-.. note::  The `functions` bits that have been selected to indicate
-   ``FUNCTION_XENSTORE`` and ``FUNCTION_LEGACY_DOM0`` are the last two bits
-   (30, 31) such that should these features ever be fully retired, the flags may
-   be dropped without leaving a gap in the flag set.
-
-mode
-  The mode the domain will be executed under. Required.
-
-domain-uuid
-  A globally unique identifier for the domain. Optional,
-  the default is NULL.
-
-cpus
-  The number of vCPUs to be assigned to the domain. Optional,
-  the default is “1”.
-
-memory
-  The amount of memory to assign to the domain, in KBs.
-  Required.
-
-security-id
-  The security identity to be assigned to the domain when XSM
-  is the access control mechanism being used. Optional,
-  the default is “domu_t”.
-
-The Module node
----------------
-
-This node describes a boot module loaded by the boot loader. The required
-compatible property follows the format: module,<type> where type can be
-“kernel”, “ramdisk”, “device-tree”, “microcode”, “xsm-policy” or “config”. In
-the case the module is a multiboot module, the additional property string
-“multiboot,module” may be present. One of two properties is required and
-identifies how to locate the module. They are the mb-index, used for multiboot
-modules, and the module-addr for memory address based location.
-
-compatible
-  This identifies what the module is and thus what the hypervisor
-  should use the module for during domain construction. Required.
-
-mb-index
-  This identifies the index for this module in the multiboot module chain.
-  Required for multiboot environments.
-
-module-addr
-  This identifies where in memory this module is located. Required for
-  non-multiboot environments.
-
-bootargs
-  This is used to provide the boot params to kernel modules.
-
-.. note::  The bootargs property is intended for situations where the same kernel multiboot module is used for more than one domain.
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 08/18] kconfig: introduce domain builder config option
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (6 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 07/18] docs: update hyperlaunch device tree documentation Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-07  1:44   ` Henry Wang
  2022-07-19 13:29   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 09/18] x86: introduce abstractions for domain builder Daniel P. Smith
                   ` (10 subsequent siblings)
  18 siblings, 2 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu

Hyperlaunch domain builder is the consolidated boot time domain building logic
framework.  This commit introduces the first config option for the domain
builder to control support for loading the domain configurations via the
flattened device tree.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/common/Kconfig                |  1 +
 xen/common/domain-builder/Kconfig | 15 +++++++++++++++
 2 files changed, 16 insertions(+)
 create mode 100644 xen/common/domain-builder/Kconfig

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 9fc6683932..5a1c40e392 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -355,6 +355,7 @@ config ARGO
 
 	  If unsure, say N.
 
+source "common/domain-builder/Kconfig"
 source "common/sched/Kconfig"
 
 config CRYPTO
diff --git a/xen/common/domain-builder/Kconfig b/xen/common/domain-builder/Kconfig
new file mode 100644
index 0000000000..893038cab3
--- /dev/null
+++ b/xen/common/domain-builder/Kconfig
@@ -0,0 +1,15 @@
+
+menu "Domain Builder Features"
+
+config BUILDER_FDT
+	bool "Domain builder device tree (UNSUPPORTED)" if UNSUPPORTED
+	select CORE_DEVICE_TREE
+	---help---
+	  Enables the ability to configure the domain builder using a
+	  flattened device tree.
+
+	  This feature is currently experimental.
+
+	  If unsure, say N.
+
+endmenu
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 09/18] x86: introduce abstractions for domain builder
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (7 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 08/18] kconfig: introduce domain builder config option Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-26 14:22   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 10/18] x86: introduce the " Daniel P. Smith
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini

This commit expands the new boot info structs to provide the initial
abstractions for domain builder.  Additionally, it reuses the memory allocation
structures previously used for dom0, bring the structures and helper functions
under the domain builder.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/boot/boot_info32.h       |  3 ++
 xen/arch/x86/dom0_build.c             | 21 +---------
 xen/arch/x86/include/asm/bootdomain.h | 30 +++++++++++++++
 xen/arch/x86/include/asm/bootinfo.h   |  2 +
 xen/include/xen/bootdomain.h          | 52 +++++++++++++++++++++++++
 xen/include/xen/bootinfo.h            |  4 ++
 xen/include/xen/domain_builder.h      | 55 +++++++++++++++++++++++++++
 7 files changed, 147 insertions(+), 20 deletions(-)
 create mode 100644 xen/arch/x86/include/asm/bootdomain.h
 create mode 100644 xen/include/xen/bootdomain.h
 create mode 100644 xen/include/xen/domain_builder.h

diff --git a/xen/arch/x86/boot/boot_info32.h b/xen/arch/x86/boot/boot_info32.h
index 01af950efc..0e7821efb3 100644
--- a/xen/arch/x86/boot/boot_info32.h
+++ b/xen/arch/x86/boot/boot_info32.h
@@ -87,6 +87,9 @@ struct __packed boot_info {
     /* struct boot_module* */
     u64 mods;
 
+    /* struct domain_builder* */
+    u64 builder;
+
     /* struct arch_boot_info* */
     u64 arch;
 };
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 9ca5a99510..e44f7f3c43 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -4,6 +4,7 @@
  * Copyright (c) 2002-2005, K A Fraser
  */
 
+#include <xen/bootdomain.h>
 #include <xen/bootinfo.h>
 #include <xen/init.h>
 #include <xen/iocap.h>
@@ -22,31 +23,11 @@
 #include <asm/setup.h>
 #include <asm/spec_ctrl.h>
 
-struct memsize {
-    long nr_pages;
-    unsigned int percent;
-    bool minus;
-};
-
 static struct memsize __initdata dom0_size;
 static struct memsize __initdata dom0_min_size;
 static struct memsize __initdata dom0_max_size = { .nr_pages = LONG_MAX };
 static bool __initdata dom0_mem_set;
 
-static bool __init memsize_gt_zero(const struct memsize *sz)
-{
-    return !sz->minus && sz->nr_pages;
-}
-
-static unsigned long __init get_memsize(const struct memsize *sz,
-                                        unsigned long avail)
-{
-    unsigned long pages;
-
-    pages = sz->nr_pages + sz->percent * avail / 100;
-    return sz->minus ? avail - pages : pages;
-}
-
 /*
  * dom0_mem=[min:<min_amt>,][max:<max_amt>,][<amt>]
  *
diff --git a/xen/arch/x86/include/asm/bootdomain.h b/xen/arch/x86/include/asm/bootdomain.h
new file mode 100644
index 0000000000..6f37ac99dc
--- /dev/null
+++ b/xen/arch/x86/include/asm/bootdomain.h
@@ -0,0 +1,30 @@
+#ifndef __ARCH_X86_BOOTDOMAIN_H__
+#define __ARCH_X86_BOOTDOMAIN_H__
+
+struct memsize {
+    long nr_pages;
+    unsigned int percent;
+    bool minus;
+};
+
+static inline bool memsize_gt_zero(const struct memsize *sz)
+{
+    return !sz->minus && sz->nr_pages;
+}
+
+static inline unsigned long get_memsize(
+    const struct memsize *sz, unsigned long avail)
+{
+    unsigned long pages;
+
+    pages = sz->nr_pages + sz->percent * avail / 100;
+    return sz->minus ? avail - pages : pages;
+}
+
+struct arch_domain_mem {
+    struct memsize mem_size;
+    struct memsize mem_min;
+    struct memsize mem_max;
+};
+
+#endif
diff --git a/xen/arch/x86/include/asm/bootinfo.h b/xen/arch/x86/include/asm/bootinfo.h
index 2fcd576023..f02f4edcd7 100644
--- a/xen/arch/x86/include/asm/bootinfo.h
+++ b/xen/arch/x86/include/asm/bootinfo.h
@@ -45,6 +45,8 @@ struct __packed mb_memmap {
     uint32_t type;
 };
 
+struct arch_domain_builder { };
+
 static inline bool loader_is_grub2(const char *loader_name)
 {
     /* GRUB1="GNU GRUB 0.xx"; GRUB2="GRUB 1.xx" */
diff --git a/xen/include/xen/bootdomain.h b/xen/include/xen/bootdomain.h
new file mode 100644
index 0000000000..b172d16f4e
--- /dev/null
+++ b/xen/include/xen/bootdomain.h
@@ -0,0 +1,52 @@
+#ifndef __XEN_BOOTDOMAIN_H__
+#define __XEN_BOOTDOMAIN_H__
+
+#include <xen/bootinfo.h>
+#include <xen/types.h>
+
+#include <public/xen.h>
+
+#include <asm/bootdomain.h>
+
+struct domain;
+
+struct boot_domain {
+#define BUILD_PERMISSION_NONE          (0)
+#define BUILD_PERMISSION_CONTROL       (1 << 0)
+#define BUILD_PERMISSION_HARDWARE      (1 << 1)
+    uint32_t permissions;
+
+#define BUILD_FUNCTION_NONE            (0)
+#define BUILD_FUNCTION_BOOT            (1 << 0)
+#define BUILD_FUNCTION_CRASH           (1 << 1)
+#define BUILD_FUNCTION_CONSOLE         (1 << 2)
+#define BUILD_FUNCTION_STUBDOM         (1 << 3)
+#define BUILD_FUNCTION_XENSTORE        (1 << 30)
+#define BUILD_FUNCTION_INITIAL_DOM     (1 << 31)
+    uint32_t functions;
+                                                /* On     | Off    */
+#define BUILD_MODE_PARAVIRTUALIZED     (1 << 0) /* PV     | PVH/HVM */
+#define BUILD_MODE_ENABLE_DEVICE_MODEL (1 << 1) /* HVM    | PVH     */
+#define BUILD_MODE_LONG                (1 << 2) /* 64 BIT | 32 BIT  */
+    uint32_t mode;
+
+    domid_t domid;
+    uint8_t uuid[16];
+
+    uint32_t ncpus;
+    struct arch_domain_mem meminfo;
+
+#define BUILD_MAX_SECID_LEN 64
+    unsigned char secid[BUILD_MAX_SECID_LEN];
+
+    struct boot_module *kernel;
+    struct boot_module *ramdisk;
+#define BUILD_MAX_CONF_MODS 2
+#define BUILD_DTB_CONF_IDX 0
+#define BUILD_DOM_CONF_IDX 1
+    struct boot_module *configs[BUILD_MAX_CONF_MODS];
+
+    struct domain *domain;
+};
+
+#endif
diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
index 477294dc10..1d76d99a40 100644
--- a/xen/include/xen/bootinfo.h
+++ b/xen/include/xen/bootinfo.h
@@ -1,6 +1,7 @@
 #ifndef __XEN_BOOTINFO_H__
 #define __XEN_BOOTINFO_H__
 
+#include <xen/bootdomain.h>
 #include <xen/mm.h>
 #include <xen/types.h>
 
@@ -15,6 +16,7 @@ typedef enum {
     BOOTMOD_XSM,
     BOOTMOD_UCODE,
     BOOTMOD_GUEST_DTB,
+    BOOTMOD_GUEST_CONF,
 }  bootmodule_kind;
 
 typedef enum {
@@ -48,6 +50,8 @@ struct __packed boot_info {
     uint32_t nr_mods;
     struct boot_module *mods;
 
+    struct domain_builder *builder;
+
     struct arch_boot_info *arch;
 };
 
diff --git a/xen/include/xen/domain_builder.h b/xen/include/xen/domain_builder.h
new file mode 100644
index 0000000000..79785ef251
--- /dev/null
+++ b/xen/include/xen/domain_builder.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef XEN_DOMAIN_BUILDER_H
+#define XEN_DOMAIN_BUILDER_H
+
+#include <xen/bootdomain.h>
+#include <xen/bootinfo.h>
+
+#include <asm/setup.h>
+
+struct domain_builder {
+    bool fdt_enabled;
+#define BUILD_MAX_BOOT_DOMAINS 64
+    uint16_t nr_doms;
+    struct boot_domain domains[BUILD_MAX_BOOT_DOMAINS];
+
+    struct arch_domain_builder *arch;
+};
+
+static inline bool builder_is_initdom(struct boot_domain *bd)
+{
+    return bd->functions & BUILD_FUNCTION_INITIAL_DOM;
+}
+
+static inline bool builder_is_ctldom(struct boot_domain *bd)
+{
+    return (bd->functions & BUILD_FUNCTION_INITIAL_DOM ||
+            bd->permissions & BUILD_PERMISSION_CONTROL );
+}
+
+static inline bool builder_is_hwdom(struct boot_domain *bd)
+{
+    return (bd->functions & BUILD_FUNCTION_INITIAL_DOM ||
+            bd->permissions & BUILD_PERMISSION_HARDWARE );
+}
+
+static inline struct domain *builder_get_hwdom(struct boot_info *info)
+{
+    int i;
+
+    for ( i = 0; i < info->builder->nr_doms; i++ )
+    {
+        struct boot_domain *d = &info->builder->domains[i];
+
+        if ( builder_is_hwdom(d) )
+            return d->domain;
+    }
+
+    return NULL;
+}
+
+void builder_init(struct boot_info *info);
+uint32_t builder_create_domains(struct boot_info *info);
+
+#endif /* XEN_DOMAIN_BUILDER_H */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 10/18] x86: introduce the domain builder
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (8 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 09/18] x86: introduce abstractions for domain builder Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-18 13:59   ` Smith, Jackson
  2022-07-26 14:46   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 11/18] x86: initial conversion to " Daniel P. Smith
                   ` (8 subsequent siblings)
  18 siblings, 2 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini

This commit introduces the domain builder configuration FDT parser along with
the domain builder core for domain creation. To enable domain builder to be a
cross architecture internal API, a new arch domain creation call is introduced
for use by the domain builder.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/setup.c               |   9 +
 xen/common/Makefile                |   1 +
 xen/common/domain-builder/Makefile |   2 +
 xen/common/domain-builder/core.c   |  96 ++++++++++
 xen/common/domain-builder/fdt.c    | 295 +++++++++++++++++++++++++++++
 xen/common/domain-builder/fdt.h    |   7 +
 xen/include/xen/bootinfo.h         |  16 ++
 xen/include/xen/domain_builder.h   |   1 +
 8 files changed, 427 insertions(+)
 create mode 100644 xen/common/domain-builder/Makefile
 create mode 100644 xen/common/domain-builder/core.c
 create mode 100644 xen/common/domain-builder/fdt.c
 create mode 100644 xen/common/domain-builder/fdt.h

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index e4060d6219..28dbfcd209 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1,4 +1,6 @@
+#include <xen/bootdomain.h>
 #include <xen/bootinfo.h>
+#include <xen/domain_builder.h>
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/err.h>
@@ -826,6 +828,13 @@ static struct domain *__init create_dom0(const struct boot_info *bi)
     return d;
 }
 
+void __init arch_create_dom(
+    const struct boot_info *bi, struct boot_domain *bd)
+{
+    if ( builder_is_initdom(bd) )
+        create_dom0(bi);
+}
+
 /* How much of the directmap is prebuilt at compile time. */
 #define PREBUILT_MAP_LIMIT (1 << L2_PAGETABLE_SHIFT)
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index ebd3e2d659..eb108fa107 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -72,6 +72,7 @@ extra-y := symbols-dummy.o
 obj-$(CONFIG_COVERAGE) += coverage/
 obj-y += sched/
 obj-$(CONFIG_UBSAN) += ubsan/
+obj-y += domain-builder/
 
 obj-$(CONFIG_NEEDS_LIBELF) += libelf/
 obj-$(CONFIG_CORE_DEVICE_TREE) += libfdt/
diff --git a/xen/common/domain-builder/Makefile b/xen/common/domain-builder/Makefile
new file mode 100644
index 0000000000..9561602502
--- /dev/null
+++ b/xen/common/domain-builder/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_BUILDER_FDT) += fdt.o
+obj-y += core.o
diff --git a/xen/common/domain-builder/core.c b/xen/common/domain-builder/core.c
new file mode 100644
index 0000000000..b030b07d71
--- /dev/null
+++ b/xen/common/domain-builder/core.c
@@ -0,0 +1,96 @@
+#include <xen/bootdomain.h>
+#include <xen/bootinfo.h>
+#include <xen/domain_builder.h>
+#include <xen/init.h>
+#include <xen/types.h>
+
+#include <asm/bzimage.h>
+#include <asm/setup.h>
+
+#include "fdt.h"
+
+static struct domain_builder __initdata builder;
+
+void __init builder_init(struct boot_info *info)
+{
+    struct boot_domain *d = NULL;
+
+    info->builder = &builder;
+
+    if ( IS_ENABLED(CONFIG_BUILDER_FDT) )
+    {
+        /* fdt is required to be module 0 */
+        switch ( check_fdt(info, __va(info->mods[0].start)) )
+        {
+        case 0:
+            printk("Domain Builder: initialized from config\n");
+            info->builder->fdt_enabled = true;
+            return;
+        case -EINVAL:
+            info->builder->fdt_enabled = false;
+            break;
+        case -ENODATA:
+        default:
+            panic("%s: error occured processing DTB\n", __func__);
+        }
+    }
+
+    /*
+     * No FDT config support or an FDT wasn't present, do an initial
+     * domain construction
+     */
+    printk("Domain Builder: falling back to initial domain build\n");
+    info->builder->nr_doms = 1;
+    d = &info->builder->domains[0];
+
+    d->mode = opt_dom0_pvh ? 0 : BUILD_MODE_PARAVIRTUALIZED;
+
+    d->kernel = &info->mods[0];
+    d->kernel->kind = BOOTMOD_KERNEL;
+
+    d->permissions = BUILD_PERMISSION_CONTROL | BUILD_PERMISSION_HARDWARE;
+    d->functions = BUILD_FUNCTION_CONSOLE | BUILD_FUNCTION_XENSTORE |
+                     BUILD_FUNCTION_INITIAL_DOM;
+
+    d->kernel->arch->headroom = bzimage_headroom(bootstrap_map(d->kernel),
+                                                   d->kernel->size);
+    bootstrap_map(NULL);
+
+    if ( d->kernel->string.len )
+        d->kernel->string.kind = BOOTSTR_CMDLINE;
+}
+
+uint32_t __init builder_create_domains(struct boot_info *info)
+{
+    uint32_t build_count = 0, functions_built = 0;
+    int i;
+
+    for ( i = 0; i < info->builder->nr_doms; i++ )
+    {
+        struct boot_domain *d = &info->builder->domains[i];
+
+        if ( ! IS_ENABLED(CONFIG_MULTIDOM_BUILDER) &&
+             ! builder_is_initdom(d) &&
+             functions_built & BUILD_FUNCTION_INITIAL_DOM )
+            continue;
+
+        if ( d->kernel == NULL )
+        {
+            if ( builder_is_initdom(d) )
+                panic("%s: intial domain missing kernel\n", __func__);
+
+            printk(XENLOG_ERR "%s:Dom%d definiton has no kernel\n", __func__,
+                    d->domid);
+            continue;
+        }
+
+        arch_create_dom(info, d);
+        if ( d->domain )
+        {
+            functions_built |= d->functions;
+            build_count++;
+        }
+    }
+
+    return build_count;
+}
diff --git a/xen/common/domain-builder/fdt.c b/xen/common/domain-builder/fdt.c
new file mode 100644
index 0000000000..937cc61e7a
--- /dev/null
+++ b/xen/common/domain-builder/fdt.c
@@ -0,0 +1,295 @@
+#include <xen/bootdomain.h>
+#include <xen/bootinfo.h>
+#include <xen/domain_builder.h>
+#include <xen/fdt.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/page-size.h>
+#include <xen/pfn.h>
+#include <xen/types.h>
+
+#include <asm/bzimage.h>
+#include <asm/setup.h>
+
+#include "fdt.h"
+
+#define BUILDER_FDT_TARGET_UNK 0
+#define BUILDER_FDT_TARGET_X86 1
+#define BUILDER_FDT_TARGET_ARM 2
+static int __initdata target_arch = BUILDER_FDT_TARGET_UNK;
+
+static struct boot_module *read_module(
+    const void *fdt, int node, uint32_t address_cells, uint32_t size_cells,
+    struct boot_info *info)
+{
+    const struct fdt_property *prop;
+    const __be32 *cell;
+    struct boot_module *bm;
+    bootmodule_kind kind = BOOTMOD_UNKNOWN;
+    int len;
+
+    if ( device_tree_node_compatible(fdt, node, "module,kernel") )
+        kind = BOOTMOD_KERNEL;
+
+    if ( device_tree_node_compatible(fdt, node, "module,ramdisk") )
+        kind = BOOTMOD_RAMDISK;
+
+    if ( device_tree_node_compatible(fdt, node, "module,microcode") )
+        kind = BOOTMOD_UCODE;
+
+    if ( device_tree_node_compatible(fdt, node, "module,xsm-policy") )
+        kind = BOOTMOD_XSM;
+
+    if ( device_tree_node_compatible(fdt, node, "module,config") )
+        kind = BOOTMOD_GUEST_CONF;
+
+    if ( device_tree_node_compatible(fdt, node, "module,index") )
+    {
+        uint32_t idx;
+
+        idx = (uint32_t)device_tree_get_u32(fdt, node, "module-index", 0);
+        if ( idx == 0 )
+            return NULL;
+
+        bm = &info->mods[idx];
+
+        bm->kind = kind;
+
+        return bm;
+    }
+
+    if ( device_tree_node_compatible(fdt, node, "module,addr") )
+    {
+        uint64_t addr, size;
+
+        prop = fdt_get_property(fdt, node, "module-addr", &len);
+        if ( !prop )
+            return NULL;
+
+        if ( len < dt_cells_to_size(address_cells + size_cells) )
+            return NULL;
+
+        cell = (const __be32 *)prop->data;
+        device_tree_get_reg(
+            &cell, address_cells, size_cells, &addr, &size);
+
+        bm = bootmodule_next_by_addr(info, addr, NULL);
+
+        bm->kind = kind;
+
+        return bm;
+    }
+
+    printk(XENLOG_WARNING
+           "builder fdt: module node %d, no index or addr provided\n",
+           node);
+
+    return NULL;
+}
+
+static int process_config_node(
+    const void *fdt, int node, const char *name, int depth,
+    uint32_t address_cells, uint32_t size_cells, void *data)
+{
+    struct boot_info *info = (struct boot_info *)data;
+    int node_next;
+
+    if ( !info )
+        return -1;
+
+    for ( node_next = fdt_first_subnode(fdt, node);
+          node_next > 0;
+          node_next = fdt_next_subnode(fdt, node_next))
+        read_module(fdt, node_next, address_cells, size_cells, info);
+
+    return 0;
+}
+
+static int process_domain_node(
+    const void *fdt, int node, const char *name, int depth,
+    uint32_t address_cells, uint32_t size_cells, void *data)
+{
+    struct boot_info *info = (struct boot_info *)data;
+    const struct fdt_property *prop;
+    struct boot_domain *domain;
+    int node_next, i, plen;
+
+    if ( !info )
+        return -1;
+
+    if ( info->builder->nr_doms >= BUILD_MAX_BOOT_DOMAINS )
+        return -1;
+
+    domain = &info->builder->domains[info->builder->nr_doms];
+
+    domain->domid = (domid_t)device_tree_get_u32(fdt, node, "domid", 0);
+    domain->permissions = device_tree_get_u32(fdt, node, "permissions", 0);
+    domain->functions = device_tree_get_u32(fdt, node, "functions", 0);
+    domain->mode = device_tree_get_u32(fdt, node, "mode", 0);
+
+    prop = fdt_get_property(fdt, node, "domain-uuid", &plen);
+    if ( prop )
+        for ( i=0; i < sizeof(domain->uuid) % sizeof(uint32_t); i++ )
+            *(domain->uuid + i) = fdt32_to_cpu((uint32_t)prop->data[i]);
+
+    domain->ncpus = device_tree_get_u32(fdt, node, "cpus", 1);
+
+    if ( target_arch == BUILDER_FDT_TARGET_X86 )
+    {
+        prop = fdt_get_property(fdt, node, "memory", &plen);
+        if ( prop )
+        {
+            int sz = fdt32_to_cpu(prop->len);
+            char s[64];
+            unsigned long val;
+
+            if ( sz >= 64 )
+                panic("node %s invalid `memory' property\n", name);
+
+            memcpy(s, prop->data, sz);
+            s[sz] = '\0';
+            val = parse_size_and_unit(s, NULL);
+
+            domain->meminfo.mem_size.nr_pages = PFN_UP(val);
+            domain->meminfo.mem_max.nr_pages = PFN_UP(val);
+        }
+        else
+            panic("node %s missing `memory' property\n", name);
+    }
+    else
+            panic("%s: only x86 memory parsing supported\n", __func__);
+
+    prop = fdt_get_property(fdt, node, "security-id",
+                                &plen);
+    if ( prop )
+    {
+        int sz = fdt32_to_cpu(prop->len);
+        sz = sz > BUILD_MAX_SECID_LEN ?  BUILD_MAX_SECID_LEN : sz;
+        memcpy(domain->secid, prop->data, sz);
+    }
+
+    for ( node_next = fdt_first_subnode(fdt, node);
+          node_next > 0;
+          node_next = fdt_next_subnode(fdt, node_next))
+    {
+        struct boot_module *bm = read_module(fdt, node_next, address_cells,
+                                             size_cells, info);
+
+        switch ( bm->kind )
+        {
+        case BOOTMOD_KERNEL:
+            /* kernel was already found */
+            if ( domain->kernel != NULL )
+                continue;
+
+            bm->arch->headroom = bzimage_headroom(bootstrap_map(bm), bm->size);
+            bootstrap_map(NULL);
+
+            if ( bm->string.len )
+                bm->string.kind = BOOTSTR_CMDLINE;
+            else
+            {
+                prop = fdt_get_property(fdt, node_next, "bootargs", &plen);
+                if ( prop )
+                {
+                    int size = fdt32_to_cpu(prop->len);
+                    size = size > BOOTMOD_MAX_STRING ?
+                           BOOTMOD_MAX_STRING : size;
+                    memcpy(bm->string.bytes, prop->data, size);
+                    bm->string.kind = BOOTSTR_CMDLINE;
+                }
+            }
+
+            domain->kernel = bm;
+
+            break;
+        case BOOTMOD_RAMDISK:
+            /* ramdisk was already found */
+            if ( domain->ramdisk != NULL )
+                continue;
+
+            domain->ramdisk = bm;
+
+            break;
+        case BOOTMOD_GUEST_CONF:
+            /* guest config was already found */
+            if ( domain->configs[BUILD_DOM_CONF_IDX] != NULL )
+                continue;
+
+            domain->configs[BUILD_DOM_CONF_IDX] = bm;
+
+            break;
+        default:
+            continue;
+        }
+    }
+
+    info->builder->nr_doms++;
+
+    return 0;
+}
+
+static int __init scan_node(
+    const void *fdt, int node, const char *name, int depth, u32 address_cells,
+    u32 size_cells, void *data)
+{
+    int rc = -1;
+
+    /* skip nodes that are not direct children of the hyperlaunch node */
+    if ( depth > 1 )
+        return 0;
+
+    if ( device_tree_node_compatible(fdt, node, "xen,config") )
+        rc = process_config_node(fdt, node, name, depth,
+                                 address_cells, size_cells, data);
+    else if ( device_tree_node_compatible(fdt, node, "xen,domain") )
+        rc = process_domain_node(fdt, node, name, depth,
+                                 address_cells, size_cells, data);
+
+    if ( rc < 0 )
+        printk("hyperlaunch fdt: node `%s'failed to parse\n", name);
+
+    return rc;
+}
+
+/* check_fdt
+ *   Attempts to initialize hyperlaunch config
+ *
+ * Returns:
+ *    -EINVAL: Not a valid DTB
+ *   -ENODATA: Valid DTB but not a valid hyperlaunch device tree
+ *          0: Valid hyperlaunch device tree
+ */
+int __init check_fdt(struct boot_info *info, void *fdt)
+{
+    int hv_node, ret;
+
+    ret = fdt_check_header(fdt);
+    if ( ret < 0 )
+        return -EINVAL;
+
+    hv_node = fdt_path_offset(fdt, "/chosen/hypervisor");
+    if ( hv_node < 0 )
+        return -ENODATA;
+
+    if ( !device_tree_node_compatible(fdt, hv_node, "hypervisor,xen") )
+        return -EINVAL;
+
+    if ( IS_ENABLED(CONFIG_X86) &&
+         device_tree_node_compatible(fdt, hv_node, "xen,x86") )
+        target_arch = BUILDER_FDT_TARGET_X86;
+    else if ( IS_ENABLED(CONFIG_ARM) &&
+              device_tree_node_compatible(fdt, hv_node, "xen,arm") )
+        target_arch = BUILDER_FDT_TARGET_ARM;
+
+    if ( target_arch != BUILDER_FDT_TARGET_X86 &&
+         target_arch != BUILDER_FDT_TARGET_ARM )
+        return -EINVAL;
+
+    ret = device_tree_for_each_node(fdt, hv_node, scan_node, boot_info);
+    if ( ret > 0 )
+        return -ENODATA;
+
+    return 0;
+}
diff --git a/xen/common/domain-builder/fdt.h b/xen/common/domain-builder/fdt.h
new file mode 100644
index 0000000000..b185718412
--- /dev/null
+++ b/xen/common/domain-builder/fdt.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef COMMON_BUILDER_FDT_H
+#define COMMON_BUILDER_FDT_H
+
+int __init check_fdt(struct boot_info *info, void *fdt);
+#endif
diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
index 1d76d99a40..07b151e318 100644
--- a/xen/include/xen/bootinfo.h
+++ b/xen/include/xen/bootinfo.h
@@ -101,6 +101,22 @@ static inline struct boot_module *bootmodule_next_by_kind(
     return NULL;
 }
 
+static inline struct boot_module *bootmodule_next_by_addr(
+    const struct boot_info *bi, paddr_t addr, struct boot_module *start)
+{
+    /* point end at the entry for xen */
+    struct boot_module *end = &bi->mods[bi->nr_mods];
+
+    if ( !start )
+        start = bi->mods;
+
+    for ( ; start < end; start++ )
+        if ( start->start == addr )
+            return start;
+
+    return NULL;
+}
+
 static inline void bootmodule_update_start(struct boot_module *b, paddr_t new_start)
 {
     b->start = new_start;
diff --git a/xen/include/xen/domain_builder.h b/xen/include/xen/domain_builder.h
index 79785ef251..c0d997f7bd 100644
--- a/xen/include/xen/domain_builder.h
+++ b/xen/include/xen/domain_builder.h
@@ -51,5 +51,6 @@ static inline struct domain *builder_get_hwdom(struct boot_info *info)
 
 void builder_init(struct boot_info *info);
 uint32_t builder_create_domains(struct boot_info *info);
+void arch_create_dom(const struct boot_info *bi, struct boot_domain *bd);
 
 #endif /* XEN_DOMAIN_BUILDER_H */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 11/18] x86: initial conversion to domain builder
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (9 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 10/18] x86: introduce the " Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-26 15:01   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 12/18] x86: convert dom0 creation " Daniel P. Smith
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné

This commit is the first step in adopting domain builder. It goes through the
dom0 creation and construction functions, converting them over to consume
struct boot_domaain and changes the startup sequence to use the domain builder
to create and construct dom0.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/dom0_build.c             |  30 +++----
 xen/arch/x86/hvm/dom0_build.c         |  10 +--
 xen/arch/x86/include/asm/dom0_build.h |   8 +-
 xen/arch/x86/include/asm/setup.h      |   5 +-
 xen/arch/x86/pv/dom0_build.c          |  39 ++++-----
 xen/arch/x86/setup.c                  | 114 +++++++++++++++-----------
 6 files changed, 109 insertions(+), 97 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index e44f7f3c43..216c9e3590 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -6,6 +6,7 @@
 
 #include <xen/bootdomain.h>
 #include <xen/bootinfo.h>
+#include <xen/domain_builder.h>
 #include <xen/init.h>
 #include <xen/iocap.h>
 #include <xen/libelf.h>
@@ -556,31 +557,32 @@ int __init dom0_setup_permissions(struct domain *d)
     return rc;
 }
 
-int __init construct_dom0(
-    struct domain *d, const struct boot_module *image,
-    struct boot_module *initrd, char *cmdline)
+int __init construct_domain(struct boot_domain *bd)
 {
-    int rc;
+    int rc = 0;
 
     /* Sanity! */
-    BUG_ON(!pv_shim && d->domain_id != 0);
-    BUG_ON(d->vcpu[0] == NULL);
-    BUG_ON(d->vcpu[0]->is_initialised);
+    BUG_ON(!pv_shim && bd->domid != 0);
+    BUG_ON(bd->domain->vcpu[0] == NULL);
+    BUG_ON(bd->domain->vcpu[0]->is_initialised);
 
     process_pending_softirqs();
 
-    if ( is_hvm_domain(d) )
-        rc = dom0_construct_pvh(d, image, initrd, cmdline);
-    else if ( is_pv_domain(d) )
-        rc = dom0_construct_pv(d, image, initrd, cmdline);
-    else
-        panic("Cannot construct Dom0. No guest interface available\n");
+    if ( builder_is_initdom(bd) )
+    {
+        if ( is_hvm_domain(bd->domain) )
+            rc = dom0_construct_pvh(bd);
+        else if ( is_pv_domain(bd->domain) )
+            rc = dom0_construct_pv(bd);
+        else
+            panic("Cannot construct Dom0. No guest interface available\n");
+    }
 
     if ( rc )
         return rc;
 
     /* Sanity! */
-    BUG_ON(!d->vcpu[0]->is_initialised);
+    BUG_ON(!bd->domain->vcpu[0]->is_initialised);
 
     return 0;
 }
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 4e903a848d..2fee2ed926 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -19,6 +19,7 @@
  */
 
 #include <xen/acpi.h>
+#include <xen/bootdomain.h>
 #include <xen/bootinfo.h>
 #include <xen/init.h>
 #include <xen/libelf.h>
@@ -1217,10 +1218,9 @@ static void __hwdom_init pvh_setup_mmcfg(struct domain *d)
     }
 }
 
-int __init dom0_construct_pvh(
-    struct domain *d, const struct boot_module *image,
-    struct boot_module *initrd, char *cmdline)
+int __init dom0_construct_pvh(struct boot_domain *bd)
 {
+    struct domain *d = bd->domain;
     paddr_t entry, start_info;
     int rc;
 
@@ -1249,8 +1249,8 @@ int __init dom0_construct_pvh(
         return rc;
     }
 
-    rc = pvh_load_kernel(d, image, initrd, bootstrap_map(image),
-                         cmdline, &entry, &start_info);
+    rc = pvh_load_kernel(d, bd->kernel, bd->ramdisk, bootstrap_map(bd->kernel),
+                         bd->kernel->string.bytes, &entry, &start_info);
     if ( rc )
     {
         printk("Failed to load Dom0 kernel\n");
diff --git a/xen/arch/x86/include/asm/dom0_build.h b/xen/arch/x86/include/asm/dom0_build.h
index ad33413710..571b25ea71 100644
--- a/xen/arch/x86/include/asm/dom0_build.h
+++ b/xen/arch/x86/include/asm/dom0_build.h
@@ -14,13 +14,9 @@ unsigned long dom0_compute_nr_pages(struct domain *d,
                                     unsigned long initrd_len);
 int dom0_setup_permissions(struct domain *d);
 
-int __init dom0_construct_pv(
-    struct domain *d, const struct boot_module *image,
-    struct boot_module *initrd, char *cmdline);
+int dom0_construct_pv(struct boot_domain *bd);
 
-int __init dom0_construct_pvh(
-    struct domain *d, const struct boot_module *image,
-    struct boot_module *initrd, char *cmdline);
+int dom0_construct_pvh(struct boot_domain *bd);
 
 unsigned long dom0_paging_pages(const struct domain *d,
                                 unsigned long nr_pages);
diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h
index 27c0d61819..f9c1468fcc 100644
--- a/xen/arch/x86/include/asm/setup.h
+++ b/xen/arch/x86/include/asm/setup.h
@@ -33,9 +33,8 @@ void vesa_init(void);
 static inline void vesa_init(void) {};
 #endif
 
-int construct_dom0(
-    struct domain *d, const struct boot_module *image,
-    struct boot_module *initrd, char *cmdline);
+int construct_domain(struct boot_domain *bd);
+
 void setup_io_bitmap(struct domain *d);
 
 unsigned long initial_images_nrpages(nodeid_t node);
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index f6131147ef..78ebb03b1b 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -4,6 +4,7 @@
  * Copyright (c) 2002-2005, K A Fraser
  */
 
+#include <xen/bootdomain.h>
 #include <xen/bootinfo.h>
 #include <xen/console.h>
 #include <xen/domain.h>
@@ -295,9 +296,7 @@ static struct page_info * __init alloc_chunk(struct domain *d,
     return page;
 }
 
-int __init dom0_construct_pv(
-    struct domain *d, const struct boot_module *image,
-    struct boot_module *initrd, char *cmdline)
+int __init dom0_construct_pv(struct boot_domain *bd)
 {
     int i, rc, order, machine;
     bool compatible, compat;
@@ -311,11 +310,12 @@ int __init dom0_construct_pv(
     unsigned long count;
     struct page_info *page = NULL;
     start_info_t *si;
+    struct domain *d = bd->domain;
     struct vcpu *v = d->vcpu[0];
-    void *image_base = bootstrap_map(image);
-    unsigned long image_len = image->size;
-    void *image_start = image_base + image->arch->headroom;
-    unsigned long initrd_len = initrd ? initrd->size : 0;
+    void *image_base = bootstrap_map(bd->kernel);
+    unsigned long image_len = bd->kernel->size;
+    void *image_start = image_base + bd->kernel->arch->headroom;
+    unsigned long initrd_len = bd->ramdisk ? bd->ramdisk->size : 0;
     l4_pgentry_t *l4tab = NULL, *l4start = NULL;
     l3_pgentry_t *l3tab = NULL, *l3start = NULL;
     l2_pgentry_t *l2tab = NULL, *l2start = NULL;
@@ -355,7 +355,7 @@ int __init dom0_construct_pv(
     d->max_pages = ~0U;
 
     if ( (rc =
-          bzimage_parse(image_base, &image_start, image->arch->headroom,
+          bzimage_parse(image_base, &image_start, bd->kernel->arch->headroom,
                          &image_len)) != 0 )
         return rc;
 
@@ -545,7 +545,7 @@ int __init dom0_construct_pv(
         initrd_pfn = vinitrd_start ?
                      (vinitrd_start - v_start) >> PAGE_SHIFT :
                      domain_tot_pages(d);
-        initrd_mfn = mfn = mfn_x(initrd->mfn);
+        initrd_mfn = mfn = mfn_x(bd->ramdisk->mfn);
         count = PFN_UP(initrd_len);
         if ( d->arch.physaddr_bitsize &&
              ((mfn + count - 1) >> (d->arch.physaddr_bitsize - PAGE_SHIFT)) )
@@ -560,13 +560,13 @@ int __init dom0_construct_pv(
                     free_domheap_pages(page, order);
                     page += 1UL << order;
                 }
-            memcpy(page_to_virt(page), maddr_to_virt(initrd->start),
+            memcpy(page_to_virt(page), maddr_to_virt(bd->ramdisk->start),
                    initrd_len);
-            mpt_alloc = initrd->start;
+            mpt_alloc = bd->ramdisk->start;
             init_domheap_pages(mpt_alloc,
                                mpt_alloc + PAGE_ALIGN(initrd_len));
-            bootmodule_update_mfn(initrd, page_to_mfn(page));
-            initrd_mfn = mfn_x(initrd->mfn);
+            bootmodule_update_mfn(bd->ramdisk, page_to_mfn(page));
+            initrd_mfn = mfn_x(bd->ramdisk->mfn);
         }
         else
         {
@@ -574,7 +574,7 @@ int __init dom0_construct_pv(
                 if ( assign_pages(mfn_to_page(_mfn(mfn++)), 1, d, 0) )
                     BUG();
         }
-        initrd->size = 0;
+        bd->ramdisk->size = 0;
     }
 
     printk("PHYSICAL MEMORY ARRANGEMENT:\n"
@@ -583,9 +583,9 @@ int __init dom0_construct_pv(
     if ( domain_tot_pages(d) < nr_pages )
         printk(" (%lu pages to be allocated)",
                nr_pages - domain_tot_pages(d));
-    if ( initrd )
+    if ( bd->ramdisk )
     {
-        mpt_alloc = initrd->start;
+        mpt_alloc = bd->ramdisk->start;
         printk("\n Init. ramdisk: %"PRIpaddr"->%"PRIpaddr,
                mpt_alloc, mpt_alloc + initrd_len);
     }
@@ -806,7 +806,7 @@ int __init dom0_construct_pv(
         if ( pfn >= initrd_pfn )
         {
             if ( pfn < initrd_pfn + PFN_UP(initrd_len) )
-                mfn = mfn_x(initrd->mfn) + (pfn - initrd_pfn);
+                mfn = mfn_x(bd->ramdisk->mfn) + (pfn - initrd_pfn);
             else
                 mfn -= PFN_UP(initrd_len);
         }
@@ -866,8 +866,9 @@ int __init dom0_construct_pv(
     }
 
     memset(si->cmd_line, 0, sizeof(si->cmd_line));
-    if ( cmdline != NULL )
-        strlcpy((char *)si->cmd_line, cmdline, sizeof(si->cmd_line));
+    if ( strlen(bd->kernel->string.bytes) > 0 )
+        strlcpy((char *)si->cmd_line, bd->kernel->string.bytes,
+                sizeof(si->cmd_line));
 
 #ifdef CONFIG_VIDEO
     if ( !pv_shim && fill_console_start_info((void *)(si + 1)) )
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 28dbfcd209..860b9e3d64 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -45,7 +45,6 @@
 #include <asm/edd.h>
 #include <xsm/xsm.h>
 #include <asm/tboot.h>
-#include <asm/bzimage.h> /* for bzimage_headroom */
 #include <asm/mach-generic/mach_apic.h> /* for generic_apic_probe */
 #include <asm/setup.h>
 #include <xen/cpu.h>
@@ -272,6 +271,24 @@ static int __init cf_check parse_acpi_param(const char *s)
 }
 custom_param("acpi", parse_acpi_param);
 
+void __init arch_builder_apply_cmdline(
+    struct boot_info *info, struct boot_domain *bd)
+{
+    if ( skip_ioapic_setup && !strstr(bd->kernel->string.bytes, "noapic") )
+        strlcat(bd->kernel->string.bytes, " noapic", MAX_GUEST_CMDLINE);
+    if ( (strlen(acpi_param) == 0) && acpi_disabled )
+    {
+        printk("ACPI is disabled, notifying Domain 0 (acpi=off)\n");
+        strlcpy(acpi_param, "off", sizeof(acpi_param));
+    }
+    if ( (strlen(acpi_param) != 0) &&
+         !strstr(bd->kernel->string.bytes, "acpi=") )
+    {
+        strlcat(bd->kernel->string.bytes, " acpi=", MAX_GUEST_CMDLINE);
+        strlcat(bd->kernel->string.bytes, acpi_param, MAX_GUEST_CMDLINE);
+    }
+}
+
 struct boot_info __initdata *boot_info;
 
 unsigned long __init initial_images_nrpages(nodeid_t node)
@@ -728,7 +745,8 @@ static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int li
     return n;
 }
 
-static struct domain *__init create_dom0(const struct boot_info *bi)
+static struct domain *__init create_dom0(
+    const struct boot_info *bi, struct boot_domain *bd)
 {
     struct xen_domctl_createdomain dom0_cfg = {
         .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
@@ -741,14 +759,10 @@ static struct domain *__init create_dom0(const struct boot_info *bi)
             .misc_flags = opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0,
         },
     };
-    struct boot_module *image = bootmodule_next_by_kind(bi, BOOTMOD_KERNEL, 0);
-    struct boot_module *initrd = bootmodule_next_by_kind(bi, BOOTMOD_RAMDISK, 0);
-    struct domain *d;
     char *cmdline;
-    domid_t domid = 0;
 
-    if ( image == NULL )
-        panic("Error creating d%uv0\n", domid);
+    if ( bd->kernel == NULL )
+        panic("Error creating d%uv0\n", bd->domid);
 
     if ( opt_dom0_pvh )
     {
@@ -764,45 +778,46 @@ static struct domain *__init create_dom0(const struct boot_info *bi)
         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
     /* Create initial domain.  Not d0 for pvshim. */
-    domid = get_initial_domain_id();
-    d = domain_create(domid, &dom0_cfg, pv_shim ? 0 : CDF_privileged);
-    if ( IS_ERR(d) )
-        panic("Error creating d%u: %ld\n", domid, PTR_ERR(d));
+    bd->domid = get_initial_domain_id();
+    bd->domain = domain_create(bd->domid, &dom0_cfg, pv_shim ?
+                               0 : CDF_privileged);
+    if ( IS_ERR(bd->domain) )
+        panic("Error creating d%u: %ld\n", bd->domid, PTR_ERR(bd->domain));
 
-    init_dom0_cpuid_policy(d);
+    init_dom0_cpuid_policy(bd->domain);
 
-    if ( alloc_dom0_vcpu0(d) == NULL )
-        panic("Error creating d%uv0\n", domid);
+    if ( alloc_dom0_vcpu0(bd->domain) == NULL )
+        panic("Error creating d%uv0\n", bd->domid);
 
     /* Grab the DOM0 command line. */
-    cmdline = (image->string.kind == BOOTSTR_CMDLINE) ?
-              image->string.bytes : NULL;
+    cmdline = (bd->kernel->string.kind == BOOTSTR_CMDLINE) ?
+              bd->kernel->string.bytes : NULL;
     if ( cmdline || bi->arch->kextra )
     {
-        static char __initdata dom0_cmdline[MAX_GUEST_CMDLINE];
+        char dom0_cmdline[MAX_GUEST_CMDLINE];
 
         cmdline = arch_prepare_cmdline(cmdline, bi->arch);
-        safe_strcpy(dom0_cmdline, cmdline);
+        strlcpy(dom0_cmdline, cmdline, MAX_GUEST_CMDLINE);
 
         if ( bi->arch->kextra )
             /* kextra always includes exactly one leading space. */
-            safe_strcat(dom0_cmdline, bi->arch->kextra);
+            strlcat(dom0_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
 
         /* Append any extra parameters. */
         if ( skip_ioapic_setup && !strstr(dom0_cmdline, "noapic") )
-            safe_strcat(dom0_cmdline, " noapic");
+            strlcat(dom0_cmdline, " noapic", MAX_GUEST_CMDLINE);
         if ( (strlen(acpi_param) == 0) && acpi_disabled )
         {
             printk("ACPI is disabled, notifying Domain 0 (acpi=off)\n");
-            safe_strcpy(acpi_param, "off");
+            strlcpy(acpi_param, "off", sizeof(acpi_param));
         }
         if ( (strlen(acpi_param) != 0) && !strstr(dom0_cmdline, "acpi=") )
         {
-            safe_strcat(dom0_cmdline, " acpi=");
-            safe_strcat(dom0_cmdline, acpi_param);
+            strlcat(dom0_cmdline, " acpi=", MAX_GUEST_CMDLINE);
+            strlcat(dom0_cmdline, acpi_param, MAX_GUEST_CMDLINE);
         }
 
-        cmdline = dom0_cmdline;
+        strlcpy(bd->kernel->string.bytes, dom0_cmdline, MAX_GUEST_CMDLINE);
     }
 
     /*
@@ -816,7 +831,7 @@ static struct domain *__init create_dom0(const struct boot_info *bi)
         write_cr4(read_cr4() & ~X86_CR4_SMAP);
     }
 
-    if ( construct_dom0(d, image, initrd, cmdline) != 0 )
+    if ( construct_domain(bd) != 0 )
         panic("Could not construct domain 0\n");
 
     if ( cpu_has_smap )
@@ -825,14 +840,14 @@ static struct domain *__init create_dom0(const struct boot_info *bi)
         cr4_pv32_mask |= X86_CR4_SMAP;
     }
 
-    return d;
+    return bd->domain;
 }
 
 void __init arch_create_dom(
     const struct boot_info *bi, struct boot_domain *bd)
 {
     if ( builder_is_initdom(bd) )
-        create_dom0(bi);
+        create_dom0(bi, bd);
 }
 
 /* How much of the directmap is prebuilt at compile time. */
@@ -1010,10 +1025,7 @@ void __init noreturn __start_xen(unsigned long bi_p)
                boot_info->nr_mods);
     }
 
-    /* Dom0 kernel is the first boot module */
-    boot_info->mods[0].kind = BOOTMOD_KERNEL;
-    if ( boot_info->mods[0].string.len )
-        boot_info->mods[0].string.kind = BOOTSTR_CMDLINE;
+    builder_init(boot_info);
 
     if ( pvh_boot )
     {
@@ -1168,11 +1180,6 @@ void __init noreturn __start_xen(unsigned long bi_p)
         boot_info->mods[boot_info->nr_mods].size = __2M_rwdata_end - _stext;
     }
 
-    boot_info->mods[0].arch->headroom = bzimage_headroom(
-                                        bootstrap_map(&boot_info->mods[0]),
-                                        boot_info->mods[0].size);
-    bootstrap_map(NULL);
-
 #ifndef highmem_start
     /* Don't allow split below 4Gb. */
     if ( highmem_start < GB(4) )
@@ -1905,22 +1912,29 @@ void __init noreturn __start_xen(unsigned long bi_p)
            cpu_has_nx ? XENLOG_INFO : XENLOG_WARNING "Warning: ",
            cpu_has_nx ? "" : "not ");
 
-    initrdidx = bootmodule_next_idx_by_kind(boot_info, BOOTMOD_UNKNOWN, 0);
-    if ( initrdidx < boot_info->nr_mods )
-        boot_info->mods[initrdidx].kind = BOOTMOD_RAMDISK;
-
-    if ( bootmodule_count_by_kind(boot_info, BOOTMOD_UNKNOWN) > 1 )
-        printk(XENLOG_WARNING
-               "Multiple initrd candidates, picking module #%u\n",
-               initrdidx);
-
     /*
-     * We're going to setup domain0 using the module(s) that we stashed safely
-     * above our heap. The second module, if present, is an initrd ramdisk.
+     * Boot description not provided, check to see if there are any remaining
+     * boot modules, the first one found will be provided as the ramdisk.
      */
-    dom0 = create_dom0(boot_info);
+    if ( ! boot_info->builder->fdt_enabled )
+    {
+        initrdidx = bootmodule_next_idx_by_kind(boot_info, BOOTMOD_UNKNOWN, 0);
+        if ( initrdidx < boot_info->nr_mods )
+        {
+            boot_info->builder->domains[0].ramdisk = &boot_info->mods[initrdidx];
+            boot_info->mods[initrdidx].kind = BOOTMOD_RAMDISK;
+        }
+        if ( bootmodule_count_by_kind(boot_info, BOOTMOD_UNKNOWN) > 1 )
+            printk(XENLOG_WARNING
+                   "Multiple initrd candidates, picking module #%u\n",
+                   initrdidx);
+    }
+
+    builder_create_domains(boot_info);
+
+    dom0 = builder_get_hwdom(boot_info);
     if ( !dom0 )
-        panic("Could not set up DOM0 guest OS\n");
+        panic("No hardware domain was built\n");
 
     heap_init_late();
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 12/18] x86: convert dom0 creation to domain builder
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (10 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 11/18] x86: initial conversion to " Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-27 12:25   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 13/18] x86: generalize physmap logic Daniel P. Smith
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné

This commit begins the transtion over to domain builder by coverting
the dom0 creation logic into a generalized domain creation logic.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/Makefile            |   1 +
 xen/arch/x86/domain_builder.c    | 128 +++++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/setup.h |   1 +
 xen/arch/x86/setup.c             | 109 +++-----------------------
 4 files changed, 141 insertions(+), 98 deletions(-)
 create mode 100644 xen/arch/x86/domain_builder.c

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 177a2ff742..2d5d398551 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -27,6 +27,7 @@ obj-y += desc.o
 obj-bin-y += dmi_scan.init.o
 obj-y += domain.o
 obj-bin-y += dom0_build.init.o
+obj-bin-y += domain_builder.init.o
 obj-y += domain_page.o
 obj-y += e820.o
 obj-y += emul-i8254.o
diff --git a/xen/arch/x86/domain_builder.c b/xen/arch/x86/domain_builder.c
new file mode 100644
index 0000000000..308e1a1c67
--- /dev/null
+++ b/xen/arch/x86/domain_builder.c
@@ -0,0 +1,128 @@
+#include <xen/bootdomain.h>
+#include <xen/bootinfo.h>
+#include <xen/domain.h>
+#include <xen/domain_builder.h>
+#include <xen/err.h>
+#include <xen/grant_table.h>
+#include <xen/iommu.h>
+#include <xen/sched.h>
+
+#include <asm/pv/shim.h>
+#include <asm/setup.h>
+
+extern unsigned long cr4_pv32_mask;
+
+static unsigned int __init dom_max_vcpus(struct boot_domain *bd)
+{
+    unsigned int limit;
+
+    if ( builder_is_initdom(bd) )
+        return dom0_max_vcpus();
+
+    limit = bd->mode & BUILD_MODE_PARAVIRTUALIZED ?
+                MAX_VIRT_CPUS : HVM_MAX_VCPUS;
+
+    if ( bd->ncpus > limit )
+        return limit;
+    else
+        return bd->ncpus;
+}
+
+void __init arch_create_dom(
+    const struct boot_info *bi, struct boot_domain *bd)
+{
+    struct xen_domctl_createdomain dom_cfg = {
+        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
+        .max_evtchn_port = -1,
+        .max_grant_frames = -1,
+        .max_maptrack_frames = -1,
+        .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
+        .max_vcpus = dom_max_vcpus(bd),
+        .arch = {
+            .misc_flags = bd->functions & BUILD_FUNCTION_INITIAL_DOM &&
+                           opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0,
+        },
+    };
+    unsigned int is_privileged = 0;
+    char *cmdline;
+
+    if ( bd->kernel == NULL )
+        panic("Error creating d%uv0\n", bd->domid);
+
+    /* mask out PV and device model bits, if 0 then the domain is PVH */
+    if ( !(bd->mode &
+           (BUILD_MODE_PARAVIRTUALIZED|BUILD_MODE_ENABLE_DEVICE_MODEL)) )
+    {
+        dom_cfg.flags |= (XEN_DOMCTL_CDF_hvm |
+                         (hvm_hap_supported() ? XEN_DOMCTL_CDF_hap : 0));
+
+        /*
+         * If shadow paging is enabled for the initial domain, mask out
+         * HAP if it was just enabled.
+         */
+        if ( builder_is_initdom(bd) )
+            if ( opt_dom0_shadow )
+                dom_cfg.flags |= ~XEN_DOMCTL_CDF_hap;
+
+        /* TODO: review which flags should be present */
+        dom_cfg.arch.emulation_flags |=
+            XEN_X86_EMU_LAPIC | XEN_X86_EMU_IOAPIC | XEN_X86_EMU_VPCI;
+    }
+
+    if ( iommu_enabled && builder_is_hwdom(bd) )
+        dom_cfg.flags |= XEN_DOMCTL_CDF_iommu;
+
+    if ( !pv_shim && builder_is_ctldom(bd) )
+        is_privileged = CDF_privileged;
+
+    /* Create initial domain.  Not d0 for pvshim. */
+    bd->domid = get_initial_domain_id();
+    bd->domain = domain_create(bd->domid, &dom_cfg, is_privileged);
+    if ( IS_ERR(bd->domain) )
+        panic("Error creating d%u: %ld\n", bd->domid, PTR_ERR(bd->domain));
+
+    init_dom0_cpuid_policy(bd->domain);
+
+    if ( alloc_dom0_vcpu0(bd->domain) == NULL )
+        panic("Error creating d%uv0\n", bd->domid);
+
+    /* Grab the DOM0 command line. */
+    cmdline = (bd->kernel->string.kind == BOOTSTR_CMDLINE) ?
+              bd->kernel->string.bytes : NULL;
+    if ( cmdline || bi->arch->kextra )
+    {
+        char dom_cmdline[MAX_GUEST_CMDLINE];
+
+        cmdline = arch_prepare_cmdline(cmdline, bi->arch);
+        strlcpy(dom_cmdline, cmdline, MAX_GUEST_CMDLINE);
+
+        if ( bi->arch->kextra )
+            /* kextra always includes exactly one leading space. */
+            strlcat(dom_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
+
+        apply_xen_cmdline(dom_cmdline);
+
+        strlcpy(bd->kernel->string.bytes, dom_cmdline, MAX_GUEST_CMDLINE);
+    }
+
+    /*
+     * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
+     * This saves a large number of corner cases interactions with
+     * copy_from_user().
+     */
+    if ( cpu_has_smap )
+    {
+        cr4_pv32_mask &= ~X86_CR4_SMAP;
+        write_cr4(read_cr4() & ~X86_CR4_SMAP);
+    }
+
+    if ( construct_domain(bd) != 0 )
+        panic("Could not construct domain 0\n");
+
+    if ( cpu_has_smap )
+    {
+        write_cr4(read_cr4() | X86_CR4_SMAP);
+        cr4_pv32_mask |= X86_CR4_SMAP;
+    }
+}
+
diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h
index f9c1468fcc..6f53623fb3 100644
--- a/xen/arch/x86/include/asm/setup.h
+++ b/xen/arch/x86/include/asm/setup.h
@@ -33,6 +33,7 @@ void vesa_init(void);
 static inline void vesa_init(void) {};
 #endif
 
+void apply_xen_cmdline(char *cmdline);
 int construct_domain(struct boot_domain *bd);
 
 void setup_io_bitmap(struct domain *d);
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 860b9e3d64..479b9fa149 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -8,6 +8,7 @@
 #include <xen/param.h>
 #include <xen/sched.h>
 #include <xen/domain.h>
+#include <xen/domain_builder.h>
 #include <xen/serial.h>
 #include <xen/softirq.h>
 #include <xen/acpi.h>
@@ -745,109 +746,21 @@ static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int li
     return n;
 }
 
-static struct domain *__init create_dom0(
-    const struct boot_info *bi, struct boot_domain *bd)
+void __init apply_xen_cmdline(char *cmdline)
 {
-    struct xen_domctl_createdomain dom0_cfg = {
-        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
-        .max_evtchn_port = -1,
-        .max_grant_frames = -1,
-        .max_maptrack_frames = -1,
-        .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
-        .max_vcpus = dom0_max_vcpus(),
-        .arch = {
-            .misc_flags = opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0,
-        },
-    };
-    char *cmdline;
-
-    if ( bd->kernel == NULL )
-        panic("Error creating d%uv0\n", bd->domid);
-
-    if ( opt_dom0_pvh )
-    {
-        dom0_cfg.flags |= (XEN_DOMCTL_CDF_hvm |
-                           ((hvm_hap_supported() && !opt_dom0_shadow) ?
-                            XEN_DOMCTL_CDF_hap : 0));
-
-        dom0_cfg.arch.emulation_flags |=
-            XEN_X86_EMU_LAPIC | XEN_X86_EMU_IOAPIC | XEN_X86_EMU_VPCI;
-    }
-
-    if ( iommu_enabled )
-        dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
-
-    /* Create initial domain.  Not d0 for pvshim. */
-    bd->domid = get_initial_domain_id();
-    bd->domain = domain_create(bd->domid, &dom0_cfg, pv_shim ?
-                               0 : CDF_privileged);
-    if ( IS_ERR(bd->domain) )
-        panic("Error creating d%u: %ld\n", bd->domid, PTR_ERR(bd->domain));
-
-    init_dom0_cpuid_policy(bd->domain);
-
-    if ( alloc_dom0_vcpu0(bd->domain) == NULL )
-        panic("Error creating d%uv0\n", bd->domid);
-
-    /* Grab the DOM0 command line. */
-    cmdline = (bd->kernel->string.kind == BOOTSTR_CMDLINE) ?
-              bd->kernel->string.bytes : NULL;
-    if ( cmdline || bi->arch->kextra )
-    {
-        char dom0_cmdline[MAX_GUEST_CMDLINE];
-
-        cmdline = arch_prepare_cmdline(cmdline, bi->arch);
-        strlcpy(dom0_cmdline, cmdline, MAX_GUEST_CMDLINE);
-
-        if ( bi->arch->kextra )
-            /* kextra always includes exactly one leading space. */
-            strlcat(dom0_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
-
-        /* Append any extra parameters. */
-        if ( skip_ioapic_setup && !strstr(dom0_cmdline, "noapic") )
-            strlcat(dom0_cmdline, " noapic", MAX_GUEST_CMDLINE);
-        if ( (strlen(acpi_param) == 0) && acpi_disabled )
-        {
-            printk("ACPI is disabled, notifying Domain 0 (acpi=off)\n");
-            strlcpy(acpi_param, "off", sizeof(acpi_param));
-        }
-        if ( (strlen(acpi_param) != 0) && !strstr(dom0_cmdline, "acpi=") )
-        {
-            strlcat(dom0_cmdline, " acpi=", MAX_GUEST_CMDLINE);
-            strlcat(dom0_cmdline, acpi_param, MAX_GUEST_CMDLINE);
-        }
-
-        strlcpy(bd->kernel->string.bytes, dom0_cmdline, MAX_GUEST_CMDLINE);
-    }
-
-    /*
-     * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
-     * This saves a large number of corner cases interactions with
-     * copy_from_user().
-     */
-    if ( cpu_has_smap )
+    if ( skip_ioapic_setup && !strstr(cmdline, "noapic") )
+        strlcat(cmdline, " noapic", MAX_GUEST_CMDLINE);
+    if ( (strlen(acpi_param) == 0) && acpi_disabled )
     {
-        cr4_pv32_mask &= ~X86_CR4_SMAP;
-        write_cr4(read_cr4() & ~X86_CR4_SMAP);
+        printk("ACPI is disabled, notifying Domain 0 (acpi=off)\n");
+        strlcpy(acpi_param, "off", sizeof(acpi_param));
     }
-
-    if ( construct_domain(bd) != 0 )
-        panic("Could not construct domain 0\n");
-
-    if ( cpu_has_smap )
+    if ( (strlen(acpi_param) != 0) &&
+         !strstr(cmdline, "acpi=") )
     {
-        write_cr4(read_cr4() | X86_CR4_SMAP);
-        cr4_pv32_mask |= X86_CR4_SMAP;
+        strlcat(cmdline, " acpi=", MAX_GUEST_CMDLINE);
+        strlcat(cmdline, acpi_param, MAX_GUEST_CMDLINE);
     }
-
-    return bd->domain;
-}
-
-void __init arch_create_dom(
-    const struct boot_info *bi, struct boot_domain *bd)
-{
-    if ( builder_is_initdom(bd) )
-        create_dom0(bi, bd);
 }
 
 /* How much of the directmap is prebuilt at compile time. */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 13/18] x86: generalize physmap logic
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (11 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 12/18] x86: convert dom0 creation " Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-27 12:33   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 14/18] x86: generalize vcpu for domain building Daniel P. Smith
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné

The existing physmap code is specific to dom0. In this commit, the dom0 physmap
code is generalized for any domain and functions are renamed to reflect their
new general nature.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/include/asm/dom0_build.h |  2 +-
 xen/arch/x86/pv/dom0_build.c          | 10 +++++-----
 xen/arch/x86/pv/shim.c                |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/include/asm/dom0_build.h b/xen/arch/x86/include/asm/dom0_build.h
index 571b25ea71..f30e4b860a 100644
--- a/xen/arch/x86/include/asm/dom0_build.h
+++ b/xen/arch/x86/include/asm/dom0_build.h
@@ -21,7 +21,7 @@ int dom0_construct_pvh(struct boot_domain *bd);
 unsigned long dom0_paging_pages(const struct domain *d,
                                 unsigned long nr_pages);
 
-void dom0_update_physmap(bool compat, unsigned long pfn,
+void dom_update_physmap(bool compat, unsigned long pfn,
                          unsigned long mfn, unsigned long vphysmap_s);
 
 #endif	/* _DOM0_BUILD_H_ */
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index 78ebb03b1b..f1ea0575f0 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -34,8 +34,8 @@
 #define L3_PROT (BASE_PROT|_PAGE_DIRTY)
 #define L4_PROT (BASE_PROT|_PAGE_DIRTY)
 
-void __init dom0_update_physmap(bool compat, unsigned long pfn,
-                                unsigned long mfn, unsigned long vphysmap_s)
+void __init dom_update_physmap(
+    bool compat, unsigned long pfn, unsigned long mfn, unsigned long vphysmap_s)
 {
     if ( !compat )
         ((unsigned long *)vphysmap_s)[pfn] = mfn;
@@ -815,7 +815,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
         if ( pfn > REVERSE_START && (vinitrd_start || pfn < initrd_pfn) )
             mfn = alloc_epfn - (pfn - REVERSE_START);
 #endif
-        dom0_update_physmap(compat, pfn, mfn, vphysmap_start);
+        dom_update_physmap(compat, pfn, mfn, vphysmap_start);
         if ( !(pfn & 0xfffff) )
             process_pending_softirqs();
     }
@@ -831,7 +831,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
                  !get_page_and_type(page, d, PGT_writable_page) )
                 BUG();
 
-            dom0_update_physmap(compat, pfn, mfn, vphysmap_start);
+            dom_update_physmap(compat, pfn, mfn, vphysmap_start);
             ++pfn;
             if ( !(pfn & 0xfffff) )
                 process_pending_softirqs();
@@ -851,7 +851,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
 #ifndef NDEBUG
 #define pfn (nr_pages - 1 - (pfn - (alloc_epfn - alloc_spfn)))
 #endif
-            dom0_update_physmap(compat, pfn, mfn, vphysmap_start);
+            dom_update_physmap(compat, pfn, mfn, vphysmap_start);
 #undef pfn
             page++; pfn++;
             if ( !(pfn & 0xfffff) )
diff --git a/xen/arch/x86/pv/shim.c b/xen/arch/x86/pv/shim.c
index 2ee290a392..fb2a7ef393 100644
--- a/xen/arch/x86/pv/shim.c
+++ b/xen/arch/x86/pv/shim.c
@@ -210,7 +210,7 @@ void __init pv_shim_setup_dom(struct domain *d, l4_pgentry_t *l4start,
     {                                                                          \
         share_xen_page_with_guest(mfn_to_page(_mfn(param)), d, SHARE_rw);      \
         replace_va_mapping(d, l4start, va, _mfn(param));                       \
-        dom0_update_physmap(compat,                                            \
+        dom_update_physmap(compat,                                             \
                             PFN_DOWN((va) - va_start), param, vphysmap);       \
     }                                                                          \
     else                                                                       \
@@ -238,7 +238,7 @@ void __init pv_shim_setup_dom(struct domain *d, l4_pgentry_t *l4start,
         si->console.domU.mfn = mfn_x(console_mfn);
         share_xen_page_with_guest(mfn_to_page(console_mfn), d, SHARE_rw);
         replace_va_mapping(d, l4start, console_va, console_mfn);
-        dom0_update_physmap(compat, (console_va - va_start) >> PAGE_SHIFT,
+        dom_update_physmap(compat, (console_va - va_start) >> PAGE_SHIFT,
                             mfn_x(console_mfn), vphysmap);
         consoled_set_ring_addr(page);
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 14/18] x86: generalize vcpu for domain building
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (12 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 13/18] x86: generalize physmap logic Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-27 12:46   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 15/18] x86: rework domain page allocation Daniel P. Smith
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, Dario Faggioli

Here, the vcpu initialization code for dom0 creation is generalized for use for
other domains.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/domain_builder.c | 14 +++++++++++++-
 xen/arch/x86/hvm/dom0_build.c |  7 ++++---
 xen/arch/x86/pv/dom0_build.c  |  2 +-
 xen/common/sched/core.c       | 25 ++++++++++++++++---------
 xen/include/xen/sched.h       |  3 ++-
 5 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/domain_builder.c b/xen/arch/x86/domain_builder.c
index 308e1a1c67..1a4a6b1ca7 100644
--- a/xen/arch/x86/domain_builder.c
+++ b/xen/arch/x86/domain_builder.c
@@ -28,6 +28,18 @@ static unsigned int __init dom_max_vcpus(struct boot_domain *bd)
         return bd->ncpus;
 }
 
+struct vcpu *__init alloc_dom_vcpu0(struct boot_domain *bd)
+{
+    if ( bd->functions & BUILD_FUNCTION_INITIAL_DOM )
+        return alloc_dom0_vcpu0(bd->domain);
+
+    bd->domain->node_affinity = node_online_map;
+    bd->domain->auto_node_affinity = true;
+
+    return vcpu_create(bd->domain, 0);
+}
+
+
 void __init arch_create_dom(
     const struct boot_info *bi, struct boot_domain *bd)
 {
@@ -83,7 +95,7 @@ void __init arch_create_dom(
 
     init_dom0_cpuid_policy(bd->domain);
 
-    if ( alloc_dom0_vcpu0(bd->domain) == NULL )
+    if ( alloc_dom_vcpu0(bd) == NULL )
         panic("Error creating d%uv0\n", bd->domid);
 
     /* Grab the DOM0 command line. */
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 2fee2ed926..ae3ffc614d 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -696,9 +696,10 @@ static int __init pvh_load_kernel(
     return 0;
 }
 
-static int __init pvh_setup_cpus(struct domain *d, paddr_t entry,
+static int __init pvh_setup_cpus(struct boot_domain *bd, paddr_t entry,
                                  paddr_t start_info)
 {
+    struct domain *d = bd->domain;
     struct vcpu *v = d->vcpu[0];
     int rc;
     /*
@@ -722,7 +723,7 @@ static int __init pvh_setup_cpus(struct domain *d, paddr_t entry,
         .cpu_regs.x86_32.tr_ar = 0x8b,
     };
 
-    sched_setup_dom0_vcpus(d);
+    sched_setup_dom_vcpus(bd);
 
     rc = arch_set_info_hvm_guest(v, &cpu_ctx);
     if ( rc )
@@ -1257,7 +1258,7 @@ int __init dom0_construct_pvh(struct boot_domain *bd)
         return rc;
     }
 
-    rc = pvh_setup_cpus(d, entry, start_info);
+    rc = pvh_setup_cpus(bd, entry, start_info);
     if ( rc )
     {
         printk("Failed to setup Dom0 CPUs: %d\n", rc);
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index f1ea0575f0..9d1c9fb8b0 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -729,7 +729,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
 
     printk("Dom%u has maximum %u VCPUs\n", d->domain_id, d->max_vcpus);
 
-    sched_setup_dom0_vcpus(d);
+    sched_setup_dom_vcpus(bd);
 
     d->arch.paging.mode = 0;
 
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 250207038e..029f5ea24e 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -14,6 +14,8 @@
  */
 
 #ifndef COMPAT
+#include <xen/bootdomain.h>
+#include <xen/domain_builder.h>
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/param.h>
@@ -3399,13 +3401,13 @@ void wait(void)
 }
 
 #ifdef CONFIG_X86
-void __init sched_setup_dom0_vcpus(struct domain *d)
+void __init sched_setup_dom_vcpus(struct boot_domain *bd)
 {
     unsigned int i;
     struct sched_unit *unit;
 
-    for ( i = 1; i < d->max_vcpus; i++ )
-        vcpu_create(d, i);
+    for ( i = 1; i < bd->domain->max_vcpus; i++ )
+        vcpu_create(bd->domain, i);
 
     /*
      * PV-shim: vcpus are pinned 1:1.
@@ -3413,19 +3415,24 @@ void __init sched_setup_dom0_vcpus(struct domain *d)
      * onlining them. This avoids pinning a vcpu to a not yet online cpu here.
      */
     if ( pv_shim )
-        sched_set_affinity(d->vcpu[0]->sched_unit,
+        sched_set_affinity(bd->domain->vcpu[0]->sched_unit,
                            cpumask_of(0), cpumask_of(0));
     else
     {
-        for_each_sched_unit ( d, unit )
+        for_each_sched_unit ( bd->domain, unit )
         {
-            if ( !opt_dom0_vcpus_pin && !dom0_affinity_relaxed )
-                sched_set_affinity(unit, &dom0_cpus, NULL);
-            sched_set_affinity(unit, NULL, &dom0_cpus);
+            if ( builder_is_initdom(bd) )
+            {
+                if ( !opt_dom0_vcpus_pin && !dom0_affinity_relaxed )
+                    sched_set_affinity(unit, &dom0_cpus, NULL);
+                sched_set_affinity(unit, NULL, &dom0_cpus);
+            }
+            else
+                sched_set_affinity(unit, NULL, cpupool_valid_cpus(cpupool0));
         }
     }
 
-    domain_update_node_affinity(d);
+    domain_update_node_affinity(bd->domain);
 }
 #endif
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b9515eb497..6ab7d69cbd 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -2,6 +2,7 @@
 #ifndef __SCHED_H__
 #define __SCHED_H__
 
+#include <xen/bootdomain.h>
 #include <xen/types.h>
 #include <xen/spinlock.h>
 #include <xen/rwlock.h>
@@ -1003,7 +1004,7 @@ static inline bool sched_has_urgent_vcpu(void)
 }
 
 void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value);
-void sched_setup_dom0_vcpus(struct domain *d);
+void sched_setup_dom_vcpus(struct boot_domain *d);
 int vcpu_temporary_affinity(struct vcpu *v, unsigned int cpu, uint8_t reason);
 int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity);
 void restore_vcpu_affinity(struct domain *d);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 15/18] x86: rework domain page allocation
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (13 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 14/18] x86: generalize vcpu for domain building Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-27 13:22   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 16/18] x86: add pv multidomain construction Daniel P. Smith
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné

This reworks all the dom0 page allocation functions for general domain
construction. Where possible, common logic between the two was split into a
separate function for reuse by the two functions.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/dom0_build.c             |  76 ++++++-------------
 xen/arch/x86/domain_builder.c         | 102 ++++++++++++++++++++++++++
 xen/arch/x86/hvm/dom0_build.c         |  11 +--
 xen/arch/x86/include/asm/dom0_build.h |  14 +++-
 xen/arch/x86/pv/dom0_build.c          |   2 +-
 5 files changed, 142 insertions(+), 63 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 216c9e3590..0600773b8f 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -320,69 +320,31 @@ static unsigned long __init default_nr_pages(unsigned long avail)
 }
 
 unsigned long __init dom0_compute_nr_pages(
-    struct domain *d, struct elf_dom_parms *parms, unsigned long initrd_len)
+    struct boot_domain *bd, struct elf_dom_parms *parms,
+    unsigned long initrd_len)
 {
-    nodeid_t node;
-    unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 0;
+    unsigned long avail, nr_pages, min_pages, max_pages;
 
     /* The ordering of operands is to work around a clang5 issue. */
     if ( CONFIG_DOM0_MEM[0] && !dom0_mem_set )
         parse_dom0_mem(CONFIG_DOM0_MEM);
 
-    for_each_node_mask ( node, dom0_nodes )
-        avail += avail_domheap_pages_region(node, 0, 0) +
-                 initial_images_nrpages(node);
-
-    /* Reserve memory for further dom0 vcpu-struct allocations... */
-    avail -= (d->max_vcpus - 1UL)
-             << get_order_from_bytes(sizeof(struct vcpu));
-    /* ...and compat_l4's, if needed. */
-    if ( is_pv_32bit_domain(d) )
-        avail -= d->max_vcpus - 1;
-
-    /* Reserve memory for iommu_dom0_init() (rough estimate). */
-    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
-    {
-        unsigned int s;
-
-        for ( s = 9; s < BITS_PER_LONG; s += 9 )
-            iommu_pages += max_pdx >> s;
-
-        avail -= iommu_pages;
-    }
+    avail = dom_avail_nr_pages(bd, dom0_nodes);
 
-    if ( paging_mode_enabled(d) || opt_dom0_shadow || opt_pv_l1tf_hwdom )
+    /* command line overrides configuration */
+    if (  dom0_mem_set )
     {
-        unsigned long cpu_pages;
-
-        nr_pages = get_memsize(&dom0_size, avail) ?: default_nr_pages(avail);
-
-        /*
-         * Clamp according to min/max limits and available memory
-         * (preliminary).
-         */
-        nr_pages = max(nr_pages, get_memsize(&dom0_min_size, avail));
-        nr_pages = min(nr_pages, get_memsize(&dom0_max_size, avail));
-        nr_pages = min(nr_pages, avail);
-
-        cpu_pages = dom0_paging_pages(d, nr_pages);
-
-        if ( !iommu_use_hap_pt(d) )
-            avail -= cpu_pages;
-        else if ( cpu_pages > iommu_pages )
-            avail -= cpu_pages - iommu_pages;
+        bd->meminfo.mem_size = dom0_size;
+        bd->meminfo.mem_min = dom0_min_size;
+        bd->meminfo.mem_max = dom0_max_size;
     }
 
-    nr_pages = get_memsize(&dom0_size, avail) ?: default_nr_pages(avail);
-    min_pages = get_memsize(&dom0_min_size, avail);
-    max_pages = get_memsize(&dom0_max_size, avail);
-
-    /* Clamp according to min/max limits and available memory (final). */
-    nr_pages = max(nr_pages, min_pages);
-    nr_pages = min(nr_pages, max_pages);
-    nr_pages = min(nr_pages, avail);
+    nr_pages = get_memsize(&bd->meminfo.mem_size, avail) ?
+               : default_nr_pages(avail);
+    min_pages = get_memsize(&bd->meminfo.mem_min, avail);
+    max_pages = get_memsize(&bd->meminfo.mem_max, avail);
 
-    if ( is_pv_domain(d) &&
+    if ( is_pv_domain(bd->domain) &&
          (parms->p2m_base == UNSET_ADDR) && !memsize_gt_zero(&dom0_size) &&
          (!memsize_gt_zero(&dom0_min_size) || (nr_pages > min_pages)) )
     {
@@ -395,7 +357,8 @@ unsigned long __init dom0_compute_nr_pages(
          * available between params.virt_base and the address space end.
          */
         unsigned long vstart, vend, end;
-        size_t sizeof_long = is_pv_32bit_domain(d) ? sizeof(int) : sizeof(long);
+        size_t sizeof_long = is_pv_32bit_domain(bd->domain) ?
+                             sizeof(int) : sizeof(long);
 
         vstart = parms->virt_base;
         vend = round_pgup(parms->virt_kend);
@@ -416,7 +379,12 @@ unsigned long __init dom0_compute_nr_pages(
         }
     }
 
-    d->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
+    /* Clamp according to min/max limits and available memory (final). */
+    nr_pages = max(nr_pages, min_pages);
+    nr_pages = min(nr_pages, max_pages);
+    nr_pages = min(nr_pages, avail);
+
+    bd->domain->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
 
     return nr_pages;
 }
diff --git a/xen/arch/x86/domain_builder.c b/xen/arch/x86/domain_builder.c
index 1a4a6b1ca7..d8babb1090 100644
--- a/xen/arch/x86/domain_builder.c
+++ b/xen/arch/x86/domain_builder.c
@@ -8,7 +8,9 @@
 #include <xen/sched.h>
 
 #include <asm/pv/shim.h>
+#include <asm/dom0_build.h>
 #include <asm/setup.h>
+#include <asm/spec_ctrl.h>
 
 extern unsigned long cr4_pv32_mask;
 
@@ -40,6 +42,106 @@ struct vcpu *__init alloc_dom_vcpu0(struct boot_domain *bd)
 }
 
 
+unsigned long __init dom_avail_nr_pages(
+    struct boot_domain *bd, nodemask_t nodes)
+{
+    unsigned long avail = 0, iommu_pages = 0;
+    bool is_ctldom = false, is_hwdom = false;
+    unsigned long nr_pages = bd->meminfo.mem_size.nr_pages;
+    nodeid_t node;
+
+    if ( builder_is_ctldom(bd) )
+        is_ctldom = true;
+    if ( builder_is_hwdom(bd) )
+        is_hwdom = true;
+
+    for_each_node_mask ( node, nodes )
+        avail += avail_domheap_pages_region(node, 0, 0) +
+                 initial_images_nrpages(node);
+
+    /* Reserve memory for further dom0 vcpu-struct allocations... */
+    avail -= (bd->domain->max_vcpus - 1UL)
+             << get_order_from_bytes(sizeof(struct vcpu));
+    /* ...and compat_l4's, if needed. */
+    if ( is_pv_32bit_domain(bd->domain) )
+        avail -= bd->domain->max_vcpus - 1;
+
+    /* Reserve memory for iommu_dom0_init() (rough estimate). */
+    if ( is_hwdom && is_iommu_enabled(bd->domain) && !iommu_hwdom_passthrough )
+    {
+        unsigned int s;
+
+        for ( s = 9; s < BITS_PER_LONG; s += 9 )
+            iommu_pages += max_pdx >> s;
+
+        avail -= iommu_pages;
+    }
+
+    if ( paging_mode_enabled(bd->domain) ||
+         (is_ctldom && opt_dom0_shadow) ||
+         (is_hwdom && opt_pv_l1tf_hwdom) )
+    {
+        unsigned long cpu_pages = dom0_paging_pages(bd->domain, nr_pages);
+
+        if ( !iommu_use_hap_pt(bd->domain) )
+            avail -= cpu_pages;
+        else if ( cpu_pages > iommu_pages )
+            avail -= cpu_pages - iommu_pages;
+    }
+
+    return avail;
+}
+
+unsigned long __init dom_compute_nr_pages(
+    struct boot_domain *bd, struct elf_dom_parms *parms,
+    unsigned long initrd_len)
+{
+    unsigned long avail, nr_pages = bd->meminfo.mem_size.nr_pages;
+
+    if ( builder_is_initdom(bd) )
+        return dom0_compute_nr_pages(bd, parms, initrd_len);
+
+    avail = dom_avail_nr_pages(bd, node_online_map);
+
+    if ( is_pv_domain(bd->domain) && (parms->p2m_base == UNSET_ADDR) )
+    {
+        /*
+         * Legacy Linux kernels (i.e. such without a XEN_ELFNOTE_INIT_P2M
+         * note) require that there is enough virtual space beyond the initial
+         * allocation to set up their initial page tables. This space is
+         * roughly the same size as the p2m table, so make sure the initial
+         * allocation doesn't consume more than about half the space that's
+         * available between params.virt_base and the address space end.
+         */
+        unsigned long vstart, vend, end;
+        size_t sizeof_long = is_pv_32bit_domain(bd->domain) ?
+                             sizeof(int) : sizeof(long);
+
+        vstart = parms->virt_base;
+        vend = round_pgup(parms->virt_kend);
+        if ( !parms->unmapped_initrd )
+            vend += round_pgup(initrd_len);
+        end = vend + nr_pages * sizeof_long;
+
+        if ( end > vstart )
+            end += end - vstart;
+        if ( end <= vstart ||
+             (sizeof_long < sizeof(end) && end > (1UL << (8 * sizeof_long))) )
+        {
+            end = sizeof_long >= sizeof(end) ? 0 : 1UL << (8 * sizeof_long);
+            nr_pages = (end - vend) / (2 * sizeof_long);
+            printk("Dom0 memory clipped to %lu pages\n", nr_pages);
+        }
+    }
+
+    /* Clamp according to available memory (final). */
+    nr_pages = min(nr_pages, avail);
+
+    bd->domain->max_pages = min_t(unsigned long, nr_pages, UINT_MAX);
+
+    return nr_pages;
+}
+
 void __init arch_create_dom(
     const struct boot_info *bi, struct boot_domain *bd)
 {
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index ae3ffc614d..e94392be07 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -412,15 +412,16 @@ static __init void pvh_setup_e820(struct domain *d, unsigned long nr_pages)
     ASSERT(cur_pages == nr_pages);
 }
 
-static void __init pvh_init_p2m(struct domain *d)
+static void __init pvh_init_p2m(struct boot_domain *bd)
 {
-    unsigned long nr_pages = dom0_compute_nr_pages(d, NULL, 0);
+    unsigned long nr_pages = dom_compute_nr_pages(bd, NULL, 0);
     bool preempted;
 
-    pvh_setup_e820(d, nr_pages);
+    pvh_setup_e820(bd->domain, nr_pages);
     do {
         preempted = false;
-        paging_set_allocation(d, dom0_paging_pages(d, nr_pages),
+        paging_set_allocation(bd->domain,
+                              dom0_paging_pages(bd->domain, nr_pages),
                               &preempted);
         process_pending_softirqs();
     } while ( preempted );
@@ -1239,7 +1240,7 @@ int __init dom0_construct_pvh(struct boot_domain *bd)
      * be done before the iommu initializion, since iommu initialization code
      * will likely add mappings required by devices to the p2m (ie: RMRRs).
      */
-    pvh_init_p2m(d);
+    pvh_init_p2m(bd);
 
     iommu_hwdom_init(d);
 
diff --git a/xen/arch/x86/include/asm/dom0_build.h b/xen/arch/x86/include/asm/dom0_build.h
index f30e4b860a..6c26ab0878 100644
--- a/xen/arch/x86/include/asm/dom0_build.h
+++ b/xen/arch/x86/include/asm/dom0_build.h
@@ -9,9 +9,17 @@
 
 extern unsigned int dom0_memflags;
 
-unsigned long dom0_compute_nr_pages(struct domain *d,
-                                    struct elf_dom_parms *parms,
-                                    unsigned long initrd_len);
+unsigned long dom_avail_nr_pages(
+    struct boot_domain *bd, nodemask_t nodes);
+
+unsigned long dom0_compute_nr_pages(
+    struct boot_domain *bd, struct elf_dom_parms *parms,
+    unsigned long initrd_len);
+
+unsigned long dom_compute_nr_pages(
+    struct boot_domain *bd, struct elf_dom_parms *parms,
+    unsigned long initrd_len);
+
 int dom0_setup_permissions(struct domain *d);
 
 int dom0_construct_pv(struct boot_domain *bd);
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index 9d1c9fb8b0..ff5c93fa14 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -428,7 +428,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
         }
     }
 
-    nr_pages = dom0_compute_nr_pages(d, &parms, initrd_len);
+    nr_pages = dom_compute_nr_pages(bd, &parms, initrd_len);
 
     if ( parms.pae == XEN_PAE_EXTCR3 )
             set_bit(VMASST_TYPE_pae_extended_cr3, &d->vm_assist);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 16/18] x86: add pv multidomain construction
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (14 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 15/18] x86: rework domain page allocation Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-27 14:12   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 17/18] builder: introduce domain builder hypfs tree Daniel P. Smith
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel, Wei Liu
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Jan Beulich,
	Andrew Cooper, Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini

This adds the ability to domain builder for the construction of multiple pv
domains at boot.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/arch/x86/dom0_build.c                     |  31 -----
 xen/arch/x86/domain_builder.c                 |  58 ++++++--
 xen/arch/x86/include/asm/dom0_build.h         |   2 -
 xen/arch/x86/include/asm/setup.h              |   2 +
 xen/arch/x86/pv/Makefile                      |   2 +-
 .../x86/pv/{dom0_build.c => domain_builder.c} |  86 +++++++-----
 xen/common/domain-builder/Kconfig             |  10 ++
 xen/common/domain-builder/core.c              | 130 ++++++++++++++++--
 xen/include/xen/bootdomain.h                  |   6 +
 xen/include/xen/domain_builder.h              |  19 +++
 10 files changed, 259 insertions(+), 87 deletions(-)
 rename xen/arch/x86/pv/{dom0_build.c => domain_builder.c} (92%)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 0600773b8f..85a10f63aa 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -524,37 +524,6 @@ int __init dom0_setup_permissions(struct domain *d)
 
     return rc;
 }
-
-int __init construct_domain(struct boot_domain *bd)
-{
-    int rc = 0;
-
-    /* Sanity! */
-    BUG_ON(!pv_shim && bd->domid != 0);
-    BUG_ON(bd->domain->vcpu[0] == NULL);
-    BUG_ON(bd->domain->vcpu[0]->is_initialised);
-
-    process_pending_softirqs();
-
-    if ( builder_is_initdom(bd) )
-    {
-        if ( is_hvm_domain(bd->domain) )
-            rc = dom0_construct_pvh(bd);
-        else if ( is_pv_domain(bd->domain) )
-            rc = dom0_construct_pv(bd);
-        else
-            panic("Cannot construct Dom0. No guest interface available\n");
-    }
-
-    if ( rc )
-        return rc;
-
-    /* Sanity! */
-    BUG_ON(!bd->domain->vcpu[0]->is_initialised);
-
-    return 0;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/domain_builder.c b/xen/arch/x86/domain_builder.c
index d8babb1090..94f3ff4d5a 100644
--- a/xen/arch/x86/domain_builder.c
+++ b/xen/arch/x86/domain_builder.c
@@ -6,6 +6,7 @@
 #include <xen/grant_table.h>
 #include <xen/iommu.h>
 #include <xen/sched.h>
+#include <xen/softirq.h>
 
 #include <asm/pv/shim.h>
 #include <asm/dom0_build.h>
@@ -189,18 +190,22 @@ void __init arch_create_dom(
     if ( !pv_shim && builder_is_ctldom(bd) )
         is_privileged = CDF_privileged;
 
-    /* Create initial domain.  Not d0 for pvshim. */
-    bd->domid = get_initial_domain_id();
+    /* Determine proper domain id. */
+    if ( builder_is_initdom(bd) )
+        bd->domid = get_initial_domain_id();
+    else
+        bd->domid = bd->domid ? bd->domid : get_next_domid();
     bd->domain = domain_create(bd->domid, &dom_cfg, is_privileged);
     if ( IS_ERR(bd->domain) )
         panic("Error creating d%u: %ld\n", bd->domid, PTR_ERR(bd->domain));
 
-    init_dom0_cpuid_policy(bd->domain);
+    if ( builder_is_initdom(bd) )
+        init_dom0_cpuid_policy(bd->domain);
 
     if ( alloc_dom_vcpu0(bd) == NULL )
         panic("Error creating d%uv0\n", bd->domid);
 
-    /* Grab the DOM0 command line. */
+    /* Grab the command line. */
     cmdline = (bd->kernel->string.kind == BOOTSTR_CMDLINE) ?
               bd->kernel->string.bytes : NULL;
     if ( cmdline || bi->arch->kextra )
@@ -210,15 +215,23 @@ void __init arch_create_dom(
         cmdline = arch_prepare_cmdline(cmdline, bi->arch);
         strlcpy(dom_cmdline, cmdline, MAX_GUEST_CMDLINE);
 
-        if ( bi->arch->kextra )
-            /* kextra always includes exactly one leading space. */
-            strlcat(dom_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
+        if ( builder_is_initdom(bd) )
+        {
+            if ( bi->arch->kextra )
+                /* kextra always includes exactly one leading space. */
+                strlcat(dom_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
 
-        apply_xen_cmdline(dom_cmdline);
+            apply_xen_cmdline(dom_cmdline);
+        }
 
         strlcpy(bd->kernel->string.bytes, dom_cmdline, MAX_GUEST_CMDLINE);
     }
 
+    if ( alloc_system_evtchn(bi, bd) != 0 )
+        printk(XENLOG_WARNING "%s: "
+               "unable set up system event channels for Dom%d\n",
+               __func__, bd->domid);
+
     /*
      * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
      * This saves a large number of corner cases interactions with
@@ -240,3 +253,32 @@ void __init arch_create_dom(
     }
 }
 
+int __init construct_domain(struct boot_domain *bd)
+{
+    int rc = 0;
+
+    /* Sanity! */
+    BUG_ON(bd->domid != bd->domain->domain_id);
+    BUG_ON(bd->domain->vcpu[0] == NULL);
+    BUG_ON(bd->domain->vcpu[0]->is_initialised);
+
+    process_pending_softirqs();
+
+    if ( is_hvm_domain(bd->domain) )
+        if ( builder_is_initdom(bd) )
+            rc = dom0_construct_pvh(bd);
+        else
+            panic("Cannot construct HVM DomU. Not supported.\n");
+    else if ( is_pv_domain(bd->domain) )
+            rc = dom_construct_pv(bd);
+    else
+        panic("Cannot construct Dom0. No guest interface available\n");
+
+    if ( rc )
+        return rc;
+
+    /* Sanity! */
+    BUG_ON(!bd->domain->vcpu[0]->is_initialised);
+
+    return 0;
+}
diff --git a/xen/arch/x86/include/asm/dom0_build.h b/xen/arch/x86/include/asm/dom0_build.h
index 6c26ab0878..3624a57641 100644
--- a/xen/arch/x86/include/asm/dom0_build.h
+++ b/xen/arch/x86/include/asm/dom0_build.h
@@ -22,8 +22,6 @@ unsigned long dom_compute_nr_pages(
 
 int dom0_setup_permissions(struct domain *d);
 
-int dom0_construct_pv(struct boot_domain *bd);
-
 int dom0_construct_pvh(struct boot_domain *bd);
 
 unsigned long dom0_paging_pages(const struct domain *d,
diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h
index 6f53623fb3..328f9a8611 100644
--- a/xen/arch/x86/include/asm/setup.h
+++ b/xen/arch/x86/include/asm/setup.h
@@ -36,6 +36,8 @@ static inline void vesa_init(void) {};
 void apply_xen_cmdline(char *cmdline);
 int construct_domain(struct boot_domain *bd);
 
+int dom_construct_pv(struct boot_domain *bd);
+
 void setup_io_bitmap(struct domain *d);
 
 unsigned long initial_images_nrpages(nodeid_t node);
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 6cda354cc4..d06a3c1de1 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -15,5 +15,5 @@ obj-$(CONFIG_PV_SHIM) += shim.o
 obj-$(CONFIG_TRACEBUFFER) += trace.o
 obj-y += traps.o
 
-obj-bin-y += dom0_build.init.o
+obj-bin-y += domain_builder.init.o
 obj-bin-y += gpr_switch.o
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/domain_builder.c
similarity index 92%
rename from xen/arch/x86/pv/dom0_build.c
rename to xen/arch/x86/pv/domain_builder.c
index ff5c93fa14..2ecf1b0ae3 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/domain_builder.c
@@ -1,5 +1,5 @@
 /******************************************************************************
- * pv/dom0_build.c
+ * pv/domain_builder.c
  *
  * Copyright (c) 2002-2005, K A Fraser
  */
@@ -8,6 +8,7 @@
 #include <xen/bootinfo.h>
 #include <xen/console.h>
 #include <xen/domain.h>
+#include <xen/domain_builder.h>
 #include <xen/domain_page.h>
 #include <xen/init.h>
 #include <xen/libelf.h>
@@ -296,7 +297,7 @@ static struct page_info * __init alloc_chunk(struct domain *d,
     return page;
 }
 
-int __init dom0_construct_pv(struct boot_domain *bd)
+int __init dom_construct_pv(struct boot_domain *bd)
 {
     int i, rc, order, machine;
     bool compatible, compat;
@@ -350,7 +351,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     /* Machine address of next candidate page-table page. */
     paddr_t mpt_alloc;
 
-    printk(XENLOG_INFO "*** Building a PV Dom%d ***\n", d->domain_id);
+    printk(XENLOG_INFO "*** Constructing a PV Dom%d ***\n", d->domain_id);
 
     d->max_pages = ~0U;
 
@@ -362,7 +363,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     if ( (rc = elf_init(&elf, image_start, image_len)) != 0 )
         return rc;
 
-    if ( opt_dom0_verbose )
+    if ( builder_is_ctldom(bd) && opt_dom0_verbose )
         elf_set_verbose(&elf);
 
     elf_parse_binary(&elf);
@@ -384,7 +385,8 @@ int __init dom0_construct_pv(struct boot_domain *bd)
         {
             if ( unlikely(rc = switch_compat(d)) )
             {
-                printk("Dom0 failed to switch to compat: %d\n", rc);
+                printk("Dom%d failed to switch to compat: %d\n",
+                        d->domain_id, rc);
                 return rc;
             }
 
@@ -404,22 +406,23 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     if ( elf_msb(&elf) )
         compatible = false;
 
-    printk(" Dom0 kernel: %s-bit%s, %s, paddr %#" PRIx64 " -> %#" PRIx64 "\n",
-           elf_64bit(&elf) ? "64" : elf_32bit(&elf) ? "32" : "??",
+    printk(" Dom%d kernel: %s-bit%s, %s, paddr %#" PRIx64 " -> %#" PRIx64 "\n",
+           d->domain_id, elf_64bit(&elf) ? "64" : elf_32bit(&elf) ? "32" : "??",
            parms.pae       ? ", PAE" : "",
            elf_msb(&elf)   ? "msb"   : "lsb",
            elf.pstart, elf.pend);
     if ( elf.bsd_symtab_pstart )
-        printk(" Dom0 symbol map %#" PRIx64 " -> %#" PRIx64 "\n",
-               elf.bsd_symtab_pstart, elf.bsd_symtab_pend);
+        printk(" Dom%d symbol map %#" PRIx64 " -> %#" PRIx64 "\n",
+               d->domain_id, elf.bsd_symtab_pstart, elf.bsd_symtab_pend);
 
     if ( !compatible )
     {
-        printk("Mismatch between Xen and DOM0 kernel\n");
+        printk("Mismatch between Xen and DOM%d kernel\n", d->domain_id);
         return -EINVAL;
     }
 
-    if ( parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE )
+    if ( builder_is_initdom(bd) &&
+         parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE )
     {
         if ( !pv_shim && !test_bit(XENFEAT_dom0, parms.f_supported) )
         {
@@ -443,7 +446,8 @@ int __init dom0_construct_pv(struct boot_domain *bd)
 
             if ( value > __HYPERVISOR_COMPAT_VIRT_START )
             {
-                printk("Dom0 expects too high a hypervisor start address\n");
+                printk("Dom%d expects too high a hypervisor start address\n",
+                       d->domain_id);
                 return -ERANGE;
             }
             HYPERVISOR_COMPAT_VIRT_START(d) =
@@ -487,7 +491,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     vstartinfo_start = round_pgup(vphysmap_end);
     vstartinfo_end   = vstartinfo_start + sizeof(struct start_info);
 
-    if ( pv_shim )
+    if ( pv_shim || ! builder_is_initdom(bd) )
     {
         vxenstore_start  = round_pgup(vstartinfo_end);
         vxenstore_end    = vxenstore_start + PAGE_SIZE;
@@ -578,8 +582,8 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     }
 
     printk("PHYSICAL MEMORY ARRANGEMENT:\n"
-           " Dom0 alloc.:   %"PRIpaddr"->%"PRIpaddr,
-           pfn_to_paddr(alloc_spfn), pfn_to_paddr(alloc_epfn));
+           " Dom%d alloc.:   %"PRIpaddr"->%"PRIpaddr,
+           d->domain_id, pfn_to_paddr(alloc_spfn), pfn_to_paddr(alloc_epfn));
     if ( domain_tot_pages(d) < nr_pages )
         printk(" (%lu pages to be allocated)",
                nr_pages - domain_tot_pages(d));
@@ -596,7 +600,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
         printk(" Init. ramdisk: %p->%p\n", _p(vinitrd_start), _p(vinitrd_end));
     printk(" Phys-Mach map: %p->%p\n", _p(vphysmap_start), _p(vphysmap_end));
     printk(" Start info:    %p->%p\n", _p(vstartinfo_start), _p(vstartinfo_end));
-    if ( pv_shim )
+    if ( pv_shim || ! builder_is_initdom(bd) )
     {
         printk(" Xenstore ring: %p->%p\n", _p(vxenstore_start), _p(vxenstore_end));
         printk(" Console ring:  %p->%p\n", _p(vconsole_start), _p(vconsole_end));
@@ -617,7 +621,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
          ? v_end > HYPERVISOR_COMPAT_VIRT_START(d)
          : (v_start < HYPERVISOR_VIRT_END) && (v_end > HYPERVISOR_VIRT_START) )
     {
-        printk("DOM0 image overlaps with Xen private area.\n");
+        printk("DOM%d image overlaps with Xen private area.\n", d->domain_id);
         return -EINVAL;
     }
 
@@ -768,9 +772,6 @@ int __init dom0_construct_pv(struct boot_domain *bd)
         init_hypercall_page(d, _p(parms.virt_hypercall));
     }
 
-    /* Free temporary buffers. */
-    discard_initial_images();
-
     /* Set up start info area. */
     si = (start_info_t *)vstartinfo_start;
     clear_page(si);
@@ -778,7 +779,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
 
     si->shared_info = virt_to_maddr(d->shared_info);
 
-    if ( !pv_shim )
+    if ( !pv_shim && builder_is_ctldom(bd) )
         si->flags    = SIF_PRIVILEGED | SIF_INITDOMAIN;
     if ( !vinitrd_start && initrd_len )
         si->flags   |= SIF_MOD_START_PFN;
@@ -789,6 +790,19 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     snprintf(si->magic, sizeof(si->magic), "xen-3.0-x86_%d%s",
              elf_64bit(&elf) ? 64 : 32, parms.pae ? "p" : "");
 
+    if ( !builder_is_initdom(bd) )
+    {
+        si->store_mfn = ((vxenstore_start - v_start) >> PAGE_SHIFT)
+                        + alloc_spfn;
+        bd->store.mfn = si->store_mfn;
+        si->store_evtchn = bd->store.evtchn;
+
+        si->console.domU.mfn = ((vconsole_start - v_start) >> PAGE_SHIFT)
+                               + alloc_spfn;
+        bd->console.mfn = si->console.domU.mfn;
+        si->console.domU.evtchn = bd->console.evtchn;
+    }
+
     count = domain_tot_pages(d);
 
     /* Set up the phys->machine table if not part of the initial mapping. */
@@ -871,23 +885,24 @@ int __init dom0_construct_pv(struct boot_domain *bd)
                 sizeof(si->cmd_line));
 
 #ifdef CONFIG_VIDEO
-    if ( !pv_shim && fill_console_start_info((void *)(si + 1)) )
-    {
-        si->console.dom0.info_off  = sizeof(struct start_info);
-        si->console.dom0.info_size = sizeof(struct dom0_vga_console_info);
-    }
+    if ( builder_is_hwdom(bd) )
+        if ( !pv_shim && fill_console_start_info((void *)(si + 1)) )
+        {
+            si->console.dom0.info_off  = sizeof(struct start_info);
+            si->console.dom0.info_size = sizeof(struct dom0_vga_console_info);
+        }
 #endif
 
     /*
      * TODO: provide an empty stub for fill_console_start_info in the
      * !CONFIG_VIDEO case so the logic here can be simplified.
      */
-    if ( pv_shim )
+    if ( builder_is_hwdom(bd) && pv_shim )
         pv_shim_setup_dom(d, l4start, v_start, vxenstore_start, vconsole_start,
                           vphysmap_start, si);
 
 #ifdef CONFIG_COMPAT
-    if ( compat )
+    if ( builder_is_hwdom(bd) && compat )
         xlat_start_info(si, pv_shim ? XLAT_start_info_console_domU
                                     : XLAT_start_info_console_dom0);
 #endif
@@ -926,15 +941,18 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     if ( test_bit(XENFEAT_supervisor_mode_kernel, parms.f_required) )
         panic("Dom0 requires supervisor-mode execution\n");
 
-    rc = dom0_setup_permissions(d);
-    BUG_ON(rc != 0);
+    if ( builder_is_hwdom(bd) )
+    {
+        rc = dom0_setup_permissions(d);
+        BUG_ON(rc != 0);
+    }
 
     if ( d->domain_id == hardware_domid )
         iommu_hwdom_init(d);
 
 #ifdef CONFIG_SHADOW_PAGING
     /* Fill the shadow pool if necessary. */
-    if ( opt_dom0_shadow || opt_pv_l1tf_hwdom )
+    if ( builder_is_hwdom(bd) && (opt_dom0_shadow || opt_pv_l1tf_hwdom) )
     {
         bool preempted;
 
@@ -948,7 +966,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
     }
 
     /* Activate shadow mode, if requested.  Reuse the pv_l1tf tasklet. */
-    if ( opt_dom0_shadow )
+    if ( builder_is_hwdom(bd) && opt_dom0_shadow )
     {
         printk("Switching dom0 to using shadow paging\n");
         tasklet_schedule(&d->arch.paging.shadow.pv_l1tf_tasklet);
@@ -960,8 +978,8 @@ int __init dom0_construct_pv(struct boot_domain *bd)
 
 out:
     if ( elf_check_broken(&elf) )
-        printk(XENLOG_WARNING "Dom0 kernel broken ELF: %s\n",
-               elf_check_broken(&elf));
+        printk(XENLOG_WARNING "Dom%d kernel broken ELF: %s\n",
+               d->domain_id, elf_check_broken(&elf));
 
     return rc;
 }
diff --git a/xen/common/domain-builder/Kconfig b/xen/common/domain-builder/Kconfig
index 893038cab3..0232e1ed8a 100644
--- a/xen/common/domain-builder/Kconfig
+++ b/xen/common/domain-builder/Kconfig
@@ -12,4 +12,14 @@ config BUILDER_FDT
 
 	  If unsure, say N.
 
+config MULTIDOM_BUILDER
+	bool "Multidomain building (UNSUPPORTED)" if UNSUPPORTED
+	depends on BUILDER_FDT
+	---help---
+	  Enables the domain builder to construct multiple domains.
+
+	  This feature is currently experimental.
+
+	  If unsure, say N.
+
 endmenu
diff --git a/xen/common/domain-builder/core.c b/xen/common/domain-builder/core.c
index b030b07d71..c6a268eb96 100644
--- a/xen/common/domain-builder/core.c
+++ b/xen/common/domain-builder/core.c
@@ -1,6 +1,7 @@
 #include <xen/bootdomain.h>
 #include <xen/bootinfo.h>
 #include <xen/domain_builder.h>
+#include <xen/event.h>
 #include <xen/init.h>
 #include <xen/types.h>
 
@@ -60,37 +61,144 @@ void __init builder_init(struct boot_info *info)
         d->kernel->string.kind = BOOTSTR_CMDLINE;
 }
 
+static bool __init build_domain(struct boot_info *info, struct boot_domain *bd)
+{
+    if ( bd->constructed == true )
+        return true;
+
+    if ( bd->kernel == NULL )
+        return false;
+
+    printk(XENLOG_INFO "*** Building Dom%d ***\n", bd->domid);
+
+    arch_create_dom(info, bd);
+    if ( bd->domain )
+    {
+        bd->constructed = true;
+        return true;
+    }
+
+    return false;
+}
+
 uint32_t __init builder_create_domains(struct boot_info *info)
 {
     uint32_t build_count = 0, functions_built = 0;
+    struct boot_domain *bd;
     int i;
 
+    if ( IS_ENABLED(CONFIG_MULTIDOM_BUILDER) )
+    {
+        bd = builder_dom_by_function(info, BUILD_FUNCTION_XENSTORE);
+        if ( build_domain(info, bd) )
+        {
+            functions_built |= bd->functions;
+            build_count++;
+        }
+        else
+            printk(XENLOG_WARNING "Xenstore build failed, system may be unusable\n");
+
+        bd = builder_dom_by_function(info, BUILD_FUNCTION_CONSOLE);
+        if ( build_domain(info, bd) )
+        {
+            functions_built |= bd->functions;
+            build_count++;
+        }
+        else
+            printk(XENLOG_WARNING "Console build failed, system may be unusable\n");
+    }
+
     for ( i = 0; i < info->builder->nr_doms; i++ )
     {
-        struct boot_domain *d = &info->builder->domains[i];
+        bd = &info->builder->domains[i];
 
         if ( ! IS_ENABLED(CONFIG_MULTIDOM_BUILDER) &&
-             ! builder_is_initdom(d) &&
+             ! builder_is_initdom(bd) &&
              functions_built & BUILD_FUNCTION_INITIAL_DOM )
             continue;
 
-        if ( d->kernel == NULL )
+        if ( !build_domain(info, bd) )
         {
-            if ( builder_is_initdom(d) )
+            if ( builder_is_initdom(bd) )
                 panic("%s: intial domain missing kernel\n", __func__);
 
-            printk(XENLOG_ERR "%s:Dom%d definiton has no kernel\n", __func__,
-                    d->domid);
+            printk(XENLOG_WARNING "Dom%d build failed, skipping\n", bd->domid);
             continue;
         }
 
-        arch_create_dom(info, d);
-        if ( d->domain )
+        functions_built |= bd->functions;
+        build_count++;
+    }
+
+    if ( IS_ENABLED(CONFIG_X86) )
+        /* Free temporary buffers. */
+        discard_initial_images();
+
+    return build_count;
+}
+
+domid_t __init get_next_domid(void)
+{
+    static domid_t __initdata last_domid = 0;
+    domid_t next;
+
+    for ( next = last_domid + 1; next < DOMID_FIRST_RESERVED; next++ )
+    {
+        struct domain *d;
+
+        if ( (d = rcu_lock_domain_by_id(next)) == NULL )
         {
-            functions_built |= d->functions;
-            build_count++;
+            last_domid = next;
+            return next;
         }
+
+        rcu_unlock_domain(d);
     }
 
-    return build_count;
+    return 0;
+}
+
+int __init alloc_system_evtchn(
+    const struct boot_info *info, struct boot_domain *bd)
+{
+    evtchn_alloc_unbound_t evtchn_req;
+    struct boot_domain *c = builder_dom_by_function(info,
+                                                    BUILD_FUNCTION_CONSOLE);
+    struct boot_domain *s = builder_dom_by_function(info,
+                                                    BUILD_FUNCTION_XENSTORE);
+    int rc;
+
+    evtchn_req.dom = bd->domid;
+
+    if ( c != NULL && c != bd && c->constructed )
+    {
+        evtchn_req.remote_dom = c->domid;
+
+        rc = evtchn_alloc_unbound(&evtchn_req);
+        if ( rc )
+        {
+            printk("Failed allocating console event channel for domain %d\n",
+                   bd->domid);
+            return rc;
+        }
+
+        bd->console.evtchn = evtchn_req.port;
+    }
+
+    if ( s != NULL && s != bd && s->constructed )
+    {
+        evtchn_req.remote_dom = s->domid;
+
+        rc = evtchn_alloc_unbound(&evtchn_req);
+        if ( rc )
+        {
+            printk("Failed allocating xenstore event channel for domain %d\n",
+                   bd->domid);
+            return rc;
+        }
+
+        bd->store.evtchn = evtchn_req.port;
+    }
+
+    return 0;
 }
diff --git a/xen/include/xen/bootdomain.h b/xen/include/xen/bootdomain.h
index b172d16f4e..9c5d4d385e 100644
--- a/xen/include/xen/bootdomain.h
+++ b/xen/include/xen/bootdomain.h
@@ -47,6 +47,12 @@ struct boot_domain {
     struct boot_module *configs[BUILD_MAX_CONF_MODS];
 
     struct domain *domain;
+    struct {
+        xen_pfn_t mfn;
+        unsigned int evtchn;
+    } store, console;
+    bool constructed;
+
 };
 
 #endif
diff --git a/xen/include/xen/domain_builder.h b/xen/include/xen/domain_builder.h
index c0d997f7bd..f9e43c9689 100644
--- a/xen/include/xen/domain_builder.h
+++ b/xen/include/xen/domain_builder.h
@@ -34,6 +34,22 @@ static inline bool builder_is_hwdom(struct boot_domain *bd)
             bd->permissions & BUILD_PERMISSION_HARDWARE );
 }
 
+static inline struct boot_domain *builder_dom_by_function(
+    const struct boot_info *info, uint32_t func)
+{
+    int i;
+
+    for ( i = 0; i < info->builder->nr_doms; i++ )
+    {
+        struct boot_domain *bd = &info->builder->domains[i];
+
+        if ( bd->functions & func )
+            return bd;
+    }
+
+    return NULL;
+}
+
 static inline struct domain *builder_get_hwdom(struct boot_info *info)
 {
     int i;
@@ -51,6 +67,9 @@ static inline struct domain *builder_get_hwdom(struct boot_info *info)
 
 void builder_init(struct boot_info *info);
 uint32_t builder_create_domains(struct boot_info *info);
+domid_t get_next_domid(void);
+int alloc_system_evtchn(
+    const struct boot_info *info, struct boot_domain *bd);
 void arch_create_dom(const struct boot_info *bi, struct boot_domain *bd);
 
 #endif /* XEN_DOMAIN_BUILDER_H */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 17/18] builder: introduce domain builder hypfs tree
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (15 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 16/18] x86: add pv multidomain construction Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-27 14:30   ` Jan Beulich
  2022-07-06 21:04 ` [PATCH v1 18/18] tools: introduce example late pv helper Daniel P. Smith
  2022-07-19 17:06 ` [PATCH v1 00/18] Hyperlaunch Smith, Jackson
  18 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu

This enables domain builder to construct a hypfs tree to expose relevant domain
creation information for use by the boot domain and/or the runtime system.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
---
 xen/common/domain-builder/Kconfig  |  11 ++
 xen/common/domain-builder/Makefile |   1 +
 xen/common/domain-builder/core.c   |   3 +
 xen/common/domain-builder/hypfs.c  | 193 +++++++++++++++++++++++++++++
 xen/include/xen/domain_builder.h   |  13 ++
 5 files changed, 221 insertions(+)
 create mode 100644 xen/common/domain-builder/hypfs.c

diff --git a/xen/common/domain-builder/Kconfig b/xen/common/domain-builder/Kconfig
index 0232e1ed8a..4b98cccfab 100644
--- a/xen/common/domain-builder/Kconfig
+++ b/xen/common/domain-builder/Kconfig
@@ -22,4 +22,15 @@ config MULTIDOM_BUILDER
 
 	  If unsure, say N.
 
+config BUILDER_HYPFS
+	bool "Domain builder hypfs support (UNSUPPORTED)" if UNSUPPORTED
+	depends on HYPFS
+	---help---
+	  Exposes the domain builder construction information
+	  through hypfs.
+
+	  This feature is currently experimental.
+
+	  If unsure, say N.
+
 endmenu
diff --git a/xen/common/domain-builder/Makefile b/xen/common/domain-builder/Makefile
index 9561602502..7aa2ea2a53 100644
--- a/xen/common/domain-builder/Makefile
+++ b/xen/common/domain-builder/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_BUILDER_FDT) += fdt.o
+obj-$(CONFIG_HYPFS) += hypfs.o
 obj-y += core.o
diff --git a/xen/common/domain-builder/core.c b/xen/common/domain-builder/core.c
index c6a268eb96..f41ca3ed35 100644
--- a/xen/common/domain-builder/core.c
+++ b/xen/common/domain-builder/core.c
@@ -134,6 +134,9 @@ uint32_t __init builder_create_domains(struct boot_info *info)
         /* Free temporary buffers. */
         discard_initial_images();
 
+    if ( IS_ENABLED(CONFIG_BUILDER_HYPFS) )
+        builder_hypfs(info);
+
     return build_count;
 }
 
diff --git a/xen/common/domain-builder/hypfs.c b/xen/common/domain-builder/hypfs.c
new file mode 100644
index 0000000000..28f8b13d85
--- /dev/null
+++ b/xen/common/domain-builder/hypfs.c
@@ -0,0 +1,193 @@
+#include <xen/bootinfo.h>
+#include <xen/domain_builder.h>
+#include <xen/hypfs.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/list.h>
+#include <xen/string.h>
+#include <xen/xmalloc.h>
+
+#define INIT_HYPFS_DIR(var, nam)                 \
+    var.e.type = XEN_HYPFS_TYPE_DIR;             \
+    var.e.encoding = XEN_HYPFS_ENC_PLAIN;        \
+    var.e.name = (nam);                          \
+    var.e.size = 0;                              \
+    var.e.max_size = 0;                          \
+    INIT_LIST_HEAD(&var.e.list);                 \
+    var.e.funcs = (&hypfs_dir_funcs);            \
+    INIT_LIST_HEAD(&var.dirlist)
+
+#define INIT_HYPFS_FIXEDSIZE(var, typ, nam, contvar, fn, wr) \
+    var.e.type = (typ);                                      \
+    var.e.encoding = XEN_HYPFS_ENC_PLAIN;                    \
+    var.e.name = (nam);                                      \
+    var.e.size = sizeof(contvar);                            \
+    var.e.max_size = (wr) ? sizeof(contvar) : 0;             \
+    var.e.funcs = (fn);                                      \
+    var.u.content = &(contvar)
+
+#define INIT_HYPFS_UINT(var, nam, contvar)                       \
+    INIT_HYPFS_FIXEDSIZE(var, XEN_HYPFS_TYPE_UINT, nam, contvar, \
+                         &hypfs_leaf_ro_funcs, 0)
+
+#define INIT_HYPFS_BOOL(var, nam, contvar)                       \
+    INIT_HYPFS_FIXEDSIZE(var, XEN_HYPFS_TYPE_BOOL, nam, contvar, \
+                         &hypfs_leaf_ro_funcs, 0)
+
+#define INIT_HYPFS_VARSIZE(var, typ, nam, msz, fn) \
+    var.e.type = (typ) ;                           \
+    var.e.encoding = XEN_HYPFS_ENC_PLAIN;          \
+    var.e.name = (nam);                            \
+    var.e.max_size = (msz);                        \
+    var.e.funcs = (fn)
+
+#define INIT_HYPFS_STRING(var, nam)               \
+    INIT_HYPFS_VARSIZE(var, XEN_HYPFS_TYPE_STRING, nam, 0, &hypfs_leaf_ro_funcs)
+
+struct device_node {
+    struct hypfs_entry_dir dir;
+
+    uint32_t evtchn;
+    struct hypfs_entry_leaf evtchn_leaf;
+
+    xen_pfn_t mfn;
+    struct hypfs_entry_leaf mfn_leaf;
+};
+
+struct domain_node {
+    char dir_name[HYPFS_DYNDIR_ID_NAMELEN];
+    struct hypfs_entry_dir dir;
+
+    char uuid[40];
+    struct hypfs_entry_leaf uuid_leaf;
+
+    uint16_t functions;
+    struct hypfs_entry_leaf func_leaf;
+
+    uint32_t ncpus;
+    struct hypfs_entry_leaf ncpus_leaf;
+
+    uint32_t mem_size;
+    struct hypfs_entry_leaf mem_sz_leaf;
+
+    uint32_t mem_max;
+    struct hypfs_entry_leaf mem_mx_leaf;
+
+    bool constructed;
+    struct hypfs_entry_leaf const_leaf;
+
+    struct device_node xs;
+
+    struct hypfs_entry_dir dev_dir;
+
+    struct device_node con_dev;
+};
+
+static struct hypfs_entry_dir __read_mostly *builder_dir;
+static struct domain_node __read_mostly *entries;
+
+static int __init alloc_hypfs(struct boot_info *info)
+{
+    if ( !(builder_dir = (struct hypfs_entry_dir *)xmalloc_bytes(
+                        sizeof(struct hypfs_entry_dir))) )
+    {
+        printk(XENLOG_WARNING "%s: unable to allocate hypfs dir\n", __func__);
+        return -ENOMEM;
+    }
+
+    builder_dir->e.type = XEN_HYPFS_TYPE_DIR;
+    builder_dir->e.encoding = XEN_HYPFS_ENC_PLAIN;
+    builder_dir->e.name = "builder";
+    builder_dir->e.size = 0;
+    builder_dir->e.max_size = 0;
+    INIT_LIST_HEAD(&builder_dir->e.list);
+    builder_dir->e.funcs = &hypfs_dir_funcs;
+    INIT_LIST_HEAD(&builder_dir->dirlist);
+
+    if ( !(entries = (struct domain_node *)xmalloc_bytes(
+                        sizeof(struct domain_node) * info->builder->nr_doms)) )
+    {
+        printk(XENLOG_WARNING "%s: unable to allocate hypfs nodes\n", __func__);
+        return -ENOMEM;
+    }
+
+    return 0;
+}
+
+void __init builder_hypfs(struct boot_info *info)
+{
+    int i;
+
+    printk("Domain Builder: creating hypfs nodes\n");
+
+    if ( alloc_hypfs(info) != 0 )
+        return;
+
+    for ( i = 0; i < info->builder->nr_doms; i++ )
+    {
+        struct domain_node *e = &entries[i];
+        struct boot_domain *bd = &info->builder->domains[i];
+        uint8_t *uuid = bd->uuid;
+
+        snprintf(e->dir_name, sizeof(e->dir_name), "%d", bd->domid);
+
+        snprintf(e->uuid, sizeof(e->uuid), "%08x-%04x-%04x-%04x-%04x%08x",
+                 *(uint32_t *)uuid, *(uint16_t *)(uuid+4),
+                 *(uint16_t *)(uuid+6), *(uint16_t *)(uuid+8),
+                 *(uint16_t *)(uuid+10), *(uint32_t *)(uuid+12));
+
+        e->functions = bd->functions;
+        e->constructed = bd->constructed;
+
+        e->ncpus = bd->ncpus;
+        e->mem_size = (bd->meminfo.mem_size.nr_pages * PAGE_SIZE)/1024;
+        e->mem_max = (bd->meminfo.mem_max.nr_pages * PAGE_SIZE)/1024;
+
+        e->xs.evtchn = bd->store.evtchn;
+        e->xs.mfn = bd->store.mfn;
+
+        e->con_dev.evtchn = bd->console.evtchn;
+        e->con_dev.mfn = bd->console.mfn;
+
+        /* Initialize and construct builder hypfs tree */
+        INIT_HYPFS_DIR(e->dir, e->dir_name);
+        INIT_HYPFS_DIR(e->xs.dir, "xenstore");
+        INIT_HYPFS_DIR(e->dev_dir, "devices");
+        INIT_HYPFS_DIR(e->con_dev.dir, "console");
+
+        INIT_HYPFS_STRING(e->uuid_leaf, "uuid");
+        hypfs_string_set_reference(&e->uuid_leaf, e->uuid);
+        INIT_HYPFS_UINT(e->func_leaf, "functions", e->functions);
+        INIT_HYPFS_UINT(e->ncpus_leaf, "ncpus", e->ncpus);
+        INIT_HYPFS_UINT(e->mem_sz_leaf, "mem_size", e->mem_size);
+        INIT_HYPFS_UINT(e->mem_mx_leaf, "mem_max", e->mem_max);
+        INIT_HYPFS_BOOL(e->const_leaf, "constructed", e->constructed);
+
+        INIT_HYPFS_UINT(e->xs.evtchn_leaf, "evtchn", e->xs.evtchn);
+        INIT_HYPFS_UINT(e->xs.mfn_leaf, "mfn", e->xs.mfn);
+
+        INIT_HYPFS_UINT(e->con_dev.evtchn_leaf, "evtchn", e->con_dev.evtchn);
+        INIT_HYPFS_UINT(e->con_dev.mfn_leaf, "mfn", e->con_dev.mfn);
+
+        hypfs_add_leaf(&e->con_dev.dir, &e->con_dev.evtchn_leaf, true);
+        hypfs_add_leaf(&e->con_dev.dir, &e->con_dev.mfn_leaf, true);
+        hypfs_add_dir(&e->dev_dir, &e->con_dev.dir, true);
+
+        hypfs_add_dir(&e->dir, &e->dev_dir, true);
+
+        hypfs_add_leaf(&e->xs.dir, &e->xs.evtchn_leaf, true);
+        hypfs_add_leaf(&e->xs.dir, &e->xs.mfn_leaf, true);
+        hypfs_add_dir(&e->dir, &e->xs.dir, true);
+
+        hypfs_add_leaf(&e->dir, &e->uuid_leaf, true);
+        hypfs_add_leaf(&e->dir, &e->func_leaf, true);
+        hypfs_add_leaf(&e->dir, &e->ncpus_leaf, true);
+        hypfs_add_leaf(&e->dir, &e->mem_sz_leaf, true);
+        hypfs_add_leaf(&e->dir, &e->mem_mx_leaf, true);
+        hypfs_add_leaf(&e->dir, &e->const_leaf, true);
+
+        hypfs_add_dir(builder_dir, &e->dir, true);
+    }
+
+    hypfs_add_dir(&hypfs_root, builder_dir, true);
+}
diff --git a/xen/include/xen/domain_builder.h b/xen/include/xen/domain_builder.h
index f9e43c9689..086968b0fe 100644
--- a/xen/include/xen/domain_builder.h
+++ b/xen/include/xen/domain_builder.h
@@ -72,4 +72,17 @@ int alloc_system_evtchn(
     const struct boot_info *info, struct boot_domain *bd);
 void arch_create_dom(const struct boot_info *bi, struct boot_domain *bd);
 
+#ifdef CONFIG_HYPFS
+
+void builder_hypfs(struct boot_info *info);
+
+#else
+
+static inline void builder_hypfs(struct boot_info *info)
+{
+    return;
+}
+
+#endif
+
 #endif /* XEN_DOMAIN_BUILDER_H */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v1 18/18] tools: introduce example late pv helper
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (16 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 17/18] builder: introduce domain builder hypfs tree Daniel P. Smith
@ 2022-07-06 21:04 ` Daniel P. Smith
  2022-07-19 17:06 ` [PATCH v1 00/18] Hyperlaunch Smith, Jackson
  18 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-06 21:04 UTC (permalink / raw)
  To: xen-devel
  Cc: Daniel P. Smith, scott.davis, christopher.clark, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu, Anthony PERARD

The late pv helper is an example helper tool for late setup of Xenstore for a
domain that was created by the hypervisor using hyperlaunch.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Christopher Clark christopher.clark@starlab.io
---
 .gitignore                    |   1 +
 tools/helpers/Makefile        |  11 ++
 tools/helpers/builder-hypfs.c | 253 ++++++++++++++++++++++++++++++
 tools/helpers/hypfs-helpers.h |   9 ++
 tools/helpers/late-init-pv.c  | 287 ++++++++++++++++++++++++++++++++++
 tools/helpers/late-init-pv.h  |  29 ++++
 tools/helpers/xs-helpers.c    | 117 ++++++++++++++
 tools/helpers/xs-helpers.h    |  27 ++++
 8 files changed, 734 insertions(+)
 create mode 100644 tools/helpers/builder-hypfs.c
 create mode 100644 tools/helpers/hypfs-helpers.h
 create mode 100644 tools/helpers/late-init-pv.c
 create mode 100644 tools/helpers/late-init-pv.h
 create mode 100644 tools/helpers/xs-helpers.c
 create mode 100644 tools/helpers/xs-helpers.h

diff --git a/.gitignore b/.gitignore
index 18ef56a780..0e5d5ceaab 100644
--- a/.gitignore
+++ b/.gitignore
@@ -206,6 +206,7 @@ tools/fuzz/x86_instruction_emulator/x86_emulate
 tools/fuzz/x86_instruction_emulator/x86-emulate.[ch]
 tools/helpers/init-xenstore-domain
 tools/helpers/xen-init-dom0
+tools/helpers/late-init-pv
 tools/hotplug/common/hotplugpath.sh
 tools/hotplug/FreeBSD/rc.d/xencommons
 tools/hotplug/FreeBSD/rc.d/xendriverdomain
diff --git a/tools/helpers/Makefile b/tools/helpers/Makefile
index 8d78ab1e90..c32481202d 100644
--- a/tools/helpers/Makefile
+++ b/tools/helpers/Makefile
@@ -14,6 +14,7 @@ ifeq ($(CONFIG_ARM),y)
 PROGS += init-dom0less
 endif
 endif
+PROGS += late-init-pv
 
 XEN_INIT_DOM0_OBJS = xen-init-dom0.o init-dom-json.o
 $(XEN_INIT_DOM0_OBJS): CFLAGS += $(CFLAGS_libxentoollog)
@@ -36,6 +37,13 @@ $(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenlight)
 $(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenctrl)
 $(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenevtchn)
 
+LATE_INIT_PV_OBJS = late-init-pv.o builder-hypfs.o xs-helpers.o
+$(LATE_INIT_PV_OBJS): CFLAGS += $(CFLAGS_libxentoollog)
+$(LATE_INIT_PV_OBJS): CFLAGS += $(CFLAGS_libxenguest)
+$(LATE_INIT_PV_OBJS): CFLAGS += $(CFLAGS_libxenctrl)
+$(LATE_INIT_PV_OBJS): CFLAGS += $(CFLAGS_libxenhypfs)
+$(LATE_INIT_PV_OBJS): CFLAGS += $(CFLAGS_libxenstore)
+
 .PHONY: all
 all: $(PROGS)
 
@@ -48,6 +56,9 @@ init-xenstore-domain: $(INIT_XENSTORE_DOMAIN_OBJS)
 init-dom0less: $(INIT_DOM0LESS_OBJS)
 	$(CC) $(LDFLAGS) -o $@ $(INIT_DOM0LESS_OBJS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenevtchn) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenlight) $(LDLIBS_libxenguest) $(LDLIBS_libxenforeignmemory) $(APPEND_LDFLAGS)
 
+late-init-pv: $(LATE_INIT_PV_OBJS)
+	$(CC) $(LDFLAGS) -o $@ $(LATE_INIT_PV_OBJS) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxenhypfs) $(APPEND_LDFLAGS)
+
 .PHONY: install
 install: all
 	$(INSTALL_DIR) $(DESTDIR)$(LIBEXEC_BIN)
diff --git a/tools/helpers/builder-hypfs.c b/tools/helpers/builder-hypfs.c
new file mode 100644
index 0000000000..d123426cfa
--- /dev/null
+++ b/tools/helpers/builder-hypfs.c
@@ -0,0 +1,253 @@
+
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <xenhypfs.h>
+
+#include "late-init-pv.h"
+
+/* general size for static path array */
+#define HYPFS_MAX_PATH 100
+
+bool has_builder_hypfs(xenhypfs_handle *hdl, uint32_t domid)
+{
+    struct xenhypfs_dirent *ent;
+    char path[HYPFS_MAX_PATH];
+    unsigned int n;
+
+    snprintf(path, HYPFS_MAX_PATH, "/builder/%d", domid);
+
+    ent = xenhypfs_readdir(hdl, path, &n);
+    if ( ent )
+    {
+        free(ent);
+        return true;
+    }
+
+    return false;
+}
+
+static int read_hypfs_bool(xenhypfs_handle *fshdl, const char *path, bool *val)
+{
+    struct xenhypfs_dirent *dirent;
+    void *raw_value;
+
+    errno = 0;
+
+    raw_value = xenhypfs_read_raw(fshdl, path, &dirent);
+    if ( raw_value == NULL )
+    {
+        errno = EIO;
+        return false;
+    }
+
+    if ( dirent->type != xenhypfs_type_bool )
+    {
+        errno = EINVAL;
+        return false;
+    }
+
+    *val = *(bool *)raw_value;
+
+    free(raw_value); free(dirent);
+    return true;
+}
+
+static bool read_hypfs_uint(
+    xenhypfs_handle *fshdl, const char *path, size_t sz, void *val)
+{
+    struct xenhypfs_dirent *dirent;
+    void *raw_value;
+
+    errno = 0;
+
+    raw_value = xenhypfs_read_raw(fshdl, path, &dirent);
+    if ( raw_value == NULL )
+    {
+        errno = EIO;
+        return false;
+    }
+
+    if ( (dirent->type != xenhypfs_type_uint) ||
+         (dirent->size != sz) )
+    {
+        errno = EINVAL;
+        return false;
+    }
+
+    switch ( sz )
+    {
+    case sizeof(uint8_t):
+        *(uint8_t *)val = *(uint8_t *)raw_value;
+        break;
+    case sizeof(uint16_t):
+        *(uint16_t *)val = *(uint16_t *)raw_value;
+        break;
+    case sizeof(uint32_t):
+        *(uint32_t *)val = *(uint32_t *)raw_value;
+        break;
+    case sizeof(uint64_t):
+        *(uint64_t *)val = *(uint64_t *)raw_value;
+        break;
+    default:
+        free(raw_value); free(dirent);
+        errno = EINVAL;
+        return false;
+    }
+
+    free(raw_value); free(dirent);
+    return true;
+}
+
+static uint8_t read_hypfs_uint8(xenhypfs_handle *fshdl, const char *path)
+{
+    uint8_t value;
+
+    if ( !read_hypfs_uint(fshdl, path, sizeof(value), &value) )
+    {
+        fprintf(stderr, "error: unable to read uint8_t from %s \n", path);
+        return 0;
+    }
+
+    return value;
+}
+
+static uint16_t read_hypfs_uint16(xenhypfs_handle *fshdl, const char *path)
+{
+    uint16_t value;
+
+    if ( !read_hypfs_uint(fshdl, path, sizeof(value), &value) )
+    {
+        fprintf(stderr, "error: unable to read uint16_t from %s \n", path);
+        return 0;
+    }
+
+    return value;
+}
+
+static uint32_t read_hypfs_uint32(xenhypfs_handle *fshdl, const char *path)
+{
+    uint32_t value;
+
+    if ( !read_hypfs_uint(fshdl, path, sizeof(value), &value) )
+    {
+        fprintf(stderr, "error: unable to read uint32_t from %s \n", path);
+        return 0;
+    }
+
+    return value;
+}
+
+static uint64_t read_hypfs_uint64(xenhypfs_handle *fshdl, const char *path)
+{
+    uint64_t value;
+
+    if ( !read_hypfs_uint(fshdl, path, sizeof(value), &value) )
+    {
+        fprintf(stderr, "error: unable to read uint64_t from %s \n", path);
+        return 0;
+    }
+
+    return value;
+}
+
+static bool is_constructed(xenhypfs_handle *fshdl, uint32_t domid)
+{
+    char path[HYPFS_MAX_PATH];
+    bool constructed;
+
+    snprintf(path, HYPFS_MAX_PATH, "/builder/%d/constructed", domid);
+
+    if ( !read_hypfs_bool(fshdl, path, &constructed) )
+    {
+        fprintf(stderr, "error: unable to read constructed field\n");
+        return false;
+    }
+
+    return constructed;
+}
+
+#define XS_PATH   "/builder/%d/xenstore"
+#define CONS_PATH "/builder/%d/devices/console"
+
+int read_hypfs_tree(xenhypfs_handle *hdl, struct domain_info *di)
+{
+    char path[HYPFS_MAX_PATH];
+
+    if ( !is_constructed(hdl, di->domid) )
+    {
+        fprintf(stderr, "error: domain %d did not get constructed\n",
+                di->domid);
+        return -EEXIST;
+    }
+
+    if ( !di->override_uuid )
+    {
+        snprintf(path, HYPFS_MAX_PATH, "/builder/%d/uuid", di->domid);
+        di->uuid = xenhypfs_read(hdl, path);
+    }
+
+    snprintf(path, HYPFS_MAX_PATH, "/builder/%d/ncpus", di->domid);
+    di->num_cpu = read_hypfs_uint32(hdl, path);
+    if ( errno != 0 )
+    {
+        fprintf(stderr, "error: unable to read number of cpus\n");
+        return -errno;
+    }
+
+    snprintf(path, HYPFS_MAX_PATH, "/builder/%d/mem_size", di->domid);
+    di->mem_info.target = read_hypfs_uint32(hdl, path);
+    if ( errno != 0 )
+    {
+        fprintf(stderr, "error: unable to read memory size\n");
+        return -errno;
+    }
+
+    snprintf(path, HYPFS_MAX_PATH, "/builder/%d/mem_max", di->domid);
+    di->mem_info.max = read_hypfs_uint32(hdl, path);
+    if ( errno != 0 )
+    {
+        fprintf(stderr, "error: unable to read max memory\n");
+        return -errno;
+    }
+
+    /* Xenstore */
+    snprintf(path, HYPFS_MAX_PATH, XS_PATH "/evtchn", di->domid);
+    di->xs_info.evtchn_port = read_hypfs_uint32(hdl, path);
+    if ( errno != 0 )
+    {
+        fprintf(stderr, "error: unable to read xenstore event channel port\n");
+        return -errno;
+    }
+
+    snprintf(path, HYPFS_MAX_PATH, XS_PATH "/mfn", di->domid);
+    di->xs_info.mfn = read_hypfs_uint64(hdl, path);
+    if ( errno != 0 )
+    {
+        fprintf(stderr, "error: unable to read xenstore page mfn\n");
+        return -errno;
+    }
+
+    /* Console */
+    if ( di->cons_info.enable )
+    {
+        snprintf(path, HYPFS_MAX_PATH, CONS_PATH "/evtchn", di->domid);
+        di->cons_info.evtchn_port = read_hypfs_uint32(hdl, path);
+        if ( errno != 0 )
+        {
+            fprintf(stderr, "error: unable to read xenstore event channel port\n");
+            return -errno;
+        }
+
+        snprintf(path, HYPFS_MAX_PATH, CONS_PATH "/mfn", di->domid);
+        di->cons_info.mfn = read_hypfs_uint64(hdl, path);
+        if ( errno != 0 )
+        {
+            fprintf(stderr, "error: unable to read xenstore page mfn\n");
+            return -errno;
+        }
+    }
+
+    return 0;
+}
+
diff --git a/tools/helpers/hypfs-helpers.h b/tools/helpers/hypfs-helpers.h
new file mode 100644
index 0000000000..2b2de5967f
--- /dev/null
+++ b/tools/helpers/hypfs-helpers.h
@@ -0,0 +1,9 @@
+#ifndef __HYPFS_HELPERS_H
+#define __HYPFS_HELPERS_H
+
+#include "late-init-pv.h"
+
+bool has_builder_hypfs(xenhypfs_handle *hdl, uint32_t domid);
+int read_hypfs_tree(xenhypfs_handle *hdl, struct domain_info *di);
+
+#endif
diff --git a/tools/helpers/late-init-pv.c b/tools/helpers/late-init-pv.c
new file mode 100644
index 0000000000..e1602be6d5
--- /dev/null
+++ b/tools/helpers/late-init-pv.c
@@ -0,0 +1,287 @@
+
+#include <errno.h>
+#include <getopt.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <xenctrl.h>
+#include <xenguest.h>
+#include <xenhypfs.h>
+#include <xenstore.h>
+#include <xentoollog.h>
+#include <xen/io/xenbus.h>
+
+#include "hypfs-helpers.h"
+#include "late-init-pv.h"
+#include "xs-helpers.h"
+
+static struct option options[] = {
+    { "uuid", 1, NULL, 'u' },
+    { "console", 0, NULL, 'c' },
+    { "force", 0, NULL, 'f' },
+    { "domain", 1, NULL, 'd' },
+    { "verbose", 0, NULL, 'v' },
+    { NULL, 0, NULL, 0 }
+};
+
+static void usage(void)
+{
+    fprintf(stderr,
+"Usage:\n"
+"\n"
+"late-init-pv <options>\n"
+"\n"
+"where options may include:\n"
+"\n"
+"  --uuid <UUID string>     override the UUID to use for the domain\n"
+"  --console                configure the console\n"
+"  --force                  for @introduceDomain even if xenstore entries exist\n"
+"  --domain <domain id>     domain id of the domain to be initialized\n"
+"  -v[v[v]]                 verbosity constructing xenstore tree\n");
+}
+
+#define XS_DOM_PERM(x, d, k, v)                                             \
+    ret = do_xs_write_dom_with_perm(x, d, k, v, perms, num_perms);          \
+    if ( ret != 0 ) return ret                                              \
+
+#define XS_DIR_PERM(x, p, k, v)                                             \
+    ret = do_xs_write_dir_node_with_perm(x, p, k, v, perms, num_perms);     \
+    if ( ret != 0 ) return ret                                              \
+
+static int create_xs_entries(
+    struct xs_handle *xsh, uint16_t curr_domid, struct domain_info *di)
+{
+    char value[16];
+    struct xs_permissions perms[2] = {
+        {.id = curr_domid, .perms = XS_PERM_NONE},
+        {.id = di->domid, .perms = XS_PERM_READ},
+    };
+    uint32_t num_perms = (sizeof(perms) / sizeof((perms)[0]));
+    int ret = 0;
+
+    while ( do_xs_start_transaction(xsh) == 0 )
+    {
+        XS_DOM_PERM(xsh, di->domid, "", "");
+
+        snprintf(value, 16, "%d", di->domid);
+        XS_DOM_PERM(xsh, di->domid, "domid", value);
+
+        XS_DOM_PERM(xsh, di->domid, "memory", "");
+        snprintf(value, 16, "%d", di->mem_info.target);
+        XS_DOM_PERM(xsh, di->domid, "memory/target", value);
+
+        if ( di->mem_info.max )
+            snprintf(value, 16, "%d", di->mem_info.max);
+        else
+            snprintf(value, 16, "%d", di->mem_info.target);
+        XS_DOM_PERM(xsh, di->domid, "memory/static-max", value);
+
+        XS_DOM_PERM(xsh, di->domid, "store", "");
+        snprintf(value, 16, "%d", di->xs_info.evtchn_port);
+        XS_DOM_PERM(xsh, di->domid, "store/port", value);
+
+        snprintf(value, 16, "%ld", di->xs_info.mfn);
+        XS_DOM_PERM(xsh, di->domid, "store/ring-ref", value);
+
+        if ( di->cons_info.enable )
+        {
+            char be_path[64], fe_path[64];
+
+            snprintf(fe_path, 64, "/local/domain/%d/console", di->domid);
+            snprintf(be_path, 64, "/local/domain/%d/backend/console/%d/0",
+                     di->cons_info.be_domid, di->domid);
+
+            /* Backend entries */
+            XS_DIR_PERM(xsh, be_path, "", "");
+            snprintf(value, 16, "%d", di->domid);
+            XS_DIR_PERM(xsh, be_path, "frontend-id", value);
+            XS_DIR_PERM(xsh, be_path, "frontend", fe_path);
+            XS_DIR_PERM(xsh, be_path, "online", "1");
+            XS_DIR_PERM(xsh, be_path, "protocol", "vt100");
+
+            snprintf(value, 16, "%d", XenbusStateInitialising);
+            XS_DIR_PERM(xsh, be_path, "state", value);
+
+            /* Frontend entries */
+            XS_DOM_PERM(xsh, di->domid, "console", "");
+            snprintf(value, 16, "%d", di->cons_info.be_domid);
+            XS_DIR_PERM(xsh, fe_path, "backend", be_path);
+            XS_DIR_PERM(xsh, fe_path, "backend-id", value);
+            XS_DIR_PERM(xsh, fe_path, "limit", "1048576");
+            XS_DIR_PERM(xsh, fe_path, "type", "xenconsoled");
+            XS_DIR_PERM(xsh, fe_path, "output", "pty");
+            XS_DIR_PERM(xsh, fe_path, "tty", "");
+
+            snprintf(value, 16, "%d", di->cons_info.evtchn_port);
+            XS_DIR_PERM(xsh, fe_path, "port", value);
+
+            snprintf(value, 16, "%ld", di->cons_info.mfn);
+            XS_DIR_PERM(xsh, fe_path, "ring-ref", value);
+
+        }
+
+        ret = do_xs_end_transaction(xsh);
+        switch ( ret )
+        {
+        case 0:
+            break; /* proceed to loop break */
+        case -EAGAIN:
+            continue; /* try again */
+        default:
+            return ret; /* failed */
+        }
+
+        break;
+    }
+
+    return ret;
+}
+
+static bool init_domain(struct xs_handle *xsh, struct domain_info *di)
+{
+    xc_interface *xch = xc_interface_open(0, 0, 0);
+    xen_pfn_t con_mfn = 0L;
+    /*xc_dom_gnttab_seed will do nothing of front == back */
+    uint32_t con_domid = di->domid;
+    int ret;
+
+    /* console */
+    if ( di->cons_info.enable )
+    {
+        con_domid = di->cons_info.be_domid;
+        con_mfn = di->cons_info.mfn;
+    }
+
+    ret = xc_dom_gnttab_seed(xch, di->domid, di->is_hvm, con_mfn,
+            di->xs_info.mfn, con_domid, di->xs_info.be_domid);
+    if ( ret != 0 )
+    {
+        fprintf(stderr, "error (%d) setting up grant tables for dom%d\n",
+                ret, di->domid);
+        xc_interface_close(xch);
+        return false;
+    }
+
+    xc_interface_close(xch);
+
+    return xs_introduce_domain(xsh, di->domid, di->xs_info.mfn,
+                               di->xs_info.evtchn_port);
+}
+
+int main(int argc, char** argv)
+{
+    int opt, rv;
+    bool force = false;
+    struct xs_handle *xsh = NULL;
+    xenhypfs_handle *xhfs = NULL;
+    xentoollog_level minmsglevel = XTL_PROGRESS;
+    xentoollog_logger *logger = NULL;
+    struct domain_info di = { .domid = ~0 };
+
+    while ( (opt = getopt_long(argc, argv, "cfd:v", options, NULL)) != -1 )
+    {
+        switch ( opt )
+        {
+        case 'u':
+            di.override_uuid = true;
+            di.uuid = optarg;
+            break;
+        case 'c':
+            di.cons_info.enable = true;
+            break;
+        case 'f':
+            force = true;
+            break;
+        case 'd':
+            di.domid = strtol(optarg, NULL, 10);
+            break;
+        case 'v':
+            if ( minmsglevel )
+                minmsglevel--;
+            break;
+        default:
+            usage();
+            return 2;
+        }
+    }
+
+    if ( optind != argc || di.domid == ~0 )
+    {
+        usage();
+        return 1;
+    }
+
+    logger = (xentoollog_logger *)xtl_createlogger_stdiostream(stderr,
+                                                               minmsglevel, 0);
+
+    xhfs = xenhypfs_open(logger, 0);
+    if ( !xhfs )
+    {
+        fprintf(stderr, "error: unable to acces xen hypfs\n");
+        rv = 2;
+        goto out;
+    }
+
+    if ( !has_builder_hypfs(xhfs, di.domid) )
+    {
+        fprintf(stderr, "error: hypfs entry for domain %d not present\n",
+                di.domid);
+        rv = 3;
+        goto out;
+    }
+
+    if ( read_hypfs_tree(xhfs, &di) != 0 )
+    {
+        fprintf(stderr, "error: unable to parse hypfs for domain %d\n",
+                di.domid);
+        rv = 4;
+        goto out;
+    }
+
+    xsh = xs_open(0);
+    if ( xsh == NULL )
+    {
+        fprintf(stderr, "error: unable to connect to xenstored\n");
+        rv = 5;
+        goto out;
+    }
+
+    if ( xs_is_domain_introduced(xsh, di.domid) )
+    {
+        if ( !force )
+        {
+            fprintf(stderr, "error: domain %d already introduced\n", di.domid);
+            rv = 6;
+            goto out;
+        }
+        else
+        {
+            fprintf(stderr, "warning: re-introducting domain %d\n", di.domid);
+        }
+    }
+
+    /* TODO: hardcdoding local domain to 0 for testing purposes */
+    if ( (rv = create_xs_entries(xsh, 0, &di)) != 0 )
+    {
+        fprintf(stderr, "error(%d): unable create xenstore entries\n", rv);
+        rv = 7;
+        goto out;
+    }
+
+    init_domain(xsh, &di);
+    rv = 0;
+
+out:
+    if ( xsh )
+        xs_close(xsh);
+
+    if ( xhfs )
+        xenhypfs_close(xhfs);
+
+    if ( logger )
+        xtl_logger_destroy(logger);
+
+    return rv;
+}
diff --git a/tools/helpers/late-init-pv.h b/tools/helpers/late-init-pv.h
new file mode 100644
index 0000000000..5d66e7870f
--- /dev/null
+++ b/tools/helpers/late-init-pv.h
@@ -0,0 +1,29 @@
+#ifndef __LATE_INIT_PV_H
+#define __LATE_INIT_PV_H
+
+struct domain_info {
+    uint16_t domid;
+    bool is_hvm;
+    bool override_uuid;
+    const char *uuid;
+    uint32_t num_cpu;
+    uint32_t max_cpu;
+    struct {
+        uint32_t target;
+        uint32_t max;
+        uint32_t video;
+    } mem_info;
+    struct {
+        uint16_t be_domid;
+        uint32_t evtchn_port;
+        uint64_t mfn;
+    } xs_info;
+    struct {
+        bool enable;
+        uint16_t be_domid;
+        uint32_t evtchn_port;
+        uint64_t mfn;
+    } cons_info;
+};
+
+#endif
diff --git a/tools/helpers/xs-helpers.c b/tools/helpers/xs-helpers.c
new file mode 100644
index 0000000000..a4d2bebbbd
--- /dev/null
+++ b/tools/helpers/xs-helpers.c
@@ -0,0 +1,117 @@
+
+#include <err.h>
+#include <stdio.h>
+#include <string.h>
+#include <xenstore.h>
+
+#define MAX_XS_PAATH 100
+
+static xs_transaction_t t_id = XBT_NULL;
+
+int do_xs_start_transaction(struct xs_handle *xsh)
+{
+    t_id = xs_transaction_start(xsh);
+    if (t_id == XBT_NULL)
+        return -errno;
+
+    return 0;
+}
+
+int do_xs_end_transaction(struct xs_handle *xsh)
+{
+    if ( t_id == XBT_NULL )
+        return -EINVAL;
+
+    if (!xs_transaction_end(xsh, t_id, false))
+        return -errno;
+
+    return 0;
+}
+
+int do_xs_write(struct xs_handle *xsh, char *path, char *val)
+{
+    if ( !xs_write(xsh, t_id, path, val, strlen(val)) )
+    {
+        fprintf(stderr, "failed write: %s\n", path);
+        return -errno;
+    }
+
+    return 0;
+}
+
+int do_xs_perms(
+    struct xs_handle *xsh, char *path, struct xs_permissions *perms,
+    uint32_t num_perms)
+{
+    if ( !xs_set_permissions(xsh, t_id, path, perms, num_perms) )
+    {
+        fprintf(stderr, "failed set perm: %s\n", path);
+        return -errno;
+    }
+
+    return 0;
+}
+
+int do_xs_write_dir_node_with_perm(
+    struct xs_handle *xsh, char *dir, char *node, char *val,
+    struct xs_permissions *perms, uint32_t num_perms)
+{
+    char full_path[MAX_XS_PAATH];
+    int ret = 0;
+
+    /*
+     * mainly for creating a value holding node, but
+     * also support creating directory nodes.
+     */
+    if ( strlen(node) != 0 )
+        snprintf(full_path, MAX_XS_PAATH, "%s/%s", dir, node);
+    else
+        snprintf(full_path, MAX_XS_PAATH, "%s", dir);
+
+    ret = do_xs_write(xsh, full_path, val);
+    if ( ret < 0 )
+        return ret;
+
+    if ( perms != NULL && num_perms > 0 )
+        ret = do_xs_perms(xsh, full_path, perms, num_perms);
+
+    return ret;
+}
+
+int do_xs_write_dir_node(
+    struct xs_handle *xsh, char *dir, char *node, char *val)
+{
+    return do_xs_write_dir_node_with_perm(xsh, dir, node, val, NULL, 0);
+}
+
+int do_xs_write_dom_with_perm(
+    struct xs_handle *xsh, uint32_t domid, char *path, char *val,
+    struct xs_permissions *perms, uint32_t num_perms)
+{
+    char full_path[MAX_XS_PAATH];
+    int ret = 0;
+
+    /*
+     * mainly for creating a value holding node, but
+     * also support creating directory nodes.
+     */
+    if ( strlen(path) != 0 )
+        snprintf(full_path, MAX_XS_PAATH, "/local/domain/%d/%s", domid, path);
+    else
+        snprintf(full_path, MAX_XS_PAATH, "/local/domain/%d", domid);
+
+    ret = do_xs_write(xsh, full_path, val);
+    if ( ret < 0 )
+        return ret;
+
+    if ( perms != NULL && num_perms > 0 )
+        ret = do_xs_perms(xsh, full_path, perms, num_perms);
+
+    return ret;
+}
+
+int do_xs_write_dom(
+    struct xs_handle *xsh, uint32_t domid, char *path, char *val)
+{
+    return do_xs_write_dom_with_perm(xsh, domid, path, val, NULL, 0);
+}
diff --git a/tools/helpers/xs-helpers.h b/tools/helpers/xs-helpers.h
new file mode 100644
index 0000000000..f57fcab843
--- /dev/null
+++ b/tools/helpers/xs-helpers.h
@@ -0,0 +1,27 @@
+#ifndef __XS_HELPERS_H
+#define __XS_HELPERS_H
+
+#include <xenstore.h>
+
+int do_xs_start_transaction(struct xs_handle *xsh);
+int do_xs_end_transaction(struct xs_handle *xsh);
+
+int do_xs_write(struct xs_handle *xsh, char *path, char *val);
+int do_xs_perms(
+    struct xs_handle *xsh, char *path, struct xs_permissions *perms,
+    uint32_t num_perms);
+
+int do_xs_write_dir_node_with_perm(
+    struct xs_handle *xsh, char *dir, char *node, char *val,
+    struct xs_permissions *perms, uint32_t num_perms);
+int do_xs_write_dir_node(
+    struct xs_handle *xsh, char *dir, char *node, char *val);
+
+int do_xs_write_dom_with_perm(
+    struct xs_handle *xsh, uint32_t domid, char *path, char *val,
+    struct xs_permissions *perms, uint32_t num_perms);
+int do_xs_write_dom(
+    struct xs_handle *xsh, uint32_t domid, char *path, char *val);
+
+#endif
+
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-06 21:04 ` [PATCH v1 01/18] kconfig: allow configuration of maximum modules Daniel P. Smith
@ 2022-07-07  1:44   ` Henry Wang
  2022-07-15 19:16   ` Julien Grall
  2022-07-19  9:32   ` Jan Beulich
  2 siblings, 0 replies; 66+ messages in thread
From: Henry Wang @ 2022-07-07  1:44 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel
  Cc: scott.davis, christopher.clark, Andrew Cooper, Wei Liu,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Roger Pau Monné,
	Volodymyr Babchuk

Hi Daniel,

> -----Original Message-----
> Subject: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
> 
> For x86 the number of allowable multiboot modules varies between the
> different
> entry points, non-efi boot, pvh boot, and efi boot. In the case of both Arm
> and
> x86 this value is fixed to values based on generalized assumptions. With
> hyperlaunch for x86 and dom0less on Arm, use of static sizes results in large
> allocations compiled into the hypervisor that will go unused by many use
> cases.
> 
> This commit introduces a Kconfig variable that is set with sane defaults based
> on configuration selection. This variable is in turned used as the array size
> for the cases where a static allocated array of boot modules is declared.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>

Reviewed-by: Henry Wang <Henry.Wang@arm.com>

> ---
>  xen/arch/Kconfig                  | 12 ++++++++++++
>  xen/arch/arm/include/asm/setup.h  |  5 +++--
>  xen/arch/x86/efi/efi-boot.h       |  2 +-
>  xen/arch/x86/guest/xen/pvh-boot.c |  2 +-
>  xen/arch/x86/setup.c              |  4 ++--
>  5 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
> index f16eb0df43..24139057be 100644
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -17,3 +17,15 @@ config NR_CPUS
>  	  For CPU cores which support Simultaneous Multi-Threading or
> similar
>  	  technologies, this the number of logical threads which Xen will
>  	  support.
> +
> +config NR_BOOTMODS
> +	int "Maximum number of boot modules that a loader can pass"
> +	range 1 32768
> +	default "8" if X86
> +	default "32" if ARM
> +	help
> +	  Controls the build-time size of various arrays allocated for
> +	  parsing the boot modules passed by a loader when starting Xen.
> +
> +	  This is of particular interest when using Xen's hypervisor domain
> +	  capabilities such as dom0less.
> diff --git a/xen/arch/arm/include/asm/setup.h
> b/xen/arch/arm/include/asm/setup.h
> index 2bb01ecfa8..312a3e4209 100644
> --- a/xen/arch/arm/include/asm/setup.h
> +++ b/xen/arch/arm/include/asm/setup.h
> @@ -10,7 +10,8 @@
> 
>  #define NR_MEM_BANKS 256
> 
> -#define MAX_MODULES 32 /* Current maximum useful modules */
> +/* Current maximum useful modules */
> +#define MAX_MODULES CONFIG_NR_BOOTMODS
> 
>  typedef enum {
>      BOOTMOD_XEN,
> @@ -38,7 +39,7 @@ struct meminfo {
>   * The domU flag is set for kernels and ramdisks of "xen,domain" nodes.
>   * The purpose of the domU flag is to avoid getting confused in
>   * kernel_probe, where we try to guess which is the dom0 kernel and
> - * initrd to be compatible with all versions of the multiboot spec.
> + * initrd to be compatible with all versions of the multiboot spec.

Thanks for taking the chance to remove the space in the end of the sentence.

Kind regards,
Henry




^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 06/18] fdt: make fdt handling reusable across arch
  2022-07-06 21:04 ` [PATCH v1 06/18] fdt: make fdt handling reusable across arch Daniel P. Smith
@ 2022-07-07  1:44   ` Henry Wang
  2022-07-19  9:36   ` Jan Beulich
  1 sibling, 0 replies; 66+ messages in thread
From: Henry Wang @ 2022-07-07  1:44 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel, Volodymyr Babchuk
  Cc: scott.davis, christopher.clark, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

Hi Daniel,

> -----Original Message-----
> Subject: [PATCH v1 06/18] fdt: make fdt handling reusable across arch
> 
> This refactors reusable code from Arm's bootfdt.c and device-tree.h that is
> general fdt handling code.  The Kconfig parameter CORE_DEVICE_TREE is
> introduced for when the ability of parsing DTB files is needed by a capability
> such as hyperlaunch.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>

Reviewed-by: Henry Wang <Henry.Wang@arm.com>

Kind regards,
Henry

> ---
>  xen/arch/arm/bootfdt.c        | 115 +----------------------------
>  xen/common/Kconfig            |   4 ++
>  xen/common/Makefile           |   3 +-
>  xen/common/fdt.c              | 131 ++++++++++++++++++++++++++++++++++
>  xen/include/xen/device_tree.h |  50 +------------
>  xen/include/xen/fdt.h         |  79 ++++++++++++++++++++



^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 08/18] kconfig: introduce domain builder config option
  2022-07-06 21:04 ` [PATCH v1 08/18] kconfig: introduce domain builder config option Daniel P. Smith
@ 2022-07-07  1:44   ` Henry Wang
  2022-07-19 13:29   ` Jan Beulich
  1 sibling, 0 replies; 66+ messages in thread
From: Henry Wang @ 2022-07-07  1:44 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

Hi Daniel,

> -----Original Message-----
> Subject: [PATCH v1 08/18] kconfig: introduce domain builder config option
> 
> Hyperlaunch domain builder is the consolidated boot time domain building
> logic
> framework.  This commit introduces the first config option for the domain
> builder to control support for loading the domain configurations via the
> flattened device tree.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>

Reviewed-by: Henry Wang <Henry.Wang@arm.com>

Kind regards,
Henry

> ---
>  xen/common/Kconfig                |  1 +
>  xen/common/domain-builder/Kconfig | 15 +++++++++++++++



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-06 21:04 ` [PATCH v1 01/18] kconfig: allow configuration of maximum modules Daniel P. Smith
  2022-07-07  1:44   ` Henry Wang
@ 2022-07-15 19:16   ` Julien Grall
  2022-07-19 16:36     ` Daniel P. Smith
  2022-07-19  9:32   ` Jan Beulich
  2 siblings, 1 reply; 66+ messages in thread
From: Julien Grall @ 2022-07-15 19:16 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel, Volodymyr Babchuk, Wei Liu
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Jan Beulich, Stefano Stabellini, Bertrand Marquis,
	Roger Pau Monné

Hi Daniel,

On 06/07/2022 22:04, Daniel P. Smith wrote:
> For x86 the number of allowable multiboot modules varies between the different
> entry points, non-efi boot, pvh boot, and efi boot. In the case of both Arm and
> x86 this value is fixed to values based on generalized assumptions. With
> hyperlaunch for x86 and dom0less on Arm, use of static sizes results in large
> allocations compiled into the hypervisor that will go unused by many use cases.
> 
> This commit introduces a Kconfig variable that is set with sane defaults based
> on configuration selection. This variable is in turned used as the array size
> for the cases where a static allocated array of boot modules is declared.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>

I am not entirely sure where this reviewed-by is coming from. Is this 
from internal review?

If yes, my recommendation would be to provide the reviewed-by on the 
mailing list. Ideally, the review should also be done in the open, but I 
understand some company wish to do a fully internal review first.

At least from a committer perspective, this helps me to know whether the 
reviewed-by still apply. An example would be if you send a v2, I would 
not be able to know whether Christoffer still agreed on the change.

> ---
>   xen/arch/Kconfig                  | 12 ++++++++++++
>   xen/arch/arm/include/asm/setup.h  |  5 +++--
>   xen/arch/x86/efi/efi-boot.h       |  2 +-
>   xen/arch/x86/guest/xen/pvh-boot.c |  2 +-
>   xen/arch/x86/setup.c              |  4 ++--
>   5 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
> index f16eb0df43..24139057be 100644
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -17,3 +17,15 @@ config NR_CPUS
>   	  For CPU cores which support Simultaneous Multi-Threading or similar
>   	  technologies, this the number of logical threads which Xen will
>   	  support.
> +
> +config NR_BOOTMODS
> +	int "Maximum number of boot modules that a loader can pass"
> +	range 1 32768
> +	default "8" if X86
> +	default "32" if ARM
> +	help
> +	  Controls the build-time size of various arrays allocated for
> +	  parsing the boot modules passed by a loader when starting Xen.
> +
> +	  This is of particular interest when using Xen's hypervisor domain
> +	  capabilities such as dom0less.
> diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
> index 2bb01ecfa8..312a3e4209 100644
> --- a/xen/arch/arm/include/asm/setup.h
> +++ b/xen/arch/arm/include/asm/setup.h
> @@ -10,7 +10,8 @@
>   
>   #define NR_MEM_BANKS 256
>   
> -#define MAX_MODULES 32 /* Current maximum useful modules */
> +/* Current maximum useful modules */
> +#define MAX_MODULES CONFIG_NR_BOOTMODS
>   
>   typedef enum {
>       BOOTMOD_XEN,
> @@ -38,7 +39,7 @@ struct meminfo {
>    * The domU flag is set for kernels and ramdisks of "xen,domain" nodes.
>    * The purpose of the domU flag is to avoid getting confused in
>    * kernel_probe, where we try to guess which is the dom0 kernel and
> - * initrd to be compatible with all versions of the multiboot spec.
> + * initrd to be compatible with all versions of the multiboot spec.

In general, I much prefer if coding style changes are done separately 
because it helps the review (I don't have to stare at the line to figure 
out what changed).

I am not going to force this here. However, the strict minimum is to 
mention the change in the commit message.

>    */
>   #define BOOTMOD_MAX_CMDLINE 1024
>   struct bootmodule {
> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
> index 6e65b569b0..4e1a799749 100644
> --- a/xen/arch/x86/efi/efi-boot.h
> +++ b/xen/arch/x86/efi/efi-boot.h
> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>    * The array size needs to be one larger than the number of modules we
>    * support - see __start_xen().
>    */
> -static module_t __initdata mb_modules[5];
> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];

Please explain in the commit message why the number of modules was 
bumped from 5 to 9.

>   
>   static void __init edd_put_string(u8 *dst, size_t n, const char *src)
>   {
> diff --git a/xen/arch/x86/guest/xen/pvh-boot.c b/xen/arch/x86/guest/xen/pvh-boot.c
> index 498625eae0..834b1ad16b 100644
> --- a/xen/arch/x86/guest/xen/pvh-boot.c
> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>   uint32_t __initdata pvh_start_info_pa;
>   
>   static multiboot_info_t __initdata pvh_mbi;
> -static module_t __initdata pvh_mbi_mods[8];
> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];

What's the +1 for?

>   static const char *__initdata pvh_loader = "PVH Directboot";
>   
>   static void __init convert_pvh_info(multiboot_info_t **mbi,
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index f08b07b8de..2aa1e28c8f 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1020,9 +1020,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>           panic("dom0 kernel not specified. Check bootloader configuration\n");
>   
>       /* Check that we don't have a silly number of modules. */
> -    if ( mbi->mods_count > sizeof(module_map) * 8 )
> +    if ( mbi->mods_count > CONFIG_NR_BOOTMODS )
>       {
> -        mbi->mods_count = sizeof(module_map) * 8;
> +        mbi->mods_count = CONFIG_NR_BOOTMODS;
>           printk("Excessive multiboot modules - using the first %u only\n",
>                  mbi->mods_count);
>       }

AFAIU, this check is to make sure that we will not overrun module_map in 
the next line:

bitmap_fill(module_map, mbi->mods_count);

The current definition of module_map will allow 64 modules. But you are 
allowing 32768. So I think you either want to keep the check or define 
module_map as:

DECLARE_BITMAP(module_map, CONFIG_NR_BOOTMODS);

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-06 21:04 ` [PATCH v1 02/18] introduction of generalized boot info Daniel P. Smith
@ 2022-07-15 19:25   ` Julien Grall
  2022-07-20 18:32     ` Daniel P. Smith
  2022-07-19 13:11   ` Jan Beulich
  1 sibling, 1 reply; 66+ messages in thread
From: Julien Grall @ 2022-07-15 19:25 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel, Wei Liu
  Cc: scott.davis, christopher.clark, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Stefano Stabellini

Hi Daniel,

On 06/07/2022 22:04, Daniel P. Smith wrote:
> The x86 and Arm architectures represent in memory the general boot information
> and boot modules differently despite having commonality. The x86
> representations are bound to the multiboot v1 structures while the Arm
> representations are a slightly generalized meta-data container for the boot
> material. The multiboot structure does not lend itself well to being expanded
> to accommodate additional metadata, both general and boot module specific. The
> Arm structures are not bound to an external specification and thus are able to
> be expanded for solutions such as dom0less.
> 
> This commit introduces a set of structures patterned off the Arm structures to
> represent the boot information in a manner that captures common data. The
> structures provide an arch field to allow arch specific expansions to the
> structures. The intended goal of these new common structures is to enable
> commonality between the different architectures.  Specifically to enable
> dom0less and hyperlaunch to have a common representation of boot-time
> constructed domains.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
> ---
>   xen/arch/x86/include/asm/bootinfo.h | 48 +++++++++++++++++++++++++
>   xen/include/xen/bootinfo.h          | 54 +++++++++++++++++++++++++++++
>   2 files changed, 102 insertions(+)
>   create mode 100644 xen/arch/x86/include/asm/bootinfo.h
>   create mode 100644 xen/include/xen/bootinfo.h
> 
> diff --git a/xen/arch/x86/include/asm/bootinfo.h b/xen/arch/x86/include/asm/bootinfo.h
> new file mode 100644
> index 0000000000..b0754a3ed0
> --- /dev/null
> +++ b/xen/arch/x86/include/asm/bootinfo.h
> @@ -0,0 +1,48 @@
> +#ifndef __ARCH_X86_BOOTINFO_H__
> +#define __ARCH_X86_BOOTINFO_H__
> +
> +/* unused for x86 */
> +struct arch_bootstring { };
> +
> +struct __packed arch_bootmodule {
> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
> +    uint32_t flags;
> +    uint32_t headroom;
> +};
> +
> +struct __packed arch_boot_info {
> +    uint32_t flags;
> +#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
> +#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
> +#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
> +#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
> +#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
> +#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
> +#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
> +#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
> +#define BOOTINFO_FLAG_X86_APM        	1U << 10
> +
> +    bool xen_guest;
> +
> +    char *boot_loader_name;
> +    char *kextra;
> +
> +    uint32_t mem_lower;
> +    uint32_t mem_upper;
> +
> +    uint32_t mmap_length;
> +    paddr_t mmap_addr;
> +};
> +
> +struct __packed mb_memmap {
> +    uint32_t size;
> +    uint32_t base_addr_low;
> +    uint32_t base_addr_high;
> +    uint32_t length_low;
> +    uint32_t length_high;
> +    uint32_t type;
> +};
> +
> +#endif

NIT: Missing emacs magics.

> diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
> new file mode 100644
> index 0000000000..42b53a3ca6
> --- /dev/null
> +++ b/xen/include/xen/bootinfo.h
> @@ -0,0 +1,54 @@
> +#ifndef __XEN_BOOTINFO_H__
> +#define __XEN_BOOTINFO_H__
> +
> +#include <xen/mm.h>
> +#include <xen/types.h>
> +
> +#include <asm/bootinfo.h>
> +
> +typedef enum {
> +    BOOTMOD_UNKNOWN,
> +    BOOTMOD_XEN,
> +    BOOTMOD_FDT,
> +    BOOTMOD_KERNEL,
> +    BOOTMOD_RAMDISK,
> +    BOOTMOD_XSM,
> +    BOOTMOD_UCODE,
> +    BOOTMOD_GUEST_DTB,
> +}  bootmodule_kind;
> +
> +typedef enum {
> +    BOOTSTR_EMPTY,
> +    BOOTSTR_STRING,
> +    BOOTSTR_CMDLINE,
> +} bootstring_kind;
> +
> +#define BOOTMOD_MAX_STRING 1024
> +struct __packed boot_string {

As you use __packed, the fields...

> +    bootstring_kind kind;
> +    struct arch_bootstring *arch;

... may not be naturally aligned anymore. Here it will depend on the 
size of bootstring_kind (this is an enum and it don't think C guarantees 
the size). This...

> +
> +    char bytes[BOOTMOD_MAX_STRING];
> +    size_t len;
> +};
> +
> +struct __packed boot_module {
> +    bootmodule_kind kind;
> +    paddr_t start;
> +    mfn_t mfn;
> +    size_t size;
> +
> +    struct arch_bootmodule *arch;
> +    struct boot_string string;
> +};
> +
> +struct __packed boot_info {
> +    char *cmdline;
> +
> +    uint32_t nr_mods;
> +    struct boot_module *mods;

... more obvious on this one because on 64-bit arch, there will be no 
32-bit padding. So 'mods' will be 32-bit aligned even if the value 64-bit.

This is going to be a problem on any architecture that forbid unaligned 
access (or let the software decide).

In this case, I don't think any structures you defined warrant to be 
__packed.

> +
> +    struct arch_boot_info *arch;
> +};
> +
> +#endif


NIT: Missing emacs magics.

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 07/18] docs: update hyperlaunch device tree documentation
  2022-07-06 21:04 ` [PATCH v1 07/18] docs: update hyperlaunch device tree documentation Daniel P. Smith
@ 2022-07-18 13:57   ` Smith, Jackson
  2022-07-22 13:34     ` Daniel P. Smith
  0 siblings, 1 reply; 66+ messages in thread
From: Smith, Jackson @ 2022-07-18 13:57 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

[-- Attachment #1: Type: text/plain, Size: 2589 bytes --]

Hi Daniel,

> -----Original Message-----
> Subject: [PATCH v1 07/18] docs: update hyperlaunch device tree
> documentation


> diff --git a/docs/designs/launch/hyperlaunch-devicetree.rst
> b/docs/designs/launch/hyperlaunch-devicetree.rst
> index b49c98cfbd..ae1a786d0b 100644
> --- a/docs/designs/launch/hyperlaunch-devicetree.rst
> +++ b/docs/designs/launch/hyperlaunch-devicetree.rst
> @@ -13,12 +13,268 @@ difference is the introduction of the ``hypervisor``

> +
> +The Hypervisor node
> +-------------------
> +
> +The ``hypervisor`` node is a top level container for the domains that
> +will be
> built
> +by hypervisor on start up. The node will be named ``hypervisor``  with
> +a
> ``compatible``
> +property to identify which hypervisors the configuration is intended.
^^^ Should there be a note here that hypervisor node also needs a compatible 
"xen,<arch>"?

> +The
> hypervisor
> +node will consist of one or more config nodes and one or more domain
> nodes.
> +
> +Properties
> +""""""""""
> +
> +compatible
> +  Identifies which hypervisors the configuration is compatible. Required.
> +
> +  Format: "hypervisor,<hypervisor name>", e.g "hypervisor,xen"
^^^ Same here: compatible "<hypervisor name>,<arch>"?

>  Example Configuration
>  ---------------------
> +
> +Multiboot x86 Configuration Dom0-only:
> +""""""""""""""""""""""""""""""""""""""
> +The following dts file can be provided to the Device Tree compiler,
> +``dtc``,
> to
> +produce a dtb file.
> +::
> +
> +  /dts-v1/;
> +
> +  / {
> +      chosen {
> +          hypervisor {
> +              compatible = "hypervisor,xen";
^^^^^^^^  compatible = "hypervisor,xen", "xen,x86";

> +
> +              dom0 {
> +                  compatible = "xen,domain";
> +
> +                  domid = <0>;
> +
> +                  permissions = <3>;
> +                  functions = <0xC000000F>;
> +                  mode = <5>;
> +
> +                  domain-uuid = [B3 FB 98 FB 8F 9F 67 A3 8A 6E 62 5A 09
> + 13 F0
> 8C];
> +
> +                  cpus = <1>;
> +                  memory = <0x0 0x20000000>;
^^^^^^^^^^ memory = "2048M";
Needs to be updated to new format for mem.

> +
> +                  kernel {
> +                      compatible = "module,kernel", "module,index";
> +                      module-index = <1>;
> +                  };
> +              };
> +
> +          };
> +      };
> +  };
> +

Similar adjustments are needed for the rest of the examples I believe.

Also, two typos:
Line 287 is missing a line ending semi-colon.
Line 82 has a double space between 'node' and 'may'.

Best,
Jackson

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5317 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 04/18] x86: refactor entrypoints to new boot info
  2022-07-06 21:04 ` [PATCH v1 04/18] x86: refactor entrypoints to new boot info Daniel P. Smith
@ 2022-07-18 13:58   ` Smith, Jackson
  2022-07-22 12:59     ` Daniel P. Smith
  0 siblings, 1 reply; 66+ messages in thread
From: Smith, Jackson @ 2022-07-18 13:58 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel
  Cc: scott.davis, christopher.clark, Jan Beulich, Andrew Cooper,
	Roger Pau Monné

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

Hi Daniel,

I hope outlook gets this reply right.

> -----Original Message-----
> Subject: [PATCH v1 04/18] x86: refactor entrypoints to new boot info

> diff --git a/xen/arch/x86/guest/xen/pvh-boot.c
> b/xen/arch/x86/guest/xen/pvh-boot.c
> index 834b1ad16b..28cf5df0a3 100644
> --- a/xen/arch/x86/guest/xen/pvh-boot.c
> +++ b/xen/arch/x86/guest/xen/pvh-boot.c

> @@ -99,13 +118,16 @@ static void __init get_memory_map(void)
>      sanitize_e820_map(e820_raw.map, &e820_raw.nr_map);
>  }
>
> -void __init pvh_init(multiboot_info_t **mbi, module_t **mod)
> +void __init pvh_init(struct boot_info **bi)
>  {
> -    convert_pvh_info(mbi, mod);
> +    *bi = init_pvh_info();
> +    convert_pvh_info(*bi);
>
>      hypervisor_probe();
>      ASSERT(xen_guest);
>
> +    (*bi)->arch->xen_guest = xen_guest;

I think you may have a typo/missed refactoring here?
I changed this line to "(*bi)->arch->xenguest = xen_guest;" to get the 
patchset to build.

The arch_boot_info struct in boot_info32.h has a field 'xen_guest' but the 
same field in asm/bootinfo.h was re-named from 'xen_guest' to 'xenguest' in 
the 'x86: adopt new boot info structures' commit.

What was your intent?

> +
>      get_memory_map();
>  }
>

Thanks,
Jackson Smith

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5317 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 10/18] x86: introduce the domain builder
  2022-07-06 21:04 ` [PATCH v1 10/18] x86: introduce the " Daniel P. Smith
@ 2022-07-18 13:59   ` Smith, Jackson
  2022-07-22 14:36     ` Daniel P. Smith
  2022-07-26 14:46   ` Jan Beulich
  1 sibling, 1 reply; 66+ messages in thread
From: Smith, Jackson @ 2022-07-18 13:59 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel
  Cc: scott.davis, christopher.clark, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 2217 bytes --]

Hi Daniel,

> -----Original Message-----
> Subject: [PATCH v1 10/18] x86: introduce the domain builder
> 
> This commit introduces the domain builder configuration FDT parser along
> with the domain builder core for domain creation. To enable domain builder
> to be a cross architecture internal API, a new arch domain creation call
is
> introduced for use by the domain builder.

> diff --git a/xen/common/domain-builder/core.c

> +void __init builder_init(struct boot_info *info) {
> +    struct boot_domain *d = NULL;
> +
> +    info->builder = &builder;
> +
> +    if ( IS_ENABLED(CONFIG_BUILDER_FDT) )
> +    {

> +    }
> +
> +    /*
> +     * No FDT config support or an FDT wasn't present, do an initial
> +     * domain construction
> +     */
> +    printk("Domain Builder: falling back to initial domain build\n");
> +    info->builder->nr_doms = 1;
> +    d = &info->builder->domains[0];
> +
> +    d->mode = opt_dom0_pvh ? 0 : BUILD_MODE_PARAVIRTUALIZED;
> +
> +    d->kernel = &info->mods[0];
> +    d->kernel->kind = BOOTMOD_KERNEL;
> +
> +    d->permissions = BUILD_PERMISSION_CONTROL |
> BUILD_PERMISSION_HARDWARE;
> +    d->functions = BUILD_FUNCTION_CONSOLE |
> BUILD_FUNCTION_XENSTORE |
> +                     BUILD_FUNCTION_INITIAL_DOM;
> +
> +    d->kernel->arch->headroom = bzimage_headroom(bootstrap_map(d-
> >kernel),
> +                                                   d->kernel->size);
> +    bootstrap_map(NULL);
> +
> +    if ( d->kernel->string.len )
> +        d->kernel->string.kind = BOOTSTR_CMDLINE; }

Forgive me if I'm incorrect, but I believe there is an issue with this
fallback logic for the case where no FDT was provided.

If dom0_mem is not supplied to the xen cmd line, then d->meminfo is never
initialized. (See dom0_compute_nr_pages/dom0_build.c:335)
This was giving me trouble because bd->meminfo.mem_max.nr_pages was left at
0, effectivity clamping dom0 to 0 pages of ram.

I'm not sure what the best solution is but one (easy) possibility is just
initializing meminfo to the dom0 defaults near the end of this function:
        d->meminfo.mem_size = dom0_size;
        d->meminfo.mem_min = dom0_min_size;
        d->meminfo.mem_max = dom0_max_size;

Thanks,
Jackson

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5317 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-06 21:04 ` [PATCH v1 01/18] kconfig: allow configuration of maximum modules Daniel P. Smith
  2022-07-07  1:44   ` Henry Wang
  2022-07-15 19:16   ` Julien Grall
@ 2022-07-19  9:32   ` Jan Beulich
  2022-07-19 17:02     ` Daniel P. Smith
  2 siblings, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-19  9:32 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Bertrand Marquis,
	Roger Pau Monné,
	xen-devel, Volodymyr Babchuk, Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -17,3 +17,15 @@ config NR_CPUS
>  	  For CPU cores which support Simultaneous Multi-Threading or similar
>  	  technologies, this the number of logical threads which Xen will
>  	  support.
> +
> +config NR_BOOTMODS
> +	int "Maximum number of boot modules that a loader can pass"
> +	range 1 32768
> +	default "8" if X86
> +	default "32" if ARM

Any reason for the larger default on Arm, irrespective of dom0less
actually being in use? (I'm actually surprised I can't spot a Kconfig
option controlling inclusion of dom0less. The default here imo isn't
supposed to depend on the architecture, but on whether dom0less is
supported. That way if another arch gained dom0less support, the
higher default would apply to it without needing further adjustment.)

> --- a/xen/arch/x86/efi/efi-boot.h
> +++ b/xen/arch/x86/efi/efi-boot.h
> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>   * The array size needs to be one larger than the number of modules we
>   * support - see __start_xen().
>   */
> -static module_t __initdata mb_modules[5];
> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];

If the build admin selected 1, I'm pretty sure about nothing would work.
I think you want max(5, CONFIG_NR_BOOTMODS) or
max(4, CONFIG_NR_BOOTMODS) + 1 here and ...

> --- a/xen/arch/x86/guest/xen/pvh-boot.c
> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>  uint32_t __initdata pvh_start_info_pa;
>  
>  static multiboot_info_t __initdata pvh_mbi;
> -static module_t __initdata pvh_mbi_mods[8];
> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];

... max(8, CONFIG_NR_BOOTMODS) here (albeit the 8 may have room for
lowering - I don't recall why 8 was chosen rather than going with
the minimum possible value covering all module kinds known at that
time).

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 06/18] fdt: make fdt handling reusable across arch
  2022-07-06 21:04 ` [PATCH v1 06/18] fdt: make fdt handling reusable across arch Daniel P. Smith
  2022-07-07  1:44   ` Henry Wang
@ 2022-07-19  9:36   ` Jan Beulich
  2022-07-22 13:18     ` Daniel P. Smith
  1 sibling, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-19  9:36 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel, Volodymyr Babchuk

On 06.07.2022 23:04, Daniel P. Smith wrote:
> This refactors reusable code from Arm's bootfdt.c and device-tree.h that is
> general fdt handling code.  The Kconfig parameter CORE_DEVICE_TREE is
> introduced for when the ability of parsing DTB files is needed by a capability
> such as hyperlaunch.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
> ---
>  xen/arch/arm/bootfdt.c        | 115 +----------------------------
>  xen/common/Kconfig            |   4 ++
>  xen/common/Makefile           |   3 +-
>  xen/common/fdt.c              | 131 ++++++++++++++++++++++++++++++++++
>  xen/include/xen/device_tree.h |  50 +------------
>  xen/include/xen/fdt.h         |  79 ++++++++++++++++++++
>  6 files changed, 218 insertions(+), 164 deletions(-)
>  create mode 100644 xen/common/fdt.c
>  create mode 100644 xen/include/xen/fdt.h

I think this wants to be accompanied by an update to ./MAINTAINERS,
so maintainership doesn't silently transition to THE REST.

I further think that the moved code would want to have style adjusted
to match present guidelines - I've noticed a number of u<N> uses which
should be uint<N>_t. I didn't look closely to see whether other style
violations are also retained in the moved code.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-06 21:04 ` [PATCH v1 02/18] introduction of generalized boot info Daniel P. Smith
  2022-07-15 19:25   ` Julien Grall
@ 2022-07-19 13:11   ` Jan Beulich
  2022-07-21 14:28     ` Daniel P. Smith
  1 sibling, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-19 13:11 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- /dev/null
> +++ b/xen/arch/x86/include/asm/bootinfo.h
> @@ -0,0 +1,48 @@
> +#ifndef __ARCH_X86_BOOTINFO_H__
> +#define __ARCH_X86_BOOTINFO_H__
> +
> +/* unused for x86 */
> +struct arch_bootstring { };
> +
> +struct __packed arch_bootmodule {
> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0

Such macro expansions need parenthesizing.

> +    uint32_t flags;
> +    uint32_t headroom;
> +};

Since you're not following any external spec, on top of what Julien
said about the __packed attribute I'd also like to point out that
in many cases here there's no need to use fixed-width types.

> +struct __packed arch_boot_info {
> +    uint32_t flags;
> +#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
> +#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
> +#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
> +#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
> +#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
> +#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
> +#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
> +#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
> +#define BOOTINFO_FLAG_X86_APM        	1U << 10
> +
> +    bool xen_guest;

As the example of this, with just the header files being introduced
here it is not really possible to figure what these fields are to
be used for and hence whether they're legitimately represented here.

> +    char *boot_loader_name;
> +    char *kextra;

const?

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 03/18] x86: adopt new boot info structures
  2022-07-06 21:04 ` [PATCH v1 03/18] x86: adopt new boot info structures Daniel P. Smith
@ 2022-07-19 13:19   ` Jan Beulich
  2022-07-22 12:34     ` Daniel P. Smith
  0 siblings, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-19 13:19 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, Daniel De Graaf,
	xen-devel, Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> This commit replaces the use of the multiboot v1 structures starting
> at __start_xen(). The majority of this commit is converting the fields
> being accessed for the startup calculations. While adapting the ucode
> boot module location logic, this code was refactored to reduce some
> of the unnecessary complexity.

Things like this or ...

> --- a/xen/arch/x86/bzimage.c
> +++ b/xen/arch/x86/bzimage.c
> @@ -69,10 +69,8 @@ static __init int bzimage_check(struct setup_header *hdr, unsigned long len)
>      return 1;
>  }
>  
> -static unsigned long __initdata orig_image_len;
> -
> -unsigned long __init bzimage_headroom(void *image_start,
> -                                      unsigned long image_length)
> +unsigned long __init bzimage_headroom(
> +    void *image_start, unsigned long image_length)
>  {
>      struct setup_header *hdr = (struct setup_header *)image_start;
>      int err;
> @@ -91,7 +89,6 @@ unsigned long __init bzimage_headroom(void *image_start,
>      if ( elf_is_elfbinary(image_start, image_length) )
>          return 0;
>  
> -    orig_image_len = image_length;
>      headroom = output_length(image_start, image_length);
>      if (gzip_check(image_start, image_length))
>      {
> @@ -104,12 +101,15 @@ unsigned long __init bzimage_headroom(void *image_start,
>      return headroom;
>  }
>  
> -int __init bzimage_parse(void *image_base, void **image_start,
> -                         unsigned long *image_len)
> +int __init bzimage_parse(
> +    void *image_base, void **image_start, unsigned int headroom,
> +    unsigned long *image_len)
>  {
>      struct setup_header *hdr = (struct setup_header *)(*image_start);
>      int err = bzimage_check(hdr, *image_len);
> -    unsigned long output_len;
> +    unsigned long output_len, orig_image_len;
> +
> +    orig_image_len = *image_len - headroom;
>  
>      if ( err < 0 )
>          return err;
> @@ -125,7 +125,7 @@ int __init bzimage_parse(void *image_base, void **image_start,
>  
>      BUG_ON(!(image_base < *image_start));
>  
> -    output_len = output_length(*image_start, orig_image_len);
> +    output_len = output_length(*image_start, *image_len);
>  
>      if ( (err = perform_gunzip(image_base, *image_start, orig_image_len)) > 0 )
>          err = decompress(*image_start, orig_image_len, image_base);

... whatever the deal is here want factoring out. Also you want to avoid
making formatting changes (like in the function headers here) in an
already large patch, when you don't otherwise touch the functions. I'm
not even convinced the formatting changes are desirable here, so I'd
like to ask that even on code you do touch for other reasons you do so
only if the existing layout ends up really awkward.

I have not looked in any further detail at this patch, sorry. Together
with my comment on the earlier patch I conclude that it might be best
if you moved things to the new representation field by field (or set of
related fields), introducing the new fields in the abstraction struct
as they are being made use of.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 05/18] x86: refactor xen cmdline into general framework
  2022-07-06 21:04 ` [PATCH v1 05/18] x86: refactor xen cmdline into general framework Daniel P. Smith
@ 2022-07-19 13:26   ` Jan Beulich
  2022-07-22 13:12     ` Daniel P. Smith
  0 siblings, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-19 13:26 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- a/xen/include/xen/bootinfo.h
> +++ b/xen/include/xen/bootinfo.h
> @@ -53,6 +53,17 @@ struct __packed boot_info {
>  
>  extern struct boot_info *boot_info;
>  
> +static inline char *bootinfo_prepare_cmdline(struct boot_info *bi)
> +{
> +    bi->cmdline = arch_bootinfo_prepare_cmdline(bi->cmdline, bi->arch);
> +
> +    if ( *bi->cmdline == ' ' )
> +        printk(XENLOG_WARNING "%s: leading whitespace left on cmdline\n",
> +               __func__);

Just a remark and a question on this one: I don't view the use of
__func__ here (and in fact in many other cases as well) as very
useful. And why do we need such a warning all of the sudden in the
first place?

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 08/18] kconfig: introduce domain builder config option
  2022-07-06 21:04 ` [PATCH v1 08/18] kconfig: introduce domain builder config option Daniel P. Smith
  2022-07-07  1:44   ` Henry Wang
@ 2022-07-19 13:29   ` Jan Beulich
  2022-07-22 13:47     ` Daniel P. Smith
  1 sibling, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-19 13:29 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, xen-devel

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- /dev/null
> +++ b/xen/common/domain-builder/Kconfig
> @@ -0,0 +1,15 @@
> +
> +menu "Domain Builder Features"
> +
> +config BUILDER_FDT
> +	bool "Domain builder device tree (UNSUPPORTED)" if UNSUPPORTED
> +	select CORE_DEVICE_TREE
> +	---help---

Nit: No new ---help--- please anymore.

> +	  Enables the ability to configure the domain builder using a
> +	  flattened device tree.

Is this about both Dom0 and DomU? Especially if not, this wants making
explicit. But perhaps even if so it wants saying, for the avoidance of
doubt.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-15 19:16   ` Julien Grall
@ 2022-07-19 16:36     ` Daniel P. Smith
  2022-07-26 18:07       ` Julien Grall
  0 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-19 16:36 UTC (permalink / raw)
  To: Julien Grall, xen-devel, Volodymyr Babchuk, Wei Liu
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Jan Beulich, Stefano Stabellini, Bertrand Marquis,
	Roger Pau Monné


On 7/15/22 15:16, Julien Grall wrote:
> Hi Daniel,
> 
> On 06/07/2022 22:04, Daniel P. Smith wrote:
>> For x86 the number of allowable multiboot modules varies between the
>> different
>> entry points, non-efi boot, pvh boot, and efi boot. In the case of
>> both Arm and
>> x86 this value is fixed to values based on generalized assumptions. With
>> hyperlaunch for x86 and dom0less on Arm, use of static sizes results
>> in large
>> allocations compiled into the hypervisor that will go unused by many
>> use cases.
>>
>> This commit introduces a Kconfig variable that is set with sane
>> defaults based
>> on configuration selection. This variable is in turned used as the
>> array size
>> for the cases where a static allocated array of boot modules is declared.
>>
>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
> 
> I am not entirely sure where this reviewed-by is coming from. Is this
> from internal review?

Yes.

> If yes, my recommendation would be to provide the reviewed-by on the
> mailing list. Ideally, the review should also be done in the open, but I
> understand some company wish to do a fully internal review first.

Since this capability is being jointly developed by Christopher and I,
with myself being the author of code, Christopher reviewed the code as
the co-developer. He did so as a second pair of eyes for any obvious
mistakes and to concur that the implementation was in line with the
approach the two of us architected. Perhaps a SoB line might be more
appropriate than an R-b line.

> At least from a committer perspective, this helps me to know whether the
> reviewed-by still apply. An example would be if you send a v2, I would
> not be able to know whether Christoffer still agreed on the change.

If an SoB line is more appropriate, then on the next version I can
switch it

>> ---
>>   xen/arch/Kconfig                  | 12 ++++++++++++
>>   xen/arch/arm/include/asm/setup.h  |  5 +++--
>>   xen/arch/x86/efi/efi-boot.h       |  2 +-
>>   xen/arch/x86/guest/xen/pvh-boot.c |  2 +-
>>   xen/arch/x86/setup.c              |  4 ++--
>>   5 files changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
>> index f16eb0df43..24139057be 100644
>> --- a/xen/arch/Kconfig
>> +++ b/xen/arch/Kconfig
>> @@ -17,3 +17,15 @@ config NR_CPUS
>>         For CPU cores which support Simultaneous Multi-Threading or
>> similar
>>         technologies, this the number of logical threads which Xen will
>>         support.
>> +
>> +config NR_BOOTMODS
>> +    int "Maximum number of boot modules that a loader can pass"
>> +    range 1 32768
>> +    default "8" if X86
>> +    default "32" if ARM
>> +    help
>> +      Controls the build-time size of various arrays allocated for
>> +      parsing the boot modules passed by a loader when starting Xen.
>> +
>> +      This is of particular interest when using Xen's hypervisor domain
>> +      capabilities such as dom0less.
>> diff --git a/xen/arch/arm/include/asm/setup.h
>> b/xen/arch/arm/include/asm/setup.h
>> index 2bb01ecfa8..312a3e4209 100644
>> --- a/xen/arch/arm/include/asm/setup.h
>> +++ b/xen/arch/arm/include/asm/setup.h
>> @@ -10,7 +10,8 @@
>>     #define NR_MEM_BANKS 256
>>   -#define MAX_MODULES 32 /* Current maximum useful modules */
>> +/* Current maximum useful modules */
>> +#define MAX_MODULES CONFIG_NR_BOOTMODS
>>     typedef enum {
>>       BOOTMOD_XEN,
>> @@ -38,7 +39,7 @@ struct meminfo {
>>    * The domU flag is set for kernels and ramdisks of "xen,domain" nodes.
>>    * The purpose of the domU flag is to avoid getting confused in
>>    * kernel_probe, where we try to guess which is the dom0 kernel and
>> - * initrd to be compatible with all versions of the multiboot spec.
>> + * initrd to be compatible with all versions of the multiboot spec.
> 
> In general, I much prefer if coding style changes are done separately
> because it helps the review (I don't have to stare at the line to figure
> out what changed).

Actually, on a past review of another series I got dinged on this, and I
did try to get most of them out of this series. This is just a straggler
that I missed. I will clean up on next revision.

> I am not going to force this here. However, the strict minimum is to
> mention the change in the commit message.
> 
>>    */
>>   #define BOOTMOD_MAX_CMDLINE 1024
>>   struct bootmodule {
>> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
>> index 6e65b569b0..4e1a799749 100644
>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>>    * The array size needs to be one larger than the number of modules we
>>    * support - see __start_xen().
>>    */
>> -static module_t __initdata mb_modules[5];
>> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
> 
> Please explain in the commit message why the number of modules was
> bumped from 5 to 9.

The number of modules were inconsistent between the different entry
points into __start_xen(). By switching to a Kconfig variable, whose
default was set to the largest value used across the entry points,
results in change for the locations using another value.

See below for +1 explanation.

>>     static void __init edd_put_string(u8 *dst, size_t n, const char *src)
>>   {
>> diff --git a/xen/arch/x86/guest/xen/pvh-boot.c
>> b/xen/arch/x86/guest/xen/pvh-boot.c
>> index 498625eae0..834b1ad16b 100644
>> --- a/xen/arch/x86/guest/xen/pvh-boot.c
>> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
>> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>>   uint32_t __initdata pvh_start_info_pa;
>>     static multiboot_info_t __initdata pvh_mbi;
>> -static module_t __initdata pvh_mbi_mods[8];
>> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];
> 
> What's the +1 for?

I should clarify in the commit message, but the value set in
CONFIG_NR_BOOTMOD is the max modules that Xen would accept from a
bootloader. Xen startup code expects to be able to append Xen itself as
the array. The +1 allocates an additional entry to store Xen in the
array should a bootloader actually pass CONFIG_NR_BOOTMOD modules to
Xen. There is an existing comment floating in one of these locations
that explained it.

>>   static const char *__initdata pvh_loader = "PVH Directboot";
>>     static void __init convert_pvh_info(multiboot_info_t **mbi,
>> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
>> index f08b07b8de..2aa1e28c8f 100644
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -1020,9 +1020,9 @@ void __init noreturn __start_xen(unsigned long
>> mbi_p)
>>           panic("dom0 kernel not specified. Check bootloader
>> configuration\n");
>>         /* Check that we don't have a silly number of modules. */
>> -    if ( mbi->mods_count > sizeof(module_map) * 8 )
>> +    if ( mbi->mods_count > CONFIG_NR_BOOTMODS )
>>       {
>> -        mbi->mods_count = sizeof(module_map) * 8;
>> +        mbi->mods_count = CONFIG_NR_BOOTMODS;
>>           printk("Excessive multiboot modules - using the first %u
>> only\n",
>>                  mbi->mods_count);
>>       }
> 
> AFAIU, this check is to make sure that we will not overrun module_map in
> the next line:
> 
> bitmap_fill(module_map, mbi->mods_count);
> 
> The current definition of module_map will allow 64 modules. But you are
> allowing 32768. So I think you either want to keep the check or define
> module_map as:
> 
> DECLARE_BITMAP(module_map, CONFIG_NR_BOOTMODS);

Yes, in the RFC I had it capped to 64 and lost track of this related
changed when it was bumped to 32768 per the review discussion. Later in
the series, module_map goes away. To ensure stability at this point I
would be inclined to restore the 64 module clamp down check. Thoughts?

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-19  9:32   ` Jan Beulich
@ 2022-07-19 17:02     ` Daniel P. Smith
  2022-07-20  7:27       ` Jan Beulich
  0 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-19 17:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Bertrand Marquis,
	Roger Pau Monné,
	xen-devel, Volodymyr Babchuk, Wei Liu

On 7/19/22 05:32, Jan Beulich wrote:
> On 06.07.2022 23:04, Daniel P. Smith wrote:
>> --- a/xen/arch/Kconfig
>> +++ b/xen/arch/Kconfig
>> @@ -17,3 +17,15 @@ config NR_CPUS
>>  	  For CPU cores which support Simultaneous Multi-Threading or similar
>>  	  technologies, this the number of logical threads which Xen will
>>  	  support.
>> +
>> +config NR_BOOTMODS
>> +	int "Maximum number of boot modules that a loader can pass"
>> +	range 1 32768
>> +	default "8" if X86
>> +	default "32" if ARM
> 
> Any reason for the larger default on Arm, irrespective of dom0less
> actually being in use? (I'm actually surprised I can't spot a Kconfig
> option controlling inclusion of dom0less. The default here imo isn't
> supposed to depend on the architecture, but on whether dom0less is
> supported. That way if another arch gained dom0less support, the
> higher default would apply to it without needing further adjustment.)

Yes, multidomain construction is always on for Arm and the only
configurable is a commandline parameter to enforce that dom0 is not
created. As for the default, it was selected based on the largest value
used in the locations replaced by the Kconfig variable. Since there was
a significant difference between Arm and x86, I did not feel it was
appropriate to reduce/increase either, since it drives multiple static
array allocations for x86.

I have no attachments to any specific value, so I will freely adjust to
whatever conscience the community might come to.

>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>>   * The array size needs to be one larger than the number of modules we
>>   * support - see __start_xen().
>>   */
>> -static module_t __initdata mb_modules[5];
>> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
> 
> If the build admin selected 1, I'm pretty sure about nothing would work.
> I think you want max(5, CONFIG_NR_BOOTMODS) or
> max(4, CONFIG_NR_BOOTMODS) + 1 here and ...

Actually, I reasoned this out and 1 is in fact a valid value. It would
mean Xen + Dom0 Linux kernel with embedded initramfs with no externally
loaded XSM policy and no boot time microcode patching. This is a working
configuration, but open to debate if it is a desirable configuration.
The question is whether it is desired to block someone from building
such a configuration, or any number between 1 and 4. If the answer is
yes, then why not just set the lower bound of the range in the Kconfig
file instead of having to maintain a hard-coded lower bound in a max
marco across multiple locations?

>> --- a/xen/arch/x86/guest/xen/pvh-boot.c
>> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
>> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>>  uint32_t __initdata pvh_start_info_pa;
>>  
>>  static multiboot_info_t __initdata pvh_mbi;
>> -static module_t __initdata pvh_mbi_mods[8];
>> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];
> 
> ... max(8, CONFIG_NR_BOOTMODS) here (albeit the 8 may have room for
> lowering - I don't recall why 8 was chosen rather than going with
> the minimum possible value covering all module kinds known at that
> time).

This is what drove the default for x86 in Kconfig to be 8. I thought it
was excessive but assumed there was some reason for the value. And see
my comment above whether it should be max({n},CONFIG_NR_BOOTMOD) vs
range {n}..32768.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 00/18] Hyperlaunch
  2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
                   ` (17 preceding siblings ...)
  2022-07-06 21:04 ` [PATCH v1 18/18] tools: introduce example late pv helper Daniel P. Smith
@ 2022-07-19 17:06 ` Smith, Jackson
  2022-07-22 14:51   ` Daniel P. Smith
  18 siblings, 1 reply; 66+ messages in thread
From: Smith, Jackson @ 2022-07-19 17:06 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel; +Cc: scott.davis, christopher.clark

Hi Daniel,

> -----Original Message-----
> Subject: [PATCH v1 00/18] Hyperlaunch

With the adjustments that I suggested in other messages, this patch builds and boots for me on x86 (including a device tree with a domU). I will continue to poke around and see if I discover any other rough edges.

One strange behavior I see is that xen fails to start the Dom0 kernel on a warm reboot. I'm using qemu_system_x86 with the KVM backend to test out the patch. After starting qemu, xen will boot correctly only once. If I attempt to reboot the virtual system (through the 'reboot' command in dom0 or the 'system_reset' qemu monitor command) without exiting/starting a new qemu process on the host machine, xen panics while booting after printing this:

(XEN) *** Building Dom0 ***
(XEN) Dom0 has maximum 856 PIRQs
(XEN) *** Constructing a PV Dom0 ***
(XEN) ELF: not an ELF binary
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Could not construct domain 0
(XEN) ****************************************

This happens with the BUILDER_FDT config option on and off, and regardless of what dtb (if any) I pass to xen. I don't see this behavior if I switch back to xen's master branch.

Hopefully that explanation made sense. Let me know if I can provide any further information about my setup.

Thanks,
Jackson

Also, I apologize that my last messages included a digital signature. Should be fixed now.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-19 17:02     ` Daniel P. Smith
@ 2022-07-20  7:27       ` Jan Beulich
  2022-07-22 15:00         ` Daniel P. Smith
  0 siblings, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-20  7:27 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Bertrand Marquis,
	Roger Pau Monné,
	xen-devel, Volodymyr Babchuk, Wei Liu

On 19.07.2022 19:02, Daniel P. Smith wrote:
> On 7/19/22 05:32, Jan Beulich wrote:
>> On 06.07.2022 23:04, Daniel P. Smith wrote:
>>> --- a/xen/arch/x86/efi/efi-boot.h
>>> +++ b/xen/arch/x86/efi/efi-boot.h
>>> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>>>   * The array size needs to be one larger than the number of modules we
>>>   * support - see __start_xen().
>>>   */
>>> -static module_t __initdata mb_modules[5];
>>> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
>>
>> If the build admin selected 1, I'm pretty sure about nothing would work.
>> I think you want max(5, CONFIG_NR_BOOTMODS) or
>> max(4, CONFIG_NR_BOOTMODS) + 1 here and ...
> 
> Actually, I reasoned this out and 1 is in fact a valid value. It would
> mean Xen + Dom0 Linux kernel with embedded initramfs with no externally
> loaded XSM policy and no boot time microcode patching. This is a working
> configuration, but open to debate if it is a desirable configuration.
> The question is whether it is desired to block someone from building
> such a configuration, or any number between 1 and 4. If the answer is
> yes, then why not just set the lower bound of the range in the Kconfig
> file instead of having to maintain a hard-coded lower bound in a max
> marco across multiple locations?

While I'd be fine with the lower bounds being raised, I wouldn't be very
happy with seeing those lower bounds becoming arch-specific.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-15 19:25   ` Julien Grall
@ 2022-07-20 18:32     ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-20 18:32 UTC (permalink / raw)
  To: Julien Grall, xen-devel, Wei Liu
  Cc: scott.davis, christopher.clark, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Stefano Stabellini

On 7/15/22 15:25, Julien Grall wrote:
> Hi Daniel,
> 
> On 06/07/2022 22:04, Daniel P. Smith wrote:
>> The x86 and Arm architectures represent in memory the general boot
>> information
>> and boot modules differently despite having commonality. The x86
>> representations are bound to the multiboot v1 structures while the Arm
>> representations are a slightly generalized meta-data container for the
>> boot
>> material. The multiboot structure does not lend itself well to being
>> expanded
>> to accommodate additional metadata, both general and boot module
>> specific. The
>> Arm structures are not bound to an external specification and thus are
>> able to
>> be expanded for solutions such as dom0less.
>>
>> This commit introduces a set of structures patterned off the Arm
>> structures to
>> represent the boot information in a manner that captures common data. The
>> structures provide an arch field to allow arch specific expansions to the
>> structures. The intended goal of these new common structures is to enable
>> commonality between the different architectures.  Specifically to enable
>> dom0less and hyperlaunch to have a common representation of boot-time
>> constructed domains.
>>
>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
>> ---
>>   xen/arch/x86/include/asm/bootinfo.h | 48 +++++++++++++++++++++++++
>>   xen/include/xen/bootinfo.h          | 54 +++++++++++++++++++++++++++++
>>   2 files changed, 102 insertions(+)
>>   create mode 100644 xen/arch/x86/include/asm/bootinfo.h
>>   create mode 100644 xen/include/xen/bootinfo.h
>>
>> diff --git a/xen/arch/x86/include/asm/bootinfo.h
>> b/xen/arch/x86/include/asm/bootinfo.h
>> new file mode 100644
>> index 0000000000..b0754a3ed0
>> --- /dev/null
>> +++ b/xen/arch/x86/include/asm/bootinfo.h
>> @@ -0,0 +1,48 @@
>> +#ifndef __ARCH_X86_BOOTINFO_H__
>> +#define __ARCH_X86_BOOTINFO_H__
>> +
>> +/* unused for x86 */
>> +struct arch_bootstring { };
>> +
>> +struct __packed arch_bootmodule {
>> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
>> +    uint32_t flags;
>> +    uint32_t headroom;
>> +};
>> +
>> +struct __packed arch_boot_info {
>> +    uint32_t flags;
>> +#define BOOTINFO_FLAG_X86_MEMLIMITS      1U << 0
>> +#define BOOTINFO_FLAG_X86_BOOTDEV        1U << 1
>> +#define BOOTINFO_FLAG_X86_CMDLINE        1U << 2
>> +#define BOOTINFO_FLAG_X86_MODULES        1U << 3
>> +#define BOOTINFO_FLAG_X86_AOUT_SYMS      1U << 4
>> +#define BOOTINFO_FLAG_X86_ELF_SYMS       1U << 5
>> +#define BOOTINFO_FLAG_X86_MEMMAP         1U << 6
>> +#define BOOTINFO_FLAG_X86_DRIVES         1U << 7
>> +#define BOOTINFO_FLAG_X86_BIOSCONFIG     1U << 8
>> +#define BOOTINFO_FLAG_X86_LOADERNAME     1U << 9
>> +#define BOOTINFO_FLAG_X86_APM            1U << 10
>> +
>> +    bool xen_guest;
>> +
>> +    char *boot_loader_name;
>> +    char *kextra;
>> +
>> +    uint32_t mem_lower;
>> +    uint32_t mem_upper;
>> +
>> +    uint32_t mmap_length;
>> +    paddr_t mmap_addr;
>> +};
>> +
>> +struct __packed mb_memmap {
>> +    uint32_t size;
>> +    uint32_t base_addr_low;
>> +    uint32_t base_addr_high;
>> +    uint32_t length_low;
>> +    uint32_t length_high;
>> +    uint32_t type;
>> +};
>> +
>> +#endif
> 
> NIT: Missing emacs magics.

As a devoted vim user, begrudged ack. ( ^_^)

>> diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
>> new file mode 100644
>> index 0000000000..42b53a3ca6
>> --- /dev/null
>> +++ b/xen/include/xen/bootinfo.h
>> @@ -0,0 +1,54 @@
>> +#ifndef __XEN_BOOTINFO_H__
>> +#define __XEN_BOOTINFO_H__
>> +
>> +#include <xen/mm.h>
>> +#include <xen/types.h>
>> +
>> +#include <asm/bootinfo.h>
>> +
>> +typedef enum {
>> +    BOOTMOD_UNKNOWN,
>> +    BOOTMOD_XEN,
>> +    BOOTMOD_FDT,
>> +    BOOTMOD_KERNEL,
>> +    BOOTMOD_RAMDISK,
>> +    BOOTMOD_XSM,
>> +    BOOTMOD_UCODE,
>> +    BOOTMOD_GUEST_DTB,
>> +}  bootmodule_kind;
>> +
>> +typedef enum {
>> +    BOOTSTR_EMPTY,
>> +    BOOTSTR_STRING,
>> +    BOOTSTR_CMDLINE,
>> +} bootstring_kind;
>> +
>> +#define BOOTMOD_MAX_STRING 1024
>> +struct __packed boot_string {
> 
> As you use __packed, the fields...
> 
>> +    bootstring_kind kind;
>> +    struct arch_bootstring *arch;
> 
> ... may not be naturally aligned anymore. Here it will depend on the
> size of bootstring_kind (this is an enum and it don't think C guarantees
> the size). This...

Ack.

>> +
>> +    char bytes[BOOTMOD_MAX_STRING];
>> +    size_t len;
>> +};
>> +
>> +struct __packed boot_module {
>> +    bootmodule_kind kind;
>> +    paddr_t start;
>> +    mfn_t mfn;
>> +    size_t size;
>> +
>> +    struct arch_bootmodule *arch;
>> +    struct boot_string string;
>> +};
>> +
>> +struct __packed boot_info {
>> +    char *cmdline;
>> +
>> +    uint32_t nr_mods;
>> +    struct boot_module *mods;
> 
> ... more obvious on this one because on 64-bit arch, there will be no
> 32-bit padding. So 'mods' will be 32-bit aligned even if the value 64-bit.
> 
> This is going to be a problem on any architecture that forbid unaligned
> access (or let the software decide).
> 
> In this case, I don't think any structures you defined warrant to be
> __packed.

Ack, I was too focused on 32bit alignment for x86 bootstrap entry point
when I was laying out the structure, that was short-sighted on my part.
I will go back and rework to be 64bit aligned.

>> +
>> +    struct arch_boot_info *arch;
>> +};
>> +
>> +#endif
> 
> 
> NIT: Missing emacs magics.

Ack.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-19 13:11   ` Jan Beulich
@ 2022-07-21 14:28     ` Daniel P. Smith
  2022-07-21 16:00       ` Jan Beulich
  2022-07-21 16:00       ` Jan Beulich
  0 siblings, 2 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-21 14:28 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 7/19/22 09:11, Jan Beulich wrote:
> On 06.07.2022 23:04, Daniel P. Smith wrote:
>> --- /dev/null
>> +++ b/xen/arch/x86/include/asm/bootinfo.h
>> @@ -0,0 +1,48 @@
>> +#ifndef __ARCH_X86_BOOTINFO_H__
>> +#define __ARCH_X86_BOOTINFO_H__
>> +
>> +/* unused for x86 */
>> +struct arch_bootstring { };
>> +
>> +struct __packed arch_bootmodule {
>> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
> 
> Such macro expansions need parenthesizing.

Ack.

>> +    uint32_t flags;
>> +    uint32_t headroom;
>> +};
> 
> Since you're not following any external spec, on top of what Julien
> said about the __packed attribute I'd also like to point out that
> in many cases here there's no need to use fixed-width types.

Oh, I forgot to mention that in the reply to Julien. Yes, the __packed
is needed to correctly cross the 32bit to 64bit bridge from the x86
bootstrap in patch 4.

>> +struct __packed arch_boot_info {
>> +    uint32_t flags;
>> +#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
>> +#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
>> +#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
>> +#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
>> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
>> +#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
>> +#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
>> +#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
>> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
>> +#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
>> +#define BOOTINFO_FLAG_X86_APM        	1U << 10
>> +
>> +    bool xen_guest;
> 
> As the example of this, with just the header files being introduced
> here it is not really possible to figure what these fields are to
> be used for and hence whether they're legitimately represented here.

I can add a comment to clarify these are a mirror of the multiboot
flags. These were mirrored to allow the multiboot flags to be direct
copied and eased the replacement locations where an mb flag is checked.

>> +    char *boot_loader_name;
>> +    char *kextra;
> 
> const?

I want to say const will not work based on usage, but I will double-check.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-21 14:28     ` Daniel P. Smith
  2022-07-21 16:00       ` Jan Beulich
@ 2022-07-21 16:00       ` Jan Beulich
  2022-07-22 16:01         ` Daniel P. Smith
  1 sibling, 1 reply; 66+ messages in thread
From: Jan Beulich @ 2022-07-21 16:00 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 21.07.2022 16:28, Daniel P. Smith wrote:
> On 7/19/22 09:11, Jan Beulich wrote:
>> On 06.07.2022 23:04, Daniel P. Smith wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/x86/include/asm/bootinfo.h
>>> @@ -0,0 +1,48 @@
>>> +#ifndef __ARCH_X86_BOOTINFO_H__
>>> +#define __ARCH_X86_BOOTINFO_H__
>>> +
>>> +/* unused for x86 */
>>> +struct arch_bootstring { };
>>> +
>>> +struct __packed arch_bootmodule {
>>> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
>>
>> Such macro expansions need parenthesizing.
> 
> Ack.
> 
>>> +    uint32_t flags;
>>> +    uint32_t headroom;
>>> +};
>>
>> Since you're not following any external spec, on top of what Julien
>> said about the __packed attribute I'd also like to point out that
>> in many cases here there's no need to use fixed-width types.
> 
> Oh, I forgot to mention that in the reply to Julien. Yes, the __packed
> is needed to correctly cross the 32bit to 64bit bridge from the x86
> bootstrap in patch 4.

I'm afraid I don't follow you here. I did briefly look at patch 4 (but
that really also falls in the "wants to be split" category), but I
can't see why a purely internally used struct may need packing. I'd
appreciate if you could expand on that.

>>> +struct __packed arch_boot_info {
>>> +    uint32_t flags;
>>> +#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
>>> +#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
>>> +#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
>>> +#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
>>> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
>>> +#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
>>> +#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
>>> +#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
>>> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
>>> +#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
>>> +#define BOOTINFO_FLAG_X86_APM        	1U << 10
>>> +
>>> +    bool xen_guest;
>>
>> As the example of this, with just the header files being introduced
>> here it is not really possible to figure what these fields are to
>> be used for and hence whether they're legitimately represented here.
> 
> I can add a comment to clarify these are a mirror of the multiboot
> flags. These were mirrored to allow the multiboot flags to be direct
> copied and eased the replacement locations where an mb flag is checked.

Multiboot flags? The context here is the "xen_guest" field.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-21 14:28     ` Daniel P. Smith
@ 2022-07-21 16:00       ` Jan Beulich
  2022-07-21 16:00       ` Jan Beulich
  1 sibling, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-21 16:00 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 21.07.2022 16:28, Daniel P. Smith wrote:
> On 7/19/22 09:11, Jan Beulich wrote:
>> On 06.07.2022 23:04, Daniel P. Smith wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/x86/include/asm/bootinfo.h
>>> @@ -0,0 +1,48 @@
>>> +#ifndef __ARCH_X86_BOOTINFO_H__
>>> +#define __ARCH_X86_BOOTINFO_H__
>>> +
>>> +/* unused for x86 */
>>> +struct arch_bootstring { };
>>> +
>>> +struct __packed arch_bootmodule {
>>> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
>>
>> Such macro expansions need parenthesizing.
> 
> Ack.
> 
>>> +    uint32_t flags;
>>> +    uint32_t headroom;
>>> +};
>>
>> Since you're not following any external spec, on top of what Julien
>> said about the __packed attribute I'd also like to point out that
>> in many cases here there's no need to use fixed-width types.
> 
> Oh, I forgot to mention that in the reply to Julien. Yes, the __packed
> is needed to correctly cross the 32bit to 64bit bridge from the x86
> bootstrap in patch 4.

I'm afraid I don't follow you here. I did briefly look at patch 4 (but
that really also falls in the "wants to be split" category), but I
can't see why a purely internally used struct may need packing. I'd
appreciate if you could expand on that.

>>> +struct __packed arch_boot_info {
>>> +    uint32_t flags;
>>> +#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
>>> +#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
>>> +#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
>>> +#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
>>> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
>>> +#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
>>> +#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
>>> +#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
>>> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
>>> +#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
>>> +#define BOOTINFO_FLAG_X86_APM        	1U << 10
>>> +
>>> +    bool xen_guest;
>>
>> As the example of this, with just the header files being introduced
>> here it is not really possible to figure what these fields are to
>> be used for and hence whether they're legitimately represented here.
> 
> I can add a comment to clarify these are a mirror of the multiboot
> flags. These were mirrored to allow the multiboot flags to be direct
> copied and eased the replacement locations where an mb flag is checked.

Multiboot flags? The context here is the "xen_guest" field.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 03/18] x86: adopt new boot info structures
  2022-07-19 13:19   ` Jan Beulich
@ 2022-07-22 12:34     ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 12:34 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, Daniel De Graaf,
	xen-devel, Wei Liu

On 7/19/22 09:19, Jan Beulich wrote:
> On 06.07.2022 23:04, Daniel P. Smith wrote:
>> This commit replaces the use of the multiboot v1 structures starting
>> at __start_xen(). The majority of this commit is converting the fields
>> being accessed for the startup calculations. While adapting the ucode
>> boot module location logic, this code was refactored to reduce some
>> of the unnecessary complexity.
> 
> Things like this or ...
> 
>> --- a/xen/arch/x86/bzimage.c
>> +++ b/xen/arch/x86/bzimage.c
>> @@ -69,10 +69,8 @@ static __init int bzimage_check(struct setup_header *hdr, unsigned long len)
>>      return 1;
>>  }
>>  
>> -static unsigned long __initdata orig_image_len;
>> -
>> -unsigned long __init bzimage_headroom(void *image_start,
>> -                                      unsigned long image_length)
>> +unsigned long __init bzimage_headroom(
>> +    void *image_start, unsigned long image_length)
>>  {
>>      struct setup_header *hdr = (struct setup_header *)image_start;
>>      int err;
>> @@ -91,7 +89,6 @@ unsigned long __init bzimage_headroom(void *image_start,
>>      if ( elf_is_elfbinary(image_start, image_length) )
>>          return 0;
>>  
>> -    orig_image_len = image_length;
>>      headroom = output_length(image_start, image_length);
>>      if (gzip_check(image_start, image_length))
>>      {
>> @@ -104,12 +101,15 @@ unsigned long __init bzimage_headroom(void *image_start,
>>      return headroom;
>>  }
>>  
>> -int __init bzimage_parse(void *image_base, void **image_start,
>> -                         unsigned long *image_len)
>> +int __init bzimage_parse(
>> +    void *image_base, void **image_start, unsigned int headroom,
>> +    unsigned long *image_len)
>>  {
>>      struct setup_header *hdr = (struct setup_header *)(*image_start);
>>      int err = bzimage_check(hdr, *image_len);
>> -    unsigned long output_len;
>> +    unsigned long output_len, orig_image_len;
>> +
>> +    orig_image_len = *image_len - headroom;
>>  
>>      if ( err < 0 )
>>          return err;
>> @@ -125,7 +125,7 @@ int __init bzimage_parse(void *image_base, void **image_start,
>>  
>>      BUG_ON(!(image_base < *image_start));
>>  
>> -    output_len = output_length(*image_start, orig_image_len);
>> +    output_len = output_length(*image_start, *image_len);
>>  
>>      if ( (err = perform_gunzip(image_base, *image_start, orig_image_len)) > 0 )
>>          err = decompress(*image_start, orig_image_len, image_base);
> 
> ... whatever the deal is here want factoring out. Also you want to avoid
> making formatting changes (like in the function headers here) in an
> already large patch, when you don't otherwise touch the functions. I'm
> not even convinced the formatting changes are desirable here, so I'd
> like to ask that even on code you do touch for other reasons you do so
> only if the existing layout ends up really awkward.

Ack. As I mentioned, I tried dropping these based on past reviews. I
will do another pass to try to catch just formatting and drop them.

> I have not looked in any further detail at this patch, sorry. Together
> with my comment on the earlier patch I conclude that it might be best
> if you moved things to the new representation field by field (or set of
> related fields), introducing the new fields in the abstraction struct
> as they are being made use of.

I am not sure whether it is possible to do this field by field. This is
why I was asking on IRC on about dealing with this kind of situation. As
soon as multiboot_info_t/module_t are replaced with struct
boot_info{}/struct boot_module{} a wholesale replacement must be done.

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 04/18] x86: refactor entrypoints to new boot info
  2022-07-18 13:58   ` Smith, Jackson
@ 2022-07-22 12:59     ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 12:59 UTC (permalink / raw)
  To: Smith, Jackson, xen-devel
  Cc: scott.davis, christopher.clark, Jan Beulich, Andrew Cooper,
	Roger Pau Monné


On 7/18/22 09:58, Smith, Jackson wrote:
> Hi Daniel,
> 
> I hope outlook gets this reply right.

Looks good to me, thank you for taking the time to review.

>> -----Original Message-----
>> Subject: [PATCH v1 04/18] x86: refactor entrypoints to new boot info
> 
>> diff --git a/xen/arch/x86/guest/xen/pvh-boot.c
>> b/xen/arch/x86/guest/xen/pvh-boot.c
>> index 834b1ad16b..28cf5df0a3 100644
>> --- a/xen/arch/x86/guest/xen/pvh-boot.c
>> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
> 
>> @@ -99,13 +118,16 @@ static void __init get_memory_map(void)
>>      sanitize_e820_map(e820_raw.map, &e820_raw.nr_map);
>>  }
>>
>> -void __init pvh_init(multiboot_info_t **mbi, module_t **mod)
>> +void __init pvh_init(struct boot_info **bi)
>>  {
>> -    convert_pvh_info(mbi, mod);
>> +    *bi = init_pvh_info();
>> +    convert_pvh_info(*bi);
>>
>>      hypervisor_probe();
>>      ASSERT(xen_guest);
>>
>> +    (*bi)->arch->xen_guest = xen_guest;
> 
> I think you may have a typo/missed refactoring here?
> I changed this line to "(*bi)->arch->xenguest = xen_guest;" to get the 
> patchset to build.

Hmm, I guess I missed one. I originally was going to mimic the name
xen_guest in the structure definition but when xen guest support is
disable the xen_guest global turns into a #define which replaces the
reference resulting in a compilation error.

> The arch_boot_info struct in boot_info32.h has a field 'xen_guest' but the 
> same field in asm/bootinfo.h was re-named from 'xen_guest' to 'xenguest' in 
> the 'x86: adopt new boot info structures' commit.
> 
> What was your intent?

As I mentioned above, the renaming was intentional, and it looks like I
do a poor job catching everywhere where the renaming need to be done.

>> +
>>      get_memory_map();
>>  }
>>
> 
> Thanks,
> Jackson Smith

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 05/18] x86: refactor xen cmdline into general framework
  2022-07-19 13:26   ` Jan Beulich
@ 2022-07-22 13:12     ` Daniel P. Smith
  2022-07-25  7:09       ` Jan Beulich
  0 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 13:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu


On 7/19/22 09:26, Jan Beulich wrote:
> On 06.07.2022 23:04, Daniel P. Smith wrote:
>> --- a/xen/include/xen/bootinfo.h
>> +++ b/xen/include/xen/bootinfo.h
>> @@ -53,6 +53,17 @@ struct __packed boot_info {
>>  
>>  extern struct boot_info *boot_info;
>>  
>> +static inline char *bootinfo_prepare_cmdline(struct boot_info *bi)
>> +{
>> +    bi->cmdline = arch_bootinfo_prepare_cmdline(bi->cmdline, bi->arch);
>> +
>> +    if ( *bi->cmdline == ' ' )
>> +        printk(XENLOG_WARNING "%s: leading whitespace left on cmdline\n",
>> +               __func__);
> 
> Just a remark and a question on this one: I don't view the use of
> __func__ here (and in fact in many other cases as well) as very
> useful. And why do we need such a warning all of the sudden in the
> first place?

This started as just a debug print, thus why __func__ is in place, but
later decided to leave it. This is because after this point, the code
assumes that all leading space was stripped, but there was never a check
that logic did the job correct. I don't believe a failure to do so
warranted a halt to the boot process, but at least provide a warning
into the log should the trimming fail. Doing so would allow an admin to
have a clue should an unexpected behavior occur as a result of leading
space making it through and breaking the fully trimmed assumption made
elsewhere.

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 06/18] fdt: make fdt handling reusable across arch
  2022-07-19  9:36   ` Jan Beulich
@ 2022-07-22 13:18     ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 13:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel, Volodymyr Babchuk


On 7/19/22 05:36, Jan Beulich wrote:
> On 06.07.2022 23:04, Daniel P. Smith wrote:
>> This refactors reusable code from Arm's bootfdt.c and device-tree.h that is
>> general fdt handling code.  The Kconfig parameter CORE_DEVICE_TREE is
>> introduced for when the ability of parsing DTB files is needed by a capability
>> such as hyperlaunch.
>>
>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
>> ---
>>  xen/arch/arm/bootfdt.c        | 115 +----------------------------
>>  xen/common/Kconfig            |   4 ++
>>  xen/common/Makefile           |   3 +-
>>  xen/common/fdt.c              | 131 ++++++++++++++++++++++++++++++++++
>>  xen/include/xen/device_tree.h |  50 +------------
>>  xen/include/xen/fdt.h         |  79 ++++++++++++++++++++
>>  6 files changed, 218 insertions(+), 164 deletions(-)
>>  create mode 100644 xen/common/fdt.c
>>  create mode 100644 xen/include/xen/fdt.h
> 
> I think this wants to be accompanied by an update to ./MAINTAINERS,
> so maintainership doesn't silently transition to THE REST.

ack

> I further think that the moved code would want to have style adjusted
> to match present guidelines - I've noticed a number of u<N> uses which
> should be uint<N>_t. I didn't look closely to see whether other style
> violations are also retained in the moved code.

ack

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 07/18] docs: update hyperlaunch device tree documentation
  2022-07-18 13:57   ` Smith, Jackson
@ 2022-07-22 13:34     ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 13:34 UTC (permalink / raw)
  To: Smith, Jackson, xen-devel
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu


On 7/18/22 09:57, Smith, Jackson wrote:
> Hi Daniel,
> 
>> -----Original Message-----
>> Subject: [PATCH v1 07/18] docs: update hyperlaunch device tree
>> documentation
> 
> 
>> diff --git a/docs/designs/launch/hyperlaunch-devicetree.rst
>> b/docs/designs/launch/hyperlaunch-devicetree.rst
>> index b49c98cfbd..ae1a786d0b 100644
>> --- a/docs/designs/launch/hyperlaunch-devicetree.rst
>> +++ b/docs/designs/launch/hyperlaunch-devicetree.rst
>> @@ -13,12 +13,268 @@ difference is the introduction of the ``hypervisor``
> 
>> +
>> +The Hypervisor node
>> +-------------------
>> +
>> +The ``hypervisor`` node is a top level container for the domains that
>> +will be
>> built
>> +by hypervisor on start up. The node will be named ``hypervisor``  with
>> +a
>> ``compatible``
>> +property to identify which hypervisors the configuration is intended.
> ^^^ Should there be a note here that hypervisor node also needs a compatible 
> "xen,<arch>"?

Ack.

>> +The
>> hypervisor
>> +node will consist of one or more config nodes and one or more domain
>> nodes.
>> +
>> +Properties
>> +""""""""""
>> +
>> +compatible
>> +  Identifies which hypervisors the configuration is compatible. Required.
>> +
>> +  Format: "hypervisor,<hypervisor name>", e.g "hypervisor,xen"
> ^^^ Same here: compatible "<hypervisor name>,<arch>"?

Ack.

>>  Example Configuration
>>  ---------------------
>> +
>> +Multiboot x86 Configuration Dom0-only:
>> +""""""""""""""""""""""""""""""""""""""
>> +The following dts file can be provided to the Device Tree compiler,
>> +``dtc``,
>> to
>> +produce a dtb file.
>> +::
>> +
>> +  /dts-v1/;
>> +
>> +  / {
>> +      chosen {
>> +          hypervisor {
>> +              compatible = "hypervisor,xen";
> ^^^^^^^^  compatible = "hypervisor,xen", "xen,x86";

Ack.

>> +
>> +              dom0 {
>> +                  compatible = "xen,domain";
>> +
>> +                  domid = <0>;
>> +
>> +                  permissions = <3>;
>> +                  functions = <0xC000000F>;
>> +                  mode = <5>;
>> +
>> +                  domain-uuid = [B3 FB 98 FB 8F 9F 67 A3 8A 6E 62 5A 09
>> + 13 F0
>> 8C];
>> +
>> +                  cpus = <1>;
>> +                  memory = <0x0 0x20000000>;
> ^^^^^^^^^^ memory = "2048M";
> Needs to be updated to new format for mem.

Ack.

>> +
>> +                  kernel {
>> +                      compatible = "module,kernel", "module,index";
>> +                      module-index = <1>;
>> +                  };
>> +              };
>> +
>> +          };
>> +      };
>> +  };
>> +
> 
> Similar adjustments are needed for the rest of the examples I believe.
> 
> Also, two typos:
> Line 287 is missing a line ending semi-colon.
> Line 82 has a double space between 'node' and 'may'.

Ack.

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 08/18] kconfig: introduce domain builder config option
  2022-07-19 13:29   ` Jan Beulich
@ 2022-07-22 13:47     ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 13:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, xen-devel


On 7/19/22 09:29, Jan Beulich wrote:
> On 06.07.2022 23:04, Daniel P. Smith wrote:
>> --- /dev/null
>> +++ b/xen/common/domain-builder/Kconfig
>> @@ -0,0 +1,15 @@
>> +
>> +menu "Domain Builder Features"
>> +
>> +config BUILDER_FDT
>> +	bool "Domain builder device tree (UNSUPPORTED)" if UNSUPPORTED
>> +	select CORE_DEVICE_TREE
>> +	---help---
> 
> Nit: No new ---help--- please anymore.

Ack.

>> +	  Enables the ability to configure the domain builder using a
>> +	  flattened device tree.
> 
> Is this about both Dom0 and DomU? Especially if not, this wants making
> explicit. But perhaps even if so it wants saying, for the avoidance of
> doubt.

The following patches will end with full conversion of both Dom0 and
DomU construction to be handled by a core domain construction framework.
If a device tree configuration is not present or this Kconfig option is
not set, then the domain builder will construct a Dom0 as it does today.
Turning this option on enables controlling the domain builder using a
device tree configuration, which (eventually) will be able to construct
any combination of Dom0, HWDom, CtlDom, DomU, etc. So I can add a
qualifier of, 'configure what domains will be constructed at boot by the
domain builder using a flattened device tree'. I can even add an
explanation that on x86 the FDT must be provided as the first multiboot
module.

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 10/18] x86: introduce the domain builder
  2022-07-18 13:59   ` Smith, Jackson
@ 2022-07-22 14:36     ` Daniel P. Smith
  2022-07-22 20:33       ` Smith, Jackson
  0 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 14:36 UTC (permalink / raw)
  To: Smith, Jackson, xen-devel
  Cc: scott.davis, christopher.clark, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini


On 7/18/22 09:59, Smith, Jackson wrote:
> Hi Daniel,
> 
>> -----Original Message-----
>> Subject: [PATCH v1 10/18] x86: introduce the domain builder
>>
>> This commit introduces the domain builder configuration FDT parser along
>> with the domain builder core for domain creation. To enable domain builder
>> to be a cross architecture internal API, a new arch domain creation call
> is
>> introduced for use by the domain builder.
> 
>> diff --git a/xen/common/domain-builder/core.c
> 
>> +void __init builder_init(struct boot_info *info) {
>> +    struct boot_domain *d = NULL;
>> +
>> +    info->builder = &builder;
>> +
>> +    if ( IS_ENABLED(CONFIG_BUILDER_FDT) )
>> +    {
> 
>> +    }
>> +
>> +    /*
>> +     * No FDT config support or an FDT wasn't present, do an initial
>> +     * domain construction
>> +     */
>> +    printk("Domain Builder: falling back to initial domain build\n");
>> +    info->builder->nr_doms = 1;
>> +    d = &info->builder->domains[0];
>> +
>> +    d->mode = opt_dom0_pvh ? 0 : BUILD_MODE_PARAVIRTUALIZED;
>> +
>> +    d->kernel = &info->mods[0];
>> +    d->kernel->kind = BOOTMOD_KERNEL;
>> +
>> +    d->permissions = BUILD_PERMISSION_CONTROL |
>> BUILD_PERMISSION_HARDWARE;
>> +    d->functions = BUILD_FUNCTION_CONSOLE |
>> BUILD_FUNCTION_XENSTORE |
>> +                     BUILD_FUNCTION_INITIAL_DOM;
>> +
>> +    d->kernel->arch->headroom = bzimage_headroom(bootstrap_map(d-
>>> kernel),
>> +                                                   d->kernel->size);
>> +    bootstrap_map(NULL);
>> +
>> +    if ( d->kernel->string.len )
>> +        d->kernel->string.kind = BOOTSTR_CMDLINE; }
> 
> Forgive me if I'm incorrect, but I believe there is an issue with this
> fallback logic for the case where no FDT was provided.

IIUC, the issue at hand has to deal with patch #15.

> If dom0_mem is not supplied to the xen cmd line, then d->meminfo is never
> initialized. (See dom0_compute_nr_pages/dom0_build.c:335)
> This was giving me trouble because bd->meminfo.mem_max.nr_pages was left at
> 0, effectivity clamping dom0 to 0 pages of ram.
> 
> I'm not sure what the best solution is but one (easy) possibility is just
> initializing meminfo to the dom0 defaults near the end of this function:
>         d->meminfo.mem_size = dom0_size;
>         d->meminfo.mem_min = dom0_min_size;
>         d->meminfo.mem_max = dom0_max_size;

I believe the correct fix is to this hunk,

@@ -416,7 +379,12 @@ unsigned long __init dom0_compute_nr_pages(
         }
     }

-    d->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
+    /* Clamp according to min/max limits and available memory (final). */
+    nr_pages = max(nr_pages, min_pages);
+    nr_pages = min(nr_pages, max_pages);
+    nr_pages = min(nr_pages, avail);
+
+    bd->domain->max_pages = min_t(unsigned long, max_pages, UINT_MAX);

Before that last line, there should be a clamp up of max_pages, e.g.

    nr_pages = max(nr_pages, min_pages);
    nr_pages = min(nr_pages, max_pages);
    nr_pages = min(nr_pages, avail);

    max_pages = max(nr_pages, max_pages);

    bd->domain->max_pages = min_t(unsigned long, max_pages, UINT_MAX);

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 00/18] Hyperlaunch
  2022-07-19 17:06 ` [PATCH v1 00/18] Hyperlaunch Smith, Jackson
@ 2022-07-22 14:51   ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 14:51 UTC (permalink / raw)
  To: Smith, Jackson, xen-devel; +Cc: scott.davis, christopher.clark

On 7/19/22 13:06, Smith, Jackson wrote:
> Hi Daniel,
> 
>> -----Original Message-----
>> Subject: [PATCH v1 00/18] Hyperlaunch
> 
> With the adjustments that I suggested in other messages, this patch builds and boots for me on x86 (including a device tree with a domU). I will continue to poke around and see if I discover any other rough edges.

Thank you so much for reviewing and testing!

> One strange behavior I see is that xen fails to start the Dom0 kernel on a warm reboot. I'm using qemu_system_x86 with the KVM backend to test out the patch. After starting qemu, xen will boot correctly only once. If I attempt to reboot the virtual system (through the 'reboot' command in dom0 or the 'system_reset' qemu monitor command) without exiting/starting a new qemu process on the host machine, xen panics while booting after printing this:
> 
> (XEN) *** Building Dom0 ***
> (XEN) Dom0 has maximum 856 PIRQs
> (XEN) *** Constructing a PV Dom0 ***
> (XEN) ELF: not an ELF binary
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Could not construct domain 0
> (XEN) ****************************************
> 
> This happens with the BUILDER_FDT config option on and off, and regardless of what dtb (if any) I pass to xen. I don't see this behavior if I switch back to xen's master branch.
> 
> Hopefully that explanation made sense. Let me know if I can provide any further information about my setup.

That is certainly a very strange behavior. I never tested reboot as I
assumed it should just go through the same process as could boot. I will
add this to my tests to run and see if I can track down why it is happening.

V/r,
Daniel P. Smith



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-20  7:27       ` Jan Beulich
@ 2022-07-22 15:00         ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 15:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Bertrand Marquis,
	Roger Pau Monné,
	xen-devel, Volodymyr Babchuk, Wei Liu

On 7/20/22 03:27, Jan Beulich wrote:
> On 19.07.2022 19:02, Daniel P. Smith wrote:
>> On 7/19/22 05:32, Jan Beulich wrote:
>>> On 06.07.2022 23:04, Daniel P. Smith wrote:
>>>> --- a/xen/arch/x86/efi/efi-boot.h
>>>> +++ b/xen/arch/x86/efi/efi-boot.h
>>>> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>>>>   * The array size needs to be one larger than the number of modules we
>>>>   * support - see __start_xen().
>>>>   */
>>>> -static module_t __initdata mb_modules[5];
>>>> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
>>>
>>> If the build admin selected 1, I'm pretty sure about nothing would work.
>>> I think you want max(5, CONFIG_NR_BOOTMODS) or
>>> max(4, CONFIG_NR_BOOTMODS) + 1 here and ...
>>
>> Actually, I reasoned this out and 1 is in fact a valid value. It would
>> mean Xen + Dom0 Linux kernel with embedded initramfs with no externally
>> loaded XSM policy and no boot time microcode patching. This is a working
>> configuration, but open to debate if it is a desirable configuration.
>> The question is whether it is desired to block someone from building
>> such a configuration, or any number between 1 and 4. If the answer is
>> yes, then why not just set the lower bound of the range in the Kconfig
>> file instead of having to maintain a hard-coded lower bound in a max
>> marco across multiple locations?
> 
> While I'd be fine with the lower bounds being raised, I wouldn't be very
> happy with seeing those lower bounds becoming arch-specific.

Okay, and I am not sure how changing the range in Kconfig would make it
arch-specific. I was not proposing making the existing range conditioned
and having arch specific instances. There is one range, it will have a
lower bound of 4 and the upper bound of 31768.

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-21 16:00       ` Jan Beulich
@ 2022-07-22 16:01         ` Daniel P. Smith
  2022-07-25  7:05           ` Jan Beulich
  0 siblings, 1 reply; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-22 16:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 7/21/22 12:00, Jan Beulich wrote:
> On 21.07.2022 16:28, Daniel P. Smith wrote:
>> On 7/19/22 09:11, Jan Beulich wrote:
>>> On 06.07.2022 23:04, Daniel P. Smith wrote:
>>>> --- /dev/null
>>>> +++ b/xen/arch/x86/include/asm/bootinfo.h
>>>> @@ -0,0 +1,48 @@
>>>> +#ifndef __ARCH_X86_BOOTINFO_H__
>>>> +#define __ARCH_X86_BOOTINFO_H__
>>>> +
>>>> +/* unused for x86 */
>>>> +struct arch_bootstring { };
>>>> +
>>>> +struct __packed arch_bootmodule {
>>>> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
>>>
>>> Such macro expansions need parenthesizing.
>>
>> Ack.
>>
>>>> +    uint32_t flags;
>>>> +    uint32_t headroom;
>>>> +};
>>>
>>> Since you're not following any external spec, on top of what Julien
>>> said about the __packed attribute I'd also like to point out that
>>> in many cases here there's no need to use fixed-width types.
>>
>> Oh, I forgot to mention that in the reply to Julien. Yes, the __packed
>> is needed to correctly cross the 32bit to 64bit bridge from the x86
>> bootstrap in patch 4.
> 
> I'm afraid I don't follow you here. I did briefly look at patch 4 (but
> that really also falls in the "wants to be split" category), but I
> can't see why a purely internally used struct may need packing. I'd
> appreciate if you could expand on that.

Originally, patch 3 and patch 4 were a single patch, and obviously was
way too large. To split them, I realized I could introduce a temporary
conversion function that would allow the patch to be split into a post
start_xen() patch (patch 3) and a pre start_xen() patch, (patch 4). For
x86, pre start_xen() consists of 3 different entry points. These being
the classic/traditional/old multiboot1/2 entry, EFI entry, and PVH entry
(aka Xen Guest). The latter two are all internal, 64bit, but the former
is located in arch/x86/boot and is compiled as 32bit. I tried different
approaches to support using a single header between these two
environments. Ultimately, IMHO, the cleanest approach is what is
introduced in patch 4 as it enabled the use of Xen types in the
structures and maintain a single structure that need to be passed
around. To do this, a 32bit specific version of the structures were
defined in arch/x86/boot/boot_info32.h that is populated under 32bit
mode, then they can be fixed up after getting into start_xen() and in
64bit code. To ensure no unexpected insertion of padding, I focused on
ensuring everything was 32bit aligned and packed. As Julien pointed out,
I messed up with the use of enum as its size is not guaranteed as the
enum list grows and I forgot to consider keeping pointers 64bit aligned.

Does that help?

>>>> +struct __packed arch_boot_info {
>>>> +    uint32_t flags;
>>>> +#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
>>>> +#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
>>>> +#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
>>>> +#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
>>>> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
>>>> +#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
>>>> +#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
>>>> +#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
>>>> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
>>>> +#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
>>>> +#define BOOTINFO_FLAG_X86_APM        	1U << 10
>>>> +
>>>> +    bool xen_guest;
>>>
>>> As the example of this, with just the header files being introduced
>>> here it is not really possible to figure what these fields are to
>>> be used for and hence whether they're legitimately represented here.
>>
>> I can add a comment to clarify these are a mirror of the multiboot
>> flags. These were mirrored to allow the multiboot flags to be direct
>> copied and eased the replacement locations where an mb flag is checked.
> 
> Multiboot flags? The context here is the "xen_guest" field.

Apologies, I thought you were referring to all the fields and I forgot
to explain xen_guest. So to clarify, flags is to carry the MB flags
passed up from the MB entry point and xen_guest is meant to carry the
xen_guest bool passed up from the PVH/Xen Guest entry point.

v/r,
dps


^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH v1 10/18] x86: introduce the domain builder
  2022-07-22 14:36     ` Daniel P. Smith
@ 2022-07-22 20:33       ` Smith, Jackson
  2022-07-23 10:45         ` Daniel P. Smith
  0 siblings, 1 reply; 66+ messages in thread
From: Smith, Jackson @ 2022-07-22 20:33 UTC (permalink / raw)
  To: Daniel P. Smith, Xen-devel

> -----Original Message-----
> From: Daniel P. Smith <dpsmith@apertussolutions.com>
>
> On 7/18/22 09:59, Smith, Jackson wrote:
> > Hi Daniel,
> >
> >> -----Original Message-----
> >> Subject: [PATCH v1 10/18] x86: introduce the domain builder
> >>
> >> This commit introduces the domain builder configuration FDT parser
> >> along with the domain builder core for domain creation. To enable
> >> domain builder to be a cross architecture internal API, a new arch
> >> domain creation call
> > is
> >> introduced for use by the domain builder.
> >
> >> diff --git a/xen/common/domain-builder/core.c
> >
> >> +void __init builder_init(struct boot_info *info) {
> >> +    struct boot_domain *d = NULL;
> >> +
> >> +    info->builder = &builder;
> >> +
> >> +    if ( IS_ENABLED(CONFIG_BUILDER_FDT) )
> >> +    {
> >
> >> +    }
> >> +
> >> +    /*
> >> +     * No FDT config support or an FDT wasn't present, do an initial
> >> +     * domain construction
> >> +     */
> >> +    printk("Domain Builder: falling back to initial domain build\n");
> >> +    info->builder->nr_doms = 1;
> >> +    d = &info->builder->domains[0];
> >> +
> >> +    d->mode = opt_dom0_pvh ? 0 : BUILD_MODE_PARAVIRTUALIZED;
> >> +
> >> +    d->kernel = &info->mods[0];
> >> +    d->kernel->kind = BOOTMOD_KERNEL;
> >> +
> >> +    d->permissions = BUILD_PERMISSION_CONTROL |
> >> BUILD_PERMISSION_HARDWARE;
> >> +    d->functions = BUILD_FUNCTION_CONSOLE |
> >> BUILD_FUNCTION_XENSTORE |
> >> +                     BUILD_FUNCTION_INITIAL_DOM;
> >> +
> >> +    d->kernel->arch->headroom = bzimage_headroom(bootstrap_map(d-
> >>> kernel),
> >> +                                                   d->kernel->size);
> >> +    bootstrap_map(NULL);
> >> +
> >> +    if ( d->kernel->string.len )
> >> +        d->kernel->string.kind = BOOTSTR_CMDLINE; }
> >
> > Forgive me if I'm incorrect, but I believe there is an issue with this
> > fallback logic for the case where no FDT was provided.
>
> IIUC, the issue at hand has to deal with patch #15.
>
> > If dom0_mem is not supplied to the xen cmd line, then d->meminfo is
> > never initialized. (See dom0_compute_nr_pages/dom0_build.c:335)
> > This was giving me trouble because bd->meminfo.mem_max.nr_pages was
> > left at 0, effectivity clamping dom0 to 0 pages of ram.
> >

I realize I never shared the exact panic message I was experiencing. Sorry about that.
It's "Domain 0 allocation is too small for kernel image" on xen/arch/x86/pv/domain_builder.c:534

I think you should be able to consistently reproduce what I'm seeing as long as these two conditions are met:
- the dom0_mem cmdline option is _not_ set
- no domain builder device tree is passed to xen (the fallback case I identified above)

> > I'm not sure what the best solution is but one (easy) possibility is
> > just initializing meminfo to the dom0 defaults near the end of this function:
> >         d->meminfo.mem_size = dom0_size;
> >         d->meminfo.mem_min = dom0_min_size;
> >         d->meminfo.mem_max = dom0_max_size;
>
> I believe the correct fix is to this hunk,
>
> @@ -416,7 +379,12 @@ unsigned long __init dom0_compute_nr_pages(
>          }
>      }
>
> -    d->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
> +    /* Clamp according to min/max limits and available memory (final). */
> +    nr_pages = max(nr_pages, min_pages);
> +    nr_pages = min(nr_pages, max_pages);
> +    nr_pages = min(nr_pages, avail);
> +
> +    bd->domain->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
>
> Before that last line, there should be a clamp up of max_pages, e.g.
>
>     nr_pages = max(nr_pages, min_pages);
>     nr_pages = min(nr_pages, max_pages);
>     nr_pages = min(nr_pages, avail);
>
>     max_pages = max(nr_pages, max_pages);
>
>     bd->domain->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
>
> v/r,
> dps

I don't believe this resolves my issue.

If max_pages is 0 before these 5 lines, then the second line will still clamp nr_pages to 0 and the panic on line 534 will be hit.

Before patch 15, this max limit came directly from dom0_max_size, which has a default value of { .nr_pages = LONG_MAX }, so no clamping will occur unless overridden by the cmd line.

After patch 15, bd->meminfo.mem_max is used as the max limit. (unless overridden by the cmdline)
I'm assuming it will eventually be specified in the device tree, but for now, the max limit just set to equal to the size (xen/common/domain-builder/fdt.c:155) so no down-clamping will occur.

The only exception is the initial domain construction fallback. In this case, there is no device tree and bd->meminfo is never initialized.
If bd->meminfo.mem_size is zero, the code will try to compute a reasonable default for nr_pages, but there is no such logic max_pages. It remains 0, and clamps nr_pages to zero.

Does this help clarify?
The core issue is that without a device tree or command line option to specify the max limit, the max limit is left uninitialized, which clamps dom0's memory to 0. I think it should be initialized to LONG_MAX in that case, like it was before this patch set.

Thanks,
Jackson


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 10/18] x86: introduce the domain builder
  2022-07-22 20:33       ` Smith, Jackson
@ 2022-07-23 10:45         ` Daniel P. Smith
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel P. Smith @ 2022-07-23 10:45 UTC (permalink / raw)
  To: Smith, Jackson, Xen-devel

On 7/22/22 16:33, Smith, Jackson wrote:
>> -----Original Message-----
>> From: Daniel P. Smith <dpsmith@apertussolutions.com>
>>
>> On 7/18/22 09:59, Smith, Jackson wrote:
>>> Hi Daniel,
>>>
>>>> -----Original Message-----
>>>> Subject: [PATCH v1 10/18] x86: introduce the domain builder
>>>>
>>>> This commit introduces the domain builder configuration FDT parser
>>>> along with the domain builder core for domain creation. To enable
>>>> domain builder to be a cross architecture internal API, a new arch
>>>> domain creation call
>>> is
>>>> introduced for use by the domain builder.
>>>
>>>> diff --git a/xen/common/domain-builder/core.c
>>>
>>>> +void __init builder_init(struct boot_info *info) {
>>>> +    struct boot_domain *d = NULL;
>>>> +
>>>> +    info->builder = &builder;
>>>> +
>>>> +    if ( IS_ENABLED(CONFIG_BUILDER_FDT) )
>>>> +    {
>>>
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * No FDT config support or an FDT wasn't present, do an initial
>>>> +     * domain construction
>>>> +     */
>>>> +    printk("Domain Builder: falling back to initial domain build\n");
>>>> +    info->builder->nr_doms = 1;
>>>> +    d = &info->builder->domains[0];
>>>> +
>>>> +    d->mode = opt_dom0_pvh ? 0 : BUILD_MODE_PARAVIRTUALIZED;
>>>> +
>>>> +    d->kernel = &info->mods[0];
>>>> +    d->kernel->kind = BOOTMOD_KERNEL;
>>>> +
>>>> +    d->permissions = BUILD_PERMISSION_CONTROL |
>>>> BUILD_PERMISSION_HARDWARE;
>>>> +    d->functions = BUILD_FUNCTION_CONSOLE |
>>>> BUILD_FUNCTION_XENSTORE |
>>>> +                     BUILD_FUNCTION_INITIAL_DOM;
>>>> +
>>>> +    d->kernel->arch->headroom = bzimage_headroom(bootstrap_map(d-
>>>>> kernel),
>>>> +                                                   d->kernel->size);
>>>> +    bootstrap_map(NULL);
>>>> +
>>>> +    if ( d->kernel->string.len )
>>>> +        d->kernel->string.kind = BOOTSTR_CMDLINE; }
>>>
>>> Forgive me if I'm incorrect, but I believe there is an issue with this
>>> fallback logic for the case where no FDT was provided.
>>
>> IIUC, the issue at hand has to deal with patch #15.
>>
>>> If dom0_mem is not supplied to the xen cmd line, then d->meminfo is
>>> never initialized. (See dom0_compute_nr_pages/dom0_build.c:335)
>>> This was giving me trouble because bd->meminfo.mem_max.nr_pages was
>>> left at 0, effectivity clamping dom0 to 0 pages of ram.
>>>
> 
> I realize I never shared the exact panic message I was experiencing. Sorry about that.
> It's "Domain 0 allocation is too small for kernel image" on xen/arch/x86/pv/domain_builder.c:534

Yep, I ran into this one before and thought I had it addressed.

> I think you should be able to consistently reproduce what I'm seeing as long as these two conditions are met:
> - the dom0_mem cmdline option is _not_ set
> - no domain builder device tree is passed to xen (the fallback case I identified above)

Ack

>>> I'm not sure what the best solution is but one (easy) possibility is
>>> just initializing meminfo to the dom0 defaults near the end of this function:
>>>          d->meminfo.mem_size = dom0_size;
>>>          d->meminfo.mem_min = dom0_min_size;
>>>          d->meminfo.mem_max = dom0_max_size;
>>
>> I believe the correct fix is to this hunk,
>>
>> @@ -416,7 +379,12 @@ unsigned long __init dom0_compute_nr_pages(
>>           }
>>       }
>>
>> -    d->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
>> +    /* Clamp according to min/max limits and available memory (final). */
>> +    nr_pages = max(nr_pages, min_pages);
>> +    nr_pages = min(nr_pages, max_pages);
>> +    nr_pages = min(nr_pages, avail);
>> +
>> +    bd->domain->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
>>
>> Before that last line, there should be a clamp up of max_pages, e.g.
>>
>>      nr_pages = max(nr_pages, min_pages);
>>      nr_pages = min(nr_pages, max_pages);
>>      nr_pages = min(nr_pages, avail);
>>
>>      max_pages = max(nr_pages, max_pages);
>>
>>      bd->domain->max_pages = min_t(unsigned long, max_pages, UINT_MAX);
>>
>> v/r,
>> dps
> 
> I don't believe this resolves my issue.
> 
> If max_pages is 0 before these 5 lines, then the second line will still clamp nr_pages to 0 and the panic on line 534 will be hit.
> 
> Before patch 15, this max limit came directly from dom0_max_size, which has a default value of { .nr_pages = LONG_MAX }, so no clamping will occur unless overridden by the cmd line.
> 
> After patch 15, bd->meminfo.mem_max is used as the max limit. (unless overridden by the cmdline)
> I'm assuming it will eventually be specified in the device tree, but for now, the max limit just set to equal to the size (xen/common/domain-builder/fdt.c:155) so no down-clamping will occur.
> 
> The only exception is the initial domain construction fallback. In this case, there is no device tree and bd->meminfo is never initialized.
> If bd->meminfo.mem_size is zero, the code will try to compute a reasonable default for nr_pages, but there is no such logic max_pages. It remains 0, and clamps nr_pages to zero.
> 
> Does this help clarify?
> The core issue is that without a device tree or command line option to specify the max limit, the max limit is left uninitialized, which clamps dom0's memory to 0. I think it should be initialized to LONG_MAX in that case, like it was before this patch set.

You are correct, my apologies. Thank you!

> Thanks,
> Jackson


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 02/18] introduction of generalized boot info
  2022-07-22 16:01         ` Daniel P. Smith
@ 2022-07-25  7:05           ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-25  7:05 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 22.07.2022 18:01, Daniel P. Smith wrote:
> On 7/21/22 12:00, Jan Beulich wrote:
>> On 21.07.2022 16:28, Daniel P. Smith wrote:
>>> On 7/19/22 09:11, Jan Beulich wrote:
>>>> On 06.07.2022 23:04, Daniel P. Smith wrote:
>>>>> --- /dev/null
>>>>> +++ b/xen/arch/x86/include/asm/bootinfo.h
>>>>> @@ -0,0 +1,48 @@
>>>>> +#ifndef __ARCH_X86_BOOTINFO_H__
>>>>> +#define __ARCH_X86_BOOTINFO_H__
>>>>> +
>>>>> +/* unused for x86 */
>>>>> +struct arch_bootstring { };
>>>>> +
>>>>> +struct __packed arch_bootmodule {
>>>>> +#define BOOTMOD_FLAG_X86_RELOCATED      1U << 0
>>>>
>>>> Such macro expansions need parenthesizing.
>>>
>>> Ack.
>>>
>>>>> +    uint32_t flags;
>>>>> +    uint32_t headroom;
>>>>> +};
>>>>
>>>> Since you're not following any external spec, on top of what Julien
>>>> said about the __packed attribute I'd also like to point out that
>>>> in many cases here there's no need to use fixed-width types.
>>>
>>> Oh, I forgot to mention that in the reply to Julien. Yes, the __packed
>>> is needed to correctly cross the 32bit to 64bit bridge from the x86
>>> bootstrap in patch 4.
>>
>> I'm afraid I don't follow you here. I did briefly look at patch 4 (but
>> that really also falls in the "wants to be split" category), but I
>> can't see why a purely internally used struct may need packing. I'd
>> appreciate if you could expand on that.
> 
> Originally, patch 3 and patch 4 were a single patch, and obviously was
> way too large. To split them, I realized I could introduce a temporary
> conversion function that would allow the patch to be split into a post
> start_xen() patch (patch 3) and a pre start_xen() patch, (patch 4). For
> x86, pre start_xen() consists of 3 different entry points. These being
> the classic/traditional/old multiboot1/2 entry, EFI entry, and PVH entry
> (aka Xen Guest). The latter two are all internal, 64bit, but the former
> is located in arch/x86/boot and is compiled as 32bit. I tried different
> approaches to support using a single header between these two
> environments. Ultimately, IMHO, the cleanest approach is what is
> introduced in patch 4 as it enabled the use of Xen types in the
> structures and maintain a single structure that need to be passed
> around. To do this, a 32bit specific version of the structures were
> defined in arch/x86/boot/boot_info32.h that is populated under 32bit
> mode, then they can be fixed up after getting into start_xen() and in
> 64bit code. To ensure no unexpected insertion of padding, I focused on
> ensuring everything was 32bit aligned and packed. As Julien pointed out,
> I messed up with the use of enum as its size is not guaranteed as the
> enum list grows and I forgot to consider keeping pointers 64bit aligned.
> 
> Does that help?

It helps as background info, yes, but I continue to be unhappy with the
new uses of the __packed attribute.

>>>>> +struct __packed arch_boot_info {
>>>>> +    uint32_t flags;
>>>>> +#define BOOTINFO_FLAG_X86_MEMLIMITS  	1U << 0
>>>>> +#define BOOTINFO_FLAG_X86_BOOTDEV    	1U << 1
>>>>> +#define BOOTINFO_FLAG_X86_CMDLINE    	1U << 2
>>>>> +#define BOOTINFO_FLAG_X86_MODULES    	1U << 3
>>>>> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  	1U << 4
>>>>> +#define BOOTINFO_FLAG_X86_ELF_SYMS   	1U << 5
>>>>> +#define BOOTINFO_FLAG_X86_MEMMAP     	1U << 6
>>>>> +#define BOOTINFO_FLAG_X86_DRIVES     	1U << 7
>>>>> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 	1U << 8
>>>>> +#define BOOTINFO_FLAG_X86_LOADERNAME 	1U << 9
>>>>> +#define BOOTINFO_FLAG_X86_APM        	1U << 10
>>>>> +
>>>>> +    bool xen_guest;
>>>>
>>>> As the example of this, with just the header files being introduced
>>>> here it is not really possible to figure what these fields are to
>>>> be used for and hence whether they're legitimately represented here.
>>>
>>> I can add a comment to clarify these are a mirror of the multiboot
>>> flags. These were mirrored to allow the multiboot flags to be direct
>>> copied and eased the replacement locations where an mb flag is checked.
>>
>> Multiboot flags? The context here is the "xen_guest" field.
> 
> Apologies, I thought you were referring to all the fields and I forgot
> to explain xen_guest. So to clarify, flags is to carry the MB flags
> passed up from the MB entry point and xen_guest is meant to carry the
> xen_guest bool passed up from the PVH/Xen Guest entry point.

That was my guess, but then my request stands: The fields should be
added to the struct at the time they're being made use of.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 05/18] x86: refactor xen cmdline into general framework
  2022-07-22 13:12     ` Daniel P. Smith
@ 2022-07-25  7:09       ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-25  7:09 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 22.07.2022 15:12, Daniel P. Smith wrote:
> On 7/19/22 09:26, Jan Beulich wrote:
>> On 06.07.2022 23:04, Daniel P. Smith wrote:
>>> --- a/xen/include/xen/bootinfo.h
>>> +++ b/xen/include/xen/bootinfo.h
>>> @@ -53,6 +53,17 @@ struct __packed boot_info {
>>>  
>>>  extern struct boot_info *boot_info;
>>>  
>>> +static inline char *bootinfo_prepare_cmdline(struct boot_info *bi)
>>> +{
>>> +    bi->cmdline = arch_bootinfo_prepare_cmdline(bi->cmdline, bi->arch);
>>> +
>>> +    if ( *bi->cmdline == ' ' )
>>> +        printk(XENLOG_WARNING "%s: leading whitespace left on cmdline\n",
>>> +               __func__);
>>
>> Just a remark and a question on this one: I don't view the use of
>> __func__ here (and in fact in many other cases as well) as very
>> useful. And why do we need such a warning all of the sudden in the
>> first place?
> 
> This started as just a debug print, thus why __func__ is in place, but
> later decided to leave it. This is because after this point, the code
> assumes that all leading space was stripped, but there was never a check
> that logic did the job correct. I don't believe a failure to do so
> warranted a halt to the boot process, but at least provide a warning
> into the log should the trimming fail. Doing so would allow an admin to
> have a clue should an unexpected behavior occur as a result of leading
> space making it through and breaking the fully trimmed assumption made
> elsewhere.

All fine, but then such a change wants doing on its own, not in the middle
of pretty involved refactoring work.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 09/18] x86: introduce abstractions for domain builder
  2022-07-06 21:04 ` [PATCH v1 09/18] x86: introduce abstractions for domain builder Daniel P. Smith
@ 2022-07-26 14:22   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-26 14:22 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- /dev/null
> +++ b/xen/arch/x86/include/asm/bootdomain.h
> @@ -0,0 +1,30 @@
> +#ifndef __ARCH_X86_BOOTDOMAIN_H__
> +#define __ARCH_X86_BOOTDOMAIN_H__
> +
> +struct memsize {
> +    long nr_pages;
> +    unsigned int percent;
> +    bool minus;
> +};
> +
> +static inline bool memsize_gt_zero(const struct memsize *sz)
> +{
> +    return !sz->minus && sz->nr_pages;
> +}
> +
> +static inline unsigned long get_memsize(
> +    const struct memsize *sz, unsigned long avail)
> +{
> +    unsigned long pages;
> +
> +    pages = sz->nr_pages + sz->percent * avail / 100;
> +    return sz->minus ? avail - pages : pages;
> +}

For both functions I think you should retain the __init, just in case
the compiler decides against actually inlining them (according to my
observations Clang frequently won't).

> +struct arch_domain_mem {
> +    struct memsize mem_size;
> +    struct memsize mem_min;
> +    struct memsize mem_max;
> +};

How come this is introduced here without the three respective Dom0
variables being replaced by an instance of this struct? At which
point a further question would be: What about dom0_mem_set?

> --- /dev/null
> +++ b/xen/include/xen/bootdomain.h
> @@ -0,0 +1,52 @@
> +#ifndef __XEN_BOOTDOMAIN_H__
> +#define __XEN_BOOTDOMAIN_H__
> +
> +#include <xen/bootinfo.h>
> +#include <xen/types.h>
> +
> +#include <public/xen.h>
> +#include <asm/bootdomain.h>
> +
> +struct domain;

Why the forward decl? There's no function being declared here, and
this is not C++.

> +struct boot_domain {
> +#define BUILD_PERMISSION_NONE          (0)
> +#define BUILD_PERMISSION_CONTROL       (1 << 0)
> +#define BUILD_PERMISSION_HARDWARE      (1 << 1)
> +    uint32_t permissions;

Why a fixed width type? And why no 'u' suffixes on the 1s being left
shifted above? (Same further down from here.)

> +#define BUILD_FUNCTION_NONE            (0)
> +#define BUILD_FUNCTION_BOOT            (1 << 0)
> +#define BUILD_FUNCTION_CRASH           (1 << 1)
> +#define BUILD_FUNCTION_CONSOLE         (1 << 2)
> +#define BUILD_FUNCTION_STUBDOM         (1 << 3)
> +#define BUILD_FUNCTION_XENSTORE        (1 << 30)
> +#define BUILD_FUNCTION_INITIAL_DOM     (1 << 31)
> +    uint32_t functions;
> +                                                /* On     | Off    */
> +#define BUILD_MODE_PARAVIRTUALIZED     (1 << 0) /* PV     | PVH/HVM */
> +#define BUILD_MODE_ENABLE_DEVICE_MODEL (1 << 1) /* HVM    | PVH     */
> +#define BUILD_MODE_LONG                (1 << 2) /* 64 BIT | 32 BIT  */

I guess bitness would better not be a boolean-like value (and "LONG"
is kind of odd anyway) - see RISC-V having provisions right away for
128-bit mode.

> --- /dev/null
> +++ b/xen/include/xen/domain_builder.h
> @@ -0,0 +1,55 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef XEN_DOMAIN_BUILDER_H
> +#define XEN_DOMAIN_BUILDER_H
> +
> +#include <xen/bootdomain.h>
> +#include <xen/bootinfo.h>
> +
> +#include <asm/setup.h>
> +
> +struct domain_builder {
> +    bool fdt_enabled;
> +#define BUILD_MAX_BOOT_DOMAINS 64
> +    uint16_t nr_doms;
> +    struct boot_domain domains[BUILD_MAX_BOOT_DOMAINS];
> +
> +    struct arch_domain_builder *arch;
> +};
> +
> +static inline bool builder_is_initdom(struct boot_domain *bd)

const wherever possible, please.

> +{
> +    return bd->functions & BUILD_FUNCTION_INITIAL_DOM;
> +}
> +
> +static inline bool builder_is_ctldom(struct boot_domain *bd)
> +{
> +    return (bd->functions & BUILD_FUNCTION_INITIAL_DOM ||
> +            bd->permissions & BUILD_PERMISSION_CONTROL );

Please parenthesize the operands of &, |, or ^ inside && or ||.

> +}
> +
> +static inline bool builder_is_hwdom(struct boot_domain *bd)
> +{
> +    return (bd->functions & BUILD_FUNCTION_INITIAL_DOM ||
> +            bd->permissions & BUILD_PERMISSION_HARDWARE );
> +}
> +
> +static inline struct domain *builder_get_hwdom(struct boot_info *info)
> +{
> +    int i;

unsigned int please when the value can't go negative.

> +    for ( i = 0; i < info->builder->nr_doms; i++ )
> +    {
> +        struct boot_domain *d = &info->builder->domains[i];
> +
> +        if ( builder_is_hwdom(d) )
> +            return d->domain;
> +    }
> +
> +    return NULL;
> +}
> +
> +void builder_init(struct boot_info *info);
> +uint32_t builder_create_domains(struct boot_info *info);

Both for these and for the inline functions - how is one to judge they
are (a) needed and (b) fit their purpose without seeing even a single
caller. And for the prototypes not even the implementation is there:
What's wrong with adding those at the time they're actually implemented
(and hopefully also used)?

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 10/18] x86: introduce the domain builder
  2022-07-06 21:04 ` [PATCH v1 10/18] x86: introduce the " Daniel P. Smith
  2022-07-18 13:59   ` Smith, Jackson
@ 2022-07-26 14:46   ` Jan Beulich
  1 sibling, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-26 14:46 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> This commit introduces the domain builder configuration FDT parser along with
> the domain builder core for domain creation. To enable domain builder to be a
> cross architecture internal API, a new arch domain creation call is introduced
> for use by the domain builder.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
> ---
>  xen/arch/x86/setup.c               |   9 +
>  xen/common/Makefile                |   1 +
>  xen/common/domain-builder/Makefile |   2 +
>  xen/common/domain-builder/core.c   |  96 ++++++++++
>  xen/common/domain-builder/fdt.c    | 295 +++++++++++++++++++++++++++++
>  xen/common/domain-builder/fdt.h    |   7 +
>  xen/include/xen/bootinfo.h         |  16 ++
>  xen/include/xen/domain_builder.h   |   1 +
>  8 files changed, 427 insertions(+)

With this diffstat - why the x86: prefix in the subject?

Also note the naming inconsistency: domain-builder/ (preferred) vs
domain_builder.h (adjustment would require touching earlier patches).

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1,4 +1,6 @@
> +#include <xen/bootdomain.h>
>  #include <xen/bootinfo.h>
> +#include <xen/domain_builder.h>
>  #include <xen/init.h>
>  #include <xen/lib.h>
>  #include <xen/err.h>
> @@ -826,6 +828,13 @@ static struct domain *__init create_dom0(const struct boot_info *bi)
>      return d;
>  }
>  
> +void __init arch_create_dom(
> +    const struct boot_info *bi, struct boot_domain *bd)
> +{
> +    if ( builder_is_initdom(bd) )
> +        create_dom0(bi);
> +}

You're not removing any code in exchange - is Dom0 now being built twice?
Or is the function above effectively dead code?

> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -72,6 +72,7 @@ extra-y := symbols-dummy.o
>  obj-$(CONFIG_COVERAGE) += coverage/
>  obj-y += sched/
>  obj-$(CONFIG_UBSAN) += ubsan/
> +obj-y += domain-builder/

At least as long as all of this is still experimental I would really like
to see a way to disable all of it via Kconfig.

> --- /dev/null
> +++ b/xen/common/domain-builder/core.c
> @@ -0,0 +1,96 @@
> +#include <xen/bootdomain.h>
> +#include <xen/bootinfo.h>
> +#include <xen/domain_builder.h>
> +#include <xen/init.h>
> +#include <xen/types.h>
> +
> +#include <asm/bzimage.h>
> +#include <asm/setup.h>
> +
> +#include "fdt.h"
> +
> +static struct domain_builder __initdata builder;
> +
> +void __init builder_init(struct boot_info *info)
> +{
> +    struct boot_domain *d = NULL;
> +
> +    info->builder = &builder;
> +
> +    if ( IS_ENABLED(CONFIG_BUILDER_FDT) )
> +    {
> +        /* fdt is required to be module 0 */
> +        switch ( check_fdt(info, __va(info->mods[0].start)) )

Besides requiring fixed order looking inflexible to me, what guarantees
there is at least one module? (Perhaps there is, but once again -
without seeing where this function is being called from, how am I to
judge?)

> +        {
> +        case 0:
> +            printk("Domain Builder: initialized from config\n");
> +            info->builder->fdt_enabled = true;
> +            return;
> +        case -EINVAL:
> +            info->builder->fdt_enabled = false;
> +            break;

Aiui this is the case where no FDT is present. I'd strongly suggest
to use a less common / ambiguous error code to cover that case. Maybe
-ENODEV or -EOPNOTSUPP or ...

> +        case -ENODATA:

... -ENODATA, albeit you having that here suggests this has some
other specific meaning already.

> +        default:
> +            panic("%s: error occured processing DTB\n", __func__);
> +        }
> +    }
> +
> +    /*
> +     * No FDT config support or an FDT wasn't present, do an initial
> +     * domain construction
> +     */
> +    printk("Domain Builder: falling back to initial domain build\n");
> +    info->builder->nr_doms = 1;
> +    d = &info->builder->domains[0];
> +
> +    d->mode = opt_dom0_pvh ? 0 : BUILD_MODE_PARAVIRTUALIZED;
> +
> +    d->kernel = &info->mods[0];
> +    d->kernel->kind = BOOTMOD_KERNEL;
> +
> +    d->permissions = BUILD_PERMISSION_CONTROL | BUILD_PERMISSION_HARDWARE;
> +    d->functions = BUILD_FUNCTION_CONSOLE | BUILD_FUNCTION_XENSTORE |
> +                     BUILD_FUNCTION_INITIAL_DOM;

Nit: Indentation.

> +    d->kernel->arch->headroom = bzimage_headroom(bootstrap_map(d->kernel),
> +                                                   d->kernel->size);

bzimage isn't an arch-agnostic concept afaict, so I don't see this
function legitimately being called from here.

And nit again: Indentation. (And at least one more further down.)

> +    bootstrap_map(NULL);
> +
> +    if ( d->kernel->string.len )
> +        d->kernel->string.kind = BOOTSTR_CMDLINE;
> +}
> +
> +uint32_t __init builder_create_domains(struct boot_info *info)
> +{
> +    uint32_t build_count = 0, functions_built = 0;
> +    int i;
> +
> +    for ( i = 0; i < info->builder->nr_doms; i++ )
> +    {
> +        struct boot_domain *d = &info->builder->domains[i];

Can variables of this type please not be named "d", but e.g. "bd"?

> +        if ( ! IS_ENABLED(CONFIG_MULTIDOM_BUILDER) &&
> +             ! builder_is_initdom(d) &&

Nit: Stray blanks after ! .

> --- /dev/null
> +++ b/xen/common/domain-builder/fdt.c
> @@ -0,0 +1,295 @@
> +#include <xen/bootdomain.h>
> +#include <xen/bootinfo.h>
> +#include <xen/domain_builder.h>
> +#include <xen/fdt.h>
> +#include <xen/init.h>
> +#include <xen/lib.h>
> +#include <xen/libfdt/libfdt.h>
> +#include <xen/page-size.h>
> +#include <xen/pfn.h>
> +#include <xen/types.h>
> +
> +#include <asm/bzimage.h>
> +#include <asm/setup.h>
> +
> +#include "fdt.h"
> +
> +#define BUILDER_FDT_TARGET_UNK 0
> +#define BUILDER_FDT_TARGET_X86 1
> +#define BUILDER_FDT_TARGET_ARM 2
> +static int __initdata target_arch = BUILDER_FDT_TARGET_UNK;
> +
> +static struct boot_module *read_module(
> +    const void *fdt, int node, uint32_t address_cells, uint32_t size_cells,
> +    struct boot_info *info)
> +{
> +    const struct fdt_property *prop;
> +    const __be32 *cell;
> +    struct boot_module *bm;
> +    bootmodule_kind kind = BOOTMOD_UNKNOWN;
> +    int len;
> +
> +    if ( device_tree_node_compatible(fdt, node, "module,kernel") )
> +        kind = BOOTMOD_KERNEL;
> +
> +    if ( device_tree_node_compatible(fdt, node, "module,ramdisk") )
> +        kind = BOOTMOD_RAMDISK;
> +
> +    if ( device_tree_node_compatible(fdt, node, "module,microcode") )
> +        kind = BOOTMOD_UCODE;
> +
> +    if ( device_tree_node_compatible(fdt, node, "module,xsm-policy") )
> +        kind = BOOTMOD_XSM;
> +
> +    if ( device_tree_node_compatible(fdt, node, "module,config") )
> +        kind = BOOTMOD_GUEST_CONF;
> +
> +    if ( device_tree_node_compatible(fdt, node, "module,index") )
> +    {
> +        uint32_t idx;
> +
> +        idx = (uint32_t)device_tree_get_u32(fdt, node, "module-index", 0);

Why the cast?

> +static int process_domain_node(

__init?

> +    const void *fdt, int node, const char *name, int depth,
> +    uint32_t address_cells, uint32_t size_cells, void *data)
> +{
> +    struct boot_info *info = (struct boot_info *)data;
> +    const struct fdt_property *prop;
> +    struct boot_domain *domain;
> +    int node_next, i, plen;
> +
> +    if ( !info )
> +        return -1;
> +
> +    if ( info->builder->nr_doms >= BUILD_MAX_BOOT_DOMAINS )
> +        return -1;
> +
> +    domain = &info->builder->domains[info->builder->nr_doms];
> +
> +    domain->domid = (domid_t)device_tree_get_u32(fdt, node, "domid", 0);
> +    domain->permissions = device_tree_get_u32(fdt, node, "permissions", 0);
> +    domain->functions = device_tree_get_u32(fdt, node, "functions", 0);
> +    domain->mode = device_tree_get_u32(fdt, node, "mode", 0);
> +
> +    prop = fdt_get_property(fdt, node, "domain-uuid", &plen);
> +    if ( prop )
> +        for ( i=0; i < sizeof(domain->uuid) % sizeof(uint32_t); i++ )
> +            *(domain->uuid + i) = fdt32_to_cpu((uint32_t)prop->data[i]);
> +
> +    domain->ncpus = device_tree_get_u32(fdt, node, "cpus", 1);
> +
> +    if ( target_arch == BUILDER_FDT_TARGET_X86 )
> +    {
> +        prop = fdt_get_property(fdt, node, "memory", &plen);
> +        if ( prop )
> +        {
> +            int sz = fdt32_to_cpu(prop->len);
> +            char s[64];
> +            unsigned long val;
> +
> +            if ( sz >= 64 )
> +                panic("node %s invalid `memory' property\n", name);
> +
> +            memcpy(s, prop->data, sz);
> +            s[sz] = '\0';
> +            val = parse_size_and_unit(s, NULL);
> +
> +            domain->meminfo.mem_size.nr_pages = PFN_UP(val);
> +            domain->meminfo.mem_max.nr_pages = PFN_UP(val);
> +        }
> +        else
> +            panic("node %s missing `memory' property\n", name);
> +    }
> +    else
> +            panic("%s: only x86 memory parsing supported\n", __func__);
> +
> +    prop = fdt_get_property(fdt, node, "security-id",
> +                                &plen);
> +    if ( prop )
> +    {
> +        int sz = fdt32_to_cpu(prop->len);
> +        sz = sz > BUILD_MAX_SECID_LEN ?  BUILD_MAX_SECID_LEN : sz;
> +        memcpy(domain->secid, prop->data, sz);
> +    }
> +
> +    for ( node_next = fdt_first_subnode(fdt, node);
> +          node_next > 0;
> +          node_next = fdt_next_subnode(fdt, node_next))
> +    {
> +        struct boot_module *bm = read_module(fdt, node_next, address_cells,
> +                                             size_cells, info);
> +
> +        switch ( bm->kind )
> +        {
> +        case BOOTMOD_KERNEL:
> +            /* kernel was already found */
> +            if ( domain->kernel != NULL )
> +                continue;
> +
> +            bm->arch->headroom = bzimage_headroom(bootstrap_map(bm), bm->size);
> +            bootstrap_map(NULL);
> +
> +            if ( bm->string.len )
> +                bm->string.kind = BOOTSTR_CMDLINE;
> +            else
> +            {
> +                prop = fdt_get_property(fdt, node_next, "bootargs", &plen);
> +                if ( prop )
> +                {
> +                    int size = fdt32_to_cpu(prop->len);
> +                    size = size > BOOTMOD_MAX_STRING ?
> +                           BOOTMOD_MAX_STRING : size;
> +                    memcpy(bm->string.bytes, prop->data, size);
> +                    bm->string.kind = BOOTSTR_CMDLINE;
> +                }
> +            }
> +
> +            domain->kernel = bm;
> +
> +            break;
> +        case BOOTMOD_RAMDISK:
> +            /* ramdisk was already found */
> +            if ( domain->ramdisk != NULL )
> +                continue;
> +
> +            domain->ramdisk = bm;
> +
> +            break;
> +        case BOOTMOD_GUEST_CONF:
> +            /* guest config was already found */
> +            if ( domain->configs[BUILD_DOM_CONF_IDX] != NULL )
> +                continue;
> +
> +            domain->configs[BUILD_DOM_CONF_IDX] = bm;
> +
> +            break;
> +        default:
> +            continue;
> +        }

For larger switch() statements please have blank lines between non-fall-
through case blocks.

> +/* check_fdt
> + *   Attempts to initialize hyperlaunch config
> + *
> + * Returns:
> + *    -EINVAL: Not a valid DTB
> + *   -ENODATA: Valid DTB but not a valid hyperlaunch device tree
> + *          0: Valid hyperlaunch device tree
> + */
> +int __init check_fdt(struct boot_info *info, void *fdt)
> +{
> +    int hv_node, ret;
> +
> +    ret = fdt_check_header(fdt);
> +    if ( ret < 0 )
> +        return -EINVAL;
> +
> +    hv_node = fdt_path_offset(fdt, "/chosen/hypervisor");
> +    if ( hv_node < 0 )
> +        return -ENODATA;
> +
> +    if ( !device_tree_node_compatible(fdt, hv_node, "hypervisor,xen") )
> +        return -EINVAL;
> +
> +    if ( IS_ENABLED(CONFIG_X86) &&
> +         device_tree_node_compatible(fdt, hv_node, "xen,x86") )
> +        target_arch = BUILDER_FDT_TARGET_X86;
> +    else if ( IS_ENABLED(CONFIG_ARM) &&
> +              device_tree_node_compatible(fdt, hv_node, "xen,arm") )
> +        target_arch = BUILDER_FDT_TARGET_ARM;
> +
> +    if ( target_arch != BUILDER_FDT_TARGET_X86 &&
> +         target_arch != BUILDER_FDT_TARGET_ARM )
> +        return -EINVAL;

So you'd happily accept BUILDER_FDT_TARGET_ARM on x86 or
BUILDER_FDT_TARGET_X86 on Arm? And there's no distinction between
Arm32 and Arm64?

> --- /dev/null
> +++ b/xen/common/domain-builder/fdt.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef COMMON_BUILDER_FDT_H
> +#define COMMON_BUILDER_FDT_H
> +
> +int __init check_fdt(struct boot_info *info, void *fdt);
> +#endif

Nit: Please put another blank line before #endif.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 11/18] x86: initial conversion to domain builder
  2022-07-06 21:04 ` [PATCH v1 11/18] x86: initial conversion to " Daniel P. Smith
@ 2022-07-26 15:01   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-26 15:01 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	xen-devel, Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> This commit is the first step in adopting domain builder. It goes through the
> dom0 creation and construction functions, converting them over to consume
> struct boot_domaain and changes the startup sequence to use the domain builder
> to create and construct dom0.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
> ---
>  xen/arch/x86/dom0_build.c             |  30 +++----
>  xen/arch/x86/hvm/dom0_build.c         |  10 +--
>  xen/arch/x86/include/asm/dom0_build.h |   8 +-
>  xen/arch/x86/include/asm/setup.h      |   5 +-
>  xen/arch/x86/pv/dom0_build.c          |  39 ++++-----
>  xen/arch/x86/setup.c                  | 114 +++++++++++++++-----------
>  6 files changed, 109 insertions(+), 97 deletions(-)
> 
> diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> index e44f7f3c43..216c9e3590 100644
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -6,6 +6,7 @@
>  
>  #include <xen/bootdomain.h>
>  #include <xen/bootinfo.h>
> +#include <xen/domain_builder.h>
>  #include <xen/init.h>
>  #include <xen/iocap.h>
>  #include <xen/libelf.h>
> @@ -556,31 +557,32 @@ int __init dom0_setup_permissions(struct domain *d)
>      return rc;
>  }
>  
> -int __init construct_dom0(
> -    struct domain *d, const struct boot_module *image,
> -    struct boot_module *initrd, char *cmdline)
> +int __init construct_domain(struct boot_domain *bd)
>  {
> -    int rc;
> +    int rc = 0;
>  
>      /* Sanity! */
> -    BUG_ON(!pv_shim && d->domain_id != 0);
> -    BUG_ON(d->vcpu[0] == NULL);
> -    BUG_ON(d->vcpu[0]->is_initialised);
> +    BUG_ON(!pv_shim && bd->domid != 0);
> +    BUG_ON(bd->domain->vcpu[0] == NULL);
> +    BUG_ON(bd->domain->vcpu[0]->is_initialised);
>  
>      process_pending_softirqs();
>  
> -    if ( is_hvm_domain(d) )
> -        rc = dom0_construct_pvh(d, image, initrd, cmdline);
> -    else if ( is_pv_domain(d) )
> -        rc = dom0_construct_pv(d, image, initrd, cmdline);
> -    else
> -        panic("Cannot construct Dom0. No guest interface available\n");
> +    if ( builder_is_initdom(bd) )
> +    {
> +        if ( is_hvm_domain(bd->domain) )
> +            rc = dom0_construct_pvh(bd);
> +        else if ( is_pv_domain(bd->domain) )
> +            rc = dom0_construct_pv(bd);
> +        else
> +            panic("Cannot construct Dom0. No guest interface available\n");
> +    }

Isn't there an "else" missing here, even if just to ASSERT_UNREACHABLE()
and set rc to, say, -EOPNOTSUPP?

> @@ -311,11 +310,12 @@ int __init dom0_construct_pv(
>      unsigned long count;
>      struct page_info *page = NULL;
>      start_info_t *si;
> +    struct domain *d = bd->domain;
>      struct vcpu *v = d->vcpu[0];
> -    void *image_base = bootstrap_map(image);
> -    unsigned long image_len = image->size;
> -    void *image_start = image_base + image->arch->headroom;
> -    unsigned long initrd_len = initrd ? initrd->size : 0;
> +    void *image_base = bootstrap_map(bd->kernel);
> +    unsigned long image_len = bd->kernel->size;
> +    void *image_start = image_base + bd->kernel->arch->headroom;
> +    unsigned long initrd_len = bd->ramdisk ? bd->ramdisk->size : 0;
>      l4_pgentry_t *l4tab = NULL, *l4start = NULL;
>      l3_pgentry_t *l3tab = NULL, *l3start = NULL;
>      l2_pgentry_t *l2tab = NULL, *l2start = NULL;
> @@ -355,7 +355,7 @@ int __init dom0_construct_pv(
>      d->max_pages = ~0U;
>  
>      if ( (rc =
> -          bzimage_parse(image_base, &image_start, image->arch->headroom,
> +          bzimage_parse(image_base, &image_start, bd->kernel->arch->headroom,
>                           &image_len)) != 0 )
>          return rc;
>  
> @@ -545,7 +545,7 @@ int __init dom0_construct_pv(
>          initrd_pfn = vinitrd_start ?
>                       (vinitrd_start - v_start) >> PAGE_SHIFT :
>                       domain_tot_pages(d);
> -        initrd_mfn = mfn = mfn_x(initrd->mfn);
> +        initrd_mfn = mfn = mfn_x(bd->ramdisk->mfn);
>          count = PFN_UP(initrd_len);
>          if ( d->arch.physaddr_bitsize &&
>               ((mfn + count - 1) >> (d->arch.physaddr_bitsize - PAGE_SHIFT)) )
> @@ -560,13 +560,13 @@ int __init dom0_construct_pv(
>                      free_domheap_pages(page, order);
>                      page += 1UL << order;
>                  }
> -            memcpy(page_to_virt(page), maddr_to_virt(initrd->start),
> +            memcpy(page_to_virt(page), maddr_to_virt(bd->ramdisk->start),
>                     initrd_len);
> -            mpt_alloc = initrd->start;
> +            mpt_alloc = bd->ramdisk->start;
>              init_domheap_pages(mpt_alloc,
>                                 mpt_alloc + PAGE_ALIGN(initrd_len));
> -            bootmodule_update_mfn(initrd, page_to_mfn(page));
> -            initrd_mfn = mfn_x(initrd->mfn);
> +            bootmodule_update_mfn(bd->ramdisk, page_to_mfn(page));
> +            initrd_mfn = mfn_x(bd->ramdisk->mfn);
>          }
>          else
>          {
> @@ -574,7 +574,7 @@ int __init dom0_construct_pv(
>                  if ( assign_pages(mfn_to_page(_mfn(mfn++)), 1, d, 0) )
>                      BUG();
>          }
> -        initrd->size = 0;
> +        bd->ramdisk->size = 0;

From an abstract pov: Is it legitimate to alter values under bd-> ? I
would have assumed bd and everything under it is r/o at this point
(and could/should be const-qualified).

> @@ -272,6 +271,24 @@ static int __init cf_check parse_acpi_param(const char *s)
>  }
>  custom_param("acpi", parse_acpi_param);
>  
> +void __init arch_builder_apply_cmdline(
> +    struct boot_info *info, struct boot_domain *bd)
> +{
> +    if ( skip_ioapic_setup && !strstr(bd->kernel->string.bytes, "noapic") )
> +        strlcat(bd->kernel->string.bytes, " noapic", MAX_GUEST_CMDLINE);
> +    if ( (strlen(acpi_param) == 0) && acpi_disabled )
> +    {
> +        printk("ACPI is disabled, notifying Domain 0 (acpi=off)\n");
> +        strlcpy(acpi_param, "off", sizeof(acpi_param));
> +    }
> +    if ( (strlen(acpi_param) != 0) &&
> +         !strstr(bd->kernel->string.bytes, "acpi=") )
> +    {
> +        strlcat(bd->kernel->string.bytes, " acpi=", MAX_GUEST_CMDLINE);
> +        strlcat(bd->kernel->string.bytes, acpi_param, MAX_GUEST_CMDLINE);
> +    }
> +}

This duplicates existing code rather than replacing it. How do
you envision the two to remain in sync? Such things should live in
exactly one place imo.

> @@ -816,7 +831,7 @@ static struct domain *__init create_dom0(const struct boot_info *bi)
>          write_cr4(read_cr4() & ~X86_CR4_SMAP);
>      }
>  
> -    if ( construct_dom0(d, image, initrd, cmdline) != 0 )
> +    if ( construct_domain(bd) != 0 )
>          panic("Could not construct domain 0\n");

You leave the log message text in place here, but ...

> @@ -1905,22 +1912,29 @@ void __init noreturn __start_xen(unsigned long bi_p)
>             cpu_has_nx ? XENLOG_INFO : XENLOG_WARNING "Warning: ",
>             cpu_has_nx ? "" : "not ");
>  
> -    initrdidx = bootmodule_next_idx_by_kind(boot_info, BOOTMOD_UNKNOWN, 0);
> -    if ( initrdidx < boot_info->nr_mods )
> -        boot_info->mods[initrdidx].kind = BOOTMOD_RAMDISK;
> -
> -    if ( bootmodule_count_by_kind(boot_info, BOOTMOD_UNKNOWN) > 1 )
> -        printk(XENLOG_WARNING
> -               "Multiple initrd candidates, picking module #%u\n",
> -               initrdidx);
> -
>      /*
> -     * We're going to setup domain0 using the module(s) that we stashed safely
> -     * above our heap. The second module, if present, is an initrd ramdisk.
> +     * Boot description not provided, check to see if there are any remaining
> +     * boot modules, the first one found will be provided as the ramdisk.
>       */
> -    dom0 = create_dom0(boot_info);
> +    if ( ! boot_info->builder->fdt_enabled )
> +    {
> +        initrdidx = bootmodule_next_idx_by_kind(boot_info, BOOTMOD_UNKNOWN, 0);
> +        if ( initrdidx < boot_info->nr_mods )
> +        {
> +            boot_info->builder->domains[0].ramdisk = &boot_info->mods[initrdidx];
> +            boot_info->mods[initrdidx].kind = BOOTMOD_RAMDISK;
> +        }
> +        if ( bootmodule_count_by_kind(boot_info, BOOTMOD_UNKNOWN) > 1 )
> +            printk(XENLOG_WARNING
> +                   "Multiple initrd candidates, picking module #%u\n",
> +                   initrdidx);
> +    }
> +
> +    builder_create_domains(boot_info);
> +
> +    dom0 = builder_get_hwdom(boot_info);
>      if ( !dom0 )
> -        panic("Could not set up DOM0 guest OS\n");
> +        panic("No hardware domain was built\n");

... you change it here, neglecting that in the late-hwdom case what is
being built here is only Dom0, not hwdom. This may also affect the name
of the function that you call.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-19 16:36     ` Daniel P. Smith
@ 2022-07-26 18:07       ` Julien Grall
  2022-07-27  6:12         ` Jan Beulich
  0 siblings, 1 reply; 66+ messages in thread
From: Julien Grall @ 2022-07-26 18:07 UTC (permalink / raw)
  To: Daniel P. Smith, xen-devel, Volodymyr Babchuk, Wei Liu
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Jan Beulich, Stefano Stabellini, Bertrand Marquis,
	Roger Pau Monné

Hi Daniel,

On 19/07/2022 17:36, Daniel P. Smith wrote:
> 
> On 7/15/22 15:16, Julien Grall wrote:
>> Hi Daniel,
>>
>> On 06/07/2022 22:04, Daniel P. Smith wrote:
>>> For x86 the number of allowable multiboot modules varies between the
>>> different
>>> entry points, non-efi boot, pvh boot, and efi boot. In the case of
>>> both Arm and
>>> x86 this value is fixed to values based on generalized assumptions. With
>>> hyperlaunch for x86 and dom0less on Arm, use of static sizes results
>>> in large
>>> allocations compiled into the hypervisor that will go unused by many
>>> use cases.
>>>
>>> This commit introduces a Kconfig variable that is set with sane
>>> defaults based
>>> on configuration selection. This variable is in turned used as the
>>> array size
>>> for the cases where a static allocated array of boot modules is declared.
>>>
>>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>>> Reviewed-by: Christopher Clark <christopher.clark@starlab.io>
>>
>> I am not entirely sure where this reviewed-by is coming from. Is this
>> from internal review?
> 
> Yes.
> 
>> If yes, my recommendation would be to provide the reviewed-by on the
>> mailing list. Ideally, the review should also be done in the open, but I
>> understand some company wish to do a fully internal review first.
> 
> Since this capability is being jointly developed by Christopher and I,
> with myself being the author of code, Christopher reviewed the code as
> the co-developer. He did so as a second pair of eyes for any obvious
> mistakes and to concur that the implementation was in line with the
> approach the two of us architected. Perhaps a SoB line might be more
> appropriate than an R-b line.
> 
>> At least from a committer perspective, this helps me to know whether the
>> reviewed-by still apply. An example would be if you send a v2, I would
>> not be able to know whether Christoffer still agreed on the change.
> 
> If an SoB line is more appropriate, then on the next version I can
> switch it

Thanks for the explanation. To me "signed-off-by" means the person wrote 
some code (or sent the patches) code. So from above, it sounds more like 
Christoffer did a review.

So I think it is more suitable for him to provide a reviewed-by. For 
follow-up, my preference would be Christoffer to provide the reviewed-by 
on the ML.

If it is too much overhead, I would suggest to log the latest version 
Christoffer reviewed-by in the changelog. I usually do:

Changes in vX:
   - Add Christoffer's reviewed-by

Or if he will reviewing every version, just mention it in the cover letter.

>>
>> Please explain in the commit message why the number of modules was
>> bumped from 5 to 9.
> 
> The number of modules were inconsistent between the different entry
> points into __start_xen(). By switching to a Kconfig variable, whose
> default was set to the largest value used across the entry points,
> results in change for the locations using another value.

Ok. Can you add something like: "For x86, the number of modules is not 
consistent across the code base. Use the maximum"?

> 
> See below for +1 explanation.
> 
>>>      static void __init edd_put_string(u8 *dst, size_t n, const char *src)
>>>    {
>>> diff --git a/xen/arch/x86/guest/xen/pvh-boot.c
>>> b/xen/arch/x86/guest/xen/pvh-boot.c
>>> index 498625eae0..834b1ad16b 100644
>>> --- a/xen/arch/x86/guest/xen/pvh-boot.c
>>> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
>>> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>>>    uint32_t __initdata pvh_start_info_pa;
>>>      static multiboot_info_t __initdata pvh_mbi;
>>> -static module_t __initdata pvh_mbi_mods[8];
>>> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];
>>
>> What's the +1 for?
> 
> I should clarify in the commit message, but the value set in
> CONFIG_NR_BOOTMOD is the max modules that Xen would accept from a
> bootloader. Xen startup code expects to be able to append Xen itself as
> the array. The +1 allocates an additional entry to store Xen in the
> array should a bootloader actually pass CONFIG_NR_BOOTMOD modules to
> Xen. There is an existing comment floating in one of these locations
> that explained it.

This makes sense. So every use of CONFIG_NR_BOOTMOD would end up to 
require +1. Is that correct?

If yes, then I think it would be better to require CONFIG_NR_BOOTMOD to 
be at minimum 1. This would reduce the risk to have different array size 
again. That said, this is x86 code, so the call is for the x86 maintainers.

> 
>>>    static const char *__initdata pvh_loader = "PVH Directboot";
>>>      static void __init convert_pvh_info(multiboot_info_t **mbi,
>>> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
>>> index f08b07b8de..2aa1e28c8f 100644
>>> --- a/xen/arch/x86/setup.c
>>> +++ b/xen/arch/x86/setup.c
>>> @@ -1020,9 +1020,9 @@ void __init noreturn __start_xen(unsigned long
>>> mbi_p)
>>>            panic("dom0 kernel not specified. Check bootloader
>>> configuration\n");
>>>          /* Check that we don't have a silly number of modules. */
>>> -    if ( mbi->mods_count > sizeof(module_map) * 8 )
>>> +    if ( mbi->mods_count > CONFIG_NR_BOOTMODS )
>>>        {
>>> -        mbi->mods_count = sizeof(module_map) * 8;
>>> +        mbi->mods_count = CONFIG_NR_BOOTMODS;
>>>            printk("Excessive multiboot modules - using the first %u
>>> only\n",
>>>                   mbi->mods_count);
>>>        }
>>
>> AFAIU, this check is to make sure that we will not overrun module_map in
>> the next line:
>>
>> bitmap_fill(module_map, mbi->mods_count);
>>
>> The current definition of module_map will allow 64 modules. But you are
>> allowing 32768. So I think you either want to keep the check or define
>> module_map as:
>>
>> DECLARE_BITMAP(module_map, CONFIG_NR_BOOTMODS);
> 
> Yes, in the RFC I had it capped to 64 and lost track of this related
> changed when it was bumped to 32768 per the review discussion. Later in
> the series, module_map goes away. To ensure stability at this point I
> would be inclined to restore the 64 module clamp down check. Thoughts?

I don't know what would a sensible value for x86. I will leave this 
question to the x86 maintainers.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules
  2022-07-26 18:07       ` Julien Grall
@ 2022-07-27  6:12         ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-27  6:12 UTC (permalink / raw)
  To: Julien Grall, Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Stefano Stabellini, Bertrand Marquis, Roger Pau Monné,
	xen-devel, Volodymyr Babchuk, Wei Liu

On 26.07.2022 20:07, Julien Grall wrote:
> On 19/07/2022 17:36, Daniel P. Smith wrote:
>> On 7/15/22 15:16, Julien Grall wrote:
>>> On 06/07/2022 22:04, Daniel P. Smith wrote:
>>>> index 498625eae0..834b1ad16b 100644
>>>> --- a/xen/arch/x86/guest/xen/pvh-boot.c
>>>> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
>>>> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>>>>    uint32_t __initdata pvh_start_info_pa;
>>>>      static multiboot_info_t __initdata pvh_mbi;
>>>> -static module_t __initdata pvh_mbi_mods[8];
>>>> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];
>>>
>>> What's the +1 for?
>>
>> I should clarify in the commit message, but the value set in
>> CONFIG_NR_BOOTMOD is the max modules that Xen would accept from a
>> bootloader. Xen startup code expects to be able to append Xen itself as
>> the array. The +1 allocates an additional entry to store Xen in the
>> array should a bootloader actually pass CONFIG_NR_BOOTMOD modules to
>> Xen. There is an existing comment floating in one of these locations
>> that explained it.
> 
> This makes sense. So every use of CONFIG_NR_BOOTMOD would end up to 
> require +1. Is that correct?
> 
> If yes, then I think it would be better to require CONFIG_NR_BOOTMOD to 
> be at minimum 1. This would reduce the risk to have different array size 
> again. That said, this is x86 code, so the call is for the x86 maintainers.

I think the Kconfig setting should stand for "true" modules. Anywhere that
x86 code internally uses one extra slot this should be expressed by an
explicit "+ 1" imo.

>>>>    static const char *__initdata pvh_loader = "PVH Directboot";
>>>>      static void __init convert_pvh_info(multiboot_info_t **mbi,
>>>> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
>>>> index f08b07b8de..2aa1e28c8f 100644
>>>> --- a/xen/arch/x86/setup.c
>>>> +++ b/xen/arch/x86/setup.c
>>>> @@ -1020,9 +1020,9 @@ void __init noreturn __start_xen(unsigned long
>>>> mbi_p)
>>>>            panic("dom0 kernel not specified. Check bootloader
>>>> configuration\n");
>>>>          /* Check that we don't have a silly number of modules. */
>>>> -    if ( mbi->mods_count > sizeof(module_map) * 8 )
>>>> +    if ( mbi->mods_count > CONFIG_NR_BOOTMODS )
>>>>        {
>>>> -        mbi->mods_count = sizeof(module_map) * 8;
>>>> +        mbi->mods_count = CONFIG_NR_BOOTMODS;
>>>>            printk("Excessive multiboot modules - using the first %u
>>>> only\n",
>>>>                   mbi->mods_count);
>>>>        }
>>>
>>> AFAIU, this check is to make sure that we will not overrun module_map in
>>> the next line:
>>>
>>> bitmap_fill(module_map, mbi->mods_count);
>>>
>>> The current definition of module_map will allow 64 modules. But you are
>>> allowing 32768. So I think you either want to keep the check or define
>>> module_map as:
>>>
>>> DECLARE_BITMAP(module_map, CONFIG_NR_BOOTMODS);
>>
>> Yes, in the RFC I had it capped to 64 and lost track of this related
>> changed when it was bumped to 32768 per the review discussion. Later in
>> the series, module_map goes away. To ensure stability at this point I
>> would be inclined to restore the 64 module clamp down check. Thoughts?
> 
> I don't know what would a sensible value for x86. I will leave this 
> question to the x86 maintainers.

I guess I'd be fine either way, as long as the code is correct.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 12/18] x86: convert dom0 creation to domain builder
  2022-07-06 21:04 ` [PATCH v1 12/18] x86: convert dom0 creation " Daniel P. Smith
@ 2022-07-27 12:25   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-27 12:25 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	xen-devel, Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- /dev/null
> +++ b/xen/arch/x86/domain_builder.c
> @@ -0,0 +1,128 @@
> +#include <xen/bootdomain.h>
> +#include <xen/bootinfo.h>
> +#include <xen/domain.h>
> +#include <xen/domain_builder.h>
> +#include <xen/err.h>
> +#include <xen/grant_table.h>
> +#include <xen/iommu.h>
> +#include <xen/sched.h>
> +
> +#include <asm/pv/shim.h>
> +#include <asm/setup.h>
> +
> +extern unsigned long cr4_pv32_mask;

Such declarations need to go in a header which both producer and
consumer(s) include.

> +static unsigned int __init dom_max_vcpus(struct boot_domain *bd)
> +{
> +    unsigned int limit;
> +
> +    if ( builder_is_initdom(bd) )
> +        return dom0_max_vcpus();
> +
> +    limit = bd->mode & BUILD_MODE_PARAVIRTUALIZED ?
> +                MAX_VIRT_CPUS : HVM_MAX_VCPUS;

Nit: Indentation.

> +    if ( bd->ncpus > limit )
> +        return limit;
> +    else
> +        return bd->ncpus;

    return min(bd->ncpus, limit);

> +}
> +
> +void __init arch_create_dom(
> +    const struct boot_info *bi, struct boot_domain *bd)
> +{
> +    struct xen_domctl_createdomain dom_cfg = {
> +        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
> +        .max_evtchn_port = -1,
> +        .max_grant_frames = -1,
> +        .max_maptrack_frames = -1,
> +        .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
> +        .max_vcpus = dom_max_vcpus(bd),
> +        .arch = {
> +            .misc_flags = bd->functions & BUILD_FUNCTION_INITIAL_DOM &&
> +                           opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0,
> +        },
> +    };
> +    unsigned int is_privileged = 0;

Either this is bool and retains its name, or it remains unsigned int
and changes its name (to e.g. "cdf").

> +    char *cmdline;
> +
> +    if ( bd->kernel == NULL )
> +        panic("Error creating d%uv0\n", bd->domid);

This gives too little information and, by mentioning vCPU 0, is imo
actively misleading.

> +    /* mask out PV and device model bits, if 0 then the domain is PVH */
> +    if ( !(bd->mode &
> +           (BUILD_MODE_PARAVIRTUALIZED|BUILD_MODE_ENABLE_DEVICE_MODEL)) )

Shouldn't you outright reject BUILD_MODE_ENABLE_DEVICE_MODEL, since
you can't fulfill that request?

> +    {
> +        dom_cfg.flags |= (XEN_DOMCTL_CDF_hvm |
> +                         (hvm_hap_supported() ? XEN_DOMCTL_CDF_hap : 0));
> +
> +        /*
> +         * If shadow paging is enabled for the initial domain, mask out
> +         * HAP if it was just enabled.
> +         */
> +        if ( builder_is_initdom(bd) )
> +            if ( opt_dom0_shadow )
> +                dom_cfg.flags |= ~XEN_DOMCTL_CDF_hap;

Please combine such if()s into a single one using &&. And I suppose
you mean &= ? Furthermore - how would a DomU be started without using
HAP when HAP is available?

> +        /* TODO: review which flags should be present */
> +        dom_cfg.arch.emulation_flags |=
> +            XEN_X86_EMU_LAPIC | XEN_X86_EMU_IOAPIC | XEN_X86_EMU_VPCI;
> +    }
> +
> +    if ( iommu_enabled && builder_is_hwdom(bd) )
> +        dom_cfg.flags |= XEN_DOMCTL_CDF_iommu;

Why would only hwdom get an IOMMU?

> +    if ( !pv_shim && builder_is_ctldom(bd) )
> +        is_privileged = CDF_privileged;
> +
> +    /* Create initial domain.  Not d0 for pvshim. */

Up to here I was assuming this function would deal with more than just
Dom0, based on conditionals seen. What's the intention? Mixing things
is at best confusing.

> +    bd->domid = get_initial_domain_id();

Higher up in the panic() invocation you did use bd->domid already.

> +    bd->domain = domain_create(bd->domid, &dom_cfg, is_privileged);
> +    if ( IS_ERR(bd->domain) )
> +        panic("Error creating d%u: %ld\n", bd->domid, PTR_ERR(bd->domain));
> +
> +    init_dom0_cpuid_policy(bd->domain);
> +
> +    if ( alloc_dom0_vcpu0(bd->domain) == NULL )
> +        panic("Error creating d%uv0\n", bd->domid);
> +
> +    /* Grab the DOM0 command line. */
> +    cmdline = (bd->kernel->string.kind == BOOTSTR_CMDLINE) ?
> +              bd->kernel->string.bytes : NULL;
> +    if ( cmdline || bi->arch->kextra )
> +    {
> +        char dom_cmdline[MAX_GUEST_CMDLINE];
> +
> +        cmdline = arch_prepare_cmdline(cmdline, bi->arch);
> +        strlcpy(dom_cmdline, cmdline, MAX_GUEST_CMDLINE);
> +
> +        if ( bi->arch->kextra )
> +            /* kextra always includes exactly one leading space. */
> +            strlcat(dom_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);

I don't think the comment belongs here - there's no insertion of a blank
in sight anywhere.

> +        apply_xen_cmdline(dom_cmdline);
> +
> +        strlcpy(bd->kernel->string.bytes, dom_cmdline, MAX_GUEST_CMDLINE);

Further up using MAX_GUEST_CMDLINE is acceptable, because it's easy to see
that this is the array's size. But here this isn't the case - you want to
use ARRAY_SIZE() at least in this one case (ideally everywhere).

> +    }
> +
> +    /*
> +     * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().

dom0 here, but ...

> +     * This saves a large number of corner cases interactions with
> +     * copy_from_user().
> +     */
> +    if ( cpu_has_smap )
> +    {
> +        cr4_pv32_mask &= ~X86_CR4_SMAP;
> +        write_cr4(read_cr4() & ~X86_CR4_SMAP);
> +    }
> +
> +    if ( construct_domain(bd) != 0 )

... domain here, yet then ...

> +        panic("Could not construct domain 0\n");

... domain 0 again here.

> @@ -745,109 +746,21 @@ static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int li
>      return n;
>  }
>  
> -static struct domain *__init create_dom0(
> -    const struct boot_info *bi, struct boot_domain *bd)
> +void __init apply_xen_cmdline(char *cmdline)
>  {
> -    struct xen_domctl_createdomain dom0_cfg = {
> -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
> -        .max_evtchn_port = -1,
> -        .max_grant_frames = -1,
> -        .max_maptrack_frames = -1,
> -        .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
> -        .max_vcpus = dom0_max_vcpus(),
> -        .arch = {
> -            .misc_flags = opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0,
> -        },
> -    };
> -    char *cmdline;
> -
> -    if ( bd->kernel == NULL )
> -        panic("Error creating d%uv0\n", bd->domid);
> -
> -    if ( opt_dom0_pvh )
> -    {
> -        dom0_cfg.flags |= (XEN_DOMCTL_CDF_hvm |
> -                           ((hvm_hap_supported() && !opt_dom0_shadow) ?
> -                            XEN_DOMCTL_CDF_hap : 0));
> -
> -        dom0_cfg.arch.emulation_flags |=
> -            XEN_X86_EMU_LAPIC | XEN_X86_EMU_IOAPIC | XEN_X86_EMU_VPCI;
> -    }
> -
> -    if ( iommu_enabled )
> -        dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> -
> -    /* Create initial domain.  Not d0 for pvshim. */
> -    bd->domid = get_initial_domain_id();
> -    bd->domain = domain_create(bd->domid, &dom0_cfg, pv_shim ?
> -                               0 : CDF_privileged);
> -    if ( IS_ERR(bd->domain) )
> -        panic("Error creating d%u: %ld\n", bd->domid, PTR_ERR(bd->domain));
> -
> -    init_dom0_cpuid_policy(bd->domain);
> -
> -    if ( alloc_dom0_vcpu0(bd->domain) == NULL )
> -        panic("Error creating d%uv0\n", bd->domid);
> -
> -    /* Grab the DOM0 command line. */
> -    cmdline = (bd->kernel->string.kind == BOOTSTR_CMDLINE) ?
> -              bd->kernel->string.bytes : NULL;
> -    if ( cmdline || bi->arch->kextra )
> -    {
> -        char dom0_cmdline[MAX_GUEST_CMDLINE];
> -
> -        cmdline = arch_prepare_cmdline(cmdline, bi->arch);
> -        strlcpy(dom0_cmdline, cmdline, MAX_GUEST_CMDLINE);
> -
> -        if ( bi->arch->kextra )
> -            /* kextra always includes exactly one leading space. */
> -            strlcat(dom0_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
> -
> -        /* Append any extra parameters. */
> -        if ( skip_ioapic_setup && !strstr(dom0_cmdline, "noapic") )
> -            strlcat(dom0_cmdline, " noapic", MAX_GUEST_CMDLINE);
> -        if ( (strlen(acpi_param) == 0) && acpi_disabled )
> -        {
> -            printk("ACPI is disabled, notifying Domain 0 (acpi=off)\n");
> -            strlcpy(acpi_param, "off", sizeof(acpi_param));
> -        }
> -        if ( (strlen(acpi_param) != 0) && !strstr(dom0_cmdline, "acpi=") )
> -        {
> -            strlcat(dom0_cmdline, " acpi=", MAX_GUEST_CMDLINE);
> -            strlcat(dom0_cmdline, acpi_param, MAX_GUEST_CMDLINE);
> -        }
> -
> -        strlcpy(bd->kernel->string.bytes, dom0_cmdline, MAX_GUEST_CMDLINE);
> -    }
> -
> -    /*
> -     * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
> -     * This saves a large number of corner cases interactions with
> -     * copy_from_user().
> -     */
> -    if ( cpu_has_smap )
> +    if ( skip_ioapic_setup && !strstr(cmdline, "noapic") )
> +        strlcat(cmdline, " noapic", MAX_GUEST_CMDLINE);
> +    if ( (strlen(acpi_param) == 0) && acpi_disabled )
>      {
> -        cr4_pv32_mask &= ~X86_CR4_SMAP;
> -        write_cr4(read_cr4() & ~X86_CR4_SMAP);
> +        printk("ACPI is disabled, notifying Domain 0 (acpi=off)\n");
> +        strlcpy(acpi_param, "off", sizeof(acpi_param));
>      }
> -
> -    if ( construct_domain(bd) != 0 )
> -        panic("Could not construct domain 0\n");
> -
> -    if ( cpu_has_smap )
> +    if ( (strlen(acpi_param) != 0) &&
> +         !strstr(cmdline, "acpi=") )
>      {
> -        write_cr4(read_cr4() | X86_CR4_SMAP);
> -        cr4_pv32_mask |= X86_CR4_SMAP;
> +        strlcat(cmdline, " acpi=", MAX_GUEST_CMDLINE);
> +        strlcat(cmdline, acpi_param, MAX_GUEST_CMDLINE);
>      }
> -
> -    return bd->domain;
> -}
> -
> -void __init arch_create_dom(
> -    const struct boot_info *bi, struct boot_domain *bd)
> -{
> -    if ( builder_is_initdom(bd) )
> -        create_dom0(bi, bd);
>  }

Earlier on a function was introduced to deal with this cmdline handling.
And now you introduce a 2nd such function?

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 13/18] x86: generalize physmap logic
  2022-07-06 21:04 ` [PATCH v1 13/18] x86: generalize physmap logic Daniel P. Smith
@ 2022-07-27 12:33   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-27 12:33 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	xen-devel, Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> The existing physmap code is specific to dom0.

I think this needs better wording. Either you name the function or you
explain what piece of code you're talking about. "physmap" alone is
just not meaningful enough. (Also applies to the title.)

> --- a/xen/arch/x86/include/asm/dom0_build.h
> +++ b/xen/arch/x86/include/asm/dom0_build.h
> @@ -21,7 +21,7 @@ int dom0_construct_pvh(struct boot_domain *bd);
>  unsigned long dom0_paging_pages(const struct domain *d,
>                                  unsigned long nr_pages);
>  
> -void dom0_update_physmap(bool compat, unsigned long pfn,
> +void dom_update_physmap(bool compat, unsigned long pfn,
>                           unsigned long mfn, unsigned long vphysmap_s);

So my initial inclination was to suggest domain_ as a name prefix,
matching what we have elsewhere. But when we're already giving the
thing a new name, its PV-only nature also wants expressing. Hence
I'd like to suggest pv_update_physmap(). And then please fix
indentation of the continuation lines here and below.

> --- a/xen/arch/x86/pv/dom0_build.c
> +++ b/xen/arch/x86/pv/dom0_build.c
> @@ -34,8 +34,8 @@
>  #define L3_PROT (BASE_PROT|_PAGE_DIRTY)
>  #define L4_PROT (BASE_PROT|_PAGE_DIRTY)
>  
> -void __init dom0_update_physmap(bool compat, unsigned long pfn,
> -                                unsigned long mfn, unsigned long vphysmap_s)
> +void __init dom_update_physmap(
> +    bool compat, unsigned long pfn, unsigned long mfn, unsigned long vphysmap_s)
>  {

Personally I dislike this further change to re-flow the parameter
list, as I see no particular reason for doing so.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 14/18] x86: generalize vcpu for domain building
  2022-07-06 21:04 ` [PATCH v1 14/18] x86: generalize vcpu for domain building Daniel P. Smith
@ 2022-07-27 12:46   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-27 12:46 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, Dario Faggioli,
	xen-devel, Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> Here, the vcpu initialization code for dom0 creation is generalized for use for
> other domains.

Yet with "other domains" still only ones created during boot, aiui.
Imo such details want spelling out.

The title also is too generic / imprecise.

> --- a/xen/arch/x86/domain_builder.c
> +++ b/xen/arch/x86/domain_builder.c
> @@ -28,6 +28,18 @@ static unsigned int __init dom_max_vcpus(struct boot_domain *bd)
>          return bd->ncpus;
>  }
>  
> +struct vcpu *__init alloc_dom_vcpu0(struct boot_domain *bd)

domain_alloc_vcpu0()?

> +{
> +    if ( bd->functions & BUILD_FUNCTION_INITIAL_DOM )
> +        return alloc_dom0_vcpu0(bd->domain);
> +
> +    bd->domain->node_affinity = node_online_map;
> +    bd->domain->auto_node_affinity = true;

I can spot neither consumers of nor code being replaced by this.

> +    return vcpu_create(bd->domain, 0);
> +}
> +
> +
>  void __init arch_create_dom(

No double blank lines please.

> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -14,6 +14,8 @@
>   */
>  
>  #ifndef COMPAT
> +#include <xen/bootdomain.h>
> +#include <xen/domain_builder.h>
>  #include <xen/init.h>
>  #include <xen/lib.h>
>  #include <xen/param.h>
> @@ -3399,13 +3401,13 @@ void wait(void)
>  }
>  
>  #ifdef CONFIG_X86
> -void __init sched_setup_dom0_vcpus(struct domain *d)
> +void __init sched_setup_dom_vcpus(struct boot_domain *bd)

Perhaps simply drop the original _dom0 infix?

>  {
>      unsigned int i;
>      struct sched_unit *unit;
>  
> -    for ( i = 1; i < d->max_vcpus; i++ )
> -        vcpu_create(d, i);
> +    for ( i = 1; i < bd->domain->max_vcpus; i++ )
> +        vcpu_create(bd->domain, i);

Seeing the further uses below, perhaps better introduce a local variable
"d", like you do elsewhere?

> @@ -3413,19 +3415,24 @@ void __init sched_setup_dom0_vcpus(struct domain *d)
>       * onlining them. This avoids pinning a vcpu to a not yet online cpu here.
>       */
>      if ( pv_shim )
> -        sched_set_affinity(d->vcpu[0]->sched_unit,
> +        sched_set_affinity(bd->domain->vcpu[0]->sched_unit,
>                             cpumask_of(0), cpumask_of(0));
>      else
>      {
> -        for_each_sched_unit ( d, unit )
> +        for_each_sched_unit ( bd->domain, unit )
>          {
> -            if ( !opt_dom0_vcpus_pin && !dom0_affinity_relaxed )
> -                sched_set_affinity(unit, &dom0_cpus, NULL);
> -            sched_set_affinity(unit, NULL, &dom0_cpus);
> +            if ( builder_is_initdom(bd) )
> +            {
> +                if ( !opt_dom0_vcpus_pin && !dom0_affinity_relaxed )
> +                    sched_set_affinity(unit, &dom0_cpus, NULL);
> +                sched_set_affinity(unit, NULL, &dom0_cpus);
> +            }
> +            else
> +                sched_set_affinity(unit, NULL, cpupool_valid_cpus(cpupool0));

Hard-coded cpupool0?

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -2,6 +2,7 @@
>  #ifndef __SCHED_H__
>  #define __SCHED_H__
>  
> +#include <xen/bootdomain.h>

Please don't - this header has already too many dependencies. All you really
need ...

> @@ -1003,7 +1004,7 @@ static inline bool sched_has_urgent_vcpu(void)
>  }
>  
>  void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value);
> -void sched_setup_dom0_vcpus(struct domain *d);
> +void sched_setup_dom_vcpus(struct boot_domain *d);

... for this is a forward declaration of struct boot_domain.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 15/18] x86: rework domain page allocation
  2022-07-06 21:04 ` [PATCH v1 15/18] x86: rework domain page allocation Daniel P. Smith
@ 2022-07-27 13:22   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-27 13:22 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	xen-devel, Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> This reworks all the dom0 page allocation functions for general domain
> construction. Where possible, common logic between the two was split into a
> separate function for reuse by the two functions.

You absolutely need to mention what behavioral / functional changes there
are (intended), even in case it is "none".

> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -320,69 +320,31 @@ static unsigned long __init default_nr_pages(unsigned long avail)
>  }
>  
>  unsigned long __init dom0_compute_nr_pages(
> -    struct domain *d, struct elf_dom_parms *parms, unsigned long initrd_len)
> +    struct boot_domain *bd, struct elf_dom_parms *parms,
> +    unsigned long initrd_len)
>  {
> -    nodeid_t node;
> -    unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 0;
> +    unsigned long avail, nr_pages, min_pages, max_pages;
>  
>      /* The ordering of operands is to work around a clang5 issue. */
>      if ( CONFIG_DOM0_MEM[0] && !dom0_mem_set )
>          parse_dom0_mem(CONFIG_DOM0_MEM);
>  
> -    for_each_node_mask ( node, dom0_nodes )
> -        avail += avail_domheap_pages_region(node, 0, 0) +
> -                 initial_images_nrpages(node);
> -
> -    /* Reserve memory for further dom0 vcpu-struct allocations... */
> -    avail -= (d->max_vcpus - 1UL)
> -             << get_order_from_bytes(sizeof(struct vcpu));
> -    /* ...and compat_l4's, if needed. */
> -    if ( is_pv_32bit_domain(d) )
> -        avail -= d->max_vcpus - 1;
> -
> -    /* Reserve memory for iommu_dom0_init() (rough estimate). */
> -    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
> -    {
> -        unsigned int s;
> -
> -        for ( s = 9; s < BITS_PER_LONG; s += 9 )
> -            iommu_pages += max_pdx >> s;
> -
> -        avail -= iommu_pages;
> -    }
> +    avail = dom_avail_nr_pages(bd, dom0_nodes);
>  
> -    if ( paging_mode_enabled(d) || opt_dom0_shadow || opt_pv_l1tf_hwdom )
> +    /* command line overrides configuration */
> +    if (  dom0_mem_set )

Nit: Stray double blanks.

>      {
> -        unsigned long cpu_pages;
> -
> -        nr_pages = get_memsize(&dom0_size, avail) ?: default_nr_pages(avail);
> -
> -        /*
> -         * Clamp according to min/max limits and available memory
> -         * (preliminary).
> -         */
> -        nr_pages = max(nr_pages, get_memsize(&dom0_min_size, avail));
> -        nr_pages = min(nr_pages, get_memsize(&dom0_max_size, avail));
> -        nr_pages = min(nr_pages, avail);
> -
> -        cpu_pages = dom0_paging_pages(d, nr_pages);
> -
> -        if ( !iommu_use_hap_pt(d) )
> -            avail -= cpu_pages;
> -        else if ( cpu_pages > iommu_pages )
> -            avail -= cpu_pages - iommu_pages;

I can't see any of this represented in the new code. Have you gone through
the history of this code, to understand why things are the way they are,
and hence what (corner) cases need to remain behaviorally unchanged?

> @@ -40,6 +42,106 @@ struct vcpu *__init alloc_dom_vcpu0(struct boot_domain *bd)
>  }
>  
>  
> +unsigned long __init dom_avail_nr_pages(
> +    struct boot_domain *bd, nodemask_t nodes)
> +{
> +    unsigned long avail = 0, iommu_pages = 0;
> +    bool is_ctldom = false, is_hwdom = false;
> +    unsigned long nr_pages = bd->meminfo.mem_size.nr_pages;
> +    nodeid_t node;
> +
> +    if ( builder_is_ctldom(bd) )
> +        is_ctldom = true;
> +    if ( builder_is_hwdom(bd) )
> +        is_hwdom = true;
> +
> +    for_each_node_mask ( node, nodes )
> +        avail += avail_domheap_pages_region(node, 0, 0) +
> +                 initial_images_nrpages(node);

I don't think this is suitable for other than Dom0, so I question the
splitting out and generalizing of this logic. For "ordinary" domains
their memory size should be well-defined rather than inferred from
host capacity.

Starting from host capacity also means you become ordering dependent
when it comes to creating (not starting) all the domains: Which one
is to come first? And even with this limited to just Dom0 - is its
size calculated before or after all the other domains were created?

> +    /* Reserve memory for further dom0 vcpu-struct allocations... */

dom0?

> +    avail -= (bd->domain->max_vcpus - 1UL)
> +             << get_order_from_bytes(sizeof(struct vcpu));
> +    /* ...and compat_l4's, if needed. */
> +    if ( is_pv_32bit_domain(bd->domain) )
> +        avail -= bd->domain->max_vcpus - 1;
> +
> +    /* Reserve memory for iommu_dom0_init() (rough estimate). */
> +    if ( is_hwdom && is_iommu_enabled(bd->domain) && !iommu_hwdom_passthrough )

Again the question why this would be Dom0-only.

> +    {
> +        unsigned int s;
> +
> +        for ( s = 9; s < BITS_PER_LONG; s += 9 )
> +            iommu_pages += max_pdx >> s;
> +
> +        avail -= iommu_pages;
> +    }
> +
> +    if ( paging_mode_enabled(bd->domain) ||
> +         (is_ctldom && opt_dom0_shadow) ||
> +         (is_hwdom && opt_pv_l1tf_hwdom) )

An interesting combination of conditions. It (again) looks to me as if
it first needs properly separating Dom0 from hwdom, in an abstract
sense.

> +    {
> +        unsigned long cpu_pages = dom0_paging_pages(bd->domain, nr_pages);
> +
> +        if ( !iommu_use_hap_pt(bd->domain) )
> +            avail -= cpu_pages;
> +        else if ( cpu_pages > iommu_pages )
> +            avail -= cpu_pages - iommu_pages;
> +    }
> +
> +    return avail;
> +}
> +
> +unsigned long __init dom_compute_nr_pages(
> +    struct boot_domain *bd, struct elf_dom_parms *parms,
> +    unsigned long initrd_len)
> +{
> +    unsigned long avail, nr_pages = bd->meminfo.mem_size.nr_pages;
> +
> +    if ( builder_is_initdom(bd) )
> +        return dom0_compute_nr_pages(bd, parms, initrd_len);
> +
> +    avail = dom_avail_nr_pages(bd, node_online_map);
> +
> +    if ( is_pv_domain(bd->domain) && (parms->p2m_base == UNSET_ADDR) )
> +    {
> +        /*
> +         * Legacy Linux kernels (i.e. such without a XEN_ELFNOTE_INIT_P2M
> +         * note) require that there is enough virtual space beyond the initial
> +         * allocation to set up their initial page tables. This space is
> +         * roughly the same size as the p2m table, so make sure the initial
> +         * allocation doesn't consume more than about half the space that's
> +         * available between params.virt_base and the address space end.
> +         */

This duplicates an existing comment (and hence below likely also
existing code) rather than replacing / moving the original. As in
an earlier case - how are the two going to remain in sync?

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 16/18] x86: add pv multidomain construction
  2022-07-06 21:04 ` [PATCH v1 16/18] x86: add pv multidomain construction Daniel P. Smith
@ 2022-07-27 14:12   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-27 14:12 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper,
	Roger Pau Monné,
	George Dunlap, Julien Grall, Stefano Stabellini, xen-devel,
	Wei Liu

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -524,37 +524,6 @@ int __init dom0_setup_permissions(struct domain *d)
>  
>      return rc;
>  }
> -
> -int __init construct_domain(struct boot_domain *bd)
> -{
> -    int rc = 0;
> -
> -    /* Sanity! */
> -    BUG_ON(!pv_shim && bd->domid != 0);
> -    BUG_ON(bd->domain->vcpu[0] == NULL);
> -    BUG_ON(bd->domain->vcpu[0]->is_initialised);
> -
> -    process_pending_softirqs();
> -
> -    if ( builder_is_initdom(bd) )
> -    {
> -        if ( is_hvm_domain(bd->domain) )
> -            rc = dom0_construct_pvh(bd);
> -        else if ( is_pv_domain(bd->domain) )
> -            rc = dom0_construct_pv(bd);
> -        else
> -            panic("Cannot construct Dom0. No guest interface available\n");
> -    }
> -
> -    if ( rc )
> -        return rc;
> -
> -    /* Sanity! */
> -    BUG_ON(!bd->domain->vcpu[0]->is_initialised);
> -
> -    return 0;
> -}

Iirc this function was introduced earlier in the series. Just for it
to now be moved around? Why can't it be introduced in the intended
source file right away?

> @@ -189,18 +190,22 @@ void __init arch_create_dom(
>      if ( !pv_shim && builder_is_ctldom(bd) )
>          is_privileged = CDF_privileged;
>  
> -    /* Create initial domain.  Not d0 for pvshim. */
> -    bd->domid = get_initial_domain_id();
> +    /* Determine proper domain id. */
> +    if ( builder_is_initdom(bd) )
> +        bd->domid = get_initial_domain_id();
> +    else
> +        bd->domid = bd->domid ? bd->domid : get_next_domid();

We prefer to omit the middle operand in such cases.

Where to you guarantee no two domains would use the same domain ID?
I can't help thinking that a predetermined one may have been
assigned earlier on to a domain which got it from get_next_domid().

>      bd->domain = domain_create(bd->domid, &dom_cfg, is_privileged);
>      if ( IS_ERR(bd->domain) )
>          panic("Error creating d%u: %ld\n", bd->domid, PTR_ERR(bd->domain));
>  
> -    init_dom0_cpuid_policy(bd->domain);
> +    if ( builder_is_initdom(bd) )
> +        init_dom0_cpuid_policy(bd->domain);

What about other than Dom0?

> @@ -210,15 +215,23 @@ void __init arch_create_dom(
>          cmdline = arch_prepare_cmdline(cmdline, bi->arch);
>          strlcpy(dom_cmdline, cmdline, MAX_GUEST_CMDLINE);
>  
> -        if ( bi->arch->kextra )
> -            /* kextra always includes exactly one leading space. */
> -            strlcat(dom_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
> +        if ( builder_is_initdom(bd) )
> +        {
> +            if ( bi->arch->kextra )
> +                /* kextra always includes exactly one leading space. */
> +                strlcat(dom_cmdline, bi->arch->kextra, MAX_GUEST_CMDLINE);
>  
> -        apply_xen_cmdline(dom_cmdline);
> +            apply_xen_cmdline(dom_cmdline);
> +        }

Why is kextra applicable to Dom0 only? Shouldn't each domain have a
way to append to its command line?

>          strlcpy(bd->kernel->string.bytes, dom_cmdline, MAX_GUEST_CMDLINE);
>      }
>  
> +    if ( alloc_system_evtchn(bi, bd) != 0 )
> +        printk(XENLOG_WARNING "%s: "
> +               "unable set up system event channels for Dom%d\n",
> +               __func__, bd->domid);

So if Dom0 is created after e.g. a separate xenstore domain, it'll
also have a xenstore event channel assigned (changing behavior from
what we have today)?

> @@ -240,3 +253,32 @@ void __init arch_create_dom(
>      }
>  }
>  
> +int __init construct_domain(struct boot_domain *bd)
> +{
> +    int rc = 0;
> +
> +    /* Sanity! */
> +    BUG_ON(bd->domid != bd->domain->domain_id);
> +    BUG_ON(bd->domain->vcpu[0] == NULL);
> +    BUG_ON(bd->domain->vcpu[0]->is_initialised);
> +
> +    process_pending_softirqs();
> +
> +    if ( is_hvm_domain(bd->domain) )
> +        if ( builder_is_initdom(bd) )
> +            rc = dom0_construct_pvh(bd);
> +        else
> +            panic("Cannot construct HVM DomU. Not supported.\n");
> +    else if ( is_pv_domain(bd->domain) )

Please properly use braces to enclose the inner if/else pair.

> +            rc = dom_construct_pv(bd);
> +    else
> +        panic("Cannot construct Dom0. No guest interface available\n");

Dom0?

> --- a/xen/arch/x86/pv/dom0_build.c
> +++ b/xen/arch/x86/pv/domain_builder.c
> @@ -1,5 +1,5 @@
>  /******************************************************************************
> - * pv/dom0_build.c
> + * pv/domain_builder.c
>   *
>   * Copyright (c) 2002-2005, K A Fraser
>   */
> @@ -8,6 +8,7 @@
>  #include <xen/bootinfo.h>
>  #include <xen/console.h>
>  #include <xen/domain.h>
> +#include <xen/domain_builder.h>
>  #include <xen/domain_page.h>
>  #include <xen/init.h>
>  #include <xen/libelf.h>
> @@ -296,7 +297,7 @@ static struct page_info * __init alloc_chunk(struct domain *d,
>      return page;
>  }
>  
> -int __init dom0_construct_pv(struct boot_domain *bd)
> +int __init dom_construct_pv(struct boot_domain *bd)
>  {
>      int i, rc, order, machine;
>      bool compatible, compat;
> @@ -350,7 +351,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>      /* Machine address of next candidate page-table page. */
>      paddr_t mpt_alloc;
>  
> -    printk(XENLOG_INFO "*** Building a PV Dom%d ***\n", d->domain_id);
> +    printk(XENLOG_INFO "*** Constructing a PV Dom%d ***\n", d->domain_id);

When touching things like here and even more so when ...

> @@ -384,7 +385,8 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>          {
>              if ( unlikely(rc = switch_compat(d)) )
>              {
> -                printk("Dom0 failed to switch to compat: %d\n", rc);
> +                printk("Dom%d failed to switch to compat: %d\n",
> +                        d->domain_id, rc);

... adding new logging of domain IDs, please use %pd whenever possible.

> @@ -404,22 +406,23 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>      if ( elf_msb(&elf) )
>          compatible = false;
>  
> -    printk(" Dom0 kernel: %s-bit%s, %s, paddr %#" PRIx64 " -> %#" PRIx64 "\n",
> -           elf_64bit(&elf) ? "64" : elf_32bit(&elf) ? "32" : "??",
> +    printk(" Dom%d kernel: %s-bit%s, %s, paddr %#" PRIx64 " -> %#" PRIx64 "\n",
> +           d->domain_id, elf_64bit(&elf) ? "64" : elf_32bit(&elf) ? "32" : "??",
>             parms.pae       ? ", PAE" : "",
>             elf_msb(&elf)   ? "msb"   : "lsb",
>             elf.pstart, elf.pend);
>      if ( elf.bsd_symtab_pstart )
> -        printk(" Dom0 symbol map %#" PRIx64 " -> %#" PRIx64 "\n",
> -               elf.bsd_symtab_pstart, elf.bsd_symtab_pend);
> +        printk(" Dom%d symbol map %#" PRIx64 " -> %#" PRIx64 "\n",
> +               d->domain_id, elf.bsd_symtab_pstart, elf.bsd_symtab_pend);
>  
>      if ( !compatible )
>      {
> -        printk("Mismatch between Xen and DOM0 kernel\n");
> +        printk("Mismatch between Xen and DOM%d kernel\n", d->domain_id);
>          return -EINVAL;
>      }
>  
> -    if ( parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE )
> +    if ( builder_is_initdom(bd) &&
> +         parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE )
>      {
>          if ( !pv_shim && !test_bit(XENFEAT_dom0, parms.f_supported) )
>          {
> @@ -443,7 +446,8 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>  
>              if ( value > __HYPERVISOR_COMPAT_VIRT_START )
>              {
> -                printk("Dom0 expects too high a hypervisor start address\n");
> +                printk("Dom%d expects too high a hypervisor start address\n",
> +                       d->domain_id);
>                  return -ERANGE;
>              }
>              HYPERVISOR_COMPAT_VIRT_START(d) =
> @@ -487,7 +491,7 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>      vstartinfo_start = round_pgup(vphysmap_end);
>      vstartinfo_end   = vstartinfo_start + sizeof(struct start_info);
>  
> -    if ( pv_shim )
> +    if ( pv_shim || ! builder_is_initdom(bd) )

As elsewhere - stray blank after ! . Also wouldn't it make sense for
builder_is_initdom() to return false in the pv_shim case, thus making
the || here (and again below) unnecessary?

> @@ -789,6 +790,19 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>      snprintf(si->magic, sizeof(si->magic), "xen-3.0-x86_%d%s",
>               elf_64bit(&elf) ? 64 : 32, parms.pae ? "p" : "");
>  
> +    if ( !builder_is_initdom(bd) )
> +    {
> +        si->store_mfn = ((vxenstore_start - v_start) >> PAGE_SHIFT)
> +                        + alloc_spfn;
> +        bd->store.mfn = si->store_mfn;
> +        si->store_evtchn = bd->store.evtchn;
> +
> +        si->console.domU.mfn = ((vconsole_start - v_start) >> PAGE_SHIFT)
> +                               + alloc_spfn;
> +        bd->console.mfn = si->console.domU.mfn;
> +        si->console.domU.evtchn = bd->console.evtchn;
> +    }

While elsewhere you allow separate hwdom and ctrldom, aiui only one
of the two would fail the entry condition to this if(). Which one
would that be? And in how far are there kernels knowing how to deal
with the situation? I'm not even certain this can be properly
expressed in the present start_info structure, as a non-Dom0
control domain would e.g. need to have xenstore coordinates passed,
but might act as the domain handling consoles.

> @@ -871,23 +885,24 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>                  sizeof(si->cmd_line));
>  
>  #ifdef CONFIG_VIDEO
> -    if ( !pv_shim && fill_console_start_info((void *)(si + 1)) )
> -    {
> -        si->console.dom0.info_off  = sizeof(struct start_info);
> -        si->console.dom0.info_size = sizeof(struct dom0_vga_console_info);
> -    }
> +    if ( builder_is_hwdom(bd) )
> +        if ( !pv_shim && fill_console_start_info((void *)(si + 1)) )

As before - please combine such if()s.

> +        {
> +            si->console.dom0.info_off  = sizeof(struct start_info);
> +            si->console.dom0.info_size = sizeof(struct dom0_vga_console_info);
> +        }

I don't view it as a given that hwdom is the domain to have access to
the physical VGA. While it may follow from its name, it may be more
useful to have the control domain direct its output there.

>  #endif
>  
>      /*
>       * TODO: provide an empty stub for fill_console_start_info in the
>       * !CONFIG_VIDEO case so the logic here can be simplified.
>       */
> -    if ( pv_shim )
> +    if ( builder_is_hwdom(bd) && pv_shim )
>          pv_shim_setup_dom(d, l4start, v_start, vxenstore_start, vconsole_start,
>                            vphysmap_start, si);

???

>  #ifdef CONFIG_COMPAT
> -    if ( compat )
> +    if ( builder_is_hwdom(bd) && compat )
>          xlat_start_info(si, pv_shim ? XLAT_start_info_console_domU
>                                      : XLAT_start_info_console_dom0);

Even more so here: ??? (This is a clear sign that your commit messages
are lacking helpful detail.)

> @@ -926,15 +941,18 @@ int __init dom0_construct_pv(struct boot_domain *bd)
>      if ( test_bit(XENFEAT_supervisor_mode_kernel, parms.f_required) )
>          panic("Dom0 requires supervisor-mode execution\n");
>  
> -    rc = dom0_setup_permissions(d);
> -    BUG_ON(rc != 0);
> +    if ( builder_is_hwdom(bd) )
> +    {
> +        rc = dom0_setup_permissions(d);
> +        BUG_ON(rc != 0);
> +    }

What about other domains?

>      if ( d->domain_id == hardware_domid )
>          iommu_hwdom_init(d);
>  
>  #ifdef CONFIG_SHADOW_PAGING
>      /* Fill the shadow pool if necessary. */
> -    if ( opt_dom0_shadow || opt_pv_l1tf_hwdom )
> +    if ( builder_is_hwdom(bd) && (opt_dom0_shadow || opt_pv_l1tf_hwdom) )

With this I'd like to refer you back to my "An interesting combination
of conditions" comment on patch 15.

> --- a/xen/common/domain-builder/Kconfig
> +++ b/xen/common/domain-builder/Kconfig
> @@ -12,4 +12,14 @@ config BUILDER_FDT
>  
>  	  If unsure, say N.
>  
> +config MULTIDOM_BUILDER
> +	bool "Multidomain building (UNSUPPORTED)" if UNSUPPORTED
> +	depends on BUILDER_FDT

Shouldn't this be "select", with that other option perhaps not even
needing a prompt?

> --- a/xen/common/domain-builder/core.c
> +++ b/xen/common/domain-builder/core.c
> @@ -1,6 +1,7 @@
>  #include <xen/bootdomain.h>
>  #include <xen/bootinfo.h>
>  #include <xen/domain_builder.h>
> +#include <xen/event.h>
>  #include <xen/init.h>
>  #include <xen/types.h>
>  
> @@ -60,37 +61,144 @@ void __init builder_init(struct boot_info *info)
>          d->kernel->string.kind = BOOTSTR_CMDLINE;
>  }
>  
> +static bool __init build_domain(struct boot_info *info, struct boot_domain *bd)
> +{
> +    if ( bd->constructed == true )

Please omit "== true" / replace "== false" or alike.

> +        return true;
> +
> +    if ( bd->kernel == NULL )
> +        return false;
> +
> +    printk(XENLOG_INFO "*** Building Dom%d ***\n", bd->domid);
> +
> +    arch_create_dom(info, bd);
> +    if ( bd->domain )
> +    {
> +        bd->constructed = true;
> +        return true;
> +    }
> +
> +    return false;
> +}
> +
>  uint32_t __init builder_create_domains(struct boot_info *info)
>  {
>      uint32_t build_count = 0, functions_built = 0;
> +    struct boot_domain *bd;
>      int i;
>  
> +    if ( IS_ENABLED(CONFIG_MULTIDOM_BUILDER) )
> +    {
> +        bd = builder_dom_by_function(info, BUILD_FUNCTION_XENSTORE);
> +        if ( build_domain(info, bd) )
> +        {
> +            functions_built |= bd->functions;
> +            build_count++;
> +        }
> +        else
> +            printk(XENLOG_WARNING "Xenstore build failed, system may be unusable\n");
> +
> +        bd = builder_dom_by_function(info, BUILD_FUNCTION_CONSOLE);
> +        if ( build_domain(info, bd) )
> +        {
> +            functions_built |= bd->functions;
> +            build_count++;

If both are the same domain, you'll end up with a count of 2 here even
though only one domain was built. This looks misleading.

> +        }
> +        else
> +            printk(XENLOG_WARNING "Console build failed, system may be unusable\n");

I think this and the similar earlier message want to include the word
"domain". You also want to split the lines at the start of the literal
strings.

> +    }
> +
>      for ( i = 0; i < info->builder->nr_doms; i++ )
>      {
> -        struct boot_domain *d = &info->builder->domains[i];
> +        bd = &info->builder->domains[i];
>  
>          if ( ! IS_ENABLED(CONFIG_MULTIDOM_BUILDER) &&

Interesting - this config option is being introduced only here.

> -             ! builder_is_initdom(d) &&
> +             ! builder_is_initdom(bd) &&
>               functions_built & BUILD_FUNCTION_INITIAL_DOM )
>              continue;
>  
> -        if ( d->kernel == NULL )
> +        if ( !build_domain(info, bd) )
>          {
> -            if ( builder_is_initdom(d) )
> +            if ( builder_is_initdom(bd) )
>                  panic("%s: intial domain missing kernel\n", __func__);
>  
> -            printk(XENLOG_ERR "%s:Dom%d definiton has no kernel\n", __func__,
> -                    d->domid);
> +            printk(XENLOG_WARNING "Dom%d build failed, skipping\n", bd->domid);
>              continue;
>          }
>  
> -        arch_create_dom(info, d);
> -        if ( d->domain )
> +        functions_built |= bd->functions;
> +        build_count++;
> +    }
> +
> +    if ( IS_ENABLED(CONFIG_X86) )
> +        /* Free temporary buffers. */
> +        discard_initial_images();

I guess this won't build on Arm. Plus Arm has a similarly named function
(discard_initial_modules()) which likely wants calling here (or rather:
adding suitable abstraction for the right function to be called).

> +    return build_count;
> +}
> +
> +domid_t __init get_next_domid(void)
> +{
> +    static domid_t __initdata last_domid = 0;
> +    domid_t next;
> +
> +    for ( next = last_domid + 1; next < DOMID_FIRST_RESERVED; next++ )
> +    {
> +        struct domain *d;
> +
> +        if ( (d = rcu_lock_domain_by_id(next)) == NULL )
>          {
> -            functions_built |= d->functions;
> -            build_count++;
> +            last_domid = next;
> +            return next;
>          }
> +
> +        rcu_unlock_domain(d);
>      }
>  
> -    return build_count;
> +    return 0;
> +}

This looks suspiciously similar to code in common/domctl.c. Perhaps
you want to make a function usable by both (introduced in a separate
patch)?

> --- a/xen/include/xen/bootdomain.h
> +++ b/xen/include/xen/bootdomain.h
> @@ -47,6 +47,12 @@ struct boot_domain {
>      struct boot_module *configs[BUILD_MAX_CONF_MODS];
>  
>      struct domain *domain;
> +    struct {
> +        xen_pfn_t mfn;
> +        unsigned int evtchn;
> +    } store, console;
> +    bool constructed;
> +
>  };

Stray blank line? Or maybe you meant it to go in front of "constructed"?

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v1 17/18] builder: introduce domain builder hypfs tree
  2022-07-06 21:04 ` [PATCH v1 17/18] builder: introduce domain builder hypfs tree Daniel P. Smith
@ 2022-07-27 14:30   ` Jan Beulich
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Beulich @ 2022-07-27 14:30 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: scott.davis, christopher.clark, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, xen-devel

On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- a/xen/common/domain-builder/core.c
> +++ b/xen/common/domain-builder/core.c
> @@ -134,6 +134,9 @@ uint32_t __init builder_create_domains(struct boot_info *info)
>          /* Free temporary buffers. */
>          discard_initial_images();
>  
> +    if ( IS_ENABLED(CONFIG_BUILDER_HYPFS) )
> +        builder_hypfs(info);

No need for the if() here when you provide a stub in the header file.
Just that of course the stub vs prototype there needs to depend on
CONFIG_BUILDER_HYPFS, not CONFIG_HYPFS.

> +static int __init alloc_hypfs(struct boot_info *info)
> +{
> +    if ( !(builder_dir = (struct hypfs_entry_dir *)xmalloc_bytes(
> +                        sizeof(struct hypfs_entry_dir))) )

Why not xmalloc() (or xzalloc()), which specifically exists to avoid
open-coded casts like the one here?

> +    {
> +        printk(XENLOG_WARNING "%s: unable to allocate hypfs dir\n", __func__);
> +        return -ENOMEM;
> +    }
> +
> +    builder_dir->e.type = XEN_HYPFS_TYPE_DIR;
> +    builder_dir->e.encoding = XEN_HYPFS_ENC_PLAIN;
> +    builder_dir->e.name = "builder";
> +    builder_dir->e.size = 0;
> +    builder_dir->e.max_size = 0;
> +    INIT_LIST_HEAD(&builder_dir->e.list);
> +    builder_dir->e.funcs = &hypfs_dir_funcs;
> +    INIT_LIST_HEAD(&builder_dir->dirlist);
> +
> +    if ( !(entries = (struct domain_node *)xmalloc_bytes(
> +                        sizeof(struct domain_node) * info->builder->nr_doms)) )

xmalloc_array()?

> +    {
> +        printk(XENLOG_WARNING "%s: unable to allocate hypfs nodes\n", __func__);
> +        return -ENOMEM;
> +    }
> +
> +    return 0;
> +}
> +
> +void __init builder_hypfs(struct boot_info *info)
> +{
> +    int i;
> +
> +    printk("Domain Builder: creating hypfs nodes\n");

If at all, then dprintk().

> +    if ( alloc_hypfs(info) != 0 )
> +        return;
> +
> +    for ( i = 0; i < info->builder->nr_doms; i++ )
> +    {
> +        struct domain_node *e = &entries[i];
> +        struct boot_domain *bd = &info->builder->domains[i];
> +        uint8_t *uuid = bd->uuid;
> +
> +        snprintf(e->dir_name, sizeof(e->dir_name), "%d", bd->domid);
> +
> +        snprintf(e->uuid, sizeof(e->uuid), "%08x-%04x-%04x-%04x-%04x%08x",
> +                 *(uint32_t *)uuid, *(uint16_t *)(uuid+4),
> +                 *(uint16_t *)(uuid+6), *(uint16_t *)(uuid+8),
> +                 *(uint16_t *)(uuid+10), *(uint32_t *)(uuid+12));

Perhaps better introduce a properly typed structure? Endian-ness-wise
I'm also unsure about the last 12 nibbles: Isn't this an array of bytes
really? Actually the second-to-last 16-bit item is an array of two
bytes as well, if Linux'es %pU vsprintf() formatting is to be trusted.

> +        e->functions = bd->functions;
> +        e->constructed = bd->constructed;
> +
> +        e->ncpus = bd->ncpus;
> +        e->mem_size = (bd->meminfo.mem_size.nr_pages * PAGE_SIZE)/1024;
> +        e->mem_max = (bd->meminfo.mem_max.nr_pages * PAGE_SIZE)/1024;

Nit: Blanks around / please.

> +        e->xs.evtchn = bd->store.evtchn;
> +        e->xs.mfn = bd->store.mfn;
> +
> +        e->con_dev.evtchn = bd->console.evtchn;
> +        e->con_dev.mfn = bd->console.mfn;
> +
> +        /* Initialize and construct builder hypfs tree */
> +        INIT_HYPFS_DIR(e->dir, e->dir_name);
> +        INIT_HYPFS_DIR(e->xs.dir, "xenstore");
> +        INIT_HYPFS_DIR(e->dev_dir, "devices");
> +        INIT_HYPFS_DIR(e->con_dev.dir, "console");
> +
> +        INIT_HYPFS_STRING(e->uuid_leaf, "uuid");
> +        hypfs_string_set_reference(&e->uuid_leaf, e->uuid);
> +        INIT_HYPFS_UINT(e->func_leaf, "functions", e->functions);
> +        INIT_HYPFS_UINT(e->ncpus_leaf, "ncpus", e->ncpus);
> +        INIT_HYPFS_UINT(e->mem_sz_leaf, "mem_size", e->mem_size);
> +        INIT_HYPFS_UINT(e->mem_mx_leaf, "mem_max", e->mem_max);

May I suggest to prefer - over _ in node names?

> --- a/xen/include/xen/domain_builder.h
> +++ b/xen/include/xen/domain_builder.h
> @@ -72,4 +72,17 @@ int alloc_system_evtchn(
>      const struct boot_info *info, struct boot_domain *bd);
>  void arch_create_dom(const struct boot_info *bi, struct boot_domain *bd);
>  
> +#ifdef CONFIG_HYPFS
> +
> +void builder_hypfs(struct boot_info *info);
> +
> +#else
> +
> +static inline void builder_hypfs(struct boot_info *info)
> +{
> +    return;
> +}
> +
> +#endif

This would better go in a private header in xen/common/domain-builder/.

Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2022-07-27 14:31 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-06 21:04 [PATCH v1 00/18] Hyperlaunch Daniel P. Smith
2022-07-06 21:04 ` [PATCH v1 01/18] kconfig: allow configuration of maximum modules Daniel P. Smith
2022-07-07  1:44   ` Henry Wang
2022-07-15 19:16   ` Julien Grall
2022-07-19 16:36     ` Daniel P. Smith
2022-07-26 18:07       ` Julien Grall
2022-07-27  6:12         ` Jan Beulich
2022-07-19  9:32   ` Jan Beulich
2022-07-19 17:02     ` Daniel P. Smith
2022-07-20  7:27       ` Jan Beulich
2022-07-22 15:00         ` Daniel P. Smith
2022-07-06 21:04 ` [PATCH v1 02/18] introduction of generalized boot info Daniel P. Smith
2022-07-15 19:25   ` Julien Grall
2022-07-20 18:32     ` Daniel P. Smith
2022-07-19 13:11   ` Jan Beulich
2022-07-21 14:28     ` Daniel P. Smith
2022-07-21 16:00       ` Jan Beulich
2022-07-21 16:00       ` Jan Beulich
2022-07-22 16:01         ` Daniel P. Smith
2022-07-25  7:05           ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 03/18] x86: adopt new boot info structures Daniel P. Smith
2022-07-19 13:19   ` Jan Beulich
2022-07-22 12:34     ` Daniel P. Smith
2022-07-06 21:04 ` [PATCH v1 04/18] x86: refactor entrypoints to new boot info Daniel P. Smith
2022-07-18 13:58   ` Smith, Jackson
2022-07-22 12:59     ` Daniel P. Smith
2022-07-06 21:04 ` [PATCH v1 05/18] x86: refactor xen cmdline into general framework Daniel P. Smith
2022-07-19 13:26   ` Jan Beulich
2022-07-22 13:12     ` Daniel P. Smith
2022-07-25  7:09       ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 06/18] fdt: make fdt handling reusable across arch Daniel P. Smith
2022-07-07  1:44   ` Henry Wang
2022-07-19  9:36   ` Jan Beulich
2022-07-22 13:18     ` Daniel P. Smith
2022-07-06 21:04 ` [PATCH v1 07/18] docs: update hyperlaunch device tree documentation Daniel P. Smith
2022-07-18 13:57   ` Smith, Jackson
2022-07-22 13:34     ` Daniel P. Smith
2022-07-06 21:04 ` [PATCH v1 08/18] kconfig: introduce domain builder config option Daniel P. Smith
2022-07-07  1:44   ` Henry Wang
2022-07-19 13:29   ` Jan Beulich
2022-07-22 13:47     ` Daniel P. Smith
2022-07-06 21:04 ` [PATCH v1 09/18] x86: introduce abstractions for domain builder Daniel P. Smith
2022-07-26 14:22   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 10/18] x86: introduce the " Daniel P. Smith
2022-07-18 13:59   ` Smith, Jackson
2022-07-22 14:36     ` Daniel P. Smith
2022-07-22 20:33       ` Smith, Jackson
2022-07-23 10:45         ` Daniel P. Smith
2022-07-26 14:46   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 11/18] x86: initial conversion to " Daniel P. Smith
2022-07-26 15:01   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 12/18] x86: convert dom0 creation " Daniel P. Smith
2022-07-27 12:25   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 13/18] x86: generalize physmap logic Daniel P. Smith
2022-07-27 12:33   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 14/18] x86: generalize vcpu for domain building Daniel P. Smith
2022-07-27 12:46   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 15/18] x86: rework domain page allocation Daniel P. Smith
2022-07-27 13:22   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 16/18] x86: add pv multidomain construction Daniel P. Smith
2022-07-27 14:12   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 17/18] builder: introduce domain builder hypfs tree Daniel P. Smith
2022-07-27 14:30   ` Jan Beulich
2022-07-06 21:04 ` [PATCH v1 18/18] tools: introduce example late pv helper Daniel P. Smith
2022-07-19 17:06 ` [PATCH v1 00/18] Hyperlaunch Smith, Jackson
2022-07-22 14:51   ` Daniel P. Smith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.