linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes
@ 2019-10-09 10:53 Daniel Kiper
  2019-10-09 10:53 ` [PATCH v3 1/3] x86/boot: Introduce the kernel_info Daniel Kiper
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Daniel Kiper @ 2019-10-09 10:53 UTC (permalink / raw)
  To: linux-efi, linux-kernel, x86, xen-devel
  Cc: ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, hpa, jgross, konrad.wilk, mingo,
	ross.philipson, tglx

Hi,

Due to very limited space in the setup_header this patch series introduces new
kernel_info struct which will be used to convey information from the kernel to
the bootloader. This way the boot protocol can be extended regardless of the
setup_header limitations. Additionally, the patch series introduces some
convenience features like the setup_indirect struct and the
kernel_info.setup_type_max field.

Daniel

 Documentation/x86/boot.rst             | 168 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/boot/Makefile                 |   2 +-
 arch/x86/boot/compressed/Makefile      |   4 +-
 arch/x86/boot/compressed/kaslr.c       |  12 ++++++
 arch/x86/boot/compressed/kernel_info.S |  22 +++++++++++
 arch/x86/boot/header.S                 |   3 +-
 arch/x86/boot/tools/build.c            |   5 +++
 arch/x86/include/uapi/asm/bootparam.h  |  16 +++++++-
 arch/x86/kernel/e820.c                 |  11 ++++++
 arch/x86/kernel/kdebugfs.c             |  20 ++++++++--
 arch/x86/kernel/ksysfs.c               |  30 ++++++++++----
 arch/x86/kernel/setup.c                |   4 ++
 arch/x86/mm/ioremap.c                  |  11 ++++++
 13 files changed, 292 insertions(+), 16 deletions(-)

Daniel Kiper (3):
      x86/boot: Introduce the kernel_info
      x86/boot: Introduce the kernel_info.setup_type_max
      x86/boot: Introduce the setup_indirect


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] x86/boot: Introduce the kernel_info
  2019-10-09 10:53 [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
@ 2019-10-09 10:53 ` Daniel Kiper
  2019-10-10  0:43   ` Randy Dunlap
  2019-10-09 10:53 ` [PATCH v3 2/3] x86/boot: Introduce the kernel_info.setup_type_max Daniel Kiper
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Daniel Kiper @ 2019-10-09 10:53 UTC (permalink / raw)
  To: linux-efi, linux-kernel, x86, xen-devel
  Cc: ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, hpa, jgross, konrad.wilk, mingo,
	ross.philipson, tglx

The relationships between the headers are analogous to the various data
sections:

  setup_header = .data
  boot_params/setup_data = .bss

What is missing from the above list? That's right:

  kernel_info = .rodata

We have been (ab)using .data for things that could go into .rodata or .bss for
a long time, for lack of alternatives and -- especially early on -- inertia.
Also, the BIOS stub is responsible for creating boot_params, so it isn't
available to a BIOS-based loader (setup_data is, though).

setup_header is permanently limited to 144 bytes due to the reach of the
2-byte jump field, which doubles as a length field for the structure, combined
with the size of the "hole" in struct boot_params that a protected-mode loader
or the BIOS stub has to copy it into. It is currently 119 bytes long, which
leaves us with 25 very precious bytes. This isn't something that can be fixed
without revising the boot protocol entirely, breaking backwards compatibility.

boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
by adding setup_data entries. It cannot be used to communicate properties of
the kernel image, because it is .bss and has no image-provided content.

kernel_info solves this by providing an extensible place for information about
the kernel image. It is readonly, because the kernel cannot rely on a
bootloader copying its contents anywhere, but that is OK; if it becomes
necessary it can still contain data items that an enabled bootloader would be
expected to copy into a setup_data chunk.

This patch does not bump setup_header version in arch/x86/boot/header.S
because it will be followed by additional changes coming into the
Linux/x86 boot protocol.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
---
v3 - suggestions/fixes:
   - split kernel_info data into fixed and variable sized regions,
     (suggested by H. Peter Anvin),
   - change kernel_info.header value to "LToP" (0x506f544c),
     (suggested by H. Peter Anvin),
   - improve the comments,
   - improve the documentation.

v2 - suggestions/fixes:
   - rename setup_header2 to kernel_info,
     (suggested by H. Peter Anvin),
   - change kernel_info.header value to "InfO" (0x4f666e49),
   - new kernel_info description in Documentation/x86/boot.rst,
     (suggested by H. Peter Anvin),
   - drop kernel_info_offset_update() as an overkill and
     update kernel_info offset directly from main(),
     (suggested by Eric Snowberg),
   - new commit message
     (suggested by H. Peter Anvin),
   - fix some commit message misspellings
     (suggested by Eric Snowberg).
---
 Documentation/x86/boot.rst             | 121 +++++++++++++++++++++++++++++++++
 arch/x86/boot/Makefile                 |   2 +-
 arch/x86/boot/compressed/Makefile      |   4 +-
 arch/x86/boot/compressed/kernel_info.S |  17 +++++
 arch/x86/boot/header.S                 |   1 +
 arch/x86/boot/tools/build.c            |   5 ++
 arch/x86/include/uapi/asm/bootparam.h  |   1 +
 7 files changed, 148 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/boot/compressed/kernel_info.S

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 08a2f100c0e6..d5323a39f5e3 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -68,8 +68,25 @@ Protocol 2.12	(Kernel 3.8) Added the xloadflags field and extension fields
 Protocol 2.13	(Kernel 3.14) Support 32- and 64-bit flags being set in
 		xloadflags to support booting a 64-bit kernel from 32-bit
 		EFI
+
+Protocol 2.14:	BURNT BY INCORRECT COMMIT ae7e1238e68f2a472a125673ab506d49158c1889
+		(x86/boot: Add ACPI RSDP address to setup_header)
+		DO NOT USE!!! ASSUME SAME AS 2.13.
+
+Protocol 2.15:	(Kernel 5.5) Added the kernel_info.
 =============	============================================================
 
+.. note::
+     The protocol version number should be changed only if the setup header
+     is changed. There is no need to update the version number if boot_params
+     or kernel_info are changed. Additionally, it is recommended to use
+     xloadflags (in this case the protocol version number should not be
+     updated either) or kernel_info to communicate supported Linux kernel
+     features to the boot loader. Due to very limited space available in
+     the original setup header every update to it should be considered
+     with great care. Starting from the protocol 2.15 the primary way to
+     communicate things to the boot loader is the kernel_info.
+
 
 Memory Layout
 =============
@@ -207,6 +224,7 @@ Offset/Size	Proto		Name			Meaning
 0258/8		2.10+		pref_address		Preferred loading address
 0260/4		2.10+		init_size		Linear memory required during initialization
 0264/4		2.11+		handover_offset		Offset of handover entry point
+0268/4		2.15+		kernel_info_offset	Offset of the kernel_info
 ===========	========	=====================	============================================
 
 .. note::
@@ -855,6 +873,109 @@ Offset/size:	0x264/4
 
   See EFI HANDOVER PROTOCOL below for more details.
 
+============	==================
+Field name:	kernel_info_offset
+Type:		read
+Offset/size:	0x268/4
+Protocol:	2.15+
+============	==================
+
+  This field is the offset from the beginning of the kernel image to the
+  kernel_info. It is embedded in the Linux image in the uncompressed
+  protected mode region.
+
+
+The kernel_info
+===============
+
+The relationships between the headers are analogous to the various data
+sections:
+
+  setup_header = .data
+  boot_params/setup_data = .bss
+
+What is missing from the above list? That's right:
+
+  kernel_info = .rodata
+
+We have been (ab)using .data for things that could go into .rodata or .bss for
+a long time, for lack of alternatives and -- especially early on -- inertia.
+Also, the BIOS stub is responsible for creating boot_params, so it isn't
+available to a BIOS-based loader (setup_data is, though).
+
+setup_header is permanently limited to 144 bytes due to the reach of the
+2-byte jump field, which doubles as a length field for the structure, combined
+with the size of the "hole" in struct boot_params that a protected-mode loader
+or the BIOS stub has to copy it into. It is currently 119 bytes long, which
+leaves us with 25 very precious bytes. This isn't something that can be fixed
+without revising the boot protocol entirely, breaking backwards compatibility.
+
+boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
+by adding setup_data entries. It cannot be used to communicate properties of
+the kernel image, because it is .bss and has no image-provided content.
+
+kernel_info solves this by providing an extensible place for information about
+the kernel image. It is readonly, because the kernel cannot rely on a
+bootloader copying its contents anywhere, but that is OK; if it becomes
+necessary it can still contain data items that an enabled bootloader would be
+expected to copy into a setup_data chunk.
+
+All kernel_info data should be part of this structure. Fixed size data have to
+be put before kernel_info_var_len_data label. Variable size data have to be put
+behind kernel_info_var_len_data label. Each chunk of variable size data has to
+be prefixed with header/magic and its size, e.g.:
+
+  kernel_info:
+          .ascii  "LToP"          /* Header, Linux top (structure). */
+          .long   kernel_info_var_len_data - kernel_info
+          .long   kernel_info_end - kernel_info
+          .long   0x01234567      /* Some fixed size data for the bootloaders. */
+  kernel_info_var_len_data:
+  example_struct:                 /* Some variable size data for the bootloaders. */
+          .ascii  "EsTT"          /* Header/Magic. */
+          .long   example_struct_end - example_struct
+          .ascii  "Struct"
+          .long   0x89012345
+  example_struct_end:
+  example_strings:                /* Some variable size data for the bootloaders. */
+          .ascii  "EsTs"          /* Header/Magic. */
+          .long   example_strings_end - example_strings
+          .asciz  "String_0"
+          .asciz  "String_1"
+  example_strings_end:
+  kernel_info_end:
+
+This way the kernel_info is self-contained blob.
+
+
+Details of the kernel_info Fields
+=================================
+
+============	========
+Field name:	header
+Offset/size:	0x0000/4
+============	========
+
+  Contains the magic number "LToP" (0x506f544c).
+
+============	========
+Field name:	size
+Offset/size:	0x0004/4
+============	========
+
+  This field contains the size of the kernel_info including kernel_info.header.
+  It does not count kernel_info.kernel_info_var_len_data size. This field should be
+  used by the bootloaders to detect supported fixed size fields in the kernel_info
+  and beginning of kernel_info.kernel_info_var_len_data.
+
+============	========
+Field name:	size_total
+Offset/size:	0x0008/4
+============	========
+
+  This field contains the size of the kernel_info including kernel_info.header
+  and kernel_info.kernel_info_var_len_data.
+
 
 The Image Checksum
 ==================
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index e2839b5c246c..c30a9b642a86 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -87,7 +87,7 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
 
 SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
 
-sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|_end\|_ehead\|_text\|z_.*\)$$/\#define ZO_\2 0x\1/p'
+sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|kernel_info\|_end\|_ehead\|_text\|z_.*\)$$/\#define ZO_\2 0x\1/p'
 
 quiet_cmd_zoffset = ZOFFSET $@
       cmd_zoffset = $(NM) $< | sed -n $(sed-zoffset) > $@
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 6b84afdd7538..fad3b18e2cc3 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -72,8 +72,8 @@ $(obj)/../voffset.h: vmlinux FORCE
 
 $(obj)/misc.o: $(obj)/../voffset.h
 
-vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
-	$(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
+vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/kernel_info.o $(obj)/head_$(BITS).o \
+	$(obj)/misc.o $(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
 	$(obj)/piggy.o $(obj)/cpuflags.o
 
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
diff --git a/arch/x86/boot/compressed/kernel_info.S b/arch/x86/boot/compressed/kernel_info.S
new file mode 100644
index 000000000000..8ea6f6e3feef
--- /dev/null
+++ b/arch/x86/boot/compressed/kernel_info.S
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+	.section ".rodata.kernel_info", "a"
+
+	.global kernel_info
+
+kernel_info:
+	/* Header, Linux top (structure). */
+	.ascii	"LToP"
+	/* Size. */
+	.long	kernel_info_var_len_data - kernel_info
+	/* Size total. */
+	.long	kernel_info_end - kernel_info
+
+kernel_info_var_len_data:
+	/* Empty for time being... */
+kernel_info_end:
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 2c11c0f45d49..22dcecaaa898 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -567,6 +567,7 @@ pref_address:		.quad LOAD_PHYSICAL_ADDR	# preferred load addr
 
 init_size:		.long INIT_SIZE		# kernel initialization size
 handover_offset:	.long 0			# Filled in by build.c
+kernel_info_offset:	.long 0			# Filled in by build.c
 
 # End of setup header #####################################################
 
diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c
index a93d44e58f9c..55e669d29e54 100644
--- a/arch/x86/boot/tools/build.c
+++ b/arch/x86/boot/tools/build.c
@@ -56,6 +56,7 @@ u8 buf[SETUP_SECT_MAX*512];
 unsigned long efi32_stub_entry;
 unsigned long efi64_stub_entry;
 unsigned long efi_pe_entry;
+unsigned long kernel_info;
 unsigned long startup_64;
 
 /*----------------------------------------------------------------------*/
@@ -321,6 +322,7 @@ static void parse_zoffset(char *fname)
 		PARSE_ZOFS(p, efi32_stub_entry);
 		PARSE_ZOFS(p, efi64_stub_entry);
 		PARSE_ZOFS(p, efi_pe_entry);
+		PARSE_ZOFS(p, kernel_info);
 		PARSE_ZOFS(p, startup_64);
 
 		p = strchr(p, '\n');
@@ -410,6 +412,9 @@ int main(int argc, char ** argv)
 
 	efi_stub_entry_update();
 
+	/* Update kernel_info offset. */
+	put_unaligned_le32(kernel_info, &buf[0x268]);
+
 	crc = partial_crc32(buf, i, crc);
 	if (fwrite(buf, 1, i, dest) != i)
 		die("Writing setup failed");
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index c895df5482c5..a1ebcd7a991c 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -88,6 +88,7 @@ struct setup_header {
 	__u64	pref_address;
 	__u32	init_size;
 	__u32	handover_offset;
+	__u32	kernel_info_offset;
 } __attribute__((packed));
 
 struct sys_desc_table {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] x86/boot: Introduce the kernel_info.setup_type_max
  2019-10-09 10:53 [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
  2019-10-09 10:53 ` [PATCH v3 1/3] x86/boot: Introduce the kernel_info Daniel Kiper
@ 2019-10-09 10:53 ` Daniel Kiper
  2019-10-09 10:53 ` [PATCH v3 3/3] x86/boot: Introduce the setup_indirect Daniel Kiper
  2019-10-16 11:06 ` [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
  3 siblings, 0 replies; 10+ messages in thread
From: Daniel Kiper @ 2019-10-09 10:53 UTC (permalink / raw)
  To: linux-efi, linux-kernel, x86, xen-devel
  Cc: ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, hpa, jgross, konrad.wilk, mingo,
	ross.philipson, tglx

This field contains maximal allowed type for setup_data.

Now bump the setup_header version in arch/x86/boot/header.S.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
---
 Documentation/x86/boot.rst             | 9 ++++++++-
 arch/x86/boot/compressed/kernel_info.S | 5 +++++
 arch/x86/boot/header.S                 | 2 +-
 arch/x86/include/uapi/asm/bootparam.h  | 3 +++
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index d5323a39f5e3..4c536bc8816d 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -73,7 +73,7 @@ Protocol 2.14:	BURNT BY INCORRECT COMMIT ae7e1238e68f2a472a125673ab506d49158c188
 		(x86/boot: Add ACPI RSDP address to setup_header)
 		DO NOT USE!!! ASSUME SAME AS 2.13.
 
-Protocol 2.15:	(Kernel 5.5) Added the kernel_info.
+Protocol 2.15:	(Kernel 5.5) Added the kernel_info and kernel_info.setup_type_max.
 =============	============================================================
 
 .. note::
@@ -976,6 +976,13 @@ Offset/size:	0x0008/4
   This field contains the size of the kernel_info including kernel_info.header
   and kernel_info.kernel_info_var_len_data.
 
+============	==============
+Field name:	setup_type_max
+Offset/size:	0x0008/4
+============	==============
+
+  This field contains maximal allowed type for setup_data and setup_indirect structs.
+
 
 The Image Checksum
 ==================
diff --git a/arch/x86/boot/compressed/kernel_info.S b/arch/x86/boot/compressed/kernel_info.S
index 8ea6f6e3feef..f818ee8fba38 100644
--- a/arch/x86/boot/compressed/kernel_info.S
+++ b/arch/x86/boot/compressed/kernel_info.S
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#include <asm/bootparam.h>
+
 	.section ".rodata.kernel_info", "a"
 
 	.global kernel_info
@@ -12,6 +14,9 @@ kernel_info:
 	/* Size total. */
 	.long	kernel_info_end - kernel_info
 
+	/* Maximal allowed type for setup_data and setup_indirect structs. */
+	.long	SETUP_TYPE_MAX
+
 kernel_info_var_len_data:
 	/* Empty for time being... */
 kernel_info_end:
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 22dcecaaa898..97d9b6d6c1af 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -300,7 +300,7 @@ _start:
 	# Part 2 of the header, from the old setup.S
 
 		.ascii	"HdrS"		# header signature
-		.word	0x020d		# header version number (>= 0x0105)
+		.word	0x020f		# header version number (>= 0x0105)
 					# or else old loadlin-1.5 will fail)
 		.globl realmode_swtch
 realmode_swtch:	.word	0, 0		# default_switch, SETUPSEG
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index a1ebcd7a991c..dbb41128e5a0 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -11,6 +11,9 @@
 #define SETUP_APPLE_PROPERTIES		5
 #define SETUP_JAILHOUSE			6
 
+/* max(SETUP_*) */
+#define SETUP_TYPE_MAX			SETUP_JAILHOUSE
+
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK	0x07FF
 #define RAMDISK_PROMPT_FLAG		0x8000
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] x86/boot: Introduce the setup_indirect
  2019-10-09 10:53 [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
  2019-10-09 10:53 ` [PATCH v3 1/3] x86/boot: Introduce the kernel_info Daniel Kiper
  2019-10-09 10:53 ` [PATCH v3 2/3] x86/boot: Introduce the kernel_info.setup_type_max Daniel Kiper
@ 2019-10-09 10:53 ` Daniel Kiper
  2019-10-16 11:06 ` [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
  3 siblings, 0 replies; 10+ messages in thread
From: Daniel Kiper @ 2019-10-09 10:53 UTC (permalink / raw)
  To: linux-efi, linux-kernel, x86, xen-devel
  Cc: ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, hpa, jgross, konrad.wilk, mingo,
	ross.philipson, tglx

The setup_data is a bit awkward to use for extremely large data objects,
both because the setup_data header has to be adjacent to the data object
and because it has a 32-bit length field. However, it is important that
intermediate stages of the boot process have a way to identify which
chunks of memory are occupied by kernel data. Thus we introduce an uniform
way to specify such indirect data as setup_indirect struct and
SETUP_INDIRECT type.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
---
v3 - suggestions/fixes:
   - add setup_indirect mapping/KASLR avoidance/etc. code
     (suggested by H. Peter Anvin),
   - the SETUP_INDIRECT sets most significant bit right now;
     this way it is possible to differentiate regular setup_data
     and setup_indirect objects in the debugfs filesystem.

v2 - suggestions/fixes:
   - add setup_indirect usage example
     (suggested by Eric Snowberg and Ross Philipson).
---
 Documentation/x86/boot.rst            | 40 +++++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/kaslr.c      | 12 +++++++++++
 arch/x86/include/uapi/asm/bootparam.h | 16 +++++++++++---
 arch/x86/kernel/e820.c                | 11 ++++++++++
 arch/x86/kernel/kdebugfs.c            | 20 ++++++++++++++----
 arch/x86/kernel/ksysfs.c              | 30 ++++++++++++++++++++------
 arch/x86/kernel/setup.c               |  4 ++++
 arch/x86/mm/ioremap.c                 | 11 ++++++++++
 8 files changed, 130 insertions(+), 14 deletions(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 4c536bc8816d..d6d03b00b594 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -827,6 +827,46 @@ Protocol:	2.09+
   sure to consider the case where the linked list already contains
   entries.
 
+  The setup_data is a bit awkward to use for extremely large data objects,
+  both because the setup_data header has to be adjacent to the data object
+  and because it has a 32-bit length field. However, it is important that
+  intermediate stages of the boot process have a way to identify which
+  chunks of memory are occupied by kernel data.
+
+  Thus setup_indirect struct and SETUP_INDIRECT type were introduced in
+  protocol 2.15.
+
+  struct setup_indirect {
+    __u32 type;
+    __u32 reserved;  /* Reserved, must be set to zero. */
+    __u64 len;
+    __u64 addr;
+  };
+
+  The type member is a SETUP_INDIRECT | SETUP_* type. However, it cannot be
+  SETUP_INDIRECT itself since making the setup_indirect a tree structure
+  could require a lot of stack space in something that needs to parse it
+  and stack space can be limited in boot contexts.
+
+  Let's give an example how to point to SETUP_E820_EXT data using setup_indirect.
+  In this case setup_data and setup_indirect will look like this:
+
+  struct setup_data {
+    __u64 next = 0 or <addr_of_next_setup_data_struct>;
+    __u32 type = SETUP_INDIRECT;
+    __u32 len = sizeof(setup_data);
+    __u8 data[sizeof(setup_indirect)] = struct setup_indirect {
+      __u32 type = SETUP_INDIRECT | SETUP_E820_EXT;
+      __u32 reserved = 0;
+      __u64 len = <len_of_SETUP_E820_EXT_data>;
+      __u64 addr = <addr_of_SETUP_E820_EXT_data>;
+    }
+  }
+
+  Note: SETUP_INDIRECT | SETUP_NONE objects cannot be properly distinguished
+        from SETUP_INDIRECT itself. So, this kind of objects cannot be provided
+        by the bootloaders.
+
 ============	============
 Field name:	pref_address
 Type:		read (reloc)
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 2e53c056ba20..bb9bfef174ae 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -459,6 +459,18 @@ static bool mem_avoid_overlap(struct mem_vector *img,
 			is_overlapping = true;
 		}
 
+		if (ptr->type == SETUP_INDIRECT &&
+		    ((struct setup_indirect *)ptr->data)->type != SETUP_INDIRECT) {
+			avoid.start = ((struct setup_indirect *)ptr->data)->addr;
+			avoid.size = ((struct setup_indirect *)ptr->data)->len;
+
+			if (mem_overlaps(img, &avoid) && (avoid.start < earliest)) {
+				*overlap = avoid;
+				earliest = overlap->start;
+				is_overlapping = true;
+			}
+		}
+
 		ptr = (struct setup_data *)(unsigned long)ptr->next;
 	}
 
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index dbb41128e5a0..949066b5398a 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_X86_BOOTPARAM_H
 #define _ASM_X86_BOOTPARAM_H
 
-/* setup_data types */
+/* setup_data/setup_indirect types */
 #define SETUP_NONE			0
 #define SETUP_E820_EXT			1
 #define SETUP_DTB			2
@@ -11,8 +11,10 @@
 #define SETUP_APPLE_PROPERTIES		5
 #define SETUP_JAILHOUSE			6
 
-/* max(SETUP_*) */
-#define SETUP_TYPE_MAX			SETUP_JAILHOUSE
+#define SETUP_INDIRECT			(1<<31)
+
+/* SETUP_INDIRECT | max(SETUP_*) */
+#define SETUP_TYPE_MAX			(SETUP_INDIRECT | SETUP_JAILHOUSE)
 
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK	0x07FF
@@ -52,6 +54,14 @@ struct setup_data {
 	__u8 data[0];
 };
 
+/* extensible setup indirect data node */
+struct setup_indirect {
+	__u32 type;
+	__u32 reserved;  /* Reserved, must be set to zero. */
+	__u64 len;
+	__u64 addr;
+};
+
 struct setup_header {
 	__u8	setup_sects;
 	__u16	root_flags;
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 7da2bcd2b8eb..0bfe9a685b3b 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -999,6 +999,17 @@ void __init e820__reserve_setup_data(void)
 		data = early_memremap(pa_data, sizeof(*data));
 		e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
 		e820__range_update_kexec(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
+
+		if (data->type == SETUP_INDIRECT &&
+		    ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) {
+			e820__range_update(((struct setup_indirect *)data->data)->addr,
+					   ((struct setup_indirect *)data->data)->len,
+					   E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
+			e820__range_update_kexec(((struct setup_indirect *)data->data)->addr,
+						 ((struct setup_indirect *)data->data)->len,
+						 E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
+		}
+
 		pa_data = data->next;
 		early_memunmap(data, sizeof(*data));
 	}
diff --git a/arch/x86/kernel/kdebugfs.c b/arch/x86/kernel/kdebugfs.c
index edaa30b20841..701a98300f86 100644
--- a/arch/x86/kernel/kdebugfs.c
+++ b/arch/x86/kernel/kdebugfs.c
@@ -44,7 +44,11 @@ static ssize_t setup_data_read(struct file *file, char __user *user_buf,
 	if (count > node->len - pos)
 		count = node->len - pos;
 
-	pa = node->paddr + sizeof(struct setup_data) + pos;
+	pa = node->paddr + pos;
+
+	if (!(node->type & SETUP_INDIRECT) || node->type == SETUP_INDIRECT)
+		pa += sizeof(struct setup_data);
+
 	p = memremap(pa, count, MEMREMAP_WB);
 	if (!p)
 		return -ENOMEM;
@@ -108,9 +112,17 @@ static int __init create_setup_data_nodes(struct dentry *parent)
 			goto err_dir;
 		}
 
-		node->paddr = pa_data;
-		node->type = data->type;
-		node->len = data->len;
+		if (data->type == SETUP_INDIRECT &&
+		    ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) {
+			node->paddr = ((struct setup_indirect *)data->data)->addr;
+			node->type = ((struct setup_indirect *)data->data)->type;
+			node->len = ((struct setup_indirect *)data->data)->len;
+		} else {
+			node->paddr = pa_data;
+			node->type = data->type;
+			node->len = data->len;
+		}
+
 		create_setup_data_node(d, no, node);
 		pa_data = data->next;
 
diff --git a/arch/x86/kernel/ksysfs.c b/arch/x86/kernel/ksysfs.c
index 7969da939213..14ef8121aa53 100644
--- a/arch/x86/kernel/ksysfs.c
+++ b/arch/x86/kernel/ksysfs.c
@@ -100,7 +100,11 @@ static int __init get_setup_data_size(int nr, size_t *size)
 		if (!data)
 			return -ENOMEM;
 		if (nr == i) {
-			*size = data->len;
+			if (data->type == SETUP_INDIRECT &&
+			    ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT)
+				*size = ((struct setup_indirect *)data->data)->len;
+			else
+				*size = data->len;
 			memunmap(data);
 			return 0;
 		}
@@ -130,7 +134,10 @@ static ssize_t type_show(struct kobject *kobj,
 	if (!data)
 		return -ENOMEM;
 
-	ret = sprintf(buf, "0x%x\n", data->type);
+	if (data->type == SETUP_INDIRECT)
+		ret = sprintf(buf, "0x%x\n", ((struct setup_indirect *)data->data)->type);
+	else
+		ret = sprintf(buf, "0x%x\n", data->type);
 	memunmap(data);
 	return ret;
 }
@@ -142,7 +149,7 @@ static ssize_t setup_data_data_read(struct file *fp,
 				    loff_t off, size_t count)
 {
 	int nr, ret = 0;
-	u64 paddr;
+	u64 paddr, len;
 	struct setup_data *data;
 	void *p;
 
@@ -157,19 +164,28 @@ static ssize_t setup_data_data_read(struct file *fp,
 	if (!data)
 		return -ENOMEM;
 
-	if (off > data->len) {
+	if (data->type == SETUP_INDIRECT &&
+	    ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) {
+		paddr = ((struct setup_indirect *)data->data)->addr;
+		len = ((struct setup_indirect *)data->data)->len;
+	} else {
+		paddr += sizeof(*data);
+		len = data->len;
+	}
+
+	if (off > len) {
 		ret = -EINVAL;
 		goto out;
 	}
 
-	if (count > data->len - off)
-		count = data->len - off;
+	if (count > len - off)
+		count = len - off;
 
 	if (!count)
 		goto out;
 
 	ret = count;
-	p = memremap(paddr + sizeof(*data), data->len, MEMREMAP_WB);
+	p = memremap(paddr, len, MEMREMAP_WB);
 	if (!p) {
 		ret = -ENOMEM;
 		goto out;
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 77ea96b794bd..4603702dbfc1 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -438,6 +438,10 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 	while (pa_data) {
 		data = early_memremap(pa_data, sizeof(*data));
 		memblock_reserve(pa_data, sizeof(*data) + data->len);
+		if (data->type == SETUP_INDIRECT &&
+		    ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT)
+			memblock_reserve(((struct setup_indirect *)data->data)->addr,
+					 ((struct setup_indirect *)data->data)->len);
 		pa_data = data->next;
 		early_memunmap(data, sizeof(*data));
 	}
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index a39dcdb5ae34..1ff9c2030b4f 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -626,6 +626,17 @@ static bool memremap_is_setup_data(resource_size_t phys_addr,
 		paddr_next = data->next;
 		len = data->len;
 
+		if ((phys_addr > paddr) && (phys_addr < (paddr + len))) {
+			memunmap(data);
+			return true;
+		}
+
+		if (data->type == SETUP_INDIRECT &&
+		    ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) {
+			paddr = ((struct setup_indirect *)data->data)->addr;
+			len = ((struct setup_indirect *)data->data)->len;
+		}
+
 		memunmap(data);
 
 		if ((phys_addr > paddr) && (phys_addr < (paddr + len)))
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/boot: Introduce the kernel_info
  2019-10-09 10:53 ` [PATCH v3 1/3] x86/boot: Introduce the kernel_info Daniel Kiper
@ 2019-10-10  0:43   ` Randy Dunlap
  2019-10-10  9:43     ` Daniel Kiper
  0 siblings, 1 reply; 10+ messages in thread
From: Randy Dunlap @ 2019-10-10  0:43 UTC (permalink / raw)
  To: Daniel Kiper, linux-efi, linux-kernel, x86, xen-devel
  Cc: ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, hpa, jgross, konrad.wilk, mingo,
	ross.philipson, tglx

Hi,

Questions and comments below...
Thanks.


On 10/9/19 3:53 AM, Daniel Kiper wrote:

> Suggested-by: H. Peter Anvin <hpa@zytor.com>
> Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
> ---

> ---
>  Documentation/x86/boot.rst             | 121 +++++++++++++++++++++++++++++++++
>  arch/x86/boot/Makefile                 |   2 +-
>  arch/x86/boot/compressed/Makefile      |   4 +-
>  arch/x86/boot/compressed/kernel_info.S |  17 +++++
>  arch/x86/boot/header.S                 |   1 +
>  arch/x86/boot/tools/build.c            |   5 ++
>  arch/x86/include/uapi/asm/bootparam.h  |   1 +
>  7 files changed, 148 insertions(+), 3 deletions(-)
>  create mode 100644 arch/x86/boot/compressed/kernel_info.S
> 
> diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
> index 08a2f100c0e6..d5323a39f5e3 100644
> --- a/Documentation/x86/boot.rst
> +++ b/Documentation/x86/boot.rst
> @@ -68,8 +68,25 @@ Protocol 2.12	(Kernel 3.8) Added the xloadflags field and extension fields
>  Protocol 2.13	(Kernel 3.14) Support 32- and 64-bit flags being set in
>  		xloadflags to support booting a 64-bit kernel from 32-bit
>  		EFI
> +
> +Protocol 2.14:	BURNT BY INCORRECT COMMIT ae7e1238e68f2a472a125673ab506d49158c1889
> +		(x86/boot: Add ACPI RSDP address to setup_header)
> +		DO NOT USE!!! ASSUME SAME AS 2.13.
> +
> +Protocol 2.15:	(Kernel 5.5) Added the kernel_info.
>  =============	============================================================
>  
> +.. note::
> +     The protocol version number should be changed only if the setup header
> +     is changed. There is no need to update the version number if boot_params
> +     or kernel_info are changed. Additionally, it is recommended to use
> +     xloadflags (in this case the protocol version number should not be
> +     updated either) or kernel_info to communicate supported Linux kernel
> +     features to the boot loader. Due to very limited space available in
> +     the original setup header every update to it should be considered
> +     with great care. Starting from the protocol 2.15 the primary way to
> +     communicate things to the boot loader is the kernel_info.
> +
>  
>  Memory Layout
>  =============
> @@ -207,6 +224,7 @@ Offset/Size	Proto		Name			Meaning
>  0258/8		2.10+		pref_address		Preferred loading address
>  0260/4		2.10+		init_size		Linear memory required during initialization
>  0264/4		2.11+		handover_offset		Offset of handover entry point
> +0268/4		2.15+		kernel_info_offset	Offset of the kernel_info
>  ===========	========	=====================	============================================
>  
>  .. note::
> @@ -855,6 +873,109 @@ Offset/size:	0x264/4
>  
>    See EFI HANDOVER PROTOCOL below for more details.
>  
> +============	==================
> +Field name:	kernel_info_offset
> +Type:		read
> +Offset/size:	0x268/4
> +Protocol:	2.15+
> +============	==================
> +
> +  This field is the offset from the beginning of the kernel image to the
> +  kernel_info. It is embedded in the Linux image in the uncompressed
                  ^^
   What does      It   refer to, please?

> +  protected mode region.
> +
> +
> +The kernel_info
> +===============
> +
> +The relationships between the headers are analogous to the various data
> +sections:
> +
> +  setup_header = .data
> +  boot_params/setup_data = .bss
> +
> +What is missing from the above list? That's right:
> +
> +  kernel_info = .rodata
> +
> +We have been (ab)using .data for things that could go into .rodata or .bss for
> +a long time, for lack of alternatives and -- especially early on -- inertia.
> +Also, the BIOS stub is responsible for creating boot_params, so it isn't
> +available to a BIOS-based loader (setup_data is, though).
> +
> +setup_header is permanently limited to 144 bytes due to the reach of the
> +2-byte jump field, which doubles as a length field for the structure, combined
> +with the size of the "hole" in struct boot_params that a protected-mode loader
> +or the BIOS stub has to copy it into. It is currently 119 bytes long, which
> +leaves us with 25 very precious bytes. This isn't something that can be fixed
> +without revising the boot protocol entirely, breaking backwards compatibility.
> +
> +boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
> +by adding setup_data entries. It cannot be used to communicate properties of
> +the kernel image, because it is .bss and has no image-provided content.
> +
> +kernel_info solves this by providing an extensible place for information about
> +the kernel image. It is readonly, because the kernel cannot rely on a
> +bootloader copying its contents anywhere, but that is OK; if it becomes
> +necessary it can still contain data items that an enabled bootloader would be
> +expected to copy into a setup_data chunk.
> +
> +All kernel_info data should be part of this structure. Fixed size data have to
> +be put before kernel_info_var_len_data label. Variable size data have to be put
> +behind kernel_info_var_len_data label. Each chunk of variable size data has to

   s/behind/after/

> +be prefixed with header/magic and its size, e.g.:
> +
> +  kernel_info:
> +          .ascii  "LToP"          /* Header, Linux top (structure). */
> +          .long   kernel_info_var_len_data - kernel_info
> +          .long   kernel_info_end - kernel_info
> +          .long   0x01234567      /* Some fixed size data for the bootloaders. */
> +  kernel_info_var_len_data:
> +  example_struct:                 /* Some variable size data for the bootloaders. */
> +          .ascii  "EsTT"          /* Header/Magic. */
> +          .long   example_struct_end - example_struct
> +          .ascii  "Struct"
> +          .long   0x89012345
> +  example_struct_end:
> +  example_strings:                /* Some variable size data for the bootloaders. */
> +          .ascii  "EsTs"          /* Header/Magic. */

Where do the Magic values "EsTT" and "EsTs" come from?
where are they defined?

> +          .long   example_strings_end - example_strings
> +          .asciz  "String_0"
> +          .asciz  "String_1"
> +  example_strings_end:
> +  kernel_info_end:
> +
> +This way the kernel_info is self-contained blob.
> +
> +
> +Details of the kernel_info Fields
> +=================================
> +
> +============	========
> +Field name:	header
> +Offset/size:	0x0000/4
> +============	========
> +
> +  Contains the magic number "LToP" (0x506f544c).
> +
> +============	========
> +Field name:	size
> +Offset/size:	0x0004/4
> +============	========
> +
> +  This field contains the size of the kernel_info including kernel_info.header.
> +  It does not count kernel_info.kernel_info_var_len_data size. This field should be
> +  used by the bootloaders to detect supported fixed size fields in the kernel_info
> +  and beginning of kernel_info.kernel_info_var_len_data.
> +
> +============	========
> +Field name:	size_total
> +Offset/size:	0x0008/4
> +============	========
> +
> +  This field contains the size of the kernel_info including kernel_info.header
> +  and kernel_info.kernel_info_var_len_data.
> +
>  
>  The Image Checksum
>  ==================


-- 
~Randy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/boot: Introduce the kernel_info
  2019-10-10  0:43   ` Randy Dunlap
@ 2019-10-10  9:43     ` Daniel Kiper
  2019-10-10 14:43       ` Randy Dunlap
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Kiper @ 2019-10-10  9:43 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-efi, linux-kernel, x86, xen-devel, ard.biesheuvel,
	boris.ostrovsky, bp, corbet, dave.hansen, luto, peterz,
	eric.snowberg, hpa, jgross, konrad.wilk, mingo, ross.philipson,
	tglx

On Wed, Oct 09, 2019 at 05:43:31PM -0700, Randy Dunlap wrote:
> Hi,
>
> Questions and comments below...
> Thanks.
>
> On 10/9/19 3:53 AM, Daniel Kiper wrote:
>
> > Suggested-by: H. Peter Anvin <hpa@zytor.com>
> > Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
> > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
> > ---
>
> > ---
> >  Documentation/x86/boot.rst             | 121 +++++++++++++++++++++++++++++++++
> >  arch/x86/boot/Makefile                 |   2 +-
> >  arch/x86/boot/compressed/Makefile      |   4 +-
> >  arch/x86/boot/compressed/kernel_info.S |  17 +++++
> >  arch/x86/boot/header.S                 |   1 +
> >  arch/x86/boot/tools/build.c            |   5 ++
> >  arch/x86/include/uapi/asm/bootparam.h  |   1 +
> >  7 files changed, 148 insertions(+), 3 deletions(-)
> >  create mode 100644 arch/x86/boot/compressed/kernel_info.S
> >
> > diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
> > index 08a2f100c0e6..d5323a39f5e3 100644
> > --- a/Documentation/x86/boot.rst
> > +++ b/Documentation/x86/boot.rst
> > @@ -68,8 +68,25 @@ Protocol 2.12	(Kernel 3.8) Added the xloadflags field and extension fields
> >  Protocol 2.13	(Kernel 3.14) Support 32- and 64-bit flags being set in
> >  		xloadflags to support booting a 64-bit kernel from 32-bit
> >  		EFI
> > +
> > +Protocol 2.14:	BURNT BY INCORRECT COMMIT ae7e1238e68f2a472a125673ab506d49158c1889
> > +		(x86/boot: Add ACPI RSDP address to setup_header)
> > +		DO NOT USE!!! ASSUME SAME AS 2.13.
> > +
> > +Protocol 2.15:	(Kernel 5.5) Added the kernel_info.
> >  =============	============================================================
> >
> > +.. note::
> > +     The protocol version number should be changed only if the setup header
> > +     is changed. There is no need to update the version number if boot_params
> > +     or kernel_info are changed. Additionally, it is recommended to use
> > +     xloadflags (in this case the protocol version number should not be
> > +     updated either) or kernel_info to communicate supported Linux kernel
> > +     features to the boot loader. Due to very limited space available in
> > +     the original setup header every update to it should be considered
> > +     with great care. Starting from the protocol 2.15 the primary way to
> > +     communicate things to the boot loader is the kernel_info.
> > +
> >
> >  Memory Layout
> >  =============
> > @@ -207,6 +224,7 @@ Offset/Size	Proto		Name			Meaning
> >  0258/8		2.10+		pref_address		Preferred loading address
> >  0260/4		2.10+		init_size		Linear memory required during initialization
> >  0264/4		2.11+		handover_offset		Offset of handover entry point
> > +0268/4		2.15+		kernel_info_offset	Offset of the kernel_info
> >  ===========	========	=====================	============================================
> >
> >  .. note::
> > @@ -855,6 +873,109 @@ Offset/size:	0x264/4
> >
> >    See EFI HANDOVER PROTOCOL below for more details.
> >
> > +============	==================
> > +Field name:	kernel_info_offset
> > +Type:		read
> > +Offset/size:	0x268/4
> > +Protocol:	2.15+
> > +============	==================
> > +
> > +  This field is the offset from the beginning of the kernel image to the
> > +  kernel_info. It is embedded in the Linux image in the uncompressed
>                   ^^
>    What does      It   refer to, please?

s/It/The kernel_info structure/ Is it better?

> > +  protected mode region.
> > +
> > +
> > +The kernel_info
> > +===============
> > +
> > +The relationships between the headers are analogous to the various data
> > +sections:
> > +
> > +  setup_header = .data
> > +  boot_params/setup_data = .bss
> > +
> > +What is missing from the above list? That's right:
> > +
> > +  kernel_info = .rodata
> > +
> > +We have been (ab)using .data for things that could go into .rodata or .bss for
> > +a long time, for lack of alternatives and -- especially early on -- inertia.
> > +Also, the BIOS stub is responsible for creating boot_params, so it isn't
> > +available to a BIOS-based loader (setup_data is, though).
> > +
> > +setup_header is permanently limited to 144 bytes due to the reach of the
> > +2-byte jump field, which doubles as a length field for the structure, combined
> > +with the size of the "hole" in struct boot_params that a protected-mode loader
> > +or the BIOS stub has to copy it into. It is currently 119 bytes long, which
> > +leaves us with 25 very precious bytes. This isn't something that can be fixed
> > +without revising the boot protocol entirely, breaking backwards compatibility.
> > +
> > +boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
> > +by adding setup_data entries. It cannot be used to communicate properties of
> > +the kernel image, because it is .bss and has no image-provided content.
> > +
> > +kernel_info solves this by providing an extensible place for information about
> > +the kernel image. It is readonly, because the kernel cannot rely on a
> > +bootloader copying its contents anywhere, but that is OK; if it becomes
> > +necessary it can still contain data items that an enabled bootloader would be
> > +expected to copy into a setup_data chunk.
> > +
> > +All kernel_info data should be part of this structure. Fixed size data have to
> > +be put before kernel_info_var_len_data label. Variable size data have to be put
> > +behind kernel_info_var_len_data label. Each chunk of variable size data has to
>
>    s/behind/after/

OK.

> > +be prefixed with header/magic and its size, e.g.:
> > +
> > +  kernel_info:
> > +          .ascii  "LToP"          /* Header, Linux top (structure). */
> > +          .long   kernel_info_var_len_data - kernel_info
> > +          .long   kernel_info_end - kernel_info
> > +          .long   0x01234567      /* Some fixed size data for the bootloaders. */
> > +  kernel_info_var_len_data:
> > +  example_struct:                 /* Some variable size data for the bootloaders. */
> > +          .ascii  "EsTT"          /* Header/Magic. */
> > +          .long   example_struct_end - example_struct
> > +          .ascii  "Struct"
> > +          .long   0x89012345
> > +  example_struct_end:
> > +  example_strings:                /* Some variable size data for the bootloaders. */
> > +          .ascii  "EsTs"          /* Header/Magic. */
>
> Where do the Magic values "EsTT" and "EsTs" come from?
> where are they defined?

EsTT == Example STrucT
EsTs == Example STringS

Anyway, it can be anything which does not collide with existing variable
length data magics. There are none right now. So, it can be anything.
Maybe I should add something saying that.

Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/boot: Introduce the kernel_info
  2019-10-10  9:43     ` Daniel Kiper
@ 2019-10-10 14:43       ` Randy Dunlap
  2019-10-11 18:43         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 10+ messages in thread
From: Randy Dunlap @ 2019-10-10 14:43 UTC (permalink / raw)
  To: Daniel Kiper
  Cc: linux-efi, linux-kernel, x86, xen-devel, ard.biesheuvel,
	boris.ostrovsky, bp, corbet, dave.hansen, luto, peterz,
	eric.snowberg, hpa, jgross, konrad.wilk, mingo, ross.philipson,
	tglx

On 10/10/19 2:43 AM, Daniel Kiper wrote:
> On Wed, Oct 09, 2019 at 05:43:31PM -0700, Randy Dunlap wrote:
>> Hi,
>>
>> Questions and comments below...
>> Thanks.
>>
>> On 10/9/19 3:53 AM, Daniel Kiper wrote:
>>
>>> Suggested-by: H. Peter Anvin <hpa@zytor.com>
>>> Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
>>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
>>> ---
>>
>>> ---
>>>  Documentation/x86/boot.rst             | 121 +++++++++++++++++++++++++++++++++
>>>  arch/x86/boot/Makefile                 |   2 +-
>>>  arch/x86/boot/compressed/Makefile      |   4 +-
>>>  arch/x86/boot/compressed/kernel_info.S |  17 +++++
>>>  arch/x86/boot/header.S                 |   1 +
>>>  arch/x86/boot/tools/build.c            |   5 ++
>>>  arch/x86/include/uapi/asm/bootparam.h  |   1 +
>>>  7 files changed, 148 insertions(+), 3 deletions(-)
>>>  create mode 100644 arch/x86/boot/compressed/kernel_info.S
>>>
>>> diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
>>> index 08a2f100c0e6..d5323a39f5e3 100644
>>> --- a/Documentation/x86/boot.rst
>>> +++ b/Documentation/x86/boot.rst
>>> @@ -68,8 +68,25 @@ Protocol 2.12	(Kernel 3.8) Added the xloadflags field and extension fields
>>>  Protocol 2.13	(Kernel 3.14) Support 32- and 64-bit flags being set in
>>>  		xloadflags to support booting a 64-bit kernel from 32-bit
>>>  		EFI
>>> +
>>> +Protocol 2.14:	BURNT BY INCORRECT COMMIT ae7e1238e68f2a472a125673ab506d49158c1889
>>> +		(x86/boot: Add ACPI RSDP address to setup_header)
>>> +		DO NOT USE!!! ASSUME SAME AS 2.13.
>>> +
>>> +Protocol 2.15:	(Kernel 5.5) Added the kernel_info.
>>>  =============	============================================================
>>>
>>> +.. note::
>>> +     The protocol version number should be changed only if the setup header
>>> +     is changed. There is no need to update the version number if boot_params
>>> +     or kernel_info are changed. Additionally, it is recommended to use
>>> +     xloadflags (in this case the protocol version number should not be
>>> +     updated either) or kernel_info to communicate supported Linux kernel
>>> +     features to the boot loader. Due to very limited space available in
>>> +     the original setup header every update to it should be considered
>>> +     with great care. Starting from the protocol 2.15 the primary way to
>>> +     communicate things to the boot loader is the kernel_info.
>>> +
>>>
>>>  Memory Layout
>>>  =============
>>> @@ -207,6 +224,7 @@ Offset/Size	Proto		Name			Meaning
>>>  0258/8		2.10+		pref_address		Preferred loading address
>>>  0260/4		2.10+		init_size		Linear memory required during initialization
>>>  0264/4		2.11+		handover_offset		Offset of handover entry point
>>> +0268/4		2.15+		kernel_info_offset	Offset of the kernel_info
>>>  ===========	========	=====================	============================================
>>>
>>>  .. note::
>>> @@ -855,6 +873,109 @@ Offset/size:	0x264/4
>>>
>>>    See EFI HANDOVER PROTOCOL below for more details.
>>>
>>> +============	==================
>>> +Field name:	kernel_info_offset
>>> +Type:		read
>>> +Offset/size:	0x268/4
>>> +Protocol:	2.15+
>>> +============	==================
>>> +
>>> +  This field is the offset from the beginning of the kernel image to the
>>> +  kernel_info. It is embedded in the Linux image in the uncompressed
>>                   ^^
>>    What does      It   refer to, please?
> 
> s/It/The kernel_info structure/ Is it better?

Yes.

>>> +  protected mode region.
>>> +
>>> +
>>> +The kernel_info
>>> +===============
>>> +
>>> +The relationships between the headers are analogous to the various data
>>> +sections:
>>> +
>>> +  setup_header = .data
>>> +  boot_params/setup_data = .bss
>>> +
>>> +What is missing from the above list? That's right:
>>> +
>>> +  kernel_info = .rodata
>>> +
>>> +We have been (ab)using .data for things that could go into .rodata or .bss for
>>> +a long time, for lack of alternatives and -- especially early on -- inertia.
>>> +Also, the BIOS stub is responsible for creating boot_params, so it isn't
>>> +available to a BIOS-based loader (setup_data is, though).
>>> +
>>> +setup_header is permanently limited to 144 bytes due to the reach of the
>>> +2-byte jump field, which doubles as a length field for the structure, combined
>>> +with the size of the "hole" in struct boot_params that a protected-mode loader
>>> +or the BIOS stub has to copy it into. It is currently 119 bytes long, which
>>> +leaves us with 25 very precious bytes. This isn't something that can be fixed
>>> +without revising the boot protocol entirely, breaking backwards compatibility.
>>> +
>>> +boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
>>> +by adding setup_data entries. It cannot be used to communicate properties of
>>> +the kernel image, because it is .bss and has no image-provided content.
>>> +
>>> +kernel_info solves this by providing an extensible place for information about
>>> +the kernel image. It is readonly, because the kernel cannot rely on a
>>> +bootloader copying its contents anywhere, but that is OK; if it becomes
>>> +necessary it can still contain data items that an enabled bootloader would be
>>> +expected to copy into a setup_data chunk.
>>> +
>>> +All kernel_info data should be part of this structure. Fixed size data have to
>>> +be put before kernel_info_var_len_data label. Variable size data have to be put
>>> +behind kernel_info_var_len_data label. Each chunk of variable size data has to
>>
>>    s/behind/after/
> 
> OK.
> 
>>> +be prefixed with header/magic and its size, e.g.:
>>> +
>>> +  kernel_info:
>>> +          .ascii  "LToP"          /* Header, Linux top (structure). */
>>> +          .long   kernel_info_var_len_data - kernel_info
>>> +          .long   kernel_info_end - kernel_info
>>> +          .long   0x01234567      /* Some fixed size data for the bootloaders. */
>>> +  kernel_info_var_len_data:
>>> +  example_struct:                 /* Some variable size data for the bootloaders. */
>>> +          .ascii  "EsTT"          /* Header/Magic. */
>>> +          .long   example_struct_end - example_struct
>>> +          .ascii  "Struct"
>>> +          .long   0x89012345
>>> +  example_struct_end:
>>> +  example_strings:                /* Some variable size data for the bootloaders. */
>>> +          .ascii  "EsTs"          /* Header/Magic. */
>>
>> Where do the Magic values "EsTT" and "EsTs" come from?
>> where are they defined?
> 
> EsTT == Example STrucT
> EsTs == Example STringS
> 
> Anyway, it can be anything which does not collide with existing variable
> length data magics. There are none right now. So, it can be anything.
> Maybe I should add something saying that.

Yes, please.

thanks.
-- 
~Randy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/boot: Introduce the kernel_info
  2019-10-10 14:43       ` Randy Dunlap
@ 2019-10-11 18:43         ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-10-11 18:43 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Daniel Kiper, linux-efi, linux-kernel, x86, xen-devel,
	ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, hpa, jgross, mingo, ross.philipson, tglx

> >>> +be prefixed with header/magic and its size, e.g.:
> >>> +
> >>> +  kernel_info:
> >>> +          .ascii  "LToP"          /* Header, Linux top (structure). */
> >>> +          .long   kernel_info_var_len_data - kernel_info
> >>> +          .long   kernel_info_end - kernel_info
> >>> +          .long   0x01234567      /* Some fixed size data for the bootloaders. */
> >>> +  kernel_info_var_len_data:
> >>> +  example_struct:                 /* Some variable size data for the bootloaders. */
> >>> +          .ascii  "EsTT"          /* Header/Magic. */
> >>> +          .long   example_struct_end - example_struct
> >>> +          .ascii  "Struct"
> >>> +          .long   0x89012345
> >>> +  example_struct_end:
> >>> +  example_strings:                /* Some variable size data for the bootloaders. */
> >>> +          .ascii  "EsTs"          /* Header/Magic. */
> >>
> >> Where do the Magic values "EsTT" and "EsTs" come from?
> >> where are they defined?
> > 
> > EsTT == Example STrucT
> > EsTs == Example STringS
> > 
> > Anyway, it can be anything which does not collide with existing variable
> > length data magics. There are none right now. So, it can be anything.
> > Maybe I should add something saying that.
> 
> Yes, please.

Or make it very clear they are examples, says "1234" or "ABCD" or such.

> 
> thanks.
> -- 
> ~Randy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes
  2019-10-09 10:53 [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
                   ` (2 preceding siblings ...)
  2019-10-09 10:53 ` [PATCH v3 3/3] x86/boot: Introduce the setup_indirect Daniel Kiper
@ 2019-10-16 11:06 ` Daniel Kiper
  2019-10-23 20:44   ` H. Peter Anvin
  3 siblings, 1 reply; 10+ messages in thread
From: Daniel Kiper @ 2019-10-16 11:06 UTC (permalink / raw)
  To: linux-efi, linux-kernel, x86, xen-devel
  Cc: ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, hpa, jgross, konrad.wilk, mingo,
	ross.philipson, tglx

On Wed, Oct 09, 2019 at 12:53:55PM +0200, Daniel Kiper wrote:
> Hi,
>
> Due to very limited space in the setup_header this patch series introduces new
> kernel_info struct which will be used to convey information from the kernel to
> the bootloader. This way the boot protocol can be extended regardless of the
> setup_header limitations. Additionally, the patch series introduces some
> convenience features like the setup_indirect struct and the
> kernel_info.setup_type_max field.
>
> Daniel
>
>  Documentation/x86/boot.rst             | 168 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/boot/Makefile                 |   2 +-
>  arch/x86/boot/compressed/Makefile      |   4 +-
>  arch/x86/boot/compressed/kaslr.c       |  12 ++++++
>  arch/x86/boot/compressed/kernel_info.S |  22 +++++++++++
>  arch/x86/boot/header.S                 |   3 +-
>  arch/x86/boot/tools/build.c            |   5 +++
>  arch/x86/include/uapi/asm/bootparam.h  |  16 +++++++-
>  arch/x86/kernel/e820.c                 |  11 ++++++
>  arch/x86/kernel/kdebugfs.c             |  20 ++++++++--
>  arch/x86/kernel/ksysfs.c               |  30 ++++++++++----
>  arch/x86/kernel/setup.c                |   4 ++
>  arch/x86/mm/ioremap.c                  |  11 ++++++
>  13 files changed, 292 insertions(+), 16 deletions(-)
>
> Daniel Kiper (3):
>       x86/boot: Introduce the kernel_info
>       x86/boot: Introduce the kernel_info.setup_type_max
>       x86/boot: Introduce the setup_indirect

hpa, ping?

Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes
  2019-10-16 11:06 ` [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
@ 2019-10-23 20:44   ` H. Peter Anvin
  0 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2019-10-23 20:44 UTC (permalink / raw)
  To: Daniel Kiper, linux-efi, linux-kernel, x86, xen-devel
  Cc: ard.biesheuvel, boris.ostrovsky, bp, corbet, dave.hansen, luto,
	peterz, eric.snowberg, jgross, konrad.wilk, mingo,
	ross.philipson, tglx, Randy Dunlap

On 2019-10-16 04:06, Daniel Kiper wrote:
> On Wed, Oct 09, 2019 at 12:53:55PM +0200, Daniel Kiper wrote:
>> Hi,
>>
>> Due to very limited space in the setup_header this patch series introduces new
>> kernel_info struct which will be used to convey information from the kernel to
>> the bootloader. This way the boot protocol can be extended regardless of the
>> setup_header limitations. Additionally, the patch series introduces some
>> convenience features like the setup_indirect struct and the
>> kernel_info.setup_type_max field.
>>
>> Daniel
>>
>>  Documentation/x86/boot.rst             | 168 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  arch/x86/boot/Makefile                 |   2 +-
>>  arch/x86/boot/compressed/Makefile      |   4 +-
>>  arch/x86/boot/compressed/kaslr.c       |  12 ++++++
>>  arch/x86/boot/compressed/kernel_info.S |  22 +++++++++++
>>  arch/x86/boot/header.S                 |   3 +-
>>  arch/x86/boot/tools/build.c            |   5 +++
>>  arch/x86/include/uapi/asm/bootparam.h  |  16 +++++++-
>>  arch/x86/kernel/e820.c                 |  11 ++++++
>>  arch/x86/kernel/kdebugfs.c             |  20 ++++++++--
>>  arch/x86/kernel/ksysfs.c               |  30 ++++++++++----
>>  arch/x86/kernel/setup.c                |   4 ++
>>  arch/x86/mm/ioremap.c                  |  11 ++++++
>>  13 files changed, 292 insertions(+), 16 deletions(-)
>>
>> Daniel Kiper (3):
>>       x86/boot: Introduce the kernel_info
>>       x86/boot: Introduce the kernel_info.setup_type_max
>>       x86/boot: Introduce the setup_indirect
> 
> hpa, ping?
> 

Looks really good to me, modulo the feedback Randy already brought up.

Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com>

	-hpa


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-23 20:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-09 10:53 [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
2019-10-09 10:53 ` [PATCH v3 1/3] x86/boot: Introduce the kernel_info Daniel Kiper
2019-10-10  0:43   ` Randy Dunlap
2019-10-10  9:43     ` Daniel Kiper
2019-10-10 14:43       ` Randy Dunlap
2019-10-11 18:43         ` Konrad Rzeszutek Wilk
2019-10-09 10:53 ` [PATCH v3 2/3] x86/boot: Introduce the kernel_info.setup_type_max Daniel Kiper
2019-10-09 10:53 ` [PATCH v3 3/3] x86/boot: Introduce the setup_indirect Daniel Kiper
2019-10-16 11:06 ` [PATCH v3 0/3] x86/boot: Introduce the kernel_info et consortes Daniel Kiper
2019-10-23 20:44   ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).