linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high
@ 2012-11-21  7:15 Yinghai Lu
  2012-11-21  7:15 ` [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200 Yinghai Lu
                   ` (11 more replies)
  0 siblings, 12 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:15 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

Now we have limit kdump reseved under 896M, because kexec has the limitation.
and also bzImage need to stay under 4g.

To make kexec/kdump could use range above 4g, we need to make bzImage and
ramdisk could be loaded above 4g.
During booting bzImage will be unpacked on same postion and stay high.

The patches add field in boot header to
1. get info about ramdisk position info above 4g from bootloader/kexec
2. set xloadflags bit0 in header for bzImage and bootloader/kexec load
   could check that to decide if need to put bzImage high.

This patches is tested with kexec tools with local changes and they are sent
to kexec list.

could be found at:
        git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-boot

and it is on top of for-x86-mm

-v2: add ext_cmd_line_ptr support, and handle boot_param/cmd_line is above
     4G case.
-v3: according to hpa, use xloadflags instead code32_start_offset.
     0x200 will not be changed...

Thanks

Yinghai

Yinghai Lu (12):
  x86, boot: move verify_cpu.S after 0x200
  x86, boot: Move lldt/ltr out of 64bit code section
  x86, 64bit: set extra ident page table for whole kernel range
  x86, 64bit: add support for loading kernel above 512G
  x86: Merge early_reserve_initrd for 32bit and 64bit
  x86: add get_ramdisk_image/size
  x86, boot: add get_cmd_line_ptr()
  x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor()
  x86, boot: update cmd_line_ptr to unsigned long
  x86: use io_remap to access real_mode_data
  x86, boot: add fields to support load bzImage and ramdisk high
  x86: remove 1024g limitation for kexec buffer on 64bit

 Documentation/x86/boot.txt         |   40 +++++++++++++++++++++++++-
 arch/x86/boot/boot.h               |   18 +++++++++--
 arch/x86/boot/cmdline.c            |   12 ++++----
 arch/x86/boot/compressed/cmdline.c |   13 +++++++-
 arch/x86/boot/compressed/head_64.S |   14 ++++++---
 arch/x86/boot/header.S             |   16 +++++++++-
 arch/x86/include/asm/bootparam.h   |    6 +++-
 arch/x86/include/asm/kexec.h       |    6 ++--
 arch/x86/kernel/head32.c           |   11 -------
 arch/x86/kernel/head64.c           |   42 +++++++++++++++++----------
 arch/x86/kernel/head_64.S          |   49 +++++++++++++++++++++++++------
 arch/x86/kernel/setup.c            |   55 +++++++++++++++++++++++++++++------
 12 files changed, 212 insertions(+), 70 deletions(-)

-- 
1.7.7


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
@ 2012-11-21  7:15 ` Yinghai Lu
  2012-11-21 17:23   ` H. Peter Anvin
  2012-11-21  7:16 ` [PATCH v3 02/12] x86, boot: Move lldt/ltr out of 64bit code section Yinghai Lu
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:15 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu, Matt Fleming

We are short of space before 0x200 that is entry for startup_64.

And we can not change startup_64 to other value --- ABI ?

We could move function verify_cpu down, and that could avoid extra
code of jmp back and forth.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
---
 arch/x86/boot/compressed/head_64.S |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 2c4b171..2c3cee4 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -182,8 +182,6 @@ no_longmode:
 	hlt
 	jmp     1b
 
-#include "../../kernel/verify_cpu.S"
-
 	/*
 	 * Be careful here startup_64 needs to be at a predictable
 	 * address so I can export it in an ELF header.  Bootloaders
@@ -349,6 +347,9 @@ relocated:
  */
 	jmp	*%rbp
 
+	.code32
+#include "../../kernel/verify_cpu.S"
+
 	.data
 gdt:
 	.word	gdt_end - gdt
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 02/12] x86, boot: Move lldt/ltr out of 64bit code section
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
  2012-11-21  7:15 ` [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200 Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 03/12] x86, 64bit: set extra ident page table for whole kernel range Yinghai Lu
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu, Zachary Amsden,
	Matt Fleming

commit 08da5a2ca

    x86_64: Early segment setup for VT

add lldt/ltr to clean more segments.

Those code are put in code64, and it is using gdt that is only
loaded from code32 path.

That breaks booting with 64bit bootloader that does not go through
code32 path, and get at startup_64 directly, so they have different
gdt.

Move those lines into code32 after their gdt is loaded.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Zachary Amsden <zamsden@gmail.com>
Cc: Matt Fleming <matt.fleming@intel.com>
---
 arch/x86/boot/compressed/head_64.S |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 2c3cee4..375af23 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -154,6 +154,12 @@ ENTRY(startup_32)
 	btsl	$_EFER_LME, %eax
 	wrmsr
 
+	/* After gdt is loaded */
+	xorl	%eax, %eax
+	lldt	%ax
+	movl    $0x20, %eax
+	ltr	%ax
+
 	/*
 	 * Setup for the jump to 64bit mode
 	 *
@@ -245,9 +251,6 @@ preferred_addr:
 	movl	%eax, %ss
 	movl	%eax, %fs
 	movl	%eax, %gs
-	lldt	%ax
-	movl    $0x20, %eax
-	ltr	%ax
 
 	/*
 	 * Compute the decompressed kernel start address.  It is where
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 03/12] x86, 64bit: set extra ident page table for whole kernel range
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
  2012-11-21  7:15 ` [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200 Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 02/12] x86, boot: Move lldt/ltr out of 64bit code section Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 04/12] x86, 64bit: add support for loading kernel above 512G Yinghai Lu
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

Current when kernel is loaded above 1G, only [_text, _text+2M]
is set up with extra ident page table.
That is not enough, some variables that could be used early are
out of that range. (like gdt...)

Just set map for [_text, _end] include text/data/bss/brk...

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/x86/kernel/head_64.S |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 94bf9cc..efc0c08 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -115,7 +115,16 @@ startup_64:
 	andq	$(PTRS_PER_PMD - 1), %rax
 	leaq	__PAGE_KERNEL_IDENT_LARGE_EXEC(%rdi), %rdx
 	leaq	level2_spare_pgt(%rip), %rbx
-	movq	%rdx, 0(%rbx, %rax, 8)
+	leaq	_end(%rip), %r8
+	decq	%r8
+	shrq	$PMD_SHIFT, %r8
+	andq	$(PTRS_PER_PMD - 1), %r8
+1:	movq	%rdx, 0(%rbx, %rax, 8)
+	addq	$PMD_SIZE, %rdx
+	incq	%rax
+	cmp	%r8, %rax
+	jle	1b
+
 ident_complete:
 
 	/*
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 04/12] x86, 64bit: add support for loading kernel above 512G
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (2 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 03/12] x86, 64bit: set extra ident page table for whole kernel range Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 05/12] x86: Merge early_reserve_initrd for 32bit and 64bit Yinghai Lu
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

Current kernel is not allowed to be loaded above 512g, it thinks
that address is too big.

We only need to add one extra spare page for needed level3 to
point another 512g range.

Need to check _text range and set level4 pg to point to that spare
level3 page, and set level3 to point to level2 page to cover
[_text, _end] with extra mapping.

We need this to put relocatable bzImage high above 512g.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/x86/kernel/head_64.S |   34 +++++++++++++++++++++++++++-------
 1 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index efc0c08..32fa9d0 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -78,12 +78,6 @@ startup_64:
 	testl	%eax, %eax
 	jnz	bad_address
 
-	/* Is the address too large? */
-	leaq	_text(%rip), %rdx
-	movq	$PGDIR_SIZE, %rax
-	cmpq	%rax, %rdx
-	jae	bad_address
-
 	/* Fixup the physical addresses in the page table
 	 */
 	addq	%rbp, init_level4_pgt + 0(%rip)
@@ -102,12 +96,35 @@ startup_64:
 	andq	$PMD_PAGE_MASK, %rdi
 
 	movq	%rdi, %rax
+	shrq	$PGDIR_SHIFT, %rax
+	andq	$(PTRS_PER_PGD - 1), %rax
+	jz	skip_level3_spare
+
+	/* Set level3 at first */
+	leaq	(level3_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	leaq	init_level4_pgt(%rip), %rbx
+	movq	%rdx, 0(%rbx, %rax, 8)
+	addq	$L4_PAGE_OFFSET, %rax
+	movq	%rdx, 0(%rbx, %rax, 8)
+
+	/* always need to set level2 */
+	movq	%rdi, %rax
+	shrq	$PUD_SHIFT, %rax
+	andq	$(PTRS_PER_PUD - 1), %rax
+	leaq	level3_spare_pgt(%rip), %rbx
+	jmp	set_level2_spare
+
+skip_level3_spare:
+	movq	%rdi, %rax
 	shrq	$PUD_SHIFT, %rax
 	andq	$(PTRS_PER_PUD - 1), %rax
 	jz	ident_complete
 
-	leaq	(level2_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	/* only set level2 with out level3 spare */
 	leaq	level3_ident_pgt(%rip), %rbx
+
+set_level2_spare:
+	leaq	(level2_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
 	movq	%rdx, 0(%rbx, %rax, 8)
 
 	movq	%rdi, %rax
@@ -435,6 +452,9 @@ NEXT_PAGE(level2_kernel_pgt)
 	PMDS(0, __PAGE_KERNEL_LARGE_EXEC,
 		KERNEL_IMAGE_SIZE/PMD_SIZE)
 
+NEXT_PAGE(level3_spare_pgt)
+	.fill   512, 8, 0
+
 NEXT_PAGE(level2_spare_pgt)
 	.fill   512, 8, 0
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 05/12] x86: Merge early_reserve_initrd for 32bit and 64bit
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (3 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 04/12] x86, 64bit: add support for loading kernel above 512G Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:40   ` Pekka Enberg
  2012-11-21  7:16 ` [PATCH v3 06/12] x86: add get_ramdisk_image/size Yinghai Lu
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

They are the same, could move them out from head32/64.c to setup.c.

We are using memblock, and it could handle overlapping properly, so
we don't need to reserve some at first to hold the location, and just
need to make sure we reserve them before we are using memblock to find
free mem to use.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/head32.c |   11 -----------
 arch/x86/kernel/head64.c |   11 -----------
 arch/x86/kernel/setup.c  |   22 ++++++++++++++++++----
 3 files changed, 18 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index c18f59d..4c52efc 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -33,17 +33,6 @@ void __init i386_start_kernel(void)
 	memblock_reserve(__pa_symbol(&_text),
 			 __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
 
-#ifdef CONFIG_BLK_DEV_INITRD
-	/* Reserve INITRD */
-	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
-		/* Assume only end is not page aligned */
-		u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-		u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-		u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
-		memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
-	}
-#endif
-
 	/* Call the subarch specific early setup function */
 	switch (boot_params.hdr.hardware_subarch) {
 	case X86_SUBARCH_MRST:
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 037df57..00e612a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -100,17 +100,6 @@ void __init x86_64_start_reservations(char *real_mode_data)
 	memblock_reserve(__pa_symbol(&_text),
 			 __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
 
-#ifdef CONFIG_BLK_DEV_INITRD
-	/* Reserve INITRD */
-	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
-		/* Assume only end is not page aligned */
-		unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
-		unsigned long ramdisk_size  = boot_params.hdr.ramdisk_size;
-		unsigned long ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
-		memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
-	}
-#endif
-
 	reserve_ebda_region();
 
 	/*
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 6d29d1f..ee6d267 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -364,6 +364,19 @@ static u64 __init get_mem_size(unsigned long limit_pfn)
 
 	return mapped_pages << PAGE_SHIFT;
 }
+static void __init early_reserve_initrd(void)
+{
+	/* Assume only end is not page aligned */
+	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
+	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
+
+	if (!boot_params.hdr.type_of_loader ||
+	    !ramdisk_image || !ramdisk_size)
+		return;		/* No initrd provided by bootloader */
+
+	memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
+}
 static void __init reserve_initrd(void)
 {
 	/* Assume only end is not page aligned */
@@ -390,10 +403,6 @@ static void __init reserve_initrd(void)
 	if (pfn_range_is_mapped(PFN_DOWN(ramdisk_image),
 				PFN_DOWN(ramdisk_end))) {
 		/* All are mapped, easy case */
-		/*
-		 * don't need to reserve again, already reserved early
-		 * in i386_start_kernel
-		 */
 		initrd_start = ramdisk_image + PAGE_OFFSET;
 		initrd_end = initrd_start + ramdisk_size;
 		return;
@@ -404,6 +413,9 @@ static void __init reserve_initrd(void)
 	memblock_free(ramdisk_image, ramdisk_end - ramdisk_image);
 }
 #else
+static void __init early_reserve_initrd(void)
+{
+}
 static void __init reserve_initrd(void)
 {
 }
@@ -665,6 +677,8 @@ early_param("reservelow", parse_reservelow);
 
 void __init setup_arch(char **cmdline_p)
 {
+	early_reserve_initrd();
+
 #ifdef CONFIG_X86_32
 	memcpy(&boot_cpu_data, &new_cpu_data, sizeof(new_cpu_data));
 	visws_early_detect();
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 06/12] x86: add get_ramdisk_image/size
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (4 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 05/12] x86: Merge early_reserve_initrd for 32bit and 64bit Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 07/12] x86, boot: add get_cmd_line_ptr() Yinghai Lu
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

There several places to find ramdisk information early for reserving
and relocating.

Use functions to make code more readable and consistent.

Later will add ext_ramdisk_image/size in those functions to support
loading ramdisk above 4g.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/setup.c |   29 +++++++++++++++++++++--------
 1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index ee6d267..194e151 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -298,12 +298,25 @@ static void __init reserve_brk(void)
 
 #ifdef CONFIG_BLK_DEV_INITRD
 
+static u64 __init get_ramdisk_image(void)
+{
+	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
+
+	return ramdisk_image;
+}
+static u64 __init get_ramdisk_size(void)
+{
+	u64 ramdisk_size = boot_params.hdr.ramdisk_size;
+
+	return ramdisk_size;
+}
+
 #define MAX_MAP_CHUNK	(NR_FIX_BTMAPS << PAGE_SHIFT)
 static void __init relocate_initrd(void)
 {
 	/* Assume only end is not page aligned */
-	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+	u64 ramdisk_image = get_ramdisk_image();
+	u64 ramdisk_size  = get_ramdisk_size();
 	u64 area_size     = PAGE_ALIGN(ramdisk_size);
 	u64 ramdisk_here;
 	unsigned long slop, clen, mapaddr;
@@ -342,8 +355,8 @@ static void __init relocate_initrd(void)
 		ramdisk_size  -= clen;
 	}
 
-	ramdisk_image = boot_params.hdr.ramdisk_image;
-	ramdisk_size  = boot_params.hdr.ramdisk_size;
+	ramdisk_image = get_ramdisk_image();
+	ramdisk_size  = get_ramdisk_size();
 	printk(KERN_INFO "Move RAMDISK from [mem %#010llx-%#010llx] to"
 		" [mem %#010llx-%#010llx]\n",
 		ramdisk_image, ramdisk_image + ramdisk_size - 1,
@@ -367,8 +380,8 @@ static u64 __init get_mem_size(unsigned long limit_pfn)
 static void __init early_reserve_initrd(void)
 {
 	/* Assume only end is not page aligned */
-	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+	u64 ramdisk_image = get_ramdisk_image();
+	u64 ramdisk_size  = get_ramdisk_size();
 	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 
 	if (!boot_params.hdr.type_of_loader ||
@@ -380,8 +393,8 @@ static void __init early_reserve_initrd(void)
 static void __init reserve_initrd(void)
 {
 	/* Assume only end is not page aligned */
-	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+	u64 ramdisk_image = get_ramdisk_image();
+	u64 ramdisk_size  = get_ramdisk_size();
 	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 	u64 mapped_size;
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 07/12] x86, boot: add get_cmd_line_ptr()
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (5 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 06/12] x86: add get_ramdisk_image/size Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 08/12] x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor() Yinghai Lu
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

later will check ext_cmd_line_ptr at the same time.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/compressed/cmdline.c |   10 ++++++++--
 arch/x86/kernel/head64.c           |   13 +++++++++++--
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
index 10f6b11..b4c913c 100644
--- a/arch/x86/boot/compressed/cmdline.c
+++ b/arch/x86/boot/compressed/cmdline.c
@@ -13,13 +13,19 @@ static inline char rdfs8(addr_t addr)
 	return *((char *)(fs + addr));
 }
 #include "../cmdline.c"
+static unsigned long get_cmd_line_ptr(void)
+{
+	unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
+
+	return cmd_line_ptr;
+}
 int cmdline_find_option(const char *option, char *buffer, int bufsize)
 {
-	return __cmdline_find_option(real_mode->hdr.cmd_line_ptr, option, buffer, bufsize);
+	return __cmdline_find_option(get_cmd_line_ptr(), option, buffer, bufsize);
 }
 int cmdline_find_option_bool(const char *option)
 {
-	return __cmdline_find_option_bool(real_mode->hdr.cmd_line_ptr, option);
+	return __cmdline_find_option_bool(get_cmd_line_ptr(), option);
 }
 
 #endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 00e612a..3ac6cad 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -41,13 +41,22 @@ static void __init clear_bss(void)
 	       (unsigned long) __bss_stop - (unsigned long) __bss_start);
 }
 
+static unsigned long get_cmd_line_ptr(void)
+{
+	unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+
+	return cmd_line_ptr;
+}
+
 static void __init copy_bootdata(char *real_mode_data)
 {
 	char * command_line;
+	unsigned long cmd_line_ptr;
 
 	memcpy(&boot_params, real_mode_data, sizeof boot_params);
-	if (boot_params.hdr.cmd_line_ptr) {
-		command_line = __va(boot_params.hdr.cmd_line_ptr);
+	cmd_line_ptr = get_cmd_line_ptr();
+	if (cmd_line_ptr) {
+		command_line = __va(cmd_line_ptr);
 		memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
 	}
 }
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 08/12] x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor()
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (6 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 07/12] x86, boot: add get_cmd_line_ptr() Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21 17:21   ` H. Peter Anvin
  2012-11-21  7:16 ` [PATCH v3 09/12] x86, boot: update cmd_line_ptr to unsigned long Yinghai Lu
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

At that stage, it is already in 32bit protected mode or 64bit mode.
so we do not need to check if ptr less 1M.

When go from other boot loader (kexec) instead of boot/ code path.

Move out accessible checking out __cmdline_find_option....

So misc.c will parse cmdline and have debug print out.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/boot.h    |   14 ++++++++++++--
 arch/x86/boot/cmdline.c |    8 ++++----
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 18997e5..7fadf80 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -289,12 +289,22 @@ int __cmdline_find_option(u32 cmdline_ptr, const char *option, char *buffer, int
 int __cmdline_find_option_bool(u32 cmdline_ptr, const char *option);
 static inline int cmdline_find_option(const char *option, char *buffer, int bufsize)
 {
-	return __cmdline_find_option(boot_params.hdr.cmd_line_ptr, option, buffer, bufsize);
+	u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+
+	if (cmd_line_ptr >= 0x100000)
+		return -1;      /* inaccessible */
+
+	return __cmdline_find_option(cmd_line_ptr, option, buffer, bufsize);
 }
 
 static inline int cmdline_find_option_bool(const char *option)
 {
-	return __cmdline_find_option_bool(boot_params.hdr.cmd_line_ptr, option);
+	u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+
+	if (cmd_line_ptr >= 0x100000)
+		return -1;      /* inaccessible */
+
+	return __cmdline_find_option_bool(cmd_line_ptr, option);
 }
 
 
diff --git a/arch/x86/boot/cmdline.c b/arch/x86/boot/cmdline.c
index 6b3b6f7..768f00f 100644
--- a/arch/x86/boot/cmdline.c
+++ b/arch/x86/boot/cmdline.c
@@ -41,8 +41,8 @@ int __cmdline_find_option(u32 cmdline_ptr, const char *option, char *buffer, int
 		st_bufcpy	/* Copying this to buffer */
 	} state = st_wordstart;
 
-	if (!cmdline_ptr || cmdline_ptr >= 0x100000)
-		return -1;	/* No command line, or inaccessible */
+	if (!cmdline_ptr)
+		return -1;      /* No command line */
 
 	cptr = cmdline_ptr & 0xf;
 	set_fs(cmdline_ptr >> 4);
@@ -111,8 +111,8 @@ int __cmdline_find_option_bool(u32 cmdline_ptr, const char *option)
 		st_wordskip,	/* Miscompare, skip */
 	} state = st_wordstart;
 
-	if (!cmdline_ptr || cmdline_ptr >= 0x100000)
-		return -1;	/* No command line, or inaccessible */
+	if (!cmdline_ptr)
+		return -1;      /* No command line */
 
 	cptr = cmdline_ptr & 0xf;
 	set_fs(cmdline_ptr >> 4);
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 09/12] x86, boot: update cmd_line_ptr to unsigned long
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (7 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 08/12] x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor() Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 10/12] x86: use io_remap to access real_mode_data Yinghai Lu
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

boot/compressed/misc.c could be with 64 bit, and cmd_line_ptr could
above 4g.

So change to unsigned long instead. that will be 64bit in 64bit,
and 32bit in 32bit.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/boot/boot.h    |    8 ++++----
 arch/x86/boot/cmdline.c |    4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 7fadf80..5b75319 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -285,11 +285,11 @@ struct biosregs {
 void intcall(u8 int_no, const struct biosregs *ireg, struct biosregs *oreg);
 
 /* cmdline.c */
-int __cmdline_find_option(u32 cmdline_ptr, const char *option, char *buffer, int bufsize);
-int __cmdline_find_option_bool(u32 cmdline_ptr, const char *option);
+int __cmdline_find_option(unsigned long cmdline_ptr, const char *option, char *buffer, int bufsize);
+int __cmdline_find_option_bool(unsigned long cmdline_ptr, const char *option);
 static inline int cmdline_find_option(const char *option, char *buffer, int bufsize)
 {
-	u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+	unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
 
 	if (cmd_line_ptr >= 0x100000)
 		return -1;      /* inaccessible */
@@ -299,7 +299,7 @@ static inline int cmdline_find_option(const char *option, char *buffer, int bufs
 
 static inline int cmdline_find_option_bool(const char *option)
 {
-	u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+	unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
 
 	if (cmd_line_ptr >= 0x100000)
 		return -1;      /* inaccessible */
diff --git a/arch/x86/boot/cmdline.c b/arch/x86/boot/cmdline.c
index 768f00f..625d21b 100644
--- a/arch/x86/boot/cmdline.c
+++ b/arch/x86/boot/cmdline.c
@@ -27,7 +27,7 @@ static inline int myisspace(u8 c)
  * Returns the length of the argument (regardless of if it was
  * truncated to fit in the buffer), or -1 on not found.
  */
-int __cmdline_find_option(u32 cmdline_ptr, const char *option, char *buffer, int bufsize)
+int __cmdline_find_option(unsigned long cmdline_ptr, const char *option, char *buffer, int bufsize)
 {
 	addr_t cptr;
 	char c;
@@ -99,7 +99,7 @@ int __cmdline_find_option(u32 cmdline_ptr, const char *option, char *buffer, int
  * Returns the position of that option (starts counting with 1)
  * or 0 on not found
  */
-int __cmdline_find_option_bool(u32 cmdline_ptr, const char *option)
+int __cmdline_find_option_bool(unsigned long cmdline_ptr, const char *option)
 {
 	addr_t cptr;
 	char c;
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 10/12] x86: use io_remap to access real_mode_data
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (8 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 09/12] x86, boot: update cmd_line_ptr to unsigned long Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high Yinghai Lu
  2012-11-21  7:16 ` [PATCH v3 12/12] x86: remove 1024g limitation for kexec buffer on 64bit Yinghai Lu
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

When 64bit bootloader put real mode data above 4g, We can not
access real mode data directly.

because in arch/x86/kernel/head_64.S, only set ident mapping
for 0-1g, and kernel code/data/bss.

So need to move early_ioremap_init() calling from setup_arch
to x86_64_start_kernel.

Also use rsi/rdi instead of esi/edi.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/head64.c  |   17 ++++++++++++++---
 arch/x86/kernel/head_64.S |    4 ++--
 arch/x86/kernel/setup.c   |    2 ++
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 3ac6cad..735cd47 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -52,12 +52,21 @@ static void __init copy_bootdata(char *real_mode_data)
 {
 	char * command_line;
 	unsigned long cmd_line_ptr;
+	char *p;
 
-	memcpy(&boot_params, real_mode_data, sizeof boot_params);
+	/*
+	 * for 64bit bootload path, those data could be above 4G,
+	 * and we do set ident mapping for them in head_64.S.
+	 * So need to ioremap to access them.
+	 */
+	p = early_memremap((unsigned long)real_mode_data, sizeof(boot_params));
+	memcpy(&boot_params, p, sizeof(boot_params));
+	early_iounmap(p, sizeof(boot_params));
 	cmd_line_ptr = get_cmd_line_ptr();
 	if (cmd_line_ptr) {
-		command_line = __va(cmd_line_ptr);
+		command_line = early_memremap(cmd_line_ptr, COMMAND_LINE_SIZE);
 		memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
+		early_iounmap(command_line, COMMAND_LINE_SIZE);
 	}
 }
 
@@ -104,7 +113,9 @@ void __init x86_64_start_kernel(char * real_mode_data)
 
 void __init x86_64_start_reservations(char *real_mode_data)
 {
-	copy_bootdata(__va(real_mode_data));
+	early_ioremap_init();
+
+	copy_bootdata(real_mode_data);
 
 	memblock_reserve(__pa_symbol(&_text),
 			 __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 32fa9d0..14c5de2 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -262,9 +262,9 @@ ENTRY(secondary_startup_64)
 	movl	initial_gs+4(%rip),%edx
 	wrmsr	
 
-	/* esi is pointer to real mode structure with interesting info.
+	/* rsi is pointer to real mode structure with interesting info.
 	   pass it to C */
-	movl	%esi, %edi
+	movq	%rsi, %rdi
 	
 	/* Finally jump to run C code and to be on real kernel address
 	 * Since we are running on identity-mapped space we have to jump
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 194e151..573fa7d7 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -718,7 +718,9 @@ void __init setup_arch(char **cmdline_p)
 
 	early_trap_init();
 	early_cpu_init();
+#ifdef CONFIG_X86_32
 	early_ioremap_init();
+#endif
 
 	setup_olpc_ofw_pgd();
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (9 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 10/12] x86: use io_remap to access real_mode_data Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  2012-11-21 17:17   ` H. Peter Anvin
  2012-11-21  7:16 ` [PATCH v3 12/12] x86: remove 1024g limitation for kexec buffer on 64bit Yinghai Lu
  11 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu, Rob Landley, Matt Fleming

ext_ramdisk_image/size will record high 32bits for ramdisk info.

xloadflags bit0 will be set if relocatable with 64bit.

Let get_ramdisk_image/size to use ext_ramdisk_image/size to get
right positon for ramdisk.

bootloader will fill value to ext_ramdisk_image/size when it load
ramdisk high.

Also bootloader will check if xloadflags bit0 is set to decicde if
it could load ramdisk high above 4G.

Update header version to 2.12.

-v2: add ext_cmd_line_ptr for above 4G support.
-v3: update to xloadflags from HPA

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Rob Landley <rob@landley.net>
Cc: Matt Fleming <matt.fleming@intel.com>
---
 Documentation/x86/boot.txt         |   40 +++++++++++++++++++++++++++++++++++-
 arch/x86/boot/compressed/cmdline.c |    3 ++
 arch/x86/boot/header.S             |   16 ++++++++++++-
 arch/x86/include/asm/bootparam.h   |    6 ++++-
 arch/x86/kernel/head64.c           |    3 ++
 arch/x86/kernel/setup.c            |    6 +++++
 6 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index 9efceff..a8263f7 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -57,6 +57,9 @@ Protocol 2.10:	(Kernel 2.6.31) Added a protocol for relaxed alignment
 Protocol 2.11:	(Kernel 3.6) Added a field for offset of EFI handover
 		protocol entry point.
 
+Protocol 2.12:	(Kernel 3.9) Added three fields for loading bzImage and
+		 ramdisk above 4G with 64bit.
+
 **** MEMORY LAYOUT
 
 The traditional memory map for the kernel loader, used for Image or
@@ -182,7 +185,7 @@ Offset	Proto	Name		Meaning
 0230/4	2.05+	kernel_alignment Physical addr alignment required for kernel
 0234/1	2.05+	relocatable_kernel Whether kernel is relocatable or not
 0235/1	2.10+	min_alignment	Minimum alignment, as a power of two
-0236/2	N/A	pad3		Unused
+0236/2	2.12+	xloadflags	Boot protocal option flags
 0238/4	2.06+	cmdline_size	Maximum size of the kernel command line
 023C/4	2.07+	hardware_subarch Hardware subarchitecture
 0240/8	2.07+	hardware_subarch_data Subarchitecture-specific data
@@ -193,6 +196,9 @@ Offset	Proto	Name		Meaning
 0258/8	2.10+	pref_address	Preferred loading address
 0260/4	2.10+	init_size	Linear memory required during initialization
 0264/4	2.11+	handover_offset	Offset of handover entry point
+0268/4	2.12+	ext_ramdisk_image ramdisk_image 32 bits
+026C/4	2.12+	ext_ramdisk_size ramdisk_size high 32 bits
+0270/4	2.12+   ext_cmd_line_ptr cmd_line_ptr high 32 bits
 
 (1) For backwards compatibility, if the setup_sects field contains 0, the
     real value is 4.
@@ -581,6 +587,16 @@ Protocol:	2.10+
   misaligned kernel.  Therefore, a loader should typically try each
   power-of-two alignment from kernel_alignment down to this alignment.
 
+Field name:     xloadflags
+Type:           modify (obligatory)
+Offset/size:    0x236/2
+Protocol:       2.12+
+
+  This field is a bitmask.
+
+  Bit 0 (read): LOADED_ABOVE_4G
+        - If 1, kernel/boot_params/cmdline/ramdisk could be above 4g
+
 Field name:	cmdline_size
 Type:		read
 Offset/size:	0x238/4
@@ -707,6 +723,28 @@ Offset/size:	0x264/4
 
   See EFI HANDOVER PROTOCOL below for more details.
 
+Field name:	ext_ramdisk_image
+Type:		write
+Offset/size:	0x268/4
+Protocol:	2.12+
+
+  The high 32-bit linear address of the initial ramdisk or ramfs.  Leave at
+  zero if there is no initial ramdisk/ramfs, or under 4G.
+
+Field name:	ext_ramdisk_size
+Type:		write
+Offset/size:	0x26c/4
+Protocol:	2.12+
+
+  High 32-bit size of the initial ramdisk or ramfs.  Leave at zero if there
+  is no initial ramdisk/ramfs.
+
+Field name:	ext_cmd_line_ptr
+Type:		write
+Offset/size:	0x270/4
+Protocol:	2.12+
+
+  cmd_line_ptr high 32 bits. Leave at zero if under 4G.
 
 **** THE IMAGE CHECKSUM
 
diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
index b4c913c..00678d3 100644
--- a/arch/x86/boot/compressed/cmdline.c
+++ b/arch/x86/boot/compressed/cmdline.c
@@ -17,6 +17,9 @@ static unsigned long get_cmd_line_ptr(void)
 {
 	unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
 
+	if (real_mode->hdr.version >= 0x020c)
+		cmd_line_ptr |= (u64)real_mode->hdr.ext_cmd_line_ptr << 32;
+
 	return cmd_line_ptr;
 }
 int cmdline_find_option(const char *option, char *buffer, int bufsize)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 2a01744..598cba5 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -279,7 +279,7 @@ _start:
 	# Part 2 of the header, from the old setup.S
 
 		.ascii	"HdrS"		# header signature
-		.word	0x020b		# header version number (>= 0x0105)
+		.word	0x020c		# header version number (>= 0x0105)
 					# or else old loadlin-1.5 will fail)
 		.globl realmode_swtch
 realmode_swtch:	.word	0, 0		# default_switch, SETUPSEG
@@ -369,7 +369,15 @@ relocatable_kernel:    .byte 1
 relocatable_kernel:    .byte 0
 #endif
 min_alignment:		.byte MIN_KERNEL_ALIGN_LG2	# minimum alignment
-pad3:			.word 0
+
+xloadflags:
+LOADED_ABOVE_4G	= 1			# If set, the kernel/boot_param/
+					# ramdisk could be loaded above 4g
+#if defined(CONFIG_X86_64) && defined(CONFIG_RELOCATABLE)
+			.word LOADED_ABOVE_4G
+#else
+			.word 0
+#endif
 
 cmdline_size:   .long   COMMAND_LINE_SIZE-1     #length of the command line,
                                                 #added with boot protocol
@@ -400,6 +408,10 @@ init_size:		.long INIT_SIZE		# kernel initialization size
 handover_offset:	.long 0x30		# offset to the handover
 						# protocol entry point
 
+ext_ramdisk_image:	.long	0	# ramdisk_image high 32 bits
+ext_ramdisk_size:	.long	0	# ramdisk_size high 32 bits
+ext_cmd_line_ptr:	.long	0	# cmd_line_ptr high 32 bits.
+
 # End of setup header #####################################################
 
 	.section ".entrytext", "ax"
diff --git a/arch/x86/include/asm/bootparam.h b/arch/x86/include/asm/bootparam.h
index 2ad874c..036a278 100644
--- a/arch/x86/include/asm/bootparam.h
+++ b/arch/x86/include/asm/bootparam.h
@@ -57,7 +57,8 @@ struct setup_header {
 	__u32	initrd_addr_max;
 	__u32	kernel_alignment;
 	__u8	relocatable_kernel;
-	__u8	_pad2[3];
+	__u8	min_alignment;
+	__u16	xloadflags;
 	__u32	cmdline_size;
 	__u32	hardware_subarch;
 	__u64	hardware_subarch_data;
@@ -67,6 +68,9 @@ struct setup_header {
 	__u64	pref_address;
 	__u32	init_size;
 	__u32	handover_offset;
+	__u32	ext_ramdisk_image;
+	__u32	ext_ramdisk_size;
+	__u32	ext_cmd_line_ptr;
 } __attribute__((packed));
 
 struct sys_desc_table {
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 735cd47..7a969a7 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -45,6 +45,9 @@ static unsigned long get_cmd_line_ptr(void)
 {
 	unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
 
+	if (boot_params.hdr.version >= 0x020c)
+		cmd_line_ptr |= (u64)boot_params.hdr.ext_cmd_line_ptr << 32;
+
 	return cmd_line_ptr;
 }
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 573fa7d7..6a0ffa3 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -302,12 +302,18 @@ static u64 __init get_ramdisk_image(void)
 {
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 
+	if (boot_params.hdr.version >= 0x020c)
+		ramdisk_image |= (u64)boot_params.hdr.ext_ramdisk_image << 32;
+
 	return ramdisk_image;
 }
 static u64 __init get_ramdisk_size(void)
 {
 	u64 ramdisk_size = boot_params.hdr.ramdisk_size;
 
+	if (boot_params.hdr.version >= 0x020c)
+		ramdisk_size |= (u64)boot_params.hdr.ext_ramdisk_size << 32;
+
 	return ramdisk_size;
 }
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 12/12] x86: remove 1024g limitation for kexec buffer on 64bit
  2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
                   ` (10 preceding siblings ...)
  2012-11-21  7:16 ` [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high Yinghai Lu
@ 2012-11-21  7:16 ` Yinghai Lu
  11 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Eric W. Biederman, linux-kernel, Yinghai Lu

Now 64bit kernel supports more than 1T ram and kexec tools
could find buffer above 1T, remove that obsolete limitation.
and use MAXMEM instead.

Tested on system more than 1024g ram.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/x86/include/asm/kexec.h |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 317ff17..11bfdc5 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -48,11 +48,11 @@
 # define vmcore_elf_check_arch_cross(x) ((x)->e_machine == EM_X86_64)
 #else
 /* Maximum physical address we can use pages from */
-# define KEXEC_SOURCE_MEMORY_LIMIT      (0xFFFFFFFFFFUL)
+# define KEXEC_SOURCE_MEMORY_LIMIT      (MAXMEM-1)
 /* Maximum address we can reach in physical address mode */
-# define KEXEC_DESTINATION_MEMORY_LIMIT (0xFFFFFFFFFFUL)
+# define KEXEC_DESTINATION_MEMORY_LIMIT (MAXMEM-1)
 /* Maximum address we can use for the control pages */
-# define KEXEC_CONTROL_MEMORY_LIMIT     (0xFFFFFFFFFFUL)
+# define KEXEC_CONTROL_MEMORY_LIMIT     (MAXMEM-1)
 
 /* Allocate one page for the pdp and the second for the code */
 # define KEXEC_CONTROL_PAGE_SIZE  (4096UL + 4096UL)
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 05/12] x86: Merge early_reserve_initrd for 32bit and 64bit
  2012-11-21  7:16 ` [PATCH v3 05/12] x86: Merge early_reserve_initrd for 32bit and 64bit Yinghai Lu
@ 2012-11-21  7:40   ` Pekka Enberg
  0 siblings, 0 replies; 57+ messages in thread
From: Pekka Enberg @ 2012-11-21  7:40 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Eric W. Biederman,
	linux-kernel

On Wed, Nov 21, 2012 at 9:16 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> They are the same, could move them out from head32/64.c to setup.c.
>
> We are using memblock, and it could handle overlapping properly, so
> we don't need to reserve some at first to hold the location, and just
> need to make sure we reserve them before we are using memblock to find
> free mem to use.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>

Reviewed-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-21  7:16 ` [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high Yinghai Lu
@ 2012-11-21 17:17   ` H. Peter Anvin
  2012-11-21 18:59     ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-21 17:17 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On 11/20/2012 11:16 PM, Yinghai Lu wrote:
>
> diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
> index 9efceff..a8263f7 100644
> --- a/Documentation/x86/boot.txt
> +++ b/Documentation/x86/boot.txt
> @@ -57,6 +57,9 @@ Protocol 2.10:	(Kernel 2.6.31) Added a protocol for relaxed alignment
>   Protocol 2.11:	(Kernel 3.6) Added a field for offset of EFI handover
>   		protocol entry point.
>
> +Protocol 2.12:	(Kernel 3.9) Added three fields for loading bzImage and
> +		 ramdisk above 4G with 64bit.
> +
>   **** MEMORY LAYOUT
>
>   The traditional memory map for the kernel loader, used for Image or
> @@ -182,7 +185,7 @@ Offset	Proto	Name		Meaning
>   0230/4	2.05+	kernel_alignment Physical addr alignment required for kernel
>   0234/1	2.05+	relocatable_kernel Whether kernel is relocatable or not
>   0235/1	2.10+	min_alignment	Minimum alignment, as a power of two
> -0236/2	N/A	pad3		Unused
> +0236/2	2.12+	xloadflags	Boot protocal option flags
                                              ^^^^^^^^
>   0238/4	2.06+	cmdline_size	Maximum size of the kernel command line
>   023C/4	2.07+	hardware_subarch Hardware subarchitecture
>   0240/8	2.07+	hardware_subarch_data Subarchitecture-specific data
> @@ -193,6 +196,9 @@ Offset	Proto	Name		Meaning
>   0258/8	2.10+	pref_address	Preferred loading address
>   0260/4	2.10+	init_size	Linear memory required during initialization
>   0264/4	2.11+	handover_offset	Offset of handover entry point
> +0268/4	2.12+	ext_ramdisk_image ramdisk_image 32 bits

"high 32 bits" presumably...

> +026C/4	2.12+	ext_ramdisk_size ramdisk_size high 32 bits
> +0270/4	2.12+   ext_cmd_line_ptr cmd_line_ptr high 32 bits

I'm looking at these three fields and I'm getting worried about space -- 
there are only two more word-sized fields possible in this structure. 
Since these fields are not initialized (default to zero) and almost 
certainly aren't useful for people entering via the 16-bit entry point I 
think we should move them out of struct setup_header and into the 
remainder of struct boot_param.
> diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
> index b4c913c..00678d3 100644
> --- a/arch/x86/boot/compressed/cmdline.c
> +++ b/arch/x86/boot/compressed/cmdline.c
> @@ -17,6 +17,9 @@ static unsigned long get_cmd_line_ptr(void)
>   {
>   	unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
>
> +	if (real_mode->hdr.version >= 0x020c)
> +		cmd_line_ptr |= (u64)real_mode->hdr.ext_cmd_line_ptr << 32;
> +
>   	return cmd_line_ptr;
>   }

No.  hdr.version is information from the kernel to the bootloader; it is 
meaningless to look at it inside the kernel.

Same in a bunch of other places.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 08/12] x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor()
  2012-11-21  7:16 ` [PATCH v3 08/12] x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor() Yinghai Lu
@ 2012-11-21 17:21   ` H. Peter Anvin
  2012-11-21 19:18     ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-21 17:21 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel

On 11/20/2012 11:16 PM, Yinghai Lu wrote:
> At that stage, it is already in 32bit protected mode or 64bit mode.
> so we do not need to check if ptr less 1M.
>
> When go from other boot loader (kexec) instead of boot/ code path.
>
> Move out accessible checking out __cmdline_find_option....
>
> So misc.c will parse cmdline and have debug print out.

Your description doesn't seem to match the code, and is incredibly 
confusing to the reader.

The reason why is because you leave out an essential piece of 
information: cmdline.c is included both in 16-bit code and in the 
decompressor (32/64-bit code), so you want to move the test out of the 
shared code.

	-hpa
-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
  2012-11-21  7:15 ` [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200 Yinghai Lu
@ 2012-11-21 17:23   ` H. Peter Anvin
  2012-11-21 19:45     ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-21 17:23 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Matt Fleming

On 11/20/2012 11:15 PM, Yinghai Lu wrote:
> We are short of space before 0x200 that is entry for startup_64.
>
> And we can not change startup_64 to other value --- ABI ?

Here you are saying "I don't understand how this works."  It is YOUR 
responsibility to find out and write a definite statement rather than 
leaving that to the reader, or expect the maintainer to edit this.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-21 17:17   ` H. Peter Anvin
@ 2012-11-21 18:59     ` Yinghai Lu
  2012-11-21 19:18       ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21 18:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Wed, Nov 21, 2012 at 9:17 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/20/2012 11:16 PM, Yinghai Lu wrote:
>>
>>
>> diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
>> index 9efceff..a8263f7 100644
>> --- a/Documentation/x86/boot.txt
>> +++ b/Documentation/x86/boot.txt
>> @@ -57,6 +57,9 @@ Protocol 2.10:        (Kernel 2.6.31) Added a protocol
>> for relaxed alignment
>>   Protocol 2.11:        (Kernel 3.6) Added a field for offset of EFI
>> handover
>>                 protocol entry point.
>>
>> +Protocol 2.12: (Kernel 3.9) Added three fields for loading bzImage and
>> +                ramdisk above 4G with 64bit.
>> +
>>   **** MEMORY LAYOUT
>>
>>   The traditional memory map for the kernel loader, used for Image or
>> @@ -182,7 +185,7 @@ Offset      Proto   Name            Meaning
>>   0230/4        2.05+   kernel_alignment Physical addr alignment required
>> for kernel
>>   0234/1        2.05+   relocatable_kernel Whether kernel is relocatable
>> or not
>>   0235/1        2.10+   min_alignment   Minimum alignment, as a power of
>> two
>> -0236/2 N/A     pad3            Unused
>> +0236/2 2.12+   xloadflags      Boot protocal option flags
>
>                                              ^^^^^^^^
sorry.
>
>>   0238/4        2.06+   cmdline_size    Maximum size of the kernel command
>> line
>>   023C/4        2.07+   hardware_subarch Hardware subarchitecture
>>   0240/8        2.07+   hardware_subarch_data Subarchitecture-specific
>> data
>> @@ -193,6 +196,9 @@ Offset      Proto   Name            Meaning
>>   0258/8        2.10+   pref_address    Preferred loading address
>>   0260/4        2.10+   init_size       Linear memory required during
>> initialization
>>   0264/4        2.11+   handover_offset Offset of handover entry point
>> +0268/4 2.12+   ext_ramdisk_image ramdisk_image 32 bits
>
>
> "high 32 bits" presumably...

ok

>
>
>> +026C/4 2.12+   ext_ramdisk_size ramdisk_size high 32 bits
>> +0270/4 2.12+   ext_cmd_line_ptr cmd_line_ptr high 32 bits
>
>
> I'm looking at these three fields and I'm getting worried about space --
> there are only two more word-sized fields possible in this structure. Since
> these fields are not initialized (default to zero) and almost certainly
> aren't useful for people entering via the 16-bit entry point I think we
> should move them out of struct setup_header and into the remainder of struct
> boot_param.

in boot_param:

        struct setup_header hdr;    /* setup header */  /* 0x1f1 */
        __u8  _pad7[0x290-0x1f1-sizeof(struct setup_header)];
        __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX];      /* 0x290 */
        struct e820entry e820_map[E820MAX];             /* 0x2d0 */
        __u8  _pad8[48];                                /* 0xcd0 */
        struct edd_info eddbuf[EDDMAXNR];               /* 0xd00 */
        __u8  _pad9[276];                               /* 0xeec */

so we can use till 0x290.

and after those three dword, will still have 7 left.

>
>> diff --git a/arch/x86/boot/compressed/cmdline.c
>> b/arch/x86/boot/compressed/cmdline.c
>> index b4c913c..00678d3 100644
>> --- a/arch/x86/boot/compressed/cmdline.c
>> +++ b/arch/x86/boot/compressed/cmdline.c
>> @@ -17,6 +17,9 @@ static unsigned long get_cmd_line_ptr(void)
>>   {
>>         unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
>>
>> +       if (real_mode->hdr.version >= 0x020c)
>> +               cmd_line_ptr |= (u64)real_mode->hdr.ext_cmd_line_ptr <<
>> 32;
>> +
>>         return cmd_line_ptr;
>>   }
>
>
> No.  hdr.version is information from the kernel to the bootloader; it is
> meaningless to look at it inside the kernel.
>
could remove them, but how about vmlinux elf.

when kexec vmlinux elf, it will fake one hdr, and fill version there.

> Same in a bunch of other places.
>

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 08/12] x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor()
  2012-11-21 17:21   ` H. Peter Anvin
@ 2012-11-21 19:18     ` Yinghai Lu
  0 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21 19:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel

On Wed, Nov 21, 2012 at 9:21 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/20/2012 11:16 PM, Yinghai Lu wrote:
>>
>> At that stage, it is already in 32bit protected mode or 64bit mode.
>> so we do not need to check if ptr less 1M.
>>
>> When go from other boot loader (kexec) instead of boot/ code path.
>>
>> Move out accessible checking out __cmdline_find_option....
>>
>> So misc.c will parse cmdline and have debug print out.
>
>
> Your description doesn't seem to match the code, and is incredibly confusing
> to the reader.
>
> The reason why is because you leave out an essential piece of information:
> cmdline.c is included both in 16-bit code and in the decompressor (32/64-bit
> code), so you want to move the test out of the shared code.

updated change log to:

Subject: [PATCH] x86, boot: move checking of cmd_line_ptr out of common path

cmdline.c::__cmdline_find_option... are shared between
16-bit setup code and 32/64 bit decompressor code.

for 32/64 only path via kexec, we should not check if ptr less 1M.
as those cmdline could be put above 1M even 4G.

Move out accessible checking out of __cmdline_find_option....
So decompressor in misc.c can parse cmdline correctly.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-21 18:59     ` Yinghai Lu
@ 2012-11-21 19:18       ` H. Peter Anvin
  2012-11-22  5:56         ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-21 19:18 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On 11/21/2012 10:59 AM, Yinghai Lu wrote:
> 
> in boot_param:
> 
>         struct setup_header hdr;    /* setup header */  /* 0x1f1 */
>         __u8  _pad7[0x290-0x1f1-sizeof(struct setup_header)];
>         __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX];      /* 0x290 */
>         struct e820entry e820_map[E820MAX];             /* 0x2d0 */
>         __u8  _pad8[48];                                /* 0xcd0 */
>         struct edd_info eddbuf[EDDMAXNR];               /* 0xd00 */
>         __u8  _pad9[276];                               /* 0xeec */
> 
> so we can use till 0x290.
> 
> and after those three dword, will still have 7 left.
> 

Not quite... the length of the initialized header is given by the byte
at 0x201, which can be at most 0x7f unfortunately.  This means 0x280 is
the endpoint, not 0x290.  Some bootloaders rely on this.

However, from the point of view of the 32- and 64-bit entry points, this
is effectively a .data segment, but these can go into the corresponding
.bss segment, which is the rest of struct boot_params.

>>
>>> diff --git a/arch/x86/boot/compressed/cmdline.c
>>> b/arch/x86/boot/compressed/cmdline.c
>>> index b4c913c..00678d3 100644
>>> --- a/arch/x86/boot/compressed/cmdline.c
>>> +++ b/arch/x86/boot/compressed/cmdline.c
>>> @@ -17,6 +17,9 @@ static unsigned long get_cmd_line_ptr(void)
>>>   {
>>>         unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
>>>
>>> +       if (real_mode->hdr.version >= 0x020c)
>>> +               cmd_line_ptr |= (u64)real_mode->hdr.ext_cmd_line_ptr <<
>>> 32;
>>> +
>>>         return cmd_line_ptr;
>>>   }
>>
>>
>> No.  hdr.version is information from the kernel to the bootloader; it is
>> meaningless to look at it inside the kernel.
>>
> could remove them, but how about vmlinux elf.
> 
> when kexec vmlinux elf, it will fake one hdr, and fill version there.
> 
>> Same in a bunch of other places.

Then whatever loads vmlinux.elf is responsible for initializing those
fields to zero anyway.  It is still an atrocious abuse.  What we
probably need to do is to include the initialized header in a section in
vmlinux.elf containing the default struct boot_params.  This is the kind
of things that happen when people do things without thinking through all
the consequences.

	-hpa



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
  2012-11-21 17:23   ` H. Peter Anvin
@ 2012-11-21 19:45     ` Yinghai Lu
  2012-11-21 19:50       ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21 19:45 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Matt Fleming

[-- Attachment #1: Type: text/plain, Size: 1662 bytes --]

On Wed, Nov 21, 2012 at 9:23 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/20/2012 11:15 PM, Yinghai Lu wrote:
>>
>> We are short of space before 0x200 that is entry for startup_64.
>>
>> And we can not change startup_64 to other value --- ABI ?
>
>
> Here you are saying "I don't understand how this works."  It is YOUR
> responsibility to find out and write a definite statement rather than
> leaving that to the reader, or expect the maintainer to edit this.

actually, i can not find that out.
in the code of arch/x86/boot/compressed/head_64.S

        /*
         * Be careful here startup_64 needs to be at a predictable
         * address so I can export it in an ELF header.  Bootloaders
         * should look at the ELF header to find this address, as
         * it may change in the future.
         */
        .code64
        .org 0x200
ENTRY(startup_64)
        /*
         * We come here either from startup_32 or directly from a
         * 64bit bootloader.  If we come here from a bootloader we depend on
         * an identity mapped page table being provied that maps our
         * entire text+data+bss and hopefully all of memory.
         */
#ifdef CONFIG_EFI_STUB
        /*
         * The entry point for the PE/COFF executable is 0x210, so only
         * legacy boot loaders will execute this jmp.
         */
        jmp     preferred_addr

        .org 0x210
        mov     %rcx, %rdi

and it says that 0x200 will be changed later..

so you said it has to stay with 0x200, do you mean 0x210 from PE/COFF
force that?

wonder if you are considering attatched patch to move startup_64 down...
we could kill one jmp.

Thanks

Yinghai

[-- Attachment #2: new_startup_64_0x400.patch --]
[-- Type: application/octet-stream, Size: 1901 bytes --]

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 2c4b171..4cb40d7 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -184,28 +184,9 @@ no_longmode:
 
 #include "../../kernel/verify_cpu.S"
 
-	/*
-	 * Be careful here startup_64 needs to be at a predictable
-	 * address so I can export it in an ELF header.  Bootloaders
-	 * should look at the ELF header to find this address, as
-	 * it may change in the future.
-	 */
 	.code64
-	.org 0x200
-ENTRY(startup_64)
-	/*
-	 * We come here either from startup_32 or directly from a
-	 * 64bit bootloader.  If we come here from a bootloader we depend on
-	 * an identity mapped page table being provied that maps our
-	 * entire text+data+bss and hopefully all of memory.
-	 */
 #ifdef CONFIG_EFI_STUB
-	/*
-	 * The entry point for the PE/COFF executable is 0x210, so only
-	 * legacy boot loaders will execute this jmp.
-	 */
-	jmp	preferred_addr
-
+	/* The entry point for the PE/COFF executable is 0x210 */
 	.org 0x210
 	mov	%rcx, %rdi
 	mov	%rdx, %rsi
@@ -234,12 +215,26 @@ ENTRY(startup_64)
 	subq	$3b, %rax
 	subq	BP_pref_address(%rsi), %rax
 	add	BP_code32_start(%esi), %eax
-	leaq	preferred_addr(%rax), %rax
+	leaq	startup_64(%rax), %rax
 	jmp	*%rax
 
-preferred_addr:
 #endif
 
+	/*
+	 * Be careful here startup_64 needs to be at a predictable
+	 * address so I can export it in an ELF header.  Bootloaders
+	 * should look at the ELF header to find this address, as
+	 * it may change in the future.
+	 */
+	.org 0x400
+ENTRY(startup_64)
+	/*
+	 * We come here either from startup_32 or directly from a
+	 * 64bit bootloader.  If we come here from a bootloader we depend on
+	 * an identity mapped page table being provied that maps our
+	 * entire text+data+bss and hopefully all of memory.
+	 */
+
 	/* Setup data segments. */
 	xorl	%eax, %eax
 	movl	%eax, %ds

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
  2012-11-21 19:45     ` Yinghai Lu
@ 2012-11-21 19:50       ` H. Peter Anvin
  2012-11-21 20:15         ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-21 19:50 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Matt Fleming

On 11/21/2012 11:45 AM, Yinghai Lu wrote:
> On Wed, Nov 21, 2012 at 9:23 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 11/20/2012 11:15 PM, Yinghai Lu wrote:
>>>
>>> We are short of space before 0x200 that is entry for startup_64.
>>>
>>> And we can not change startup_64 to other value --- ABI ?
>>
>>
>> Here you are saying "I don't understand how this works."  It is YOUR
>> responsibility to find out and write a definite statement rather than
>> leaving that to the reader, or expect the maintainer to edit this.
> 
> actually, i can not find that out.
> in the code of arch/x86/boot/compressed/head_64.S
> 
>         /*
>          * Be careful here startup_64 needs to be at a predictable
>          * address so I can export it in an ELF header.  Bootloaders
>          * should look at the ELF header to find this address, as
>          * it may change in the future.
>          */
>         .code64
>         .org 0x200
> ENTRY(startup_64)
>         /*
>          * We come here either from startup_32 or directly from a
>          * 64bit bootloader.  If we come here from a bootloader we depend on
>          * an identity mapped page table being provied that maps our
>          * entire text+data+bss and hopefully all of memory.
>          */
> #ifdef CONFIG_EFI_STUB
>         /*
>          * The entry point for the PE/COFF executable is 0x210, so only
>          * legacy boot loaders will execute this jmp.
>          */
>         jmp     preferred_addr
> 
>         .org 0x210
>         mov     %rcx, %rdi
> 
> and it says that 0x200 will be changed later..
> 
> so you said it has to stay with 0x200, do you mean 0x210 from PE/COFF
> force that?
> 
> wonder if you are considering attatched patch to move startup_64 down...
> we could kill one jmp.
> 

The comment is just plain wrong.  It assumes you're loading an ELF file,
whereas in practice that is rarely true.

This does explain why the poor ABI, though.  A jump table at the
beginning would have been a lot cleaner.

	-hpa



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
  2012-11-21 19:50       ` H. Peter Anvin
@ 2012-11-21 20:15         ` Yinghai Lu
  2012-11-22  5:48           ` Eric W. Biederman
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-21 20:15 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Matt Fleming

On Wed, Nov 21, 2012 at 11:50 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> The comment is just plain wrong.  It assumes you're loading an ELF file,
> whereas in practice that is rarely true.
>
> This does explain why the poor ABI, though.  A jump table at the
> beginning would have been a lot cleaner.

Can you please have patch to update the comments and point to the API there ?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
  2012-11-21 20:15         ` Yinghai Lu
@ 2012-11-22  5:48           ` Eric W. Biederman
       [not found]             ` <3178cb29-0e9e-44d2-b21f-45c53f38980a@email.android.com>
  0 siblings, 1 reply; 57+ messages in thread
From: Eric W. Biederman @ 2012-11-22  5:48 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel, Matt Fleming

Yinghai Lu <yinghai@kernel.org> writes:

> On Wed, Nov 21, 2012 at 11:50 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>> The comment is just plain wrong.  It assumes you're loading an ELF file,
>> whereas in practice that is rarely true.
>>
>> This does explain why the poor ABI, though.  A jump table at the
>> beginning would have been a lot cleaner.
>
> Can you please have patch to update the comments and point to the API there ?

Long ago and far away.  I wrote the 64bit entry code for a bzImage by
putting an ELF header in the boot sector.  That is what that comment
referred to when it was written.  Andrew had a problem on one of his
test machines and so the patch to bootsector was dropped.

Booting with a vmlinux file does (or at least should not) need this.
ELF loaders will pick the entry point out from the ELF header.

I don't know what has happened in the intervening time, or who if anyone
has depended on that offset.  The original intent an internal use hard
code.

Eric




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-21 19:18       ` H. Peter Anvin
@ 2012-11-22  5:56         ` Yinghai Lu
       [not found]           ` <a1ca794a-09d4-4d36-8c8c-67100cb3696e@email.android.com>
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22  5:56 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Wed, Nov 21, 2012 at 11:18 AM, H. Peter Anvin <hpa@zytor.com> wrote:

> Then whatever loads vmlinux.elf is responsible for initializing those
> fields to zero anyway.  It is still an atrocious abuse.  What we
> probably need to do is to include the initialized header in a section in
> vmlinux.elf containing the default struct boot_params.  This is the kind
> of things that happen when people do things without thinking through all
> the consequences.

ok, will remove the version checking in kernel.

also do you still think need to move ext_ramdisk... ext_cmd_line_ptr
from setup_header to boot_param ?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
       [not found]           ` <a1ca794a-09d4-4d36-8c8c-67100cb3696e@email.android.com>
@ 2012-11-22  6:47             ` Yinghai Lu
  2012-11-22  6:58               ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22  6:47 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Wed, Nov 21, 2012 at 9:58 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> Yes, lets...
>
>
>> also do you still think need to move ext_ramdisk... ext_cmd_line_ptr
>> from setup_header to boot_param ?
>
but looks werid:
ramdisk_image, ramdisk_size, cmd_line_ptr are in setup_header
but
ext_... are in boot_param..

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22  6:47             ` Yinghai Lu
@ 2012-11-22  6:58               ` Yinghai Lu
  2012-11-22 15:59                 ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22  6:58 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Wed, Nov 21, 2012 at 10:47 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Wed, Nov 21, 2012 at 9:58 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> Yes, lets...
>>
>>
>>> also do you still think need to move ext_ramdisk... ext_cmd_line_ptr
>>> from setup_header to boot_param ?
>>

how about:

diff --git a/arch/x86/include/asm/bootparam.h b/arch/x86/include/asm/bootparam.h
index 2ad874c..81b619e 100644
--- a/arch/x86/include/asm/bootparam.h
+++ b/arch/x86/include/asm/bootparam.h
@@ -100,7 +100,10 @@ struct boot_params {
        __u8  _pad2[4];                                 /* 0x054 */
        __u64  tboot_addr;                              /* 0x058 */
        struct ist_info ist_info;                       /* 0x060 */
-       __u8  _pad3[16];                                /* 0x070 */
+       __u32 ext_ramdisk_image;                        /* 0x070 */
+       __u32 ext_ramdisk_size;                         /* 0x074 */
+       __u32 ext_cmd_line_ptr;                         /* 0x078 */
+       __u8  _pad3[4];                                 /* 0x07C */
        __u8  hd0_info[16];     /* obsolete! */         /* 0x080 */
        __u8  hd1_info[16];     /* obsolete! */         /* 0x090 */
        struct sys_desc_table sys_desc_table;           /* 0x0a0 */

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
       [not found]             ` <3178cb29-0e9e-44d2-b21f-45c53f38980a@email.android.com>
@ 2012-11-22 11:27               ` Eric W. Biederman
  2012-11-24  7:00                 ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: Eric W. Biederman @ 2012-11-22 11:27 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, linux-kernel, Matt Fleming

"H. Peter Anvin" <hpa@zytor.com> writes:

> Quite certain something depends on it.

It would not surprise me at all that there is a dependency, if we have
not had a better way to report the 64bit entry point.  I just wanted to
make the context clear as that was confused in the discussion.

Note that having a 32bit entry point at offset 0 is as much of an ABI.

I am surprised that there are legitimate reasons to bulk up the 32bit
entry point code before the 0x200.  Everything that we are doing at that
point is architectural.

Eric

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22  6:58               ` Yinghai Lu
@ 2012-11-22 15:59                 ` H. Peter Anvin
  2012-11-22 18:28                   ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-22 15:59 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

Looks good to me.

Yinghai Lu <yinghai@kernel.org> wrote:

>On Wed, Nov 21, 2012 at 10:47 PM, Yinghai Lu <yinghai@kernel.org>
>wrote:
>> On Wed, Nov 21, 2012 at 9:58 PM, H. Peter Anvin <hpa@zytor.com>
>wrote:
>>> Yes, lets...
>>>
>>>
>>>> also do you still think need to move ext_ramdisk...
>ext_cmd_line_ptr
>>>> from setup_header to boot_param ?
>>>
>
>how about:
>
>diff --git a/arch/x86/include/asm/bootparam.h
>b/arch/x86/include/asm/bootparam.h
>index 2ad874c..81b619e 100644
>--- a/arch/x86/include/asm/bootparam.h
>+++ b/arch/x86/include/asm/bootparam.h
>@@ -100,7 +100,10 @@ struct boot_params {
>        __u8  _pad2[4];                                 /* 0x054 */
>        __u64  tboot_addr;                              /* 0x058 */
>        struct ist_info ist_info;                       /* 0x060 */
>-       __u8  _pad3[16];                                /* 0x070 */
>+       __u32 ext_ramdisk_image;                        /* 0x070 */
>+       __u32 ext_ramdisk_size;                         /* 0x074 */
>+       __u32 ext_cmd_line_ptr;                         /* 0x078 */
>+       __u8  _pad3[4];                                 /* 0x07C */
>        __u8  hd0_info[16];     /* obsolete! */         /* 0x080 */
>        __u8  hd1_info[16];     /* obsolete! */         /* 0x090 */
>        struct sys_desc_table sys_desc_table;           /* 0x0a0 */

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 15:59                 ` H. Peter Anvin
@ 2012-11-22 18:28                   ` Yinghai Lu
  2012-11-22 18:37                     ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22 18:28 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Thu, Nov 22, 2012 at 7:59 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> Looks good to me.
>

has problem with old kexec, it only copy header from bzImage include
setup_header as boot_param.

00000000  ea 05 00 c0 07 8c c8 8e  d8 8e c0 8e d0 31 e4 fb  |.............1..|
00000010  fc be 2d 00 ac 20 c0 74  09 b4 0e bb 07 00 cd 10  |..-.. .t........|
00000020  eb f2 31 c0 cd 16 cd 19  ea f0 ff 00 f0 44 69 72  |..1..........Dir|
00000030  65 63 74 20 66 6c 6f 70  70 79 20 62 6f 6f 74 20  |ect floppy boot |
00000040  69 73 20 6e 6f 74 20 73  75 70 70 6f 72 74 65 64  |is not supported|
00000050  2e 20 55 73 65 20 61 20  62 6f 6f 74 20 6c 6f 61  |. Use a boot loa|
00000060  64 65 72 20 70 72 6f 67  72 61 6d 20 69 6e 73 74  |der program inst|
00000070  65 61 64 2e 0d 0a 0a 52  65 6d 6f 76 65 20 64 69  |ead....Remove di|
00000080  73 6b 20 61 6e 64 20 70  72 65 73 73 20 61 6e 79  |sk and press any|
00000090  20 6b 65 79 20 74 6f 20  72 65 62 6f 6f 74 20 2e  | key to reboot .|
000000a0  2e 2e 0d 0a 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 21 01 00 33 15 09 00  00 00 ff ff 00 00 55 aa  |.!..3.........U.|


so will have stuff in 0x70

then i change to 0xC0, when CONFIG_EFI_STUB is enabled, there is value
there too.


00000000  4d 5a ea 07 00 c0 07 8c  c8 8e d8 8e c0 8e d0 31  |MZ.............1|
00000010  e4 fb fc be 40 00 ac 20  c0 74 09 b4 0e bb 07 00  |....@.. .t......|
00000020  cd 10 eb f2 31 c0 cd 16  cd 19 ea f0 ff 00 f0 00  |....1...........|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 b8 00 00 00  |................|
00000040  44 69 72 65 63 74 20 66  6c 6f 70 70 79 20 62 6f  |Direct floppy bo|
00000050  6f 74 20 69 73 20 6e 6f  74 20 73 75 70 70 6f 72  |ot is not suppor|
00000060  74 65 64 2e 20 55 73 65  20 61 20 62 6f 6f 74 20  |ted. Use a boot |
00000070  6c 6f 61 64 65 72 20 70  72 6f 67 72 61 6d 20 69  |loader program i|
00000080  6e 73 74 65 61 64 2e 0d  0a 0a 52 65 6d 6f 76 65  |nstead....Remove|
00000090  20 64 69 73 6b 20 61 6e  64 20 70 72 65 73 73 20  | disk and press |
000000a0  61 6e 79 20 6b 65 79 20  74 6f 20 72 65 62 6f 6f  |any key to reboo|
000000b0  74 20 2e 2e 2e 0d 0a 00  50 45 00 00 64 86 03 00  |t ......PE..d...|
000000c0  00 00 00 00 00 00 00 00  01 00 00 00 a0 00 06 02  |................|
000000d0  0b 02 02 14 20 be 91 00  00 00 00 00 00 00 00 00  |.... ...........|
000000e0  10 46 00 00 00 02 00 00  00 00 00 00 00 00 00 00  |.F..............|
000000f0  20 00 00 00 20 00 00 00  00 00 00 00 00 00 00 00  | ... ...........|
00000100  00 00 00 00 00 00 00 00  20 c0 91 00 00 02 00 00  |........ .......|
00000110  00 00 00 00 0a 00 00 00  00 00 00 00 00 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 06 00 00 00  |................|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000170  2e 73 65 74 75 70 00 00  e0 41 00 00 00 02 00 00  |.setup...A......|
00000180  e0 41 00 00 00 02 00 00  00 00 00 00 00 00 00 00  |.A..............|
00000190  00 00 00 00 20 00 50 60  2e 72 65 6c 6f 63 00 00  |.... .P`.reloc..|
000001a0  20 00 00 00 e0 43 00 00  20 00 00 00 e0 43 00 00  | ....C.. ....C..|
000001b0  00 00 00 00 00 00 00 00  00 00 00 00 40 00 10 42  |............@..B|
000001c0  2e 74 65 78 74 00 00 00  20 7c 91 00 00 44 00 00  |.text... |...D..|
000001d0  20 7c 91 00 00 44 00 00  00 00 00 00 00 00 00 00  | |...D..........|
000001e0  00 00 00 00 20 00 50 60  00 00 00 00 00 00 00 00  |.... .P`........|
000001f0  00 21 01 00 c2 17 09 00  00 00 ff ff 00 00 55 aa  |.!............U.|


looks we only can use [0x30,0x3c), [0x1e8, 0x1f0), but in boot_params, they
are apm_bios_info, and alt_mem_k...

so looks we still have to use setup_header instead.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 18:28                   ` Yinghai Lu
@ 2012-11-22 18:37                     ` H. Peter Anvin
  2012-11-22 18:50                       ` Yinghai Lu
  2012-11-24 12:37                       ` Eric W. Biederman
  0 siblings, 2 replies; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-22 18:37 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On 11/22/2012 10:28 AM, Yinghai Lu wrote:
>
> has problem with old kexec, it only copy header from bzImage include
> setup_header as boot_param.
>

How old are we talking here? This is a clear and blatant bug, and it 
would affect a whole bunch of things, not just this.  In fact, one 
really has to wonder how it can work at all.

One option I guess would be to have a sentinel field which, if it is not 
zero, causes the kernel to zero all of struct setup_info outside of 
setup_header... however, I have a nasty suspicion that this kexec botch 
might be initializing some fields and leaving others unmodified, which 
basically means "there is no hope for sanity and it is just working by 
pure accident."

Eric, do you have any insight here?

> looks we only can use [0x30,0x3c), [0x1e8, 0x1f0), but in boot_params, they
> are apm_bios_info, and alt_mem_k...

... which I suspect get set by said kexec botch.

> so looks we still have to use setup_header instead.

We need to dig into this and either say "this is unsupportable" or put 
in some kind of hack.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 18:37                     ` H. Peter Anvin
@ 2012-11-22 18:50                       ` Yinghai Lu
  2012-11-22 18:51                         ` H. Peter Anvin
  2012-11-24 12:37                       ` Eric W. Biederman
  1 sibling, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22 18:50 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Thu, Nov 22, 2012 at 10:37 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>> looks we only can use [0x30,0x3c), [0x1e8, 0x1f0), but in boot_params,
>> they
>> are apm_bios_info, and alt_mem_k...
>
>
> ... which I suspect get set by said kexec botch.
>
>
>> so looks we still have to use setup_header instead.
>
>
> We need to dig into this and either say "this is unsupportable" or put in
> some kind of hack.

ok, I will use 0xc0 instead, and at the same time try to fix that from kexec.

then user will still have chance to use old kexec tools without enable
CONFIG_EFI_STUB in kernel.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 18:50                       ` Yinghai Lu
@ 2012-11-22 18:51                         ` H. Peter Anvin
  2012-11-22 20:18                           ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-22 18:51 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On 11/22/2012 10:50 AM, Yinghai Lu wrote:
>
> ok, I will use 0xc0 instead, and at the same time try to fix that from kexec.
>
> then user will still have chance to use old kexec tools without enable
> CONFIG_EFI_STUB in kernel.
>

If we can get the sentinel hack to work that would probably be useful, 
but we need to understand the exact pathology.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 18:51                         ` H. Peter Anvin
@ 2012-11-22 20:18                           ` Yinghai Lu
  2012-11-22 20:20                             ` H. Peter Anvin
  2012-11-22 20:50                             ` H. Peter Anvin
  0 siblings, 2 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22 20:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Thu, Nov 22, 2012 at 10:51 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/22/2012 10:50 AM, Yinghai Lu wrote:
>>
>>
>> ok, I will use 0xc0 instead, and at the same time try to fix that from
>> kexec.
>>
>> then user will still have chance to use old kexec tools without enable
>> CONFIG_EFI_STUB in kernel.
>>
>
> If we can get the sentinel hack to work that would probably be useful, but
> we need to understand the exact pathology.

for kexec bzImage --real-mode-entry, code after setup_header will be executed.

so we could clear value before setup_header after copy 16bit section
from bzImage...

Index: kexec-tools/kexec/arch/i386/kexec-bzImage.c
===================================================================
--- kexec-tools.orig/kexec/arch/i386/kexec-bzImage.c
+++ kexec-tools/kexec/arch/i386/kexec-bzImage.c
@@ -212,6 +212,16 @@ int do_bzImage_load(struct kexec_info *i
 	setup_size = kern16_size + command_line_len + PURGATORY_CMDLINE_SIZE;
 	real_mode = xmalloc(setup_size);
 	memcpy(real_mode, kernel, kern16_size);
+	/*
+	 * clear value before header
+	 * not not clear value after header, --real-mode-entry
+	 * need code after header.
+	 */
+	memset(real_mode, 0, 0x1f1);
+	if (!real_mode_entry) {
+		/* clear value after setup_header  */
+		memset((unsigned char *)real_mode + 0x290, 0, kern16_size - 0x290);
+	}

 	if (info->kexec_flags & (KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT)) {
 		/* If using bzImage for capture kernel, then we will not be

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 20:18                           ` Yinghai Lu
@ 2012-11-22 20:20                             ` H. Peter Anvin
  2012-11-22 20:29                               ` Yinghai Lu
  2012-11-22 20:50                             ` H. Peter Anvin
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-22 20:20 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On 11/22/2012 12:18 PM, Yinghai Lu wrote:
 >
> for kexec bzImage --real-mode-entry, code after setup_header will be executed.
>

For real mode entry we go through the real mode path which takes care of 
this.  What matters is the 32/64-bit entry point.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 20:20                             ` H. Peter Anvin
@ 2012-11-22 20:29                               ` Yinghai Lu
  0 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22 20:29 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Thu, Nov 22, 2012 at 12:20 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/22/2012 12:18 PM, Yinghai Lu wrote:
>>
>>
>> for kexec bzImage --real-mode-entry, code after setup_header will be
>> executed.
>>
>
> For real mode entry we go through the real mode path which takes care of
> this.  What matters is the 32/64-bit entry point.
>

yes.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 20:18                           ` Yinghai Lu
  2012-11-22 20:20                             ` H. Peter Anvin
@ 2012-11-22 20:50                             ` H. Peter Anvin
  2012-11-22 21:02                               ` H. Peter Anvin
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-22 20:50 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On 11/22/2012 12:18 PM, Yinghai Lu wrote:
>>
>> If we can get the sentinel hack to work that would probably be useful, but
>> we need to understand the exact pathology.
>
> for kexec bzImage --real-mode-entry, code after setup_header will be executed.
>
> so we could clear value before setup_header after copy 16bit section
> from bzImage...
>
> Index: kexec-tools/kexec/arch/i386/kexec-bzImage.c
> ===================================================================
> --- kexec-tools.orig/kexec/arch/i386/kexec-bzImage.c
> +++ kexec-tools/kexec/arch/i386/kexec-bzImage.c
> @@ -212,6 +212,16 @@ int do_bzImage_load(struct kexec_info *i
>   	setup_size = kern16_size + command_line_len + PURGATORY_CMDLINE_SIZE;
>   	real_mode = xmalloc(setup_size);
>   	memcpy(real_mode, kernel, kern16_size);
> +	/*
> +	 * clear value before header
> +	 * not not clear value after header, --real-mode-entry
> +	 * need code after header.
> +	 */
> +	memset(real_mode, 0, 0x1f1);
> +	if (!real_mode_entry) {
> +		/* clear value after setup_header  */
> +		memset((unsigned char *)real_mode + 0x290, 0, kern16_size - 0x290);
> +	}
>

You really should move the memset() into the if() clause as well... 
doesn't matter at the moment, but that is the protocol.

The limit is 0x280, not 0x290, or -- better -- you can use the byte at 
0x201 to get the size.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 20:50                             ` H. Peter Anvin
@ 2012-11-22 21:02                               ` H. Peter Anvin
  2012-11-22 22:13                                 ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-22 21:02 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On 11/22/2012 12:50 PM, H. Peter Anvin wrote:
> On 11/22/2012 12:18 PM, Yinghai Lu wrote:
>>>
>>> If we can get the sentinel hack to work that would probably be
>>> useful, but
>>> we need to understand the exact pathology.
>>
>> for kexec bzImage --real-mode-entry, code after setup_header will be
>> executed.
>>
>> so we could clear value before setup_header after copy 16bit section
>> from bzImage...
>>
>> Index: kexec-tools/kexec/arch/i386/kexec-bzImage.c
>> ===================================================================
>> --- kexec-tools.orig/kexec/arch/i386/kexec-bzImage.c
>> +++ kexec-tools/kexec/arch/i386/kexec-bzImage.c
>> @@ -212,6 +212,16 @@ int do_bzImage_load(struct kexec_info *i
>>       setup_size = kern16_size + command_line_len +
>> PURGATORY_CMDLINE_SIZE;
>>       real_mode = xmalloc(setup_size);
>>       memcpy(real_mode, kernel, kern16_size);
>> +    /*
>> +     * clear value before header
>> +     * not not clear value after header, --real-mode-entry
>> +     * need code after header.
>> +     */
>> +    memset(real_mode, 0, 0x1f1);
>> +    if (!real_mode_entry) {
>> +        /* clear value after setup_header  */
>> +        memset((unsigned char *)real_mode + 0x290, 0, kern16_size -
>> 0x290);
>> +    }
>>
>
> You really should move the memset() into the if() clause as well...
> doesn't matter at the moment, but that is the protocol.
>
> The limit is 0x280, not 0x290, or -- better -- you can use the byte at
> 0x201 to get the size.
>

Not doing so would be wrong, in fact.


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 21:02                               ` H. Peter Anvin
@ 2012-11-22 22:13                                 ` Yinghai Lu
  0 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-22 22:13 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Eric W. Biederman, linux-kernel,
	Rob Landley, Matt Fleming

On Thu, Nov 22, 2012 at 1:02 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/22/2012 12:50 PM, H. Peter Anvin wrote:
>>
>> The limit is 0x280, not 0x290, or -- better -- you can use the byte at
>> 0x201 to get the size.
>>
>
> Not doing so would be wrong, in fact.

+	if (!real_mode_entry) {
+		unsigned long end;
+		/* clear value before header */
+		memset(real_mode, 0, 0x1f1);
+		/* clear value after setup_header  */
+		end = *((unsigned char *)real_mode + 0x201);
+		end += 0x202;
+		memset((unsigned char *)real_mode + end, 0, kern16_size - end);
+	}

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200
  2012-11-22 11:27               ` Eric W. Biederman
@ 2012-11-24  7:00                 ` Yinghai Lu
  0 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-24  7:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel, Matt Fleming

On Thu, Nov 22, 2012 at 3:27 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
>
>> Quite certain something depends on it.
>
> It would not surprise me at all that there is a dependency, if we have
> not had a better way to report the 64bit entry point.  I just wanted to
> make the context clear as that was confused in the discussion.
>
> Note that having a 32bit entry point at offset 0 is as much of an ABI.
>
> I am surprised that there are legitimate reasons to bulk up the 32bit
> entry point code before the 0x200.  Everything that we are doing at that
> point is architectural.

arch/x86/boot/header.S has bzImage 16 bit entry, and it is 0x200

arch/x86/boot/compressed/head_64.S has bzImage 32bit entry and 64bit entry.
loader need to find out the setup_code size at first aka kern16_size.
then from kern16_size will be 32bit code/64 bit code, will be aligned
to kernel_align
then from there 0: will be 32bit entry, 0x200 will be 64bit entry.

Actually kexec does not support bzImage booting directly from 64bit
before this patch set.

So are there any 64 bit boot loader that load bzImage and boot
directly from that second 0x200
64 bit entry ?

If there is no such bootloader there, we may change that position, and
add one field in setup_header
for it.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-22 18:37                     ` H. Peter Anvin
  2012-11-22 18:50                       ` Yinghai Lu
@ 2012-11-24 12:37                       ` Eric W. Biederman
  2012-11-24 17:32                         ` H. Peter Anvin
  1 sibling, 1 reply; 57+ messages in thread
From: Eric W. Biederman @ 2012-11-24 12:37 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 11/22/2012 10:28 AM, Yinghai Lu wrote:
>>
>> has problem with old kexec, it only copy header from bzImage include
>> setup_header as boot_param.
>>
>
> How old are we talking here? This is a clear and blatant bug, and it would
> affect a whole bunch of things, not just this.  In fact, one really has to
> wonder how it can work at all.
>
> One option I guess would be to have a sentinel field which, if it is not zero,
> causes the kernel to zero all of struct setup_info outside of
> setup_header... however, I have a nasty suspicion that this kexec botch might be
> initializing some fields and leaving others unmodified, which basically means
> "there is no hope for sanity and it is just working by pure accident."
>
> Eric, do you have any insight here?

I seem to be missing something.

With respect to boot parameters when we are booting a bzImage
/sbin/kexec initializes the boot parameters with all of the 16bit real
mode code.  aka (setup_sects + 1) * 512 bytes.

I remember adding that as soon as we started having to deal with
pre-initialized fields in boot_params.

I don't have a clue what you folks are referring to as a bug.  

Looking I see this verbage in boot.txt

> For machine with some new BIOS other than legacy BIOS, such as EFI,
> LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
> based on legacy BIOS can not be used, so a 32-bit boot protocol needs
> to be defined.
> 
> In 32-bit boot protocol, the first step in loading a Linux kernel
> should be to setup the boot parameters (struct boot_params,
> traditionally known as "zero page"). The memory for struct boot_params
> should be allocated and initialized to all zero. Then the setup header
> from offset 0x01f1 of kernel image on should be loaded into struct
> boot_params and examined. The end of setup header can be calculated as
> follow:
> 
> 	0x0202 + byte value at offset 0x0201
> 
> In addition to read/modify/write the setup header of the struct
> boot_params as that of 16-bit boot protocol, the boot loader should
> also fill the additional fields of the struct boot_params as that
> described in zero-page.txt.

Certainly /sbin/kexec isn't bothering to calculate the end of the setup
header and just being far more conservative and using all of the 16bit
real mode code as it's initializer.

Eric

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 12:37                       ` Eric W. Biederman
@ 2012-11-24 17:32                         ` H. Peter Anvin
       [not found]                           ` <CAE9FiQV0Q0fi7TrNjihdsUt0ueT4LLON4o+JEmX6ry9S6AU-ug@mail.gmail.com>
  2012-11-24 19:50                           ` H. Peter Anvin
  0 siblings, 2 replies; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-24 17:32 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On 11/24/2012 04:37 AM, Eric W. Biederman wrote:
>
> Certainly /sbin/kexec isn't bothering to calculate the end of the setup
> header and just being far more conservative and using all of the 16bit
> real mode code as it's initializer.
>

That's not conservative... that's just plain wrong.  It means you're 
initializing the fields in struct boot_params with garbage instead of a 
predictable value (zero).

We could work around it with a sentinel hack... except you *also* 
probably modify *some* fields and now we have a horrid mix of 
initialized and uninitialized fields to sort out... and there really 
isn't any sane way for the kernel to sort that out.

We have a huge problem on our hands now because of it.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
       [not found]                           ` <CAE9FiQV0Q0fi7TrNjihdsUt0ueT4LLON4o+JEmX6ry9S6AU-ug@mail.gmail.com>
@ 2012-11-24 18:24                             ` H. Peter Anvin
  0 siblings, 0 replies; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-24 18:24 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On 11/24/2012 10:12 AM, Yinghai Lu wrote:
>
> Now I have a fix ready, also found fix for kexec real mode path working
> with recently kernel by settin heap end ptr correctly.
>
> Please decide if we need to add 64 bit entry offset in setup header,
> Or just stick to 0x200.
>
> I check grub2 and gujin and qemu , looks like they are all using bzimage
> 16 bit entry.
>
> Do you have pointer for any boot loader that is using 64 bit entry in
> bzimage?
>

I'm fairly certain Grub2 does *not* use the 16-bit entry point by 
default even on BIOS platforms, needing the "linux16" directive to 
behave sanely (this is one of many complete facepalsm in Grub2).

efilinux or elilo compiled for a 64-bit EFI platform would be a good 
example, bit even if we can't find a 64-bit boot loader example I don't 
think we can rule one out, so let's just define 0x200 as an ABI constant 
and be done with it.  The cost is minimal and the consequences of 
changing it are potentially severe.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 17:32                         ` H. Peter Anvin
       [not found]                           ` <CAE9FiQV0Q0fi7TrNjihdsUt0ueT4LLON4o+JEmX6ry9S6AU-ug@mail.gmail.com>
@ 2012-11-24 19:50                           ` H. Peter Anvin
  2012-11-24 21:30                             ` Yinghai Lu
  2012-11-24 23:50                             ` Eric W. Biederman
  1 sibling, 2 replies; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-24 19:50 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On 11/24/2012 09:32 AM, H. Peter Anvin wrote:
> On 11/24/2012 04:37 AM, Eric W. Biederman wrote:
>>
>> Certainly /sbin/kexec isn't bothering to calculate the end of the setup
>> header and just being far more conservative and using all of the 16bit
>> real mode code as it's initializer.
>>
>
> That's not conservative... that's just plain wrong.  It means you're
> initializing the fields in struct boot_params with garbage instead of a
> predictable value (zero).
>
> We could work around it with a sentinel hack... except you *also*
> probably modify *some* fields and now we have a horrid mix of
> initialized and uninitialized fields to sort out... and there really
> isn't any sane way for the kernel to sort that out.
>
> We have a huge problem on our hands now because of it.
>

So, given the mess we now have on our hands... any suggestions how to 
best solve it?  There is the option of simply declaring old kexec 
binaries broken; they will then not work reliably with newer kernels, if 
they even work reliably now -- it is hard to know for certain.

Another option is the sentinel hack I mentioned... permanently reserve a 
field that if it is nonzero we will have the kernel erase the remainder 
of struct boot_params... except for *some fields* to be defined.  This 
is a total hack workaround and will not work if we have the same class 
of problems in another bootloader which initializes different fields, 
but, well, it might provide some value and might solve problems with 
other bootloaders which have similar enough misbehavior.

The final idea would be to declare the current struct boot_params frozen 
indefinitely, and instead create a whole new set of data structures 
going forward, perhaps inserting them into the linked list.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 19:50                           ` H. Peter Anvin
@ 2012-11-24 21:30                             ` Yinghai Lu
  2012-11-24 21:38                               ` H. Peter Anvin
  2012-11-24 23:50                             ` Eric W. Biederman
  1 sibling, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-24 21:30 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On Sat, Nov 24, 2012 at 11:50 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/24/2012 09:32 AM, H. Peter Anvin wrote:
>>
>> On 11/24/2012 04:37 AM, Eric W. Biederman wrote:
>>>
>>>
>>> Certainly /sbin/kexec isn't bothering to calculate the end of the setup
>>> header and just being far more conservative and using all of the 16bit
>>> real mode code as it's initializer.
>>>
>>
>> That's not conservative... that's just plain wrong.  It means you're
>> initializing the fields in struct boot_params with garbage instead of a
>> predictable value (zero).
>>
>> We could work around it with a sentinel hack... except you *also*
>> probably modify *some* fields and now we have a horrid mix of
>> initialized and uninitialized fields to sort out... and there really
>> isn't any sane way for the kernel to sort that out.
>>
>> We have a huge problem on our hands now because of it.
>>
>
> So, given the mess we now have on our hands... any suggestions how to best
> solve it?  There is the option of simply declaring old kexec binaries
> broken; they will then not work reliably with newer kernels, if they even
> work reliably now -- it is hard to know for certain.

yes, if the user updates kernel to be kexeced, then would be
reasonable to ask them to
update kexec-tools.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 21:30                             ` Yinghai Lu
@ 2012-11-24 21:38                               ` H. Peter Anvin
  2012-11-24 22:18                                 ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-24 21:38 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On 11/24/2012 01:30 PM, Yinghai Lu wrote:
>>
>> So, given the mess we now have on our hands... any suggestions how to best
>> solve it?  There is the option of simply declaring old kexec binaries
>> broken; they will then not work reliably with newer kernels, if they even
>> work reliably now -- it is hard to know for certain.
>
> yes, if the user updates kernel to be kexeced, then would be
> reasonable to ask them to
> update kexec-tools.
>

Careful... consider the people who use a kexec-based solution as 
bootloaders.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 21:38                               ` H. Peter Anvin
@ 2012-11-24 22:18                                 ` Yinghai Lu
  2012-11-24 22:32                                   ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-24 22:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On Sat, Nov 24, 2012 at 1:38 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/24/2012 01:30 PM, Yinghai Lu wrote:
>>>
>>>
>>> So, given the mess we now have on our hands... any suggestions how to
>>> best
>>> solve it?  There is the option of simply declaring old kexec binaries
>>> broken; they will then not work reliably with newer kernels, if they even
>>> work reliably now -- it is hard to know for certain.
>>
>>
>> yes, if the user updates kernel to be kexeced, then would be
>> reasonable to ask them to
>> update kexec-tools.
>>
>
> Careful... consider the people who use a kexec-based solution as
> bootloaders.

yes, those may not update kexec in the flash...

then, may need to use another bit in xloadflags to tell new kernel if
need to check ext_...

Field name:     xloadflags
Type:           modify (obligatory)
Offset/size:    0x236/2
Protocol:       2.12+

  This field is a bitmask.

  Bit 0 (read): CAN_BE_LOADED_ABOVE_4G
        - If 1, kernel/boot_params/cmdline/ramdisk can be above 4g,
                set by kernel.

  Bit 1 (write): LOADED_ABOVE_4G
        - If 1, kernel/boot_params/cmdline/ramdisk is loaded above 4g,
                set by bootloader, and kernel will check ext_ramdisk_image,
                ext_ramdisk_size and ext_cmd_line_ptr.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 22:18                                 ` Yinghai Lu
@ 2012-11-24 22:32                                   ` H. Peter Anvin
  2012-11-24 23:24                                     ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-24 22:32 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On 11/24/2012 02:18 PM, Yinghai Lu wrote:
>>
>> Careful... consider the people who use a kexec-based solution as
>> bootloaders.
>
> yes, those may not update kexec in the flash...
>
> then, may need to use another bit in xloadflags to tell new kernel if
> need to check ext_...
>
> Field name:     xloadflags
> Type:           modify (obligatory)
> Offset/size:    0x236/2
> Protocol:       2.12+
>
>    This field is a bitmask.
>
>    Bit 0 (read): CAN_BE_LOADED_ABOVE_4G
>          - If 1, kernel/boot_params/cmdline/ramdisk can be above 4g,
>                  set by kernel.
>
>    Bit 1 (write): LOADED_ABOVE_4G
>          - If 1, kernel/boot_params/cmdline/ramdisk is loaded above 4g,
>                  set by bootloader, and kernel will check ext_ramdisk_image,
>                  ext_ramdisk_size and ext_cmd_line_ptr.
>

Well, that solves the problem for *this specific instance* but I fear 
therein lies madness in the general case.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 22:32                                   ` H. Peter Anvin
@ 2012-11-24 23:24                                     ` Yinghai Lu
  0 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-24 23:24 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On Sat, Nov 24, 2012 at 2:32 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/24/2012 02:18 PM, Yinghai Lu wrote:

> Well, that solves the problem for *this specific instance* but I fear
> therein lies madness in the general case.
>

use

   Bit 0 (read): CAN_BE_LOADED_ABOVE_4G
         - If 1, kernel/boot_params/cmdline/ramdisk can be above 4g,
                 set by kernel.

   Bit 1 (write): USE_EXT_BOOT_PARAMS
         - If 1, set by bootloader, and kernel could check new fields
in boot_params
                  that are added from 2.12 safely.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 19:50                           ` H. Peter Anvin
  2012-11-24 21:30                             ` Yinghai Lu
@ 2012-11-24 23:50                             ` Eric W. Biederman
  2012-11-25  0:04                               ` H. Peter Anvin
  2012-11-25  0:04                               ` Yinghai Lu
  1 sibling, 2 replies; 57+ messages in thread
From: Eric W. Biederman @ 2012-11-24 23:50 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 11/24/2012 09:32 AM, H. Peter Anvin wrote:
>> On 11/24/2012 04:37 AM, Eric W. Biederman wrote:
>>>
>>> Certainly /sbin/kexec isn't bothering to calculate the end of the setup
>>> header and just being far more conservative and using all of the 16bit
>>> real mode code as it's initializer.
>>>
>>
>> That's not conservative... that's just plain wrong.  It means you're
>> initializing the fields in struct boot_params with garbage instead of a
>> predictable value (zero).

It was conservative at the time the code was introduced and it most
definitely is not wrong.  The code predates the verbage in boot.txt.
Apparently no one bothered to see what /sbin/kexec was actually doing
when they documented the 32bit boot loader interface.  I was under the
impression that it was actual practice that was documented but in this
particular something else was documented instead.  Since /sbin/kexec did
not need any of the more recent features we simply have not noticed it
until now.

>> We could work around it with a sentinel hack... except you *also*
>> probably modify *some* fields and now we have a horrid mix of
>> initialized and uninitialized fields to sort out... and there really
>> isn't any sane way for the kernel to sort that out.
>>
>> We have a huge problem on our hands now because of it.
>>
>
> So, given the mess we now have on our hands... any suggestions how to best solve
> it?  There is the option of simply declaring old kexec binaries broken; they
> will then not work reliably with newer kernels, if they even work reliably now
> -- it is hard to know for certain.

I believe all added variables between the last version of the boot
protocol /sbin/kexec knows about and the current time were added in the
initialized data section.  Certainly we can check and that will tell us
how likely changes in arch/x86/boot/ have been regressions in the 32bit
entry point support.

As for solving this there is a simple solution.  Add a second jump
right after the first jump.   The variables after the second jump can
all be zero initialized.

And if we really care about breaking other boot loaders we can take a
survey and actually look and see what they do.  There really aren't that
many x86 boot loaders.

Eric

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 23:50                             ` Eric W. Biederman
@ 2012-11-25  0:04                               ` H. Peter Anvin
  2012-11-25  0:11                                 ` Yinghai Lu
  2012-11-25  0:04                               ` Yinghai Lu
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-25  0:04 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On 11/24/2012 03:50 PM, Eric W. Biederman wrote:
>
> It was conservative at the time the code was introduced and it most
> definitely is not wrong.  The code predates the verbage in boot.txt.
> Apparently no one bothered to see what /sbin/kexec was actually doing
> when they documented the 32bit boot loader interface.  I was under the
> impression that it was actual practice that was documented but in this
> particular something else was documented instead.  Since /sbin/kexec did
> not need any of the more recent features we simply have not noticed it
> until now.
>

The problem is that kexec and others didn't follow any protocol at all, 
but rather did something that happened to work... but could trivially be 
shown had no way of being forward compatible.

>>> We could work around it with a sentinel hack... except you *also*
>>> probably modify *some* fields and now we have a horrid mix of
>>> initialized and uninitialized fields to sort out... and there really
>>> isn't any sane way for the kernel to sort that out.
>>>
>>> We have a huge problem on our hands now because of it.
>>
>> So, given the mess we now have on our hands... any suggestions how to best solve
>> it?  There is the option of simply declaring old kexec binaries broken; they
>> will then not work reliably with newer kernels, if they even work reliably now
>> -- it is hard to know for certain.
>
> I believe all added variables between the last version of the boot
> protocol /sbin/kexec knows about and the current time were added in the
> initialized data section.  Certainly we can check and that will tell us
> how likely changes in arch/x86/boot/ have been regressions in the 32bit
> entry point support.
>
> As for solving this there is a simple solution.  Add a second jump
> right after the first jump.   The variables after the second jump can
> all be zero initialized.

It doesn't work for the variables *before* the initialized section, and 
that is actually where we have most problems... there really are only 
very few bytes left after the initialized section.  The reason we can't 
do anything about the area before it is because that has to have stuff 
in it, like the EFI header, to work.

> And if we really care about breaking other boot loaders we can take a
> survey and actually look and see what they do.  There really aren't that
> many x86 boot loaders.

There are more than you think... a lot of them are hiding in grotty 
corners.  However, they are minority users.

It sounds like we are leaning toward some form of the sentinel hack, 
which means we need an enumerated list of things that should *not* be 
zeroed if the sentinel is present.

The option of declaring the list frozen makes me a bit nervous, because 
it isn't clear that we don't already have fields that will be 
misinterpreted by the kernel if filled in from the file.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-24 23:50                             ` Eric W. Biederman
  2012-11-25  0:04                               ` H. Peter Anvin
@ 2012-11-25  0:04                               ` Yinghai Lu
  2012-11-25  0:06                                 ` H. Peter Anvin
  1 sibling, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-25  0:04 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On Sat, Nov 24, 2012 at 3:50 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> I believe all added variables between the last version of the boot
> protocol /sbin/kexec knows about and the current time were added in the
> initialized data section.  Certainly we can check and that will tell us
> how likely changes in arch/x86/boot/ have been regressions in the 32bit
> entry point support.
>
> As for solving this there is a simple solution.  Add a second jump
> right after the first jump.   The variables after the second jump can
> all be zero initialized.

could use .org to force start_of_setup start from 0x1000

but how about area before setup_header ? how it is full of EFI_STUB suff there.

Yinghai

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-25  0:04                               ` Yinghai Lu
@ 2012-11-25  0:06                                 ` H. Peter Anvin
  0 siblings, 0 replies; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-25  0:06 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On 11/24/2012 04:04 PM, Yinghai Lu wrote:
> On Sat, Nov 24, 2012 at 3:50 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>>
>> I believe all added variables between the last version of the boot
>> protocol /sbin/kexec knows about and the current time were added in the
>> initialized data section.  Certainly we can check and that will tell us
>> how likely changes in arch/x86/boot/ have been regressions in the 32bit
>> entry point support.
>>
>> As for solving this there is a simple solution.  Add a second jump
>> right after the first jump.   The variables after the second jump can
>> all be zero initialized.
>
> could use .org to force start_of_setup start from 0x1000
>
> but how about area before setup_header ? how it is full of EFI_STUB suff there.
>

Yes, it doesn't really solve the problem I fear.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-25  0:04                               ` H. Peter Anvin
@ 2012-11-25  0:11                                 ` Yinghai Lu
  2012-11-25  5:50                                   ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-25  0:11 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

[-- Attachment #1: Type: text/plain, Size: 504 bytes --]

On Sat, Nov 24, 2012 at 4:04 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> It sounds like we are leaning toward some form of the sentinel hack, which
> means we need an enumerated list of things that should *not* be zeroed if
> the sentinel is present.
>
> The option of declaring the list frozen makes me a bit nervous, because it
> isn't clear that we don't already have fields that will be misinterpreted by
> the kernel if filled in from the file.

USE_EXT_BOOT_PARAMS bit in xloadflags should work.

[-- Attachment #2: ext_ramdisk_image.patch --]
[-- Type: application/octet-stream, Size: 7641 bytes --]

Subject: [PATCH] x86, boot: add fields to support load bzImage and ramdisk above 4G

ext_ramdisk_image/size will record high 32bits for ramdisk info.

xloadflags bit0 will be set if relocatable with 64bit.

Let get_ramdisk_image/size to use ext_ramdisk_image/size to get
right positon for ramdisk.

bootloader will fill value to ext_ramdisk_image/size when it load
ramdisk above 4G.

Also bootloader will check if xloadflags bit0 is set to decicde if
it could load ramdisk high above 4G.

Update header version to 2.12.

-v2: add ext_cmd_line_ptr for above 4G support.
-v3: update to xloadflags from HPA.
-v4: use fields from bootparam instead setup_header accoring to HPA.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Rob Landley <rob@landley.net>
Cc: Matt Fleming <matt.fleming@intel.com>

---
 Documentation/x86/boot.txt         |   19 ++++++++++++++++++-
 Documentation/x86/zero-page.txt    |    3 +++
 arch/x86/boot/compressed/cmdline.c |    3 +++
 arch/x86/boot/header.S             |   12 ++++++++++--
 arch/x86/include/asm/bootparam.h   |   10 ++++++++--
 arch/x86/kernel/head64.c           |    3 +++
 arch/x86/kernel/setup.c            |    6 ++++++
 7 files changed, 51 insertions(+), 5 deletions(-)

Index: linux-2.6/Documentation/x86/boot.txt
===================================================================
--- linux-2.6.orig/Documentation/x86/boot.txt
+++ linux-2.6/Documentation/x86/boot.txt
@@ -57,6 +57,9 @@ Protocol 2.10:	(Kernel 2.6.31) Added a p
 Protocol 2.11:	(Kernel 3.6) Added a field for offset of EFI handover
 		protocol entry point.
 
+Protocol 2.12:	(Kernel 3.9) Added three fields for loading bzImage and
+		 ramdisk above 4G with 64bit in bootparam.
+
 **** MEMORY LAYOUT
 
 The traditional memory map for the kernel loader, used for Image or
@@ -182,7 +185,7 @@ Offset	Proto	Name		Meaning
 0230/4	2.05+	kernel_alignment Physical addr alignment required for kernel
 0234/1	2.05+	relocatable_kernel Whether kernel is relocatable or not
 0235/1	2.10+	min_alignment	Minimum alignment, as a power of two
-0236/2	N/A	pad3		Unused
+0236/2	2.12+	xloadflags	Boot protocol option flags
 0238/4	2.06+	cmdline_size	Maximum size of the kernel command line
 023C/4	2.07+	hardware_subarch Hardware subarchitecture
 0240/8	2.07+	hardware_subarch_data Subarchitecture-specific data
@@ -581,6 +584,20 @@ Protocol:	2.10+
   misaligned kernel.  Therefore, a loader should typically try each
   power-of-two alignment from kernel_alignment down to this alignment.
 
+Field name:     xloadflags
+Type:           modify (obligatory)
+Offset/size:    0x236/2
+Protocol:       2.12+
+
+  This field is a bitmask.
+
+  Bit 0 (read): CAN_BE_LOADED_ABOVE_4G
+        - If 1, kernel/boot_params/cmdline/ramdisk can be above 4g,
+
+  Bit 15 (write): USE_EXT_BOOT_PARAMS
+	- If 1, set by bootloader, and kernel could check new fields
+		in boot_params that are added from 2.12 safely.
+
 Field name:	cmdline_size
 Type:		read
 Offset/size:	0x238/4
Index: linux-2.6/arch/x86/boot/header.S
===================================================================
--- linux-2.6.orig/arch/x86/boot/header.S
+++ linux-2.6/arch/x86/boot/header.S
@@ -279,7 +279,7 @@ _start:
 	# Part 2 of the header, from the old setup.S
 
 		.ascii	"HdrS"		# header signature
-		.word	0x020b		# header version number (>= 0x0105)
+		.word	0x020c		# header version number (>= 0x0105)
 					# or else old loadlin-1.5 will fail)
 		.globl realmode_swtch
 realmode_swtch:	.word	0, 0		# default_switch, SETUPSEG
@@ -369,7 +369,15 @@ relocatable_kernel:    .byte 1
 relocatable_kernel:    .byte 0
 #endif
 min_alignment:		.byte MIN_KERNEL_ALIGN_LG2	# minimum alignment
-pad3:			.word 0
+
+xloadflags:
+CAN_BE_LOADED_ABOVE_4G	= 1		# If set, the kernel/boot_param/
+					# ramdisk could be loaded above 4g
+#if defined(CONFIG_X86_64) && defined(CONFIG_RELOCATABLE)
+			.word CAN_BE_LOADED_ABOVE_4G
+#else
+			.word 0
+#endif
 
 cmdline_size:   .long   COMMAND_LINE_SIZE-1     #length of the command line,
                                                 #added with boot protocol
Index: linux-2.6/arch/x86/include/asm/bootparam.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/bootparam.h
+++ linux-2.6/arch/x86/include/asm/bootparam.h
@@ -57,7 +57,10 @@ struct setup_header {
 	__u32	initrd_addr_max;
 	__u32	kernel_alignment;
 	__u8	relocatable_kernel;
-	__u8	_pad2[3];
+	__u8	min_alignment;
+	__u16	xloadflags;
+#define CAN_BE_LOADED_ABOVE_4G	(1<<0)
+#define USE_EXT_BOOT_PARAMS		(1<<15)
 	__u32	cmdline_size;
 	__u32	hardware_subarch;
 	__u64	hardware_subarch_data;
@@ -105,7 +108,10 @@ struct boot_params {
 	__u8  hd1_info[16];	/* obsolete! */		/* 0x090 */
 	struct sys_desc_table sys_desc_table;		/* 0x0a0 */
 	struct olpc_ofw_header olpc_ofw_header;		/* 0x0b0 */
-	__u8  _pad4[128];				/* 0x0c0 */
+	__u32 ext_ramdisk_image;			/* 0x0c0 */
+	__u32 ext_ramdisk_size;				/* 0x0c4 */
+	__u32 ext_cmd_line_ptr;				/* 0x0c8 */
+	__u8  _pad4[116];				/* 0x0cc */
 	struct edid_info edid_info;			/* 0x140 */
 	struct efi_info efi_info;			/* 0x1c0 */
 	__u32 alt_mem_k;				/* 0x1e0 */
Index: linux-2.6/arch/x86/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup.c
+++ linux-2.6/arch/x86/kernel/setup.c
@@ -302,12 +302,18 @@ static u64 __init get_ramdisk_image(void
 {
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 
+	if (boot_params.hdr.xloadflags & USE_EXT_BOOT_PARAMS)
+		ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
+
 	return ramdisk_image;
 }
 static u64 __init get_ramdisk_size(void)
 {
 	u64 ramdisk_size = boot_params.hdr.ramdisk_size;
 
+	if (boot_params.hdr.xloadflags & USE_EXT_BOOT_PARAMS)
+		ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
+
 	return ramdisk_size;
 }
 
Index: linux-2.6/arch/x86/boot/compressed/cmdline.c
===================================================================
--- linux-2.6.orig/arch/x86/boot/compressed/cmdline.c
+++ linux-2.6/arch/x86/boot/compressed/cmdline.c
@@ -17,6 +17,9 @@ static unsigned long get_cmd_line_ptr(vo
 {
 	unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
 
+	if (real_mode->hdr.xloadflags & USE_EXT_BOOT_PARAMS)
+		cmd_line_ptr |= (u64)real_mode->ext_cmd_line_ptr << 32;
+
 	return cmd_line_ptr;
 }
 int cmdline_find_option(const char *option, char *buffer, int bufsize)
Index: linux-2.6/arch/x86/kernel/head64.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/head64.c
+++ linux-2.6/arch/x86/kernel/head64.c
@@ -45,6 +45,9 @@ static unsigned long get_cmd_line_ptr(vo
 {
 	unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
 
+	if (boot_params.hdr.xloadflags & USE_EXT_BOOT_PARAMS)
+		cmd_line_ptr |= (u64)boot_params.ext_cmd_line_ptr << 32;
+
 	return cmd_line_ptr;
 }
 
Index: linux-2.6/Documentation/x86/zero-page.txt
===================================================================
--- linux-2.6.orig/Documentation/x86/zero-page.txt
+++ linux-2.6/Documentation/x86/zero-page.txt
@@ -19,6 +19,9 @@ Offset	Proto	Name		Meaning
 090/010	ALL	hd1_info	hd1 disk parameter, OBSOLETE!!
 0A0/010	ALL	sys_desc_table	System description table (struct sys_desc_table)
 0B0/010	ALL	olpc_ofw_header	OLPC's OpenFirmware CIF and friends
+0C0/004 ALL	ext_ramdisk_image ramdisk_image high 32bits
+0C4/004 ALL	ext_ramdisk_size  ramdisk_size high 32bits
+0C8/004 ALL	ext_cmd_line_ptr  cmd_line_ptr high 32bits
 140/080	ALL	edid_info	Video mode setup (struct edid_info)
 1C0/020	ALL	efi_info	EFI 32 information (struct efi_info)
 1E0/004	ALL	alk_mem_k	Alternative mem check, in KB

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-25  0:11                                 ` Yinghai Lu
@ 2012-11-25  5:50                                   ` Yinghai Lu
  2012-11-25  5:52                                     ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Yinghai Lu @ 2012-11-25  5:50 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On Sat, Nov 24, 2012 at 4:11 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Sat, Nov 24, 2012 at 4:04 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> It sounds like we are leaning toward some form of the sentinel hack, which
>> means we need an enumerated list of things that should *not* be zeroed if
>> the sentinel is present.
>>
>> The option of declaring the list frozen makes me a bit nervous, because it
>> isn't clear that we don't already have fields that will be misinterpreted by
>> the kernel if filled in from the file.
>
> USE_EXT_BOOT_PARAMS bit in xloadflags should work.

new kexec will clean around bit around setup head, and set that bit,
if it is not with real_mode entry.

32bit and 64bit entry:
old kernel has no idea of this bit, and still use old ramdisk_image,
cmd_line_ptr in setup header.
new kernel will check that bit before it use ext_ramdisk_image, and
ext_cmd_line_ptr.

old kexec and new kernel is safe too, because that bit is not set, new
kernel will not use ex_...

later all new kernel need to check USE_EXT_BOOT_PARAMS bit for all new
added field in boot_params.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-25  5:50                                   ` Yinghai Lu
@ 2012-11-25  5:52                                     ` H. Peter Anvin
  2012-11-25  6:09                                       ` Yinghai Lu
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2012-11-25  5:52 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

But it doesn't solve the bigger problem, and it is just begging to be gotten wrong.

Yinghai Lu <yinghai@kernel.org> wrote:

>On Sat, Nov 24, 2012 at 4:11 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> On Sat, Nov 24, 2012 at 4:04 PM, H. Peter Anvin <hpa@zytor.com>
>wrote:
>>>
>>> It sounds like we are leaning toward some form of the sentinel hack,
>which
>>> means we need an enumerated list of things that should *not* be
>zeroed if
>>> the sentinel is present.
>>>
>>> The option of declaring the list frozen makes me a bit nervous,
>because it
>>> isn't clear that we don't already have fields that will be
>misinterpreted by
>>> the kernel if filled in from the file.
>>
>> USE_EXT_BOOT_PARAMS bit in xloadflags should work.
>
>new kexec will clean around bit around setup head, and set that bit,
>if it is not with real_mode entry.
>
>32bit and 64bit entry:
>old kernel has no idea of this bit, and still use old ramdisk_image,
>cmd_line_ptr in setup header.
>new kernel will check that bit before it use ext_ramdisk_image, and
>ext_cmd_line_ptr.
>
>old kexec and new kernel is safe too, because that bit is not set, new
>kernel will not use ex_...
>
>later all new kernel need to check USE_EXT_BOOT_PARAMS bit for all new
>added field in boot_params.

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high
  2012-11-25  5:52                                     ` H. Peter Anvin
@ 2012-11-25  6:09                                       ` Yinghai Lu
  0 siblings, 0 replies; 57+ messages in thread
From: Yinghai Lu @ 2012-11-25  6:09 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Rob Landley, Matt Fleming

On Sat, Nov 24, 2012 at 9:52 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> But it doesn't solve the bigger problem, and it is just begging to be gotten wrong.
>>
>>later all new kernel need to check USE_EXT_BOOT_PARAMS bit for all new
>>added field in boot_params.

Do you mean
later someone would forget checking USE_EXT_BOOT_PARAMS when accessing
new added fields in boot_params?

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2012-11-25  6:09 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-21  7:15 [PATCH v3 00/12] x86, boot, 64bit: Add support for loading ramdisk and bzImage high Yinghai Lu
2012-11-21  7:15 ` [PATCH v3 01/12] x86, boot: move verify_cpu.S after 0x200 Yinghai Lu
2012-11-21 17:23   ` H. Peter Anvin
2012-11-21 19:45     ` Yinghai Lu
2012-11-21 19:50       ` H. Peter Anvin
2012-11-21 20:15         ` Yinghai Lu
2012-11-22  5:48           ` Eric W. Biederman
     [not found]             ` <3178cb29-0e9e-44d2-b21f-45c53f38980a@email.android.com>
2012-11-22 11:27               ` Eric W. Biederman
2012-11-24  7:00                 ` Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 02/12] x86, boot: Move lldt/ltr out of 64bit code section Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 03/12] x86, 64bit: set extra ident page table for whole kernel range Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 04/12] x86, 64bit: add support for loading kernel above 512G Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 05/12] x86: Merge early_reserve_initrd for 32bit and 64bit Yinghai Lu
2012-11-21  7:40   ` Pekka Enberg
2012-11-21  7:16 ` [PATCH v3 06/12] x86: add get_ramdisk_image/size Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 07/12] x86, boot: add get_cmd_line_ptr() Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 08/12] x86, boot: Don't check if cmd_line_ptr is accessible in misc/decompressor() Yinghai Lu
2012-11-21 17:21   ` H. Peter Anvin
2012-11-21 19:18     ` Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 09/12] x86, boot: update cmd_line_ptr to unsigned long Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 10/12] x86: use io_remap to access real_mode_data Yinghai Lu
2012-11-21  7:16 ` [PATCH v3 11/12] x86, boot: add fields to support load bzImage and ramdisk high Yinghai Lu
2012-11-21 17:17   ` H. Peter Anvin
2012-11-21 18:59     ` Yinghai Lu
2012-11-21 19:18       ` H. Peter Anvin
2012-11-22  5:56         ` Yinghai Lu
     [not found]           ` <a1ca794a-09d4-4d36-8c8c-67100cb3696e@email.android.com>
2012-11-22  6:47             ` Yinghai Lu
2012-11-22  6:58               ` Yinghai Lu
2012-11-22 15:59                 ` H. Peter Anvin
2012-11-22 18:28                   ` Yinghai Lu
2012-11-22 18:37                     ` H. Peter Anvin
2012-11-22 18:50                       ` Yinghai Lu
2012-11-22 18:51                         ` H. Peter Anvin
2012-11-22 20:18                           ` Yinghai Lu
2012-11-22 20:20                             ` H. Peter Anvin
2012-11-22 20:29                               ` Yinghai Lu
2012-11-22 20:50                             ` H. Peter Anvin
2012-11-22 21:02                               ` H. Peter Anvin
2012-11-22 22:13                                 ` Yinghai Lu
2012-11-24 12:37                       ` Eric W. Biederman
2012-11-24 17:32                         ` H. Peter Anvin
     [not found]                           ` <CAE9FiQV0Q0fi7TrNjihdsUt0ueT4LLON4o+JEmX6ry9S6AU-ug@mail.gmail.com>
2012-11-24 18:24                             ` H. Peter Anvin
2012-11-24 19:50                           ` H. Peter Anvin
2012-11-24 21:30                             ` Yinghai Lu
2012-11-24 21:38                               ` H. Peter Anvin
2012-11-24 22:18                                 ` Yinghai Lu
2012-11-24 22:32                                   ` H. Peter Anvin
2012-11-24 23:24                                     ` Yinghai Lu
2012-11-24 23:50                             ` Eric W. Biederman
2012-11-25  0:04                               ` H. Peter Anvin
2012-11-25  0:11                                 ` Yinghai Lu
2012-11-25  5:50                                   ` Yinghai Lu
2012-11-25  5:52                                     ` H. Peter Anvin
2012-11-25  6:09                                       ` Yinghai Lu
2012-11-25  0:04                               ` Yinghai Lu
2012-11-25  0:06                                 ` H. Peter Anvin
2012-11-21  7:16 ` [PATCH v3 12/12] x86: remove 1024g limitation for kexec buffer on 64bit Yinghai Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).