All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] kexec: put bzImage and ramdisk above 4G for x86 64bit
@ 2012-11-21  7:31 Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 1/4] kexec, x86: add boot header member for version 2.12 Yinghai Lu
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:31 UTC (permalink / raw)
  To: Simon Horman, H. Peter Anvin, Vivek Goyal, Haren Myneni,
	Eric W. Biederman
  Cc: Yinghai Lu, kexec

Now we have limit kdump reserved under 896M, because kexec has the limitation.
and also bzImage need to stay under 4g.

kernel parts changes could be found at:
        git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-boot

here patches are for kexec tools to load bzImage and ramdisk high acccording
to new added boot header fields.

-v3: address review from Eric to use locate_hole at first.
     use xloadflags instead.

Yinghai Lu (4):
  kexec, x86: add boot header member for version 2.12
  kexec, x86: put ramdisk high for 64bit bzImage
  kexec, x86: set ext_cmd_line_ptr when boot_param is above 4g
  kexec, x86_64: Load bzImage64 above 4G

 include/x86/x86-linux.h             |   26 +++-
 kexec/arch/i386/x86-linux-setup.c   |   25 +++-
 kexec/arch/x86_64/Makefile          |    1 +
 kexec/arch/x86_64/kexec-bzImage64.c |  327 +++++++++++++++++++++++++++++++++++
 kexec/arch/x86_64/kexec-x86_64.c    |    1 +
 kexec/arch/x86_64/kexec-x86_64.h    |    5 +
 6 files changed, 378 insertions(+), 7 deletions(-)
 create mode 100644 kexec/arch/x86_64/kexec-bzImage64.c

-- 
1.7.7


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/4] kexec, x86: add boot header member for version 2.12
  2012-11-21  7:31 [PATCH v3 0/4] kexec: put bzImage and ramdisk above 4G for x86 64bit Yinghai Lu
@ 2012-11-21  7:31 ` Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 2/4] kexec, x86: put ramdisk high for 64bit bzImage Yinghai Lu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:31 UTC (permalink / raw)
  To: Simon Horman, H. Peter Anvin, Vivek Goyal, Haren Myneni,
	Eric W. Biederman
  Cc: Yinghai Lu, kexec

will use ext_ramdisk_image/size, and xloadflags to put
ramdisk and bzImage high for 64bit.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/x86/x86-linux.h |   26 +++++++++++++++++++++++---
 1 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/include/x86/x86-linux.h b/include/x86/x86-linux.h
index 27af02b..6d6c5e0 100644
--- a/include/x86/x86-linux.h
+++ b/include/x86/x86-linux.h
@@ -174,11 +174,21 @@ struct x86_linux_param_header {
 	/* 2.04+ */
 	uint32_t kernel_alignment;		/* 0x230 */
 	uint8_t  relocatable_kernel;		/* 0x234 */
-	uint8_t  reserved15[3];			/* 0x235 */
+	uint8_t  min_alignment;			/* 0x235 */
+	uint16_t xloadflags;			/* 0x236 */
 	uint32_t cmdline_size;			/* 0x238 */
 	uint32_t hardware_subarch;		/* 0x23C */
 	uint64_t hardware_subarch_data;		/* 0x240 */
-	uint8_t  reserved16[0x290 - 0x248];	/* 0x248 */
+	uint32_t payload_offset;		/* 0x248 */
+	uint32_t payload_length;		/* 0x24C */
+	uint64_t setup_data;			/* 0x250 */
+	uint64_t pref_address;			/* 0x258 */
+	uint32_t init_size;			/* 0x260 */
+	uint32_t handover_offset;		/* 0x264 */
+	uint32_t ext_ramdisk_image;		/* 0x268 */
+	uint32_t ext_ramdisk_size;		/* 0x26C */
+	uint32_t ext_cmd_line_ptr;		/* 0x270 */
+	uint8_t  reserved16[0x290 - 0x274];	/* 0x274 */
 	uint32_t edd_mbr_sig_buffer[EDD_MBR_SIG_MAX];	/* 0x290 */
 #endif
 	struct 	e820entry e820_map[E820MAX];	/* 0x2d0 */
@@ -241,10 +251,20 @@ struct x86_linux_header {
 #else
 	uint32_t kernel_alignment;		/* 0x230 */
 	uint8_t  relocatable_kernel;		/* 0x234 */
-	uint8_t  reserved6[3];			/* 0x235 */
+	uint8_t  min_alignment;			/* 0x235 */
+	uint16_t xloadflags;			/* 0x236 */
 	uint32_t cmdline_size;                  /* 0x238 */
 	uint32_t hardware_subarch;              /* 0x23C */
 	uint64_t hardware_subarch_data;         /* 0x240 */
+	uint32_t payload_offset;		/* 0x248 */
+	uint32_t payload_length;		/* 0x24C */
+	uint64_t setup_data;			/* 0x250 */
+	uint64_t pref_address;			/* 0x258 */
+	uint32_t init_size;			/* 0x260 */
+	uint32_t handover_offset;		/* 0x264 */
+	uint32_t ext_ramdisk_image;		/* 0x268 */
+	uint32_t ext_ramdisk_size;		/* 0x26C */
+	uint32_t ext_cmd_line_ptr;		/* 0x270 */
 #endif
 } PACKED;
 
-- 
1.7.7


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/4] kexec, x86: put ramdisk high for 64bit bzImage
  2012-11-21  7:31 [PATCH v3 0/4] kexec: put bzImage and ramdisk above 4G for x86 64bit Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 1/4] kexec, x86: add boot header member for version 2.12 Yinghai Lu
@ 2012-11-21  7:31 ` Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 3/4] kexec, x86: set ext_cmd_line_ptr when boot_param is above 4g Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G Yinghai Lu
  3 siblings, 0 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:31 UTC (permalink / raw)
  To: Simon Horman, H. Peter Anvin, Vivek Goyal, Haren Myneni,
	Eric W. Biederman
  Cc: Yinghai Lu, kexec

We could put ramdisk high for bzImage on 64bit for protocol 2.12.

-v2: change ext_... handling to way that eric like.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 kexec/arch/i386/x86-linux-setup.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/kexec/arch/i386/x86-linux-setup.c b/kexec/arch/i386/x86-linux-setup.c
index b7ab8ea..3c31f64 100644
--- a/kexec/arch/i386/x86-linux-setup.c
+++ b/kexec/arch/i386/x86-linux-setup.c
@@ -64,7 +64,11 @@ void setup_linux_bootloader_parameters(
 	/* Find the maximum initial ramdisk address */
 	initrd_addr_max = DEFAULT_INITRD_ADDR_MAX;
 	if (real_mode->protocol_version >= 0x0203) {
-		initrd_addr_max = real_mode->initrd_addr_max;
+		if (real_mode->protocol_version >= 0x020c &&
+		    real_mode->xloadflags & 1)
+			initrd_addr_max = ULONG_MAX;
+		else
+			initrd_addr_max = real_mode->initrd_addr_max;
 		dbgprintf("initrd_addr_max is 0x%lx\n", initrd_addr_max);
 	}
 
@@ -81,8 +85,16 @@ void setup_linux_bootloader_parameters(
 	}
 
 	/* Ramdisk address and size */
-	real_mode->initrd_start = initrd_base;
-	real_mode->initrd_size  = initrd_size;
+	real_mode->initrd_start = initrd_base & 0xffffffffUL;
+	real_mode->initrd_size  = initrd_size & 0xffffffffUL;
+
+	if (real_mode->protocol_version >= 0x020c &&
+	    (initrd_base & 0xffffffffUL) != initrd_base)
+		real_mode->ext_ramdisk_image = initrd_base >> 32;
+
+	if (real_mode->protocol_version >= 0x020c &&
+	    (initrd_size & 0xffffffffUL) != initrd_size)
+		real_mode->ext_ramdisk_size = initrd_size >> 32;
 
 	/* The location of the command line */
 	/* if (real_mode_base == 0x90000) { */
-- 
1.7.7


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/4] kexec, x86: set ext_cmd_line_ptr when boot_param is above 4g
  2012-11-21  7:31 [PATCH v3 0/4] kexec: put bzImage and ramdisk above 4G for x86 64bit Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 1/4] kexec, x86: add boot header member for version 2.12 Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 2/4] kexec, x86: put ramdisk high for 64bit bzImage Yinghai Lu
@ 2012-11-21  7:31 ` Yinghai Lu
  2012-11-21  7:31 ` [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G Yinghai Lu
  3 siblings, 0 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:31 UTC (permalink / raw)
  To: Simon Horman, H. Peter Anvin, Vivek Goyal, Haren Myneni,
	Eric W. Biederman
  Cc: Yinghai Lu, kexec

update ext_cmd_line_ptr for bzImage from protocal 2.12
that could have command line above 4g.

-v2: update ext_... handling to the way that Eric likes.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 kexec/arch/i386/x86-linux-setup.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/kexec/arch/i386/x86-linux-setup.c b/kexec/arch/i386/x86-linux-setup.c
index 3c31f64..d12dab1 100644
--- a/kexec/arch/i386/x86-linux-setup.c
+++ b/kexec/arch/i386/x86-linux-setup.c
@@ -103,7 +103,12 @@ void setup_linux_bootloader_parameters(
 		/* setup_move_size */
 	/* } */
 	if (real_mode->protocol_version >= 0x0202) {
-		real_mode->cmd_line_ptr = real_mode_base + cmdline_offset;
+		unsigned long cmd_line_ptr = real_mode_base + cmdline_offset;
+
+		real_mode->cmd_line_ptr = cmd_line_ptr & 0xffffffffUL;
+		if ((real_mode->protocol_version >= 0x020c) &&
+		    ((cmd_line_ptr & 0xffffffffUL) != cmd_line_ptr))
+			real_mode->ext_cmd_line_ptr = cmd_line_ptr >> 32;
 	}
 
 	/* Fill in the command line */
-- 
1.7.7


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21  7:31 [PATCH v3 0/4] kexec: put bzImage and ramdisk above 4G for x86 64bit Yinghai Lu
                   ` (2 preceding siblings ...)
  2012-11-21  7:31 ` [PATCH v3 3/4] kexec, x86: set ext_cmd_line_ptr when boot_param is above 4g Yinghai Lu
@ 2012-11-21  7:31 ` Yinghai Lu
  2012-11-21 14:37   ` Vivek Goyal
  2012-11-21 14:50   ` Vivek Goyal
  3 siblings, 2 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21  7:31 UTC (permalink / raw)
  To: Simon Horman, H. Peter Anvin, Vivek Goyal, Haren Myneni,
	Eric W. Biederman
  Cc: Yinghai Lu, kexec

need to check xloadflags to see the bzImage is for 64bit relocatable.

-v2: add kexec-bzImage64.c according to Eric.
-v3: don't need to purgatory under 2g after Eric's change to purgatory code.
-v4: use locate_hole find position first then add_buffer... suggested by Eric
     add buffer for kernel image at last to make kexec-load faster.
     use xloadflags in setup_header to tell if is bzImage64.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 kexec/arch/x86_64/Makefile          |    1 +
 kexec/arch/x86_64/kexec-bzImage64.c |  327 +++++++++++++++++++++++++++++++++++
 kexec/arch/x86_64/kexec-x86_64.c    |    1 +
 kexec/arch/x86_64/kexec-x86_64.h    |    5 +
 4 files changed, 334 insertions(+), 0 deletions(-)
 create mode 100644 kexec/arch/x86_64/kexec-bzImage64.c

diff --git a/kexec/arch/x86_64/Makefile b/kexec/arch/x86_64/Makefile
index 405bdf5..1cf10f9 100644
--- a/kexec/arch/x86_64/Makefile
+++ b/kexec/arch/x86_64/Makefile
@@ -13,6 +13,7 @@ x86_64_KEXEC_SRCS += kexec/arch/i386/crashdump-x86.c
 x86_64_KEXEC_SRCS_native =  kexec/arch/x86_64/kexec-x86_64.c
 x86_64_KEXEC_SRCS_native += kexec/arch/x86_64/kexec-elf-x86_64.c
 x86_64_KEXEC_SRCS_native += kexec/arch/x86_64/kexec-elf-rel-x86_64.c
+x86_64_KEXEC_SRCS_native += kexec/arch/x86_64/kexec-bzImage64.c
 
 x86_64_KEXEC_SRCS += $(x86_64_KEXEC_SRCS_native)
 
diff --git a/kexec/arch/x86_64/kexec-bzImage64.c b/kexec/arch/x86_64/kexec-bzImage64.c
new file mode 100644
index 0000000..28f1ace
--- /dev/null
+++ b/kexec/arch/x86_64/kexec-bzImage64.c
@@ -0,0 +1,327 @@
+/*
+ * kexec: Linux boots Linux
+ *
+ * Copyright (C) 2003-2010  Eric Biederman (ebiederm@xmission.com)
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation (version 2 of the License).
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#define _GNU_SOURCE
+#include <stddef.h>
+#include <stdio.h>
+#include <string.h>
+#include <limits.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <getopt.h>
+#include <elf.h>
+#include <boot/elf_boot.h>
+#include <ip_checksum.h>
+#include <x86/x86-linux.h>
+#include "../../kexec.h"
+#include "../../kexec-elf.h"
+#include "../../kexec-syscall.h"
+#include "kexec-x86_64.h"
+#include "../i386/x86-linux-setup.h"
+#include "../i386/crashdump-x86.h"
+#include <arch/options.h>
+
+static const int probe_debug = 0;
+
+int bzImage64_probe(const char *buf, off_t len)
+{
+	const struct x86_linux_header *header;
+	if ((uintmax_t)len < (uintmax_t)(2 * 512)) {
+		if (probe_debug) {
+			fprintf(stderr, "File is too short to be a bzImage!\n");
+		}
+		return -1;
+	}
+	header = (const struct x86_linux_header *)buf;
+	if (memcmp(header->header_magic, "HdrS", 4) != 0) {
+		if (probe_debug) {
+			fprintf(stderr, "Not a bzImage\n");
+		}
+		return -1;
+	}
+	if (header->boot_sector_magic != 0xAA55) {
+		if (probe_debug) {
+			fprintf(stderr, "No x86 boot sector present\n");
+		}
+		/* No x86 boot sector present */
+		return -1;
+	}
+	if (header->protocol_version < 0x020C) {
+		if (probe_debug) {
+			fprintf(stderr, "Must be at least protocol version 2.12\n");
+		}
+		/* Must be at least protocol version 2.12 */
+		return -1;
+	}
+	if ((header->loadflags & 1) == 0) {
+		if (probe_debug) {
+			fprintf(stderr, "zImage not a bzImage\n");
+		}
+		/* Not a bzImage */
+		return -1;
+	}
+	if (!(header->xloadflags & 1)) {
+		if (probe_debug) {
+			fprintf(stderr, "Not a bzImage64\n");
+		}
+		/* Must be LOADED_ABOVE_4G */
+		return -1;
+	}
+	/* I've got a bzImage64 */
+	if (probe_debug) {
+		fprintf(stderr, "It's a bzImage64\n");
+	}
+	return 0;
+}
+
+void bzImage64_usage(void)
+{
+	printf(	"    --command-line=STRING Set the kernel command line to STRING.\n"
+		"    --append=STRING       Set the kernel command line to STRING.\n"
+		"    --reuse-cmdline       Use kernel command line from running system.\n"
+		"    --initrd=FILE         Use FILE as the kernel's initial ramdisk.\n"
+		"    --ramdisk=FILE        Use FILE as the kernel's initial ramdisk.\n"
+		);
+}
+
+/* round_up() is from Linux kernel include/linux/kernel.h */
+/*
+ * This looks more complex than it should be. But we need to
+ * get the type for the ~ right in round_down (it needs to be
+ * as wide as the result!), and we want to evaluate the macro
+ * arguments just once each.
+ */
+#define __round_mask(x, y) ((__typeof__(x))((y)-1))
+#define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1)
+#define round_down(x, y) ((x) & ~__round_mask(x, y))
+
+static int do_bzImage64_load(struct kexec_info *info,
+	const char *kernel, off_t kernel_len,
+	const char *command_line, off_t command_line_len,
+	const char *initrd, off_t initrd_len)
+{
+	struct x86_linux_header setup_header;
+	struct x86_linux_param_header *real_mode;
+	int setup_sects;
+	size_t size;
+	int kern16_size;
+	unsigned long setup_base, setup_size;
+	struct entry64_regs regs64;
+	char *modified_cmdline;
+	unsigned long cmdline_end;
+	unsigned long align, addr, k_size;
+
+	/*
+	 * Find out about the file I am about to load.
+	 */
+	if ((uintmax_t)kernel_len < (uintmax_t)(2 * 512))
+		return -1;
+
+	memcpy(&setup_header, kernel, sizeof(setup_header));
+	setup_sects = setup_header.setup_sects;
+	if (setup_sects == 0)
+		setup_sects = 4;
+
+	kern16_size = (setup_sects + 1) * 512;
+	if (kernel_len < kern16_size) {
+		fprintf(stderr, "BzImage truncated?\n");
+		return -1;
+	}
+
+	if ((uintmax_t)command_line_len > (uintmax_t)setup_header.cmdline_size) {
+		dbgprintf("Kernel command line too long for kernel!\n");
+		return -1;
+	}
+
+	/* Need to append some command line parameters internally in case of
+	 * taking crash dumps.
+	 */
+	if (info->kexec_flags & (KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT)) {
+		modified_cmdline = xmalloc(COMMAND_LINE_SIZE);
+		memset((void *)modified_cmdline, 0, COMMAND_LINE_SIZE);
+		if (command_line) {
+			strncpy(modified_cmdline, command_line,
+					COMMAND_LINE_SIZE);
+			modified_cmdline[COMMAND_LINE_SIZE - 1] = '\0';
+		}
+
+		/* If panic kernel is being loaded, additional segments need
+		 * to be created. load_crashdump_segments will take care of
+		 * loading the segments as high in memory as possible, hence
+		 * in turn as away as possible from kernel to avoid being
+		 * stomped by the kernel.
+		 */
+		if (load_crashdump_segments(info, modified_cmdline, -1, 0) < 0)
+			return -1;
+
+		/* Use new command line buffer */
+		command_line = modified_cmdline;
+		command_line_len = strlen(command_line) +1;
+	}
+
+	/* x86_64 purgatory could be anywhere */
+	elf_rel_build_load(info, &info->rhdr, purgatory, purgatory_size,
+				0x3000, -1, -1, 0);
+	dbgprintf("Loaded purgatory at addr 0x%lx\n", info->rhdr.rel_addr);
+	/* The argument/parameter segment */
+	setup_size = kern16_size + command_line_len + PURGATORY_CMDLINE_SIZE;
+	real_mode = xmalloc(setup_size);
+	memcpy(real_mode, kernel, kern16_size);
+
+	/* No real mode code will be executing. setup segment can be loaded
+	 * anywhere as we will be just reading command line.
+	 */
+	setup_base = add_buffer(info, real_mode, setup_size, setup_size,
+				16, 0x3000, -1, -1);
+
+	dbgprintf("Loaded real_mode_data and command line at 0x%lx\n",
+			setup_base);
+
+	/* Tell the kernel what is going on */
+	setup_linux_bootloader_parameters(info, real_mode, setup_base,
+			kern16_size, command_line, command_line_len,
+			initrd, initrd_len);
+
+	/*
+	 * add kernel at last, to make kexec load big kernel faster.
+	 * we are finding buffer with run-time size, and only add buffer
+	 * with image size that is smaller than run-time size.
+	 * later kexec_load will take less time with small range.
+	 * otherwise kexec_load will allocate big range but only
+	 * copy small buffer and waste time to allocate need needed
+	 * range.
+	 */
+
+	/* The main kernel segment */
+	k_size = kernel_len - kern16_size;
+
+	/* need to use run-time size for buffer searching */
+	dbgprintf("kernel init_size 0x%x\n", real_mode->init_size);
+	size = round_up(real_mode->init_size, 4096);
+
+	/* need to sort segments before locate_hole */
+        if (sort_segments(info) < 0)
+                die("sort_segments failed\n");
+
+	/* avoid cross GB boundary */
+	align = real_mode->kernel_alignment;
+	addr = locate_hole(info, size, align, 0x100000, -1, -1);
+	if (addr == ULONG_MAX)
+		die("can not load bzImage64");
+	/* same GB ? */
+	while ((addr >> 30) != ((addr + size - 1) >> 30)) {
+		addr = locate_hole(info, size, align, 0x100000,
+				 round_down(addr + size - 1, (1UL<<30)), -1);
+		if (addr == ULONG_MAX)
+			die("can not load bzImage64");
+	}
+	dbgprintf("Found kernel buffer at %lx size %lx\n", addr, size);
+
+	/* put compressed image at start of buffer */
+	addr = add_buffer(info, kernel + kern16_size, k_size, k_size, align,
+				addr, addr + size, 1);
+	if (addr == ULONG_MAX)
+		die("can not load bzImage64");
+	dbgprintf("Loaded 64bit kernel at 0x%lx\n", addr);
+
+	elf_rel_get_symbol(&info->rhdr, "entry64_regs", &regs64, sizeof(regs64));
+	regs64.rbx = 0;           /* Bootstrap processor */
+	regs64.rsi = setup_base;  /* Pointer to the parameters */
+	regs64.rip = addr + 0x200; /* the entry point for startup_64 */
+	regs64.rsp = elf_rel_get_addr(&info->rhdr, "stack_end"); /* Stack, unused */
+	elf_rel_set_symbol(&info->rhdr, "entry64_regs", &regs64, sizeof(regs64));
+
+	cmdline_end = setup_base + kern16_size + command_line_len - 1;
+	elf_rel_set_symbol(&info->rhdr, "cmdline_end", &cmdline_end,
+			   sizeof(unsigned long));
+
+	/* Fill in the information BIOS calls would normally provide. */
+	setup_linux_system_parameters(real_mode, info->kexec_flags);
+
+	return 0;
+}
+
+int bzImage64_load(int argc, char **argv, const char *buf, off_t len,
+	struct kexec_info *info)
+{
+	char *command_line = NULL;
+	const char *ramdisk, *append = NULL;
+	char *ramdisk_buf;
+	off_t ramdisk_length;
+	int command_line_len;
+	int opt;
+	int result;
+
+	/* See options.h -- add any more there, too. */
+	static const struct option options[] = {
+		KEXEC_ARCH_OPTIONS
+		{ "command-line",	1, 0, OPT_APPEND },
+		{ "append",		1, 0, OPT_APPEND },
+		{ "reuse-cmdline",	0, 0, OPT_REUSE_CMDLINE },
+		{ "initrd",		1, 0, OPT_RAMDISK },
+		{ "ramdisk",		1, 0, OPT_RAMDISK },
+		{ 0,			0, 0, 0 },
+	};
+	static const char short_options[] = KEXEC_ARCH_OPT_STR "d";
+
+	ramdisk = 0;
+	ramdisk_length = 0;
+	while((opt = getopt_long(argc, argv, short_options, options, 0)) != -1) {
+		switch(opt) {
+		default:
+			/* Ignore core options */
+			if (opt < OPT_ARCH_MAX) {
+				break;
+			}
+		case '?':
+			usage();
+			return -1;
+			break;
+		case OPT_APPEND:
+			append = optarg;
+			break;
+		case OPT_REUSE_CMDLINE:
+			command_line = get_command_line();
+			break;
+		case OPT_RAMDISK:
+			ramdisk = optarg;
+			break;
+		}
+	}
+	command_line = concat_cmdline(command_line, append);
+	command_line_len = 0;
+	if (command_line) {
+		command_line_len = strlen(command_line) +1;
+	}
+	ramdisk_buf = 0;
+	if (ramdisk) {
+		ramdisk_buf = slurp_file(ramdisk, &ramdisk_length);
+	}
+	result = do_bzImage64_load(info,
+		buf, len,
+		command_line, command_line_len,
+		ramdisk_buf, ramdisk_length);
+
+	free(command_line);
+	return result;
+}
diff --git a/kexec/arch/x86_64/kexec-x86_64.c b/kexec/arch/x86_64/kexec-x86_64.c
index 6c42c32..5c23e01 100644
--- a/kexec/arch/x86_64/kexec-x86_64.c
+++ b/kexec/arch/x86_64/kexec-x86_64.c
@@ -37,6 +37,7 @@ struct file_type file_type[] = {
 	{ "multiboot-x86", multiboot_x86_probe, multiboot_x86_load,
 	  multiboot_x86_usage },
 	{ "elf-x86", elf_x86_probe, elf_x86_load, elf_x86_usage },
+	{ "bzImage64", bzImage64_probe, bzImage64_load, bzImage64_usage },
 	{ "bzImage", bzImage_probe, bzImage_load, bzImage_usage },
 	{ "beoboot-x86", beoboot_probe, beoboot_load, beoboot_usage },
 	{ "nbi-x86", nbi_probe, nbi_load, nbi_usage },
diff --git a/kexec/arch/x86_64/kexec-x86_64.h b/kexec/arch/x86_64/kexec-x86_64.h
index a97cd71..b820ae8 100644
--- a/kexec/arch/x86_64/kexec-x86_64.h
+++ b/kexec/arch/x86_64/kexec-x86_64.h
@@ -28,4 +28,9 @@ int elf_x86_64_load(int argc, char **argv, const char *buf, off_t len,
 	struct kexec_info *info);
 void elf_x86_64_usage(void);
 
+int bzImage64_probe(const char *buf, off_t len);
+int bzImage64_load(int argc, char **argv, const char *buf, off_t len,
+        struct kexec_info *info);
+void bzImage64_usage(void);
+
 #endif /* KEXEC_X86_64_H */
-- 
1.7.7


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21  7:31 ` [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G Yinghai Lu
@ 2012-11-21 14:37   ` Vivek Goyal
  2012-11-21 17:24     ` H. Peter Anvin
  2012-11-21 19:54     ` Yinghai Lu
  2012-11-21 14:50   ` Vivek Goyal
  1 sibling, 2 replies; 25+ messages in thread
From: Vivek Goyal @ 2012-11-21 14:37 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Tue, Nov 20, 2012 at 11:31:38PM -0800, Yinghai Lu wrote:

[..]
> +	/* avoid cross GB boundary */
> +	align = real_mode->kernel_alignment;
> +	addr = locate_hole(info, size, align, 0x100000, -1, -1);
> +	if (addr == ULONG_MAX)
> +		die("can not load bzImage64");
> +	/* same GB ? */
> +	while ((addr >> 30) != ((addr + size - 1) >> 30)) {
> +		addr = locate_hole(info, size, align, 0x100000,
> +				 round_down(addr + size - 1, (1UL<<30)), -1);
> +		if (addr == ULONG_MAX)
> +			die("can not load bzImage64");
> +	}
> +	dbgprintf("Found kernel buffer at %lx size %lx\n", addr, size);

Where does this limitation of not loading kernel across GB boundary come
from?

Vivek


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21  7:31 ` [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G Yinghai Lu
  2012-11-21 14:37   ` Vivek Goyal
@ 2012-11-21 14:50   ` Vivek Goyal
  2012-11-21 19:50     ` Yinghai Lu
  1 sibling, 1 reply; 25+ messages in thread
From: Vivek Goyal @ 2012-11-21 14:50 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Tue, Nov 20, 2012 at 11:31:38PM -0800, Yinghai Lu wrote:

[..]
> +int bzImage64_probe(const char *buf, off_t len)
> +{
> +	const struct x86_linux_header *header;
> +	if ((uintmax_t)len < (uintmax_t)(2 * 512)) {
> +		if (probe_debug) {
> +			fprintf(stderr, "File is too short to be a bzImage!\n");
> +		}
> +		return -1;
> +	}
> +	header = (const struct x86_linux_header *)buf;
> +	if (memcmp(header->header_magic, "HdrS", 4) != 0) {
> +		if (probe_debug) {
> +			fprintf(stderr, "Not a bzImage\n");
> +		}
> +		return -1;
> +	}
> +	if (header->boot_sector_magic != 0xAA55) {
> +		if (probe_debug) {
> +			fprintf(stderr, "No x86 boot sector present\n");
> +		}
> +		/* No x86 boot sector present */
> +		return -1;
> +	}
> +	if (header->protocol_version < 0x020C) {
> +		if (probe_debug) {
> +			fprintf(stderr, "Must be at least protocol version 2.12\n");
> +		}
> +		/* Must be at least protocol version 2.12 */
> +		return -1;
> +	}
> +	if ((header->loadflags & 1) == 0) {
> +		if (probe_debug) {
> +			fprintf(stderr, "zImage not a bzImage\n");
> +		}
> +		/* Not a bzImage */
> +		return -1;
> +	}
> +	if (!(header->xloadflags & 1)) {
> +		if (probe_debug) {
> +			fprintf(stderr, "Not a bzImage64\n");
> +		}
> +		/* Must be LOADED_ABOVE_4G */
> +		return -1;
> +	}

So how do I force a 16bit or 32bit entry using a bzImage64?

Thanks
Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 14:37   ` Vivek Goyal
@ 2012-11-21 17:24     ` H. Peter Anvin
  2012-11-21 19:54     ` Yinghai Lu
  1 sibling, 0 replies; 25+ messages in thread
From: H. Peter Anvin @ 2012-11-21 17:24 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Haren Myneni, Simon Horman, Yinghai Lu, Eric W. Biederman, kexec

On 11/21/2012 06:37 AM, Vivek Goyal wrote:
> On Tue, Nov 20, 2012 at 11:31:38PM -0800, Yinghai Lu wrote:
>
> [..]
>> +	/* avoid cross GB boundary */
>> +	align = real_mode->kernel_alignment;
>> +	addr = locate_hole(info, size, align, 0x100000, -1, -1);
>> +	if (addr == ULONG_MAX)
>> +		die("can not load bzImage64");
>> +	/* same GB ? */
>> +	while ((addr >> 30) != ((addr + size - 1) >> 30)) {
>> +		addr = locate_hole(info, size, align, 0x100000,
>> +				 round_down(addr + size - 1, (1UL<<30)), -1);
>> +		if (addr == ULONG_MAX)
>> +			die("can not load bzImage64");
>> +	}
>> +	dbgprintf("Found kernel buffer at %lx size %lx\n", addr, size);
>
> Where does this limitation of not loading kernel across GB boundary come
> from?
>

Seriously... that is bizarre.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 14:50   ` Vivek Goyal
@ 2012-11-21 19:50     ` Yinghai Lu
  2012-11-21 19:52       ` H. Peter Anvin
                         ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21 19:50 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Wed, Nov 21, 2012 at 6:50 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, Nov 20, 2012 at 11:31:38PM -0800, Yinghai Lu wrote:
>
> [..]
>> +int bzImage64_probe(const char *buf, off_t len)
>> +{
>> +     const struct x86_linux_header *header;
>> +     if ((uintmax_t)len < (uintmax_t)(2 * 512)) {
>> +             if (probe_debug) {
>> +                     fprintf(stderr, "File is too short to be a bzImage!\n");
>> +             }
>> +             return -1;
>> +     }
>> +     header = (const struct x86_linux_header *)buf;
>> +     if (memcmp(header->header_magic, "HdrS", 4) != 0) {
>> +             if (probe_debug) {
>> +                     fprintf(stderr, "Not a bzImage\n");
>> +             }
>> +             return -1;
>> +     }
>> +     if (header->boot_sector_magic != 0xAA55) {
>> +             if (probe_debug) {
>> +                     fprintf(stderr, "No x86 boot sector present\n");
>> +             }
>> +             /* No x86 boot sector present */
>> +             return -1;
>> +     }
>> +     if (header->protocol_version < 0x020C) {
>> +             if (probe_debug) {
>> +                     fprintf(stderr, "Must be at least protocol version 2.12\n");
>> +             }
>> +             /* Must be at least protocol version 2.12 */
>> +             return -1;
>> +     }
>> +     if ((header->loadflags & 1) == 0) {
>> +             if (probe_debug) {
>> +                     fprintf(stderr, "zImage not a bzImage\n");
>> +             }
>> +             /* Not a bzImage */
>> +             return -1;
>> +     }
>> +     if (!(header->xloadflags & 1)) {
>> +             if (probe_debug) {
>> +                     fprintf(stderr, "Not a bzImage64\n");
>> +             }
>> +             /* Must be LOADED_ABOVE_4G */
>> +             return -1;
>> +     }
>
> So how do I force a 16bit or 32bit entry using a bzImage64?

kexec -t bzImage -l ....
will load low and use 32bit entry.

kexec -t bzImage64 -l ...
kexec -l ...
will try to load high and use 64bit entry.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 19:50     ` Yinghai Lu
@ 2012-11-21 19:52       ` H. Peter Anvin
  2012-11-21 19:57         ` Yinghai Lu
  2012-11-21 20:00       ` Vivek Goyal
  2012-11-21 20:07       ` Vivek Goyal
  2 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2012-11-21 19:52 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

On 11/21/2012 11:50 AM, Yinghai Lu wrote:
>>
>> So how do I force a 16bit or 32bit entry using a bzImage64?
> 
> kexec -t bzImage -l ....
> will load low and use 32bit entry.
> 
> kexec -t bzImage64 -l ...
> kexec -l ...
> will try to load high and use 64bit entry.
> 

I don't see any difference...

	-hpa


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 14:37   ` Vivek Goyal
  2012-11-21 17:24     ` H. Peter Anvin
@ 2012-11-21 19:54     ` Yinghai Lu
  2012-11-21 19:56       ` H. Peter Anvin
  1 sibling, 1 reply; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21 19:54 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Wed, Nov 21, 2012 at 6:37 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, Nov 20, 2012 at 11:31:38PM -0800, Yinghai Lu wrote:
>
> [..]
>> +     /* avoid cross GB boundary */
>> +     align = real_mode->kernel_alignment;
>> +     addr = locate_hole(info, size, align, 0x100000, -1, -1);
>> +     if (addr == ULONG_MAX)
>> +             die("can not load bzImage64");
>> +     /* same GB ? */
>> +     while ((addr >> 30) != ((addr + size - 1) >> 30)) {
>> +             addr = locate_hole(info, size, align, 0x100000,
>> +                              round_down(addr + size - 1, (1UL<<30)), -1);
>> +             if (addr == ULONG_MAX)
>> +                     die("can not load bzImage64");
>> +     }
>> +     dbgprintf("Found kernel buffer at %lx size %lx\n", addr, size);
>
> Where does this limitation of not loading kernel across GB boundary come
> from?

in kernel arch/x86/kernel/head_64.S

it only set first 1G ident mapping. and if it find that code is above
1G, it will set extra ident mapping
for new _text.._end.
To make checking and add extra mapping simple and also save two extra
pages for mapping.
Limit that _text.._end in them same GB range.

Thanks

Yinghai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 19:54     ` Yinghai Lu
@ 2012-11-21 19:56       ` H. Peter Anvin
  2012-11-21 20:01         ` Yinghai Lu
  0 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2012-11-21 19:56 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

On 11/21/2012 11:54 AM, Yinghai Lu wrote:
> On Wed, Nov 21, 2012 at 6:37 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> On Tue, Nov 20, 2012 at 11:31:38PM -0800, Yinghai Lu wrote:
>>
>> [..]
>>> +     /* avoid cross GB boundary */
>>> +     align = real_mode->kernel_alignment;
>>> +     addr = locate_hole(info, size, align, 0x100000, -1, -1);
>>> +     if (addr == ULONG_MAX)
>>> +             die("can not load bzImage64");
>>> +     /* same GB ? */
>>> +     while ((addr >> 30) != ((addr + size - 1) >> 30)) {
>>> +             addr = locate_hole(info, size, align, 0x100000,
>>> +                              round_down(addr + size - 1, (1UL<<30)), -1);
>>> +             if (addr == ULONG_MAX)
>>> +                     die("can not load bzImage64");
>>> +     }
>>> +     dbgprintf("Found kernel buffer at %lx size %lx\n", addr, size);
>>
>> Where does this limitation of not loading kernel across GB boundary come
>> from?
> 
> in kernel arch/x86/kernel/head_64.S
> 
> it only set first 1G ident mapping. and if it find that code is above
> 1G, it will set extra ident mapping
> for new _text.._end.
> To make checking and add extra mapping simple and also save two extra
> pages for mapping.
> Limit that _text.._end in them same GB range.
> 

No, this is backwards.

We should fix that limitation instead.

	-hpa



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 19:52       ` H. Peter Anvin
@ 2012-11-21 19:57         ` Yinghai Lu
  0 siblings, 0 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21 19:57 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

On Wed, Nov 21, 2012 at 11:52 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/21/2012 11:50 AM, Yinghai Lu wrote:
>>>
>>> So how do I force a 16bit or 32bit entry using a bzImage64?
>>
>> kexec -t bzImage -l ....
>> will load low and use 32bit entry.
>>
>> kexec -t bzImage64 -l ...
>> kexec -l ...
>> will try to load high and use 64bit entry.
>>

>
> I don't see any difference...
>


--- a/kexec/arch/x86_64/kexec-x86_64.c
+++ b/kexec/arch/x86_64/kexec-x86_64.c
@@ -37,6 +37,7 @@ struct file_type file_type[] = {
        { "multiboot-x86", multiboot_x86_probe, multiboot_x86_load,
          multiboot_x86_usage },
        { "elf-x86", elf_x86_probe, elf_x86_load, elf_x86_usage },
+       { "bzImage64", bzImage64_probe, bzImage64_load, bzImage64_usage },
        { "bzImage", bzImage_probe, bzImage_load, bzImage_usage },
        { "beoboot-x86", beoboot_probe, beoboot_load, beoboot_usage },
        { "nbi-x86", nbi_probe, nbi_load, nbi_usage },

bzImage64_probe will be run before bzImage_probe.

and if it find that is 64bit, bzImage_probe will not be executed.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 19:50     ` Yinghai Lu
  2012-11-21 19:52       ` H. Peter Anvin
@ 2012-11-21 20:00       ` Vivek Goyal
  2012-11-21 20:09         ` Yinghai Lu
  2012-11-21 20:07       ` Vivek Goyal
  2 siblings, 1 reply; 25+ messages in thread
From: Vivek Goyal @ 2012-11-21 20:00 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Wed, Nov 21, 2012 at 11:50:56AM -0800, Yinghai Lu wrote:

[..]
> > So how do I force a 16bit or 32bit entry using a bzImage64?
> 
> kexec -t bzImage -l ....
> will load low and use 32bit entry.
> 

Ok, so user needs to enforce image type (using -t option) to bzImage
(for a bzImage which supports 64bit entry point) to be able to use
32bit entry.

I think a better option is that bzImage64 loader parses the user specified
entry options and call into 32bit bzImage loader if user is asking for 32bit
or 16bit entry. For 16bit, we already have option --real-mode option. May
be we need to introduce another one for forcing 32bit entry, say --entry-32bit
or --protected-mode (whatever makes sense).

Eric will know better but I do think that bzImage64 loader should be able
to parse user specified entry point and honor that.

> kexec -t bzImage64 -l ...
> kexec -l ...
> will try to load high and use 64bit entry.

This one violates the existing syntax where --real-mode is supposed to
jump to real mode entry point of bzImage.

Eric will know better but I do think that bzImage64 loader should be able
to parse user specified entry point and honor that.

Thanks
Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 19:56       ` H. Peter Anvin
@ 2012-11-21 20:01         ` Yinghai Lu
  2012-11-21 20:16           ` H. Peter Anvin
  0 siblings, 1 reply; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21 20:01 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

On Wed, Nov 21, 2012 at 11:56 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/21/2012 11:54 AM, Yinghai Lu wrote:
>>
>> in kernel arch/x86/kernel/head_64.S
>>
>> it only set first 1G ident mapping. and if it find that code is above
>> 1G, it will set extra ident mapping
>> for new _text.._end.
>> To make checking and add extra mapping simple and also save two extra
>> pages for mapping.
>> Limit that _text.._end in them same GB range.
>>
>
> No, this is backwards.

old one: it limited bzImage in [0,1G) aka the first 1G.

Now we can put it in any aligned 1G range.

So how could it be called backwards?

>
> We should fix that limitation instead.

sure, but that will make arch/x86/boot/compressed/head_64.S not need
complicated.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 19:50     ` Yinghai Lu
  2012-11-21 19:52       ` H. Peter Anvin
  2012-11-21 20:00       ` Vivek Goyal
@ 2012-11-21 20:07       ` Vivek Goyal
  2012-11-22 11:39         ` Eric W. Biederman
  2 siblings, 1 reply; 25+ messages in thread
From: Vivek Goyal @ 2012-11-21 20:07 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Wed, Nov 21, 2012 at 11:50:56AM -0800, Yinghai Lu wrote:
> On Wed, Nov 21, 2012 at 6:50 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Tue, Nov 20, 2012 at 11:31:38PM -0800, Yinghai Lu wrote:
> >
> > [..]
> >> +int bzImage64_probe(const char *buf, off_t len)
> >> +{
> >> +     const struct x86_linux_header *header;
> >> +     if ((uintmax_t)len < (uintmax_t)(2 * 512)) {
> >> +             if (probe_debug) {
> >> +                     fprintf(stderr, "File is too short to be a bzImage!\n");
> >> +             }
> >> +             return -1;
> >> +     }
> >> +     header = (const struct x86_linux_header *)buf;
> >> +     if (memcmp(header->header_magic, "HdrS", 4) != 0) {
> >> +             if (probe_debug) {
> >> +                     fprintf(stderr, "Not a bzImage\n");
> >> +             }
> >> +             return -1;
> >> +     }
> >> +     if (header->boot_sector_magic != 0xAA55) {
> >> +             if (probe_debug) {
> >> +                     fprintf(stderr, "No x86 boot sector present\n");
> >> +             }
> >> +             /* No x86 boot sector present */
> >> +             return -1;
> >> +     }
> >> +     if (header->protocol_version < 0x020C) {
> >> +             if (probe_debug) {
> >> +                     fprintf(stderr, "Must be at least protocol version 2.12\n");
> >> +             }
> >> +             /* Must be at least protocol version 2.12 */
> >> +             return -1;
> >> +     }
> >> +     if ((header->loadflags & 1) == 0) {
> >> +             if (probe_debug) {
> >> +                     fprintf(stderr, "zImage not a bzImage\n");
> >> +             }
> >> +             /* Not a bzImage */
> >> +             return -1;
> >> +     }
> >> +     if (!(header->xloadflags & 1)) {
> >> +             if (probe_debug) {
> >> +                     fprintf(stderr, "Not a bzImage64\n");
> >> +             }
> >> +             /* Must be LOADED_ABOVE_4G */
> >> +             return -1;
> >> +     }
> >
> > So how do I force a 16bit or 32bit entry using a bzImage64?
> 
> kexec -t bzImage -l ....
> will load low and use 32bit entry.
> 
> kexec -t bzImage64 -l ...
> kexec -l ...
> will try to load high and use 64bit entry.

Also bzImage64 is not really a new image format. It is just enhancement
of bzImage format. We keep on doing extention of bzImage and don't call it
a new image format. I am not sure how good an idea it is to export the
notion of new image type bzImage64 to user.

Thanks
Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:00       ` Vivek Goyal
@ 2012-11-21 20:09         ` Yinghai Lu
  2012-11-21 20:12           ` Vivek Goyal
  0 siblings, 1 reply; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21 20:09 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Wed, Nov 21, 2012 at 12:00 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Nov 21, 2012 at 11:50:56AM -0800, Yinghai Lu wrote:
>
> [..]
>> > So how do I force a 16bit or 32bit entry using a bzImage64?
>>
>> kexec -t bzImage -l ....
>> will load low and use 32bit entry.
>>
>
> Ok, so user needs to enforce image type (using -t option) to bzImage
> (for a bzImage which supports 64bit entry point) to be able to use
> 32bit entry.
>
> I think a better option is that bzImage64 loader parses the user specified
> entry options and call into 32bit bzImage loader if user is asking for 32bit
> or 16bit entry. For 16bit, we already have option --real-mode option. May
> be we need to introduce another one for forcing 32bit entry, say --entry-32bit
> or --protected-mode (whatever makes sense).

that is doable.

add checking entry-32bit and real-mode checking in kexec-bzImage64.c
and bail early to leave kexec-bzImage.c to take over.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:09         ` Yinghai Lu
@ 2012-11-21 20:12           ` Vivek Goyal
  2012-11-21 20:17             ` Yinghai Lu
  0 siblings, 1 reply; 25+ messages in thread
From: Vivek Goyal @ 2012-11-21 20:12 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Wed, Nov 21, 2012 at 12:09:04PM -0800, Yinghai Lu wrote:
> On Wed, Nov 21, 2012 at 12:00 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Wed, Nov 21, 2012 at 11:50:56AM -0800, Yinghai Lu wrote:
> >
> > [..]
> >> > So how do I force a 16bit or 32bit entry using a bzImage64?
> >>
> >> kexec -t bzImage -l ....
> >> will load low and use 32bit entry.
> >>
> >
> > Ok, so user needs to enforce image type (using -t option) to bzImage
> > (for a bzImage which supports 64bit entry point) to be able to use
> > 32bit entry.
> >
> > I think a better option is that bzImage64 loader parses the user specified
> > entry options and call into 32bit bzImage loader if user is asking for 32bit
> > or 16bit entry. For 16bit, we already have option --real-mode option. May
> > be we need to introduce another one for forcing 32bit entry, say --entry-32bit
> > or --protected-mode (whatever makes sense).
> 
> that is doable.
> 
> add checking entry-32bit and real-mode checking in kexec-bzImage64.c
> and bail early to leave kexec-bzImage.c to take over.

Or actually can we do reverse. Do not introduce new image format
bzImage64. In existing bzImage loader if bzImage version is greater
than 0x020c (or whatever version has 64bit entry extension), just
call into load_bzImage64().

Thanks
Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:01         ` Yinghai Lu
@ 2012-11-21 20:16           ` H. Peter Anvin
  2012-11-21 20:47             ` Yinghai Lu
  0 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2012-11-21 20:16 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

On 11/21/2012 12:01 PM, Yinghai Lu wrote:
> On Wed, Nov 21, 2012 at 11:56 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 11/21/2012 11:54 AM, Yinghai Lu wrote:
>>>
>>> in kernel arch/x86/kernel/head_64.S
>>>
>>> it only set first 1G ident mapping. and if it find that code is above
>>> 1G, it will set extra ident mapping
>>> for new _text.._end.
>>> To make checking and add extra mapping simple and also save two extra
>>> pages for mapping.
>>> Limit that _text.._end in them same GB range.
>>>
>>
>> No, this is backwards.
> 
> old one: it limited bzImage in [0,1G) aka the first 1G.
> 
> Now we can put it in any aligned 1G range.
> 
> So how could it be called backwards?
> 

Because you're adding a more complicated hack.

>>
>> We should fix that limitation instead.
> 
> sure, but that will make arch/x86/boot/compressed/head_64.S not need
> complicated.
> 

But it makes the bootloaders more complicated, and the bootloaders are
harder to fix.

	-hpa


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:12           ` Vivek Goyal
@ 2012-11-21 20:17             ` Yinghai Lu
  0 siblings, 0 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21 20:17 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, H. Peter Anvin

On Wed, Nov 21, 2012 at 12:12 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Or actually can we do reverse. Do not introduce new image format
> bzImage64. In existing bzImage loader if bzImage version is greater
> than 0x020c (or whatever version has 64bit entry extension), just
> call into load_bzImage64().

Eric wanted me to separate that out.
and you want me to put them back?

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:16           ` H. Peter Anvin
@ 2012-11-21 20:47             ` Yinghai Lu
  2012-11-21 20:56               ` H. Peter Anvin
  2012-11-21 23:34               ` H. Peter Anvin
  0 siblings, 2 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-21 20:47 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

On Wed, Nov 21, 2012 at 12:16 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/21/2012 12:01 PM, Yinghai Lu wrote:
>> On Wed, Nov 21, 2012 at 11:56 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>>> On 11/21/2012 11:54 AM, Yinghai Lu wrote:
>>>>
>>>> in kernel arch/x86/kernel/head_64.S
>>>>
>>>> it only set first 1G ident mapping. and if it find that code is above
>>>> 1G, it will set extra ident mapping
>>>> for new _text.._end.
>>>> To make checking and add extra mapping simple and also save two extra
>>>> pages for mapping.
>>>> Limit that _text.._end in them same GB range.
>>>>
>>>
>>> No, this is backwards.
>>
>> old one: it limited bzImage in [0,1G) aka the first 1G.
>>
>> Now we can put it in any aligned 1G range.
>>
>> So how could it be called backwards?
>>
>
> Because you're adding a more complicated hack.

not that complicated, and it only add 7 lines

        /* same GB ? */
        while ((addr >> 30) != ((addr + size - 1) >> 30)) {
                addr = locate_hole(info, size, align, 0x100000,
                                 round_down(addr + size - 1, (1UL<<30)), -1);
                if (addr == ULONG_MAX)
                        die("can not load bzImage64");
        }

>
>>>
>>> We should fix that limitation instead.
>>
>> sure, but that will make arch/x86/boot/compressed/head_64.S not need
>> complicated.

If that add cover cross GB boundary handling in head_64.S, may need
another 100 line code
for checking and etc.

Also will need another two spare pages for cross 512G boundary.

>
> But it makes the bootloaders more complicated, and the bootloaders are
> harder to fix.

current bootloader does not have this feature and will
add the feature. and looks like kexec is only one at this point.

BTW,  is there any 64bit boot loader?

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:47             ` Yinghai Lu
@ 2012-11-21 20:56               ` H. Peter Anvin
  2012-11-21 23:34               ` H. Peter Anvin
  1 sibling, 0 replies; 25+ messages in thread
From: H. Peter Anvin @ 2012-11-21 20:56 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

Yes.

Yinghai Lu <yinghai@kernel.org> wrote:

>On Wed, Nov 21, 2012 at 12:16 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 11/21/2012 12:01 PM, Yinghai Lu wrote:
>>> On Wed, Nov 21, 2012 at 11:56 AM, H. Peter Anvin <hpa@zytor.com>
>wrote:
>>>> On 11/21/2012 11:54 AM, Yinghai Lu wrote:
>>>>>
>>>>> in kernel arch/x86/kernel/head_64.S
>>>>>
>>>>> it only set first 1G ident mapping. and if it find that code is
>above
>>>>> 1G, it will set extra ident mapping
>>>>> for new _text.._end.
>>>>> To make checking and add extra mapping simple and also save two
>extra
>>>>> pages for mapping.
>>>>> Limit that _text.._end in them same GB range.
>>>>>
>>>>
>>>> No, this is backwards.
>>>
>>> old one: it limited bzImage in [0,1G) aka the first 1G.
>>>
>>> Now we can put it in any aligned 1G range.
>>>
>>> So how could it be called backwards?
>>>
>>
>> Because you're adding a more complicated hack.
>
>not that complicated, and it only add 7 lines
>
>        /* same GB ? */
>        while ((addr >> 30) != ((addr + size - 1) >> 30)) {
>                addr = locate_hole(info, size, align, 0x100000,
>                           round_down(addr + size - 1, (1UL<<30)), -1);
>                if (addr == ULONG_MAX)
>                        die("can not load bzImage64");
>        }
>
>>
>>>>
>>>> We should fix that limitation instead.
>>>
>>> sure, but that will make arch/x86/boot/compressed/head_64.S not need
>>> complicated.
>
>If that add cover cross GB boundary handling in head_64.S, may need
>another 100 line code
>for checking and etc.
>
>Also will need another two spare pages for cross 512G boundary.
>
>>
>> But it makes the bootloaders more complicated, and the bootloaders
>are
>> harder to fix.
>
>current bootloader does not have this feature and will
>add the feature. and looks like kexec is only one at this point.
>
>BTW,  is there any 64bit boot loader?

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:47             ` Yinghai Lu
  2012-11-21 20:56               ` H. Peter Anvin
@ 2012-11-21 23:34               ` H. Peter Anvin
  2012-11-22  5:52                 ` Yinghai Lu
  1 sibling, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2012-11-21 23:34 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

On 11/21/2012 12:47 PM, Yinghai Lu wrote:
>>>>
>>>> We should fix that limitation instead.
>>>
>>> sure, but that will make arch/x86/boot/compressed/head_64.S not need
>>> complicated.
>
> If that add cover cross GB boundary handling in head_64.S, may need
> another 100 line code
> for checking and etc.
>

That seems unlikely.

> Also will need another two spare pages for cross 512G boundary.

Doesn't seem like a problem.

Let me be blunt: either we do it right or we don't do it at all.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 23:34               ` H. Peter Anvin
@ 2012-11-22  5:52                 ` Yinghai Lu
  0 siblings, 0 replies; 25+ messages in thread
From: Yinghai Lu @ 2012-11-22  5:52 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Haren Myneni, Simon Horman, kexec, Eric W. Biederman, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 426 bytes --]

On Wed, Nov 21, 2012 at 3:34 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>
>> Also will need another two spare pages for cross 512G boundary.
>
>
> Doesn't seem like a problem.
>
> Let me be blunt: either we do it right or we don't do it at all.

ok, please check the version that cover gb boundary.

it is about 100 lines more than the version that does not handle crossing.

test on cross 1G, 5G, 512G, 513G.

Thanks

Yinghai

[-- Attachment #2: kernel_pgt_level3_spare.patch --]
[-- Type: application/octet-stream, Size: 5831 bytes --]

Subject: [PATCH] x86, 64bit: add support for loading kernel above 512G

Current kernel is not allowed to be loaded above 512g, it thinks
that address is too big.

We only need to add one extra spare page for needed level3 to
point another 512g range.

Need to check _text range and set level4 pg to point to that spare
level3 page, and set level3 to point to level2 page to cover
[_text, _end] with extra mapping.

We need this to put relocatable bzImage high above 512g.

-v2: handling cross GB boundary that hpa insists on.
    test on cross 1G, 5G, 512g, 513g.
    should double check on cross 1024g, but it should work.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>

---
 arch/x86/kernel/head_64.S |  142 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 131 insertions(+), 11 deletions(-)

Index: linux-2.6/arch/x86/kernel/head_64.S
===================================================================
--- linux-2.6.orig/arch/x86/kernel/head_64.S
+++ linux-2.6/arch/x86/kernel/head_64.S
@@ -78,12 +78,6 @@ startup_64:
 	testl	%eax, %eax
 	jnz	bad_address
 
-	/* Is the address too large? */
-	leaq	_text(%rip), %rdx
-	movq	$PGDIR_SIZE, %rax
-	cmpq	%rax, %rdx
-	jae	bad_address
-
 	/* Fixup the physical addresses in the page table
 	 */
 	addq	%rbp, init_level4_pgt + 0(%rip)
@@ -97,28 +91,147 @@ startup_64:
 
 	addq	%rbp, level2_fixmap_pgt + (506*8)(%rip)
 
-	/* Add an Identity mapping if I am above 1G */
+	/* Add an Identity mapping if _end is above 1G */
+	leaq	_end(%rip), %r9
+	decq	%r9
+	cmp	$PUD_SIZE, %r9
+	jl	ident_complete
+
+	/* get end */
+	andq	$PMD_PAGE_MASK, %r9
+	/* round start to 1G if it is below 1G */
 	leaq	_text(%rip), %rdi
 	andq	$PMD_PAGE_MASK, %rdi
+	cmp	$PUD_SIZE, %rdi
+	jg	1f
+	movq	$PUD_SIZE, %rdi
+1:
+	/* get 512G index */
+	movq	%r9, %r10
+	shrq	$PGDIR_SHIFT, %r10
+	andq	$(PTRS_PER_PGD - 1), %r10
+	movq	%rdi, %rax
+	shrq	$PGDIR_SHIFT, %rax
+	andq	$(PTRS_PER_PGD - 1), %rax
+
+	/* cross two 512G ? */
+	cmp	%r10, %rax
+	jne	set_level3_other_512g
+
+	/* all in first 512G ? */
+	cmp	$0, %rax
+	je	skip_level3_spare
+
+	/* same 512G other than first 512g */
+	leaq    (level3_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	leaq    init_level4_pgt(%rip), %rbx
+	movq    %rdx, 0(%rbx, %rax, 8)
+	addq    $L4_PAGE_OFFSET, %rax
+	movq    %rdx, 0(%rbx, %rax, 8)
+
+	/* get 1G index */
+	movq    %r9, %r10
+	shrq    $PUD_SHIFT, %r10
+	andq    $(PTRS_PER_PUD - 1), %r10
+        movq    %rdi, %rax
+        shrq    $PUD_SHIFT, %rax
+        andq    $(PTRS_PER_PUD - 1), %rax
+
+	/* same 1G ? */
+	cmp     %r10, %rax
+	je	set_level2_start_only_not_first_512g
+
+	/* set level2 for end */
+	leaq    level3_spare_pgt(%rip), %rbx
+	leaq    (level2_spare2_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	movq    %rdx, 0(%rbx, %r10, 8)
+
+set_level2_start_only_not_first_512g:
+	leaq    level3_spare_pgt(%rip), %rbx
+	leaq    (level2_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	movq    %rdx, 0(%rbx, %rax, 8)
+
+	jmp	set_level2_spare
+
+set_level3_other_512g:
+	/* start is in first 512G ? */
+	cmp	$0, %rax
+	/* for level2 last on first 512g */
+	leaq	level3_ident_pgt(%rip), %rcx
+	je	set_level2_start_other_512g
+
+	/* Set level3 for _text */
+	leaq	(level3_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	leaq	init_level4_pgt(%rip), %rbx
+	movq	%rdx, 0(%rbx, %rax, 8)
+	addq	$L4_PAGE_OFFSET, %rax
+	movq	%rdx, 0(%rbx, %rax, 8)
 
+	/* for level2 last not on first 512G */
+	leaq	level3_spare_pgt(%rip), %rcx
+
+set_level2_start_other_512g:
+	/* always need to set level2 */
 	movq	%rdi, %rax
 	shrq	$PUD_SHIFT, %rax
 	andq	$(PTRS_PER_PUD - 1), %rax
-	jz	ident_complete
-
+	movq	%rcx, %rbx    /* %rcx has level3_spare_pgt or level3_ident_pgt */
 	leaq	(level2_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	movq	%rdx, 0(%rbx, %rax, 8)
+
+set_level3_end_other_512g:
+	leaq	(level3_spare2_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	leaq	init_level4_pgt(%rip), %rbx
+	movq	%rdx, 0(%rbx, %r10, 8)
+	addq	$L4_PAGE_OFFSET, %r10
+	movq	%rdx, 0(%rbx, %r10, 8)
+
+	/* always need to set level2 */
+	movq	%r9, %r10
+	shrq	$PUD_SHIFT, %r10
+	andq	$(PTRS_PER_PUD - 1), %r10
+	leaq	level3_spare2_pgt(%rip), %rbx
+	leaq	(level2_spare2_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	movq	%rdx, 0(%rbx, %r10, 8)
+
+	jmp	set_level2_spare
+
+skip_level3_spare:
+	/* get 1G index */
+	movq	%r9, %r10
+	shrq	$PUD_SHIFT, %r10
+	andq	$(PTRS_PER_PUD - 1), %r10
+	movq	%rdi, %rax
+	shrq	$PUD_SHIFT, %rax
+	andq	$(PTRS_PER_PUD - 1), %rax
+
+	/* same 1G ? */
+	cmp	%r10, %rax
+	je	set_level2_start_only_first_512g
+
+	/* set level2 without level3 spare */
 	leaq	level3_ident_pgt(%rip), %rbx
+	leaq	(level2_spare2_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
+	movq	%rdx, 0(%rbx, %r10, 8)
+
+set_level2_start_only_first_512g:
+	/*  set level2 without level3 spare */
+	leaq	level3_ident_pgt(%rip), %rbx
+	leaq	(level2_spare_pgt - __START_KERNEL_map + _KERNPG_TABLE)(%rbp), %rdx
 	movq	%rdx, 0(%rbx, %rax, 8)
 
+set_level2_spare:
 	movq	%rdi, %rax
 	shrq	$PMD_SHIFT, %rax
 	andq	$(PTRS_PER_PMD - 1), %rax
 	leaq	__PAGE_KERNEL_IDENT_LARGE_EXEC(%rdi), %rdx
 	leaq	level2_spare_pgt(%rip), %rbx
-	leaq	_end(%rip), %r8
-	decq	%r8
+	movq	%r9, %r8
 	shrq	$PMD_SHIFT, %r8
 	andq	$(PTRS_PER_PMD - 1), %r8
+	cmp	%r8, %rax
+	jl	1f
+	addq	$PTRS_PER_PMD, %r8
 1:	movq	%rdx, 0(%rbx, %rax, 8)
 	addq	$PMD_SIZE, %rdx
 	incq	%rax
@@ -435,8 +548,15 @@ NEXT_PAGE(level2_kernel_pgt)
 	PMDS(0, __PAGE_KERNEL_LARGE_EXEC,
 		KERNEL_IMAGE_SIZE/PMD_SIZE)
 
+NEXT_PAGE(level3_spare_pgt)
+	.fill   512, 8, 0
+NEXT_PAGE(level3_spare2_pgt)
+	.fill   512, 8, 0
+
 NEXT_PAGE(level2_spare_pgt)
 	.fill   512, 8, 0
+NEXT_PAGE(level2_spare2_pgt)
+	.fill   512, 8, 0
 
 #undef PMDS
 #undef NEXT_PAGE

[-- Attachment #3: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G
  2012-11-21 20:07       ` Vivek Goyal
@ 2012-11-22 11:39         ` Eric W. Biederman
  0 siblings, 0 replies; 25+ messages in thread
From: Eric W. Biederman @ 2012-11-22 11:39 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Haren Myneni, Simon Horman, Yinghai Lu, kexec, H. Peter Anvin

Vivek Goyal <vgoyal@redhat.com> writes:

> On Wed, Nov 21, 2012 at 11:50:56AM -0800, Yinghai Lu wrote:
>> On Wed, Nov 21, 2012 at 6:50 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> >
>> > So how do I force a 16bit or 32bit entry using a bzImage64?
>> 
>> kexec -t bzImage -l ....
>> will load low and use 32bit entry.
>> 
>> kexec -t bzImage64 -l ...
>> kexec -l ...
>> will try to load high and use 64bit entry.
>
> Also bzImage64 is not really a new image format. It is just enhancement
> of bzImage format. We keep on doing extention of bzImage and don't call it
> a new image format. I am not sure how good an idea it is to export the
> notion of new image type bzImage64 to user.

For what the loader has to do bzImage64 is effectively a new format,
and one way or another needs to be handled by separate functions so that
the code remains readable.

I asked YH to add the code that way because it means only a 64bit kexec
has to carry that code and we don't have 64bit dependencies in in the
32bit build that will break.  The code to prepare boot_params is still
shared.

Chaining to the 32bit loader if someone asks for --real-mode or
perhaps --entry32 seems quite reasonable and only a couple of lines
of code so I have not objections to that.

While development is on-going forcing the image type seems very
reasonable, but when the smoke clears I would like to see
the bzImage64 format chaining to bzImage for the cases it does not
handle.

But with respect to autodetection only having bzImage64 kick in when
loading a 64bit kernel is possible looks like the right way to go.

Eric




_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-11-22 11:39 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-21  7:31 [PATCH v3 0/4] kexec: put bzImage and ramdisk above 4G for x86 64bit Yinghai Lu
2012-11-21  7:31 ` [PATCH v3 1/4] kexec, x86: add boot header member for version 2.12 Yinghai Lu
2012-11-21  7:31 ` [PATCH v3 2/4] kexec, x86: put ramdisk high for 64bit bzImage Yinghai Lu
2012-11-21  7:31 ` [PATCH v3 3/4] kexec, x86: set ext_cmd_line_ptr when boot_param is above 4g Yinghai Lu
2012-11-21  7:31 ` [PATCH v3 4/4] kexec, x86_64: Load bzImage64 above 4G Yinghai Lu
2012-11-21 14:37   ` Vivek Goyal
2012-11-21 17:24     ` H. Peter Anvin
2012-11-21 19:54     ` Yinghai Lu
2012-11-21 19:56       ` H. Peter Anvin
2012-11-21 20:01         ` Yinghai Lu
2012-11-21 20:16           ` H. Peter Anvin
2012-11-21 20:47             ` Yinghai Lu
2012-11-21 20:56               ` H. Peter Anvin
2012-11-21 23:34               ` H. Peter Anvin
2012-11-22  5:52                 ` Yinghai Lu
2012-11-21 14:50   ` Vivek Goyal
2012-11-21 19:50     ` Yinghai Lu
2012-11-21 19:52       ` H. Peter Anvin
2012-11-21 19:57         ` Yinghai Lu
2012-11-21 20:00       ` Vivek Goyal
2012-11-21 20:09         ` Yinghai Lu
2012-11-21 20:12           ` Vivek Goyal
2012-11-21 20:17             ` Yinghai Lu
2012-11-21 20:07       ` Vivek Goyal
2012-11-22 11:39         ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.