linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] paravirt/subarchitecture boot protocol
@ 2007-06-15  0:48 Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 01/10] update boot spec to 2.07 Jeremy Fitzhardinge
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

This series updates the boot protocol to 2.07 and uses it to implement
paravirtual booting.  This allows the bootloader to tell the kernel
what kind of hardware/pseudo-hardware environment it's coming up under,
and the kernel can use the appropriate boot sequence code.

Specifically:

- Update the boot protocol to 2.07, which adds fields to specify the
  hardware subarchitecture and some subarchitecture-specific data.
  It also specifies a flag to tell the boot code to avoid reloading
  segment registers and playing with interrupt state, since it may not
  have a visible gdt and/or may not be running in ring 0.
  (Note: the segment reload and interrupt flags are conflated into one
   flag, but they are not really related.  We could have two flags, but
   the "cli" is probably completely redundant anyway, since the bootloader
   would have to be completely mad to start the kernel with interrupts
   enabled.)

- Change the format of bzImage to contain an ELF file.  The initial part of
  the bzImage is still the boot_params header followed by the 16-bit
  setup code needed for booting from BIOS.  But rather than having
  the self-decompressing kernel follow as a naked blob of code+data,
  its actually wrapped in a page-aligned ELF file.  This allows the
  bootloader to extract it and parse it, and use that to know what
  memory the booting kernel will need initially.  Xen and lguest need
  this because they start the kernel with paging enabled, and so need
  to know what initial mappings to create.

- Modify the kernel boot sequence to:
  1. avoid reloading the segment state (gdt and segment registers) if the
     bootloader asks it to
  2. jump to a subarchitecture-specific entrypoint in kernel/head.S.

- Add Xen-specific starup code, which mainly remaps the kernel from
  its P=V identity mapping to the normal PAGE_OFFSET mapping.

  One open issue is that I haven't made the normal head.S initial
  pagetable construction code generally reusable.  The default
  boot-on-normal x86 hardware still uses it of course, but other
  subarchitectures like Voyager and lguest could probably use it as-is,
  while still needing to do other specialized things.

  The obvious fix is to make it a callable function, but we don't
  generally assume there's a stack available at this early stage.
  It looks like it would be easy to set one up though.

As a pre-requisite for all the above, I've also cleaned up the process
to generate the bzImage.  I've eliminated the need for the tools/build
program, and instead use the linker to do more heavy lifting.  I've also
removed some somewhat obscure uses of ld and objcopy wrap binary files
in ELF .o wrappers, and replaced with with .S files containing .incbin.

The downside is that its making a bit more complex use of linker scripts,
which always opens scope for finding more binutils bugs.  Only one way
to find out...

Tested to check the generated kernels boot under qemu's internal bootload
and grub, as well as booting under Xen (with an appropriate update to
the Xen domain builder).

TODO:
 - poke Rusty into implementing the lguest bits
 - clean up all the ELF headers to make them easier to reuse in the
   boot-code compile environment, in order to
 - remove all the .S files specifying ELF structures, and use .c instead
   (which would be a general cleanup; I don't think we need to specify
    ELF notes in .S files at all.)

	J
-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/10] update boot spec to 2.07
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 02/10] allow linux/elf.h to be included in assembler Jeremy Fitzhardinge
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: update-boot-spec.patch --]
[-- Type: text/plain, Size: 4840 bytes --]

Proposed updates for version 2.07 of the boot protocol.  This includes:

load_flags.KEEP_SEGMENTS- flag to request/inhibit segment reloads
hardware_subarch	- what subarchitecture we're booting under
hardware_subarch_data	- per-architecture data

The intention of these changes is to make booting a paravirtualized
kernel work via the normal Linux boot protocol.  The intention is that
the bzImage payload can be a properly formed ELF file, so that the
bootloader can use its ELF notes and Phdrs to get more metadata about
the kernel and its requirements.

The ELF file could be the uncompressed kernel vmlinux itself; it would
only take small buildsystem changes to implement this.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>

---
 Documentation/i386/boot.txt    |   34 +++++++++++++++++++++++++++++++++-
 arch/i386/kernel/asm-offsets.c |    7 +++++++
 include/asm-i386/bootparam.h   |    9 +++++++--
 3 files changed, 47 insertions(+), 3 deletions(-)

===================================================================
--- a/Documentation/i386/boot.txt
+++ b/Documentation/i386/boot.txt
@@ -168,6 +168,8 @@ 0234/1	2.05+	relocatable_kernel Whether 
 0234/1	2.05+	relocatable_kernel Whether kernel is relocatable or not
 0235/3	N/A	pad2		Unused
 0238/4	2.06+	cmdline_size	Maximum size of the kernel command line
+023C/4	2.07+	hardware_subarch Hardware subarchitecture
+0240/8	2.07+	hardware_subarch_data Subarchitecture-specific data
 
 (1) For backwards compatibility, if the setup_sects field contains 0, the
     real value is 4.
@@ -204,7 +206,7 @@ boot loaders can ignore those fields.
 
 The byte order of all fields is littleendian (this is x86, after all.)
 
-Field name:	setup_secs
+Field name:	setup_sects
 Type:		read
 Offset/size:	0x1f1/1
 Protocol:	ALL
@@ -356,6 +358,13 @@ Protocol:	2.00+
 	- If 0, the protected-mode code is loaded at 0x10000.
 	- If 1, the protected-mode code is loaded at 0x100000.
 
+  Bit 6 (write): KEEP_SEGMENTS
+	Protocol: 2.07+
+	- if 0, reload the segment registers in the 32bit entry point.
+	- if 1, do not reload the segment registers in the 32bit entry point.
+		Assume that %cs %ds %ss %es are all set to flat segments with
+		a base of 0 (or the equivalent for their environment).
+
   Bit 7 (write): CAN_USE_HEAP
 	Set this bit to 1 to indicate that the value entered in the
 	heap_end_ptr is valid.  If this field is clear, some setup code
@@ -479,6 +488,29 @@ Protocol:	2.06+
   zero. This means that the command line can contain at most
   cmdline_size characters. With protocol version 2.05 and earlier, the
   maximum size was 255.
+
+Field name:	hardware_subarch
+Type:		write
+Offset/size:	0x23c/4
+Protocol:	2.07+
+
+  In a paravirtualized environment the hardware low level architectural
+  pieces such as interrupt handling, page table handling, and
+  accessing process control registers needs to be done differently.
+
+  This field allows the bootloader to inform the kernel we are in one
+  one of those environments.
+
+  0x00000000	The default x86/PC environment
+  0x00000001	lguest
+  0x00000002	Xen
+
+Field name:	hardware_subarch_data
+Type:		write
+Offset/size:	0x240/8
+Protocol:	2.07+
+
+  A pointer to data that is specific to hardware subarch
 
 
 **** THE KERNEL COMMAND LINE
===================================================================
--- a/arch/i386/kernel/asm-offsets.c
+++ b/arch/i386/kernel/asm-offsets.c
@@ -15,6 +15,7 @@
 #include <asm/fixmap.h>
 #include <asm/processor.h>
 #include <asm/thread_info.h>
+#include <asm/bootparam.h>
 #include <asm/elf.h>
 #include <xen/interface/xen.h>
 
@@ -143,4 +144,10 @@ void foo(void)
 	OFFSET(LGUEST_PAGES_regs_errcode, lguest_pages, regs.errcode);
 	OFFSET(LGUEST_PAGES_regs, lguest_pages, regs);
 #endif
+
+	BLANK();
+	OFFSET(BP_scratch, boot_params, scratch);
+	OFFSET(BP_loadflags, boot_params, hdr.loadflags);
+	OFFSET(BP_hardware_subarch, boot_params, hdr.hardware_subarch);
+	OFFSET(BP_version, boot_params, hdr.version);
 }
===================================================================
--- a/include/asm-i386/bootparam.h
+++ b/include/asm-i386/bootparam.h
@@ -24,8 +24,9 @@ struct setup_header {
 	u16	kernel_version;
 	u8	type_of_loader;
 	u8	loadflags;
-#define LOADED_HIGH	0x01
-#define CAN_USE_HEAP	0x80
+#define LOADED_HIGH	(1<<0)
+#define KEEP_SEGMENTS	(1<<6)
+#define CAN_USE_HEAP	(1<<7)
 	u16	setup_move_size;
 	u32	code32_start;
 	u32	ramdisk_image;
@@ -37,6 +38,10 @@ struct setup_header {
 	u32	initrd_addr_max;
 	u32	kernel_alignment;
 	u8	relocatable_kernel;
+	u8	_pad2[3];
+	u32	cmdline_size;
+	u32	hardware_subarch;
+	u64	hardware_subarch_data;
 } __attribute__((packed));
 
 struct sys_desc_table {

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 02/10] allow linux/elf.h to be included in assembler
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 01/10] update boot spec to 2.07 Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 03/10] define ELF notes for adding to a boot image Jeremy Fitzhardinge
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: asm-elf_h.patch --]
[-- Type: text/plain, Size: 3727 bytes --]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>

---
 include/linux/elf.h |   24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

===================================================================
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -1,9 +1,10 @@
 #ifndef _LINUX_ELF_H
 #define _LINUX_ELF_H
 
+#include <linux/elf-em.h>
+#ifndef __ASSEMBLY__
 #include <linux/types.h>
 #include <linux/auxvec.h>
-#include <linux/elf-em.h>
 #include <asm/elf.h>
 
 struct file;
@@ -31,6 +32,7 @@ typedef __u32	Elf64_Word;
 typedef __u32	Elf64_Word;
 typedef __u64	Elf64_Xword;
 typedef __s64	Elf64_Sxword;
+#endif	/* __ASSEMBLY__ */
 
 /* These constants are for the segment types stored in the image headers */
 #define PT_NULL    0
@@ -123,6 +125,7 @@ typedef __s64	Elf64_Sxword;
 #define ELF64_ST_BIND(x)	ELF_ST_BIND(x)
 #define ELF64_ST_TYPE(x)	ELF_ST_TYPE(x)
 
+#ifndef __ASSEMBLY__
 typedef struct dynamic{
   Elf32_Sword d_tag;
   union{
@@ -138,6 +141,7 @@ typedef struct {
     Elf64_Addr d_ptr;
   } d_un;
 } Elf64_Dyn;
+#endif	/* __ASSEMBLY__ */
 
 /* The following are used with relocations */
 #define ELF32_R_SYM(x) ((x) >> 8)
@@ -146,6 +150,7 @@ typedef struct {
 #define ELF64_R_SYM(i)			((i) >> 32)
 #define ELF64_R_TYPE(i)			((i) & 0xffffffff)
 
+#ifndef __ASSEMBLY__
 typedef struct elf32_rel {
   Elf32_Addr	r_offset;
   Elf32_Word	r_info;
@@ -185,11 +190,12 @@ typedef struct elf64_sym {
   Elf64_Addr st_value;		/* Value of the symbol */
   Elf64_Xword st_size;		/* Associated symbol size */
 } Elf64_Sym;
-
+#endif	/* __ASSEMBLY__ */
 
 #define EI_NIDENT	16
 
-typedef struct elf32_hdr{
+#ifndef __ASSEMBLY__
+typedef struct elf32_hdr {
   unsigned char	e_ident[EI_NIDENT];
   Elf32_Half	e_type;
   Elf32_Half	e_machine;
@@ -222,6 +228,7 @@ typedef struct elf64_hdr {
   Elf64_Half e_shnum;
   Elf64_Half e_shstrndx;
 } Elf64_Ehdr;
+#endif	/* __ASSEMBLY__ */
 
 /* These constants define the permissions on sections in the program
    header, p_flags. */
@@ -229,7 +236,8 @@ typedef struct elf64_hdr {
 #define PF_W		0x2
 #define PF_X		0x1
 
-typedef struct elf32_phdr{
+#ifndef __ASSEMBLY__
+typedef struct elf32_phdr {
   Elf32_Word	p_type;
   Elf32_Off	p_offset;
   Elf32_Addr	p_vaddr;
@@ -250,6 +258,7 @@ typedef struct elf64_phdr {
   Elf64_Xword p_memsz;		/* Segment size in memory */
   Elf64_Xword p_align;		/* Segment alignment, file & memory */
 } Elf64_Phdr;
+#endif	/* __ASSEMBLY__ */
 
 /* sh_type */
 #define SHT_NULL	0
@@ -284,7 +293,8 @@ typedef struct elf64_phdr {
 #define SHN_ABS		0xfff1
 #define SHN_COMMON	0xfff2
 #define SHN_HIRESERVE	0xffff
- 
+
+#ifndef __ASSEMBLY__
 typedef struct {
   Elf32_Word	sh_name;
   Elf32_Word	sh_type;
@@ -310,6 +320,7 @@ typedef struct elf64_shdr {
   Elf64_Xword sh_addralign;	/* Section alignment */
   Elf64_Xword sh_entsize;	/* Entry size if section holds table */
 } Elf64_Shdr;
+#endif	/* __ASSEMBLY__ */
 
 #define	EI_MAG0		0		/* e_ident[] indexes */
 #define	EI_MAG1		1
@@ -343,6 +354,7 @@ typedef struct elf64_shdr {
 
 #define ELFOSABI_NONE	0
 #define ELFOSABI_LINUX	3
+#define ELFOSABI_STANDALONE	255
 
 #ifndef ELF_OSABI
 #define ELF_OSABI ELFOSABI_NONE
@@ -357,6 +369,7 @@ typedef struct elf64_shdr {
 #define NT_PRXFPREG     0x46e62b7f      /* copied from gdb5.1/include/elf/common.h */
 
 
+#ifndef __ASSEMBLY__
 /* Note header in a PT_NOTE section */
 typedef struct elf32_note {
   Elf32_Word	n_namesz;	/* Name size */
@@ -396,5 +409,6 @@ static inline void arch_write_notes(stru
 #define ELF_CORE_EXTRA_NOTES_SIZE arch_notes_size()
 #define ELF_CORE_WRITE_EXTRA_NOTES arch_write_notes(file)
 #endif /* ARCH_HAVE_EXTRA_ELF_NOTES */
+#endif	/* __ASSEMBLY__ */
 
 #endif /* _LINUX_ELF_H */

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 03/10] define ELF notes for adding to a boot image
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 01/10] update boot spec to 2.07 Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 02/10] allow linux/elf.h to be included in assembler Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15 16:20   ` H. Peter Anvin
  2007-06-15  0:48 ` [PATCH 04/10] i386: clean up bzImage generation Jeremy Fitzhardinge
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: elf-bootnotes.patch --]
[-- Type: text/plain, Size: 890 bytes --]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>

---
 include/linux/elf_boot.h |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

===================================================================
--- /dev/null
+++ b/include/linux/elf_boot.h
@@ -0,0 +1,15 @@
+#ifndef ELF_BOOT_H
+#define ELF_BOOT_H
+
+/* Elf notes to help bootloaders identify what program they are booting.
+ */
+
+/* Standardized Elf image notes for booting... The name for all of these is ELFBoot */
+#define ELF_NOTE_BOOT		ELFBoot
+
+#define EIN_PROGRAM_NAME	1 /* The program in this ELF file */
+#define EIN_PROGRAM_VERSION	2 /* The version of the program in this ELF file */
+#define EIN_PROGRAM_CHECKSUM	3 /* ip style checksum of the memory image. */
+#define EIN_ARGUMENT_STYLE	4 /* String identifying argument passing style */
+
+#endif /* ELF_BOOT_H */

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 04/10] i386: clean up bzImage generation
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
                   ` (2 preceding siblings ...)
  2007-06-15  0:48 ` [PATCH 03/10] define ELF notes for adding to a boot image Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15 16:20   ` H. Peter Anvin
  2007-06-15  0:48 ` [PATCH 05/10] i386: make the bzImage payload an ELF file Jeremy Fitzhardinge
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: i386-cleanup-image-generation.patch --]
[-- Type: text/plain, Size: 13399 bytes --]

This patch cleans up image generation in several ways:
 - Firstly, it removes tools/build, and uses binutils to do all the
   final construction of the bzImage.  This removes a chunk of code
   and makes the image generation more flexible, since we can compute
   various numbers rather than be forced to use fixed constants.

 - Rename compressed/vmlinux to compressed/blob, to make it a
   bit clearer that it's the compressed kernel image + decompressor
   (now all the files named "vmlinux*" are directly derived from
   the kernel vmlinux).

 - Rather than using objcopy to wrap the compressed kernel into an
   object file, simply use the assembler: payload.S does a .bininc
   of the blob.bin file, which allows us to easily place
   it into a section, and it makes the Makefile dependency a little
   clearer.

 - Similarly, use the same technique to create compressed/piggy.o,
   which cleans things up even more, since the .S file can also
   set the input and output_size symbols without further linker
   script hackery; it also removes a complete linker script.

 - Also, remove stray "sti"

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>

---
 arch/i386/boot/Makefile               |   31 +-----
 arch/i386/boot/compressed/Makefile    |   13 --
 arch/i386/boot/compressed/piggy.S     |   10 +
 arch/i386/boot/compressed/vmlinux.scr |   10 -
 arch/i386/boot/header.S               |    7 -
 arch/i386/boot/payload.S              |    3 
 arch/i386/boot/setup.ld               |   37 ++++---
 arch/i386/boot/tools/.gitignore       |    1 
 arch/i386/boot/tools/build.c          |  168 ---------------------------------
 9 files changed, 55 insertions(+), 225 deletions(-)

===================================================================
--- a/arch/i386/boot/Makefile
+++ b/arch/i386/boot/Makefile
@@ -25,12 +25,13 @@ SVGA_MODE := -DSVGA_MODE=NORMAL_VGA
 
 #RAMDISK := -DRAMDISK=512
 
-targets		:= vmlinux.bin setup.bin setup.elf zImage bzImage
+targets		:= blob.bin setup.elf zImage bzImage
 subdir- 	:= compressed
 
 setup-y		+= a20.o apm.o cmdline.o copy.o cpu.o cpucheck.o edd.o
-setup-y		+= header.o main.o mca.o memory.o pm.o pmjump.o
-setup-y		+= printf.o string.o tty.o video.o version.o voyager.o
+setup-y		+= header.o main.o mca.o memory.o payload.o pm.o
+setup-y		+= pmjump.o printf.o string.o tty.o video.o version.o
+setup-y		+= voyager.o
 
 # The link order of the video-*.o modules can matter.  In particular,
 # video-vga.o *must* be listed first, followed by video-vesa.o.
@@ -39,10 +40,6 @@ setup-y		+= video-vga.o
 setup-y		+= video-vga.o
 setup-y		+= video-vesa.o
 setup-y		+= video-bios.o
-
-hostprogs-y	:= tools/build
-
-HOSTCFLAGS_build.o := $(LINUXINCLUDE)
 
 # ---------------------------------------------------------------------------
 
@@ -65,18 +62,12 @@ AFLAGS		:= $(CFLAGS) -D__ASSEMBLY__
 $(obj)/bzImage: IMAGE_OFFSET := 0x100000
 $(obj)/bzImage: EXTRA_CFLAGS := -D__BIG_KERNEL__
 $(obj)/bzImage: EXTRA_AFLAGS := $(SVGA_MODE) $(RAMDISK) -D__BIG_KERNEL__
-$(obj)/bzImage: BUILDFLAGS   := -b
 
-quiet_cmd_image = BUILD   $@
-cmd_image = $(obj)/tools/build $(BUILDFLAGS) $(obj)/setup.bin \
-	    $(obj)/vmlinux.bin $(ROOT_DEV) > $@
-
-$(obj)/zImage $(obj)/bzImage: $(obj)/setup.bin \
-			      $(obj)/vmlinux.bin $(obj)/tools/build FORCE
-	$(call if_changed,image)
+$(obj)/zImage $(obj)/bzImage: $(obj)/setup.elf FORCE
+	$(call if_changed,objcopy)
 	@echo 'Kernel: $@ is ready' ' (#'`cat .version`')'
 
-$(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
+$(obj)/blob.bin: $(obj)/compressed/blob FORCE
 	$(call if_changed,objcopy)
 
 SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
@@ -85,12 +76,10 @@ LDFLAGS_setup.elf	:= -T
 $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
 	$(call if_changed,ld)
 
-OBJCOPYFLAGS_setup.bin	:= -O binary
+$(obj)/payload.o:	EXTRA_AFLAGS := -Wa,-I$(obj)
+$(obj)/payload.o: $(src)/payload.S $(obj)/blob.bin
 
-$(obj)/setup.bin: $(obj)/setup.elf FORCE
-	$(call if_changed,objcopy)
-
-$(obj)/compressed/vmlinux: FORCE
+$(obj)/compressed/blob: FORCE
 	$(Q)$(MAKE) $(build)=$(obj)/compressed IMAGE_OFFSET=$(IMAGE_OFFSET) $@
 
 # Set this if you want to pass append arguments to the zdisk/fdimage/isoimage kernel
===================================================================
--- a/arch/i386/boot/compressed/Makefile
+++ b/arch/i386/boot/compressed/Makefile
@@ -4,11 +4,10 @@
 # create a compressed vmlinux image from the original vmlinux
 #
 
-targets		:= vmlinux vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \
+targets		:= blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \
 			vmlinux.bin.all vmlinux.relocs
-EXTRA_AFLAGS	:= -traditional
 
-LDFLAGS_vmlinux := -T
+LDFLAGS_blob	:= -T
 hostprogs-y	:= relocs
 
 CFLAGS  := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -O2 \
@@ -17,7 +16,7 @@ CFLAGS  := -m32 -D__KERNEL__ $(LINUX_INC
 	   $(call cc-option,-fno-stack-protector)
 LDFLAGS := -m elf_i386
 
-$(obj)/vmlinux: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE
+$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE
 	$(call if_changed,ld)
 	@:
 
@@ -44,7 +43,5 @@ else
 	$(call if_changed,gzip)
 endif
 
-LDFLAGS_piggy.o := -r --format binary --oformat elf32-i386 -T
-
-$(obj)/piggy.o: $(src)/vmlinux.scr $(obj)/vmlinux.bin.gz FORCE
-	$(call if_changed,ld)
+$(obj)/piggy.o:	EXTRA_AFLAGS := -Wa,-I$(obj)
+$(obj)/piggy.o: $(src)/piggy.S $(obj)/vmlinux.bin.gz
===================================================================
--- /dev/null
+++ b/arch/i386/boot/compressed/piggy.S
@@ -0,0 +1,10 @@
+.section .data.compressed,"a",@progbits
+
+.globl input_data, input_len, output_len
+
+input_len:	.long input_data_end - input_data
+
+input_data:
+.incbin "vmlinux.bin.gz"
+output_len = .-4
+input_data_end:
===================================================================
--- a/arch/i386/boot/compressed/vmlinux.scr
+++ /dev/null
@@ -1,10 +0,0 @@
-SECTIONS
-{
-  .data.compressed : {
-	input_len = .;
-	LONG(input_data_end - input_data) input_data = .; 
-	*(.data) 
-	output_len = . - 4;
-	input_data_end = .; 
-	}
-}
===================================================================
--- a/arch/i386/boot/header.S
+++ b/arch/i386/boot/header.S
@@ -97,9 +97,9 @@ bugger_off_msg:
 	.section ".header", "a"
 	.globl	hdr
 hdr:
-setup_sects:	.byte SETUPSECTS
+setup_sects:	.byte _setup_sects
 root_flags:	.word ROOT_RDONLY
-syssize:	.long SYSSIZE
+syssize:	.long kernel_size_para
 ram_size:	.word RAMDISK
 vid_mode:	.word SVGA_MODE
 root_dev:	.word ROOT_DEV
@@ -148,7 +148,7 @@ CAN_USE_HEAP	= 0x80			# If set, the load
 		.byte	LOADED_HIGH
 #endif
 
-setup_move_size: .word  0x8000		# size to move, when setup is not
+setup_move_size: .word  _setup_size	# size to move, when setup is not
 					# loaded at 0x90000. We will move setup
 					# to 0x90000 then just before jumping
 					# into the kernel. However, only the
@@ -246,7 +246,6 @@ setup2:
 	jnz	1f
 	movw	$0xfffc, %sp	# Make sure we're not zero
 1:	movzwl	%sp, %esp	# Clear upper half of %esp
-	sti
 
 # Check signature at end of setup
 	cmpl	$0x5a5aaa55, setup_sig
===================================================================
--- /dev/null
+++ b/arch/i386/boot/payload.S
@@ -0,0 +1,3 @@
+.section .kernel,"a",@progbits
+
+.incbin "blob.bin"
===================================================================
--- a/arch/i386/boot/setup.ld
+++ b/arch/i386/boot/setup.ld
@@ -3,18 +3,16 @@
  *
  * Linker script for the i386 setup code
  */
-OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")
+OUTPUT_FORMAT("elf32-i386")
 OUTPUT_ARCH(i386)
 ENTRY(_start)
 
 SECTIONS
 {
-	. = 0;
-	.bstext		: { *(.bstext) }
+	.bstext 0	: { *(.bstext) }
 	.bsdata		: { *(.bsdata) }
 
-	. = 497;
-	.header		: { *(.header) }
+	.header 497	: { *(.header) }
 	.inittext	: { *(.inittext) }
 	.initdata	: { *(.initdata) }
 	.text		: { *(.text*) }
@@ -38,16 +36,29 @@ SECTIONS
 
 
 	. = ALIGN(16);
-	__bss_start = .;
-	.bss 		:
-	{
-		*(.bss)
-	}
-	. = ALIGN(16);
+	.bss ALIGN(16)	: {
+		__bss_start = .;
+		 *(.bss)
+		. = ALIGN(16);
+	 }
 	_end = .;
 
 	/DISCARD/ : { *(.note*) }
 
-	. = ASSERT(_end <= 0x8000, "Setup too big!");
-	. = ASSERT(hdr == 0x1f1, "The setup header has the wrong offset!");
+	. = ALIGN(512);		/* align to sector size */
+	_setup_size = . - _start;
+	_setup_sects = _setup_size / 512;
+
+	/* compressed kernel data */
+	.kernel	: {
+		kernel = .;
+		*(.kernel)
+		kernel_end = .;
+
+	}
+	kernel_size = kernel_end - kernel;
+	kernel_size_para = (kernel_size + 15) / 16;
 }
+
+ASSERT(_end <= 0x8000, "Setup too big!");
+ASSERT(hdr == 0x1f1, "The setup header has the wrong offset!");
===================================================================
--- a/arch/i386/boot/tools/.gitignore
+++ /dev/null
@@ -1,1 +0,0 @@
-build
===================================================================
--- a/arch/i386/boot/tools/build.c
+++ /dev/null
@@ -1,168 +0,0 @@
-/*
- *  Copyright (C) 1991, 1992  Linus Torvalds
- *  Copyright (C) 1997 Martin Mares
- *  Copyright (C) 2007 H. Peter Anvin
- */
-
-/*
- * This file builds a disk-image from three different files:
- *
- * - setup: 8086 machine code, sets up system parm
- * - system: 80386 code for actual system
- *
- * It does some checking that all files are of the correct type, and
- * just writes the result to stdout, removing headers and padding to
- * the right amount. It also writes some system data to stderr.
- */
-
-/*
- * Changes by tytso to allow root device specification
- * High loaded stuff by Hans Lermen & Werner Almesberger, Feb. 1996
- * Cross compiling fixes by Gertjan van Wingerde, July 1996
- * Rewritten by Martin Mares, April 1997
- * Substantially overhauled by H. Peter Anvin, April 2007
- */
-
-#include <stdio.h>
-#include <string.h>
-#include <stdlib.h>
-#include <stdarg.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <sys/sysmacros.h>
-#include <unistd.h>
-#include <fcntl.h>
-#include <sys/mman.h>
-#include <asm/boot.h>
-
-typedef unsigned char  u8;
-typedef unsigned short u16;
-typedef unsigned long  u32;
-
-#define DEFAULT_MAJOR_ROOT 0
-#define DEFAULT_MINOR_ROOT 0
-
-/* Minimal number of setup sectors */
-#define SETUP_SECT_MIN 5
-#define SETUP_SECT_MAX 64
-
-/* This must be large enough to hold the entire setup */
-u8 buf[SETUP_SECT_MAX*512];
-int is_big_kernel;
-
-static void die(const char * str, ...)
-{
-	va_list args;
-	va_start(args, str);
-	vfprintf(stderr, str, args);
-	fputc('\n', stderr);
-	exit(1);
-}
-
-static void usage(void)
-{
-	die("Usage: build [-b] setup system [rootdev] [> image]");
-}
-
-int main(int argc, char ** argv)
-{
-	unsigned int i, sz, setup_sectors;
-	int c;
-	u32 sys_size;
-	u8 major_root, minor_root;
-	struct stat sb;
-	FILE *file;
-	int fd;
-	void *kernel;
-
-	if (argc > 2 && !strcmp(argv[1], "-b"))
-	  {
-	    is_big_kernel = 1;
-	    argc--, argv++;
-	  }
-	if ((argc < 3) || (argc > 4))
-		usage();
-	if (argc > 3) {
-		if (!strcmp(argv[3], "CURRENT")) {
-			if (stat("/", &sb)) {
-				perror("/");
-				die("Couldn't stat /");
-			}
-			major_root = major(sb.st_dev);
-			minor_root = minor(sb.st_dev);
-		} else if (strcmp(argv[3], "FLOPPY")) {
-			if (stat(argv[3], &sb)) {
-				perror(argv[3]);
-				die("Couldn't stat root device.");
-			}
-			major_root = major(sb.st_rdev);
-			minor_root = minor(sb.st_rdev);
-		} else {
-			major_root = 0;
-			minor_root = 0;
-		}
-	} else {
-		major_root = DEFAULT_MAJOR_ROOT;
-		minor_root = DEFAULT_MINOR_ROOT;
-	}
-	fprintf(stderr, "Root device is (%d, %d)\n", major_root, minor_root);
-
-	/* Copy the setup code */
-	file = fopen(argv[1], "r");
-	if (!file)
-		die("Unable to open `%s': %m", argv[1]);
-	c = fread(buf, 1, sizeof(buf), file);
-	if (ferror(file))
-		die("read-error on `setup'");
-	if (c < 1024)
-		die("The setup must be at least 1024 bytes");
-	if (buf[510] != 0x55 || buf[511] != 0xaa)
-		die("Boot block hasn't got boot flag (0xAA55)");
-	fclose(file);
-
-	/* Pad unused space with zeros */
-	setup_sectors = (c + 511) / 512;
-	if (setup_sectors < SETUP_SECT_MIN)
-		setup_sectors = SETUP_SECT_MIN;
-	i = setup_sectors*512;
-	memset(buf+c, 0, i-c);
-
-	/* Set the default root device */
-	buf[508] = minor_root;
-	buf[509] = major_root;
-
-	fprintf(stderr, "Setup is %d bytes (padded to %d bytes).\n", c, i);
-
-	/* Open and stat the kernel file */
-	fd = open(argv[2], O_RDONLY);
-	if (fd < 0)
-		die("Unable to open `%s': %m", argv[2]);
-	if (fstat(fd, &sb))
-		die("Unable to stat `%s': %m", argv[2]);
-	sz = sb.st_size;
-	fprintf (stderr, "System is %d kB\n", (sz+1023)/1024);
-	kernel = mmap(NULL, sz, PROT_READ, MAP_SHARED, fd, 0);
-	if (kernel == MAP_FAILED)
-		die("Unable to mmap '%s': %m", argv[2]);
-	sys_size = (sz + 15) / 16;
-	if (!is_big_kernel && sys_size > DEF_SYSSIZE)
-		die("System is too big. Try using bzImage or modules.");
-
-	/* Patch the setup code with the appropriate size parameters */
-	buf[0x1f1] = setup_sectors-1;
-	buf[0x1f4] = sys_size;
-	buf[0x1f5] = sys_size >> 8;
-	buf[0x1f6] = sys_size >> 16;
-	buf[0x1f7] = sys_size >> 24;
-
-	if (fwrite(buf, 1, i, stdout) != i)
-		die("Writing setup failed");
-
-	/* Copy the kernel code */
-	if (fwrite(kernel, 1, sz, stdout) != sz)
-		die("Writing kernel failed");
-	close(fd);
-
-	/* Everything is OK */
-	return 0;
-}

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 05/10] i386: make the bzImage payload an ELF file
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
                   ` (3 preceding siblings ...)
  2007-06-15  0:48 ` [PATCH 04/10] i386: clean up bzImage generation Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 06/10] add WEAK() for creating weak asm labels Jeremy Fitzhardinge
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel, James Bottomley

[-- Attachment #1: i386-bzimage-elf-payload.patch --]
[-- Type: text/plain, Size: 13318 bytes --]

This patch makes the payload of the bzImage file an ELF file.  In
other words, the bzImage is structured as follows:
 - boot sector
 - 16bit setup code
 - ELF header
  - decompressor
  - compressed kernel

A bootloader may find the start of the ELF file by looking at the
setup_size entry in the boot params, and using that to find the offset
of the ELF header.  The ELF Phdrs contain all the mapped memory
required to decompress and start booting the kernel.

One slightly complex part of this is that the bzImage boot_params need
to know about the internal structure of the ELF file, at least to the
extent of being able to point the core32_start entry at the ELF file's
entrypoint, so that loaders which use this field will still work.

Similarly, the ELF header needs to know how big the kernel vmlinux's
bss segment is, in order to make sure is is mapped properly.

To handle these two cases, we generate abstracted versions of the
object files which only contain the symbols we care about (generated
with objcopy --strip-all --keep-symbol=X), and then include those
symbol tables with ld -R.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 arch/i386/boot/Makefile               |   11 ++++--
 arch/i386/boot/compressed/Makefile    |   29 +++++++++++++--
 arch/i386/boot/compressed/elfhdr.S    |   60 +++++++++++++++++++++++++++++++++
 arch/i386/boot/compressed/head.S      |    9 ++--
 arch/i386/boot/compressed/notes.S     |    7 +++
 arch/i386/boot/compressed/piggy.S     |    5 +-
 arch/i386/boot/compressed/vmlinux.lds |   33 +++++++++++++++---
 arch/i386/boot/header.S               |    7 ---
 arch/i386/boot/setup.ld               |    5 ++
 arch/i386/kernel/asm-offsets.c        |    2 +
 arch/i386/kernel/head.S               |   26 +++++++++++---
 arch/i386/kernel/vmlinux.lds.S        |    4 +-
 12 files changed, 167 insertions(+), 31 deletions(-)

===================================================================
--- a/arch/i386/boot/Makefile
+++ b/arch/i386/boot/Makefile
@@ -72,14 +72,19 @@ AFLAGS		:= $(CFLAGS) -D__ASSEMBLY__
 
 SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
 
-LDFLAGS_setup.elf	:= -T
-$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
+$(obj)/zImage $(obj)/bzImage: 				\
+	LDFLAGS	:=					\
+		-R $(obj)/compressed/blob-syms		\
+		--defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T
+
+$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) 	\
+		$(obj)/compressed/blob-syms FORCE
 	$(call if_changed,ld)
 
 $(obj)/payload.o:	EXTRA_AFLAGS := -Wa,-I$(obj)
 $(obj)/payload.o: $(src)/payload.S $(obj)/blob.bin
 
-$(obj)/compressed/blob: FORCE
+$(obj)/compressed/blob $(obj)/compressed/blob-syms: FORCE
 	$(Q)$(MAKE) $(build)=$(obj)/compressed IMAGE_OFFSET=$(IMAGE_OFFSET) $@
 
 # Set this if you want to pass append arguments to the zdisk/fdimage/isoimage kernel
===================================================================
--- a/arch/i386/boot/compressed/Makefile
+++ b/arch/i386/boot/compressed/Makefile
@@ -4,21 +4,42 @@
 # create a compressed vmlinux image from the original vmlinux
 #
 
-targets		:= blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \
+targets		:= blob vmlinux.bin vmlinux.bin.gz \
+			elfhdr.o head.o misc.o notes.o piggy.o \
 			vmlinux.bin.all vmlinux.relocs
 
-LDFLAGS_blob	:= -T
 hostprogs-y	:= relocs
 
 CFLAGS  := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -O2 \
 	   -fno-strict-aliasing -fPIC \
 	   $(call cc-option,-ffreestanding) \
 	   $(call cc-option,-fno-stack-protector)
-LDFLAGS := -m elf_i386
+LDFLAGS := -R $(obj)/vmlinux-syms --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T
 
-$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE
+OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o piggy.o)
+
+$(obj)/blob: $(src)/vmlinux.lds $(obj)/vmlinux-syms $(OBJS) FORCE
 	$(call if_changed,ld)
 	@:
+
+# Generate a stripped-down object including only the symbols needed
+# so that we can get them with ld -R. Direct stderr to /dev/null to
+# shut useless warning up.
+quiet_cmd_symextract = SYMEXT $@
+      cmd_symextract = objcopy -S \
+			$(addprefix -j,$(EXTRACTSECTS)) \
+			$(addprefix -K,$(EXTRACTSYMS)) \
+			$< $@ 2>/dev/null
+
+$(obj)/blob-syms: EXTRACTSYMS := blob_entry blob_payload
+$(obj)/blob-syms: EXTRACTSECTS := .text.head .data.compressed
+$(obj)/blob-syms: $(obj)/blob FORCE
+	$(call if_changed,symextract)
+
+$(obj)/vmlinux-syms: EXTRACTSYMS := __kernel_end __kernel_data_size
+$(obj)/vmlinux-syms: EXTRACTSECTS :=
+$(obj)/vmlinux-syms: vmlinux FORCE
+	$(call if_changed,symextract)
 
 $(obj)/vmlinux.bin: vmlinux FORCE
 	$(call if_changed,objcopy)
===================================================================
--- /dev/null
+++ b/arch/i386/boot/compressed/elfhdr.S
@@ -0,0 +1,60 @@
+/* DIY ELF header */
+
+#include <linux/elf.h>
+#include <asm/boot.h>
+
+.section .elfhdr,"a",@progbits
+ehdr:
+	# e_ident
+	.byte ELFMAG0, ELFMAG1, ELFMAG2, ELFMAG3
+	.byte ELFCLASS32, ELFDATA2LSB, EV_CURRENT, ELFOSABI_STANDALONE
+	.org ehdr + EI_NIDENT
+#ifndef CONFIG_RELOCATABLE
+	.word ET_EXEC				# e_type
+#else
+	.word ET_DYN				# e_type
+#endif
+	.word EM_386				# e_machine
+	.int  1					# e_version
+	.int  LOAD_PHYSICAL_ADDR + blob_startup_32 - ehdr	# e_entry
+	.int  phdr - ehdr			# e_phoff
+	.int  0					# e_shoff
+	.int  0					# e_flags
+	.word ehdr_end - ehdr			# e_ehsize
+	.word phdr_size				# e_phentsize
+	.word phnum				# e_phnum
+	.word 40				# e_shentsize
+	.word 0					# e_shnum
+	.word 0					# e_shstrndx
+ehdr_end:
+
+phdr:
+	.int PT_LOAD					# p_type
+	.int 0						# p_offset
+	.int LOAD_PHYSICAL_ADDR				# p_vaddr
+	.int LOAD_PHYSICAL_ADDR				# p_paddr
+	.int blob_filesz				# p_filesz
+	.int blob_memsz					# p_memsz
+	.int PF_R | PF_W | PF_X				# p_flags
+	.int 4096					# p_align
+phdr_size = . - phdr
+
+	.int PT_NOTE					# p_type
+	.int _notes - ehdr				# p_offset
+	.int 0						# p_vaddr
+	.int 0						# p_paddr
+	.int blob_notesz				# p_filesz
+	.int 0						# p_memsz
+	.int 0						# p_flags
+	.int 0						# p_align
+
+	.int PT_PHDR					# p_type
+	.int phdr - ehdr				# p_offset
+	.int LOAD_PHYSICAL_ADDR + phdr - ehdr		# p_vaddr
+	.int LOAD_PHYSICAL_ADDR + phdr - ehdr		# p_paddr
+	.int phdr_end - phdr				# p_filesz
+	.int phdr_end - phdr				# p_memsz
+	.int PF_R | PF_W | PF_X				# p_flags
+	.int 0						# p_align
+phdr_end:
+phnum = (phdr_end - phdr) / phdr_size
===================================================================
--- a/arch/i386/boot/compressed/head.S
+++ b/arch/i386/boot/compressed/head.S
@@ -27,11 +27,12 @@
 #include <asm/segment.h>
 #include <asm/page.h>
 #include <asm/boot.h>
+#include <asm/asm-offsets.h>
 
 .section ".text.head","ax",@progbits
-	.globl startup_32
+	.globl blob_startup_32
 
-startup_32:
+blob_startup_32:
 	cld
 	cli
 	movl $(__BOOT_DS),%eax
@@ -48,7 +49,7 @@ startup_32:
  * data at 0x1e4 (defined as a scratch field) are used as the stack
  * for this calculation. Only 4 bytes are needed.
  */
-	leal (0x1e4+4)(%esi), %esp
+	leal (BP_scratch+4)(%esi), %esp
 	call 1f
 1:	popl %ebp
 	subl $1b, %ebp
@@ -85,7 +86,7 @@ 1:	popl %ebp
 	pushl %esi
 	leal _end(%ebp), %esi
 	leal _end(%ebx), %edi
-	movl $(_end - startup_32), %ecx
+	movl $(_end - blob_startup_32), %ecx
 	std
 	rep
 	movsb
===================================================================
--- /dev/null
+++ b/arch/i386/boot/compressed/notes.S
@@ -0,0 +1,7 @@
+#include <linux/elfnote.h>
+#include <linux/elf_boot.h>
+#include <linux/utsrelease.h>
+
+ELFNOTE(ELF_NOTE_BOOT, EIN_PROGRAM_NAME, .asciz "Linux")
+ELFNOTE(ELF_NOTE_BOOT, EIN_PROGRAM_VERSION, .asciz UTS_RELEASE)
+ELFNOTE(ELF_NOTE_BOOT, EIN_ARGUMENT_STYLE, .asciz "Linux")
===================================================================
--- a/arch/i386/boot/compressed/piggy.S
+++ b/arch/i386/boot/compressed/piggy.S
@@ -1,8 +1,9 @@
 .section .data.compressed,"a",@progbits
 
-.globl input_data, input_len, output_len
+.globl input_data, input_len, input_size, output_len
 
-input_len:	.long input_data_end - input_data
+input_size = input_data_end - input_data
+input_len:	.long input_size
 
 input_data:
 .incbin "vmlinux.bin.gz"
===================================================================
--- a/arch/i386/boot/compressed/vmlinux.lds
+++ b/arch/i386/boot/compressed/vmlinux.lds
@@ -1,18 +1,21 @@ OUTPUT_FORMAT("elf32-i386", "elf32-i386"
-OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")
+OUTPUT_FORMAT("elf32-i386")
 OUTPUT_ARCH(i386)
-ENTRY(startup_32)
+
 SECTIONS
 {
-        /* Be careful parts of head.S assume startup_32 is at
-         * address 0.
-	 */
+	/* make sure we don't get anything from vmlinux-syms */
+	/DISCARD/ : { */vmlinux-syms(*) }
+
 	. =  0 	;
 	.text.head : {
+		*(.elfhdr)
 		_head = . ;
+	blob_entry = blob_startup_32 + IMAGE_OFFSET;
 		*(.text.head)
 		_ehead = . ;
 	}
 	.data.compressed : {
+	blob_payload = input_data + IMAGE_OFFSET;
 		*(.data.compressed)
 	}
 	.text :	{
@@ -33,6 +36,14 @@ SECTIONS
 		*(.data.*)
 		_edata = . ;
 	}
+	.notes : {
+		_notes = . ;
+		*(.note*)
+		_notes_end = .;
+	}
+	blob_notesz = _notes_end - _notes;
+
+	blob_filesz = . ;
 	.bss : {
 		_bss = . ;
 		*(.bss)
@@ -40,4 +51,16 @@ SECTIONS
 		*(COMMON)
 		_end = . ;
 	}
+
+	blob_notesz = _notes_end - _notes;
+
+	/* How much memory we need for decompression: */
+	blob_needs = . - IMAGE_OFFSET +	/* compressed data + decompressor */
+		__kernel_data_size +	/* uncompressed data */
+		(__kernel_data_size / 0x8000 * 8) + 0x8000 + 18; /* overhead */
+
+	/* Memory we need to reserve in PHDR:
+	   max of our needs and kernel's needs */
+	blob_memsz = blob_needs > __kernel_end ? blob_needs : __kernel_end;
+
 }
===================================================================
--- a/arch/i386/boot/header.S
+++ b/arch/i386/boot/header.S
@@ -155,13 +155,8 @@ setup_move_size: .word  _setup_size	# si
 					# loader knows how much data behind
 					# us also needs to be loaded.
 
-code32_start:				# here loaders can put a different
+code32_start:	.long blob_entry	# here loaders can put a different
 					# start address for 32-bit code.
-#ifndef __BIG_KERNEL__
-		.long	0x1000		#   0x1000 = default for zImage
-#else
-		.long	0x100000	# 0x100000 = default for big kernel
-#endif
 
 ramdisk_image:	.long	0		# address of loaded ramdisk image
 					# Here the loader puts the 32-bit
===================================================================
--- a/arch/i386/boot/setup.ld
+++ b/arch/i386/boot/setup.ld
@@ -9,6 +9,9 @@ ENTRY(_start)
 
 SECTIONS
 {
+	/* make sure we don't get anything from blob-syms */
+	/DISCARD/	: { */blob-syms(*) }
+
 	.bstext 0	: { *(.bstext) }
 	.bsdata		: { *(.bsdata) }
 
@@ -45,7 +48,7 @@ SECTIONS
 
 	/DISCARD/ : { *(.note*) }
 
-	. = ALIGN(512);		/* align to sector size */
+	. = ALIGN(4096);		/* align to page size */
 	_setup_size = . - _start;
 	_setup_sects = _setup_size / 512;
 
===================================================================
--- a/arch/i386/kernel/asm-offsets.c
+++ b/arch/i386/kernel/asm-offsets.c
@@ -109,6 +109,8 @@ void foo(void)
 	DEFINE(PTRS_PER_PTE, PTRS_PER_PTE);
 	DEFINE(PTRS_PER_PMD, PTRS_PER_PMD);
 	DEFINE(PTRS_PER_PGD, PTRS_PER_PGD);
+	DEFINE(PMD_SIZE, PMD_SIZE);
+	DEFINE(PGDIR_SIZE, PGDIR_SIZE);
 
 	DEFINE(VDSO_PRELINK_asm, VDSO_PRELINK);
 
===================================================================
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -49,18 +49,34 @@
  *
  * This should be a multiple of a page.
  */
-LOW_PAGES = 1<<(32-PAGE_SHIFT_asm)
-
+LOW_PAGES = ((-__PAGE_OFFSET) >> PAGE_SHIFT_asm)
+
+
+#define ROUNDUP(x,y)	(((x) + (y) - 1) & ~((y)-1))
+
+/* number of pages needed to map x bytes */
 #if PTRS_PER_PMD > 1
-PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PMD) + PTRS_PER_PGD
+#define MAPPING_SIZE(x)	(ROUNDUP(x, PMD_SIZE) / PTRS_PER_PMD + PTRS_PER_PGD)
 #else
-PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PGD)
-#endif
+#define MAPPING_SIZE(x)	(ROUNDUP(x, PGDIR_SIZE) / PTRS_PER_PGD)
+#endif
+
+PAGE_TABLE_SIZE = MAPPING_SIZE(LOW_PAGES)
 BOOTBITMAP_SIZE = LOW_PAGES / 8
 ALLOCATOR_SLOP = 4
 
 INIT_MAP_BEYOND_END = BOOTBITMAP_SIZE + (PAGE_TABLE_SIZE + ALLOCATOR_SLOP)*PAGE_SIZE_asm
 
+/*
+ * Where the kernel's initial load-time mapping must end for this code
+ * to get started.  This includes the kernel text+data+bss+enough
+ * pg0 space to map INIT_MAP_BEYOND_END.  This is only really needed for
+ * bootloaders which start the kernel with paging enabled, which may
+ * not use this pagetable setup anyway, but its good to be consistent.
+ */
+.globl pg0_init_size
+pg0_init_size = ROUNDUP(INIT_MAP_BEYOND_END / 1024, PAGE_SIZE_asm)
+
 /*
  * 32-bit kernel entrypoint; only used by the boot CPU.  On entry,
  * %esi points to the real-mode code as a 32-bit pointer.
===================================================================
--- a/arch/i386/kernel/vmlinux.lds.S
+++ b/arch/i386/kernel/vmlinux.lds.S
@@ -199,7 +199,9 @@ SECTIONS
 	/* This is where the kernel creates the early boot page tables */
 	. = ALIGN(4096);
 	pg0 = . ;
-  }
+	__kernel_end = pg0 + pg0_init_size - LOAD_OFFSET - LOAD_PHYSICAL_ADDR ;
+  }
+  __kernel_data_size = __init_end - _text	;
 
   /* Sections to be discarded */
   /DISCARD/ : {

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 06/10] add WEAK() for creating weak asm labels
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
                   ` (4 preceding siblings ...)
  2007-06-15  0:48 ` [PATCH 05/10] i386: make the bzImage payload an ELF file Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 07/10] always allocate space for notes Jeremy Fitzhardinge
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: weak-linkage.patch --]
[-- Type: text/plain, Size: 468 bytes --]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>

---
 include/linux/linkage.h |    6 ++++++
 1 file changed, 6 insertions(+)

===================================================================
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -34,6 +34,12 @@
   name:
 #endif
 
+#ifndef WEAK
+#define WEAK(name)	   \
+	.weak name;	   \
+	name:
+#endif
+
 #define KPROBE_ENTRY(name) \
   .pushsection .kprobes.text, "ax"; \
   ENTRY(name)

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 07/10] always allocate space for notes
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
                   ` (5 preceding siblings ...)
  2007-06-15  0:48 ` [PATCH 06/10] add WEAK() for creating weak asm labels Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 08/10] i386: paravirt boot sequence Jeremy Fitzhardinge
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: elfnote-alloc.patch --]
[-- Type: text/plain, Size: 4006 bytes --]

This patch makes .note segments always allocated; that is, they are
loaded as part of the binary and appear in the :data segment.  This is
not always necessary, but certain users - such as vsyscalls and notes
in boot images - require the notes to be allocated.  Rather than
having two ways of creating notes, just have one which suits
everyone.  The only downside is that the notes will actually consume
space at runtime.  This isn't a big deal, since a typical kernel
doesn't have very many, if any.

Also, make the ELFNOTE() macro do the right thing in 32/64 bit
environments.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>

---
 arch/i386/kernel/vmlinux.lds.S    |    9 ++++-----
 arch/i386/kernel/vsyscall-note.S  |    4 +---
 include/asm-generic/vmlinux.lds.h |    7 ++++++-
 include/linux/elfnote.h           |   21 ++++++++++++++-------
 4 files changed, 25 insertions(+), 16 deletions(-)

===================================================================
--- a/arch/i386/kernel/vmlinux.lds.S
+++ b/arch/i386/kernel/vmlinux.lds.S
@@ -28,10 +28,9 @@ jiffies = jiffies_64;
 jiffies = jiffies_64;
 
 PHDRS {
-	text PT_LOAD FLAGS(5);	/* R_E */
-	data PT_LOAD FLAGS(7);	/* RWE */
-	note PT_NOTE FLAGS(0);	/* ___ */
+	STD_PHDRS
 }
+
 SECTIONS
 {
   . = LOAD_OFFSET + LOAD_PHYSICAL_ADDR;
@@ -72,6 +71,8 @@ SECTIONS
   _sdata = .;			/* End of text section */
 
   RODATA
+
+  NOTES
 
   /* writeable */
   . = ALIGN(4096);
@@ -210,6 +211,4 @@ SECTIONS
   STABS_DEBUG
 
   DWARF_DEBUG
-
-  NOTES
 }
===================================================================
--- a/arch/i386/kernel/vsyscall-note.S
+++ b/arch/i386/kernel/vsyscall-note.S
@@ -9,9 +9,7 @@
 /* Ideally this would use UTS_NAME, but using a quoted string here
    doesn't work. Remember to change this when changing the
    kernel's name. */
-ELFNOTE_START(Linux, 0, "a")
-	.long LINUX_VERSION_CODE
-ELFNOTE_END
+ELFNOTE(Linux, 0, .long LINUX_VERSION_CODE)
 
 #ifdef CONFIG_XEN
 
===================================================================
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -245,8 +245,13 @@
 		__stop___bug_table = .;					\
 	}
 
+#define STD_PHDRS							\
+	text PT_LOAD FILEHDR PHDRS FLAGS(5);	/* R_E */		\
+	data PT_LOAD FLAGS(7);			/* RWE */		\
+	note PT_NOTE FLAGS(0);			/* ___ */
+
 #define NOTES								\
-	.notes : { *(.note.*) } :note
+	.notes : { *(.note.*) } :data :note
 
 #define INITCALLS							\
   	*(.initcall0.init)						\
===================================================================
--- a/include/linux/elfnote.h
+++ b/include/linux/elfnote.h
@@ -53,7 +53,7 @@ 4484:.balign 4				;	\
 .popsection				;
 
 #define ELFNOTE(name, type, desc)		\
-	ELFNOTE_START(name, type, "")		\
+	ELFNOTE_START(name, type, "a")		\
 		desc			;	\
 	ELFNOTE_END
 
@@ -67,8 +67,8 @@ 4484:.balign 4				;	\
  * only define one note per line, since __LINE__ is used to generate
  * unique symbols.
  */
-#define _ELFNOTE_PASTE(a,b)	a##b
-#define _ELFNOTE(size, name, unique, type, desc)			\
+#define __ELFNOTE_PASTE(a,b)	a##b
+#define __ELFNOTE(size, name, unique, type, desc)			\
 	static const struct {						\
 		struct elf##size##_note _nhdr;				\
 		unsigned char _name[sizeof(name)]			\
@@ -88,11 +88,18 @@ 4484:.balign 4				;	\
 		name,							\
 		desc							\
 	}
-#define ELFNOTE(size, name, type, desc)		\
-	_ELFNOTE(size, name, __LINE__, type, desc)
 
-#define ELFNOTE32(name, type, desc) ELFNOTE(32, name, type, desc)
-#define ELFNOTE64(name, type, desc) ELFNOTE(64, name, type, desc)
+#define ELFNOTE32(name, type, desc) __ELFNOTE(32, name, __LINE__, type, desc)
+#define ELFNOTE64(name, type, desc) __ELFNOTE(64, name, __LINE__, type, desc)
+
+#ifdef BITS_PER_LONG == 32
+#define ELFNOTE(name, type, desc)	ELFNOTE32(name, type, desc)
+#elif BITS_PER_LONG == 64
+#define ELFNOTE(name, type, desc)	ELFNOTE64(name, type, desc)
+#else
+#error Define ELFNOTE for this BITS_PER_LONG
+#endif
+
 #endif	/* __ASSEMBLER__ */
 
 #endif /* _LINUX_ELFNOTE_H */

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 08/10] i386: paravirt boot sequence
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
                   ` (6 preceding siblings ...)
  2007-06-15  0:48 ` [PATCH 07/10] always allocate space for notes Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 09/10] ask the hypervisor how much space it needs reserved Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 10/10] xen: use boot protocol to boot xen kernel Jeremy Fitzhardinge
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel, James Bottomley

[-- Attachment #1: i386-paravirt-bootup.patch --]
[-- Type: text/plain, Size: 5429 bytes --]

This patch uses the updated boot protocol to do paravirtualized boot.
If the boot version is >= 2.07, then it will do two things:

 1. Check the bootparams loadflags to see if we should reload the
    segment registers and clear interrupts.  This is appropriate
    for normal native boot and some paravirtualized environments, but
    inapproprate for others.

 2. Check the hardware architecture, and dispatch to the appropriate
    kernel entrypoint.  If the bootloader doesn't set this, then we
    simply do the normal boot sequence.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 arch/i386/boot/compressed/head.S |   14 +++++++++--
 arch/i386/boot/compressed/misc.c |    4 +++
 arch/i386/boot/header.S          |    7 ++++-
 arch/i386/kernel/head.S          |   47 ++++++++++++++++++++++++++++++++++----
 4 files changed, 65 insertions(+), 7 deletions(-)

===================================================================
--- a/arch/i386/boot/compressed/head.S
+++ b/arch/i386/boot/compressed/head.S
@@ -33,14 +33,24 @@
 	.globl blob_startup_32
 
 blob_startup_32:
-	cld
-	cli
+	/* check to see if KEEP_SEGMENTS flag is meaningful */
+	cmpw $0x207, BP_version(%esi)
+	jb 1f
+
+	/* test KEEP_SEGMENTS flag to see if the bootloader is asking
+		us to not reload segments */
+	testb $(1<<6), BP_loadflags(%esi)
+	jnz 2f
+
+1:	cli
 	movl $(__BOOT_DS),%eax
 	movl %eax,%ds
 	movl %eax,%es
 	movl %eax,%fs
 	movl %eax,%gs
 	movl %eax,%ss
+
+2:	cld
 
 /* Calculate the delta between where we were compiled to run
  * at and where we were actually loaded at.  This can only be done
===================================================================
--- a/arch/i386/boot/compressed/misc.c
+++ b/arch/i386/boot/compressed/misc.c
@@ -245,6 +245,10 @@ static void putstr(const char *s)
 {
 	int x,y,pos;
 	char c;
+
+	if (RM_SCREEN_INFO.orig_video_mode == 0 &&
+	    lines == 0 && cols == 0)
+		return;
 
 	x = RM_SCREEN_INFO.orig_x;
 	y = RM_SCREEN_INFO.orig_y;
===================================================================
--- a/arch/i386/boot/header.S
+++ b/arch/i386/boot/header.S
@@ -119,7 +119,7 @@ 1:
 	# Part 2 of the header, from the old setup.S
 
 		.ascii	"HdrS"		# header signature
-		.word	0x0206		# header version number (>= 0x0105)
+		.word	0x0207		# header version number (>= 0x0105)
 					# or else old loadlin-1.5 will fail)
 		.globl realmode_swtch
 realmode_swtch:	.word	0, 0		# default_switch, SETUPSEG
@@ -209,6 +209,11 @@ cmdline_size:   .long   COMMAND_LINE_SIZ
                                                 #added with boot protocol
                                                 #version 2.06
 
+hardware_subarch:	.long 0			# subarchitecture, added with 2.07
+						# default to 0 for normal x86 PC
+
+hardware_subarch_data:	.quad 0
+
 # End of setup header #####################################################
 
 	.section ".inittext", "ax"
===================================================================
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -86,28 +86,37 @@ pg0_init_size = ROUNDUP(INIT_MAP_BEYOND_
  */
 .section .text.head,"ax",@progbits
 ENTRY(startup_32)
+	/* check to see if KEEP_SEGMENTS flag is meaningful */
+	cmpw $0x207, BP_version(%esi)
+	jb 1f
+
+	/* test KEEP_SEGMENTS flag to see if the bootloader is asking
+		us to not reload segments */
+	testb $(1<<6), BP_loadflags(%esi)
+	jnz 2f
 
 /*
  * Set segments to known values.
  */
-	cld
-	lgdt boot_gdt_descr - __PAGE_OFFSET
+1:	lgdt boot_gdt_descr - __PAGE_OFFSET
 	movl $(__BOOT_DS),%eax
 	movl %eax,%ds
 	movl %eax,%es
 	movl %eax,%fs
 	movl %eax,%gs
+2:
 
 /*
  * Clear BSS first so that there are no surprises...
- * No need to cld as DF is already clear from cld above...
- */
+ */
+	cld
 	xorl %eax,%eax
 	movl $__bss_start - __PAGE_OFFSET,%edi
 	movl $__bss_stop - __PAGE_OFFSET,%ecx
 	subl %edi,%ecx
 	shrl $2,%ecx
 	rep ; stosl
+
 /*
  * Copy bootup parameters out of the way.
  * Note: %esi still has the pointer to the real-mode data.
@@ -135,6 +144,35 @@ 2:
 	movsl
 1:
 
+#ifdef CONFIG_PARAVIRT
+	cmpw $0x207, (boot_params + BP_version - __PAGE_OFFSET)
+	jb default_entry
+
+	/* Paravirt-compatible boot parameters.  Look to see what architecture
+		we're booting under. */
+	movl (boot_params + BP_hardware_subarch - __PAGE_OFFSET), %eax
+	cmpl $num_subarch_entries, %eax
+	jae bad_subarch
+
+	movl subarch_entries - __PAGE_OFFSET(,%eax,4), %eax
+	subl $__PAGE_OFFSET, %eax
+	jmp *%eax
+
+bad_subarch:
+WEAK(lguest_entry)
+WEAK(xen_entry)
+	/* Unknown implementation; there's really
+	   nothing we can do at this point. */
+	ud2a
+.data
+subarch_entries:
+	.long default_entry		/* normal x86/PC */
+	.long lguest_entry		/* lguest hypervisor */
+	.long xen_entry			/* Xen hypervisor */
+num_subarch_entries = (. - subarch_entries) / 4
+.previous
+#endif /* CONFIG_PARAVIRT */
+
 /*
  * Initialize page tables.  This creates a PDE and a set of page
  * tables, which are located immediately beyond _end.  The variable
@@ -147,6 +185,7 @@ 1:
  */
 page_pde_offset = (__PAGE_OFFSET >> 20);
 
+default_entry:
 	movl $(pg0 - __PAGE_OFFSET), %edi
 	movl $(swapper_pg_dir - __PAGE_OFFSET), %edx
 	movl $0x007, %eax			/* 0x007 = PRESENT+RW+USER */

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 09/10] ask the hypervisor how much space it needs reserved
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
                   ` (7 preceding siblings ...)
  2007-06-15  0:48 ` [PATCH 08/10] i386: paravirt boot sequence Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  2007-06-15  0:48 ` [PATCH 10/10] xen: use boot protocol to boot xen kernel Jeremy Fitzhardinge
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: xen-dynamic-topaddr.patch --]
[-- Type: text/plain, Size: 1213 bytes --]

Ask the hypervisor how much space it needs reserved, since 32-on-64
doesn't need any space, and it may change in future.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>

---
 arch/i386/xen/enlighten.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

===================================================================
--- a/arch/i386/xen/enlighten.c
+++ b/arch/i386/xen/enlighten.c
@@ -1113,6 +1113,17 @@ static void setup_hypercall_page(void)
 	native_write_msr(ebx + 0, ((u64)hypercall_mfn) << PAGE_SHIFT);
 }
 
+static void __init xen_reserve_top(void)
+{
+	unsigned long top = HYPERVISOR_VIRT_START;
+	struct xen_platform_parameters pp;
+
+	if (HYPERVISOR_xen_version(XENVER_platform_parameters, &pp) == 0)
+		top = pp.virt_start;
+
+	reserve_top_address(-top + 2 * PAGE_SIZE);
+}
+
 /* First C function to be called on Xen boot */
 asmlinkage void __init xen_start_kernel(void)
 {
@@ -1163,7 +1174,7 @@ asmlinkage void __init xen_start_kernel(
 		paravirt_ops.kernel_rpl = 0;
 
 	/* set the limit of our address space */
-	reserve_top_address(-HYPERVISOR_VIRT_START + 2 * PAGE_SIZE);
+	xen_reserve_top();
 
 	/* set up basic CPUID stuff */
 	cpu_detect(&new_cpu_data);

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 10/10] xen: use boot protocol to boot xen kernel
  2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
                   ` (8 preceding siblings ...)
  2007-06-15  0:48 ` [PATCH 09/10] ask the hypervisor how much space it needs reserved Jeremy Fitzhardinge
@ 2007-06-15  0:48 ` Jeremy Fitzhardinge
  9 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15  0:48 UTC (permalink / raw)
  To: Eric W. Biederman, H. Peter Anvin
  Cc: Vivek Goyal, Rusty Russell, Andi Kleen, v12n, lkml,
	Andrew Morton, Xen-Devel

[-- Attachment #1: xen-paravirt-boot.patch --]
[-- Type: text/plain, Size: 14345 bytes --]

Boot a Xen kernel using the boot protocol.  There are two parts to this:

1. Add Xen-specific notes to the bzImage's internal ELF file, so that
   the Xen domain builder knows what to do with it.  This is simply a
   matter of adding a new notes-xen.S to the image.  The notes depend
   on the config options, but they contain no addresses, so there's no
   concern about relocation, or references into the kernel image itself.

2. Do the early setup after booting, mainly to remap the kernel to
   the proper virtual address.  The kernel initially comes up with
   a P=V 1:1 mapping.  We need to copy the appropriate internal pagetable
   pointers to get the kernel also mapped at __PAGE_OFFSET.  In order
   to simplify this process, we just keep the same pte pages, and only
   update the pgd/pmd entries (depending on whether its PAE or not, and
   whether the kernel and Xen want to share the same pgd slot).

   A pre-requisite for updating the pagetables is setting up the
   hypercall page in order to do hypercalls.  Rather than using the
   Xen Notes to set this up (which would require a relocatable
   reference from the bzImage notes into the kernel), we use the Xen
   reserved MSR to set the page address.

   Once the kernel has been relocated, we update some of the pointers
   in the start_info to kernel virtual addresses, and then jump to
   xen_start_kernel() to do the rest of the setup before calling
   start_kernel() proper.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>

---
 arch/i386/boot/compressed/Makefile    |    4 
 arch/i386/boot/compressed/notes-xen.S |   17 ++
 arch/i386/kernel/head.S               |    2 
 arch/i386/xen/Makefile                |    2 
 arch/i386/xen/early.c                 |  215 +++++++++++++++++++++++++++++++++
 arch/i386/xen/enlighten.c             |   29 +---
 arch/i386/xen/xen-head.S              |   38 -----
 arch/i386/xen/xen-ops.h               |    1 
 include/asm-i386/xen/hypercall.h      |    4 
 9 files changed, 251 insertions(+), 61 deletions(-)

===================================================================
--- a/arch/i386/boot/compressed/Makefile
+++ b/arch/i386/boot/compressed/Makefile
@@ -5,7 +5,7 @@
 #
 
 targets		:= blob vmlinux.bin vmlinux.bin.gz \
-			elfhdr.o head.o misc.o notes.o piggy.o \
+			elfhdr.o head.o misc.o notes.o notes-xen.o piggy.o \
 			vmlinux.bin.all vmlinux.relocs
 
 hostprogs-y	:= relocs
@@ -16,7 +16,7 @@ CFLAGS  := -m32 -D__KERNEL__ $(LINUX_INC
 	   $(call cc-option,-fno-stack-protector)
 LDFLAGS := -R $(obj)/vmlinux-syms --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T
 
-OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o piggy.o)
+OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o notes-xen.o piggy.o)
 
 $(obj)/blob: $(src)/vmlinux.lds $(obj)/vmlinux-syms $(OBJS) FORCE
 	$(call if_changed,ld)
===================================================================
--- /dev/null
+++ b/arch/i386/boot/compressed/notes-xen.S
@@ -0,0 +1,17 @@
+
+#ifdef CONFIG_XEN
+#include <linux/elfnote.h>
+#include <xen/interface/elfnote.h>
+
+	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
+	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz "2.6")
+	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz "xen-3.0")
+	ELFNOTE(Xen, XEN_ELFNOTE_FEATURES,
+			.asciz "!writable_page_tables|pae_pgdir_above_4gb")
+#ifdef CONFIG_X86_PAE
+	ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE,       .asciz "yes")
+#else
+	ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE,       .asciz "no")
+#endif
+	ELFNOTE(Xen, XEN_ELFNOTE_LOADER,         .asciz "generic")
+#endif
===================================================================
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -594,8 +594,6 @@ fault_msg:
 	.ascii "Int %d: CR2 %p  err %p  EIP %p  CS %p  flags %p\n"
 	.asciz "Stack: %p %p %p %p %p %p %p %p\n"
 
-#include "../xen/xen-head.S"
-
 /*
  * The IDT and GDT 'descriptors' are a strange 48-bit object
  * only used by the lidt and lgdt instructions. They are not
===================================================================
--- a/arch/i386/xen/Makefile
+++ b/arch/i386/xen/Makefile
@@ -1,4 +1,4 @@ obj-y		:= enlighten.o setup.o features.o
-obj-y		:= enlighten.o setup.o features.o multicalls.o mmu.o \
+obj-y		:= early.o enlighten.o setup.o features.o multicalls.o mmu.o \
 			events.o time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)	+= smp.o
===================================================================
--- /dev/null
+++ b/arch/i386/xen/early.c
@@ -0,0 +1,215 @@
+/*
+ * Very earliest code, run before we're in the kernel virtual address
+ * space.  As a result, we need to be careful about touching static
+ * symbols or any absolute address.
+ */
+
+#include <linux/types.h>
+#include <linux/bug.h>
+#include <linux/sched.h>
+
+#include <asm/bootparam.h>
+#include <asm/setup.h>
+#include <asm/paravirt.h>
+#include <asm/pgtable.h>
+
+#include <xen/interface/xen.h>
+#include <xen/page.h>
+#include <asm/xen/interface.h>
+#include <asm/xen/hypercall.h>
+
+#include "xen-ops.h"
+
+#define PA(ptr)	((typeof(ptr)) __pa_symbol((ptr)))
+
+extern char _end[];
+
+static inline void early_cpuid(unsigned int *eax, unsigned int *ebx,
+			       unsigned int *ecx, unsigned int *edx)
+{
+	asm(XEN_EMULATE_PREFIX "cpuid"
+		: "=a" (*eax),
+		  "=b" (*ebx),
+		  "=c" (*ecx),
+		  "=d" (*edx)
+		: "0" (*eax), "2" (*ecx));
+}
+
+static __init u64 early_p2m(unsigned long *mfn_list,
+			    unsigned long phys)
+{
+	unsigned offset = phys & ~PAGE_MASK;
+	return PFN_PHYS((u64)mfn_list[PFN_DOWN(phys)]) + offset;
+}
+
+static __init unsigned long early_m2p(unsigned long mfn)
+{
+	unsigned long ret = machine_to_phys_mapping[mfn];
+	if (ret == ~0)
+		ret = 0;
+	return ret;
+}
+
+static __init void setup_hypercall_page(struct start_info *info)
+{
+	unsigned long *mfn_list = (unsigned long *)info->mfn_list;
+	unsigned eax, ebx, ecx, edx;
+	unsigned long hypercall_mfn;
+
+	/* leaf 0x40000000 is a virtual machine leaf */
+	eax = 0x40000000;
+	ecx = 0;
+	early_cpuid(&eax, &ebx, &ecx, &edx);
+
+	/* No way we should be able to get here without being under Xen */
+	if (ebx != 0x566e6558 || /* Signature 1: "XenV" */
+	    ecx != 0x65584d4d || /* Signature 2: "MMXe" */
+	    edx != 0x4d4d566e || /* Signature 3: "nVMM" */
+	    eax < 0x40000002)
+		BUG();
+
+	/* Get the number of hypercall pages (we only need 1) and the
+	   Xen MSR base */
+	eax = 0x40000002;
+	early_cpuid(&eax, &ebx, &ecx, &edx);
+
+	/* Use magic msr to set the address of the hypercall page */
+	hypercall_mfn = PFN_DOWN(__pa_symbol(hypercall_page));
+	if (mfn_list)
+		hypercall_mfn = mfn_list[hypercall_mfn];
+
+	native_write_msr(ebx + 0, ((u64)hypercall_mfn) << PAGE_SHIFT);
+}
+
+static __init pmd_t *get_pmd(pgd_t *pgd, unsigned long addr)
+{
+	unsigned idx = pgd_index(addr);
+	pmd_t *ret;
+
+#ifdef CONFIG_X86_PAE
+	{
+		unsigned long pfn;
+		pfn = PFN_DOWN(pgd_val_ma(pgd[idx]));
+		pfn = machine_to_phys_mapping[pfn];
+
+		ret = ((pmd_t *)PFN_PHYS(pfn)) + pmd_index(addr);
+	}
+#else
+	ret = (pmd_t *)&pgd[idx];
+#endif
+
+	return ret;
+}
+
+static __init void copy_mapping(struct start_info *info, void *src, void *dst)
+{
+	unsigned long *mfn_list = (unsigned long *)info->mfn_list;
+	struct mmu_update u;
+
+	u.ptr = early_p2m(mfn_list, (unsigned long)dst);
+	u.val = pte_val_ma(*(pte_t *)src);
+
+	if (HYPERVISOR_mmu_update(&u, 1, NULL, DOMID_SELF) != 0)
+		BUG();
+}
+
+static __init void remap_addr_pmd(struct start_info *info, unsigned long addr)
+{
+	pgd_t *pgd = (pgd_t *)info->pt_base;
+	pmd_t *src = get_pmd(pgd, addr);
+	pmd_t *dst = get_pmd(pgd, addr + __PAGE_OFFSET);
+
+	copy_mapping(info, src, dst);
+}
+
+static __init void remap_kernel_pmd(struct start_info *info,
+				    unsigned long addr, unsigned long max)
+{
+	while (addr < max) {
+		remap_addr_pmd(info, addr);
+		addr += PMD_SIZE;
+	}
+}
+
+static __init void remap_addr_pgd(struct start_info *info, unsigned long addr)
+{
+	pgd_t *pgd = (pgd_t *)info->pt_base;
+	pgd_t *src = &pgd[pgd_index(addr)];
+	pgd_t *dst = &pgd[pgd_index(addr + __PAGE_OFFSET)];
+
+	copy_mapping(info, src, dst);
+}
+
+static __init void remap_kernel_pgd(struct start_info *info,
+				    unsigned long addr, unsigned long max)
+{
+	while (addr < max) {
+		remap_addr_pgd(info, addr);
+		addr += PGDIR_SIZE;
+	}
+}
+
+static __init void remap_kernel(struct start_info *info,
+				unsigned long start, unsigned long max)
+{
+	pgd_t *pgd = (pgd_t *)info->pt_base;
+
+#ifdef CONFIG_X86_PAE
+	/*
+	 * If we're running PAE, the kernel will probably want to be
+	 * mapped into the same pgd slot as Xen.  If so, we need to
+	 * copy the pmd entries rather than the pgd entries.  If not,
+	 * either the kernel doesn't want to be at the top of the
+	 * address space, or Xen has decided not to reserve any space;
+	 * in that case, we can just clone the pgd entries.
+	 */
+	if (pgd_val_ma(pgd[pgd_index(__PAGE_OFFSET)]) != 0) {
+		remap_kernel_pmd(info, start, max);
+		return;
+	}
+#endif
+
+	remap_kernel_pgd(info, start, max);
+}
+
+void __init xen_entry(void)
+{
+	struct start_info *info;
+	unsigned long limit;
+
+	info = (struct start_info *)(unsigned long)
+		(PA(&boot_params)->hdr.hardware_subarch_data);
+
+	BUG_ON(memcmp(info->magic, PA(&"xen-3.0"), 7) != 0);
+
+	/* establish a hypercall page */
+	setup_hypercall_page(info);
+
+	/* work out how far we need to remap */
+	limit = __pa(_end);
+	limit = max(limit, info->pt_base + (info->nr_pt_frames * PAGE_SIZE));
+	limit = max(limit, info->mfn_list +
+		    (info->nr_pages * sizeof(unsigned long)));
+	limit = max(limit, info->mod_start + info->mod_len);
+	limit = max(limit, early_m2p(info->console.domU.mfn) << PAGE_SHIFT);
+	limit = max(limit, early_m2p(info->store_mfn) << PAGE_SHIFT);
+
+	limit += PAGE_SIZE;
+
+	/* remap the kernel to its virtual address */
+	remap_kernel(info, 0, limit + PAGE_SIZE);
+
+	/* repoint things to their new virtual addresses */
+	info->pt_base = (unsigned long)__va(info->pt_base);
+	info->mfn_list = (unsigned long)__va(info->mfn_list);
+
+	init_pg_tables_end = limit;
+
+	asm volatile("mov %0,%%esp;"
+		     "push $0;"
+		     "jmp *%1"
+		     :
+		     : "i" (&init_thread_union.stack[THREAD_SIZE/sizeof(long)]),
+		       "r" (xen_start_kernel)
+		     : "memory");
+}
===================================================================
--- a/arch/i386/xen/enlighten.c
+++ b/arch/i386/xen/enlighten.c
@@ -50,6 +50,8 @@
 #include "mmu.h"
 #include "multicalls.h"
 
+struct hypercall_entry hypercall_page[PAGE_SIZE/sizeof(struct hypercall_entry)]
+	__attribute__((aligned(PAGE_SIZE), section(".bss.page_aligned")));
 EXPORT_SYMBOL_GPL(hypercall_page);
 
 DEFINE_PER_CPU(enum paravirt_lazy_mode, xen_lazy_mode);
@@ -1096,15 +1098,19 @@ static void __init xen_reserve_top(void)
 	reserve_top_address(-top + 2 * PAGE_SIZE);
 }
 
-/* First C function to be called on Xen boot */
-asmlinkage void __init xen_start_kernel(void)
+/*
+ * This is jumped to by early.c, once we're running in the proper
+ * kernel virtual address space.
+ */
+void __init xen_start_kernel(void)
 {
 	pgd_t *pgd;
 
-	if (!xen_start_info)
-		return;
-
-	BUG_ON(memcmp(xen_start_info->magic, "xen-3.0", 7) != 0);
+	xen_start_info = (struct start_info *)
+		__va(boot_params.hdr.hardware_subarch_data);
+
+	/* Get mfn list */
+	phys_to_machine_mapping = (unsigned long *)xen_start_info->mfn_list;
 
 	/* Install Xen paravirt ops */
 	paravirt_ops = xen_paravirt_ops;
@@ -1116,13 +1122,7 @@ asmlinkage void __init xen_start_kernel(
 
 	xen_setup_features();
 
-	/* Get mfn list */
-	if (!xen_feature(XENFEAT_auto_translated_physmap))
-		phys_to_machine_mapping = (unsigned long *)xen_start_info->mfn_list;
-
 	pgd = (pgd_t *)xen_start_info->pt_base;
-
-	init_pg_tables_end = __pa(pgd) + xen_start_info->nr_pt_frames*PAGE_SIZE;
 
 	init_mm.pgd = pgd; /* use the Xen pagetables to start */
 
@@ -1152,11 +1152,6 @@ asmlinkage void __init xen_start_kernel(
 	new_cpu_data.hard_math = 1;
 	new_cpu_data.x86_capability[0] = cpuid_edx(1);
 
-	/* Poke various useful things into boot_params */
-	LOADER_TYPE = (9 << 4) | 0;
-	INITRD_START = xen_start_info->mod_start ? __pa(xen_start_info->mod_start) : 0;
-	INITRD_SIZE = xen_start_info->mod_len;
-
 	/* Start the world */
 	start_kernel();
 }
===================================================================
--- a/arch/i386/xen/xen-head.S
+++ /dev/null
@@ -1,38 +0,0 @@
-/* Xen-specific pieces of head.S, intended to be included in the right
-	place in head.S */
-
-#ifdef CONFIG_XEN
-
-#include <linux/elfnote.h>
-#include <asm/boot.h>
-#include <xen/interface/elfnote.h>
-
-.pushsection .init.text,"ax",@progbits
-ENTRY(startup_xen)
-	movl %esi,xen_start_info
-	cld
- 	movl $(init_thread_union+THREAD_SIZE),%esp
-	jmp xen_start_kernel
-.popsection
-
-.pushsection ".bss.page_aligned"
-	.align PAGE_SIZE_asm
-ENTRY(hypercall_page)
-	.skip 0x1000
-.popsection
-
-	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
-	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz "2.6")
-	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz "xen-3.0")
-	ELFNOTE(Xen, XEN_ELFNOTE_VIRT_BASE,      .long  __PAGE_OFFSET)
-	ELFNOTE(Xen, XEN_ELFNOTE_ENTRY,          .long  startup_xen)
-	ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, .long  hypercall_page)
-	ELFNOTE(Xen, XEN_ELFNOTE_FEATURES,       .asciz "!writable_page_tables|pae_pgdir_above_4gb")
-#ifdef CONFIG_X86_PAE
-	ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE,       .asciz "yes")
-#else
-	ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE,       .asciz "no")
-#endif
-	ELFNOTE(Xen, XEN_ELFNOTE_LOADER,         .asciz "generic")
-
-#endif /*CONFIG_XEN */
===================================================================
--- a/arch/i386/xen/xen-ops.h
+++ b/arch/i386/xen/xen-ops.h
@@ -19,6 +19,7 @@ void __init xen_arch_setup(void);
 void __init xen_arch_setup(void);
 void __init xen_init_IRQ(void);
 
+void xen_start_kernel(void);
 void xen_setup_timer(int cpu);
 void xen_setup_cpu_clockevents(void);
 unsigned long xen_cpu_khz(void);
===================================================================
--- a/include/asm-i386/xen/hypercall.h
+++ b/include/asm-i386/xen/hypercall.h
@@ -40,7 +40,9 @@
 #include <xen/interface/sched.h>
 #include <xen/interface/physdev.h>
 
-extern struct { char _entry[32]; } hypercall_page[];
+extern struct hypercall_entry {
+	char _entry[32];
+} hypercall_page[];
 
 #define _hypercall0(type, name)						\
 ({									\

-- 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 04/10] i386: clean up bzImage generation
  2007-06-15  0:48 ` [PATCH 04/10] i386: clean up bzImage generation Jeremy Fitzhardinge
@ 2007-06-15 16:20   ` H. Peter Anvin
  2007-06-15 16:34     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 19+ messages in thread
From: H. Peter Anvin @ 2007-06-15 16:20 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Eric W. Biederman, Vivek Goyal, Rusty Russell, Andi Kleen, v12n,
	lkml, Andrew Morton, Xen-Devel

Jeremy Fitzhardinge wrote:
> -setup_move_size: .word  0x8000		# size to move, when setup is not
> +setup_move_size: .word  _setup_size	# size to move, when setup is not
>  					# loaded at 0x90000. We will move setup
>  					# to 0x90000 then just before jumping
>  					# into the kernel. However, only the

This is WRONG and will break 2.00 protocol bootloaders, if any still
exist, and quite possibly some 2.01 protocol bootloaders.  There are
definitiely bootloaders in the field that rely on this implicit value.

> @@ -246,7 +246,6 @@ setup2:
>  	jnz	1f
>  	movw	$0xfffc, %sp	# Make sure we're not zero
>  1:	movzwl	%sp, %esp	# Clear upper half of %esp
> -	sti

Motivation, please?

	-hpa


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 03/10] define ELF notes for adding to a boot image
  2007-06-15  0:48 ` [PATCH 03/10] define ELF notes for adding to a boot image Jeremy Fitzhardinge
@ 2007-06-15 16:20   ` H. Peter Anvin
  2007-06-15 16:40     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 19+ messages in thread
From: H. Peter Anvin @ 2007-06-15 16:20 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Eric W. Biederman, Vivek Goyal, Rusty Russell, Andi Kleen, v12n,
	lkml, Andrew Morton, Xen-Devel

Jeremy Fitzhardinge wrote:
> +#define EIN_PROGRAM_CHECKSUM	3 /* ip style checksum of the memory image. */

Why on earth use one of the weakest verificants in common use?

	-hpa

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 04/10] i386: clean up bzImage generation
  2007-06-15 16:20   ` H. Peter Anvin
@ 2007-06-15 16:34     ` Jeremy Fitzhardinge
  2007-06-15 16:51       ` H. Peter Anvin
  0 siblings, 1 reply; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15 16:34 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Vivek Goyal, Rusty Russell, Andi Kleen, v12n,
	lkml, Andrew Morton, Xen-Devel

H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>   
>> -setup_move_size: .word  0x8000		# size to move, when setup is not
>> +setup_move_size: .word  _setup_size	# size to move, when setup is not
>>  					# loaded at 0x90000. We will move setup
>>  					# to 0x90000 then just before jumping
>>  					# into the kernel. However, only the
>>     
>
> This is WRONG and will break 2.00 protocol bootloaders, if any still
> exist, and quite possibly some 2.01 protocol bootloaders.  There are
> definitiely bootloaders in the field that rely on this implicit value.
>   

Ah, I see.  I didn't see any documentation saying that this must be
0x8000.  Or does _setup_size just have to be <= 0x8000?

>> @@ -246,7 +246,6 @@ setup2:
>>  	jnz	1f
>>  	movw	$0xfffc, %sp	# Make sure we're not zero
>>  1:	movzwl	%sp, %esp	# Clear upper half of %esp
>> -	sti
>>     
>
> Motivation, please?
>   

We talked about this, and you said it was a mistake.  It needn't be in
this patch; it could be separate, or just dropped as far as I'm concerned.

    J

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 03/10] define ELF notes for adding to a boot image
  2007-06-15 16:20   ` H. Peter Anvin
@ 2007-06-15 16:40     ` Jeremy Fitzhardinge
  2007-06-26 20:18       ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15 16:40 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Vivek Goyal, Rusty Russell, Andi Kleen, v12n,
	lkml, Andrew Morton, Xen-Devel

H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>   
>> +#define EIN_PROGRAM_CHECKSUM	3 /* ip style checksum of the memory image. */
>>     
>
> Why on earth use one of the weakest verificants in common use?

I don't know.  I copied this stuff from the original relocatable kernel
patches; I think this is from Vivek.  Eric, Vivek: is there a specific
use for these notes?

    J

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 04/10] i386: clean up bzImage generation
  2007-06-15 16:34     ` Jeremy Fitzhardinge
@ 2007-06-15 16:51       ` H. Peter Anvin
  2007-06-15 17:03         ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 19+ messages in thread
From: H. Peter Anvin @ 2007-06-15 16:51 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Eric W. Biederman, Vivek Goyal, Rusty Russell, Andi Kleen, v12n,
	lkml, Andrew Morton, Xen-Devel

Jeremy Fitzhardinge wrote:
> H. Peter Anvin wrote:
>> Jeremy Fitzhardinge wrote:
>>   
>>> -setup_move_size: .word  0x8000		# size to move, when setup is not
>>> +setup_move_size: .word  _setup_size	# size to move, when setup is not
>>>  					# loaded at 0x90000. We will move setup
>>>  					# to 0x90000 then just before jumping
>>>  					# into the kernel. However, only the
>>>     
>> This is WRONG and will break 2.00 protocol bootloaders, if any still
>> exist, and quite possibly some 2.01 protocol bootloaders.  There are
>> definitiely bootloaders in the field that rely on this implicit value.   
> 
> Ah, I see.  I didn't see any documentation saying that this must be
> 0x8000.  Or does _setup_size just have to be <= 0x8000?
> 

The default for unaware bootloaders has been 0x8000 since the boot
protocol was created, and bootloaders are known to (improperly) rely on
it.  _setup_size does have to be <= 0x8000, but that's another issue.

>>> @@ -246,7 +246,6 @@ setup2:
>>>  	jnz	1f
>>>  	movw	$0xfffc, %sp	# Make sure we're not zero
>>>  1:	movzwl	%sp, %esp	# Clear upper half of %esp
>>> -	sti
>>>     
>> Motivation, please?
>>   
> 
> We talked about this, and you said it was a mistake.  It needn't be in
> this patch; it could be separate, or just dropped as far as I'm concerned.
> 

I said it probably wouldn't hurt to drop it.  I don't believe you ever
actually explained why you wanted it dropped.

	-hpa

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 04/10] i386: clean up bzImage generation
  2007-06-15 16:51       ` H. Peter Anvin
@ 2007-06-15 17:03         ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-15 17:03 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Vivek Goyal, Rusty Russell, Andi Kleen, v12n,
	lkml, Andrew Morton, Xen-Devel

H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>   
>> H. Peter Anvin wrote:
>>     
>>> Jeremy Fitzhardinge wrote:
>>>   
>>>       
>>>> -setup_move_size: .word  0x8000		# size to move, when setup is not
>>>> +setup_move_size: .word  _setup_size	# size to move, when setup is not
>>>>  					# loaded at 0x90000. We will move setup
>>>>  					# to 0x90000 then just before jumping
>>>>  					# into the kernel. However, only the
>>>>     
>>>>         
>>> This is WRONG and will break 2.00 protocol bootloaders, if any still
>>> exist, and quite possibly some 2.01 protocol bootloaders.  There are
>>> definitiely bootloaders in the field that rely on this implicit value.   
>>>       
>> Ah, I see.  I didn't see any documentation saying that this must be
>> 0x8000.  Or does _setup_size just have to be <= 0x8000?
>>
>>     
>
> The default for unaware bootloaders has been 0x8000 since the boot
> protocol was created, and bootloaders are known to (improperly) rely on
> it.  _setup_size does have to be <= 0x8000, but that's another issue.
>   

Hm, so the worst that could happen is that an old bootloader will
over-copy 0x8000 bytes rather than the specified amount?  How would that
break anything?

> I said it probably wouldn't hurt to drop it.  I don't believe you ever
> actually explained why you wanted it dropped.

Well, I don't specifically care for Xen; I don't really mind either way
in general.  I'll break it into a separate patch and we can handle it
that way.

    J

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 03/10] define ELF notes for adding to a boot image
  2007-06-15 16:40     ` Jeremy Fitzhardinge
@ 2007-06-26 20:18       ` Andrew Morton
  2007-06-26 20:21         ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2007-06-26 20:18 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: H. Peter Anvin, Eric W. Biederman, Vivek Goyal, Rusty Russell,
	Andi Kleen, v12n, lkml, Xen-Devel

On Fri, 15 Jun 2007 09:40:31 -0700
Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> H. Peter Anvin wrote:
> > Jeremy Fitzhardinge wrote:
> >   
> >> +#define EIN_PROGRAM_CHECKSUM	3 /* ip style checksum of the memory image. */
> >>     
> >
> > Why on earth use one of the weakest verificants in common use?
> 
> I don't know.  I copied this stuff from the original relocatable kernel
> patches; I think this is from Vivek.  Eric, Vivek: is there a specific
> use for these notes?
> 

ping?


Jeremy, I'll duck these patches for now, sorry.  Mainly on
i-already-have-enough-x86-stuff grounds.

There's a bit of overlap between this work and git-newsetup, but nothing
particularly serious-looking.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 03/10] define ELF notes for adding to a boot image
  2007-06-26 20:18       ` Andrew Morton
@ 2007-06-26 20:21         ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 19+ messages in thread
From: Jeremy Fitzhardinge @ 2007-06-26 20:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: H. Peter Anvin, Eric W. Biederman, Vivek Goyal, Rusty Russell,
	Andi Kleen, v12n, lkml, Xen-Devel

Andrew Morton wrote:
> Jeremy, I'll duck these patches for now, sorry.  Mainly on
> i-already-have-enough-x86-stuff grounds.
>   

That's fine.  I think it breaks x86-64 at the moment anyway.  And I 
haven't got much feedback about it.

> There's a bit of overlap between this work and git-newsetup, but nothing
> particularly serious-looking.
>   

Yes, I based it on the newsetup stuff in -mm, so there shouldn't be too 
much difference.

    J

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-06-26 20:22 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-15  0:48 [PATCH 00/10] paravirt/subarchitecture boot protocol Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 01/10] update boot spec to 2.07 Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 02/10] allow linux/elf.h to be included in assembler Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 03/10] define ELF notes for adding to a boot image Jeremy Fitzhardinge
2007-06-15 16:20   ` H. Peter Anvin
2007-06-15 16:40     ` Jeremy Fitzhardinge
2007-06-26 20:18       ` Andrew Morton
2007-06-26 20:21         ` Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 04/10] i386: clean up bzImage generation Jeremy Fitzhardinge
2007-06-15 16:20   ` H. Peter Anvin
2007-06-15 16:34     ` Jeremy Fitzhardinge
2007-06-15 16:51       ` H. Peter Anvin
2007-06-15 17:03         ` Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 05/10] i386: make the bzImage payload an ELF file Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 06/10] add WEAK() for creating weak asm labels Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 07/10] always allocate space for notes Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 08/10] i386: paravirt boot sequence Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 09/10] ask the hypervisor how much space it needs reserved Jeremy Fitzhardinge
2007-06-15  0:48 ` [PATCH 10/10] xen: use boot protocol to boot xen kernel Jeremy Fitzhardinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).