linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re : Re : Re : [PATCH] Compressed ia32 ELF file generation for loading by Gujin 1/3
@ 2007-02-06 20:30 Etienne Lorrain
  2007-02-06 20:55 ` Eric W. Biederman
  0 siblings, 1 reply; 5+ messages in thread
From: Etienne Lorrain @ 2007-02-06 20:30 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: H. Peter Anvin, vgoyal, linux-kernel

Eric W. Biederman wrote:
> Etienne Lorrain <etienne_lorrain@yahoo.fr> writes:
>>  Well, if the function called at offset 0 in the real-mode section return
>>  non zero, that is probably an error - that is Gujin interface, but do not ask
>>  me to change other bootloaders to handle that.
>>  This function is called with few parameters, one is a string pointer
>>  if the function do not want to print the problem report itself.
>>  The function is called with an offset of zero in its own intel
>>  segment.
>
> So I think there are some backwards compatibility issues with your
> real mode code interface.  Setting a new flag that says to return instead
> of printing an error message and halting would be one way to handle
> this.

 I am not sure to understand: Gujin calls a function inside the ELF real
mode program header of the Linux kernel. All the system is currently
in real mode. There isn't any limitation in this function of the kernel to
decide to print some message and not return at all, this function can
selectively do so by reading which bootloader ID it is using (another
parameter). It is not done mostly because "printf" is difficult to do
in assembler, Gujin has no problem.

> Is your real mode C code section relative such that it can be loaded
> at different addresses and still work?

  The code is relative - but not the data (first 4 Kbytes at %ds:0 and stack
 available) - but the whole lot at any segment boundary, i.e. every 16 bytes.
 Else Gujin would not work under DOS.

> The program header is for executable loaders the section header is for
> linkers and the section header is optional in PT_LOAD and PT_DYN
> executables.  Use of the section header by a loader is a bug.

  Unless if there is a problem, Gujin uses only the program header;
 it has a look in the section header just in case - that can be removed
 easily. I was wondering about a relocatable image there - but I do
 not know enough for that.

> There have been limitations but mostly with respect to page table size
> and the like but they were not limitations a bootloader had to deal with.

Was talking about this comment, but it is old:
/* 0x28000*16 = 2.5 MB, conservative estimate for the current maximum */
http://lxr.linux.no/source/arch/i386/boot/tools/build.c?v=2.4.28

> >   Well, you can generate with the proposed patch and boot with SYSLINUX,
> >  Grub and LILO, but trying to clean Linux real-mode code is where I begun.
>
> I haven't looked yet.  But I believe this is something that we can do,
> so long as cleaning and feature enhancement are not mixed in a bad way.
> Phrasing this another way.  What we can do with the interface is
> determined our interface to existing bootloaders and what bootloaders
> exist.  Basically it is the rule that we need to preserve backwards
> compatibility and changes generally need to be evolutionary ones.

  Well, if you want to preserve compatibility with other bootloaders,
 it is probably possible to put some source around the ELF kernel file,
 mostly taken from Gujin if you want (GPL), but I wonder why you would
 like to be compatible with LILO.
 Modifying the linux real mode assembler, nobody could even want to.

 Etienne.






	

	
		
___________________________________________________________________________ 
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses 
http://fr.answers.yahoo.com

^ permalink raw reply	[flat|nested] 5+ messages in thread
* Re: Re : [PATCH] Compressed ia32 ELF file generation for loading by Gujin 1/3
@ 2007-02-09 19:42 Eric W. Biederman
  2007-02-11 13:23 ` RE : " Etienne Lorrain
  0 siblings, 1 reply; 5+ messages in thread
From: Eric W. Biederman @ 2007-02-09 19:42 UTC (permalink / raw)
  To: Etienne Lorrain; +Cc: vgoyal, H. Peter Anvin, linux-kernel

Etienne Lorrain <etienne_lorrain@yahoo.fr> writes:

>   Well, a self relocating image cannot be an ELF file because the code
>  to relocate the ELF cannot be executed at the wrong place.
>  If relocation is needed, I would better like not to link vmlinux at a
>  fixed address first. In fact I wonder if we are talking of the same
>  kind of relocation: you seem to talk about "ld --pic-executable" while
>  I am thinking of "ld -r" to "locate" it at the bootloader loading time.
>   The main problem I see is that I do not have the code for that, and
>  I am going deeper/earlier into the generation of vmlinux, while comments
>  are "already you are too early, loading an ELF file is too complex for
>  a bootloader". The solution I have already is working.

Being very clear.  ld --pic-executable or ld -shared is essentially
what we are talking when we are discussing building a relocatable
kernel.  Something with the properties of an ELF ET_DYN executable
that does not use an interpreter.  ld.so is the only common executable
of this type in linux.

Loading an ELF executable is very much:
- Walk through the program headers and for each PT_LOAD segment load
  it at the address it requests.
- Jump to e_entry from the ELF header.

If you are working with a relocatable ELF object the rules become:
- Walk through the program header once to find the size and alignment
  of the chunk of memory that the linux kernel needs.
- Find a hole in the memory map that meets those requirements.
- Compute the offset 
- Walk through the program headers and for each PT_LOAD segment
  add offset to the addresses and load the segment like normal.
- Jump to offset + e_entry.

This is within the scope of what a bootloader can reasonably do, and
I have implemented it in etherboot as well as /sbin/kexec. 

>> > If you cannot get a PT_LOAD
>> > section, maybe we can put a simple system in NOTE, or just create a
>> > PT_LOAD16 if the linker accepts other values.
>> 
>> My guess is that PT_LOAD16 is not an acceptable value. Putting information
>> in PT_NOTE seems interesting (As Eric already mentioned).
>
>  In fact, thinking more about that, I am going back to my implementation
>  of it, because on ia32 the interrupt vectors are at address zero and it is
>  obviouly an invalid address to load an ELF for this architecture.

No special games no special rules with the well defined ELF components
either add a note that you can define all of the semantics yourself
or don't do it.  That is what the notes are there for.

>  But for the linker, it is the right address to link it (being an offset
>  into a non-null segment in real mode), and because the entry point has
>  to be zero (I cannot use the ELF entry value) the program header base
>  address has to be zero.

Agreed. When the object file is linked using offset 0, and letting the
real mode segments do have different bases to do your relocation is fine.

>  Anyway, your loader in (probably) written in C, so a test against zero
>  is a simple thing to do, and should be done anyway to check for an
>  incorrect ELF program header. I wonder if this NOTE program header is
>  not simply designed as an "end" marker, it does not seem to contain
>  anything, so me defining the realmode after that program header may
>  be a good idea.

We have been very sparse on the usage of ELF notes but yes they exist
and yes people do look at them.  Please dig up a copy of the ELF spec
and read up on them or look at etherboot for an example.

>  If you really are tring to catch an erroneous DMA into the kernel,
>  is it better to keep an exact copy of the kernel you are using somewhere
>  else to do a bit-to-bit comparisson after the crash, and so no relocate.
>  Anyway if the DMA crash has crashed the exception handling area the system
>  is dead anyway.

No.  We are not trying to catch a erroneous DMA in that sense.  We
do not shutdown any drivers when switching to the new kernel from
panic(), because we don't know what is broken.   Any single bit
of kernel code of could cause problems not just the exception table.
Running in an area that we have never used for DMA and is completely
reserved gives us freedom to not worry about it.  I.e. This is not
error detection but future error prevention and it works.

Before starting the new kernel we do a sha256 checksum test on the
new kernel and on our code that is running the checksum.  All of which
comes from /sbin/kexec.  Not compiled into the running kernel.  The
policy is in user space.

>> Interesting question, How does a boot loader/user decide where to load
>> the relocatable image? I think it depends on the new interesting usages
>> of the relocatable kernel. As of today, kexec knows where is reserved
>> memory region (Read from /proc/iomem) and it loads the image at the
>> start of that reserved region (Meeting alignment restrictions, if any). So
>> in this case boot loader takes the decision. May be a user option also
>> can be created, something like --load-address=0xXYZ and then people
>> can have fun loading same image at various addresses.
>
>  I think that you are asking too much for the bootloader user, and that
> is a decision he has to take *before* the crash; even me, I would select
> one address like 16 Mbytes and stick with it.

Yes.  Unfortunately there is no one value that works on all machines,
which is why we are moving to a relocatable kernel.

Currently we specify crashkernel=size@location on the kernel command
line to reserve the memory.  Hopefully we can reduce this to just
crashkernel=size and have the kernel find a reasonable whole in
the memory map to reserve.  Where that hole is, is exported to
userspace so /sbin/kexec just puts the kernel in that hole.

>  If the running Linux kernel do not erase Gujin from memory, it could
> also go back to real mode and do a "longjmp()" to return to the
> Gujin interface - but most of the times the system had a reason to
> crash (for instance a ventilator stopped working) and you can plan
> whatever you want in software...

True, and no solution is perfect.   The target is to catch the maximum
number of situations that can be caught.

Going back to real mode generally doesn't work because the BIOS get's
confused with the changes in hardware state from linux running.  Some
systems it does work on though.  Going back to real mode especially
don't work well when the kernel doesn't do a clean shutdown.

In a normal context there are two practical advantages to a bootloader
speaking ELF.
1) The load address is no longer fixed at 1MB.  So if (for example) we
   want to get the performance advantages of 4MB pages all we have to
   do is tweak the  alignment and the load address to be at 4MB.
2) Being able to choose somewhere else in the memory map that works.
   If we have a big (32MB uncompressed) kernel with all of the modules
   compiled in and there is a memory hole at 15MB-16MB.  The normal
   load address won't work so the bootloader can pick another address.
   Similarly problems appear when people place acpi tables at lower
   addresses.

For a 64bit kernel I have thought placing the kernel above 4GB has
several interesting advantages for making more memory available for
DMA accesses without needed an IOMMU.

Then we get into the cases like Xen, which have no real mode to go
through so need completely different bootloaders.

Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-02-11 20:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-06 20:30 Re : Re : Re : [PATCH] Compressed ia32 ELF file generation for loading by Gujin 1/3 Etienne Lorrain
2007-02-06 20:55 ` Eric W. Biederman
2007-02-06 21:00   ` H. Peter Anvin
2007-02-09 19:42 Eric W. Biederman
2007-02-11 13:23 ` RE : " Etienne Lorrain
2007-02-11 20:49   ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).