All of lore.kernel.org
 help / color / mirror / Atom feed
* Multiboot ELF Loader - Alignment
@ 2021-10-21 19:46 Chris Plant
  2021-10-26 12:34 ` Daniel Kiper
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Plant @ 2021-10-21 19:46 UTC (permalink / raw)
  To: grub-devel

Hello,

I'm continuing to play with Multiboot2 on ARM64 on a Raspberry Pi, and
I've run into something which I'm trying to understand.

I have an ELF file for my kernel which has two segments (I have no idea
why rust is giving me two segments, but it is).  If I boot directly
into the ELF file from the Pi firmware it boots fine, but if I boot via
GRUB I have issues with data corruption in .rodata which is in the
second segment.  The first segment (.text) appears to load correctly.

Digging into this, the ELF headers specify a load base address of
0x801060 for the second segment, but GRUB allocates and loads this to
0x802000.  If I change my linker to align the second segment onto
0x1000 everything works fine.

Is this alignment to 0x1000 a defined behaviour of the GRUB allocator
or Multiboot2?  I'm assuming it's not a problem with the ELF file as
it's generated by usual means and runs fine otherwise.

Thanks,

Chris




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Multiboot ELF Loader - Alignment
  2021-10-21 19:46 Multiboot ELF Loader - Alignment Chris Plant
@ 2021-10-26 12:34 ` Daniel Kiper
  2021-10-26 13:51   ` Chris Plant
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Kiper @ 2021-10-26 12:34 UTC (permalink / raw)
  To: Chris Plant; +Cc: grub-devel

Hey,

On Thu, Oct 21, 2021 at 08:46:06PM +0100, Chris Plant wrote:
> Hello,
>
> I'm continuing to play with Multiboot2 on ARM64 on a Raspberry Pi, and
> I've run into something which I'm trying to understand.

Why do you use Multiboot2 on ARM64?

> I have an ELF file for my kernel which has two segments (I have no idea
> why rust is giving me two segments, but it is).  If I boot directly
> into the ELF file from the Pi firmware it boots fine, but if I boot via
> GRUB I have issues with data corruption in .rodata which is in the
> second segment.  The first segment (.text) appears to load correctly.

Could you share the output from "readelf -Wa <kernel>"? Additionally,
how is your Multiboot2 header defined in the kernel?

> Digging into this, the ELF headers specify a load base address of
> 0x801060 for the second segment, but GRUB allocates and loads this to
> 0x802000.  If I change my linker to align the second segment onto
> 0x1000 everything works fine.

The Multiboot2 protocol does not tolerate "holes" in the image. You can
find good explanation what I mean by that here [1].

> Is this alignment to 0x1000 a defined behaviour of the GRUB allocator
> or Multiboot2?  I'm assuming it's not a problem with the ELF file as
> it's generated by usual means and runs fine otherwise.

I think you should take closer look at grub-core/loader/multiboot_elfxx.c
and especially lines starting from 258. It seems to me something around
here overwrites part of the image in the memory.

Daniel

[1] https://lists.xenproject.org/archives/html/xen-devel/2021-05/msg01003.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Multiboot ELF Loader - Alignment
  2021-10-26 12:34 ` Daniel Kiper
@ 2021-10-26 13:51   ` Chris Plant
  2021-10-29 19:16     ` Chris Plant
  2021-11-03 15:42     ` Daniel Kiper
  0 siblings, 2 replies; 7+ messages in thread
From: Chris Plant @ 2021-10-26 13:51 UTC (permalink / raw)
  To: Daniel Kiper; +Cc: grub-devel

Thanks for the detailed reply, I think I’ve got to the bottom of this now, I realised what I missed just after I sent this email.

> On 26 Oct 2021, at 13:35, Daniel Kiper <dkiper@net-space.pl> wrote:
> 
> Hey,
> 
>> On Thu, Oct 21, 2021 at 08:46:06PM +0100, Chris Plant wrote:
>> Hello,
>> 
>> I'm continuing to play with Multiboot2 on ARM64 on a Raspberry Pi, and
>> I've run into something which I'm trying to understand.
> 
> Why do you use Multiboot2 on ARM64?

I guess the question is why wouldn’t I?  Is the assumption that I should use ARM32 and switch after booting?

> 
>> I have an ELF file for my kernel which has two segments (I have no idea
>> why rust is giving me two segments, but it is).  If I boot directly
>> into the ELF file from the Pi firmware it boots fine, but if I boot via
>> GRUB I have issues with data corruption in .rodata which is in the
>> second segment.  The first segment (.text) appears to load correctly.
> 
> Could you share the output from "readelf -Wa <kernel>"? Additionally,
> how is your Multiboot2 header defined in the kernel?

I don’t have this to hand but I will share when I get back.  The multiboot2 header is defined in ASM at the very head of my code.

> 
>> Digging into this, the ELF headers specify a load base address of
>> 0x801060 for the second segment, but GRUB allocates and loads this to
>> 0x802000.  If I change my linker to align the second segment onto
>> 0x1000 everything works fine.
> 
> The Multiboot2 protocol does not tolerate "holes" in the image. You can
> find good explanation what I mean by that here [1].

I’ll look into this.

> 
>> Is this alignment to 0x1000 a defined behaviour of the GRUB allocator
>> or Multiboot2?  I'm assuming it's not a problem with the ELF file as
>> it's generated by usual means and runs fine otherwise.
> 
> I think you should take closer look at grub-core/loader/multiboot_elfxx.c
> and especially lines starting from 258. It seems to me something around
> here overwrites part of the image in the memory.
> 

I did this and I now understand what is causing the issue.  The memory allocator in grub allocates pages and hence aligns the memory region to 0x1000.  When the allocator returns it gives the start of the region allocated and the segment is loaded to that address, not the address specified by the elf file.

The workaround I’ve used is to align your second segment to 0x1000 or to only use one segment.

Should the defined behaviour be to require segments to be aligned on pages, or should grub load as per the elf file?

Chris


> Daniel
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2021-05/msg01003.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Multiboot ELF Loader - Alignment
  2021-10-26 13:51   ` Chris Plant
@ 2021-10-29 19:16     ` Chris Plant
  2021-11-03 16:38       ` Daniel Kiper
  2021-11-03 15:42     ` Daniel Kiper
  1 sibling, 1 reply; 7+ messages in thread
From: Chris Plant @ 2021-10-29 19:16 UTC (permalink / raw)
  To: The development of GNU GRUB, Daniel Kiper

Daniel,

I'm back home now and I have access to my PC again, so I can answer
more completely.

On Tue, 2021-10-26 at 14:51 +0100, Chris Plant wrote:
> Thanks for the detailed reply, I think I’ve got to the bottom of this
> now, I realised what I missed just after I sent this email.
> 
> > On 26 Oct 2021, at 13:35, Daniel Kiper <dkiper@net-space.pl> wrote:
> > 
> > Hey,
> > 
> > > On Thu, Oct 21, 2021 at 08:46:06PM +0100, Chris Plant wrote:
> > > Hello,
> > > 
> > > I'm continuing to play with Multiboot2 on ARM64 on a Raspberry
> > > Pi, and
> > > I've run into something which I'm trying to understand.
> > 
> > Why do you use Multiboot2 on ARM64?
> 
> I guess the question is why wouldn’t I?  Is the assumption that I
> should use ARM32 and switch after booting?
> 
> > 
> > > I have an ELF file for my kernel which has two segments (I have
> > > no idea
> > > why rust is giving me two segments, but it is).  If I boot
> > > directly
> > > into the ELF file from the Pi firmware it boots fine, but if I
> > > boot via
> > > GRUB I have issues with data corruption in .rodata which is in
> > > the
> > > second segment.  The first segment (.text) appears to load
> > > correctly.
> > 
> > Could you share the output from "readelf -Wa <kernel>"?
> > Additionally,
> > how is your Multiboot2 header defined in the kernel?
> 
> I don’t have this to hand but I will share when I get back.  The
> multiboot2 header is defined in ASM at the very head of my code.

I've used readelf -la rather than -Wa to avoid including the debugging
symbols which I don't believe add any value here.

First, the original, which fails to boot correctly:

Elf file type is EXEC (Executable file)
Entry point 0x800000
There are 3 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000200 0x0000000000800000
0x0000000000800000
                 0x0000000000000fd4 0x0000000000000fd4  R E    0x100
  LOAD           0x00000000000011e0 0x0000000000800fe0
0x0000000000800fe0
                 0x00000000000000e0 0x00000000000000e0  R      0x10
  GNU_STACK      0x0000000000000000 0x0000000000000000
0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x0

 Section to Segment mapping:
  Segment Sections...
   00     .text
   01     .rodata
   02

And the revised, working code, with ld script used to force alignment
of the second segment to 0x1000 (i.e. page aligned):

Elf file type is EXEC (Executable file)
Entry point 0x800000
There are 3 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000200 0x0000000000800000
0x0000000000800000
                 0x0000000000000fd4 0x0000000000000fd4  R E    0x100
  LOAD           0x00000000000011e0 0x0000000000801000
0x0000000000801000
                 0x0000000000001000 0x0000000000001000  R      0x10
  GNU_STACK      0x0000000000000000 0x0000000000000000
0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x0

 Section to Segment mapping:
  Segment Sections...
   00     .text
   01     .rodata
   02
> 
> > 
> > > Digging into this, the ELF headers specify a load base address of
> > > 0x801060 for the second segment, but GRUB allocates and loads
> > > this to
> > > 0x802000.  If I change my linker to align the second segment onto
> > > 0x1000 everything works fine.
> > 
> > The Multiboot2 protocol does not tolerate "holes" in the image. You
> > can
> > find good explanation what I mean by that here [1].
> 
> I’ll look into this.

I've read the article and I think my "fixed" linking *is* an image with
a hole as you describe, since we now ask for loading at 0x800000 - less
than 0x801000, and then from 0x801000 to whatever.

> 
> > 
> > > Is this alignment to 0x1000 a defined behaviour of the GRUB
> > > allocator
> > > or Multiboot2?  I'm assuming it's not a problem with the ELF file
> > > as
> > > it's generated by usual means and runs fine otherwise.
> > 
> > I think you should take closer look at grub-
> > core/loader/multiboot_elfxx.c
> > and especially lines starting from 258. It seems to me something
> > around
> > here overwrites part of the image in the memory.
> > 
> 
> I did this and I now understand what is causing the issue.  The
> memory allocator in grub allocates pages and hence aligns the memory
> region to 0x1000.  When the allocator returns it gives the start of
> the region allocated and the segment is loaded to that address, not
> the address specified by the elf file.
> 
> The workaround I’ve used is to align your second segment to 0x1000 or
> to only use one segment.
> 
> Should the defined behaviour be to require segments to be aligned on
> pages, or should grub load as per the elf file?

The issue is in grub-core/loader/multiboot_elfxx.c, line 134:

for (i = 0; i < ehdr->e_phnum; i++)

...

If the image is not relocatable, grub tries to allocate each segment
individually (line 149) (if it is relocatable, grub allocates the
entire requested, continuous block in one allocation, line 112)
  
err = grub_relocator_alloc_chunk_addr (GRUB_MULTIBOOT (relocator), &ch,
                            phdr(i)->p_paddr, phdr(i)->p_memsz);

Although we ask for the base address to be the segment load address,
the relocator allocates pages, and hence the result is page aligned.

We then convert this to a virtual address at line 158:

	      source = get_virtual_current_address (ch);

We then load the segment to the allocated region from line 149/158 at
line 167:

      if (grub_file_read (mld->file, (grub_uint8_t *) source +
load_offset, phdr(i)->p_filesz)
		  != (grub_ssize_t) phdr(i)->p_filesz)

So there is no check that the allocated region is the same as the load
address in the linked file, and because we allocate by page, if there
is > 1 segment, your segments must be page aligned for this code to
work.

I'm happy to fix this, mostly by copying the approach used for
relocatable code, which fits with your comments about there being no
holes in the image.

Thanks again,

Chris

> 
> Chris
> 
> 
> > Daniel
> > 
> > [1]  
> > https://lists.xenproject.org/archives/html/xen-devel/2021-05/msg01003.html
> 
> 
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Multiboot ELF Loader - Alignment
  2021-10-26 13:51   ` Chris Plant
  2021-10-29 19:16     ` Chris Plant
@ 2021-11-03 15:42     ` Daniel Kiper
  1 sibling, 0 replies; 7+ messages in thread
From: Daniel Kiper @ 2021-11-03 15:42 UTC (permalink / raw)
  To: Chris Plant; +Cc: grub-devel

On Tue, Oct 26, 2021 at 02:51:20PM +0100, Chris Plant wrote:
> Thanks for the detailed reply, I think I’ve got to the bottom of this now, I realised what I missed just after I sent this email.
> > On 26 Oct 2021, at 13:35, Daniel Kiper <dkiper@net-space.pl> wrote:
> >
> > Hey,
> >
> >> On Thu, Oct 21, 2021 at 08:46:06PM +0100, Chris Plant wrote:
> >> Hello,
> >>
> >> I'm continuing to play with Multiboot2 on ARM64 on a Raspberry Pi, and
> >> I've run into something which I'm trying to understand.
> >
> > Why do you use Multiboot2 on ARM64?
>
> I guess the question is why wouldn’t I?  Is the assumption that I should use ARM32 and switch after booting?

It is difficult to say anything if I do not know the details about your
use case. That is why I am asking. And I am in particular interested in
what kind of "firmware" you are using for your project. UEFI? Anything
else?

Daniel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Multiboot ELF Loader - Alignment
  2021-10-29 19:16     ` Chris Plant
@ 2021-11-03 16:38       ` Daniel Kiper
  2021-11-04  8:25         ` Chris Plant
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Kiper @ 2021-11-03 16:38 UTC (permalink / raw)
  To: Chris Plant; +Cc: The development of GNU GRUB

On Fri, Oct 29, 2021 at 08:16:36PM +0100, Chris Plant wrote:
> Daniel,
>
> I'm back home now and I have access to my PC again, so I can answer
> more completely.
>
> On Tue, 2021-10-26 at 14:51 +0100, Chris Plant wrote:
> > Thanks for the detailed reply, I think I’ve got to the bottom of this
> > now, I realised what I missed just after I sent this email.
> >
> > > On 26 Oct 2021, at 13:35, Daniel Kiper <dkiper@net-space.pl> wrote:
> > >
> > > Hey,
> > >
> > > > On Thu, Oct 21, 2021 at 08:46:06PM +0100, Chris Plant wrote:
> > > > Hello,
> > > >
> > > > I'm continuing to play with Multiboot2 on ARM64 on a Raspberry
> > > > Pi, and
> > > > I've run into something which I'm trying to understand.
> > >
> > > Why do you use Multiboot2 on ARM64?
> >
> > I guess the question is why wouldn’t I?  Is the assumption that I
> > should use ARM32 and switch after booting?
> >
> > >
> > > > I have an ELF file for my kernel which has two segments (I have
> > > > no idea
> > > > why rust is giving me two segments, but it is).  If I boot
> > > > directly
> > > > into the ELF file from the Pi firmware it boots fine, but if I
> > > > boot via
> > > > GRUB I have issues with data corruption in .rodata which is in
> > > > the
> > > > second segment.  The first segment (.text) appears to load
> > > > correctly.
> > >
> > > Could you share the output from "readelf -Wa <kernel>"?
> > > Additionally,
> > > how is your Multiboot2 header defined in the kernel?
> >
> > I don’t have this to hand but I will share when I get back.  The
> > multiboot2 header is defined in ASM at the very head of my code.
>
> I've used readelf -la rather than -Wa to avoid including the debugging
> symbols which I don't believe add any value here.
>
> First, the original, which fails to boot correctly:
>
> Elf file type is EXEC (Executable file)
> Entry point 0x800000
> There are 3 program headers, starting at offset 64
>
> Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000000200 0x0000000000800000
> 0x0000000000800000
>                  0x0000000000000fd4 0x0000000000000fd4  R E    0x100
>   LOAD           0x00000000000011e0 0x0000000000800fe0
> 0x0000000000800fe0
>                  0x00000000000000e0 0x00000000000000e0  R      0x10
>   GNU_STACK      0x0000000000000000 0x0000000000000000
> 0x0000000000000000
>                  0x0000000000000000 0x0000000000000000  RW     0x0
>
>  Section to Segment mapping:
>   Segment Sections...
>    00     .text
>    01     .rodata
>    02
>
> And the revised, working code, with ld script used to force alignment
> of the second segment to 0x1000 (i.e. page aligned):
>
> Elf file type is EXEC (Executable file)
> Entry point 0x800000
> There are 3 program headers, starting at offset 64
>
> Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000000200 0x0000000000800000
> 0x0000000000800000
>                  0x0000000000000fd4 0x0000000000000fd4  R E    0x100
>   LOAD           0x00000000000011e0 0x0000000000801000
> 0x0000000000801000
>                  0x0000000000001000 0x0000000000001000  R      0x10
>   GNU_STACK      0x0000000000000000 0x0000000000000000
> 0x0000000000000000
>                  0x0000000000000000 0x0000000000000000  RW     0x0
>
>  Section to Segment mapping:
>   Segment Sections...
>    00     .text
>    01     .rodata
>    02
> >
> > >
> > > > Digging into this, the ELF headers specify a load base address of
> > > > 0x801060 for the second segment, but GRUB allocates and loads
> > > > this to
> > > > 0x802000.  If I change my linker to align the second segment onto
> > > > 0x1000 everything works fine.
> > >
> > > The Multiboot2 protocol does not tolerate "holes" in the image. You
> > > can
> > > find good explanation what I mean by that here [1].
> >
> > I’ll look into this.
>
> I've read the article and I think my "fixed" linking *is* an image with
> a hole as you describe, since we now ask for loading at 0x800000 - less
> than 0x801000, and then from 0x801000 to whatever.

Yep!

> > > > Is this alignment to 0x1000 a defined behaviour of the GRUB
> > > > allocator
> > > > or Multiboot2?  I'm assuming it's not a problem with the ELF file
> > > > as
> > > > it's generated by usual means and runs fine otherwise.
> > >
> > > I think you should take closer look at grub-
> > > core/loader/multiboot_elfxx.c
> > > and especially lines starting from 258. It seems to me something
> > > around
> > > here overwrites part of the image in the memory.
> > >
> >
> > I did this and I now understand what is causing the issue.  The
> > memory allocator in grub allocates pages and hence aligns the memory
> > region to 0x1000.  When the allocator returns it gives the start of
> > the region allocated and the segment is loaded to that address, not
> > the address specified by the elf file.
> >
> > The workaround I’ve used is to align your second segment to 0x1000 or
> > to only use one segment.
> >
> > Should the defined behaviour be to require segments to be aligned on
> > pages, or should grub load as per the elf file?
>
> The issue is in grub-core/loader/multiboot_elfxx.c, line 134:
>
> for (i = 0; i < ehdr->e_phnum; i++)
>
> ...
>
> If the image is not relocatable, grub tries to allocate each segment
> individually (line 149) (if it is relocatable, grub allocates the
> entire requested, continuous block in one allocation, line 112)
>   
> err = grub_relocator_alloc_chunk_addr (GRUB_MULTIBOOT (relocator), &ch,
>                             phdr(i)->p_paddr, phdr(i)->p_memsz);
>
> Although we ask for the base address to be the segment load address,
> the relocator allocates pages, and hence the result is page aligned.

I am not fully convinced. Could you check where exactly this happens in
the GRUB memory allocator? Additionally, could you check what happens
if you use grub_relocator_alloc_chunk_align() instead of
grub_relocator_alloc_chunk_addr() here? Of course the arguments for the
grub_relocator_alloc_chunk_align() should in theory lead to the same
allocation which should be done by the grub_relocator_alloc_chunk_addr().

> We then convert this to a virtual address at line 158:
>
> 	      source = get_virtual_current_address (ch);
>
> We then load the segment to the allocated region from line 149/158 at
> line 167:
>
>       if (grub_file_read (mld->file, (grub_uint8_t *) source +
> load_offset, phdr(i)->p_filesz)
> 		  != (grub_ssize_t) phdr(i)->p_filesz)
>
> So there is no check that the allocated region is the same as the load
> address in the linked file, and because we allocate by page, if there
> is > 1 segment, your segments must be page aligned for this code to
> work.

It seems to me we should introduce checks here which compare allocated
address with the requested one. Hmmm...I think the
grub_relocator_alloc_chunk_addr() should fail if it is not able to
fulfill the request. It should not introduce any alignments because caller
asked for exact address. If this does not work in that way it is a bug.

> I'm happy to fix this, mostly by copying the approach used for
> relocatable code, which fits with your comments about there being no
> holes in the image.

I think you should check [1] first. The most interesting thing is the
note at the bottom of this section: This information does not need to be
provided if the kernel image is in ELF format, but it must be provided
if the image is in a.out format or in some other format. When the
address tag is present it must be used in order to load the image,
regardless of whether an ELF header is also present. Compliant boot
loaders must be able to load images that are either in ELF format or
contain the address tag embedded in the Multiboot2 header.

So, my understating is that the ELF header has to be ignored if the address
tag of Multiboot2 header is present. If this is not the case this is the
bug. Though if both are present they values should be aligned properly
to not make any confusion.

I think it would help if you share your Multiboot2 header with us.

Daniel

[1] https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html#Address-header-tag


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Multiboot ELF Loader - Alignment
  2021-11-03 16:38       ` Daniel Kiper
@ 2021-11-04  8:25         ` Chris Plant
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Plant @ 2021-11-04  8:25 UTC (permalink / raw)
  To: The development of GNU GRUB

On Wed, 2021-11-03 at 17:38 +0100, Daniel Kiper wrote:
> On Fri, Oct 29, 2021 at 08:16:36PM +0100, Chris Plant wrote:
> > Daniel,
> > 
> > I'm back home now and I have access to my PC again, so I can answer
> > more completely.
> > 
> > On Tue, 2021-10-26 at 14:51 +0100, Chris Plant wrote:
> > > Thanks for the detailed reply, I think I’ve got to the bottom of
> > > this
> > > now, I realised what I missed just after I sent this email.
> > > 
> > > > On 26 Oct 2021, at 13:35, Daniel Kiper <dkiper@net-space.pl>
> > > > wrote:
> > > > 
> > > > Hey,
> > > > 
> > > > > On Thu, Oct 21, 2021 at 08:46:06PM +0100, Chris Plant wrote:
> > > > > Hello,
> > > > > 
> > > > > I'm continuing to play with Multiboot2 on ARM64 on a
> > > > > Raspberry
> > > > > Pi, and
> > > > > I've run into something which I'm trying to understand.
> > > > 
> > > > Why do you use Multiboot2 on ARM64?
> > > 
> > > I guess the question is why wouldn’t I?  Is the assumption that I
> > > should use ARM32 and switch after booting?

To tie the two email chains back together (apologies, I was on holiday
so replied piecemeal), I want to use multiboot (hence grub), and then
EFI gives me access to the FDT for hardware info across multiple
hardware platforms, is the idea.  Mostly I'm just playing at this, so
there is no grand plan, but it might be useful eventually.

Hence ARM64 and EFI.  It seems to work and I've patched a few things
within grub to enable this and fix a couple of bugs (e.g. ELF section
tag alignment on ARM64 caused by mismatch between GRUB headers and
multiboot spec).

> > > 
> > > > 
> > > > > I have an ELF file for my kernel which has two segments (I
> > > > > have
> > > > > no idea
> > > > > why rust is giving me two segments, but it is).  If I boot
> > > > > directly
> > > > > into the ELF file from the Pi firmware it boots fine, but if
> > > > > I
> > > > > boot via
> > > > > GRUB I have issues with data corruption in .rodata which is
> > > > > in
> > > > > the
> > > > > second segment.  The first segment (.text) appears to load
> > > > > correctly.
> > > > 
> > > > Could you share the output from "readelf -Wa <kernel>"?
> > > > Additionally,
> > > > how is your Multiboot2 header defined in the kernel?
> > > 
> > > I don’t have this to hand but I will share when I get back.  The
> > > multiboot2 header is defined in ASM at the very head of my code.
> > 
> > I've used readelf -la rather than -Wa to avoid including the
> > debugging
> > symbols which I don't believe add any value here.
> > 
> > First, the original, which fails to boot correctly:
> > 
> > Elf file type is EXEC (Executable file)
> > Entry point 0x800000
> > There are 3 program headers, starting at offset 64
> > 
> > Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags 
> > Align
> >   LOAD           0x0000000000000200 0x0000000000800000
> > 0x0000000000800000
> >                  0x0000000000000fd4 0x0000000000000fd4  R E   
> > 0x100
> >   LOAD           0x00000000000011e0 0x0000000000800fe0
> > 0x0000000000800fe0
> >                  0x00000000000000e0 0x00000000000000e0  R      0x10
> >   GNU_STACK      0x0000000000000000 0x0000000000000000
> > 0x0000000000000000
> >                  0x0000000000000000 0x0000000000000000  RW     0x0
> > 
> >  Section to Segment mapping:
> >   Segment Sections...
> >    00     .text
> >    01     .rodata
> >    02
> > 
> > And the revised, working code, with ld script used to force
> > alignment
> > of the second segment to 0x1000 (i.e. page aligned):
> > 
> > Elf file type is EXEC (Executable file)
> > Entry point 0x800000
> > There are 3 program headers, starting at offset 64
> > 
> > Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags 
> > Align
> >   LOAD           0x0000000000000200 0x0000000000800000
> > 0x0000000000800000
> >                  0x0000000000000fd4 0x0000000000000fd4  R E   
> > 0x100
> >   LOAD           0x00000000000011e0 0x0000000000801000
> > 0x0000000000801000
> >                  0x0000000000001000 0x0000000000001000  R      0x10
> >   GNU_STACK      0x0000000000000000 0x0000000000000000
> > 0x0000000000000000
> >                  0x0000000000000000 0x0000000000000000  RW     0x0
> > 
> >  Section to Segment mapping:
> >   Segment Sections...
> >    00     .text
> >    01     .rodata
> >    02
> > > 
> > > > 
> > > > > Digging into this, the ELF headers specify a load base
> > > > > address of
> > > > > 0x801060 for the second segment, but GRUB allocates and loads
> > > > > this to
> > > > > 0x802000.  If I change my linker to align the second segment
> > > > > onto
> > > > > 0x1000 everything works fine.
> > > > 
> > > > The Multiboot2 protocol does not tolerate "holes" in the image.
> > > > You
> > > > can
> > > > find good explanation what I mean by that here [1].
> > > 
> > > I’ll look into this.
> > 
> > I've read the article and I think my "fixed" linking *is* an image
> > with
> > a hole as you describe, since we now ask for loading at 0x800000 -
> > less
> > than 0x801000, and then from 0x801000 to whatever.
> 
> Yep!



> 
> > > > > Is this alignment to 0x1000 a defined behaviour of the GRUB
> > > > > allocator
> > > > > or Multiboot2?  I'm assuming it's not a problem with the ELF
> > > > > file
> > > > > as
> > > > > it's generated by usual means and runs fine otherwise.
> > > > 
> > > > I think you should take closer look at grub-
> > > > core/loader/multiboot_elfxx.c
> > > > and especially lines starting from 258. It seems to me
> > > > something
> > > > around
> > > > here overwrites part of the image in the memory.
> > > > 
> > > 
> > > I did this and I now understand what is causing the issue.  The
> > > memory allocator in grub allocates pages and hence aligns the
> > > memory
> > > region to 0x1000.  When the allocator returns it gives the start
> > > of
> > > the region allocated and the segment is loaded to that address,
> > > not
> > > the address specified by the elf file.
> > > 
> > > The workaround I’ve used is to align your second segment to
> > > 0x1000 or
> > > to only use one segment.
> > > 
> > > Should the defined behaviour be to require segments to be aligned
> > > on
> > > pages, or should grub load as per the elf file?
> > 
> > The issue is in grub-core/loader/multiboot_elfxx.c, line 134:
> > 
> > for (i = 0; i < ehdr->e_phnum; i++)
> > 
> > ...
> > 
> > If the image is not relocatable, grub tries to allocate each
> > segment
> > individually (line 149) (if it is relocatable, grub allocates the
> > entire requested, continuous block in one allocation, line 112)
> >   
> > err = grub_relocator_alloc_chunk_addr (GRUB_MULTIBOOT (relocator),
> > &ch,
> >                             phdr(i)->p_paddr, phdr(i)->p_memsz);
> > 
> > Although we ask for the base address to be the segment load
> > address,
> > the relocator allocates pages, and hence the result is page
> > aligned.
> 
> I am not fully convinced. Could you check where exactly this happens
> in
> the GRUB memory allocator? Additionally, could you check what happens
> if you use grub_relocator_alloc_chunk_align() instead of
> grub_relocator_alloc_chunk_addr() here? Of course the arguments for
> the
> grub_relocator_alloc_chunk_align() should in theory lead to the same
> allocation which should be done by the
> grub_relocator_alloc_chunk_addr().
> 

Completely agree, I'll look into this a bit more and confirm the
behaviour.  

As I say, every ELF image I've built from C has one segment and hence
I've not seen this issue to date, but for some reason rust makes two
segments!

> > We then convert this to a virtual address at line 158:
> > 
> >               source = get_virtual_current_address (ch);
> > 
> > We then load the segment to the allocated region from line 149/158
> > at
> > line 167:
> > 
> >       if (grub_file_read (mld->file, (grub_uint8_t *) source +
> > load_offset, phdr(i)->p_filesz)
> >                   != (grub_ssize_t) phdr(i)->p_filesz)
> > 
> > So there is no check that the allocated region is the same as the
> > load
> > address in the linked file, and because we allocate by page, if
> > there
> > is > 1 segment, your segments must be page aligned for this code to
> > work.
> 
> It seems to me we should introduce checks here which compare
> allocated
> address with the requested one. Hmmm...I think the
> grub_relocator_alloc_chunk_addr() should fail if it is not able to
> fulfill the request. It should not introduce any alignments because
> caller
> asked for exact address. If this does not work in that way it is a
> bug.

Agreed, I'll include in any patch when I get there

> 
> > I'm happy to fix this, mostly by copying the approach used for
> > relocatable code, which fits with your comments about there being
> > no
> > holes in the image.
> 
> I think you should check [1] first. The most interesting thing is the
> note at the bottom of this section: This information does not need to
> be
> provided if the kernel image is in ELF format, but it must be
> provided
> if the image is in a.out format or in some other format. When the
> address tag is present it must be used in order to load the image,
> regardless of whether an ELF header is also present. Compliant boot
> loaders must be able to load images that are either in ELF format or
> contain the address tag embedded in the Multiboot2 header.
> 
> So, my understating is that the ELF header has to be ignored if the
> address
> tag of Multiboot2 header is present. If this is not the case this is
> the
> bug. Though if both are present they values should be aligned
> properly
> to not make any confusion.
> 
> I think it would help if you share your Multiboot2 header with us.

See below, not very complex at all and no load address.  The same boot
header is used for the C kernels and the rust kernels, defined in
assembly in the initial boot code.  It's lifted from the multiboot2
specification, and the EFI_BS is commented out.

multiboot2_header:
	/* Multiboot2 magic for GRUB */
    .long	MULTIBOOT2_HEADER_MAGIC
	/* ISA: aarch64 */
    .long  MULTIBOOT2_ARCHITECTURE_AARCH64 
	/* Header length */
    .long  multiboot2_header_end - multiboot2_header
	/* Checksum */
	.long	-(MULTIBOOT2_HEADER_MAGIC +
MULTIBOOT2_ARCHITECTURE_AARCH64 + (multiboot2_header_end -
multiboot2_header))
framebuffer_tag_start:
	.short	MULTIBOOT_HEADER_TAG_FRAMEBUFFER
	.short	MULTIBOOT_HEADER_TAG_OPTIONAL
	.long	framebuffer_tag_end - framebuffer_tag_start
	.long	1024
	.long	768
	.long 	32
framebuffer_tag_end:
	/* Padding to 8 byte boundary */
	.long 	0
end_tag_start:
	.short	MULTIBOOT_HEADER_TAG_END
	.short	0
	.long	8
multiboot2_header_end:

Thanks,

Chris

> 
> Daniel
> 
> [1]     
> https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html#Address-header-tag
> 
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-04  8:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-21 19:46 Multiboot ELF Loader - Alignment Chris Plant
2021-10-26 12:34 ` Daniel Kiper
2021-10-26 13:51   ` Chris Plant
2021-10-29 19:16     ` Chris Plant
2021-11-03 16:38       ` Daniel Kiper
2021-11-04  8:25         ` Chris Plant
2021-11-03 15:42     ` Daniel Kiper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.