All of lore.kernel.org
 help / color / mirror / Atom feed
* Arm64's xen.efi vs GNU binutils (and alike)
@ 2022-07-11 14:32 Jan Beulich
  2022-07-18  7:43 ` Wei Chen
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Beulich @ 2022-07-11 14:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Julien Grall, Stefano Stabellini, Volodymyr Babchuk, Bertrand Marquis

Hello,

the other day I wanted to look at the basic structure of xen.efi. First
I used my own dumping tool, which didn't work. Then I used objdump,
which appeared to work. I decided that I should look into what they do
different, and whether I could make mine work as well, or whether
instead objdump is broken and shouldn't work on this sort of binary.
While I'm not fully certain yet, I'm leaning to the latter. This is
supported by GNU objcopy corrupting the binary (I assume this is known
and considered okay-ish).

Many problems boil down to the (ab)use of the DOS executable header
fields, yielding an invalid header. The first 8 bytes are instructions,
with the first carefully chosen to produce 'MZ' in its low half.
(Oddly enough Xen and Linux use different insns there.) This doesn't
play well with the meaning of the respective fields in the DOS header.
Subsequently there are a number of .quad-s, some of which again yield
an invalid DOS header. I'm therefore inclined to submit a patch to
make objdump properly fail on this binary. But of course with both
Xen and Linux (and who knows who else) using this hairy approach, it
may end up necessary to continue to "support" this special case,
which is why I'm seeking your input here first.

Furthermore the fake .reloc section overlaps the file header. The
section is zero size (i.e. empty), but a reasonable PE loader might
still object to its RVA being zero.

As to objcopy: It shrinks the binary significantly in size, removes
the dummy .reloc section, re-writes fair parts of the DOS header,
and extends the NT header resulting in the file position of .text
changing. The size reduction and possibly the movement of .text may
be okay as long as the resulting binary is to only be used with UEFI
(as it's due to zapping of the embedded DTB and the unnecessary zero-
filling of .bss, afaict), but my understanding is that the other
adjustments are all fatal to the usability of the binary even on
UEFI.

I may easily have forgotten further anomalies.

Jan


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Arm64's xen.efi vs GNU binutils (and alike)
  2022-07-11 14:32 Arm64's xen.efi vs GNU binutils (and alike) Jan Beulich
@ 2022-07-18  7:43 ` Wei Chen
  2022-07-18  8:37   ` Jan Beulich
  0 siblings, 1 reply; 3+ messages in thread
From: Wei Chen @ 2022-07-18  7:43 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Julien Grall, Stefano Stabellini, Volodymyr Babchuk, Bertrand Marquis

Hi Jan,

On 2022/7/11 22:32, Jan Beulich wrote:
> Hello,
> 
> the other day I wanted to look at the basic structure of xen.efi. First
> I used my own dumping tool, which didn't work. Then I used objdump,
> which appeared to work. I decided that I should look into what they do
> different, and whether I could make mine work as well, or whether
> instead objdump is broken and shouldn't work on this sort of binary.
> While I'm not fully certain yet, I'm leaning to the latter. This is
> supported by GNU objcopy corrupting the binary (I assume this is known
> and considered okay-ish).
> 

Did you use x86's objcopy? AArch64 GNU objcopy does not support any
PE format file. So I'm curious about the version of objcopy you are using.

> Many problems boil down to the (ab)use of the DOS executable header
> fields, yielding an invalid header. The first 8 bytes are instructions,
> with the first carefully chosen to produce 'MZ' in its low half.
> (Oddly enough Xen and Linux use different insns there.) This doesn't
> play well with the meaning of the respective fields in the DOS header.

UEFI executables are regular PE32/PE32+ images, Arm64 EFI applications 
use a subsystem "0xAA64". PE32/PE32+ require images to have a DOS header
for option#1 backwards compatibility,or option#2 to prevent images to be 
run in DOS. I think Arm64 EFI applications select option#2. In this case
I don't understand why we need a valid DOS header? For my understanding,
we just need 'MZ' for file type identify and "offset to the PE header".
Other fields have be re-used by other purpose when load Xen image as
binary. And lots of bootloaders are following this header format to load 
Xen (Linux, or other Arm64 OS/VMM) images. Therefore, it is not 
currently possible to construct a valid DOS header.

> Subsequently there are a number of .quad-s, some of which again yield
> an invalid DOS header. I'm therefore inclined to submit a patch to
> make objdump properly fail on this binary. But of course with both

I have not used objdump to dump xen image successfully. I always use
xen-syms for objdump.Sorry, Maybe I didn't understand your question clearly.

> Xen and Linux (and who knows who else) using this hairy approach, it
> may end up necessary to continue to "support" this special case,
> which is why I'm seeking your input here first.
> 

Yes, like I said above, most OSs, VMMs and bootloaders currently follow 
this format and boot protocol. Therefore, it is difficult for us to 
completely remove it all at once.



> Furthermore the fake .reloc section overlaps the file header. The
> section is zero size (i.e. empty), but a reasonable PE loader might
> still object to its RVA being zero.
> 

I am not very clear about "overlaps". Is it because we are setting
PointerToRelocations to zero?

Cheers,
Wei Chen

> As to objcopy: It shrinks the binary significantly in size, removes
> the dummy .reloc section, re-writes fair parts of the DOS header,
> and extends the NT header resulting in the file position of .text
> changing. The size reduction and possibly the movement of .text may
> be okay as long as the resulting binary is to only be used with UEFI
> (as it's due to zapping of the embedded DTB and the unnecessary zero-
> filling of .bss, afaict), but my understanding is that the other
> adjustments are all fatal to the usability of the binary even on
> UEFI.
> 
> I may easily have forgotten further anomalies.
> 
> Jan
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Arm64's xen.efi vs GNU binutils (and alike)
  2022-07-18  7:43 ` Wei Chen
@ 2022-07-18  8:37   ` Jan Beulich
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Beulich @ 2022-07-18  8:37 UTC (permalink / raw)
  To: Wei Chen
  Cc: Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Bertrand Marquis, xen-devel

On 18.07.2022 09:43, Wei Chen wrote:
> On 2022/7/11 22:32, Jan Beulich wrote:
>> the other day I wanted to look at the basic structure of xen.efi. First
>> I used my own dumping tool, which didn't work. Then I used objdump,
>> which appeared to work. I decided that I should look into what they do
>> different, and whether I could make mine work as well, or whether
>> instead objdump is broken and shouldn't work on this sort of binary.
>> While I'm not fully certain yet, I'm leaning to the latter. This is
>> supported by GNU objcopy corrupting the binary (I assume this is known
>> and considered okay-ish).
>>
> 
> Did you use x86's objcopy? AArch64 GNU objcopy does not support any
> PE format file. So I'm curious about the version of objcopy you are using.

I did use an aarch64 one, yes. Are you sure (partial) support wasn't
added? I've had no error messages, just a corrupt output binary. Btw,
this is 2.38's entry in bfd/config.bfd:

  aarch64-*-linux* | aarch64-*-netbsd*)
    targ_defvec=aarch64_elf64_le_vec
    targ_selvecs="aarch64_elf64_be_vec aarch64_elf32_le_vec aarch64_elf32_be_vec arm_elf32_le_vec arm_elf32_be_vec aarch64_pei_vec"
    want64=true
    ;;

Clearly PEI (the name used in GNU binutils) is included by default.

>> Many problems boil down to the (ab)use of the DOS executable header
>> fields, yielding an invalid header. The first 8 bytes are instructions,
>> with the first carefully chosen to produce 'MZ' in its low half.
>> (Oddly enough Xen and Linux use different insns there.) This doesn't
>> play well with the meaning of the respective fields in the DOS header.
> 
> UEFI executables are regular PE32/PE32+ images, Arm64 EFI applications 
> use a subsystem "0xAA64". PE32/PE32+ require images to have a DOS header
> for option#1 backwards compatibility,or option#2 to prevent images to be 
> run in DOS. I think Arm64 EFI applications select option#2. In this case
> I don't understand why we need a valid DOS header? For my understanding,
> we just need 'MZ' for file type identify and "offset to the PE header".

This last step requires reading a field from the DOS header which hadn't
been there forever. Therefore one first needs to establish whether the
field is actually inside the header. Yet the fields used to determine
header size have been re-used (abused).

> Other fields have be re-used by other purpose when load Xen image as
> binary. And lots of bootloaders are following this header format to load 
> Xen (Linux, or other Arm64 OS/VMM) images. Therefore, it is not 
> currently possible to construct a valid DOS header.

Which would carry the implication that well-behaved PE processing tools
should refuse to work with these binaries.

>> Subsequently there are a number of .quad-s, some of which again yield
>> an invalid DOS header. I'm therefore inclined to submit a patch to
>> make objdump properly fail on this binary. But of course with both
> 
> I have not used objdump to dump xen image successfully. I always use
> xen-syms for objdump.Sorry, Maybe I didn't understand your question clearly.

xen-syms is an ELF binary. That's of course easily dumpable. The
question very specifically is xen.efi, which ought to be a valid
binary (and hence possible to process by tools understanding the
format), but isn't really. As a result the question is: Should GNU
binutils be able to deal with this half-broken format? Imo the
answer can only be yes (requiring all tools to properly handle it)
or no (suggesting all tools to properly refuse to work with it).

>> Xen and Linux (and who knows who else) using this hairy approach, it
>> may end up necessary to continue to "support" this special case,
>> which is why I'm seeking your input here first.
>>
> 
> Yes, like I said above, most OSs, VMMs and bootloaders currently follow 
> this format and boot protocol. Therefore, it is difficult for us to 
> completely remove it all at once.
> 
> 
> 
>> Furthermore the fake .reloc section overlaps the file header. The
>> section is zero size (i.e. empty), but a reasonable PE loader might
>> still object to its RVA being zero.
>>
> 
> I am not very clear about "overlaps". Is it because we are setting
> PointerToRelocations to zero?

What is PointerToRelocations? There's an NT header field (entry 5
of the Data Directory) which is supposed to hold the same address as
the .reloc section's RVA. And it is the .reloc section's RVA being
zero which makes that section live at the same address as the image
header (both at RVA 0). The section being zero size, it can
effectively be put anywhere, and hence I cannot see why it isn't put
at a valid address (outside of the header). As long as it comes
ahead of .text in the section table, it would e.g. be fine to live
at the same RVA as .text. (Note how on x86 we had to adjust the RVAs
of .debug_* to match the general expectation of RVAs of successive
sections to never go backwards. Without that linker script
adjustment GNU ld would have produced a broken and unusable binary.)

Jan


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-07-18  8:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-11 14:32 Arm64's xen.efi vs GNU binutils (and alike) Jan Beulich
2022-07-18  7:43 ` Wei Chen
2022-07-18  8:37   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.