All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
       [not found]   ` <1428653687.11559.5.camel@nilsson.home.kraxel.org>
@ 2015-04-10 10:06     ` Laszlo Ersek
  2015-04-10 11:04       ` [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF] Laszlo Ersek
  2015-05-26 14:36       ` [Qemu-devel] [edk2] syslinux vs. OVMF Michael Tokarev
  0 siblings, 2 replies; 12+ messages in thread
From: Laszlo Ersek @ 2015-04-10 10:06 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: edk2-devel, Michael Tokarev, qemu devel list, BALATON Zoltan

On 04/10/15 10:14, Gerd Hoffmann wrote:
>   Hi,
> 
>> In summary, please ask Gerd to rebuild the ipxe binaries that are
>> bundled with upstream qemu such that they include those two iPXE patches
>> of ours (see the last reference).
> 
> https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next
> 
> Can you give this a try?

Thank you for this update, I tested it.

(1) I reproduced the issue, so that I could be sure that the fix wasn't
meaningless. Indeed the bug reproduces with the iPXE binaries bundled
with upstream qemu.

I then checked out, built and installed your branch, and tried again,
with virtio-net and then e1000.

(2) Virito-net results:
- OVMF        loads shim.efi    via network
- shim.efi    loads grubx64.efi via network
- grubx64.efi loads grub.cfg    via network
- grubx64.efi loads vmlinuz     via network

However, while grubx64.efi loads initrd.img via the network, qemu
crashes the guest, with the following message:

qemu-system-x86_64: Guest moved used index from 46499 to 65534

This is a virtio protocol bug in the guest (efi-virtio.rom), *or* in
QEMU. I don't know.

* e1000 results:
- OVMF        loads shim.efi    via network
- shim.efi    loads grubx64.efi via network
- grubx64.efi loads grub.cfg    via network
- grubx64.efi loads vmlinuz     via network
- grubx64.efi loads initrd.img  via network
- guest kernel boots

So, I think the update is fine in general; but maybe there's a new
virtio-related bug in either "efi-virtio.rom" or in QEMU.

(When I originally wrote the (earlier versions of the) patches, I tested
them with virtio-net using RHEL-7 qemu, so I guess this could be an
upstream QEMU regression. The machine type I used for testing was
pc-i440fx-2.3.)

(3) ... Confirmed, this is a qemu regression. Namely, I checked your new
efi-virtio.rom with RHEL-7 qemu, and it works fine. CC'ing qemu-devel.

(4) Independently -- can you please update the commit message on
"roms/ipxe-patches/0002-efi-make-load-file-protocol-optional.patch"?

The code is indeed all yours in that patch, but I did the initial
analysis and the first fix. Can you please add a Suggested-by, and
reference (or even: include) the analysis in
<http://lists.ipxe.org/pipermail/ipxe-devel/2015-February/003979.html>?

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF]
  2015-04-10 10:06     ` [Qemu-devel] [edk2] syslinux vs. OVMF Laszlo Ersek
@ 2015-04-10 11:04       ` Laszlo Ersek
  2015-04-10 14:36         ` Laszlo Ersek
  2015-05-26 14:36       ` [Qemu-devel] [edk2] syslinux vs. OVMF Michael Tokarev
  1 sibling, 1 reply; 12+ messages in thread
From: Laszlo Ersek @ 2015-04-10 11:04 UTC (permalink / raw)
  To: qemu devel list; +Cc: Gerd Hoffmann

On 04/10/15 12:06, Laszlo Ersek wrote:
> On 04/10/15 10:14, Gerd Hoffmann wrote:
>>   Hi,
>>
>>> In summary, please ask Gerd to rebuild the ipxe binaries that are
>>> bundled with upstream qemu such that they include those two iPXE patches
>>> of ours (see the last reference).
>>
>> https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next
>>
>> Can you give this a try?
> 
> Thank you for this update, I tested it.
> 
> (1) I reproduced the issue, so that I could be sure that the fix wasn't
> meaningless. Indeed the bug reproduces with the iPXE binaries bundled
> with upstream qemu.
> 
> I then checked out, built and installed your branch, and tried again,
> with virtio-net and then e1000.
> 
> (2) Virito-net results:
> - OVMF        loads shim.efi    via network
> - shim.efi    loads grubx64.efi via network
> - grubx64.efi loads grub.cfg    via network
> - grubx64.efi loads vmlinuz     via network
> 
> However, while grubx64.efi loads initrd.img via the network, qemu
> crashes the guest, with the following message:
> 
> qemu-system-x86_64: Guest moved used index from 46499 to 65534
> 
> This is a virtio protocol bug in the guest (efi-virtio.rom), *or* in
> QEMU. I don't know.
> 
> * e1000 results:
> - OVMF        loads shim.efi    via network
> - shim.efi    loads grubx64.efi via network
> - grubx64.efi loads grub.cfg    via network
> - grubx64.efi loads vmlinuz     via network
> - grubx64.efi loads initrd.img  via network
> - guest kernel boots
> 
> So, I think the update is fine in general; but maybe there's a new
> virtio-related bug in either "efi-virtio.rom" or in QEMU.
> 
> (When I originally wrote the (earlier versions of the) patches, I tested
> them with virtio-net using RHEL-7 qemu, so I guess this could be an
> upstream QEMU regression. The machine type I used for testing was
> pc-i440fx-2.3.)
> 
> (3) ... Confirmed, this is a qemu regression. Namely, I checked your new
> efi-virtio.rom with RHEL-7 qemu, and it works fine. CC'ing qemu-devel.

Small update, before I start bisecting it: the bug does not reproduce
with "-netdev bridge".

It seems to be specific to "-netdev tap". Further, "vhost=on" seems to
play no role, "-netdev tap" reproduces the error both with and without
vhost=on.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF]
  2015-04-10 11:04       ` [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF] Laszlo Ersek
@ 2015-04-10 14:36         ` Laszlo Ersek
  2015-04-10 19:56           ` Laszlo Ersek
  0 siblings, 1 reply; 12+ messages in thread
From: Laszlo Ersek @ 2015-04-10 14:36 UTC (permalink / raw)
  To: qemu devel list; +Cc: Gerd Hoffmann

[-- Attachment #1: Type: text/plain, Size: 5245 bytes --]

On 04/10/15 13:04, Laszlo Ersek wrote:
> On 04/10/15 12:06, Laszlo Ersek wrote:
>> On 04/10/15 10:14, Gerd Hoffmann wrote:
>>>   Hi,
>>>
>>>> In summary, please ask Gerd to rebuild the ipxe binaries that are
>>>> bundled with upstream qemu such that they include those two iPXE patches
>>>> of ours (see the last reference).
>>>
>>> https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next
>>>
>>> Can you give this a try?
>>
>> Thank you for this update, I tested it.
>>
>> (1) I reproduced the issue, so that I could be sure that the fix wasn't
>> meaningless. Indeed the bug reproduces with the iPXE binaries bundled
>> with upstream qemu.
>>
>> I then checked out, built and installed your branch, and tried again,
>> with virtio-net and then e1000.
>>
>> (2) Virito-net results:
>> - OVMF        loads shim.efi    via network
>> - shim.efi    loads grubx64.efi via network
>> - grubx64.efi loads grub.cfg    via network
>> - grubx64.efi loads vmlinuz     via network
>>
>> However, while grubx64.efi loads initrd.img via the network, qemu
>> crashes the guest, with the following message:
>>
>> qemu-system-x86_64: Guest moved used index from 46499 to 65534
>>
>> This is a virtio protocol bug in the guest (efi-virtio.rom), *or* in
>> QEMU. I don't know.
>>
>> * e1000 results:
>> - OVMF        loads shim.efi    via network
>> - shim.efi    loads grubx64.efi via network
>> - grubx64.efi loads grub.cfg    via network
>> - grubx64.efi loads vmlinuz     via network
>> - grubx64.efi loads initrd.img  via network
>> - guest kernel boots
>>
>> So, I think the update is fine in general; but maybe there's a new
>> virtio-related bug in either "efi-virtio.rom" or in QEMU.
>>
>> (When I originally wrote the (earlier versions of the) patches, I tested
>> them with virtio-net using RHEL-7 qemu, so I guess this could be an
>> upstream QEMU regression. The machine type I used for testing was
>> pc-i440fx-2.3.)
>>
>> (3) ... Confirmed, this is a qemu regression. Namely, I checked your new
>> efi-virtio.rom with RHEL-7 qemu, and it works fine. CC'ing qemu-devel.
> 
> Small update, before I start bisecting it: the bug does not reproduce
> with "-netdev bridge".
> 
> It seems to be specific to "-netdev tap". Further, "vhost=on" seems to
> play no role, "-netdev tap" reproduces the error both with and without
> vhost=on.

This is creepy.

It was not easy to bisect, because machine type "pc-i440fx-2.3" is obviously not available in eg. v2.2.0.

Ultimately I realized that machine type pc-i440fx-2.0 does not reproduce the error, even with current master.

So I picked machine type pc-i440fx-2.1, and bisected the interval between the introduction of "pc-i440fx-2.1" (commit 3458b2b0) and current master (commit 6a460ed1). Log attached.

The result makes me question my sanity, or at least that I issued the correct "git bisect bad" and "git bisect good" commands. This is the culprit:

commit 18045fb9f457a0f0cba2bd113c748a2dcb4ed39e
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Mon Jul 28 17:34:16 2014 +0200

    pc: future-proof migration-compatibility of ACPI tables
    
    This patch avoids that similar changes break QEMU again in the future.
    QEMU will now hard-code 64k as the maximum ACPI table size, which
    (despite being an order of magnitude smaller than 640k) should be enough
    for everyone.
    
    Reviewed-by: Laszlo Ersek <lersek@redhat.com>
    Tested-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

How?!

Anyway, then I patched qemu, on top of current master, still sticking with machine type "pc-i440fx-2.1", as follows:

-----------
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1fe7bfb..6cb00a2 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -344,6 +344,7 @@ static void pc_compat_2_1(MachineState *machine)
     x86_cpu_compat_set_features("core2duo", FEAT_1_ECX, CPUID_EXT_VMX, 0);
     x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
     pcms->enforce_aligned_dimm = false;
+    legacy_acpi_table_size = 6652;
 }
 
 static void pc_compat_2_0(MachineState *machine)
-----------

Incredibly, this made the crash go away.

Without this patch (ie. when it crashes), the fw_cfg file called "etc/acpi/tables" has size 0x20000. With the patch (which happens to suppress the crash for some reason), the same fw_cfg file has size 0x2000 (1/16th). This is consistent with the branches in acpi_build(). (Note that the warning block visible there, in the second branch, is never printed.)

It seems very unlikely that qemu is doing anything wrong. The difference in the fw_cfg file size causes a differently sized memory allocation in OVMF, which displaces further allocations by 1 page (4KB). For example, "1af41000.efi" (the iPXE virtio-net driver) is also loaded 4KB higher than before. But that doesn't directly explain why grub places garbage in the virtio-net ring while it downloads "initrd.img".

Anyway I think we can rule out any qemu regression at this point. It's a bug in some other component that the different memory map (due to the larger, 0x20000 allocation) exposes.

Thanks,
Laszlo

[-- Attachment #2: bisect.log --]
[-- Type: text/x-log, Size: 2471 bytes --]

git bisect start
# bad: [6a460ed18a3fda0eb2d9c96b8b01817b4dcbded4] configure: disable Archipelago by default and warn about libxseg GPLv3 license
git bisect bad 6a460ed18a3fda0eb2d9c96b8b01817b4dcbded4
# good: [3458b2b075f92f163ccb9a1f24733eb5705947f0] pc: add 2.1 machine type
git bisect good 3458b2b075f92f163ccb9a1f24733eb5705947f0
# bad: [ed173cb704f01a62143a3ef0dcf8b493bc795c23] .travis.yml: remove "make check" from main matrix
git bisect bad ed173cb704f01a62143a3ef0dcf8b493bc795c23
# good: [089a39486f2c47994c6c0d34ac7abf34baf40d9d] Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
git bisect good 089a39486f2c47994c6c0d34ac7abf34baf40d9d
# bad: [39ba3bf69c4ef4d8a8b683ee7282efd25b3f01ff] qcow2: fix new_blocks double-free in alloc_refcount_block()
git bisect bad 39ba3bf69c4ef4d8a8b683ee7282efd25b3f01ff
# good: [4bce526ec4b88362a684fd858e0e14c83ddf0db4] target-ppc: KVMPPC_H_CAS fix cpu-version endianess
git bisect good 4bce526ec4b88362a684fd858e0e14c83ddf0db4
# bad: [a9047ec3f6ab56295cba5b07e0d46cded9e2a7ff] hw/arm/boot: Set PC correctly when loading AArch64 ELF files
git bisect bad a9047ec3f6ab56295cba5b07e0d46cded9e2a7ff
# good: [82172b751929314a81337aa91deea82e8297af1f] tests/Makefile: Only run vhost-user-test on Linux
git bisect good 82172b751929314a81337aa91deea82e8297af1f
# good: [3a18d449836d21dee60439b154056cca9a3b6aee] Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
git bisect good 3a18d449836d21dee60439b154056cca9a3b6aee
# bad: [18045fb9f457a0f0cba2bd113c748a2dcb4ed39e] pc: future-proof migration-compatibility of ACPI tables
git bisect bad 18045fb9f457a0f0cba2bd113c748a2dcb4ed39e
# good: [3b257486639cf6c25e1f3a744d1f19e6b4efdc7a] Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
git bisect good 3b257486639cf6c25e1f3a744d1f19e6b4efdc7a
# good: [c60a57ff497667780132a3fcdc1500c83af5d5c0] Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
git bisect good c60a57ff497667780132a3fcdc1500c83af5d5c0
# good: [cb348985abd3673b40c8af069c3e3b84f547b6f7] bios-tables-test: fix ASL normalization false positive
git bisect good cb348985abd3673b40c8af069c3e3b84f547b6f7
# good: [093a35e5fc0c60508e8c754ae81572090365723d] acpi-build: minor code cleanup
git bisect good 093a35e5fc0c60508e8c754ae81572090365723d
# first bad commit: [18045fb9f457a0f0cba2bd113c748a2dcb4ed39e] pc: future-proof migration-compatibility of ACPI tables

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF]
  2015-04-10 14:36         ` Laszlo Ersek
@ 2015-04-10 19:56           ` Laszlo Ersek
  0 siblings, 0 replies; 12+ messages in thread
From: Laszlo Ersek @ 2015-04-10 19:56 UTC (permalink / raw)
  To: qemu devel list; +Cc: Gerd Hoffmann

On 04/10/15 16:36, Laszlo Ersek wrote:

> Anyway I think we can rule out any qemu regression at this point.
> It's a bug in some other component that the different memory map (due
> to the larger, 0x20000 allocation) exposes.

It was an iPXE issue:

http://thread.gmane.org/gmane.network.ipxe.devel/3918

Thanks & sorry about the noise.
Laszlo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-04-10 10:06     ` [Qemu-devel] [edk2] syslinux vs. OVMF Laszlo Ersek
  2015-04-10 11:04       ` [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF] Laszlo Ersek
@ 2015-05-26 14:36       ` Michael Tokarev
  2015-05-26 16:49         ` Laszlo Ersek
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2015-05-26 14:36 UTC (permalink / raw)
  To: Laszlo Ersek, Gerd Hoffmann; +Cc: edk2-devel, qemu devel list, BALATON Zoltan

10.04.2015 13:06, Laszlo Ersek wrote:
> On 04/10/15 10:14, Gerd Hoffmann wrote:
>>   Hi,
>>
>>> In summary, please ask Gerd to rebuild the ipxe binaries that are
>>> bundled with upstream qemu such that they include those two iPXE patches
>>> of ours (see the last reference).
>>
>> https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next
>>
>> Can you give this a try?
> 
> Thank you for this update, I tested it.
> 
> (1) I reproduced the issue, so that I could be sure that the fix wasn't
> meaningless. Indeed the bug reproduces with the iPXE binaries bundled
> with upstream qemu.
> 
[]
> * e1000 results:
> - OVMF        loads shim.efi    via network
> - shim.efi    loads grubx64.efi via network
> - grubx64.efi loads grub.cfg    via network
> - grubx64.efi loads vmlinuz     via network
> - grubx64.efi loads initrd.img  via network
> - guest kernel boots
> 
> So, I think the update is fine in general...

However, after the update of efi roms in qemu, the original problem
of booting syslinux in OVMF still persists.  I received several
private messages asking whenever I succeeded in resolving the
original prob outlined at

 http://www.syslinux.org/archives/2014-November/022804.html

and I always referred to this thread, until someone told me that
the update does not fix the issue.  Now I verified it locally,
and no, I still can't use syslinux with OVMF with qemu efi roms,
getting exactly the same output as I've seen on Nov-2014.
As you checked, grub loads, but apparently syslinux still doesn't.

Is it a different issue perhaps?

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-05-26 14:36       ` [Qemu-devel] [edk2] syslinux vs. OVMF Michael Tokarev
@ 2015-05-26 16:49         ` Laszlo Ersek
  2015-05-26 17:04           ` Michael Tokarev
  0 siblings, 1 reply; 12+ messages in thread
From: Laszlo Ersek @ 2015-05-26 16:49 UTC (permalink / raw)
  To: Michael Tokarev, Gerd Hoffmann
  Cc: edk2-devel, qemu devel list, BALATON Zoltan

On 05/26/15 16:36, Michael Tokarev wrote:
> 10.04.2015 13:06, Laszlo Ersek wrote:
>> On 04/10/15 10:14, Gerd Hoffmann wrote:
>>>   Hi,
>>>
>>>> In summary, please ask Gerd to rebuild the ipxe binaries that are
>>>> bundled with upstream qemu such that they include those two iPXE patches
>>>> of ours (see the last reference).
>>>
>>> https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next
>>>
>>> Can you give this a try?
>>
>> Thank you for this update, I tested it.
>>
>> (1) I reproduced the issue, so that I could be sure that the fix wasn't
>> meaningless. Indeed the bug reproduces with the iPXE binaries bundled
>> with upstream qemu.
>>
> []
>> * e1000 results:
>> - OVMF        loads shim.efi    via network
>> - shim.efi    loads grubx64.efi via network
>> - grubx64.efi loads grub.cfg    via network
>> - grubx64.efi loads vmlinuz     via network
>> - grubx64.efi loads initrd.img  via network
>> - guest kernel boots
>>
>> So, I think the update is fine in general...
> 
> However, after the update of efi roms in qemu, the original problem
> of booting syslinux in OVMF still persists.  I received several
> private messages asking whenever I succeeded in resolving the
> original prob outlined at
> 
>  http://www.syslinux.org/archives/2014-November/022804.html
> 
> and I always referred to this thread, until someone told me that
> the update does not fix the issue.  Now I verified it locally,
> and no, I still can't use syslinux with OVMF with qemu efi roms,
> getting exactly the same output as I've seen on Nov-2014.

If you are getting *exactly* the same output as in the message
referenced above, complete with the iPXE banner, then you're not using
the right (updated) iPXE binaries. (I think Gerd's patches implementing
the update have not been merged into upstream qemu yet? The most recent
patch from Gerd, under pc-bios/, is
c246cee4eedb17ae3932d699e009a8b63240235f. Unrelated, and too old.)

I'm saying this because, if you had everything in place, then the iPXE
banner would *not* be printed. iPXE would not hijack the boot flow "as
usual", it would only provide an SNP (Simple Network Protocol)
implementation for edk2's network stack (including the PXE base code
driver). And the iPXE banner would be absent.

To summarize, I've found three bugs in iPXE thus far:

- the EFI_SIMPLE_NETWORK_PROTOCOL.Transmit() and .GetStatus() interfaces
are not correctly implemented. This trips up at least grub. Fixed by
"efi_snp: improve compliance with the EFI_SIMPLE_NETWORK_PROTOCOL spec"
patch; not taken by upstream.

- iPXE's own EFI_LOAD_FILE_PROTOCOL implementation causes edk2's PXE
base code driver to become inactive / useless. See the discussion in
<http://lists.ipxe.org/pipermail/ipxe-devel/2015-February/003979.html>.
Fixed by "make load file protocol optional", and "ipxe: disable load
file protocol". Not taken by upstream. This is the bug that you are
still running into, most likely.

(The iPXE banner is printed in ipxe(), "src/usr/autoboot.c", via the
macro PRODUCT_TAG_LINE and its friends. The ipxe() function is not
called after these patches, because its caller, efi_snp_load_file(), is
never reached either.)

- NIC driver not torn down at ExitBootServices(). Fixed by (one month
old) upstream iPXE commit 755d2b8f. This bug becomes a problem only when
you actually start a runtime OS, and even then it is very sensitive to
memory layout.

Earlier I received reports about syslinux 6.03-pre20 working nicely with
OVMF's builtin virtio-net driver:

http://lukas.zapletalovi.com/2014/09/efi-in-qemu-kvm-on-fedora-20.html

Can you please verify that on your end? (Disable iPXE oprom loading with
"-device virtio-net-pci,romfile=".) That would at least narrow down the
troubles.

> As you checked, grub loads, but apparently syslinux still doesn't.

I guess I'll have to set up syslinux too, and see it for myself. ;)

> Is it a different issue perhaps?

We'll see.

Thanks
Laszlo

> Thanks,
> 
> /mjt
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-05-26 16:49         ` Laszlo Ersek
@ 2015-05-26 17:04           ` Michael Tokarev
  2015-05-26 18:38             ` Laszlo Ersek
  2015-05-26 21:31             ` Michael Tokarev
  0 siblings, 2 replies; 12+ messages in thread
From: Michael Tokarev @ 2015-05-26 17:04 UTC (permalink / raw)
  To: Laszlo Ersek, Gerd Hoffmann; +Cc: edk2-devel, qemu devel list, BALATON Zoltan

26.05.2015 19:49, Laszlo Ersek wrote:
[]
>> However, after the update of efi roms in qemu, the original problem
>> of booting syslinux in OVMF still persists.  I received several
>> private messages asking whenever I succeeded in resolving the
>> original prob outlined at
>>
>>  http://www.syslinux.org/archives/2014-November/022804.html
>>
>> and I always referred to this thread, until someone told me that
>> the update does not fix the issue.  Now I verified it locally,
>> and no, I still can't use syslinux with OVMF with qemu efi roms,
>> getting exactly the same output as I've seen on Nov-2014.
> 
> If you are getting *exactly* the same output as in the message
> referenced above, complete with the iPXE banner, then you're not using

No, I mean I see the same error message "Failed to read blocks: 0xC"
after syslinux.efi load.  The banner is new, with a few changed details.

> the right (updated) iPXE binaries. (I think Gerd's patches implementing
> the update have not been merged into upstream qemu yet? The most recent
> patch from Gerd, under pc-bios/, is
> c246cee4eedb17ae3932d699e009a8b63240235f. Unrelated, and too old.)

Oh sh*t.  You're right.  Indeed, that's the last patch, and indeed
it is too old.  I guess we need http://lists.ipxe.org/pipermail/ipxe-devel/2015-March/004007.html
or some other bits from https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next.

Somehow, since the talk was about updating binaries before the next
(2.3 at that time) release, I thought current qemu have all necessary
bits.

> I'm saying this because, if you had everything in place, then the iPXE
> banner would *not* be printed. iPXE would not hijack the boot flow "as
> usual", it would only provide an SNP (Simple Network Protocol)
> implementation for edk2's network stack (including the PXE base code
> driver). And the iPXE banner would be absent.

Ok.  I do see a banner here, so things doesn't work as they should,
and that's because I don't have the last patches which aren't still
in qemu.

> To summarize, I've found three bugs in iPXE thus far:
> 
> - the EFI_SIMPLE_NETWORK_PROTOCOL.Transmit() and .GetStatus() interfaces
> are not correctly implemented. This trips up at least grub. Fixed by
> "efi_snp: improve compliance with the EFI_SIMPLE_NETWORK_PROTOCOL spec"
> patch; not taken by upstream.

Not taken?  Why?  Just time issues or some problem?

> - iPXE's own EFI_LOAD_FILE_PROTOCOL implementation causes edk2's PXE
> base code driver to become inactive / useless. See the discussion in
> <http://lists.ipxe.org/pipermail/ipxe-devel/2015-February/003979.html>.
> Fixed by "make load file protocol optional", and "ipxe: disable load
> file protocol". Not taken by upstream. This is the bug that you are
> still running into, most likely.

Again, why it hasn't been taken?

> (The iPXE banner is printed in ipxe(), "src/usr/autoboot.c", via the
> macro PRODUCT_TAG_LINE and its friends. The ipxe() function is not
> called after these patches, because its caller, efi_snp_load_file(), is
> never reached either.)
> 
> - NIC driver not torn down at ExitBootServices(). Fixed by (one month
> old) upstream iPXE commit 755d2b8f. This bug becomes a problem only when
> you actually start a runtime OS, and even then it is very sensitive to
> memory layout.
> 
> Earlier I received reports about syslinux 6.03-pre20 working nicely with
> OVMF's builtin virtio-net driver:
> 
> http://lukas.zapletalovi.com/2014/09/efi-in-qemu-kvm-on-fedora-20.html
> 
> Can you please verify that on your end? (Disable iPXE oprom loading with
> "-device virtio-net-pci,romfile=".) That would at least narrow down the
> troubles.

Yes, that one works.  It needs updated OVMF, but that's details,
and you know this already as well.

>> As you checked, grub loads, but apparently syslinux still doesn't.
> 
> I guess I'll have to set up syslinux too, and see it for myself. ;)

Nah. there's no need to. Lemme to actually apply the patches and see.

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-05-26 17:04           ` Michael Tokarev
@ 2015-05-26 18:38             ` Laszlo Ersek
  2015-05-26 20:17               ` BALATON Zoltan
  2015-05-26 21:31             ` Michael Tokarev
  1 sibling, 1 reply; 12+ messages in thread
From: Laszlo Ersek @ 2015-05-26 18:38 UTC (permalink / raw)
  To: Michael Tokarev, Gerd Hoffmann
  Cc: edk2-devel, qemu devel list, BALATON Zoltan

On 05/26/15 19:04, Michael Tokarev wrote:
> 26.05.2015 19:49, Laszlo Ersek wrote:
> []
>>> However, after the update of efi roms in qemu, the original problem
>>> of booting syslinux in OVMF still persists.  I received several
>>> private messages asking whenever I succeeded in resolving the
>>> original prob outlined at
>>>
>>>  http://www.syslinux.org/archives/2014-November/022804.html
>>>
>>> and I always referred to this thread, until someone told me that
>>> the update does not fix the issue.  Now I verified it locally,
>>> and no, I still can't use syslinux with OVMF with qemu efi roms,
>>> getting exactly the same output as I've seen on Nov-2014.
>>
>> If you are getting *exactly* the same output as in the message
>> referenced above, complete with the iPXE banner, then you're not using
> 
> No, I mean I see the same error message "Failed to read blocks: 0xC"
> after syslinux.efi load.  The banner is new, with a few changed details.

Interesting -- no clue where "Failed to read blocks" comes from. Not
syslinux, not iPXE, not shim, not grub, not edk2, not the kernel...

>> the right (updated) iPXE binaries. (I think Gerd's patches implementing
>> the update have not been merged into upstream qemu yet? The most recent
>> patch from Gerd, under pc-bios/, is
>> c246cee4eedb17ae3932d699e009a8b63240235f. Unrelated, and too old.)
> 
> Oh sh*t.  You're right.  Indeed, that's the last patch, and indeed
> it is too old.  I guess we need http://lists.ipxe.org/pipermail/ipxe-devel/2015-March/004007.html
> or some other bits from https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next.

Gerd's "rebase/roms-next" branch should be all right, except it should
incorporate an update to iPXE 755d2b8f too, not just the earlier dc795b9:

$ git log --oneline --reverse dc795b9..755d2b8f | cat -n
 1  b12b1b6 [virtio] Downgrade per-iobuf debug messages to DBGC2
 2  a9da129 [test] Simplify digest algorithm self-tests
 3  4dbc443 [crypto] Add SHA-224 algorithm
 4  6f713c2 [crypto] Add SHA-512 algorithm
 5  0287929 [crypto] Add SHA-384 algorithm
 6  e5e91ab [crypto] Add SHA-512/256 algorithm
 7  ea3d587 [crypto] Add SHA-512/224 algorithm
 8  755d2b8 [efi] Ensure drivers are disconnected when
            ExitBootServices() is called

755d2b8f is the relevant commit. (Mentioned below.)

> Somehow, since the talk was about updating binaries before the next
> (2.3 at that time) release, I thought current qemu have all necessary
> bits.

There was a little discussion in
<http://lists.nongnu.org/archive/html/qemu-devel/2015-04/msg01216.html>,
but the series was not merged.

> 
>> I'm saying this because, if you had everything in place, then the iPXE
>> banner would *not* be printed. iPXE would not hijack the boot flow "as
>> usual", it would only provide an SNP (Simple Network Protocol)
>> implementation for edk2's network stack (including the PXE base code
>> driver). And the iPXE banner would be absent.
> 
> Ok.  I do see a banner here, so things doesn't work as they should,
> and that's because I don't have the last patches which aren't still
> in qemu.
> 
>> To summarize, I've found three bugs in iPXE thus far:
>>
>> - the EFI_SIMPLE_NETWORK_PROTOCOL.Transmit() and .GetStatus() interfaces
>> are not correctly implemented. This trips up at least grub. Fixed by
>> "efi_snp: improve compliance with the EFI_SIMPLE_NETWORK_PROTOCOL spec"
>> patch; not taken by upstream.
> 
> Not taken?  Why?  Just time issues or some problem?

I don't know. No feedback on ipxe-devel. On IRC, the maintainer seemed
to lean towards accepting this patch, but ultimately it didn't go anywhere.

Gerd has repeatedly posted this patch to ipxe-devel (I really couldn't
handle the stress any longer, caused by the lack of maintainer
responsiveness, so I gave up -- thanks Gerd for picking it up!)

So, short answer: "no clue".

> 
>> - iPXE's own EFI_LOAD_FILE_PROTOCOL implementation causes edk2's PXE
>> base code driver to become inactive / useless. See the discussion in
>> <http://lists.ipxe.org/pipermail/ipxe-devel/2015-February/003979.html>.
>> Fixed by "make load file protocol optional", and "ipxe: disable load
>> file protocol". Not taken by upstream. This is the bug that you are
>> still running into, most likely.
> 
> Again, why it hasn't been taken?

Two reasons, in my perception:

(a) It implements the polar opposite of how the iPXE maintainer sees /
positions iPXE. "We" (well, "I", definitely) want iPXE to provide a low
level network driver for the edk2 network stack, for QEMU NICs different
from virtio-net, and then get out of the way

Whereas the iPXE maintainer wants to provide users with the traditional
iPXE user experience, and in his opinion the edk2 network stack comes up
short in that regard -- refer to iPXE commit c7c3d839. For that end,
upstream iPXE basically hijacks the boot flow when it gets control from
the UEFI firmware, and does (in practice) what it does on legacy BIOS
systems.

(b) When I found out above the above trick in iPXE, I got quite angry,
and used somewhat inflammatory (and inexact) language in the public BZ,
and in the patch that I submitted to ipxe-devel. (In fact I think my
first patch just reverted c7c3d839.) That wasn't smart, probably.

Nonetheless, Gerd has since addressed these issues -- he made the code
in question build-time tweakable, matching the prior configuration
infrastructure of iPXE. No more angry commit messages, just some
refactoring for iPXE upstream, and flipping a macro for downstreams
(that want iPXE to provide a low level network driver and nothing else).
Plus, I provided a full analysis of the issue on ipxe-devel (link above
in the quote).

Unfortunately, even this approach got stuck on ipxe-devel.

>> (The iPXE banner is printed in ipxe(), "src/usr/autoboot.c", via the
>> macro PRODUCT_TAG_LINE and its friends. The ipxe() function is not
>> called after these patches, because its caller, efi_snp_load_file(), is
>> never reached either.)
>>
>> - NIC driver not torn down at ExitBootServices(). Fixed by (one month
>> old) upstream iPXE commit 755d2b8f. This bug becomes a problem only when
>> you actually start a runtime OS, and even then it is very sensitive to
>> memory layout.
>>
>> Earlier I received reports about syslinux 6.03-pre20 working nicely with
>> OVMF's builtin virtio-net driver:
>>
>> http://lukas.zapletalovi.com/2014/09/efi-in-qemu-kvm-on-fedora-20.html
>>
>> Can you please verify that on your end? (Disable iPXE oprom loading with
>> "-device virtio-net-pci,romfile=".) That would at least narrow down the
>> troubles.
> 
> Yes, that one works.  It needs updated OVMF, but that's details,
> and you know this already as well.
> 
>>> As you checked, grub loads, but apparently syslinux still doesn't.
>>
>> I guess I'll have to set up syslinux too, and see it for myself. ;)
> 
> Nah. there's no need to. Lemme to actually apply the patches and see.

Thanks, I'm curious. In fact, for a quick test, you can simply grab the
iPXE binaries from Gerd's rebase/roms-next branch. They won't include
iPXE commit 755d2b8, but that bug has potential to cause problems only
much later than where you're stuck now.

Cheers
Laszlo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-05-26 18:38             ` Laszlo Ersek
@ 2015-05-26 20:17               ` BALATON Zoltan
  2015-05-26 20:27                 ` Michael Tokarev
  0 siblings, 1 reply; 12+ messages in thread
From: BALATON Zoltan @ 2015-05-26 20:17 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: edk2-devel, Michael Tokarev, Gerd Hoffmann, qemu devel list

On Tue, 26 May 2015, Laszlo Ersek wrote:
> On 05/26/15 19:04, Michael Tokarev wrote:
>> No, I mean I see the same error message "Failed to read blocks: 0xC"
>> after syslinux.efi load.  The banner is new, with a few changed details.
>
> Interesting -- no clue where "Failed to read blocks" comes from. Not
> syslinux, not iPXE, not shim, not grub, not edk2, not the kernel...

I think it comes from syslinux/efi/diskio.c:41. I remember seeing this 
message before but I don't remember the context and what lead to it so I 
can't add much more to this.

Regards,
BALATON Zoltan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-05-26 20:17               ` BALATON Zoltan
@ 2015-05-26 20:27                 ` Michael Tokarev
  2015-05-26 20:42                   ` BALATON Zoltan
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2015-05-26 20:27 UTC (permalink / raw)
  To: BALATON Zoltan, Laszlo Ersek; +Cc: edk2-devel, Gerd Hoffmann, qemu devel list

26.05.2015 23:17, BALATON Zoltan wrote:
> On Tue, 26 May 2015, Laszlo Ersek wrote:
>> On 05/26/15 19:04, Michael Tokarev wrote:
>>> No, I mean I see the same error message "Failed to read blocks: 0xC"
>>> after syslinux.efi load.  The banner is new, with a few changed details.
>>
>> Interesting -- no clue where "Failed to read blocks" comes from. Not
>> syslinux, not iPXE, not shim, not grub, not edk2, not the kernel...
> 
> I think it comes from syslinux/efi/diskio.c:41....

Indeed. Thank you very much for finding this.

FWIW, 0xC means RETURN_NO_MEDIA.  Which is kind of strange.
But it is the first load which is done by syslinux.efi, not
by ipxe rom.

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-05-26 20:27                 ` Michael Tokarev
@ 2015-05-26 20:42                   ` BALATON Zoltan
  0 siblings, 0 replies; 12+ messages in thread
From: BALATON Zoltan @ 2015-05-26 20:42 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: edk2-devel, Laszlo Ersek, Gerd Hoffmann, qemu devel list

On Tue, 26 May 2015, Michael Tokarev wrote:
> FWIW, 0xC means RETURN_NO_MEDIA.  Which is kind of strange.
> But it is the first load which is done by syslinux.efi, not
> by ipxe rom.

I vaguely remember it happens when syslinux.efi thinks it is reading a 
disk and not booting from the network so it's most likely caused by one of 
the bugs Laszlo has mentioned. I think I got this first with e1000 without 
appropriate rom but this went away with virtio-net and disabling rom 
loading by specifying empty value as suggested by Laszlo. But I could be 
wrong, I don't remember it clearly.

*After some digging*

Here's the original thread:
http://sourceforge.net/p/edk2/mailman/message/33735116/

It starts with the message explaining how it gets to the wrong path which 
gets this message and replies from Laszlo about what causes this.

Regards,
BALATON Zoltan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [edk2] syslinux vs. OVMF
  2015-05-26 17:04           ` Michael Tokarev
  2015-05-26 18:38             ` Laszlo Ersek
@ 2015-05-26 21:31             ` Michael Tokarev
  1 sibling, 0 replies; 12+ messages in thread
From: Michael Tokarev @ 2015-05-26 21:31 UTC (permalink / raw)
  To: Laszlo Ersek, Gerd Hoffmann; +Cc: edk2-devel, qemu devel list, BALATON Zoltan

26.05.2015 20:04, Michael Tokarev wrpte:
> 26.05.2015 19:49, Laszlo Ersek wrote:
[]
>> the right (updated) iPXE binaries. (I think Gerd's patches implementing
>> the update have not been merged into upstream qemu yet? The most recent
>> patch from Gerd, under pc-bios/, is
>> c246cee4eedb17ae3932d699e009a8b63240235f. Unrelated, and too old.)
> 
> Oh sh*t.  You're right.  Indeed, that's the last patch, and indeed
> it is too old.  I guess we need http://lists.ipxe.org/pipermail/ipxe-devel/2015-March/004007.html
> or some other bits from https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next.
> 
> Somehow, since the talk was about updating binaries before the next
> (2.3 at that time) release, I thought current qemu have all necessary
> bits.

After applying 2 patches from Gerd:

 efi_snp-improve-compliance-with-the-EFI_SIMPLE_NETWORK_PROTOCOL_spec.patch
 efi-make-load-file-protocol-optional.patch

syslinux works fine with the resulting efirom for qemu too,
it successfully loads kernel+initrd and boots the system.

So we're waiting for the missing ipxe bits... ;)

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-05-26 21:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <alpine.GSO.2.01.1504062122470.1832@mono>
     [not found] ` <5523E12E.8010103@redhat.com>
     [not found]   ` <1428653687.11559.5.camel@nilsson.home.kraxel.org>
2015-04-10 10:06     ` [Qemu-devel] [edk2] syslinux vs. OVMF Laszlo Ersek
2015-04-10 11:04       ` [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF] Laszlo Ersek
2015-04-10 14:36         ` Laszlo Ersek
2015-04-10 19:56           ` Laszlo Ersek
2015-05-26 14:36       ` [Qemu-devel] [edk2] syslinux vs. OVMF Michael Tokarev
2015-05-26 16:49         ` Laszlo Ersek
2015-05-26 17:04           ` Michael Tokarev
2015-05-26 18:38             ` Laszlo Ersek
2015-05-26 20:17               ` BALATON Zoltan
2015-05-26 20:27                 ` Michael Tokarev
2015-05-26 20:42                   ` BALATON Zoltan
2015-05-26 21:31             ` Michael Tokarev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.