All of lore.kernel.org
 help / color / mirror / Atom feed
* VW ELF loader
@ 2020-02-01 13:39 Alexey Kardashevskiy
  2020-02-01 19:04 ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-01 13:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Stefano Garzarella, David Gibson

Hi!

In my effort to "kill SLOF" (the PPC pseries guest firmware), I proceeded to the stage when QEMU needs to load GRUB from 
the disk. The current workaround is to read it from qcow2, save in a file and then call load_elf(). Not nice.

2 problems with that.

1. when load_elf calls address_space_write() - I need to know where and how much RAM was used to mark this memory "used" 
for the OF client interface (/memory@0/available FDT property). So I'll need "preload()" hook.

2. (bigger) GRUB comes from PReP partition which is 8MB. load_elf{32|64} consumes filename, not a memory pointer nor a 
"read_fn" callback - so I thought I need a "read_fn" callback.

And then I discovered that load_elf actually maps the passed file. And here I got lost.

Why does not load_elf just map the entire file and parse the bits? It still reads chunks with seek+read and then it maps 
the file in a loop potentially multiple times - is this even correct? Passing "fd" around is weird.

Why ROMs are different from "-kernel"?

If I want to solve 1 and 2 of my problem, should I just cut-n-paste load_elf and tweak bits rather then add more 
parameters to already 15-parameters long prototypes?
Or I could read GRUB from qcow2 into the memory and change the rest to parse ELF from memory (mapped from a ELF file or 
read from qcow2)?


Thanks,

ps. VW == very weird, indeed :)

-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-01 13:39 VW ELF loader Alexey Kardashevskiy
@ 2020-02-01 19:04 ` Paolo Bonzini
  2020-02-02 11:51   ` Alexey Kardashevskiy
                     ` (2 more replies)
  0 siblings, 3 replies; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-01 19:04 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel
  Cc: Christian Borntraeger, Thomas Huth, Stefano Garzarella,
	Cornelia Huck, David Gibson

On 01/02/20 14:39, Alexey Kardashevskiy wrote:
> QEMU needs to load GRUB from the disk. The current workaround is to read
> it from qcow2, save in a file and then call load_elf(). Not nice.
> 
> 2 problems with that.
> 
> 1. when load_elf calls address_space_write() - I need to know where and
> how much RAM was used to mark this memory "used" for the OF client
> interface (/memory@0/available FDT property). So I'll need "preload()"
> hook.
> 
> 2. (bigger) GRUB comes from PReP partition which is 8MB. load_elf{32|64}
> consumes filename, not a memory pointer nor a "read_fn" callback - so I
> thought I need a "read_fn" callback.
> 
> And then I discovered that load_elf actually maps the passed file. And
> here I got lost.
> 
> Why does not load_elf just map the entire file and parse the bits? It
> still reads chunks with seek+read and then it maps the file in a loop
> potentially multiple times - is this even correct? Passing "fd" around
> is weird.

QEMU must not load GRUB from disk, that's the firmware's task.  If you
want to kill SLOF, you can rewrite it, but loading the kernel GRUB from
disk within QEMU is a bad idea: the next feature you'll be requested to
implement will be network boot, and there's no way to do that in QEMU.

You should be able to reuse quite a lot of code from both
pc-bios/s390-ccw (for virtio drivers) and kvm-unit-tests (for device
tree parsing).  You'd have to write the glue code for PCI hypercalls,
and adapt virtio.c for virtio-pci instead of virtio-ccw.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-01 19:04 ` Paolo Bonzini
@ 2020-02-02 11:51   ` Alexey Kardashevskiy
  2020-02-02 17:38     ` Paolo Bonzini
  2020-02-03  1:28   ` David Gibson
  2020-02-04  9:40   ` Christian Borntraeger
  2 siblings, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-02 11:51 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: Christian Borntraeger, Thomas Huth, Stefano Garzarella,
	Cornelia Huck, David Gibson



On 02/02/2020 06:04, Paolo Bonzini wrote:
> On 01/02/20 14:39, Alexey Kardashevskiy wrote:
>> QEMU needs to load GRUB from the disk. The current workaround is to read
>> it from qcow2, save in a file and then call load_elf(). Not nice.
>>
>> 2 problems with that.
>>
>> 1. when load_elf calls address_space_write() - I need to know where and
>> how much RAM was used to mark this memory "used" for the OF client
>> interface (/memory@0/available FDT property). So I'll need "preload()"
>> hook.
>>
>> 2. (bigger) GRUB comes from PReP partition which is 8MB. load_elf{32|64}
>> consumes filename, not a memory pointer nor a "read_fn" callback - so I
>> thought I need a "read_fn" callback.
>>
>> And then I discovered that load_elf actually maps the passed file. And
>> here I got lost.
>>
>> Why does not load_elf just map the entire file and parse the bits? It
>> still reads chunks with seek+read and then it maps the file in a loop
>> potentially multiple times - is this even correct? Passing "fd" around
>> is weird.
> 
> QEMU must not load GRUB from disk, that's the firmware's task.  If you
> want to kill SLOF, you can rewrite it, but loading the kernel GRUB from
> disk within QEMU is a bad idea: the next feature you'll be requested to
> implement will be network boot, and there's no way to do that in QEMU.

What is exactly the problem with netboot? I can hook up the OF's "net" to a backend (as I do for serial console and 
blockdev, in boot order) and GRUB will do the rest which is tftp/dhcp/ip (SLOF does just this and nothing more). If GRUB 
does not do this on POWER - I can fix this.

Or alternatively it is possible with my patchset to load petitboot (kernel + intramdisk, the default way of booting 
POWER8/9 baremetal systems) and that thing can do whole lot of things, we can consider it as a replacement for ROMs from 
devices (or I misunderstood what kind of netboot you meant).

> You should be able to reuse quite a lot of code from both
> pc-bios/s390-ccw (for virtio drivers) and kvm-unit-tests (for device
> tree parsing).  You'd have to write the glue code for PCI hypercalls,
> and adapt virtio.c for virtio-pci instead of virtio-ccw.

The reason for killing SLOF is to keep one device tree for the entire boot process including 
ibm,client-architecture-support with possible (and annoying) configuration reboots. Having another firware won't help 
with that.

Also the OF1275 client interface is the way for the client to get net/block device without need to have drivers, I'd 
like to do just this and skip the middle man (QEMU device and guest driver in firmware/bootloader).

I'll post another RFC tomorrow to give a better idea.


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-02 11:51   ` Alexey Kardashevskiy
@ 2020-02-02 17:38     ` Paolo Bonzini
  2020-02-03  1:31       ` David Gibson
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-02 17:38 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Thomas Huth, qemu-devel, Cornelia Huck, Christian Borntraeger,
	David Gibson, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]

Il dom 2 feb 2020, 12:51 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:

> > QEMU must not load GRUB from disk, that's the firmware's task.  If you
> > want to kill SLOF, you can rewrite it, but loading the kernel GRUB from
> > disk within QEMU is a bad idea: the next feature you'll be requested to
> > implement will be network boot, and there's no way to do that in QEMU.
>
> What is exactly the problem with netboot? I can hook up the OF's "net" to
> a backend (as I do for serial console and
> blockdev, in boot order)


Who provides the OpenFirmware entry point when you remove SLOF and boot
directly into grub?

Or alternatively it is possible with my patchset to load petitboot (kernel
> + intramdisk, the default way of booting
> POWER8/9 baremetal systems) and that thing can do whole lot of things, we
> can consider it as a replacement for ROMs from
> devices (or I misunderstood what kind of netboot you meant).
>

Why wouldn't that have the same issue as SLOF that you describe below (I
honestly don't understand anything of it, but that's not your fault :-)).

Paolo


> > You should be able to reuse quite a lot of code from both
> > pc-bios/s390-ccw (for virtio drivers) and kvm-unit-tests (for device
> > tree parsing).  You'd have to write the glue code for PCI hypercalls,
> > and adapt virtio.c for virtio-pci instead of virtio-ccw.
>
> The reason for killing SLOF is to keep one device tree for the entire boot
> process including
> ibm,client-architecture-support with possible (and annoying) configuration
> reboots. Having another firware won't help
> with that.
>
> Also the OF1275 client interface is the way for the client to get
> net/block device without need to have drivers, I'd
> like to do just this and skip the middle man (QEMU device and guest driver
> in firmware/bootloader).
>
> I'll post another RFC tomorrow to give a better idea.
>
>
> --
> Alexey
>
>

[-- Attachment #2: Type: text/html, Size: 2857 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-01 19:04 ` Paolo Bonzini
  2020-02-02 11:51   ` Alexey Kardashevskiy
@ 2020-02-03  1:28   ` David Gibson
  2020-02-03  9:12     ` Paolo Bonzini
  2020-02-04  9:40   ` Christian Borntraeger
  2 siblings, 1 reply; 48+ messages in thread
From: David Gibson @ 2020-02-03  1:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 3562 bytes --]

On Sat, Feb 01, 2020 at 08:04:25PM +0100, Paolo Bonzini wrote:
> On 01/02/20 14:39, Alexey Kardashevskiy wrote:
> > QEMU needs to load GRUB from the disk. The current workaround is to read
> > it from qcow2, save in a file and then call load_elf(). Not nice.
> > 
> > 2 problems with that.
> > 
> > 1. when load_elf calls address_space_write() - I need to know where and
> > how much RAM was used to mark this memory "used" for the OF client
> > interface (/memory@0/available FDT property). So I'll need "preload()"
> > hook.
> > 
> > 2. (bigger) GRUB comes from PReP partition which is 8MB. load_elf{32|64}
> > consumes filename, not a memory pointer nor a "read_fn" callback - so I
> > thought I need a "read_fn" callback.
> > 
> > And then I discovered that load_elf actually maps the passed file. And
> > here I got lost.
> > 
> > Why does not load_elf just map the entire file and parse the bits? It
> > still reads chunks with seek+read and then it maps the file in a loop
> > potentially multiple times - is this even correct? Passing "fd" around
> > is weird.
> 
> QEMU must not load GRUB from disk, that's the firmware's task.  If you
> want to kill SLOF, you can rewrite it, but loading the kernel GRUB from
> disk within QEMU is a bad idea: the next feature you'll be requested to
> implement will be network boot, and there's no way to do that in QEMU.

So.. I'm going to dispute this.  Or at least dispute that writing "the
firmware" as part of qemu isn't a feasible strategy.  At least in the
case of the "pseries" machine type, and possibly some other explicitly
paravirt machine types.

I do agree that we should leave firmware things to firmware when we're
implementing a real hardware platform and can therefore (at least in
theory) run the same firmware binary under qemu as for the real
hardware.

But "pseries" is different.  We're implementing the PAPR platform,
which describes an OS environment that's presented by a combination of
a hypervisor and firmware.  The features it specifies *require*
collaboration between the firmware and the hypervisor.

In PowerVM the environment is implemented with a substantial firmware
as well as hypervisor.  How those two communicate is in closed code,
it's not documented anywhere public, and I suspect it's not even
documented anywhere internal to IBM.

So, for qemu we've taken a different approach.  Since the beginning,
the runtime component of the firmware (RTAS) has been implemented as a
20 byte shim which simply forwards it to a hypercall implemented in
qemu.  The boottime firmware component is SLOF - but a build that's
specific to qemu, and has always needed to be updated in sync with
it.  Even though we've managed to limit the amount of runtime
communication we need between qemu and SLOF, there's some, and it's
become increasingly awkward to handle as we've implemented new features.

So really, the question isn't whether we implement things in firmware
or in qemu.  It's whether we implement the firmware functionality as
guest cpu code, which needs to be coded to work with a limited
environment, built with a special toolchain, then emulated with TCG.
Or, do we just implement it in normal C code, with a full C library,
and existing device and backend abstractions inside qemu.

That's what killing slof is about.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-02 17:38     ` Paolo Bonzini
@ 2020-02-03  1:31       ` David Gibson
  0 siblings, 0 replies; 48+ messages in thread
From: David Gibson @ 2020-02-03  1:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 2640 bytes --]

On Sun, Feb 02, 2020 at 06:38:59PM +0100, Paolo Bonzini wrote:
> Il dom 2 feb 2020, 12:51 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:
> 
> > > QEMU must not load GRUB from disk, that's the firmware's task.  If you
> > > want to kill SLOF, you can rewrite it, but loading the kernel GRUB from
> > > disk within QEMU is a bad idea: the next feature you'll be requested to
> > > implement will be network boot, and there's no way to do that in QEMU.
> >
> > What is exactly the problem with netboot? I can hook up the OF's "net" to
> > a backend (as I do for serial console and
> > blockdev, in boot order)
> 
> Who provides the OpenFirmware entry point when you remove SLOF and boot
> directly into grub?

We do the same thing as we do for RTAS.  We have a tiny (20 byte) stub
for the client interface entry point which forwards client interface
calls to a hypercall which we implement in qemu.

> Or alternatively it is possible with my patchset to load petitboot (kernel
> > + intramdisk, the default way of booting
> > POWER8/9 baremetal systems) and that thing can do whole lot of things, we
> > can consider it as a replacement for ROMs from
> > devices (or I misunderstood what kind of netboot you meant).
> >
> 
> Why wouldn't that have the same issue as SLOF that you describe below (I
> honestly don't understand anything of it, but that's not your fault :-)).

Because having it's own full understanding of the hardware (via its
linux kernel), petitboot doesn't have to shared data with the
hypervisor to the extent that SLOF needs to.

> 
> Paolo
> 
> 
> > > You should be able to reuse quite a lot of code from both
> > > pc-bios/s390-ccw (for virtio drivers) and kvm-unit-tests (for device
> > > tree parsing).  You'd have to write the glue code for PCI hypercalls,
> > > and adapt virtio.c for virtio-pci instead of virtio-ccw.
> >
> > The reason for killing SLOF is to keep one device tree for the entire boot
> > process including
> > ibm,client-architecture-support with possible (and annoying) configuration
> > reboots. Having another firware won't help
> > with that.
> >
> > Also the OF1275 client interface is the way for the client to get
> > net/block device without need to have drivers, I'd
> > like to do just this and skip the middle man (QEMU device and guest driver
> > in firmware/bootloader).
> >
> > I'll post another RFC tomorrow to give a better idea.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03  1:28   ` David Gibson
@ 2020-02-03  9:12     ` Paolo Bonzini
  2020-02-03  9:50       ` David Gibson
  2020-02-03 10:58       ` Alexey Kardashevskiy
  0 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-03  9:12 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

On 03/02/20 02:28, David Gibson wrote:
> But "pseries" is different.  We're implementing the PAPR platform,
> which describes an OS environment that's presented by a combination of
> a hypervisor and firmware.  The features it specifies *require*
> collaboration between the firmware and the hypervisor.

Which features are these?

> So really, the question isn't whether we implement things in firmware
> or in qemu.  It's whether we implement the firmware functionality as
> guest cpu code, which needs to be coded to work with a limited
> environment, built with a special toolchain, then emulated with TCG.
> Or, do we just implement it in normal C code, with a full C library,
> and existing device and backend abstractions inside qemu.

... which is adding almost 2000 lines of new code to the host despite
the following limitations:

> 4. no networking in OF CI at all;
> 5. no vga;
> 6. no disk partitions in CI, i.e. no commas to select a partition -
> this relies on a bootloader accessing the disk as a whole;

and of course:

> 7. "interpret" (executes passed forth expression) does nothing as in this
> environment grub only uses it for switching cursor off and similar tasks.

In other words you're not dropping SLOF, you're really dropping
OpenFirmware completely.  It's little more than what ARM does.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03  9:12     ` Paolo Bonzini
@ 2020-02-03  9:50       ` David Gibson
  2020-02-03 10:58       ` Alexey Kardashevskiy
  1 sibling, 0 replies; 48+ messages in thread
From: David Gibson @ 2020-02-03  9:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 2786 bytes --]

On Mon, Feb 03, 2020 at 10:12:02AM +0100, Paolo Bonzini wrote:
> On 03/02/20 02:28, David Gibson wrote:
> > But "pseries" is different.  We're implementing the PAPR platform,
> > which describes an OS environment that's presented by a combination of
> > a hypervisor and firmware.  The features it specifies *require*
> > collaboration between the firmware and the hypervisor.
> 
> Which features are these?

Too many to list really.  In the whole of PAPR, there are probably
dozens of RTAS calls that require hypervisor privilege at some point
along the way.  We probably don't implement that many of them -
there's a bunch of stuff we've never bothered with because Linux
doesn't care.

> > So really, the question isn't whether we implement things in firmware
> > or in qemu.  It's whether we implement the firmware functionality as
> > guest cpu code, which needs to be coded to work with a limited
> > environment, built with a special toolchain, then emulated with TCG.
> > Or, do we just implement it in normal C code, with a full C library,
> > and existing device and backend abstractions inside qemu.
> 
> ... which is adding almost 2000 lines of new code to the host despite
> the following limitations:

Well.. yeah.. it is kinda larger than I hoped.

But in comparison *just* the qemu specific parts of SLOF are >4000
lines of Forth.  Overall there's about 20k lines of Forth and 33k
lines of C.  And the number of people who both understand Forth and
have the slightest interest in SLOF is, like.. 2 people?  Maybe 3 if
you count Segher's occasional drive-by rants.

> > 4. no networking in OF CI at all;
> > 5. no vga;
> > 6. no disk partitions in CI, i.e. no commas to select a partition -
> > this relies on a bootloader accessing the disk as a whole;
> 
> and of course:
> 
> > 7. "interpret" (executes passed forth expression) does nothing as in this
> > environment grub only uses it for switching cursor off and similar tasks.
> 
> In other words you're not dropping SLOF, you're really dropping
> OpenFirmware completely.

No argument there.  That is absolutely true, and absolutely
intentional.  The idea is to maintain just what we need of the
OS-facing OF interface.

Frankly, while it has some good ideas, I don't think Open Firmware
wasn't that great a concept overall in the 90s[0] - and it has not really
improved with age.

[0] Incidentally I also think EFI's a pretty crappy concept for almost
    exactly the same reasons, but it has the huge advantage of a much
    more actively developed codebase.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03  9:12     ` Paolo Bonzini
  2020-02-03  9:50       ` David Gibson
@ 2020-02-03 10:58       ` Alexey Kardashevskiy
  2020-02-03 15:08         ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-03 10:58 UTC (permalink / raw)
  To: Paolo Bonzini, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella



On 3/2/20 8:12 pm, Paolo Bonzini wrote:
> On 03/02/20 02:28, David Gibson wrote:
>> But "pseries" is different.  We're implementing the PAPR platform,
>> which describes an OS environment that's presented by a combination of
>> a hypervisor and firmware.  The features it specifies *require*
>> collaboration between the firmware and the hypervisor.
> 
> Which features are these?

RTAS: PCI handling - MSI allocations, config space, interrupts
(XICS/XIVE) - we do it in QEMU right now so this went unnoticed but
ideally there should have been a RTAS binary lot bigger than those 20
bytes (never even had a chance to look at what IBM pHyp does).

OF CI: ibm,client-architecture-support and all these spapr-vlan/vty/scsi
paradevices - we do not really need any driver between GRUB and QEMU -
the OF interface defines enough.

Resource allocation - we allocate some in QEMU (PCI bus numbers
assignment and PHB windows) but assign BARs and bridge windows in SLOF
(boottime) or Linux (hotplug). We could just let Linux do this or do
this in QEMU.

Interrupt map - QEMU does this for PHB (as a host interrupt controller
is a parent) and SLOF does it for PCI bridges (they have PHB or other
bridges as parents so they do it themselves), except of course PCI hot
plug after the guest started but Linux has not fetched the device tree.

All this is manageable but quite hard to maintain while benefits of such
separation of hypervisor code are not clear.


>> So really, the question isn't whether we implement things in firmware
>> or in qemu.  It's whether we implement the firmware functionality as
>> guest cpu code, which needs to be coded to work with a limited
>> environment, built with a special toolchain, then emulated with TCG.
>> Or, do we just implement it in normal C code, with a full C library,
>> and existing device and backend abstractions inside qemu.
> 
> ... which is adding almost 2000 lines of new code to the host despite
> the following limitations:

Kind of. But it replaces a log bigger chunk of SLOF, easy to read and it
works faster. Just virtio-scsi/net drivers are about 1700 lines and we
do need them at all with the proposed patches (or I missed the bigger
picture again and we need them?).

Also Linux needs only roughly a half of this. One idea was to hack GRUB
to run in the userspace from initrd with petitboot-alike kernel, and
carry this small kernel with a franken-GRUB with QEMU, then extra code
goes to GRUB and then those folks become unhappy :)


>> 4. no networking in OF CI at all;
>> 5. no vga;
>> 6. no disk partitions in CI, i.e. no commas to select a partition -
>> this relies on a bootloader accessing the disk as a whole;

This is not going to be a lot really, especially supporting partitions -
the code is practically there already as I needed it to find GRUB, and
GRUB does the rest asking very little from the firmware to work.

btw what is the common way of netbooting in x86? NIC ROM or GRUB (but
this would be a disk anyway)? Can we consider having a precompiled GRUB
image somewhere in pc-bios/ to use for netboot? Or Uboot would do (it is
already in pc-bios/, no?), I suppose?


> and of course:
> 
>> 7. "interpret" (executes passed forth expression) does nothing as in this
>> environment grub only uses it for switching cursor off and similar tasks.
> 
> In other words you're not dropping SLOF, you're really dropping
> OpenFirmware completely.

What is the exact benefit of having OpenFirmware's "interpret"? The rest
is there as far as known clients are concerned. FreeBSD is somewhere
between GRUB and Linux, and we never truly supported AIX as it has (or
at least used to) fixes for pHyp firmware bugs which we never cared to
simulate in SLOF.

I totally get why people want a firmware, it makes perfect sense when
emulating bare metal machine (such as the powernv machine type in QEMU)
but spapr is not the case.



>  It's little more than what ARM does.
> 
> Paolo
> 

-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 10:58       ` Alexey Kardashevskiy
@ 2020-02-03 15:08         ` Paolo Bonzini
  2020-02-03 22:36           ` Alexey Kardashevskiy
  2020-02-05  5:58           ` David Gibson
  0 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-03 15:08 UTC (permalink / raw)
  To: Alexey Kardashevskiy, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella

On 03/02/20 11:58, Alexey Kardashevskiy wrote:
>>> So really, the question isn't whether we implement things in firmware
>>> or in qemu.  It's whether we implement the firmware functionality as
>>> guest cpu code, which needs to be coded to work with a limited
>>> environment, built with a special toolchain, then emulated with TCG.
>>> Or, do we just implement it in normal C code, with a full C library,
>>> and existing device and backend abstractions inside qemu.
>>
>> ... which is adding almost 2000 lines of new code to the host despite
>> the following limitations:
>>
>>> 4. no networking in OF CI at all;
>>> 5. no vga;
>>> 6. no disk partitions in CI, i.e. no commas to select a partition -
>>> this relies on a bootloader accessing the disk as a whole;
> 
> This is not going to be a lot really, especially supporting partitions -
> the code is practically there already as I needed it to find GRUB, and
> GRUB does the rest asking very little from the firmware to work.

What partition formats would have to be supported?  But honestly I'm
more worried about the networking part.

> btw what is the common way of netbooting in x86? NIC ROM or GRUB (but
> this would be a disk anyway)? Can we consider having a precompiled GRUB
> image somewhere in pc-bios/ to use for netboot? Or Uboot would do (it is
> already in pc-bios/, no?), I suppose?

GRUB netboot support is almost never used.  There are three cases:

- QEMU BIOS: the NIC ROM contain iPXE, which is both the driver code and
the boot loader (which chains into GRUB).

- Bare metal BIOS: same, but the boot loader is minimal so most of the
time iPXE is loaded via TFTP and reuses the NIC ROM's driver code.

- UEFI: the NIC ROM contains driver code only and the firmware does the
rest.

>> In other words you're not dropping SLOF, you're really dropping
>> OpenFirmware completely.
> 
> What is the exact benefit of having OpenFirmware's "interpret"?

None, besides being able to play space invaders written in Forth.  I'm
not against dropping most OpenFirmware capabilities, I'm against adding
a limited (or broken depending on what you're trying to do) version that
runs in the host.

Yes, SLOF is big and slow.  petitboot is not petit at all either, and
has the disadvantage that you have to find a way to run GRUB afterwards.
 But would a similarly minimal OF implementation (no network, almost no
interpret so no Forth, device tree built entirely in the host, etc.) be
just as big and slow?

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 15:08         ` Paolo Bonzini
@ 2020-02-03 22:36           ` Alexey Kardashevskiy
  2020-02-03 22:56             ` Paolo Bonzini
  2020-02-05  5:58           ` David Gibson
  1 sibling, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-03 22:36 UTC (permalink / raw)
  To: Paolo Bonzini, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella



On 04/02/2020 02:08, Paolo Bonzini wrote:
> On 03/02/20 11:58, Alexey Kardashevskiy wrote:
>>>> So really, the question isn't whether we implement things in firmware
>>>> or in qemu.  It's whether we implement the firmware functionality as
>>>> guest cpu code, which needs to be coded to work with a limited
>>>> environment, built with a special toolchain, then emulated with TCG.
>>>> Or, do we just implement it in normal C code, with a full C library,
>>>> and existing device and backend abstractions inside qemu.
>>>
>>> ... which is adding almost 2000 lines of new code to the host despite
>>> the following limitations:
>>>
>>>> 4. no networking in OF CI at all;
>>>> 5. no vga;
>>>> 6. no disk partitions in CI, i.e. no commas to select a partition -
>>>> this relies on a bootloader accessing the disk as a whole;
>>
>> This is not going to be a lot really, especially supporting partitions -
>> the code is practically there already as I needed it to find GRUB, and
>> GRUB does the rest asking very little from the firmware to work.
> 
> What partition formats would have to be supported? 

MBR, GPT, is there anything else? "Support" is limited to converting a
number after command to [start, size] couple. I am not going for file
systems.

> But honestly I'm
> more worried about the networking part.

Fair enough.

>> btw what is the common way of netbooting in x86? NIC ROM or GRUB (but
>> this would be a disk anyway)? Can we consider having a precompiled GRUB
>> image somewhere in pc-bios/ to use for netboot? Or Uboot would do (it is
>> already in pc-bios/, no?), I suppose?
> 
> GRUB netboot support is almost never used. 

Huh. We use yaboot here in Ozlabs for netbooting quite a lot.

> There are three cases:
> 
> - QEMU BIOS: the NIC ROM contain iPXE, which is both the driver code and
> the boot loader (which chains into GRUB).
> 
> - Bare metal BIOS: same, but the boot loader is minimal so most of the
> time iPXE is loaded via TFTP and reuses the NIC ROM's driver code.
> 
> - UEFI: the NIC ROM contains driver code only and the firmware does the
> rest.

Well, we never really had this luxury of NIC ROM, there were a couple of
NICs with fcode which never really worked in SLOF.

Oh well, this is probably the time to look into netbooting then.


>>> In other words you're not dropping SLOF, you're really dropping
>>> OpenFirmware completely.
>>
>> What is the exact benefit of having OpenFirmware's "interpret"?
> 
> None, besides being able to play space invaders written in Forth.  I'm
> not against dropping most OpenFirmware capabilities, I'm against adding
> a limited (or broken depending on what you're trying to do) version that
> runs in the host.
> 
> Yes, SLOF is big and slow.  petitboot is not petit at all either, and
> has the disadvantage that you have to find a way to run GRUB afterwards.
>  But would a similarly minimal OF implementation (no network, almost no
> interpret so no Forth, device tree built entirely in the host, etc.)

The device tree is almost completely built in QEMU these days anyway,
twice during normal boot.

> be just as big and slow?

I doubt. We will be getting rid of unnecessary drivers, bus scanning
code (SCSI, PCI), device tree synchronization.


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 22:36           ` Alexey Kardashevskiy
@ 2020-02-03 22:56             ` Paolo Bonzini
  2020-02-03 23:19               ` Alexey Kardashevskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-03 22:56 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Thomas Huth, qemu-devel, Cornelia Huck, Christian Borntraeger,
	David Gibson, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 1084 bytes --]

Il lun 3 feb 2020, 23:36 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:

>
> > What partition formats would have to be supported?
>
> MBR, GPT, is there anything else? "Support" is limited to converting a
> number after command to [start, size] couple. I am not going for file
> systems.
>
> > But honestly I'm
> > more worried about the networking part.
>
> Fair enough.
>
> > Yes, SLOF is big and slow.  petitboot is not petit at all either, and
> > has the disadvantage that you have to find a way to run GRUB afterwards.
> >  But would a similarly minimal OF implementation (no network, almost no
> > interpret so no Forth, device tree built entirely in the host, etc.)
> > be just as big and slow?
>
> I doubt. We will be getting rid of unnecessary drivers, bus scanning
> code (SCSI, PCI), device tree synchronization.
>

What I mean is, if you write a firmware that exposes a minimal OF device
interface but runs it in the guest, and does a hypercall for everything
else, would it be as big and slow as SLOF?

Paolo

>
>
> --
> Alexey
>
>

[-- Attachment #2: Type: text/html, Size: 1695 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 22:56             ` Paolo Bonzini
@ 2020-02-03 23:19               ` Alexey Kardashevskiy
  2020-02-03 23:26                 ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-03 23:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, qemu-devel, Cornelia Huck, Christian Borntraeger,
	David Gibson, Stefano Garzarella



On 04/02/2020 09:56, Paolo Bonzini wrote:
> 
> 
> Il lun 3 feb 2020, 23:36 Alexey Kardashevskiy <aik@ozlabs.ru
> <mailto:aik@ozlabs.ru>> ha scritto:
> 
> 
>     > What partition formats would have to be supported?
> 
>     MBR, GPT, is there anything else? "Support" is limited to converting a
>     number after command to [start, size] couple. I am not going for file
>     systems.
> 
>     > But honestly I'm
>     > more worried about the networking part.
> 
>     Fair enough.
> 
>     > Yes, SLOF is big and slow.  petitboot is not petit at all either, and
>     > has the disadvantage that you have to find a way to run GRUB
>     afterwards.
>     >  But would a similarly minimal OF implementation (no network,
>     almost no
>     > interpret so no Forth, device tree built entirely in the host, etc.)
>     > be just as big and slow?
> 
>     I doubt. We will be getting rid of unnecessary drivers, bus scanning
>     code (SCSI, PCI), device tree synchronization.
> 
> 
> What I mean is, if you write a firmware that exposes a minimal OF device
> interface but runs it in the guest, and does a hypercall for everything
> else, would it be as big and slow as SLOF?

I just did almost that - 20 bytes, fast as a bullet, runs in the guest ;)

Speaking seriously, what would I put into the guest?

The device tree? This is the core problem of the current design - we
need to keep it in sync with QEMU.

Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
smaller but adhoc with only a couple of people knowing it. Other
packages - disk-label, deblocker - I do not see any user for these
except SLOF itself.


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 23:19               ` Alexey Kardashevskiy
@ 2020-02-03 23:26                 ` Paolo Bonzini
  2020-02-04  6:16                   ` Thomas Huth
                                     ` (2 more replies)
  0 siblings, 3 replies; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-03 23:26 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Thomas Huth, qemu-devel, Cornelia Huck, Christian Borntraeger,
	David Gibson, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 558 bytes --]

Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:

>
>
> Speaking seriously, what would I put into the guest?
>

Only things that would be considered drivers. Ignore the partitions issue
for now so that you can just pass the device tree services to QEMU with
hypercalls.

Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> smaller but adhoc with only a couple of people knowing it.
>

You can generalize and reuse the s390 code. All you have to write is the
PCI scan and virtio-pci setup.

Paolo

[-- Attachment #2: Type: text/html, Size: 1155 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 23:26                 ` Paolo Bonzini
@ 2020-02-04  6:16                   ` Thomas Huth
  2020-02-04  8:54                     ` Cornelia Huck
  2020-02-04 23:18                   ` VW ELF loader Alexey Kardashevskiy
  2020-02-05  6:06                   ` David Gibson
  2 siblings, 1 reply; 48+ messages in thread
From: Thomas Huth @ 2020-02-04  6:16 UTC (permalink / raw)
  To: Paolo Bonzini, Alexey Kardashevskiy
  Cc: Stefano Garzarella, Christian Borntraeger, David Gibson,
	qemu-devel, Cornelia Huck

On 04/02/2020 00.26, Paolo Bonzini wrote:
> 
> 
> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
> <mailto:aik@ozlabs.ru>> ha scritto:
> 
>     Speaking seriously, what would I put into the guest?
> 
> Only things that would be considered drivers. Ignore the partitions
> issue for now so that you can just pass the device tree services to QEMU
> with hypercalls.
> 
>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
>     smaller but adhoc with only a couple of people knowing it.
> 
> 
> You can generalize and reuse the s390 code. All you have to write is the
> PCI scan and virtio-pci setup.

Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
so re-using this for a slim netboot client on ppc64 would certainly be
feasible (especially since there are also already virtio drivers in SLOF
that are written in C), but I think it is not very future proof. The
libnet from SLOF only supports UDP, and no TCP. So for advanced boot
scenarios like booting from HTTP or even HTTPS, you need something else
(i.e. maybe grub is the better option, indeed).

 Thomas



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-04  6:16                   ` Thomas Huth
@ 2020-02-04  8:54                     ` Cornelia Huck
  2020-02-04  9:20                       ` Restrictions of libnet (was: Re: VW ELF loader) Thomas Huth
  0 siblings, 1 reply; 48+ messages in thread
From: Cornelia Huck @ 2020-02-04  8:54 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Alexey Kardashevskiy, qemu-devel, Christian Borntraeger,
	Paolo Bonzini, Stefano Garzarella, David Gibson

On Tue, 4 Feb 2020 07:16:46 +0100
Thomas Huth <thuth@redhat.com> wrote:

> On 04/02/2020 00.26, Paolo Bonzini wrote:
> > 
> > 
> > Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
> > <mailto:aik@ozlabs.ru>> ha scritto:
> > 
> >     Speaking seriously, what would I put into the guest?
> > 
> > Only things that would be considered drivers. Ignore the partitions
> > issue for now so that you can just pass the device tree services to QEMU
> > with hypercalls.
> > 
> >     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> >     smaller but adhoc with only a couple of people knowing it.
> > 
> > 
> > You can generalize and reuse the s390 code. All you have to write is the
> > PCI scan and virtio-pci setup.  
> 
> Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
> so re-using this for a slim netboot client on ppc64 would certainly be
> feasible (especially since there are also already virtio drivers in SLOF
> that are written in C), but I think it is not very future proof. The
> libnet from SLOF only supports UDP, and no TCP. So for advanced boot
> scenarios like booting from HTTP or even HTTPS, you need something else
> (i.e. maybe grub is the better option, indeed).

That makes me wonder what that means for s390: We're inheriting
libnet's limitations, but we don't have grub -- do we need to come up
with something different? Or improve libnet?


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-04  8:54                     ` Cornelia Huck
@ 2020-02-04  9:20                       ` Thomas Huth
  2020-02-04  9:32                         ` Thomas Huth
                                           ` (2 more replies)
  0 siblings, 3 replies; 48+ messages in thread
From: Thomas Huth @ 2020-02-04  9:20 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alexey Kardashevskiy, qemu-devel, Christian Borntraeger,
	qemu-s390x, Paolo Bonzini, Stefano Garzarella, David Gibson

On 04/02/2020 09.54, Cornelia Huck wrote:
> On Tue, 4 Feb 2020 07:16:46 +0100
> Thomas Huth <thuth@redhat.com> wrote:
> 
>> On 04/02/2020 00.26, Paolo Bonzini wrote:
>>>
>>>
>>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
>>> <mailto:aik@ozlabs.ru>> ha scritto:
>>>
>>>     Speaking seriously, what would I put into the guest?
>>>
>>> Only things that would be considered drivers. Ignore the partitions
>>> issue for now so that you can just pass the device tree services to QEMU
>>> with hypercalls.
>>>
>>>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
>>>     smaller but adhoc with only a couple of people knowing it.
>>>
>>>
>>> You can generalize and reuse the s390 code. All you have to write is the
>>> PCI scan and virtio-pci setup.  
>>
>> Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
>> so re-using this for a slim netboot client on ppc64 would certainly be
>> feasible (especially since there are also already virtio drivers in SLOF
>> that are written in C), but I think it is not very future proof. The
>> libnet from SLOF only supports UDP, and no TCP. So for advanced boot
>> scenarios like booting from HTTP or even HTTPS, you need something else
>> (i.e. maybe grub is the better option, indeed).
> 
> That makes me wonder what that means for s390: We're inheriting
> libnet's limitations, but we don't have grub -- do we need to come up
> with something different? Or improve libnet?

I don't think that it makes sense to re-invent the wheel yet another
time and write yet another TCP implementation (which is likely quite a
bit of work, too, especially if you also want to do secure HTTPS in the
end). So yes, in the long run (as soon as somebody seriously asks for
HTTP booting on s390x) we need something different here.

Now looking at our standard s390x bootloader zipl - this has been giving
us a headache a couple of times in the past, too (from a distro point of
view since s390x is the only major platform left that does not use grub,
but also from a s390-ccw bios point of view, see e.g.
https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03046.html and
related discussions).

So IMHO the s390x world should move towards grub2, too. We could e.g.
link it initially into the s390-ccw bios bios ... and if that works out
well, later also use it as normal bootloader instead of zipl (not sure
if that works in all cases, though, IIRC there were some size
constraints and stuff like that).

Just my 0.02 € of course, though.

 Thomas



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-04  9:20                       ` Restrictions of libnet (was: Re: VW ELF loader) Thomas Huth
@ 2020-02-04  9:32                         ` Thomas Huth
  2020-02-04  9:33                         ` Michal Suchánek
  2020-02-05  5:30                         ` David Gibson
  2 siblings, 0 replies; 48+ messages in thread
From: Thomas Huth @ 2020-02-04  9:32 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alexey Kardashevskiy, qemu-devel, Christian Borntraeger,
	qemu-s390x, Paolo Bonzini, David Gibson, Stefano Garzarella

On 04/02/2020 10.20, Thomas Huth wrote:
[...]
> So IMHO the s390x world should move towards grub2, too. We could e.g.
> link it initially into the s390-ccw bios bios ... and if that works out
> well, later also use it as normal bootloader instead of zipl

I meant to say "use it as normal bootloader instead of zipl on LPARs and
z/VM, too".

 Thomas



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-04  9:20                       ` Restrictions of libnet (was: Re: VW ELF loader) Thomas Huth
  2020-02-04  9:32                         ` Thomas Huth
@ 2020-02-04  9:33                         ` Michal Suchánek
  2020-02-05  5:30                         ` David Gibson
  2 siblings, 0 replies; 48+ messages in thread
From: Michal Suchánek @ 2020-02-04  9:33 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Alexey Kardashevskiy, Cornelia Huck, qemu-devel,
	Christian Borntraeger, qemu-s390x, Paolo Bonzini, David Gibson,
	Stefano Garzarella

Hello,

On Tue, Feb 04, 2020 at 10:20:14AM +0100, Thomas Huth wrote:

> 
> So IMHO the s390x world should move towards grub2, too. We could e.g.
> link it initially into the s390-ccw bios bios ... and if that works out
> well, later also use it as normal bootloader instead of zipl (not sure
> if that works in all cases, though, IIRC there were some size
> constraints and stuff like that).

AFAIK the main reason why it does not work is that grub does not have
ccw drivers.

That aside it would be welcome to get it working.

Thanks

Michal


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-01 19:04 ` Paolo Bonzini
  2020-02-02 11:51   ` Alexey Kardashevskiy
  2020-02-03  1:28   ` David Gibson
@ 2020-02-04  9:40   ` Christian Borntraeger
  2 siblings, 0 replies; 48+ messages in thread
From: Christian Borntraeger @ 2020-02-04  9:40 UTC (permalink / raw)
  To: Paolo Bonzini, Alexey Kardashevskiy, qemu-devel
  Cc: Thomas Huth, Stefano Garzarella, Cornelia Huck, David Gibson



On 01.02.20 20:04, Paolo Bonzini wrote:
> On 01/02/20 14:39, Alexey Kardashevskiy wrote:
>> QEMU needs to load GRUB from the disk. The current workaround is to read
>> it from qcow2, save in a file and then call load_elf(). Not nice.
>>
>> 2 problems with that.
>>
>> 1. when load_elf calls address_space_write() - I need to know where and
>> how much RAM was used to mark this memory "used" for the OF client
>> interface (/memory@0/available FDT property). So I'll need "preload()"
>> hook.
>>
>> 2. (bigger) GRUB comes from PReP partition which is 8MB. load_elf{32|64}
>> consumes filename, not a memory pointer nor a "read_fn" callback - so I
>> thought I need a "read_fn" callback.
>>
>> And then I discovered that load_elf actually maps the passed file. And
>> here I got lost.
>>
>> Why does not load_elf just map the entire file and parse the bits? It
>> still reads chunks with seek+read and then it maps the file in a loop
>> potentially multiple times - is this even correct? Passing "fd" around
>> is weird.
> 
> QEMU must not load GRUB from disk, that's the firmware's task.  If you
> want to kill SLOF, you can rewrite it, but loading the kernel GRUB from
> disk within QEMU is a bad idea: the next feature you'll be requested to
> implement will be network boot, and there's no way to do that in QEMU.
> 
> You should be able to reuse quite a lot of code from both
> pc-bios/s390-ccw (for virtio drivers) and kvm-unit-tests (for device
> tree parsing).  You'd have to write the glue code for PCI hypercalls,
> and adapt virtio.c for virtio-pci instead of virtio-ccw.

Yes, we had disk format parsing at the beginning in QEMU and Alex Graf insisted
on using a BIOS code - even if s390 has no bios inside the guest context.
So we put this somewhere at the end of the guest and it seems that it does not
collide with Linux guests. 
In the end we have boot from all kind of disks and network boot. 

Thomas can maybe tell better if this works out good or bad.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 23:26                 ` Paolo Bonzini
  2020-02-04  6:16                   ` Thomas Huth
@ 2020-02-04 23:18                   ` Alexey Kardashevskiy
  2020-02-05  6:06                   ` David Gibson
  2 siblings, 0 replies; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-04 23:18 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, qemu-devel, Cornelia Huck, Christian Borntraeger,
	David Gibson, Stefano Garzarella



On 04/02/2020 10:26, Paolo Bonzini wrote:
> 
> 
> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
> <mailto:aik@ozlabs.ru>> ha scritto:
> 
> 
> 
>     Speaking seriously, what would I put into the guest?
> 
> 
> Only things that would be considered drivers. Ignore the partitions
> issue for now so that you can just pass the device tree services to QEMU
> with hypercalls.
> 
>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
>     smaller but adhoc with only a couple of people knowing it.
> 
> 
> You can generalize and reuse the s390 code. All you have to write is the
> PCI scan and virtio-pci setup.

Among with the device tree syncing, these are the things I really want
to get rid of, especially drivers as today they do not support IOMMU so
I will also have to implement that as well.

I guess I could write a small firmware which would read MBR/GPT, find
PReP, load GRUB elf and jump into it (although this seems unnecessary
complicated for the task and definitely duplicates the code) but having
drivers in what is defined as a driverless environment is just weird imho.

However I am struggling with hooking network from CI to the network
backend, not as easy as blockdev, need another temporary netclient :)


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-04  9:20                       ` Restrictions of libnet (was: Re: VW ELF loader) Thomas Huth
  2020-02-04  9:32                         ` Thomas Huth
  2020-02-04  9:33                         ` Michal Suchánek
@ 2020-02-05  5:30                         ` David Gibson
  2020-02-05  6:24                           ` Thomas Huth
  2 siblings, 1 reply; 48+ messages in thread
From: David Gibson @ 2020-02-05  5:30 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Alexey Kardashevskiy, Cornelia Huck, qemu-devel,
	Christian Borntraeger, qemu-s390x, Paolo Bonzini,
	Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 3210 bytes --]

On Tue, Feb 04, 2020 at 10:20:14AM +0100, Thomas Huth wrote:
> On 04/02/2020 09.54, Cornelia Huck wrote:
> > On Tue, 4 Feb 2020 07:16:46 +0100
> > Thomas Huth <thuth@redhat.com> wrote:
> > 
> >> On 04/02/2020 00.26, Paolo Bonzini wrote:
> >>>
> >>>
> >>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
> >>> <mailto:aik@ozlabs.ru>> ha scritto:
> >>>
> >>>     Speaking seriously, what would I put into the guest?
> >>>
> >>> Only things that would be considered drivers. Ignore the partitions
> >>> issue for now so that you can just pass the device tree services to QEMU
> >>> with hypercalls.
> >>>
> >>>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> >>>     smaller but adhoc with only a couple of people knowing it.
> >>>
> >>>
> >>> You can generalize and reuse the s390 code. All you have to write is the
> >>> PCI scan and virtio-pci setup.  
> >>
> >> Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
> >> so re-using this for a slim netboot client on ppc64 would certainly be
> >> feasible (especially since there are also already virtio drivers in SLOF
> >> that are written in C), but I think it is not very future proof. The
> >> libnet from SLOF only supports UDP, and no TCP. So for advanced boot
> >> scenarios like booting from HTTP or even HTTPS, you need something else
> >> (i.e. maybe grub is the better option, indeed).
> > 
> > That makes me wonder what that means for s390: We're inheriting
> > libnet's limitations, but we don't have grub -- do we need to come up
> > with something different? Or improve libnet?
> 
> I don't think that it makes sense to re-invent the wheel yet another
> time and write yet another TCP implementation (which is likely quite a
> bit of work, too, especially if you also want to do secure HTTPS in the
> end). So yes, in the long run (as soon as somebody seriously asks for
> HTTP booting on s390x) we need something different here.
> 
> Now looking at our standard s390x bootloader zipl - this has been giving
> us a headache a couple of times in the past, too (from a distro point of
> view since s390x is the only major platform left that does not use grub,
> but also from a s390-ccw bios point of view, see e.g.
> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03046.html and
> related discussions).
> 
> So IMHO the s390x world should move towards grub2, too. We could e.g.
> link it initially into the s390-ccw bios bios ... and if that works out
> well, later also use it as normal bootloader instead of zipl (not sure
> if that works in all cases, though, IIRC there were some size
> constraints and stuff like that).

petitboot would be another reasonable thing to consider here.  Since
it's Linux based, you have all the drivers you have there.  It's not
quite grub, but it does at least parse the same configuration files.

You do need kexec() of course, I don't know if you have that already
for s390 or not.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 15:08         ` Paolo Bonzini
  2020-02-03 22:36           ` Alexey Kardashevskiy
@ 2020-02-05  5:58           ` David Gibson
  2020-02-06  8:29             ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: David Gibson @ 2020-02-05  5:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 3775 bytes --]

On Mon, Feb 03, 2020 at 04:08:54PM +0100, Paolo Bonzini wrote:
> On 03/02/20 11:58, Alexey Kardashevskiy wrote:
> >>> So really, the question isn't whether we implement things in firmware
> >>> or in qemu.  It's whether we implement the firmware functionality as
> >>> guest cpu code, which needs to be coded to work with a limited
> >>> environment, built with a special toolchain, then emulated with TCG.
> >>> Or, do we just implement it in normal C code, with a full C library,
> >>> and existing device and backend abstractions inside qemu.
> >>
> >> ... which is adding almost 2000 lines of new code to the host despite
> >> the following limitations:
> >>
> >>> 4. no networking in OF CI at all;
> >>> 5. no vga;
> >>> 6. no disk partitions in CI, i.e. no commas to select a partition -
> >>> this relies on a bootloader accessing the disk as a whole;
> > 
> > This is not going to be a lot really, especially supporting partitions -
> > the code is practically there already as I needed it to find GRUB, and
> > GRUB does the rest asking very little from the firmware to work.
> 
> What partition formats would have to be supported?  But honestly I'm
> more worried about the networking part.
> 
> > btw what is the common way of netbooting in x86? NIC ROM or GRUB (but
> > this would be a disk anyway)? Can we consider having a precompiled GRUB
> > image somewhere in pc-bios/ to use for netboot? Or Uboot would do (it is
> > already in pc-bios/, no?), I suppose?
> 
> GRUB netboot support is almost never used.  There are three cases:
> 
> - QEMU BIOS: the NIC ROM contain iPXE, which is both the driver code and
> the boot loader (which chains into GRUB).
> 
> - Bare metal BIOS: same, but the boot loader is minimal so most of the
> time iPXE is loaded via TFTP and reuses the NIC ROM's driver code.
> 
> - UEFI: the NIC ROM contains driver code only and the firmware does the
> rest.
> 
> >> In other words you're not dropping SLOF, you're really dropping
> >> OpenFirmware completely.
> > 
> > What is the exact benefit of having OpenFirmware's "interpret"?
> 
> None, besides being able to play space invaders written in Forth.  I'm
> not against dropping most OpenFirmware capabilities, I'm against adding
> a limited (or broken depending on what you're trying to do) version that
> runs in the host.
> 
> Yes, SLOF is big and slow.  petitboot is not petit at all either, and
> has the disadvantage that you have to find a way to run GRUB afterwards.

Well, not usually.  Petitboot parses grub configuration itself, which
means that generally from the OS / installer point of view it looks
like grub, even though it's not from the actual bootstrapping point of
view.

>  But would a similarly minimal OF implementation (no network, almost no
> interpret so no Forth, device tree built entirely in the host, etc.) be
> just as big and slow?

So, as actual OF implementations go, SLOF is already pretty minimal
(hence "Slim Line Open Firmware").  If there's no Forth, it's really
not OF any more, just something mimicing some of OF's interfaces.

But the difficulty of SLOF isn't really its bigness or slowness in any
case (the slowness is just an additional irritation).  The two big
issues are 1) that it's written in an obscure language and 2)
synchronizing its state with things that require host side
involvement.

Rewriting a minimal guest side not-OF would partly address (1) (but
there's still the logistical pain of having to build and insert it),
and wouldn't address (2) at all.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-03 23:26                 ` Paolo Bonzini
  2020-02-04  6:16                   ` Thomas Huth
  2020-02-04 23:18                   ` VW ELF loader Alexey Kardashevskiy
@ 2020-02-05  6:06                   ` David Gibson
  2020-02-05  9:28                     ` Cornelia Huck
  2020-02-06  8:27                     ` Paolo Bonzini
  2 siblings, 2 replies; 48+ messages in thread
From: David Gibson @ 2020-02-05  6:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]

On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:
> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:
> 
> >
> >
> > Speaking seriously, what would I put into the guest?
> 
> Only things that would be considered drivers. Ignore the partitions issue
> for now so that you can just pass the device tree services to QEMU with
> hypercalls.

Urgh... first, I don't really see how you'd do that.  OF's whole
device model is based around the device tree.  So implementing OF
driver interactions would require the firmware to do a bunch of
internal hypercalls to do all the DT stuff, which brings us back to a
much more complex and active interface between firmware and hypervisor
than we really want.

Second, drivers are kind of where we'd get the most benefit by putting
them in qemu: from qemu we can just talk to the device backends
directly so we don't need to re-abstract the differences between
different device models of the same type.

> Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> > smaller but adhoc with only a couple of people knowing it.

Netboot I will grant is a pretty thorny problem, whichever way we
tackle it.

> You can generalize and reuse the s390 code. All you have to write is the
> PCI scan and virtio-pci setup.

If we assume virtio only.  In any case it sounds like the s390 code is
actually based on the SLOF code anyway.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-05  5:30                         ` David Gibson
@ 2020-02-05  6:24                           ` Thomas Huth
  2020-02-10  7:55                             ` David Gibson
  0 siblings, 1 reply; 48+ messages in thread
From: Thomas Huth @ 2020-02-05  6:24 UTC (permalink / raw)
  To: David Gibson
  Cc: Alexey Kardashevskiy, Cornelia Huck, qemu-devel,
	Christian Borntraeger, qemu-s390x, Paolo Bonzini,
	Stefano Garzarella

On 05/02/2020 06.30, David Gibson wrote:
> On Tue, Feb 04, 2020 at 10:20:14AM +0100, Thomas Huth wrote:
>> On 04/02/2020 09.54, Cornelia Huck wrote:
>>> On Tue, 4 Feb 2020 07:16:46 +0100
>>> Thomas Huth <thuth@redhat.com> wrote:
>>>
>>>> On 04/02/2020 00.26, Paolo Bonzini wrote:
>>>>>
>>>>>
>>>>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
>>>>> <mailto:aik@ozlabs.ru>> ha scritto:
>>>>>
>>>>>     Speaking seriously, what would I put into the guest?
>>>>>
>>>>> Only things that would be considered drivers. Ignore the partitions
>>>>> issue for now so that you can just pass the device tree services to QEMU
>>>>> with hypercalls.
>>>>>
>>>>>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
>>>>>     smaller but adhoc with only a couple of people knowing it.
>>>>>
>>>>>
>>>>> You can generalize and reuse the s390 code. All you have to write is the
>>>>> PCI scan and virtio-pci setup.  
>>>>
>>>> Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
>>>> so re-using this for a slim netboot client on ppc64 would certainly be
>>>> feasible (especially since there are also already virtio drivers in SLOF
>>>> that are written in C), but I think it is not very future proof. The
>>>> libnet from SLOF only supports UDP, and no TCP. So for advanced boot
>>>> scenarios like booting from HTTP or even HTTPS, you need something else
>>>> (i.e. maybe grub is the better option, indeed).
>>>
>>> That makes me wonder what that means for s390: We're inheriting
>>> libnet's limitations, but we don't have grub -- do we need to come up
>>> with something different? Or improve libnet?
>>
>> I don't think that it makes sense to re-invent the wheel yet another
>> time and write yet another TCP implementation (which is likely quite a
>> bit of work, too, especially if you also want to do secure HTTPS in the
>> end). So yes, in the long run (as soon as somebody seriously asks for
>> HTTP booting on s390x) we need something different here.
>>
>> Now looking at our standard s390x bootloader zipl - this has been giving
>> us a headache a couple of times in the past, too (from a distro point of
>> view since s390x is the only major platform left that does not use grub,
>> but also from a s390-ccw bios point of view, see e.g.
>> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03046.html and
>> related discussions).
>>
>> So IMHO the s390x world should move towards grub2, too. We could e.g.
>> link it initially into the s390-ccw bios bios ... and if that works out
>> well, later also use it as normal bootloader instead of zipl (not sure
>> if that works in all cases, though, IIRC there were some size
>> constraints and stuff like that).
> 
> petitboot would be another reasonable thing to consider here.  Since
> it's Linux based, you have all the drivers you have there.  It's not
> quite grub, but it does at least parse the same configuration files.
> 
> You do need kexec() of course, I don't know if you have that already
> for s390 or not.

AFAIK we have kexec on s390. So yes, petitboot would be another option
for replacing the s390-ccw bios. But when it comes to LPARs and z/VMs, I
don't think it's really feasible to replace the zipl bootloader there
with petitboot, so in that case grub2 still sounds like the better
option to me.

 Thomas



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-05  6:06                   ` David Gibson
@ 2020-02-05  9:28                     ` Cornelia Huck
  2020-02-06  4:47                       ` David Gibson
  2020-02-06  8:27                     ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: Cornelia Huck @ 2020-02-05  9:28 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel,
	Christian Borntraeger, Paolo Bonzini, Stefano Garzarella

On Wed, 5 Feb 2020 17:06:34 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:

> > You can generalize and reuse the s390 code. All you have to write is the
> > PCI scan and virtio-pci setup.  
> 
> If we assume virtio only.  In any case it sounds like the s390 code is
> actually based on the SLOF code anyway.

Only the netboot part. Device discovery/setup etc. had been written
from scratch, but I'm not sure how much reusable infrastructure remains
once you strip all the s390x-specific stuff.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-05  9:28                     ` Cornelia Huck
@ 2020-02-06  4:47                       ` David Gibson
  0 siblings, 0 replies; 48+ messages in thread
From: David Gibson @ 2020-02-06  4:47 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel,
	Christian Borntraeger, Paolo Bonzini, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 927 bytes --]

On Wed, Feb 05, 2020 at 10:28:30AM +0100, Cornelia Huck wrote:
> On Wed, 5 Feb 2020 17:06:34 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:
> 
> > > You can generalize and reuse the s390 code. All you have to write is the
> > > PCI scan and virtio-pci setup.  
> > 
> > If we assume virtio only.  In any case it sounds like the s390 code is
> > actually based on the SLOF code anyway.
> 
> Only the netboot part. Device discovery/setup etc. had been written
> from scratch, but I'm not sure how much reusable infrastructure remains
> once you strip all the s390x-specific stuff.

The netboot's the bit we'd be interested in, anyway.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-05  6:06                   ` David Gibson
  2020-02-05  9:28                     ` Cornelia Huck
@ 2020-02-06  8:27                     ` Paolo Bonzini
  2020-02-06 23:17                       ` Alexey Kardashevskiy
  2020-02-10  7:28                       ` David Gibson
  1 sibling, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-06  8:27 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

On 05/02/20 07:06, David Gibson wrote:
> On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:
>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:
>>> Speaking seriously, what would I put into the guest?
>>
>> Only things that would be considered drivers. Ignore the partitions issue
>> for now so that you can just pass the device tree services to QEMU with
>> hypercalls.
> 
> Urgh... first, I don't really see how you'd do that.  OF's whole
> device model is based around the device tree.  So implementing OF
> driver interactions would require the firmware to do a bunch of
> internal hypercalls to do all the DT stuff, which brings us back to a
> much more complex and active interface between firmware and hypervisor
> than we really want.

I'm really sorry if what I am saying is stupid; but I was thinking of a
firmware entrypoint like

	if (op == "read" || op == "write")
		do_driver_stuff(op);
	else
		hypercall();

This is not even close to pseudocode, but hopefully enough to give the
idea.  Perhaps what I don't understand is why you can't start the
firmware with r3 pointing to the device tree, and stash it for when you
leave control to GRUB.  Or to put it another way, what petitboot does
that you cannot do in your own firmware.

> Second, drivers are kind of where we'd get the most benefit by putting
> them in qemu: from qemu we can just talk to the device backends
> directly so we don't need to re-abstract the differences between
> different device models of the same type.

Of course, but drivers are easy to write.  Not as easy as s390 probably
because you'd have to link in libfdt and so on, but between
kvm-unit-tests and s390-ccw there's quite a bit of code can be reused.

>> You can generalize and reuse the s390 code. All you have to write is the
>> PCI scan and virtio-pci setup.
> 
> If we assume virtio only.

Do you actually need something else?  The TTY can use the simple
getchar/putchar hypercalls, and sPAPR-vSCSI clients can keep using SLOF.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-05  5:58           ` David Gibson
@ 2020-02-06  8:29             ` Paolo Bonzini
  2020-02-06 23:23               ` Alexey Kardashevskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-06  8:29 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

On 05/02/20 06:58, David Gibson wrote:
>> Yes, SLOF is big and slow.  petitboot is not petit at all either, and
>> has the disadvantage that you have to find a way to run GRUB afterwards.
> Well, not usually.  Petitboot parses grub configuration itself, which
> means that generally from the OS / installer point of view it looks
> like grub, even though it's not from the actual bootstrapping point of
> view.

Ok, sorry about that.  I need to learn a bit more.

>>  But would a similarly minimal OF implementation (no network, almost no
>> interpret so no Forth, device tree built entirely in the host, etc.) be
>> just as big and slow?
> 
> So, as actual OF implementations go, SLOF is already pretty minimal
> (hence "Slim Line Open Firmware").  If there's no Forth, it's really
> not OF any more, just something mimicing some of OF's interfaces.

Right, not unlike what you get with vof=on. :)  I'm not against at all
that idea.  I just don't understand what you refer to below as (2).
Does petitboot not have the problem because it kexecs the new kernel?

Paolo

> But the difficulty of SLOF isn't really its bigness or slowness in any
> case (the slowness is just an additional irritation).  The two big
> issues are 1) that it's written in an obscure language and 2)
> synchronizing its state with things that require host side
> involvement.
> 
> Rewriting a minimal guest side not-OF would partly address (1) (but
> there's still the logistical pain of having to build and insert it),
> and wouldn't address (2) at all.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-06  8:27                     ` Paolo Bonzini
@ 2020-02-06 23:17                       ` Alexey Kardashevskiy
  2020-02-06 23:45                         ` Paolo Bonzini
  2020-02-10  7:28                       ` David Gibson
  1 sibling, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-06 23:17 UTC (permalink / raw)
  To: Paolo Bonzini, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella



On 06/02/2020 19:27, Paolo Bonzini wrote:
> On 05/02/20 07:06, David Gibson wrote:
>> On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:
>>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:
>>>> Speaking seriously, what would I put into the guest?
>>>
>>> Only things that would be considered drivers. Ignore the partitions issue
>>> for now so that you can just pass the device tree services to QEMU with
>>> hypercalls.
>>
>> Urgh... first, I don't really see how you'd do that.  OF's whole
>> device model is based around the device tree.  So implementing OF
>> driver interactions would require the firmware to do a bunch of
>> internal hypercalls to do all the DT stuff, which brings us back to a
>> much more complex and active interface between firmware and hypervisor
>> than we really want.
> 
> I'm really sorry if what I am saying is stupid; but I was thinking of a
> firmware entrypoint like
> 
> 	if (op == "read" || op == "write")
> 		do_driver_stuff(op);


do_driver_stuff() will require assigned PCI BARs, PCI bridge windows,
IOMMU. So QEMU or this new not-SLOF firmware will have to do this all.
This is a lot and what is exactly the benefit? My alternative does not
need drivers at all.


> 	else
> 		hypercall();
> 
> This is not even close to pseudocode, but hopefully enough to give the
> idea.  Perhaps what I don't understand is why you can't start the
> firmware with r3 pointing to the device tree, and stash it for when you
> leave control to GRUB. > Or to put it another way, what petitboot does
> that you cannot do in your own firmware.

Petitboot has all PCI code and driver ready, it can easily boot from
even passed through PCI devices which neither SLOF nor QEMU will have
drivers for.


>> Second, drivers are kind of where we'd get the most benefit by putting
>> them in qemu: from qemu we can just talk to the device backends
>> directly so we don't need to re-abstract the differences between
>> different device models of the same type.
> 
> Of course, but drivers are easy to write.  Not as easy as s390 probably
> because you'd have to link in libfdt and so on, but between
> kvm-unit-tests and s390-ccw there's quite a bit of code can be reused.
> 
>>> You can generalize and reuse the s390 code. All you have to write is the
>>> PCI scan and virtio-pci setup.
>>
>> If we assume virtio only.
> 
> Do you actually need something else?  

spapr-vscsi and usb-storage, probably.


> The TTY can use the simple
> getchar/putchar hypercalls, and sPAPR-vSCSI clients can keep using SLOF.

If we are open to the idea of using SLOF for one thing and new small
firmware for another thing, then it would make more sense to use
petitboot instead of SLOF.


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-06  8:29             ` Paolo Bonzini
@ 2020-02-06 23:23               ` Alexey Kardashevskiy
  2020-02-06 23:46                 ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-06 23:23 UTC (permalink / raw)
  To: Paolo Bonzini, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella



On 06/02/2020 19:29, Paolo Bonzini wrote:
> On 05/02/20 06:58, David Gibson wrote:
>>> Yes, SLOF is big and slow.  petitboot is not petit at all either, and
>>> has the disadvantage that you have to find a way to run GRUB afterwards.
>> Well, not usually.  Petitboot parses grub configuration itself, which
>> means that generally from the OS / installer point of view it looks
>> like grub, even though it's not from the actual bootstrapping point of
>> view.
> 
> Ok, sorry about that.  I need to learn a bit more.
> 
>>>  But would a similarly minimal OF implementation (no network, almost no
>>> interpret so no Forth, device tree built entirely in the host, etc.) be
>>> just as big and slow?
>>
>> So, as actual OF implementations go, SLOF is already pretty minimal
>> (hence "Slim Line Open Firmware").  If there's no Forth, it's really
>> not OF any more, just something mimicing some of OF's interfaces.
> 
> Right, not unlike what you get with vof=on. :)  I'm not against at all
> that idea.  I just don't understand what you refer to below as (2).
> Does petitboot not have the problem because it kexecs the new kernel?


Petitboot does not have this problem *if* it runs without SLOF, i.e.
directly via -kernel and -initrd and uses OF CI (cut down version, about
v3-v4 of my patchset, without block devices and grub lookup). In this
case there is one device tree instance, fully synchronized with the
machine state.

If there is still SLOF and (2) is happening, then petitboot is screwed
as any other kernel.


> Paolo
> 
>> But the difficulty of SLOF isn't really its bigness or slowness in any
>> case (the slowness is just an additional irritation).  The two big
>> issues are 1) that it's written in an obscure language and 2)
>> synchronizing its state with things that require host side
>> involvement.
>>
>> Rewriting a minimal guest side not-OF would partly address (1) (but
>> there's still the logistical pain of having to build and insert it),
>> and wouldn't address (2) at all.
> 

-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-06 23:17                       ` Alexey Kardashevskiy
@ 2020-02-06 23:45                         ` Paolo Bonzini
  2020-02-10  7:30                           ` David Gibson
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-06 23:45 UTC (permalink / raw)
  To: Alexey Kardashevskiy, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella

On 07/02/20 00:17, Alexey Kardashevskiy wrote:
> This is a lot and what is exactly the benefit? My alternative does not
> need drivers at all.

Anything you put in the host is potential attack surface.  Plus, you're
not doing a different thing than anyone else and as you've found out it
may be easy for block device but not for everything else.

Every platform that QEMU supports is just using a firmware to do
firmware things; it can be U-Boot, EDK-2, SLOF, SeaBIOS, qboot, with
varying level of complexity.  Some are doing -kernel in QEMU rather than
firmware, but that's where things end.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-06 23:23               ` Alexey Kardashevskiy
@ 2020-02-06 23:46                 ` Paolo Bonzini
  2020-02-10  0:31                   ` Alexey Kardashevskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-06 23:46 UTC (permalink / raw)
  To: Alexey Kardashevskiy, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella

On 07/02/20 00:23, Alexey Kardashevskiy wrote:
>> Right, not unlike what you get with vof=on. :)  I'm not against at all
>> that idea.  I just don't understand what you refer to below as (2).
>> Does petitboot not have the problem because it kexecs the new kernel?
> 
> Petitboot does not have this problem *if* it runs without SLOF, i.e.
> directly via -kernel and -initrd and uses OF CI (cut down version, about
> v3-v4 of my patchset, without block devices and grub lookup). In this
> case there is one device tree instance, fully synchronized with the
> machine state.
> 
> If there is still SLOF and (2) is happening, then petitboot is screwed
> as any other kernel.

Ok, so "minimal pseudo-OpenFirmware in QEMU" is doable and can get
everything right; it's just work to set up PCI and do all that other
do_driver_stuff(), so you can either do it yourself or use
Linux+petitboot.  Is this correct?

Also, can a normal distro kernel run via -kernel/-initrd + the minimal
firmware in QEMU?

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-06 23:46                 ` Paolo Bonzini
@ 2020-02-10  0:31                   ` Alexey Kardashevskiy
  2020-02-13  1:43                     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-10  0:31 UTC (permalink / raw)
  To: Paolo Bonzini, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella



On 07/02/2020 10:46, Paolo Bonzini wrote:
> On 07/02/20 00:23, Alexey Kardashevskiy wrote:
>>> Right, not unlike what you get with vof=on. :)  I'm not against at all
>>> that idea.  I just don't understand what you refer to below as (2).
>>> Does petitboot not have the problem because it kexecs the new kernel?
>>
>> Petitboot does not have this problem *if* it runs without SLOF, i.e.
>> directly via -kernel and -initrd and uses OF CI (cut down version, about
>> v3-v4 of my patchset, without block devices and grub lookup). In this
>> case there is one device tree instance, fully synchronized with the
>> machine state.
>>
>> If there is still SLOF and (2) is happening, then petitboot is screwed
>> as any other kernel.
> 
> Ok, so "minimal pseudo-OpenFirmware in QEMU" is doable and can get
> everything right;

I am not convinced that ditching drivers is not right; I am moving elf +
mbr + gpt + grub loading to the guest though so 20 bytes blob becomes
FDT-less firmmare, a few kbytes big.

> it's just work to set up PCI and do all that other
> do_driver_stuff(), so you can either do it yourself or use
> Linux+petitboot.  Is this correct?

Except using the "just" word, yes, correct ;)

> Also, can a normal distro kernel run via -kernel/-initrd + the minimal
> firmware in QEMU?

Yes.


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-06  8:27                     ` Paolo Bonzini
  2020-02-06 23:17                       ` Alexey Kardashevskiy
@ 2020-02-10  7:28                       ` David Gibson
  2020-02-10 11:26                         ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: David Gibson @ 2020-02-10  7:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 4945 bytes --]

On Thu, Feb 06, 2020 at 09:27:01AM +0100, Paolo Bonzini wrote:
> On 05/02/20 07:06, David Gibson wrote:
> > On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:
> >> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru> ha scritto:
> >>> Speaking seriously, what would I put into the guest?
> >>
> >> Only things that would be considered drivers. Ignore the partitions issue
> >> for now so that you can just pass the device tree services to QEMU with
> >> hypercalls.
> > 
> > Urgh... first, I don't really see how you'd do that.  OF's whole
> > device model is based around the device tree.  So implementing OF
> > driver interactions would require the firmware to do a bunch of
> > internal hypercalls to do all the DT stuff, which brings us back to a
> > much more complex and active interface between firmware and hypervisor
> > than we really want.
> 
> I'm really sorry if what I am saying is stupid; but I was thinking of a
> firmware entrypoint like
> 
> 	if (op == "read" || op == "write")
> 		do_driver_stuff(op);
> 	else
> 		hypercall();

Um... I'm not really clear on where you're imagining this going.  In
the OF model, device operations are done by "opening" a device tree
node then executing methods on it, so you can't really even get to
this point without a bunch of DT stuff.

> This is not even close to pseudocode, but hopefully enough to give the
> idea.  Perhaps what I don't understand is why you can't start the
> firmware with r3 pointing to the device tree, and stash it for when you
> leave control to GRUB.

Again, I'm not even really sure what you mean by this.  We already
enter SLOF with r3 pointing to a device tree.  I'm not sure what
stashing it would accomplish.  GRUB as it stands expects an OF style
entry point though, not a flat tree style entry point.

Are you suggesting rewriting it to run in that environment?  That
might be an option, but it's certainly not easy.  We'd have to write
"native" grub drivers for all the devices we care about, rather than
calling into OF for them.  Maybe there's some x86 code we could
already use here?  I don't know how much GRUB relies on the BIOS or
UEFI for device access on PC.

> Or to put it another way, what petitboot does
> that you cannot do in your own firmware.

So, part of the confusion is that there are two things we're
considering here and it's not really clear yet how much they overlap.

1) Is using petitboot as our bootloader.  That gives us basically
every driver, network protocol and tool we could ever want.  However,
it gets to the next stage via kexec(), so it can only support OSes
which are kexec()able - i.e. Linux.  This is mostly speculation at
this point.

2) Having a way of booting existing clients - i.e. those that expect
OF-style entry conditions - but without having to maintain a blob of
Forth.  The "kill slof" patches are a concrete, if limited, attempt at
this.

> > Second, drivers are kind of where we'd get the most benefit by putting
> > them in qemu: from qemu we can just talk to the device backends
> > directly so we don't need to re-abstract the differences between
> > different device models of the same type.
> 
> Of course, but drivers are easy to write.

I'm not really sure I agree with you there.

> Not as easy as s390 probably
> because you'd have to link in libfdt and so on, but between
> kvm-unit-tests and s390-ccw there's quite a bit of code can be reused.

Maybe, but I'm not really sure where you're picturing it fitting in.

> >> You can generalize and reuse the s390 code. All you have to write is the
> >> PCI scan and virtio-pci setup.
> > 
> > If we assume virtio only.
> 
> Do you actually need something else?

Well.. that's an interesting question.

> The TTY can use the simple
> getchar/putchar hypercalls,

Yes... though if people attach a graphical console they might be
pretty surprised that they don't get anything on there.  Supporting
that means adding vga, which adds substantial complexity (especially
since text mode isn't really a thing for a vga on a POWER machine).

> and sPAPR-vSCSI clients can keep using SLOF.

We can possibly ignore the spapr virtual devices.  They seemed like
they'd be important for people transitioning from guests under
PowerVM, but honestly I'm not sure they've ever been used much.

We do support emulated (or passthrough) PCI devices.  I don't know if
they're common enough that we need boot support for them.  Netboot
from a vfio network adaptor might be something people want.

USB storage is also a fairly likely candidate, and that would add a
*lot* of extra complexity, since we'd need both the HCD and storage
drivers.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-06 23:45                         ` Paolo Bonzini
@ 2020-02-10  7:30                           ` David Gibson
  2020-02-10 10:37                             ` Peter Maydell
  2020-02-10 11:25                             ` Paolo Bonzini
  0 siblings, 2 replies; 48+ messages in thread
From: David Gibson @ 2020-02-10  7:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]

On Fri, Feb 07, 2020 at 12:45:20AM +0100, Paolo Bonzini wrote:
> On 07/02/20 00:17, Alexey Kardashevskiy wrote:
> > This is a lot and what is exactly the benefit? My alternative does not
> > need drivers at all.
> 
> Anything you put in the host is potential attack surface.

Ok, it is attack surface you're concerned about.  That wasn't totally
clear before this point.

> Plus, you're
> not doing a different thing than anyone else and as you've found out it
> may be easy for block device but not for everything else.

Uh.. was that supposed to be "we *are* doing a different thing than
anyone else"?

> Every platform that QEMU supports is just using a firmware to do
> firmware things; it can be U-Boot, EDK-2, SLOF, SeaBIOS, qboot, with
> varying level of complexity.  Some are doing -kernel in QEMU rather than
> firmware, but that's where things end.

Well, yeah, but AIUI those platforms actually have a defined hardware
environment on which the firmware is running.  For PAPR we don't, we
*only* have a specification for the "hardware"+"firmware" environment
as seen by the OS together.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-05  6:24                           ` Thomas Huth
@ 2020-02-10  7:55                             ` David Gibson
  2020-02-10  9:39                               ` Michal Suchánek
  0 siblings, 1 reply; 48+ messages in thread
From: David Gibson @ 2020-02-10  7:55 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Alexey Kardashevskiy, Cornelia Huck, qemu-devel,
	Christian Borntraeger, qemu-s390x, Paolo Bonzini,
	Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 4409 bytes --]

On Wed, Feb 05, 2020 at 07:24:04AM +0100, Thomas Huth wrote:
> On 05/02/2020 06.30, David Gibson wrote:
> > On Tue, Feb 04, 2020 at 10:20:14AM +0100, Thomas Huth wrote:
> >> On 04/02/2020 09.54, Cornelia Huck wrote:
> >>> On Tue, 4 Feb 2020 07:16:46 +0100
> >>> Thomas Huth <thuth@redhat.com> wrote:
> >>>
> >>>> On 04/02/2020 00.26, Paolo Bonzini wrote:
> >>>>>
> >>>>>
> >>>>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
> >>>>> <mailto:aik@ozlabs.ru>> ha scritto:
> >>>>>
> >>>>>     Speaking seriously, what would I put into the guest?
> >>>>>
> >>>>> Only things that would be considered drivers. Ignore the partitions
> >>>>> issue for now so that you can just pass the device tree services to QEMU
> >>>>> with hypercalls.
> >>>>>
> >>>>>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> >>>>>     smaller but adhoc with only a couple of people knowing it.
> >>>>>
> >>>>>
> >>>>> You can generalize and reuse the s390 code. All you have to write is the
> >>>>> PCI scan and virtio-pci setup.  
> >>>>
> >>>> Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
> >>>> so re-using this for a slim netboot client on ppc64 would certainly be
> >>>> feasible (especially since there are also already virtio drivers in SLOF
> >>>> that are written in C), but I think it is not very future proof. The
> >>>> libnet from SLOF only supports UDP, and no TCP. So for advanced boot
> >>>> scenarios like booting from HTTP or even HTTPS, you need something else
> >>>> (i.e. maybe grub is the better option, indeed).
> >>>
> >>> That makes me wonder what that means for s390: We're inheriting
> >>> libnet's limitations, but we don't have grub -- do we need to come up
> >>> with something different? Or improve libnet?
> >>
> >> I don't think that it makes sense to re-invent the wheel yet another
> >> time and write yet another TCP implementation (which is likely quite a
> >> bit of work, too, especially if you also want to do secure HTTPS in the
> >> end). So yes, in the long run (as soon as somebody seriously asks for
> >> HTTP booting on s390x) we need something different here.
> >>
> >> Now looking at our standard s390x bootloader zipl - this has been giving
> >> us a headache a couple of times in the past, too (from a distro point of
> >> view since s390x is the only major platform left that does not use grub,
> >> but also from a s390-ccw bios point of view, see e.g.
> >> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03046.html and
> >> related discussions).
> >>
> >> So IMHO the s390x world should move towards grub2, too. We could e.g.
> >> link it initially into the s390-ccw bios bios ... and if that works out
> >> well, later also use it as normal bootloader instead of zipl (not sure
> >> if that works in all cases, though, IIRC there were some size
> >> constraints and stuff like that).
> > 
> > petitboot would be another reasonable thing to consider here.  Since
> > it's Linux based, you have all the drivers you have there.  It's not
> > quite grub, but it does at least parse the same configuration files.
> > 
> > You do need kexec() of course, I don't know if you have that already
> > for s390 or not.
> 
> AFAIK we have kexec on s390. So yes, petitboot would be another option
> for replacing the s390-ccw bios. But when it comes to LPARs and z/VMs, I
> don't think it's really feasible to replace the zipl bootloader there
> with petitboot, so in that case grub2 still sounds like the better
> option to me.

Actually, between that and Paolo's suggestions, I thought of another
idea that could be helpful for both s390 and power.  Could we load
non-kexec() things (legacy kernels, non-Linux OSes) from Petitboot by
having it kexec() into a shim mini-kernel that just sets up the boot
environment for the other thing.

What I'm imagining is that petitboot loads everything that will be
needed for the other OS into RAM - probably as (or part of) the
"initrd" image.  That means the shim doesn't need to have drivers or
a network stack to load that in.  It just needs to construct
environment and jump into the real kernel.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-10  7:55                             ` David Gibson
@ 2020-02-10  9:39                               ` Michal Suchánek
  2020-02-13  3:16                                 ` David Gibson
  0 siblings, 1 reply; 48+ messages in thread
From: Michal Suchánek @ 2020-02-10  9:39 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, Cornelia Huck, qemu-devel,
	Christian Borntraeger, qemu-s390x, Paolo Bonzini,
	Stefano Garzarella

On Mon, Feb 10, 2020 at 06:55:16PM +1100, David Gibson wrote:
> On Wed, Feb 05, 2020 at 07:24:04AM +0100, Thomas Huth wrote:
> > On 05/02/2020 06.30, David Gibson wrote:
> > > On Tue, Feb 04, 2020 at 10:20:14AM +0100, Thomas Huth wrote:
> > >> On 04/02/2020 09.54, Cornelia Huck wrote:
> > >>> On Tue, 4 Feb 2020 07:16:46 +0100
> > >>> Thomas Huth <thuth@redhat.com> wrote:
> > >>>
> > >>>> On 04/02/2020 00.26, Paolo Bonzini wrote:
> > >>>>>
> > >>>>>
> > >>>>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
> > >>>>> <mailto:aik@ozlabs.ru>> ha scritto:
> > >>>>>
> > >>>>>     Speaking seriously, what would I put into the guest?
> > >>>>>
> > >>>>> Only things that would be considered drivers. Ignore the partitions
> > >>>>> issue for now so that you can just pass the device tree services to QEMU
> > >>>>> with hypercalls.
> > >>>>>
> > >>>>>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> > >>>>>     smaller but adhoc with only a couple of people knowing it.
> > >>>>>
> > >>>>>
> > >>>>> You can generalize and reuse the s390 code. All you have to write is the
> > >>>>> PCI scan and virtio-pci setup.  
> > >>>>
> > >>>> Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
> > >>>> so re-using this for a slim netboot client on ppc64 would certainly be
> > >>>> feasible (especially since there are also already virtio drivers in SLOF
> > >>>> that are written in C), but I think it is not very future proof. The
> > >>>> libnet from SLOF only supports UDP, and no TCP. So for advanced boot
> > >>>> scenarios like booting from HTTP or even HTTPS, you need something else
> > >>>> (i.e. maybe grub is the better option, indeed).
> > >>>
> > >>> That makes me wonder what that means for s390: We're inheriting
> > >>> libnet's limitations, but we don't have grub -- do we need to come up
> > >>> with something different? Or improve libnet?
> > >>
> > >> I don't think that it makes sense to re-invent the wheel yet another
> > >> time and write yet another TCP implementation (which is likely quite a
> > >> bit of work, too, especially if you also want to do secure HTTPS in the
> > >> end). So yes, in the long run (as soon as somebody seriously asks for
> > >> HTTP booting on s390x) we need something different here.
> > >>
> > >> Now looking at our standard s390x bootloader zipl - this has been giving
> > >> us a headache a couple of times in the past, too (from a distro point of
> > >> view since s390x is the only major platform left that does not use grub,
> > >> but also from a s390-ccw bios point of view, see e.g.
> > >> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03046.html and
> > >> related discussions).
> > >>
> > >> So IMHO the s390x world should move towards grub2, too. We could e.g.
> > >> link it initially into the s390-ccw bios bios ... and if that works out
> > >> well, later also use it as normal bootloader instead of zipl (not sure
> > >> if that works in all cases, though, IIRC there were some size
> > >> constraints and stuff like that).
> > > 
> > > petitboot would be another reasonable thing to consider here.  Since
> > > it's Linux based, you have all the drivers you have there.  It's not
> > > quite grub, but it does at least parse the same configuration files.
> > > 
> > > You do need kexec() of course, I don't know if you have that already
> > > for s390 or not.
> > 
> > AFAIK we have kexec on s390. So yes, petitboot would be another option
> > for replacing the s390-ccw bios. But when it comes to LPARs and z/VMs, I
> > don't think it's really feasible to replace the zipl bootloader there
> > with petitboot, so in that case grub2 still sounds like the better
> > option to me.
> 
> Actually, between that and Paolo's suggestions, I thought of another
> idea that could be helpful for both s390 and power.  Could we load
> non-kexec() things (legacy kernels, non-Linux OSes) from Petitboot by
> having it kexec() into a shim mini-kernel that just sets up the boot
> environment for the other thing.
> 
> What I'm imagining is that petitboot loads everything that will be
> needed for the other OS into RAM - probably as (or part of) the
> "initrd" image.  That means the shim doesn't need to have drivers or
> a network stack to load that in.  It just needs to construct
> environment and jump into the real kernel.

How does that differ from what kexec normally does?

It does support the kernel format that is usually booted on the
architecture.

Thanks

Michal


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-10  7:30                           ` David Gibson
@ 2020-02-10 10:37                             ` Peter Maydell
  2020-02-10 11:25                             ` Paolo Bonzini
  1 sibling, 0 replies; 48+ messages in thread
From: Peter Maydell @ 2020-02-10 10:37 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Paolo Bonzini, Stefano Garzarella

On Mon, 10 Feb 2020 at 07:56, David Gibson <david@gibson.dropbear.id.au> wrote:
> On Fri, Feb 07, 2020 at 12:45:20AM +0100, Paolo Bonzini wrote:
> > Every platform that QEMU supports is just using a firmware to do
> > firmware things; it can be U-Boot, EDK-2, SLOF, SeaBIOS, qboot, with
> > varying level of complexity.  Some are doing -kernel in QEMU rather than
> > firmware, but that's where things end.
>
> Well, yeah, but AIUI those platforms actually have a defined hardware
> environment on which the firmware is running.  For PAPR we don't, we
> *only* have a specification for the "hardware"+"firmware" environment
> as seen by the OS together.

(The below is not intended to be a prescription for what PPC should
do, just some background info about what we're doing with Arm currently.)

For Arm our 'virt' board is drifting a bit towards doing some 'firmware'
ABIs in QEMU -- currently this mostly means PSCI (for CPU power on/off,
system reset, etc), but there have been proposals for other firmware
ABIs that are hard to implement in guest firmware. I tend to agree with
Paolo in principle that where possible keeping QEMU to "we implement
some hardware emulation" and having firmware running in the guest
is a nicer separation of concerns, though, so for Arm I'd like
to avoid ending up with a lot of firmware-equivalent code in QEMU.

FWIW for the Arm 'virt' board there is no defined hardware spec
in the "handed down from elsewhere" sense -- so we defined our
own (and some mechanisms for passing the device tree description
of it into the guest firmware).

thanks
-- PMM


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-10  7:30                           ` David Gibson
  2020-02-10 10:37                             ` Peter Maydell
@ 2020-02-10 11:25                             ` Paolo Bonzini
  2020-02-14  3:23                               ` David Gibson
  1 sibling, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-10 11:25 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

On 10/02/20 08:30, David Gibson wrote:
>> Anything you put in the host is potential attack surface.
> Ok, it is attack surface you're concerned about.  That wasn't totally
> clear before this point.

Part that, part having to add backend hooks that weren't needed so far.

>> Plus, you're not doing a different thing than anyone else and as
>> you've found out it may be easy for block device but not for
>> everything else.
>
> Uh.. was that supposed to be "we *are* doing a different thing than
> anyone else"?

Alexey's question was "what is exactly the benefit", so "not doing a
different thing" is the answer (one of them).

>> Every platform that QEMU supports is just using a firmware to do
>> firmware things; it can be U-Boot, EDK-2, SLOF, SeaBIOS, qboot, with
>> varying level of complexity.  Some are doing -kernel in QEMU rather than
>> firmware, but that's where things end.
>
> Well, yeah, but AIUI those platforms actually have a defined hardware
> environment on which the firmware is running.  For PAPR we don't, we
> *only* have a specification for the "hardware"+"firmware" environment
> as seen by the OS together.

PAPR is a specification for the environment as seen by the OS.  But "-M
pseries" is already a defined hardware environment on which SLOF is
running.  There's nothing that prevents you from defining more of that
environment in order to run Linux (for petitboot) or your own
pseudo-OpenFirmware driver provider inside it.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-10  7:28                       ` David Gibson
@ 2020-02-10 11:26                         ` Paolo Bonzini
  2020-02-14  4:02                           ` David Gibson
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-10 11:26 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

On 10/02/20 08:28, David Gibson wrote:
> On Thu, Feb 06, 2020 at 09:27:01AM +0100, Paolo Bonzini wrote:
>> On 05/02/20 07:06, David Gibson wrote:
>>> On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:
>> I'm really sorry if what I am saying is stupid; but I was thinking of a
>> firmware entrypoint like
>>
>> 	if (op == "read" || op == "write")
>> 		do_driver_stuff(op);
>> 	else
>> 		hypercall();
> 
> Um... I'm not really clear on where you're imagining this going.  In
> the OF model, device operations are done by "opening" a device tree
> node then executing methods on it, so you can't really even get to
> this point without a bunch of DT stuff.

Could you delegate that part to QEMU, as in the v6 patches?  The
firmware would record the path<->ihandle association on open and close,
and then you can use that when GRUB does "read" and "write" to invoke
the appropriate driver.

>> This is not even close to pseudocode, but hopefully enough to give the
>> idea.  Perhaps what I don't understand is why you can't start the
>> firmware with r3 pointing to the device tree, and stash it for when you
>> leave control to GRUB.
> 
> Again, I'm not even really sure what you mean by this.  We already
> enter SLOF with r3 pointing to a device tree.  I'm not sure what
> stashing it would accomplish.  GRUB as it stands expects an OF style
> entry point though, not a flat tree style entry point.

Again, sorry if what I'm saying makes little sense.  The terminology is
certainly off.  What I mean is:

- read the device tree, instantiate all PCI and virtio drivers

- keep the device tree around for use while GRUB is running

- find and invoke GRUB

- on the OF entry point, wrap open and close + handle the disk and
network entry points, and pass everything else to QEMU.

>> The TTY can use the simple
>> getchar/putchar hypercalls,
> 
> Yes... though if people attach a graphical console they might be
> pretty surprised that they don't get anything on there.

They wouldn't with Alexey's code either, would they?  And it would be
yet another QEMU backend to hook into, while with firmware it would be
lots of code to write but super-boring and something that has been done
countless times.

> We can possibly ignore the spapr virtual devices.  They seemed like
> they'd be important for people transitioning from guests under
> PowerVM, but honestly I'm not sure they've ever been used much.
> 
> We do support emulated (or passthrough) PCI devices.  I don't know if
> they're common enough that we need boot support for them.  Netboot
> from a vfio network adaptor might be something people want.

Can you get that with SLOF?

> USB storage is also a fairly likely candidate, and that would add a
> *lot* of extra complexity, since we'd need both the HCD and storage
> drivers.

Any reason to make it USB and not a virtio-blk device?  (On x86 these
days you only add USB storage disks to a VM in order to get drivers to
Windows).

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-10  0:31                   ` Alexey Kardashevskiy
@ 2020-02-13  1:43                     ` Alexey Kardashevskiy
  2020-02-13 10:17                       ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-13  1:43 UTC (permalink / raw)
  To: Paolo Bonzini, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella



On 10/02/2020 11:31, Alexey Kardashevskiy wrote:
> 
> 
> On 07/02/2020 10:46, Paolo Bonzini wrote:
>> On 07/02/20 00:23, Alexey Kardashevskiy wrote:
>>>> Right, not unlike what you get with vof=on. :)  I'm not against at all
>>>> that idea.  I just don't understand what you refer to below as (2).
>>>> Does petitboot not have the problem because it kexecs the new kernel?
>>>
>>> Petitboot does not have this problem *if* it runs without SLOF, i.e.
>>> directly via -kernel and -initrd and uses OF CI (cut down version, about
>>> v3-v4 of my patchset, without block devices and grub lookup). In this
>>> case there is one device tree instance, fully synchronized with the
>>> machine state.
>>>
>>> If there is still SLOF and (2) is happening, then petitboot is screwed
>>> as any other kernel.
>>
>> Ok, so "minimal pseudo-OpenFirmware in QEMU" is doable and can get
>> everything right;
> 
> I am not convinced that ditching drivers is not right; I am moving elf +
> mbr + gpt + grub loading to the guest though so 20 bytes blob becomes
> FDT-less firmmare, a few kbytes big.


Ok. So, I have made a small firmware which does OF CI, loads GRUB and
instantiates RTAS:
https://github.com/aik/of1275
Quite raw but gives the idea.

It does not contain drivers and still relies on QEMU to hook an OF path
to a backend. Is this a showstopper and without drivers it is no go? Thanks,



>> it's just work to set up PCI and do all that other
>> do_driver_stuff(), so you can either do it yourself or use
>> Linux+petitboot.  Is this correct?
> 
> Except using the "just" word, yes, correct ;)
> 
>> Also, can a normal distro kernel run via -kernel/-initrd + the minimal
>> firmware in QEMU?
> 
> Yes.


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Restrictions of libnet (was: Re: VW ELF loader)
  2020-02-10  9:39                               ` Michal Suchánek
@ 2020-02-13  3:16                                 ` David Gibson
  0 siblings, 0 replies; 48+ messages in thread
From: David Gibson @ 2020-02-13  3:16 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Thomas Huth, Alexey Kardashevskiy, Cornelia Huck, qemu-devel,
	Christian Borntraeger, qemu-s390x, Paolo Bonzini,
	Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 5722 bytes --]

On Mon, Feb 10, 2020 at 10:39:52AM +0100, Michal Suchánek wrote:
> On Mon, Feb 10, 2020 at 06:55:16PM +1100, David Gibson wrote:
> > On Wed, Feb 05, 2020 at 07:24:04AM +0100, Thomas Huth wrote:
> > > On 05/02/2020 06.30, David Gibson wrote:
> > > > On Tue, Feb 04, 2020 at 10:20:14AM +0100, Thomas Huth wrote:
> > > >> On 04/02/2020 09.54, Cornelia Huck wrote:
> > > >>> On Tue, 4 Feb 2020 07:16:46 +0100
> > > >>> Thomas Huth <thuth@redhat.com> wrote:
> > > >>>
> > > >>>> On 04/02/2020 00.26, Paolo Bonzini wrote:
> > > >>>>>
> > > >>>>>
> > > >>>>> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy <aik@ozlabs.ru
> > > >>>>> <mailto:aik@ozlabs.ru>> ha scritto:
> > > >>>>>
> > > >>>>>     Speaking seriously, what would I put into the guest?
> > > >>>>>
> > > >>>>> Only things that would be considered drivers. Ignore the partitions
> > > >>>>> issue for now so that you can just pass the device tree services to QEMU
> > > >>>>> with hypercalls.
> > > >>>>>
> > > >>>>>     Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> > > >>>>>     smaller but adhoc with only a couple of people knowing it.
> > > >>>>>
> > > >>>>>
> > > >>>>> You can generalize and reuse the s390 code. All you have to write is the
> > > >>>>> PCI scan and virtio-pci setup.  
> > > >>>>
> > > >>>> Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
> > > >>>> so re-using this for a slim netboot client on ppc64 would certainly be
> > > >>>> feasible (especially since there are also already virtio drivers in SLOF
> > > >>>> that are written in C), but I think it is not very future proof. The
> > > >>>> libnet from SLOF only supports UDP, and no TCP. So for advanced boot
> > > >>>> scenarios like booting from HTTP or even HTTPS, you need something else
> > > >>>> (i.e. maybe grub is the better option, indeed).
> > > >>>
> > > >>> That makes me wonder what that means for s390: We're inheriting
> > > >>> libnet's limitations, but we don't have grub -- do we need to come up
> > > >>> with something different? Or improve libnet?
> > > >>
> > > >> I don't think that it makes sense to re-invent the wheel yet another
> > > >> time and write yet another TCP implementation (which is likely quite a
> > > >> bit of work, too, especially if you also want to do secure HTTPS in the
> > > >> end). So yes, in the long run (as soon as somebody seriously asks for
> > > >> HTTP booting on s390x) we need something different here.
> > > >>
> > > >> Now looking at our standard s390x bootloader zipl - this has been giving
> > > >> us a headache a couple of times in the past, too (from a distro point of
> > > >> view since s390x is the only major platform left that does not use grub,
> > > >> but also from a s390-ccw bios point of view, see e.g.
> > > >> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03046.html and
> > > >> related discussions).
> > > >>
> > > >> So IMHO the s390x world should move towards grub2, too. We could e.g.
> > > >> link it initially into the s390-ccw bios bios ... and if that works out
> > > >> well, later also use it as normal bootloader instead of zipl (not sure
> > > >> if that works in all cases, though, IIRC there were some size
> > > >> constraints and stuff like that).
> > > > 
> > > > petitboot would be another reasonable thing to consider here.  Since
> > > > it's Linux based, you have all the drivers you have there.  It's not
> > > > quite grub, but it does at least parse the same configuration files.
> > > > 
> > > > You do need kexec() of course, I don't know if you have that already
> > > > for s390 or not.
> > > 
> > > AFAIK we have kexec on s390. So yes, petitboot would be another option
> > > for replacing the s390-ccw bios. But when it comes to LPARs and z/VMs, I
> > > don't think it's really feasible to replace the zipl bootloader there
> > > with petitboot, so in that case grub2 still sounds like the better
> > > option to me.
> > 
> > Actually, between that and Paolo's suggestions, I thought of another
> > idea that could be helpful for both s390 and power.  Could we load
> > non-kexec() things (legacy kernels, non-Linux OSes) from Petitboot by
> > having it kexec() into a shim mini-kernel that just sets up the boot
> > environment for the other thing.
> > 
> > What I'm imagining is that petitboot loads everything that will be
> > needed for the other OS into RAM - probably as (or part of) the
> > "initrd" image.  That means the shim doesn't need to have drivers or
> > a network stack to load that in.  It just needs to construct
> > environment and jump into the real kernel.
> 
> How does that differ from what kexec normally does?
> 
> It does support the kernel format that is usually booted on the
> architecture.

It's not a question of format, but of environment.

By the time a kexec() occurs there won't be any OF client interface
available, whether or not there ever was one.  So, kexec() won't be
able to exec things which expect that to be present.  That includes
grub, AIX, and (probably) FreeBSD.

Note that while it's the same image, Linux kernels for POWER for a
long time have been able to boot in one of two ways.  One expects the
flattened tree, and an already instantiated RTAS - this method is used
directly by RTAS.  The other expects an OF client interface - it
slurps the device tree out of that and produces the flattened tree,
instantiates RTAS and then jumps to the flat tree entry point.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-13  1:43                     ` Alexey Kardashevskiy
@ 2020-02-13 10:17                       ` Paolo Bonzini
  2020-02-14  0:01                         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2020-02-13 10:17 UTC (permalink / raw)
  To: Alexey Kardashevskiy, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella

On 13/02/20 02:43, Alexey Kardashevskiy wrote:
> 
> Ok. So, I have made a small firmware which does OF CI, loads GRUB and
> instantiates RTAS:
> https://github.com/aik/of1275
> Quite raw but gives the idea.
> 
> It does not contain drivers and still relies on QEMU to hook an OF path
> to a backend. Is this a showstopper and without drivers it is no go? Thanks,

Yes, it's really the drivers.  Something like netboot wouldn't work for
example.

I don't have a problem with relying on QEMU for opening and closing OF
paths, but I really believe that read/write on ihandles should be done
within the firmware and not QEMU.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-13 10:17                       ` Paolo Bonzini
@ 2020-02-14  0:01                         ` Alexey Kardashevskiy
  2020-02-14  2:30                           ` David Gibson
  0 siblings, 1 reply; 48+ messages in thread
From: Alexey Kardashevskiy @ 2020-02-14  0:01 UTC (permalink / raw)
  To: Paolo Bonzini, David Gibson
  Cc: Christian Borntraeger, Thomas Huth, qemu-devel, Cornelia Huck,
	Stefano Garzarella



On 13/02/2020 21:17, Paolo Bonzini wrote:
> On 13/02/20 02:43, Alexey Kardashevskiy wrote:
>>
>> Ok. So, I have made a small firmware which does OF CI, loads GRUB and
>> instantiates RTAS:
>> https://github.com/aik/of1275
>> Quite raw but gives the idea.
>>
>> It does not contain drivers and still relies on QEMU to hook an OF path
>> to a backend. Is this a showstopper and without drivers it is no go? Thanks,
> 
> Yes, it's really the drivers.  Something like netboot wouldn't work for
> example.
> 
> I don't have a problem with relying on QEMU for opening and closing OF
> paths, but I really believe that read/write on ihandles should be done
> within the firmware and not QEMU.

Moving read/write to the firmware is not a problem but there is a little
mix up here :)

An ihandle is open from a path and nothing there suggests drivers, it is
up to the ihandle's "read" method what happens next.

If we do PCI drivers in the firmware, then the entire ihandle (==
"opened instance of a phandle") business goes to the firmware and we are
slowly bringing the existing mess back again.


-- 
Alexey


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-14  0:01                         ` Alexey Kardashevskiy
@ 2020-02-14  2:30                           ` David Gibson
  0 siblings, 0 replies; 48+ messages in thread
From: David Gibson @ 2020-02-14  2:30 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Thomas Huth, qemu-devel, Cornelia Huck, Christian Borntraeger,
	Paolo Bonzini, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 1726 bytes --]

On Fri, Feb 14, 2020 at 11:01:26AM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 13/02/2020 21:17, Paolo Bonzini wrote:
> > On 13/02/20 02:43, Alexey Kardashevskiy wrote:
> >>
> >> Ok. So, I have made a small firmware which does OF CI, loads GRUB and
> >> instantiates RTAS:
> >> https://github.com/aik/of1275
> >> Quite raw but gives the idea.
> >>
> >> It does not contain drivers and still relies on QEMU to hook an OF path
> >> to a backend. Is this a showstopper and without drivers it is no go? Thanks,
> > 
> > Yes, it's really the drivers.  Something like netboot wouldn't work for
> > example.
> > 
> > I don't have a problem with relying on QEMU for opening and closing OF
> > paths, but I really believe that read/write on ihandles should be done
> > within the firmware and not QEMU.
> 
> Moving read/write to the firmware is not a problem but there is a little
> mix up here :)
> 
> An ihandle is open from a path and nothing there suggests drivers, it is
> up to the ihandle's "read" method what happens next.
> 
> If we do PCI drivers in the firmware, then the entire ihandle (==
> "opened instance of a phandle") business goes to the firmware and we are
> slowly bringing the existing mess back again.

Right, even setting aside the specifics of how ihandles are managed,
having the device tree in qemu but device handling in firmware seems
like an even more complex interaction between qemu and firmware pieces
of the environment, which is what we've been trying to avoid here.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-10 11:25                             ` Paolo Bonzini
@ 2020-02-14  3:23                               ` David Gibson
  0 siblings, 0 replies; 48+ messages in thread
From: David Gibson @ 2020-02-14  3:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 3472 bytes --]

On Mon, Feb 10, 2020 at 12:25:39PM +0100, Paolo Bonzini wrote:
> On 10/02/20 08:30, David Gibson wrote:
> >> Anything you put in the host is potential attack surface.
> > Ok, it is attack surface you're concerned about.  That wasn't totally
> > clear before this point.
> 
> Part that, part having to add backend hooks that weren't needed so far.
> 
> >> Plus, you're not doing a different thing than anyone else and as
> >> you've found out it may be easy for block device but not for
> >> everything else.
> >
> > Uh.. was that supposed to be "we *are* doing a different thing than
> > anyone else"?
> 
> Alexey's question was "what is exactly the benefit", so "not doing a
> different thing" is the answer (one of them).

Ah, right, I see.

> >> Every platform that QEMU supports is just using a firmware to do
> >> firmware things; it can be U-Boot, EDK-2, SLOF, SeaBIOS, qboot, with
> >> varying level of complexity.  Some are doing -kernel in QEMU rather than
> >> firmware, but that's where things end.
> >
> > Well, yeah, but AIUI those platforms actually have a defined hardware
> > environment on which the firmware is running.  For PAPR we don't, we
> > *only* have a specification for the "hardware"+"firmware" environment
> > as seen by the OS together.
> 
> PAPR is a specification for the environment as seen by the OS.  But "-M
> pseries" is already a defined hardware environment on which SLOF is
> running.

"defined" might be a bit strong.  We have on multiple occasions
required synchronized SLOF and qemu updates in order to keep
presenting the same guest environment.

> There's nothing that prevents you from defining more of that
> environment in order to run Linux (for petitboot) or your own
> pseudo-OpenFirmware driver provider inside it.

Well, sure, but we don't want that definition to introduce lots of
complexity we have to maintain on top of the existing HV and firmware
interfaces that are defined.

I realized what I said about the firmware interfaces requiring HV
privilege was a bit misleading.  For the boot time firmware
components, such as the OF client interface, that's mostly not true
(with the big and hairy exception of the
ibm,client-architecture-support feature negotiation mechanism).

It is, however, true of the runtime RTAS interfaces.  In fact it's
true to the point that, at least for most of the RTAS interfaces we
care about, it's hard to imagine an in-guest implementation doing
anything much other than repackaging the information it gets and
forwarding it to the hypervisor.  For most purposes RTAS is pretty
much an alternative hypercall mechanism.  So, I think implementing the
RTAS entirely in qemu was the right choice.

Where it gets complicated is that a number of RTAS calls need to match
IDs with stuff from the boot time firmware.  Particularly phandles of
device nodes, and some other IDs.

Which would be fine if the boot time firmware took the device tree
from qemu and used it unmodified.  But.. it doesn't, quite.  In
particular it assigns its own phandles, because it uses them as an
internal index.  And worse, clients can alter the device tree, and
this is used to a small extent - grub pokes a few values in there for
use later.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: VW ELF loader
  2020-02-10 11:26                         ` Paolo Bonzini
@ 2020-02-14  4:02                           ` David Gibson
  0 siblings, 0 replies; 48+ messages in thread
From: David Gibson @ 2020-02-14  4:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Alexey Kardashevskiy, qemu-devel, Cornelia Huck,
	Christian Borntraeger, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 3556 bytes --]

On Mon, Feb 10, 2020 at 12:26:07PM +0100, Paolo Bonzini wrote:
> On 10/02/20 08:28, David Gibson wrote:
> > On Thu, Feb 06, 2020 at 09:27:01AM +0100, Paolo Bonzini wrote:
> >> On 05/02/20 07:06, David Gibson wrote:
> >>> On Tue, Feb 04, 2020 at 12:26:32AM +0100, Paolo Bonzini wrote:
> >> I'm really sorry if what I am saying is stupid; but I was thinking of a
> >> firmware entrypoint like
> >>
> >> 	if (op == "read" || op == "write")
> >> 		do_driver_stuff(op);
> >> 	else
> >> 		hypercall();
> > 
> > Um... I'm not really clear on where you're imagining this going.  In
> > the OF model, device operations are done by "opening" a device tree
> > node then executing methods on it, so you can't really even get to
> > this point without a bunch of DT stuff.
> 
> Could you delegate that part to QEMU, as in the v6 patches?  The
> firmware would record the path<->ihandle association on open and close,
> and then you can use that when GRUB does "read" and "write" to invoke
> the appropriate driver.
> 
> >> This is not even close to pseudocode, but hopefully enough to give the
> >> idea.  Perhaps what I don't understand is why you can't start the
> >> firmware with r3 pointing to the device tree, and stash it for when you
> >> leave control to GRUB.
> > 
> > Again, I'm not even really sure what you mean by this.  We already
> > enter SLOF with r3 pointing to a device tree.  I'm not sure what
> > stashing it would accomplish.  GRUB as it stands expects an OF style
> > entry point though, not a flat tree style entry point.
> 
> Again, sorry if what I'm saying makes little sense.  The terminology is
> certainly off.  What I mean is:
> 
> - read the device tree, instantiate all PCI and virtio drivers
> 
> - keep the device tree around for use while GRUB is running
> 
> - find and invoke GRUB
> 
> - on the OF entry point, wrap open and close + handle the disk and
> network entry points, and pass everything else to QEMU.
> 
> >> The TTY can use the simple
> >> getchar/putchar hypercalls,
> > 
> > Yes... though if people attach a graphical console they might be
> > pretty surprised that they don't get anything on there.
> 
> They wouldn't with Alexey's code either, would they?  And it would be
> yet another QEMU backend to hook into, while with firmware it would be
> lots of code to write but super-boring and something that has been done
> countless times.

That's a fair point.

> > We can possibly ignore the spapr virtual devices.  They seemed like
> > they'd be important for people transitioning from guests under
> > PowerVM, but honestly I'm not sure they've ever been used much.
> > 
> > We do support emulated (or passthrough) PCI devices.  I don't know if
> > they're common enough that we need boot support for them.  Netboot
> > from a vfio network adaptor might be something people want.
> 
> Can you get that with SLOF?

I think yes, if your passthrough device is one of the small number
supported by SLOF.

> > USB storage is also a fairly likely candidate, and that would add a
> > *lot* of extra complexity, since we'd need both the HCD and storage
> > drivers.
> 
> Any reason to make it USB and not a virtio-blk device?  (On x86 these
> days you only add USB storage disks to a VM in order to get drivers to
> Windows).

Hm, yeah, maybe.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2020-02-14  4:03 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-01 13:39 VW ELF loader Alexey Kardashevskiy
2020-02-01 19:04 ` Paolo Bonzini
2020-02-02 11:51   ` Alexey Kardashevskiy
2020-02-02 17:38     ` Paolo Bonzini
2020-02-03  1:31       ` David Gibson
2020-02-03  1:28   ` David Gibson
2020-02-03  9:12     ` Paolo Bonzini
2020-02-03  9:50       ` David Gibson
2020-02-03 10:58       ` Alexey Kardashevskiy
2020-02-03 15:08         ` Paolo Bonzini
2020-02-03 22:36           ` Alexey Kardashevskiy
2020-02-03 22:56             ` Paolo Bonzini
2020-02-03 23:19               ` Alexey Kardashevskiy
2020-02-03 23:26                 ` Paolo Bonzini
2020-02-04  6:16                   ` Thomas Huth
2020-02-04  8:54                     ` Cornelia Huck
2020-02-04  9:20                       ` Restrictions of libnet (was: Re: VW ELF loader) Thomas Huth
2020-02-04  9:32                         ` Thomas Huth
2020-02-04  9:33                         ` Michal Suchánek
2020-02-05  5:30                         ` David Gibson
2020-02-05  6:24                           ` Thomas Huth
2020-02-10  7:55                             ` David Gibson
2020-02-10  9:39                               ` Michal Suchánek
2020-02-13  3:16                                 ` David Gibson
2020-02-04 23:18                   ` VW ELF loader Alexey Kardashevskiy
2020-02-05  6:06                   ` David Gibson
2020-02-05  9:28                     ` Cornelia Huck
2020-02-06  4:47                       ` David Gibson
2020-02-06  8:27                     ` Paolo Bonzini
2020-02-06 23:17                       ` Alexey Kardashevskiy
2020-02-06 23:45                         ` Paolo Bonzini
2020-02-10  7:30                           ` David Gibson
2020-02-10 10:37                             ` Peter Maydell
2020-02-10 11:25                             ` Paolo Bonzini
2020-02-14  3:23                               ` David Gibson
2020-02-10  7:28                       ` David Gibson
2020-02-10 11:26                         ` Paolo Bonzini
2020-02-14  4:02                           ` David Gibson
2020-02-05  5:58           ` David Gibson
2020-02-06  8:29             ` Paolo Bonzini
2020-02-06 23:23               ` Alexey Kardashevskiy
2020-02-06 23:46                 ` Paolo Bonzini
2020-02-10  0:31                   ` Alexey Kardashevskiy
2020-02-13  1:43                     ` Alexey Kardashevskiy
2020-02-13 10:17                       ` Paolo Bonzini
2020-02-14  0:01                         ` Alexey Kardashevskiy
2020-02-14  2:30                           ` David Gibson
2020-02-04  9:40   ` Christian Borntraeger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.