All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Configuring pflash devices for OVMF firmware
@ 2019-01-25 15:03 Markus Armbruster
  2019-01-28  7:58 ` Laszlo Ersek
                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Markus Armbruster @ 2019-01-25 15:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, libvir-list, László Érsek,
	Daniel P. Berrangé,
	Peter Krempa

We configure OVMF firmware for PC machine types with -drive if=pflash.
This is pretty much the last remaining use of -drive in libvirt we can't
yet replace by -blockdev.  Such a replacement is desirable, because
-blockdev + -device is more flexible than -drive if=pflash.  Also, once
we don't need -drive with new QEMU anymore, the path for deleting all
-drive code in libvirt some day is open.  As with all desirables, the
benefit needs to exceed the cost.

I'm going to describe the status quo, how we got there (briefly and much
simplified), then sketch how to replace -drive if=pflash.  I'm afraid
this is fairly long; sorry.  Please correct misunderstandings.  Beware,
my libvirt and OVMF fu is much weaker than my QEMU fu.

In the beginning, board code read the BIOS from a fixed file and mapped
it into the guest's address space.  Life was simple.

On physical hardware, the BIOS can persist a bit of state across (cold)
reboots by storing it in (non-volatile) CMOS RAM.  We didn't bother.
Simple.

Fast forward several years, and The Law of OS Envy (every program wants
to grow into a full-blown operating system) has asserted itself: PC
Firmware has grown from an 8KiB ROM using a few bytes of volatile and
non-volatile RAM into a multi-megabyte beast with much more complex
storage needs.

On today's physical PC hardware, firmware is stored in flash memory.
There's code, and there's persistent data.  For obvious reasons, the
code should be write-protected except when doing an upgrade.  "Secure
boot" additionally needs to restrict data writes to system management
mode (SMM).

Here's our first iteration of OVMF support, at QEMU level:

    -drive if=pflash,format=raw,file=/where/ever/OVMF.fd

Generic code creates a block backend for it.  Magic board code picks up
the backend, creates a frontend (a cfi.pflash01 device), and maps it
into the guest's address space.

At libvirt level:

  <loader type="pflash">/where/ever/OVMF.fd</loader>

Problem: while the flash device model provides read-only capability,
it's all-or-nothing.  You can't tell it to write-protect just the part
holding code.  The examples above don't write-protect anything.
/where/ever/OVMF.fd better be writable exclusively.

The flash device model could be enhanced, but we went down a different
path: we split the single OVMF image OVMF.fd ("unified build") into a
code image OVMF_CODE.fd and a data image OVMF_VARS.fd ("split build").
At QEMU level:

    -drive if=pflash,format=raw,readonly,file=/usr/share/OVMF/OVMF_CODE.fd
    -drive if=pflash,format=raw,file=/where/ever/OVMF_VARS.fd

OVMF_CODE.fd must be unit 0, and OVMF_VARS.fd must be unit 1.

Generic code creates two block backends.  Magic board code picks them
up, creates a frontend (a cfi.pflash01 device) for each, and maps them
into the guest's address space.

Note there are *two* virtual flash devices now, whereas physical
hardware commonly has just one.

At libvirt level:

  <loader type="pflash" readonly="yes">/usr/share/OVMF/OVMF_CODE.fd</loader>
  <nvram template="/usr/share/OVMF/OVMF_VARS.fd">/var/libvirt/nvram/${guest}_VARS.fd</nvram>

This treats OVMF_VARS.fd as a read-only template, and gives each guest
its own writable copy, which is nice.

The flash device model supports restricting writes to SMM (remember,
that's required for secure boot).  It's controlled by cfi.pflash01
property secure, off by default.  If we created the device model with
-device, we'd simply pass secure=on.  But since we create it with -drive
if=pflash, we can't.  Instead we have to use

    -global driver=cfi.pflash01,property=secure,value=on

This flips the global default value.  Awkward, but works out okay,
because (1) the flash device holding OVMF_VARS.fd wants this value, and
(2) the flash device holding OVMF_CODE.fd doesn't care (it's read-only),
and (3) there is no way to create additional flash devices.

At the libvirt level, we add secure='yes' to the loader element.

We also have to enable SMM emulation.  At QEMU level:

    -machine smm=on

At libvirt level:

    <features>
      <smm state='on'/>
    </features>

Note that the above configuration examples involve selecting OVMF
images.  A bit of an inconvenience compared to BIOS, where the default
"use the BIOS shipped with QEMU" pretty much just works.

To add annoyance to inconvenience, different distributions have
different ideas on where to install OVMF images.  And because that's not
complicated enough, we also have to pair code with data images.  And
because that's still not complicated enough, any specific machine type
may work only with a subset of the available firmwares.

The proposed way to deal with all that works as follows.

Each set of firmware images comes with a descriptor file.  These are
JSON and conform to the QAPI schema docs/interop/firmware.json.

Among the descriptors that declare support for the kind of machine we
want, we pick (really: the management application picks) the one with
the highest priority.  The distribution provides default priorities,
which system administrator and user can override.  firmware.json
documents this in much more detail.

I wrote "proposed", because as far as I can tell, neither distributions
nor libvirt are there, yet.

After all this text, I'm finally ready to curve towards -blockdev.
Going from -drive if=T, T!=none to -blockdev involves two steps.  The
first step replaces if=T with if=none and -device.  The second step
replaces -drive if=none with -blockdev.  That step is "obvious" (it took
us a few years to get to obvious, but I digress).  The difficulty is in
the first step.  Two issues:

(1) cfi.pflash01 isn't available with -device.

(2) "Magic board code picks up the backend [created for -drive
    if=pflash], creates a frontend (a cfi.pflash01 device), and maps it
    into the guest's address space."  When we replace if=pflash by
    if=none, we get to replicate that magic on top of -device.

Issue (1) isn't too hard: we add the device to the dynamic sysbus device
white-list, move a sysbus_mmio_map() from pflash_cfi01_realize() into
pflash_cfi01_realize().  The latter requires a new device property to
configure the base address.  I got a working prototype.  Since this
makes the device model's name and properties ABI, review would be
advisable.

To solve (2), we first have to understand the magic.  Device
cfi.pflash01 has the following properties:

    num-blocks                  Size of the device in blocks
    sector-length               Size of a block
                                (admire the choice of names)
    width                       Bank width
    big-endian                  Endianess (d'oh)
    id0, id1, id2, id3          Some kind of device ID, guest-visible,
                                default to zero, few boards change it
    name                        Memory region name
                                (why is this even configurable?)
    phys-addr                   Physical base address
                                (this is the new device property
                                mentioned above)
    secure                      For restricting access to firmware,
                                default off
    device-width                you don't want to know,
                                there is a default, but it's documented
                                as "bad, do not use", yet pretty much
                                all boards use it
    max-device-width            defaults to device-width
                                not actually set anywhere
    old-multiple-chip-handling  back-compat gunk for
                                machine types 2.8 and older

The magic board code in hw/i386/pc_sysfw.c configures as follows:

    num-blocks                  computed from backend size
    sector-length               4096
    width                       1
    big-endian                  0
    id0, id1, id2, id3          all 0
    name                        system.pflash<U>, where U is -drive's
                                unit number
    phys-addr                   computed so
                                unit 0 ends right below 0x100000000,
                                unit n+1 ends at right below unit n

"secure", "device-width", "max-device-width",
"old-multiple-chip-handling" are left at the default.

One additional bit of magic is actually in libvirt: it configures
"secure" by flipping its default with
-global driver=cfi.pflash01,property=secure,value=on.

Now let's consider how to replicate this magic on top of device.

Perhaps machine-type specific defaults could take care of sector-length,
width, big-endian, id0, id1, id2, id3.  Leaves num-blocks, name, and
phys-addr.

Perhaps the realize() method could default num-blocks to size of
backend.  But that doesn't really help the management application,
because it needs to mess with the size anyway to compute phys-addr.  So
scratch that idea.

Moving the magic code to compute num-blocks, phys-addr and name to the
management application is certainly possible, but ugly.

Note that the values computed are fixed when the firmware gets deployed.
If we record them in the firmware descriptor, the management application
doesn't need magic, it can simply pass on the values obtained from the
descriptor.

We'd want to include sector-length in the descriptor then, to ensure
num-block has a defined meaning.

Same technique could take care of width, big-endian, ... in case
machine-type specific defaults turn out to be inadequate for them.

Opinions?

One more problem: the magic board code does a bit more than just
configure the cfi.pflash01 device.  That additional magic needs to be
generalized to work regardless of whether the device gets configured
with -drive if=pflash or with -device.  I got a working prototype.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-25 15:03 [Qemu-devel] Configuring pflash devices for OVMF firmware Markus Armbruster
@ 2019-01-28  7:58 ` Laszlo Ersek
  2019-01-28 10:39 ` Peter Maydell
  2019-02-07  9:30 ` Markus Armbruster
  2 siblings, 0 replies; 45+ messages in thread
From: Laszlo Ersek @ 2019-01-28  7:58 UTC (permalink / raw)
  To: Markus Armbruster, qemu-devel
  Cc: qemu-block, libvir-list, Daniel P. Berrangé,
	Peter Krempa, Peter Maydell

(+Peter, keeping the context)

On 01/25/19 16:03, Markus Armbruster wrote:
> We configure OVMF firmware for PC machine types with -drive if=pflash.
> This is pretty much the last remaining use of -drive in libvirt we can't
> yet replace by -blockdev.  Such a replacement is desirable, because
> -blockdev + -device is more flexible than -drive if=pflash.  Also, once
> we don't need -drive with new QEMU anymore, the path for deleting all
> -drive code in libvirt some day is open.  As with all desirables, the
> benefit needs to exceed the cost.
> 
> I'm going to describe the status quo, how we got there (briefly and much
> simplified), then sketch how to replace -drive if=pflash.  I'm afraid
> this is fairly long; sorry.  Please correct misunderstandings.  Beware,
> my libvirt and OVMF fu is much weaker than my QEMU fu.
> 
> In the beginning, board code read the BIOS from a fixed file and mapped
> it into the guest's address space.  Life was simple.
> 
> On physical hardware, the BIOS can persist a bit of state across (cold)
> reboots by storing it in (non-volatile) CMOS RAM.  We didn't bother.
> Simple.
> 
> Fast forward several years, and The Law of OS Envy (every program wants
> to grow into a full-blown operating system) has asserted itself: PC
> Firmware has grown from an 8KiB ROM using a few bytes of volatile and
> non-volatile RAM into a multi-megabyte beast with much more complex
> storage needs.
> 
> On today's physical PC hardware, firmware is stored in flash memory.
> There's code, and there's persistent data.  For obvious reasons, the
> code should be write-protected except when doing an upgrade.  "Secure
> boot" additionally needs to restrict data writes to system management
> mode (SMM).
> 
> Here's our first iteration of OVMF support, at QEMU level:
> 
>     -drive if=pflash,format=raw,file=/where/ever/OVMF.fd
> 
> Generic code creates a block backend for it.  Magic board code picks up
> the backend, creates a frontend (a cfi.pflash01 device), and maps it
> into the guest's address space.
> 
> At libvirt level:
> 
>   <loader type="pflash">/where/ever/OVMF.fd</loader>
> 
> Problem: while the flash device model provides read-only capability,
> it's all-or-nothing.  You can't tell it to write-protect just the part
> holding code.  The examples above don't write-protect anything.
> /where/ever/OVMF.fd better be writable exclusively.
> 
> The flash device model could be enhanced, but we went down a different
> path: we split the single OVMF image OVMF.fd ("unified build") into a
> code image OVMF_CODE.fd and a data image OVMF_VARS.fd ("split build").
> At QEMU level:
> 
>     -drive if=pflash,format=raw,readonly,file=/usr/share/OVMF/OVMF_CODE.fd
>     -drive if=pflash,format=raw,file=/where/ever/OVMF_VARS.fd
> 
> OVMF_CODE.fd must be unit 0, and OVMF_VARS.fd must be unit 1.
> 
> Generic code creates two block backends.  Magic board code picks them
> up, creates a frontend (a cfi.pflash01 device) for each, and maps them
> into the guest's address space.
> 
> Note there are *two* virtual flash devices now, whereas physical
> hardware commonly has just one.
> 
> At libvirt level:
> 
>   <loader type="pflash" readonly="yes">/usr/share/OVMF/OVMF_CODE.fd</loader>
>   <nvram template="/usr/share/OVMF/OVMF_VARS.fd">/var/libvirt/nvram/${guest}_VARS.fd</nvram>
> 
> This treats OVMF_VARS.fd as a read-only template, and gives each guest
> its own writable copy, which is nice.
> 
> The flash device model supports restricting writes to SMM (remember,
> that's required for secure boot).  It's controlled by cfi.pflash01
> property secure, off by default.  If we created the device model with
> -device, we'd simply pass secure=on.  But since we create it with -drive
> if=pflash, we can't.  Instead we have to use
> 
>     -global driver=cfi.pflash01,property=secure,value=on
> 
> This flips the global default value.  Awkward, but works out okay,
> because (1) the flash device holding OVMF_VARS.fd wants this value, and
> (2) the flash device holding OVMF_CODE.fd doesn't care (it's read-only),
> and (3) there is no way to create additional flash devices.
> 
> At the libvirt level, we add secure='yes' to the loader element.
> 
> We also have to enable SMM emulation.  At QEMU level:
> 
>     -machine smm=on
> 
> At libvirt level:
> 
>     <features>
>       <smm state='on'/>
>     </features>
> 
> Note that the above configuration examples involve selecting OVMF
> images.  A bit of an inconvenience compared to BIOS, where the default
> "use the BIOS shipped with QEMU" pretty much just works.
> 
> To add annoyance to inconvenience, different distributions have
> different ideas on where to install OVMF images.  And because that's not
> complicated enough, we also have to pair code with data images.  And
> because that's still not complicated enough, any specific machine type
> may work only with a subset of the available firmwares.
> 
> The proposed way to deal with all that works as follows.
> 
> Each set of firmware images comes with a descriptor file.  These are
> JSON and conform to the QAPI schema docs/interop/firmware.json.
> 
> Among the descriptors that declare support for the kind of machine we
> want, we pick (really: the management application picks) the one with
> the highest priority.  The distribution provides default priorities,
> which system administrator and user can override.  firmware.json
> documents this in much more detail.
> 
> I wrote "proposed", because as far as I can tell, neither distributions
> nor libvirt are there, yet.
> 
> After all this text, I'm finally ready to curve towards -blockdev.
> Going from -drive if=T, T!=none to -blockdev involves two steps.  The
> first step replaces if=T with if=none and -device.  The second step
> replaces -drive if=none with -blockdev.  That step is "obvious" (it took
> us a few years to get to obvious, but I digress).  The difficulty is in
> the first step.  Two issues:
> 
> (1) cfi.pflash01 isn't available with -device.
> 
> (2) "Magic board code picks up the backend [created for -drive
>     if=pflash], creates a frontend (a cfi.pflash01 device), and maps it
>     into the guest's address space."  When we replace if=pflash by
>     if=none, we get to replicate that magic on top of -device.
> 
> Issue (1) isn't too hard: we add the device to the dynamic sysbus device
> white-list, move a sysbus_mmio_map() from pflash_cfi01_realize() into
> pflash_cfi01_realize().  The latter requires a new device property to
> configure the base address.  I got a working prototype.  Since this
> makes the device model's name and properties ABI, review would be
> advisable.
> 
> To solve (2), we first have to understand the magic.  Device
> cfi.pflash01 has the following properties:
> 
>     num-blocks                  Size of the device in blocks
>     sector-length               Size of a block
>                                 (admire the choice of names)
>     width                       Bank width
>     big-endian                  Endianess (d'oh)
>     id0, id1, id2, id3          Some kind of device ID, guest-visible,
>                                 default to zero, few boards change it
>     name                        Memory region name
>                                 (why is this even configurable?)
>     phys-addr                   Physical base address
>                                 (this is the new device property
>                                 mentioned above)
>     secure                      For restricting access to firmware,
>                                 default off
>     device-width                you don't want to know,
>                                 there is a default, but it's documented
>                                 as "bad, do not use", yet pretty much
>                                 all boards use it
>     max-device-width            defaults to device-width
>                                 not actually set anywhere
>     old-multiple-chip-handling  back-compat gunk for
>                                 machine types 2.8 and older
> 
> The magic board code in hw/i386/pc_sysfw.c configures as follows:
> 
>     num-blocks                  computed from backend size
>     sector-length               4096
>     width                       1
>     big-endian                  0
>     id0, id1, id2, id3          all 0
>     name                        system.pflash<U>, where U is -drive's
>                                 unit number
>     phys-addr                   computed so
>                                 unit 0 ends right below 0x100000000,
>                                 unit n+1 ends at right below unit n
> 
> "secure", "device-width", "max-device-width",
> "old-multiple-chip-handling" are left at the default.
> 
> One additional bit of magic is actually in libvirt: it configures
> "secure" by flipping its default with
> -global driver=cfi.pflash01,property=secure,value=on.
> 
> Now let's consider how to replicate this magic on top of device.
> 
> Perhaps machine-type specific defaults could take care of sector-length,
> width, big-endian, id0, id1, id2, id3.  Leaves num-blocks, name, and
> phys-addr.
> 
> Perhaps the realize() method could default num-blocks to size of
> backend.  But that doesn't really help the management application,
> because it needs to mess with the size anyway to compute phys-addr.  So
> scratch that idea.
> 
> Moving the magic code to compute num-blocks, phys-addr and name to the
> management application is certainly possible, but ugly.
> 
> Note that the values computed are fixed when the firmware gets deployed.
> If we record them in the firmware descriptor, the management application
> doesn't need magic, it can simply pass on the values obtained from the
> descriptor.
> 
> We'd want to include sector-length in the descriptor then, to ensure
> num-block has a defined meaning.
> 
> Same technique could take care of width, big-endian, ... in case
> machine-type specific defaults turn out to be inadequate for them.
> 
> Opinions?
> 
> One more problem: the magic board code does a bit more than just
> configure the cfi.pflash01 device.  That additional magic needs to be
> generalized to work regardless of whether the device gets configured
> with -drive if=pflash or with -device.  I got a working prototype.
> 

Great write-up.

For i440fx and q35, exposing the size info (i.e., "num-blocks" and
"sector-length"), and the base address ("phys-addr") in the firmware
descriptor json would be justifiable, and not much of a burden to fill in.

Regarding "name", I have no idea what that is used for (and so whether
it must remain "compatible"), so maybe we could auto-generate that in
generic "-device" code in QEMU. Not sure. If it must go in the
descriptor, it can, but it wouldn't provide any useful distinction
between OVMF builds at least.

"secure" is already covered in the descriptor, via @requires-smm.

Regarding the rest of the properties, from my POV, I'd prefer to
continue controlling them via machine type-specific defaults. My only
reason is to keep the json reasonably clutter-free. OTOH, if they
significantly helped this conversion and/or libvirtd, I wouldn't insist
on keeping them out.

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-25 15:03 [Qemu-devel] Configuring pflash devices for OVMF firmware Markus Armbruster
  2019-01-28  7:58 ` Laszlo Ersek
@ 2019-01-28 10:39 ` Peter Maydell
  2019-01-28 12:40   ` [Qemu-devel] [libvirt] " Gerd Hoffmann
                     ` (2 more replies)
  2019-02-07  9:30 ` Markus Armbruster
  2 siblings, 3 replies; 45+ messages in thread
From: Peter Maydell @ 2019-01-28 10:39 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: QEMU Developers, Libvirt, Peter Krempa,
	László Érsek, Qemu-block

On Fri, 25 Jan 2019 at 15:11, Markus Armbruster <armbru@redhat.com> wrote:
> (1) cfi.pflash01 isn't available with -device.
>
> (2) "Magic board code picks up the backend [created for -drive
>     if=pflash], creates a frontend (a cfi.pflash01 device), and maps it
>     into the guest's address space."  When we replace if=pflash by
>     if=none, we get to replicate that magic on top of -device.
>
> Issue (1) isn't too hard: we add the device to the dynamic sysbus device
> white-list, move a sysbus_mmio_map() from pflash_cfi01_realize() into
> pflash_cfi01_realize().  The latter requires a new device property to
> configure the base address.  I got a working prototype.  Since this
> makes the device model's name and properties ABI, review would be
> advisable.

Flash devices exist on the board at specific addresses, so they
should in general be created by the board model. Creating them
by the user on the command line is a mess because then the
user has to know the right base address. And then the board
code needs to do something for "if the user didn't create this
then we need to do it", because the flash device should exist
in the model whether the user cared about its contents or not.

Dynamic sysbus is something I'd rather we did less of, not more
of. It's there because it solves an annoying problem (people
want to do device passthrough of hardware that's not on a nice
pluggable probeable bus), but it's really awkward.

> To solve (2), we first have to understand the magic.  Device
> cfi.pflash01 has the following properties:
>
>     num-blocks                  Size of the device in blocks
>     sector-length               Size of a block
>                                 (admire the choice of names)
>     width                       Bank width
>     big-endian                  Endianess (d'oh)
>     id0, id1, id2, id3          Some kind of device ID, guest-visible,
>                                 default to zero, few boards change it

Note that most of this is stuff that the hardware has.
A lot of boards set these to garbage values which happened
to be what the very old implementation of pflash hardcoded,
because most guests don't care. This is strictly speaking
wrong and they should use whatever the hardware really has,
but most of these cases are for old not-very-maintained dev
boards where probably nobody even has the relevant hardware
even if they cared enough to find out what its ID values are.

>     name                        Memory region name
>                                 (why is this even configurable?)

(a) for debug purposes, so a machine can create two flash
devices and give them names which make them easier to tell
apart in monitor "info mem" and similar command output
(b) more  importantly, the memory region name is used as
the migration vmstate ram name, so if you change it you
break migration compat.

>     phys-addr                   Physical base address
>                                 (this is the new device property
>                                 mentioned above)
>     secure                      For restricting access to firmware,
>                                 default off
>     device-width                you don't want to know,
>                                 there is a default, but it's documented
>                                 as "bad, do not use", yet pretty much
>                                 all boards use it

See above about "old not-very-maintained dev boards". A
board which does use this is one that's doing it for
back-compat because nobody's cared to fix and test.

>     max-device-width            defaults to device-width
>                                 not actually set anywhere
>     old-multiple-chip-handling  back-compat gunk for
>                                 machine types 2.8 and older
>
> The magic board code in hw/i386/pc_sysfw.c configures as follows:
>
>     num-blocks                  computed from backend size
>     sector-length               4096
>     width                       1
>     big-endian                  0
>     id0, id1, id2, id3          all 0
>     name                        system.pflash<U>, where U is -drive's
>                                 unit number
>     phys-addr                   computed so
>                                 unit 0 ends right below 0x100000000,
>                                 unit n+1 ends at right below unit n
>
> "secure", "device-width", "max-device-width",
> "old-multiple-chip-handling" are left at the default.
>
> One additional bit of magic is actually in libvirt: it configures
> "secure" by flipping its default with
> -global driver=cfi.pflash01,property=secure,value=on.
>
> Now let's consider how to replicate this magic on top of device.
>
> Perhaps machine-type specific defaults could take care of sector-length,
> width, big-endian, id0, id1, id2, id3.  Leaves num-blocks, name, and
> phys-addr.

You could at least in theory have a machine with several
different flash devices of different make/ID. This is why
they're device properties. I think they're fine the way
they are, ie set by the code that creates the device.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] [libvirt] Configuring pflash devices for OVMF firmware
  2019-01-28 10:39 ` Peter Maydell
@ 2019-01-28 12:40   ` Gerd Hoffmann
  2019-01-28 13:06     ` Peter Maydell
  2019-01-30  7:24   ` [Qemu-devel] " Markus Armbruster
  2019-01-30 14:13   ` Markus Armbruster
  2 siblings, 1 reply; 45+ messages in thread
From: Gerd Hoffmann @ 2019-01-28 12:40 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Markus Armbruster, Libvirt, Peter Krempa,
	László Érsek, QEMU Developers, Qemu-block

  Hi,

> > Issue (1) isn't too hard: we add the device to the dynamic sysbus device
> > white-list, move a sysbus_mmio_map() from pflash_cfi01_realize() into
> > pflash_cfi01_realize().  The latter requires a new device property to
> > configure the base address.  I got a working prototype.  Since this
> > makes the device model's name and properties ABI, review would be
> > advisable.
> 
> Flash devices exist on the board at specific addresses, so they
> should in general be created by the board model. Creating them
> by the user on the command line is a mess because then the
> user has to know the right base address. And then the board
> code needs to do something for "if the user didn't create this
> then we need to do it", because the flash device should exist
> in the model whether the user cared about its contents or not.

The tricky part is the access control here.  On physical hardware you
typically have one flash rom, say 16M below 4G (on x86).

Our pflash device doesn't allow to define multiple regions, so we use
multiple pflash devices instead, each with different access permissions
(code vs. vars).  Because of that they are more dynamic than they are on
phyiscal hardware, x86 sizes them according to the size of the firmware
images (arm is easier here, we have fixed size and location no matter
how big the firmware images are).

So I think the options we have are:
  (a) leave pflash as-is, which pretty much implies physaddr and size
      must be user-configurable.
  (b) add support for multiple regions to pflash, so one can attach
      multiple blockdev at different offsets to a single pflash device.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] [libvirt] Configuring pflash devices for OVMF firmware
  2019-01-28 12:40   ` [Qemu-devel] [libvirt] " Gerd Hoffmann
@ 2019-01-28 13:06     ` Peter Maydell
  2019-01-28 14:55       ` Laszlo Ersek
  2019-01-30  7:36       ` Markus Armbruster
  0 siblings, 2 replies; 45+ messages in thread
From: Peter Maydell @ 2019-01-28 13:06 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Markus Armbruster, Libvirt, Peter Krempa,
	László Érsek, QEMU Developers, Qemu-block

On Mon, 28 Jan 2019 at 12:40, Gerd Hoffmann <kraxel@redhat.com> wrote:
> The tricky part is the access control here.  On physical hardware you
> typically have one flash rom, say 16M below 4G (on x86).
>
> Our pflash device doesn't allow to define multiple regions, so we use
> multiple pflash devices instead, each with different access permissions
> (code vs. vars).  Because of that they are more dynamic than they are on
> phyiscal hardware, x86 sizes them according to the size of the firmware
> images (arm is easier here, we have fixed size and location no matter
> how big the firmware images are).
>
> So I think the options we have are:
>   (a) leave pflash as-is, which pretty much implies physaddr and size
>       must be user-configurable.
>   (b) add support for multiple regions to pflash, so one can attach
>       multiple blockdev at different offsets to a single pflash device.

The latter makes more sense to me -- we should model what the
hardware actually has, because the guest can tell the difference.
(Migration compat is left as an exercise for the reader :-))

thanks
-- PMM

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] [libvirt] Configuring pflash devices for OVMF firmware
  2019-01-28 13:06     ` Peter Maydell
@ 2019-01-28 14:55       ` Laszlo Ersek
  2019-01-28 14:58         ` Peter Maydell
  2019-01-30  7:36       ` Markus Armbruster
  1 sibling, 1 reply; 45+ messages in thread
From: Laszlo Ersek @ 2019-01-28 14:55 UTC (permalink / raw)
  To: Peter Maydell, Gerd Hoffmann
  Cc: Markus Armbruster, Libvirt, Peter Krempa, QEMU Developers,
	Qemu-block, Ard Biesheuvel

On 01/28/19 14:06, Peter Maydell wrote:
> On Mon, 28 Jan 2019 at 12:40, Gerd Hoffmann <kraxel@redhat.com> wrote:
>> The tricky part is the access control here.  On physical hardware you
>> typically have one flash rom, say 16M below 4G (on x86).
>>
>> Our pflash device doesn't allow to define multiple regions, so we use
>> multiple pflash devices instead, each with different access permissions
>> (code vs. vars).  Because of that they are more dynamic than they are on
>> phyiscal hardware, x86 sizes them according to the size of the firmware
>> images (arm is easier here, we have fixed size and location no matter
>> how big the firmware images are).
>>
>> So I think the options we have are:
>>   (a) leave pflash as-is, which pretty much implies physaddr and size
>>       must be user-configurable.
>>   (b) add support for multiple regions to pflash, so one can attach
>>       multiple blockdev at different offsets to a single pflash device.
> 
> The latter makes more sense to me -- we should model what the
> hardware actually has, because the guest can tell the difference.

Regarding OVMF, I kept the flash driver intentionally in the dark about
split vs. unified pflash, so OVMF will not care, as long as the same
GPAs behave the same as before.

Regarding ArmVirtQemu, I'm not so sure. It think it already depends on
two separate pflash chips, through the DTB that QEMU exposes. So
unifying the chips (albeit with multiple regions) might actually confuse
the firmware. I'm CC'ing Ard, he's done some work on the ARM NOR flash
driver recently, incl. updates to its usage in ArmVirtQemu:

* https://github.com/tianocore/edk2/commit/5e27deed438b

  [edk2] [PATCH v2 3/4]
  ArmVirtPkg/NorFlashQemuLib: disregard our primary FV

* https://github.com/tianocore/edk2/commit/51bb05c79595

  [edk2] [PATCH v2 4/4]
  ArmVirtPkg/QemuVirtMemInfoLib: trim the MMIO region mapping

Thanks
Laszlo

> (Migration compat is left as an exercise for the reader :-))
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] [libvirt] Configuring pflash devices for OVMF firmware
  2019-01-28 14:55       ` Laszlo Ersek
@ 2019-01-28 14:58         ` Peter Maydell
  2019-01-28 15:03           ` Laszlo Ersek
  0 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2019-01-28 14:58 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Gerd Hoffmann, Markus Armbruster, Libvirt, Peter Krempa,
	QEMU Developers, Qemu-block, Ard Biesheuvel

On Mon, 28 Jan 2019 at 14:56, Laszlo Ersek <lersek@redhat.com> wrote:
> Regarding OVMF, I kept the flash driver intentionally in the dark about
> split vs. unified pflash, so OVMF will not care, as long as the same
> GPAs behave the same as before.
>
> Regarding ArmVirtQemu, I'm not so sure. It think it already depends on
> two separate pflash chips, through the DTB that QEMU exposes. So
> unifying the chips (albeit with multiple regions) might actually confuse
> the firmware.

"virt" is a funny case because there is no underlying hardware
that we're trying to match. So it is whatever we say it is
and we've said it's two separate pflash chips. We don't need
to change that I think. (IIRC this is derived partly from
OVMF usecase requirements and partly from the vexpress devboards
having 2 flash chips.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] [libvirt] Configuring pflash devices for OVMF firmware
  2019-01-28 14:58         ` Peter Maydell
@ 2019-01-28 15:03           ` Laszlo Ersek
  0 siblings, 0 replies; 45+ messages in thread
From: Laszlo Ersek @ 2019-01-28 15:03 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Gerd Hoffmann, Markus Armbruster, Libvirt, Peter Krempa,
	QEMU Developers, Qemu-block, Ard Biesheuvel

On 01/28/19 15:58, Peter Maydell wrote:
> On Mon, 28 Jan 2019 at 14:56, Laszlo Ersek <lersek@redhat.com> wrote:
>> Regarding OVMF, I kept the flash driver intentionally in the dark about
>> split vs. unified pflash, so OVMF will not care, as long as the same
>> GPAs behave the same as before.
>>
>> Regarding ArmVirtQemu, I'm not so sure. It think it already depends on
>> two separate pflash chips, through the DTB that QEMU exposes. So
>> unifying the chips (albeit with multiple regions) might actually confuse
>> the firmware.
> 
> "virt" is a funny case because there is no underlying hardware
> that we're trying to match. So it is whatever we say it is
> and we've said it's two separate pflash chips. We don't need
> to change that I think. (IIRC this is derived partly from
> OVMF usecase requirements and partly from the vexpress devboards
> having 2 flash chips.)

That sounds great, but in that case, would the work be justified in
practice, to introduce single-ship-with-multiple-regions, if only x86
put it to use (for OVMF), but arm/aarch64 "virt" would have to stick
with the current layout?

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-28 10:39 ` Peter Maydell
  2019-01-28 12:40   ` [Qemu-devel] [libvirt] " Gerd Hoffmann
@ 2019-01-30  7:24   ` Markus Armbruster
  2019-01-30 15:24     ` Peter Maydell
  2019-01-30 14:13   ` Markus Armbruster
  2 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-30  7:24 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Libvirt, Peter Krempa, László Érsek,
	QEMU Developers, Qemu-block

Let me reply to the "why is the cfi.pflash01 device so weird" part
first, because that's relatively quick, and because it could easily
distract us from the more important "how do we want to configure OVMF"
part.  I'll reply to that part later.

Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 25 Jan 2019 at 15:11, Markus Armbruster <armbru@redhat.com> wrote:
[...]
>> To solve (2), we first have to understand the magic.  Device
>> cfi.pflash01 has the following properties:
>>
>>     num-blocks                  Size of the device in blocks
>>     sector-length               Size of a block
>>                                 (admire the choice of names)
>>     width                       Bank width
>>     big-endian                  Endianess (d'oh)
>>     id0, id1, id2, id3          Some kind of device ID, guest-visible,
>>                                 default to zero, few boards change it
>
> Note that most of this is stuff that the hardware has.
> A lot of boards set these to garbage values which happened
> to be what the very old implementation of pflash hardcoded,
> because most guests don't care. This is strictly speaking
> wrong and they should use whatever the hardware really has,
> but most of these cases are for old not-very-maintained dev
> boards where probably nobody even has the relevant hardware
> even if they cared enough to find out what its ID values are.

Why are we emulating (badly) stuff nobody cares about enough to find out
what exactly we should be emulating, the world wonders...

>>     name                        Memory region name
>>                                 (why is this even configurable?)
>
> (a) for debug purposes, so a machine can create two flash
> devices and give them names which make them easier to tell
> apart in monitor "info mem" and similar command output

That's what we have qdev IDs for.

> (b) more  importantly, the memory region name is used as
> the migration vmstate ram name, so if you change it you
> break migration compat.

Sounds about as useful as a --break-me-but-subtly option.

>>     phys-addr                   Physical base address
>>                                 (this is the new device property
>>                                 mentioned above)
>>     secure                      For restricting access to firmware,
>>                                 default off
>>     device-width                you don't want to know,
>>                                 there is a default, but it's documented
>>                                 as "bad, do not use", yet pretty much
>>                                 all boards use it
>
> See above about "old not-very-maintained dev boards". A
> board which does use this is one that's doing it for
> back-compat because nobody's cared to fix and test.

Boards that seem to care:

    hw/arm/vexpress.c:    qdev_prop_set_uint8(dev, "device-width", 2);
    hw/arm/virt.c:    qdev_prop_set_uint8(dev, "device-width", 2);

Boards that seem not to care:

    hw/arm/collie.c:    pflash_cfi01_register(SA_CS0, NULL, "collie.fl1", 0x02000000,
    hw/arm/collie.c:    pflash_cfi01_register(SA_CS1, NULL, "collie.fl2", 0x02000000,
    hw/arm/gumstix.c:    if (!pflash_cfi01_register(0x00000000, NULL, "connext.rom", connex_rom,
    hw/arm/gumstix.c:    if (!pflash_cfi01_register(0x00000000, NULL, "verdex.rom", verdex_rom,
    hw/arm/mainstone.c:        if (!pflash_cfi01_register(mainstone_flash_base[i], NULL,
    hw/arm/omap_sx1.c:        if (!pflash_cfi01_register(OMAP_CS0_BASE, NULL,
    hw/arm/omap_sx1.c:        if (!pflash_cfi01_register(OMAP_CS1_BASE, NULL,
    hw/arm/versatilepb.c:    if (!pflash_cfi01_register(VERSATILE_FLASH_ADDR, NULL, "versatile.flash",
    hw/arm/vexpress.c:static pflash_t *ve_pflash_cfi01_register(hwaddr base, const char *name,
    hw/arm/vexpress.c:    pflash0 = ve_pflash_cfi01_register(map[VE_NORFLASH0], "vexpress.flash0",
    hw/arm/vexpress.c:    if (!ve_pflash_cfi01_register(map[VE_NORFLASH1], "vexpress.flash1",
    hw/arm/z2.c:    if (!pflash_cfi01_register(Z2_FLASH_BASE,
    hw/i386/pc_sysfw.c:        /* pflash_cfi01_register() creates a deep copy of the name */
    hw/i386/pc_sysfw.c:        system_flash = pflash_cfi01_register(phys_addr, NULL /* qdev */, name,
    hw/lm32/milkymist.c:    pflash_cfi01_register(flash_base, NULL, "milkymist.flash", flash_size,
    hw/microblaze/petalogix_ml605_mmu.c:    pflash_cfi01_register(FLASH_BASEADDR,
    hw/microblaze/petalogix_s3adsp1800_mmu.c:    pflash_cfi01_register(FLASH_BASEADDR,
    hw/mips/mips_malta.c:    fl = pflash_cfi01_register(FLASH_ADDRESS, NULL, "mips_malta.bios",
    hw/mips/mips_r4k.c:        if (!pflash_cfi01_register(0x1fc00000, NULL, "mips_r4k.bios", mips_rom,
    hw/ppc/sam460ex.c:    if (!pflash_cfi01_register(base, NULL, "sam460ex.flash", bios_size,
    hw/ppc/virtex_ml507.c:    pflash_cfi01_register(PFLASH_BASEADDR, NULL, "virtex.flash", FLASH_SIZE,

At least PC can't be characterized as "not-very-maintaned dev board".

>>     max-device-width            defaults to device-width
>>                                 not actually set anywhere
>>     old-multiple-chip-handling  back-compat gunk for
>>                                 machine types 2.8 and older
>>
>> The magic board code in hw/i386/pc_sysfw.c configures as follows:
>>
>>     num-blocks                  computed from backend size
>>     sector-length               4096
>>     width                       1
>>     big-endian                  0
>>     id0, id1, id2, id3          all 0
>>     name                        system.pflash<U>, where U is -drive's
>>                                 unit number
>>     phys-addr                   computed so
>>                                 unit 0 ends right below 0x100000000,
>>                                 unit n+1 ends at right below unit n
>>
>> "secure", "device-width", "max-device-width",
>> "old-multiple-chip-handling" are left at the default.
>>
>> One additional bit of magic is actually in libvirt: it configures
>> "secure" by flipping its default with
>> -global driver=cfi.pflash01,property=secure,value=on.
[...]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] [libvirt] Configuring pflash devices for OVMF firmware
  2019-01-28 13:06     ` Peter Maydell
  2019-01-28 14:55       ` Laszlo Ersek
@ 2019-01-30  7:36       ` Markus Armbruster
  2019-01-30  8:00         ` Gerd Hoffmann
  1 sibling, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-30  7:36 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Gerd Hoffmann, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, László Érsek

Peter Maydell <peter.maydell@linaro.org> writes:

> On Mon, 28 Jan 2019 at 12:40, Gerd Hoffmann <kraxel@redhat.com> wrote:
>> The tricky part is the access control here.  On physical hardware you
>> typically have one flash rom, say 16M below 4G (on x86).
>>
>> Our pflash device doesn't allow to define multiple regions, so we use
>> multiple pflash devices instead, each with different access permissions
>> (code vs. vars).  Because of that they are more dynamic than they are on
>> phyiscal hardware, x86 sizes them according to the size of the firmware
>> images (arm is easier here, we have fixed size and location no matter
>> how big the firmware images are).
>>
>> So I think the options we have are:
>>   (a) leave pflash as-is, which pretty much implies physaddr and size
>>       must be user-configurable.
>>   (b) add support for multiple regions to pflash, so one can attach
>>       multiple blockdev at different offsets to a single pflash device.
>
> The latter makes more sense to me -- we should model what the
> hardware actually has, because the guest can tell the difference.

No argument.

> (Migration compat is left as an exercise for the reader :-))

It's not just migration compatibility, it's also guest ABI: "the guest
can tell the difference".  So, old machine types continue to get two
flash devices, and new machine types get one.  In other words, we're
stuck maintaining (a) even if we decide to switch to (b).

By itself, not an argument against doing the right thing for new machine
types.  It's an argument for trying harder doing the right thing from
the start next time.

I'm not sure whether doing the right thing now is worthwhile from a
practical point of view.

I figure the need for a region-capable flash device model may well come
up again in contexts other than PC firmware.  Perhaps we should discount
the cost of creating it when weighing the tradeoff for PC.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] [libvirt] Configuring pflash devices for OVMF firmware
  2019-01-30  7:36       ` Markus Armbruster
@ 2019-01-30  8:00         ` Gerd Hoffmann
  0 siblings, 0 replies; 45+ messages in thread
From: Gerd Hoffmann @ 2019-01-30  8:00 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, László Érsek

  Hi,

> > (Migration compat is left as an exercise for the reader :-))
> 
> It's not just migration compatibility, it's also guest ABI: "the guest
> can tell the difference".

Is that actually the case on x86?  I don't think so.

Note: arm is different, because the flash is listed in the device tree
which is guest visible.

> So, old machine types continue to get two flash devices, and new
> machine types get one.  In other words, we're stuck maintaining (a)
> even if we decide to switch to (b).

I think that'll be true nevertheless for live migration compatibility.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-28 10:39 ` Peter Maydell
  2019-01-28 12:40   ` [Qemu-devel] [libvirt] " Gerd Hoffmann
  2019-01-30  7:24   ` [Qemu-devel] " Markus Armbruster
@ 2019-01-30 14:13   ` Markus Armbruster
  2019-01-30 14:33     ` Paolo Bonzini
  2 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-30 14:13 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Libvirt, Peter Krempa, László Érsek,
	QEMU Developers, Qemu-block, Paolo Bonzini

Cc: Paolo for additonal device infrastructure expertise.

Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 25 Jan 2019 at 15:11, Markus Armbruster <armbru@redhat.com> wrote:
>> (1) cfi.pflash01 isn't available with -device.
>>
>> (2) "Magic board code picks up the backend [created for -drive
>>     if=pflash], creates a frontend (a cfi.pflash01 device), and maps it
>>     into the guest's address space."  When we replace if=pflash by
>>     if=none, we get to replicate that magic on top of -device.
>>
>> Issue (1) isn't too hard: we add the device to the dynamic sysbus device
>> white-list, move a sysbus_mmio_map() from pflash_cfi01_realize() into
>> pflash_cfi01_realize().  The latter requires a new device property to
>> configure the base address.  I got a working prototype.  Since this
>> makes the device model's name and properties ABI, review would be
>> advisable.
>
> Flash devices exist on the board at specific addresses, so they
> should in general be created by the board model. Creating them
> by the user on the command line is a mess because then the
> user has to know the right base address.

I agree outsourcing configuration of onboard devices to the user is not
very nice.  However, the current situation is also not nice: onboard
device configuration is different from plugged device configuration, and
markedly inferior.

Our common mechanism to configure devices is device properties.  Some
properties serve to configure the device (e.g. e1000.mac), some serve to
connect it to backends (e.g. e1000.netdev), and some serve to wire it up
(e.g. e1000.bus and .addr).

This mechanism is pretty much a specialization of our general QOM
configuration mechanism (although it predates QOM).

This mechanism is satisfactorily flexible.  In particular, you can
capture commonalities such as "all PCI devices have these properties" or
"all NIC devices have these properties", and still have individual
devices expose their own idiosyncratic properties.

-device / device_add provide full control over properties.

Onboard device configuration is totally different, and leaves most
properties inaccessible.

Board code creates the devices and sets (some of) their properties.  The
property values come from ad hoc board configuration, magic board code,
or the device model's default.

Ad hoc configuration knobs include -drive if=T where T != none, -serial,
-net nic, and more.  They work roughly like this.  Generic code stores
the user's ad hoc board configuration in well-known places.  Board code
picks up some configuration bits, and ignores the rest.

Let's have a closer look at one of these beauties: -serial.  If you
don't need to be convinced the ad hoc configuration is problematic, you
may want to skip ahead to "Observe:".

Let's start with the default:

    $ qemu-system-aarch64 -S -display none -monitor stdio -machine virt
    QEMU 3.1.50 monitor - type 'help' for more information
    (qemu) info chardev 
    parallel0: filename=vc
    serial0: filename=vc
    compat_monitor0: filename=stdio
    (qemu) info qtree
    [...]
      dev: pl011, id ""
        gpio-out "sysbus-irq" 1
        chardev = "serial0"
        mmio ffffffffffffffff/0000000000001000
    [...]

There's a default chardev "serial0".  It's created by generic code
unless the user or the board supress it (this board doesn't).  Its exact
configuration depends on several factors too complicated to explain
here.

There's an onboard serial device pl011, connected to the default
chardev.

Say you don't like the default chardev.  You can suppress it with
-nodefaults or -serial none.  This board creates the serial device
anyway, since its a mandatory onboard device.  It just creates it with a
null backend:

    $ qemu-system-aarch64 -S -display none -monitor stdio -machine virt -nodefaults
    (qemu) info chardev 
    compat_monitor0: filename=stdio
    (qemu) info qtree
    [...]
      dev: pl011, id ""
        gpio-out "sysbus-irq" 1
        chardev = ""
        mmio ffffffffffffffff/0000000000001000
    [...]

Boards with optional onboard serial devices would suppress the entire
device instead.  If you want a null backend without suppressing the
device, use -serial null:

    $ qemu-system-aarch64 -S -display none -monitor stdio -machine virt -serial null
    QEMU 3.1.50 monitor - type 'help' for more information
    (qemu) info chardev 
    parallel0: filename=vc
    serial0: filename=null
    compat_monitor0: filename=stdio
    (qemu) info qtree
    [...]
    dev: pl011, id ""
      gpio-out "sysbus-irq" 1
      chardev = "serial0"
      mmio ffffffffffffffff/0000000000001000
    [...]

That's the chardev type "null", which is not the same as the null
backend we just saw.  They just behave the same, as far as I know.

As example for a "non-null" chardev, here's one wrapping a Unix domain
socket:

    $ qemu-system-aarch64 -S -display none -monitor stdio -machine virt -serial unix:./serial-socket,server,wait=off
    QEMU 3.1.50 monitor - type 'help' for more information
    (qemu) info chardev 
    parallel0: filename=vc
    serial0: filename=disconnected:unix:./serial-socket,server
    compat_monitor0: filename=stdio
    (qemu) info qtree
    [...]
      dev: pl011, id ""
        gpio-out "sysbus-irq" 1
        chardev = "serial0"
        mmio ffffffffffffffff/0000000000001000
    [...]

You can have any number of -serial:

    $ qemu-system-aarch64 -S -display none -monitor stdio -machine virt -serial null -serial unix:./serial-socket,server,wait=off
    (qemu) info chardev
    parallel0: filename=vc
    serial0: filename=null
    compat_monitor0: filename=stdio
    serial1: filename=disconnected:unix:./serial-socket,server

This board silently ignores all but the first.  Except with -machine
virt,secure, it ignores all but the first two:

    $ qemu-system-aarch64 -S -display none -monitor stdio -machine virt,secure -serial null -serial unix:./serial-socket,server,wait=off
    QEMU 3.1.50 monitor - type 'help' for more information
    (qemu) info chardev
    parallel0: filename=vc
    serial0: filename=null
    compat_monitor0: filename=stdio
    serial1: filename=disconnected:unix:./serial-socket,server
    (qemu) info qtree
    [...]
      dev: pl011, id ""
        gpio-out "sysbus-irq" 1
        chardev = "serial1"
        mmio ffffffffffffffff/0000000000001000
      dev: pl011, id ""
        gpio-out "sysbus-irq" 1
        chardev = "serial0"
        mmio ffffffffffffffff/0000000000001000
    [...]

Nothing stops boards from rejecting extra -serial instead.  Perhaps
there's even one that does.

For -drive, we have generic code to detect and reject drives ignored by
the board.

Again, -serial none is not to be confused with -serial null:

    $ qemu-system-aarch64 -S -display none -monitor stdio -machine virt,secure -serial none -serial unix:./serial-socket,server,wait=off
    QEMU 3.1.50 monitor - type 'help' for more information
    (qemu) info chardev
    parallel0: filename=vc
    serial0: filename=disconnected:unix:./serial-socket,server
    compat_monitor0: filename=stdio
    (qemu) info qtree
    [...]
      dev: pl011, id ""
        gpio-out "sysbus-irq" 1
        chardev = ""
        mmio ffffffffffffffff/0000000000001000
      dev: pl011, id ""
        gpio-out "sysbus-irq" 1
        chardev = "serial0"
        mmio ffffffffffffffff/0000000000001000
    [...]

Unlike -serial null, -serial none has no effect here.  Lovely, isn't it?

Note that -serial lets you control just *one* property of the pl011
device model: its backend.  I figure that's quite okay for pl011, but it
can be a problem in other cases.  One such case we saw in the memo that
started this thread: we need to control cfi.pflash01 property "secure",
but -drive if=pflash doesn't let us.  We work around that embarrassment
with

    -global driver=cfi.pflash01,property=secure,value=on

Affects *all* such devices, but fortunately we have at most two, and the
one we don't want to affect happens to ignore the property value.

For block devices, there's an additional issue: our ad hoc board
configuration mechanism (-drive) ties us to the legacy block backend
configuration mechanism (also -drive), with all its legacy baggage.
There is no way to use the modern mechanism (-blockdev) with onboard
devices.

Observe:

* Board configuration is totally different from our common device
  configuration mechanism.

* Each class of onboard device has its own ad hoc board configuration
  option (plus sugared forms in some cases).

* Each ad hoc board configuration option provides only selected
  properties common to all devices of this class.  Many device
  properties remain inaccessible.  Sometimes, we can use -global to work
  around, but not in the general case.

* We can't use -blockdev with onboard block devices, because the ad hoc
  board configuration option for block devices is -drive, with all its
  legacy baggage.  This is what made me write the memo that started this
  thread.

* Ad hoc configuration consistency checking is somewhat haphazard.
  Plenty of nonsense gets silently ignored.

>                                          And then the board
> code needs to do something for "if the user didn't create this
> then we need to do it", because the flash device should exist
> in the model whether the user cared about its contents or not.

True for mandatory onboard devices.  However, this flash device is *not*
mandatory.

If you ask for one flash device, you get OVMF firmware, and the image
you provide must be a unified build.

If you ask for two, you also get OVMF firmware, and the images you
provide must be a split build, unit 0 code, unit 1 data.

If you ask for more than two, you get an error.

If you ask for none, you get a traditional BIOS.

If you ask for both OVMF (with -drive if=pflash) and BIOS (with -bios),
you silently get OVMF.

If that makes you go "boy, this sucks", then I go "right you are."

Back to the issue that made me go down this rabbit hole: libvirt wants
to use -blockdev rather than -drive to configure their backends.

I tried to find a solution that doesn't involve redoing board
configuration from the ground up.  For better or worse, these devices
are optional.  So plug them with -device, and see what explodes.  Turns
out nor all that much, but still enough to make me post a lengthy memo.

You pointed out that using -device is "a mess because then the user has
to know the right base address".  You have a point.  But being stuck
with legacy -drive is also a mess.

For libvirt, plumbing the base address from the firmware's descriptor to
QEMU would be the lesser mess (for the firmware, providing the base
address there would be no mess at all).

For human users, it's perhaps the greater mess.  They can continue to
use -drive if=pflash.

Perhaps we *should* redo board configuration from the ground up.
Perhaps a board should be a composite object that exposes properties of
its own and its part, just like other composite devices, and so that
"create, set properties, realize" works.  That would extend our common
device configuration mechanism naturally to onboard devices.

Of course, "we should" doesn't imply "I could".

> Dynamic sysbus is something I'd rather we did less of, not more
> of. It's there because it solves an annoying problem (people
> want to do device passthrough of hardware that's not on a nice
> pluggable probeable bus), but it's really awkward.

>From 10,000ft, dynamic sysbus is no more awkward than other pluggable
devices: you (cold-)plug a device model by specifying a bunch of
properties.

[...]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 14:13   ` Markus Armbruster
@ 2019-01-30 14:33     ` Paolo Bonzini
  2019-01-30 16:38       ` Laszlo Ersek
                         ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Paolo Bonzini @ 2019-01-30 14:33 UTC (permalink / raw)
  To: Markus Armbruster, Peter Maydell
  Cc: Libvirt, Peter Krempa, László Érsek,
	QEMU Developers, Qemu-block

On 30/01/19 15:13, Markus Armbruster wrote:
>     -global driver=cfi.pflash01,property=secure,value=on
> 
> Affects *all* such devices, but fortunately we have at most two, and the
> one we don't want to affect happens to ignore the property value.

Is this true?  I think both need secure=on, at least on x86.

> For libvirt, plumbing the base address from the firmware's descriptor to
> QEMU would be the lesser mess (for the firmware, providing the base
> address there would be no mess at all).
> 
> For human users, it's perhaps the greater mess.  They can continue to
> use -drive if=pflash.
> 
> Perhaps we *should* redo board configuration from the ground up.
> Perhaps a board should be a composite object that exposes properties of
> its own and its part, just like other composite devices, and so that
> "create, set properties, realize" works.  That would extend our common
> device configuration mechanism naturally to onboard devices.
> 
> Of course, "we should" doesn't imply "I could".

Maybe we should just add pflash block properties to the machine?  And
then it can create the devices if the properties are set to a non-empty
value.

This doesn't remove the need to use -global to configure the "secure"
property, but it's not particularly an issue.

Paolo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30  7:24   ` [Qemu-devel] " Markus Armbruster
@ 2019-01-30 15:24     ` Peter Maydell
  2019-01-30 16:44       ` Laszlo Ersek
  0 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2019-01-30 15:24 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Libvirt, Peter Krempa, László Érsek,
	QEMU Developers, Qemu-block

On Wed, 30 Jan 2019 at 07:24, Markus Armbruster <armbru@redhat.com> wrote:
>
> Let me reply to the "why is the cfi.pflash01 device so weird" part
> first, because that's relatively quick, and because it could easily
> distract us from the more important "how do we want to configure OVMF"
> part.  I'll reply to that part later.
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > On Fri, 25 Jan 2019 at 15:11, Markus Armbruster <armbru@redhat.com> wrote:
> [...]
> >> To solve (2), we first have to understand the magic.  Device
> >> cfi.pflash01 has the following properties:
> >>
> >>     num-blocks                  Size of the device in blocks
> >>     sector-length               Size of a block
> >>                                 (admire the choice of names)
> >>     width                       Bank width
> >>     big-endian                  Endianess (d'oh)
> >>     id0, id1, id2, id3          Some kind of device ID, guest-visible,
> >>                                 default to zero, few boards change it
> >
> > Note that most of this is stuff that the hardware has.
> > A lot of boards set these to garbage values which happened
> > to be what the very old implementation of pflash hardcoded,
> > because most guests don't care. This is strictly speaking
> > wrong and they should use whatever the hardware really has,
> > but most of these cases are for old not-very-maintained dev
> > boards where probably nobody even has the relevant hardware
> > even if they cared enough to find out what its ID values are.
>
> Why are we emulating (badly) stuff nobody cares about enough to find out
> what exactly we should be emulating, the world wonders...

Usual reason -- we don't change stuff unless we're reasonably
sure it doesn't actually break guests that used to work.

> >>     name                        Memory region name
> >>                                 (why is this even configurable?)
> >
> > (a) for debug purposes, so a machine can create two flash
> > devices and give them names which make them easier to tell
> > apart in monitor "info mem" and similar command output
>
> That's what we have qdev IDs for.

The property setup here is basically maintaining compat
with the way the old pre-qdev implementation worked.
If there's a nicer way to do this that doesn't change
the memory region name and break migration compat, that's
great.

> > See above about "old not-very-maintained dev boards". A
> > board which does use this is one that's doing it for
> > back-compat because nobody's cared to fix and test.
>
> Boards that seem to care:
>
>     hw/arm/vexpress.c:    qdev_prop_set_uint8(dev, "device-width", 2);
>     hw/arm/virt.c:    qdev_prop_set_uint8(dev, "device-width", 2);

Yes. vexpress was the board where the broken pflash implementation
was first reported, so we fixed the pflash device and made
vexpress set things correctly. virt also gets a fixed device
because it postdates things being fixed.

> Boards that seem not to care:
>
>     hw/arm/collie.c:    pflash_cfi01_register(SA_CS0, NULL, "collie.fl1", 0x02000000,
>     hw/arm/collie.c:    pflash_cfi01_register(SA_CS1, NULL, "collie.fl2", 0x02000000,
>     hw/arm/gumstix.c:    if (!pflash_cfi01_register(0x00000000, NULL, "connext.rom", connex_rom,
>     hw/arm/gumstix.c:    if (!pflash_cfi01_register(0x00000000, NULL, "verdex.rom", verdex_rom,
>     hw/arm/mainstone.c:        if (!pflash_cfi01_register(mainstone_flash_base[i], NULL,
>     hw/arm/omap_sx1.c:        if (!pflash_cfi01_register(OMAP_CS0_BASE, NULL,
>     hw/arm/omap_sx1.c:        if (!pflash_cfi01_register(OMAP_CS1_BASE, NULL,
>     hw/arm/versatilepb.c:    if (!pflash_cfi01_register(VERSATILE_FLASH_ADDR, NULL, "versatile.flash",
>     hw/arm/vexpress.c:static pflash_t *ve_pflash_cfi01_register(hwaddr base, const char *name,
>     hw/arm/vexpress.c:    pflash0 = ve_pflash_cfi01_register(map[VE_NORFLASH0], "vexpress.flash0",
>     hw/arm/vexpress.c:    if (!ve_pflash_cfi01_register(map[VE_NORFLASH1], "vexpress.flash1",

...vexpress can't be both fixed and not...

>     hw/arm/z2.c:    if (!pflash_cfi01_register(Z2_FLASH_BASE,
>     hw/i386/pc_sysfw.c:        /* pflash_cfi01_register() creates a deep copy of the name */
>     hw/i386/pc_sysfw.c:        system_flash = pflash_cfi01_register(phys_addr, NULL /* qdev */, name,
>     hw/lm32/milkymist.c:    pflash_cfi01_register(flash_base, NULL, "milkymist.flash", flash_size,
>     hw/microblaze/petalogix_ml605_mmu.c:    pflash_cfi01_register(FLASH_BASEADDR,
>     hw/microblaze/petalogix_s3adsp1800_mmu.c:    pflash_cfi01_register(FLASH_BASEADDR,
>     hw/mips/mips_malta.c:    fl = pflash_cfi01_register(FLASH_ADDRESS, NULL, "mips_malta.bios",
>     hw/mips/mips_r4k.c:        if (!pflash_cfi01_register(0x1fc00000, NULL, "mips_r4k.bios", mips_rom,
>     hw/ppc/sam460ex.c:    if (!pflash_cfi01_register(base, NULL, "sam460ex.flash", bios_size,
>     hw/ppc/virtex_ml507.c:    pflash_cfi01_register(PFLASH_BASEADDR, NULL, "virtex.flash", FLASH_SIZE,
>
> At least PC can't be characterized as "not-very-maintaned dev board".

Well, nobody who does anything with x86 has cared enough to
make the pflash implementation actually correct. The
weird "behave like a buggy implementation" default is there
pretty much exactly because x86 uses it and I didn't
want to change the x86 behaviour.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 14:33     ` Paolo Bonzini
@ 2019-01-30 16:38       ` Laszlo Ersek
  2019-01-31  8:33         ` Markus Armbruster
  2019-01-31  8:40       ` Markus Armbruster
  2019-02-19  7:19       ` Markus Armbruster
  2 siblings, 1 reply; 45+ messages in thread
From: Laszlo Ersek @ 2019-01-30 16:38 UTC (permalink / raw)
  To: Paolo Bonzini, Markus Armbruster, Peter Maydell
  Cc: Libvirt, Peter Krempa, QEMU Developers, Qemu-block

On 01/30/19 15:33, Paolo Bonzini wrote:
> On 30/01/19 15:13, Markus Armbruster wrote:
>>     -global driver=cfi.pflash01,property=secure,value=on
>>
>> Affects *all* such devices, but fortunately we have at most two, and
>> the one we don't want to affect happens to ignore the property value.
>
> Is this true?  I think both need secure=on, at least on x86.

commit f71e42a5c98722d6faa5be84a34fbad90d27dc04
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Wed Apr 8 14:09:43 2015 +0200

    pflash_cfi01: add secure property

    When this property is set, MMIO accesses are only allowed with the
    MEMTXATTRS_SECURE attribute.  This is used for secure access to UEFI
    variables stored in flash.

    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

If you don't add "secure=on" to unit#0, then MMIO accesses will be
possible outside of SMM. From those, I'd hazard "MMIO reads" are
generally irrelevant. "MMIO writes" could succeed to the RAM image, but:

- they are never written back to the disk (due to readonly=on),

- the actual contents of unit#0 stops mattering as soon as the SEC phase
  decompresses the PEIFV and DXEFV firmware volumes from it, to DRAM.
  The SMM infrastructure is then constructed in SMRAM from DRAM. By the
  time a 3rd party UEFI application or driver, or an OS, is reached, the
  sensitive bits are all locked in SMRAM.

... But, I wonder if S3 resume would be under threat in this case. In
that case, SEC runs again (from pflash), and it re-decompresses
PEIFV/DXEFV from pflash to DRAM. If the in-memory image of pflash
doesn't revert to the (pristine, due to readonly=on) disk image at
platform reset, then I reckon there could be a problem; the SEC code and
the compressed FVs could have been tampered with in memory.

I guess it's best to apply secure=on to unit#0 as well, after all :)

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 15:24     ` Peter Maydell
@ 2019-01-30 16:44       ` Laszlo Ersek
  2019-01-30 17:24         ` Peter Maydell
  0 siblings, 1 reply; 45+ messages in thread
From: Laszlo Ersek @ 2019-01-30 16:44 UTC (permalink / raw)
  To: Peter Maydell, Markus Armbruster
  Cc: Libvirt, Peter Krempa, QEMU Developers, Qemu-block

On 01/30/19 16:24, Peter Maydell wrote:

> Well, nobody who does anything with x86 has cared enough to
> make the pflash implementation actually correct.

I feel sort of included under this umbrella, so:

I haven't been aware of any particular pflash implementation errors. I
"didn't care" because it "worked fine" as much as I could tell.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 16:44       ` Laszlo Ersek
@ 2019-01-30 17:24         ` Peter Maydell
  2019-01-31  8:52           ` Markus Armbruster
  0 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2019-01-30 17:24 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Markus Armbruster, Libvirt, Peter Krempa, QEMU Developers, Qemu-block

On Wed, 30 Jan 2019 at 16:44, Laszlo Ersek <lersek@redhat.com> wrote:
>
> On 01/30/19 16:24, Peter Maydell wrote:
>
> > Well, nobody who does anything with x86 has cared enough to
> > make the pflash implementation actually correct.
>
> I feel sort of included under this umbrella, so:
>
> I haven't been aware of any particular pflash implementation errors. I
> "didn't care" because it "worked fine" as much as I could tell.

It depends entirely on what the guest code that's accessing
the flash devices does. If you stick to what Linux and
presumably UEFI have always done then you don't notice
anything wrong. Some things that are in spec for hardware
don't work right, though.

Commits 4b6fedcac0f51157e through a0289b8af3b05fe4 are
where we fixed these bugs.

Updating other platforms should mostly be a matter of
(a) figuring out what the actual hardware config is
and setting the device properties accordingly
(b) testing that guest software still works
(c) checking that migration compat remains working

thanks
-- PMM

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 16:38       ` Laszlo Ersek
@ 2019-01-31  8:33         ` Markus Armbruster
  2019-01-31  9:19           ` Paolo Bonzini
  0 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31  8:33 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Paolo Bonzini, Peter Maydell, Libvirt, Peter Krempa,
	QEMU Developers, Qemu-block

Laszlo Ersek <lersek@redhat.com> writes:

> On 01/30/19 15:33, Paolo Bonzini wrote:
>> On 30/01/19 15:13, Markus Armbruster wrote:
>>>     -global driver=cfi.pflash01,property=secure,value=on
>>>
>>> Affects *all* such devices, but fortunately we have at most two, and
>>> the one we don't want to affect happens to ignore the property value.
>>
>> Is this true?  I think both need secure=on, at least on x86.
>
> commit f71e42a5c98722d6faa5be84a34fbad90d27dc04
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Wed Apr 8 14:09:43 2015 +0200
>
>     pflash_cfi01: add secure property
>
>     When this property is set, MMIO accesses are only allowed with the
>     MEMTXATTRS_SECURE attribute.  This is used for secure access to UEFI
>     variables stored in flash.
>
>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
> If you don't add "secure=on" to unit#0, then MMIO accesses will be
> possible outside of SMM. From those, I'd hazard "MMIO reads" are
> generally irrelevant. "MMIO writes" could succeed to the RAM image, but:
>
> - they are never written back to the disk (due to readonly=on),
>
> - the actual contents of unit#0 stops mattering as soon as the SEC phase
>   decompresses the PEIFV and DXEFV firmware volumes from it, to DRAM.
>   The SMM infrastructure is then constructed in SMRAM from DRAM. By the
>   time a 3rd party UEFI application or driver, or an OS, is reached, the
>   sensitive bits are all locked in SMRAM.
>
> ... But, I wonder if S3 resume would be under threat in this case. In
> that case, SEC runs again (from pflash), and it re-decompresses
> PEIFV/DXEFV from pflash to DRAM. If the in-memory image of pflash
> doesn't revert to the (pristine, due to readonly=on) disk image at
> platform reset, then I reckon there could be a problem; the SEC code and
> the compressed FVs could have been tampered with in memory.
>
> I guess it's best to apply secure=on to unit#0 as well, after all :)

I thought secure=on affected only writes (and so wouldn't matter with
readonly=on), but I was wrong:

    static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value,
                                                  unsigned len, MemTxAttrs attrs)
    {
        pflash_t *pfl = opaque;
        bool be = !!(pfl->features & (1 << PFLASH_BE));

        if ((pfl->features & (1 << PFLASH_SECURE)) && !attrs.secure) {
            *value = pflash_data_read(opaque, addr, len, be);
        } else {
            *value = pflash_read(opaque, addr, len, be);
        }
        return MEMTX_OK;
    }

pflash_data_read() is what pflash_read() does when pfl->cmd is 0.

Hmm, why is it okay to treat all pfl->cmd values the same when
secure=on?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 14:33     ` Paolo Bonzini
  2019-01-30 16:38       ` Laszlo Ersek
@ 2019-01-31  8:40       ` Markus Armbruster
  2019-01-31  9:19         ` Paolo Bonzini
  2019-02-19  7:19       ` Markus Armbruster
  2 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31  8:40 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Libvirt, Peter Krempa,
	László Érsek, QEMU Developers, Qemu-block

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 30/01/19 15:13, Markus Armbruster wrote:
>>     -global driver=cfi.pflash01,property=secure,value=on
>> 
>> Affects *all* such devices, but fortunately we have at most two, and the
>> one we don't want to affect happens to ignore the property value.
>
> Is this true?  I think both need secure=on, at least on x86.
>
>> For libvirt, plumbing the base address from the firmware's descriptor to
>> QEMU would be the lesser mess (for the firmware, providing the base
>> address there would be no mess at all).
>> 
>> For human users, it's perhaps the greater mess.  They can continue to
>> use -drive if=pflash.
>> 
>> Perhaps we *should* redo board configuration from the ground up.
>> Perhaps a board should be a composite object that exposes properties of
>> its own and its part, just like other composite devices, and so that
>> "create, set properties, realize" works.  That would extend our common
>> device configuration mechanism naturally to onboard devices.
>> 
>> Of course, "we should" doesn't imply "I could".
>
> Maybe we should just add pflash block properties to the machine?  And
> then it can create the devices if the properties are set to a non-empty
> value.

What exactly do you have in mind?  Something like

    -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE

where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
i.e.

    -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
    -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...

perhaps?

[...]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 17:24         ` Peter Maydell
@ 2019-01-31  8:52           ` Markus Armbruster
  2019-01-31 10:01             ` Peter Maydell
  0 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31  8:52 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Laszlo Ersek, Libvirt, Peter Krempa, Qemu-block, QEMU Developers

Peter Maydell <peter.maydell@linaro.org> writes:

> On Wed, 30 Jan 2019 at 16:44, Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> On 01/30/19 16:24, Peter Maydell wrote:
>>
>> > Well, nobody who does anything with x86 has cared enough to
>> > make the pflash implementation actually correct.
>>
>> I feel sort of included under this umbrella, so:
>>
>> I haven't been aware of any particular pflash implementation errors. I
>> "didn't care" because it "worked fine" as much as I could tell.

Lesson for the future: when we correct something, but don't dare to
touch (some) existing uses (being not "reasonably sure it doesn't
actually break guests that used to work"), we should at least have
enough sense to make "incorrect" opt-in rather than opt-out!  People
adding new uses will be blissfully unaware of the need to opt-out, and
the problem will multiply, just like it did here.

> It depends entirely on what the guest code that's accessing
> the flash devices does. If you stick to what Linux and
> presumably UEFI have always done then you don't notice
> anything wrong. Some things that are in spec for hardware
> don't work right, though.
>
> Commits 4b6fedcac0f51157e through a0289b8af3b05fe4 are
> where we fixed these bugs.
>
> Updating other platforms should mostly be a matter of
> (a) figuring out what the actual hardware config is
> and setting the device properties accordingly
> (b) testing that guest software still works
> (c) checking that migration compat remains working

Unavoidable pain for uses that predate the bug fixes.  Self-inflicted
pain for all later uses.

Can you fix the code so incorrect behavior becomes opt-in rather than
opt-out?  Pretty-please?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  8:33         ` Markus Armbruster
@ 2019-01-31  9:19           ` Paolo Bonzini
  2019-01-31  9:37             ` Markus Armbruster
  0 siblings, 1 reply; 45+ messages in thread
From: Paolo Bonzini @ 2019-01-31  9:19 UTC (permalink / raw)
  To: Markus Armbruster, Laszlo Ersek
  Cc: Peter Maydell, Libvirt, Peter Krempa, QEMU Developers, Qemu-block

On 31/01/19 09:33, Markus Armbruster wrote:
> I thought secure=on affected only writes (and so wouldn't matter with
> readonly=on), but I was wrong:
> 
>     static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value,
>                                                   unsigned len, MemTxAttrs attrs)
>     {
>         pflash_t *pfl = opaque;
>         bool be = !!(pfl->features & (1 << PFLASH_BE));
> 
>         if ((pfl->features & (1 << PFLASH_SECURE)) && !attrs.secure) {
>             *value = pflash_data_read(opaque, addr, len, be);
>         } else {
>             *value = pflash_read(opaque, addr, len, be);
>         }
>         return MEMTX_OK;
>     }
> 
> pflash_data_read() is what pflash_read() does when pfl->cmd is 0.

Reads from flash actually do not go through here; this function executes
if the flash chip is already in MMIO mode, which happens after you
*write* a command to the memory area.  With secure=on, you just cannot
do a command write unless you're in SMM, in other words the flash chip
can only ever go in MMIO mode if you're in SMM.

> Hmm, why is it okay to treat all pfl->cmd values the same when
> secure=on?

But doesn't matter.  You just don't want MMIO mode to be active outside
SMM: all that non-SMM code want to do with the flash is read and execute
it, as far as they're concerned it's just ROM and the command mode is
nonexistent.

Paolo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  8:40       ` Markus Armbruster
@ 2019-01-31  9:19         ` Paolo Bonzini
  2019-01-31  9:41           ` Markus Armbruster
  2019-01-31 11:57           ` Laszlo Ersek
  0 siblings, 2 replies; 45+ messages in thread
From: Paolo Bonzini @ 2019-01-31  9:19 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Libvirt, Peter Krempa,
	László Érsek, QEMU Developers, Qemu-block

On 31/01/19 09:40, Markus Armbruster wrote:
>> Maybe we should just add pflash block properties to the machine?  And
>> then it can create the devices if the properties are set to a non-empty
>> value.
> What exactly do you have in mind?  Something like
> 
>     -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE
> 
> where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
> i.e.
> 
>     -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
>     -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...

Yes, though I would call it pflash0 and pflash1.

Paolo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  9:19           ` Paolo Bonzini
@ 2019-01-31  9:37             ` Markus Armbruster
  2019-01-31 12:02               ` Laszlo Ersek
  2019-01-31 12:10               ` Paolo Bonzini
  0 siblings, 2 replies; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31  9:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Laszlo Ersek, Libvirt, Peter Maydell, Peter Krempa,
	QEMU Developers, Qemu-block

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 31/01/19 09:33, Markus Armbruster wrote:
>> I thought secure=on affected only writes (and so wouldn't matter with
>> readonly=on), but I was wrong:
>> 
>>     static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value,
>>                                                   unsigned len, MemTxAttrs attrs)
>>     {
>>         pflash_t *pfl = opaque;
>>         bool be = !!(pfl->features & (1 << PFLASH_BE));
>> 
>>         if ((pfl->features & (1 << PFLASH_SECURE)) && !attrs.secure) {
>>             *value = pflash_data_read(opaque, addr, len, be);
>>         } else {
>>             *value = pflash_read(opaque, addr, len, be);
>>         }
>>         return MEMTX_OK;
>>     }
>> 
>> pflash_data_read() is what pflash_read() does when pfl->cmd is 0.
>
> Reads from flash actually do not go through here; this function executes
> if the flash chip is already in MMIO mode, which happens after you
> *write* a command to the memory area.  With secure=on, you just cannot
> do a command write unless you're in SMM, in other words the flash chip
> can only ever go in MMIO mode if you're in SMM.
>
>> Hmm, why is it okay to treat all pfl->cmd values the same when
>> secure=on?
>
> But doesn't matter.  You just don't want MMIO mode to be active outside
> SMM: all that non-SMM code want to do with the flash is read and execute
> it, as far as they're concerned it's just ROM and the command mode is
> nonexistent.

Out of curiosity: what effect does secure=on have when the device is
read-only (pflash_t member ro non-zero)?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  9:19         ` Paolo Bonzini
@ 2019-01-31  9:41           ` Markus Armbruster
  2019-01-31 10:12             ` Paolo Bonzini
  2019-01-31 11:57           ` Laszlo Ersek
  1 sibling, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31  9:41 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, László Érsek

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 31/01/19 09:40, Markus Armbruster wrote:
>>> Maybe we should just add pflash block properties to the machine?  And
>>> then it can create the devices if the properties are set to a non-empty
>>> value.
>> What exactly do you have in mind?  Something like
>> 
>>     -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE
>> 
>> where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
>> i.e.
>> 
>>     -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
>>     -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...
>
> Yes, though I would call it pflash0 and pflash1.

Digression... should we put traditional BIOS in flash as well?  Only for
new machine types, obviously.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  8:52           ` Markus Armbruster
@ 2019-01-31 10:01             ` Peter Maydell
  2019-01-31 10:24               ` Markus Armbruster
  0 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2019-01-31 10:01 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Laszlo Ersek, Libvirt, Peter Krempa, Qemu-block, QEMU Developers

On Thu, 31 Jan 2019 at 08:52, Markus Armbruster <armbru@redhat.com> wrote:
> Lesson for the future: when we correct something, but don't dare to
> touch (some) existing uses (being not "reasonably sure it doesn't
> actually break guests that used to work"), we should at least have
> enough sense to make "incorrect" opt-in rather than opt-out!  People
> adding new uses will be blissfully unaware of the need to opt-out, and
> the problem will multiply, just like it did here.

The theory was that pflash_cfi01_register() is the legacy
back-compat API, and new board models use new style
"create the device and set its properties longhand", I think.
(This is part of a wider issue that the preferred QOM APIs
tend to be more longwinded than the old obsolete helper
functions.) It's definitely unfortunate that we didn't
at least comment that that function shouldn't be used in
new boards.

The problem hasn't multiplied very much -- almost all the
users of pflash_cfi01_register() predate the change.

I'm not sure what an API that defaulted to 'correct' would
look like here, though.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  9:41           ` Markus Armbruster
@ 2019-01-31 10:12             ` Paolo Bonzini
  2019-01-31 12:12               ` Markus Armbruster
  0 siblings, 1 reply; 45+ messages in thread
From: Paolo Bonzini @ 2019-01-31 10:12 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, László Érsek

On 31/01/19 10:41, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 31/01/19 09:40, Markus Armbruster wrote:
>>>> Maybe we should just add pflash block properties to the machine?  And
>>>> then it can create the devices if the properties are set to a non-empty
>>>> value.
>>> What exactly do you have in mind?  Something like
>>>
>>>     -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE
>>>
>>> where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
>>> i.e.
>>>
>>>     -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
>>>     -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...
>>
>> Yes, though I would call it pflash0 and pflash1.
> 
> Digression... should we put traditional BIOS in flash as well?  Only for
> new machine types, obviously.

The blocker was that very old KVM didn't support ROMD memory regions.
Now on one hand we don't support those old kernel versions anymore; on
the other hand we have HAX and WHPX that do not support ROMD at all.

Paolo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 10:01             ` Peter Maydell
@ 2019-01-31 10:24               ` Markus Armbruster
  2019-01-31 10:34                 ` Peter Maydell
  0 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31 10:24 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Libvirt, Peter Krempa, Laszlo Ersek, QEMU Developers, Qemu-block

Peter Maydell <peter.maydell@linaro.org> writes:

> On Thu, 31 Jan 2019 at 08:52, Markus Armbruster <armbru@redhat.com> wrote:
>> Lesson for the future: when we correct something, but don't dare to
>> touch (some) existing uses (being not "reasonably sure it doesn't
>> actually break guests that used to work"), we should at least have
>> enough sense to make "incorrect" opt-in rather than opt-out!  People
>> adding new uses will be blissfully unaware of the need to opt-out, and
>> the problem will multiply, just like it did here.
>
> The theory was that pflash_cfi01_register() is the legacy
> back-compat API, and new board models use new style
> "create the device and set its properties longhand", I think.
> (This is part of a wider issue that the preferred QOM APIs
> tend to be more longwinded than the old obsolete helper
> functions.) It's definitely unfortunate that we didn't
> at least comment that that function shouldn't be used in
> new boards.
>
> The problem hasn't multiplied very much -- almost all the
> users of pflash_cfi01_register() predate the change.
>
> I'm not sure what an API that defaulted to 'correct' would
> look like here, though.

What about this:

1. Make the device model default to some "correct" configuration, even
if that setting is kind of arbitrary.  That way, any code using new
style gets an "incorrect" configuration only if it actively selects one.

2. Won't help with pflash_cfi01_register(), which actively selects
whatever configuration it gets passed.  Not a serious problem as long as
it doesn't get used by new code, so rename it to
pflash_cfi01_create_legacy() and give it a fat comment.

3. Convert one or more of its users to the new style, so we have good
examples, not just bad ones.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 10:24               ` Markus Armbruster
@ 2019-01-31 10:34                 ` Peter Maydell
  2019-01-31 12:05                   ` Markus Armbruster
  0 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2019-01-31 10:34 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Libvirt, Peter Krempa, Laszlo Ersek, QEMU Developers, Qemu-block

On Thu, 31 Jan 2019 at 10:24, Markus Armbruster <armbru@redhat.com> wrote:
> 1. Make the device model default to some "correct" configuration, even
> if that setting is kind of arbitrary.  That way, any code using new
> style gets an "incorrect" configuration only if it actively selects one.

I don't think that's much better, because it will still
probably be wrong for the board. People writing board models
should define what their pflash device is, not plug in something
that's not what the hardware is.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  9:19         ` Paolo Bonzini
  2019-01-31  9:41           ` Markus Armbruster
@ 2019-01-31 11:57           ` Laszlo Ersek
  1 sibling, 0 replies; 45+ messages in thread
From: Laszlo Ersek @ 2019-01-31 11:57 UTC (permalink / raw)
  To: Paolo Bonzini, Markus Armbruster
  Cc: Peter Maydell, Libvirt, Peter Krempa, QEMU Developers, Qemu-block

On 01/31/19 10:19, Paolo Bonzini wrote:
> On 31/01/19 09:40, Markus Armbruster wrote:
>>> Maybe we should just add pflash block properties to the machine?  And
>>> then it can create the devices if the properties are set to a non-empty
>>> value.
>> What exactly do you have in mind?  Something like
>>
>>     -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE
>>
>> where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
>> i.e.
>>
>>     -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
>>     -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...
> 
> Yes, though I would call it pflash0 and pflash1.

(side-comment: read-only isn't correct for the varstore pflash)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  9:37             ` Markus Armbruster
@ 2019-01-31 12:02               ` Laszlo Ersek
  2019-01-31 12:10               ` Paolo Bonzini
  1 sibling, 0 replies; 45+ messages in thread
From: Laszlo Ersek @ 2019-01-31 12:02 UTC (permalink / raw)
  To: Markus Armbruster, Paolo Bonzini
  Cc: Libvirt, Peter Maydell, Peter Krempa, QEMU Developers, Qemu-block

On 01/31/19 10:37, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 31/01/19 09:33, Markus Armbruster wrote:
>>> I thought secure=on affected only writes (and so wouldn't matter with
>>> readonly=on), but I was wrong:
>>>
>>>     static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value,
>>>                                                   unsigned len, MemTxAttrs attrs)
>>>     {
>>>         pflash_t *pfl = opaque;
>>>         bool be = !!(pfl->features & (1 << PFLASH_BE));
>>>
>>>         if ((pfl->features & (1 << PFLASH_SECURE)) && !attrs.secure) {
>>>             *value = pflash_data_read(opaque, addr, len, be);
>>>         } else {
>>>             *value = pflash_read(opaque, addr, len, be);
>>>         }
>>>         return MEMTX_OK;
>>>     }
>>>
>>> pflash_data_read() is what pflash_read() does when pfl->cmd is 0.
>>
>> Reads from flash actually do not go through here; this function executes
>> if the flash chip is already in MMIO mode, which happens after you
>> *write* a command to the memory area.  With secure=on, you just cannot
>> do a command write unless you're in SMM, in other words the flash chip
>> can only ever go in MMIO mode if you're in SMM.
>>
>>> Hmm, why is it okay to treat all pfl->cmd values the same when
>>> secure=on?
>>
>> But doesn't matter.  You just don't want MMIO mode to be active outside
>> SMM: all that non-SMM code want to do with the flash is read and execute
>> it, as far as they're concerned it's just ROM and the command mode is
>> nonexistent.
> 
> Out of curiosity: what effect does secure=on have when the device is
> read-only (pflash_t member ro non-zero)?
> 

It's hard to theorize about this comprehensively. What action and which
pflash unit do you have in mind?

(Interpreting your question as "what does the firmware see if...".
Another interpretation would be possible too, "what does QEMU do if...".)

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 10:34                 ` Peter Maydell
@ 2019-01-31 12:05                   ` Markus Armbruster
  0 siblings, 0 replies; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31 12:05 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Libvirt, Peter Krempa, Laszlo Ersek, QEMU Developers, Qemu-block

Peter Maydell <peter.maydell@linaro.org> writes:

> On Thu, 31 Jan 2019 at 10:24, Markus Armbruster <armbru@redhat.com> wrote:
>> 1. Make the device model default to some "correct" configuration, even
>> if that setting is kind of arbitrary.  That way, any code using new
>> style gets an "incorrect" configuration only if it actively selects one.
>
> I don't think that's much better, because it will still
> probably be wrong for the board. People writing board models
> should define what their pflash device is, not plug in something
> that's not what the hardware is.

If you want to force people to select a sane configuration, set the
default to some suitable nonsense the realize() method rejects.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31  9:37             ` Markus Armbruster
  2019-01-31 12:02               ` Laszlo Ersek
@ 2019-01-31 12:10               ` Paolo Bonzini
  2019-01-31 12:51                 ` Markus Armbruster
  1 sibling, 1 reply; 45+ messages in thread
From: Paolo Bonzini @ 2019-01-31 12:10 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Laszlo Ersek, Libvirt, Peter Maydell, Peter Krempa,
	QEMU Developers, Qemu-block

On 31/01/19 10:37, Markus Armbruster wrote:
>>
>>> Hmm, why is it okay to treat all pfl->cmd values the same when
>>> secure=on?
>> But doesn't matter.  You just don't want MMIO mode to be active outside
>> SMM: all that non-SMM code want to do with the flash is read and execute
>> it, as far as they're concerned it's just ROM and the command mode is
>> nonexistent.
> Out of curiosity: what effect does secure=on have when the device is
> read-only (pflash_t member ro non-zero)?

Non-SMM code cannot execute commands.  This means two things:

First, in addition to writes, there are nondestructive commands such as
read device id.  Those are also inaccessible to non-SMM if secure=on.
Again, for non-SMM code it looks like your old ROM.  This is not
important but...

... CFI commands, even commands that are nondestructive or writes that
fail because of readonly-ness, consist of multiple writes to the flash
device.  If non-SMM code could issue a partial command, the SMM flash
driver would likely end up confused.  Therefore it's probably a good
idea to make all parallel flash devices have secure=on even if the
content of the flash cannot be damaged, and that's why I never
considered anything but -global to configure the property.

Paolo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 10:12             ` Paolo Bonzini
@ 2019-01-31 12:12               ` Markus Armbruster
  2019-01-31 22:57                 ` Paolo Bonzini
  0 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31 12:12 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, László Érsek

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 31/01/19 10:41, Markus Armbruster wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>>> On 31/01/19 09:40, Markus Armbruster wrote:
>>>>> Maybe we should just add pflash block properties to the machine?  And
>>>>> then it can create the devices if the properties are set to a non-empty
>>>>> value.
>>>> What exactly do you have in mind?  Something like
>>>>
>>>>     -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE
>>>>
>>>> where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
>>>> i.e.
>>>>
>>>>     -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
>>>>     -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...
>>>
>>> Yes, though I would call it pflash0 and pflash1.
>> 
>> Digression... should we put traditional BIOS in flash as well?  Only for
>> new machine types, obviously.
>
> The blocker was that very old KVM didn't support ROMD memory regions.
> Now on one hand we don't support those old kernel versions anymore; on
> the other hand we have HAX and WHPX that do not support ROMD at all.

This is all greek to me.  I take it there's something wrong with these
accelerators that makes (read-only?) flash memory not work, even though
the read-only mapping we now create for traditional BIOS works.  Weird,
but I'm of course willing to take your word for it.

I guess it's fixable, but nobody has stepped up to fix it.

Aside: accepting incomplete accelerators, then letting their
incompleteness hold back things doesn't strike me as sound policy.

Do we reject these accelerators when the user asks for firmware in
flash?  Or do we let the guest run into some more or less obscure
failure?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 12:10               ` Paolo Bonzini
@ 2019-01-31 12:51                 ` Markus Armbruster
  0 siblings, 0 replies; 45+ messages in thread
From: Markus Armbruster @ 2019-01-31 12:51 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, Laszlo Ersek

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 31/01/19 10:37, Markus Armbruster wrote:
>>>
>>>> Hmm, why is it okay to treat all pfl->cmd values the same when
>>>> secure=on?
>>> But doesn't matter.  You just don't want MMIO mode to be active outside
>>> SMM: all that non-SMM code want to do with the flash is read and execute
>>> it, as far as they're concerned it's just ROM and the command mode is
>>> nonexistent.
>> Out of curiosity: what effect does secure=on have when the device is
>> read-only (pflash_t member ro non-zero)?
>
> Non-SMM code cannot execute commands.  This means two things:
>
> First, in addition to writes, there are nondestructive commands such as
> read device id.  Those are also inaccessible to non-SMM if secure=on.
> Again, for non-SMM code it looks like your old ROM.  This is not
> important but...
>
> ... CFI commands, even commands that are nondestructive or writes that
> fail because of readonly-ness, consist of multiple writes to the flash
> device.  If non-SMM code could issue a partial command, the SMM flash
> driver would likely end up confused.  Therefore it's probably a good
> idea to make all parallel flash devices have secure=on even if the
> content of the flash cannot be damaged, and that's why I never
> considered anything but -global to configure the property.

Makes sense now, thanks!

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 12:12               ` Markus Armbruster
@ 2019-01-31 22:57                 ` Paolo Bonzini
  2019-01-31 23:28                   ` Alexandro Sanchez Bach
  2019-02-01  8:58                   ` Markus Armbruster
  0 siblings, 2 replies; 45+ messages in thread
From: Paolo Bonzini @ 2019-01-31 22:57 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, László Érsek,
	Alexandro Sanchez Bach, Justin Terry (VM)

On 31/01/19 13:12, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 31/01/19 10:41, Markus Armbruster wrote:
>>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>>
>>>> On 31/01/19 09:40, Markus Armbruster wrote:
>>>>>> Maybe we should just add pflash block properties to the machine?  And
>>>>>> then it can create the devices if the properties are set to a non-empty
>>>>>> value.
>>>>> What exactly do you have in mind?  Something like
>>>>>
>>>>>     -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE
>>>>>
>>>>> where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
>>>>> i.e.
>>>>>
>>>>>     -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
>>>>>     -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...
>>>>
>>>> Yes, though I would call it pflash0 and pflash1.
>>>
>>> Digression... should we put traditional BIOS in flash as well?  Only for
>>> new machine types, obviously.
>>
>> The blocker was that very old KVM didn't support ROMD memory regions.
>> Now on one hand we don't support those old kernel versions anymore; on
>> the other hand we have HAX and WHPX that do not support ROMD at all.
> 
> This is all greek to me.  I take it there's something wrong with these
> accelerators that makes (read-only?) flash memory not work, even though
> the read-only mapping we now create for traditional BIOS works.  Weird,
> but I'm of course willing to take your word for it.

Yes, as I wrote in the other message even read-only flash memory
supports commands, and these accelerators do not support direct reads +
MMIO writes on the same memory slot.

At least I checked HAX code and it doesn't; I don't know about WHPX.

> Aside: accepting incomplete accelerators, then letting their
> incompleteness hold back things doesn't strike me as sound policy.

Yes, there is a balance to be found between that and accepting features
from known out-of-tree forks, in order to help these out-of-tree forks
not remain forever on very old releases.

However, another important question is---if you changed the default from
-bios to -pflash, would you also flip secure from off to on?  Only TCG
and KVM supports SMM, and it's quite unlikely that the other
accelerators will support it (there are also "philosophical" debates
behind that choice...).

> Do we reject these accelerators when the user asks for firmware in
> flash?  Or do we let the guest run into some more or less obscure
> failure?

For HAX it just fails to boot I think.

Paolo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 22:57                 ` Paolo Bonzini
@ 2019-01-31 23:28                   ` Alexandro Sanchez Bach
  2019-01-31 23:54                     ` Paolo Bonzini
  2019-02-01  8:58                   ` Markus Armbruster
  1 sibling, 1 reply; 45+ messages in thread
From: Alexandro Sanchez Bach @ 2019-01-31 23:28 UTC (permalink / raw)
  To: 'Paolo Bonzini', 'Markus Armbruster'
  Cc: 'Peter Maydell', 'Peter Krempa',
	'Qemu-block', 'Libvirt',
	'QEMU Developers', 'László Érsek',
	'Justin Terry (VM)', 'Ning, Yu'

>> This is all greek to me.  I take it there's something wrong with these 
>> accelerators that makes (read-only?) flash memory not work, even 
>> though the read-only mapping we now create for traditional BIOS works.  
>> Weird, but I'm of course willing to take your word for it.

> Yes, as I wrote in the other message even read-only flash memory supports commands, and these accelerators do not support direct reads + MMIO writes on the same memory slot.
> At least I checked HAX code and it doesn't; I don't know about WHPX.

(CC'd Yu Ning @ Intel's HAXM team)

Not sure, if I'm understanding the issue correctly, but isn't `HAX_VM_IOCTL_SET_RAM2` with the `HAX_RAM_INFO_ROM` flag precisely what you are looking for?

More precisely, HAX_VM_IOCTL_SET_RAM2 maps an HVA range to a GPA range, the HAX_RAM_INFO_ROM flag should allow only guest memory reads to that range [1]. When the guest attempts to write, this should trigger a VM exit that will be handled by QEMU. Also, this seems to be handled here:
https://github.com/qemu/qemu/blob/15bede554162dda822cd762c689edb6fa32b6e3b/target/i386/hax-mem.c#L205-L207

Best,
Alexandro

[1] https://github.com/intel/haxm/blob/master/docs/api.md

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 23:28                   ` Alexandro Sanchez Bach
@ 2019-01-31 23:54                     ` Paolo Bonzini
  2019-02-01  2:49                       ` Ning, Yu
  0 siblings, 1 reply; 45+ messages in thread
From: Paolo Bonzini @ 2019-01-31 23:54 UTC (permalink / raw)
  To: Alexandro Sanchez Bach, 'Markus Armbruster'
  Cc: 'Peter Maydell', 'Peter Krempa',
	'Qemu-block', 'Libvirt',
	'QEMU Developers', 'László Érsek',
	'Justin Terry (VM)', 'Ning, Yu'

On 01/02/19 00:28, Alexandro Sanchez Bach wrote:
> (CC'd Yu Ning @ Intel's HAXM team)
> 
> Not sure, if I'm understanding the issue correctly, but isn't
> `HAX_VM_IOCTL_SET_RAM2` with the `HAX_RAM_INFO_ROM` flag precisely
> what you are looking for?
> 
> More precisely, HAX_VM_IOCTL_SET_RAM2 maps an HVA range to a GPA
> range, the HAX_RAM_INFO_ROM flag should allow only guest memory reads
> to that range [1]. When the guest attempts to write, this should
> trigger a VM exit that will be handled by QEMU.

The missing handling is in the hypervisor:

    if (ret == -EACCES) {
        /*
         * For some reason, during boot-up, Chrome OS guests make
hundreds of
         * attempts to write to GPAs close to 4GB, which are mapped into
BIOS
         * (read-only) and thus result in EPT violations.
         * TODO: Handle this case properly.
         */
        hax_warning("%s: Unexpected EPT violation cause. Skipping
instruction"
                    " (len=%u)\n", __func__, vcpu->vmx.exit_instr_length);
        advance_rip(vcpu);
        return HAX_EXIT;
}

> Also, this seems to be handled here:
> https://github.com/qemu/qemu/blob/15bede554162dda822cd762c689edb6fa32b6e3b/target/i386/hax-mem.c#L205-L207

Right, though to be precise it should be changed to

     if (memory_region_is_rom(section->mr) ||
	 memory_region_is_romd(section->mr)) {         flags |= HAX_RAM_INFO_ROM;
     }

for that to work.

Paolo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 23:54                     ` Paolo Bonzini
@ 2019-02-01  2:49                       ` Ning, Yu
  2019-02-04 10:00                         ` Paolo Bonzini
  0 siblings, 1 reply; 45+ messages in thread
From: Ning, Yu @ 2019-02-01  2:49 UTC (permalink / raw)
  To: Paolo Bonzini, Alexandro Sanchez Bach
  Cc: 'Peter Maydell', 'Peter Krempa',
	'Qemu-block', 'Libvirt',
	'QEMU Developers', 'László Érsek',
	'Justin Terry (VM)', 'Markus Armbruster'



> -----Original Message-----
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> Sent: Friday, February 1, 2019 7:55
> To: Alexandro Sanchez Bach <alexandro@phi.nz>; 'Markus Armbruster'
> <armbru@redhat.com>
> Cc: 'Peter Maydell' <peter.maydell@linaro.org>; 'Peter Krempa'
> <pkrempa@redhat.com>; 'Qemu-block' <qemu-block@nongnu.org>; 'Libvirt'
> <libvir-list@redhat.com>; 'QEMU Developers' <qemu-devel@nongnu.org>;
> 'László Érsek' <lersek@redhat.com>; 'Justin Terry (VM)'
> <juterry@microsoft.com>; Ning, Yu <yu.ning@intel.com>
> Subject: Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
> 
> On 01/02/19 00:28, Alexandro Sanchez Bach wrote:
> > (CC'd Yu Ning @ Intel's HAXM team)
> >
> > Not sure, if I'm understanding the issue correctly, but isn't
> > `HAX_VM_IOCTL_SET_RAM2` with the `HAX_RAM_INFO_ROM` flag precisely
> > what you are looking for?
> >
> > More precisely, HAX_VM_IOCTL_SET_RAM2 maps an HVA range to a GPA
> > range, the HAX_RAM_INFO_ROM flag should allow only guest memory reads
> > to that range [1]. When the guest attempts to write, this should
> > trigger a VM exit that will be handled by QEMU.
> 
> The missing handling is in the hypervisor:
> 
>     if (ret == -EACCES) {
>         /*
>          * For some reason, during boot-up, Chrome OS guests make
> hundreds of
>          * attempts to write to GPAs close to 4GB, which are mapped into
> BIOS
>          * (read-only) and thus result in EPT violations.
>          * TODO: Handle this case properly.
>          */
>         hax_warning("%s: Unexpected EPT violation cause. Skipping
> instruction"
>                     " (len=%u)\n", __func__,
> vcpu->vmx.exit_instr_length);
>         advance_rip(vcpu);
>         return HAX_EXIT;
> }
> 
> > Also, this seems to be handled here:
> >
> https://github.com/qemu/qemu/blob/15bede554162dda822cd762c689edb6fa3
> 2b
> > 6e3b/target/i386/hax-mem.c#L205-L207
> 
> Right, though to be precise it should be changed to
> 
>      if (memory_region_is_rom(section->mr) ||
> 	 memory_region_is_romd(section->mr)) {         flags |=
> HAX_RAM_INFO_ROM;
>      }
> 
> for that to work.
> 

Thank you both for outlining the changes we have to make in order to support ROMD memory regions!  The only question is whether we should pass a new flag to HAX_VM_IOCTL_SET_RAM2 for ROMD, so the hypervisor could respond differently to writes to ROM and ROMD regions.  Would that be useful at all?  What would happen if HAXM asked QEMU to emulate a write to ROM?

HAXM didn't implement ROMD support at first, because the guests we tested could boot without it (including Chrome OS).  Now that this feature has become more popular (and we want to be able to boot OVMF), I think it's time to get it done.  I'd like to get to it after the Lunar New Year holidays, but if anyone can finish it sooner, I'll be happy to merge their patch into HAXM.

-Yu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-31 22:57                 ` Paolo Bonzini
  2019-01-31 23:28                   ` Alexandro Sanchez Bach
@ 2019-02-01  8:58                   ` Markus Armbruster
  1 sibling, 0 replies; 45+ messages in thread
From: Markus Armbruster @ 2019-02-01  8:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Peter Krempa, Qemu-block, Libvirt,
	QEMU Developers, Alexandro Sanchez Bach, Justin Terry (VM),
	László Érsek

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 31/01/19 13:12, Markus Armbruster wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>>> On 31/01/19 10:41, Markus Armbruster wrote:
>>>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>>>
>>>>> On 31/01/19 09:40, Markus Armbruster wrote:
>>>>>>> Maybe we should just add pflash block properties to the machine?  And
>>>>>>> then it can create the devices if the properties are set to a non-empty
>>>>>>> value.
>>>>>> What exactly do you have in mind?  Something like
>>>>>>
>>>>>>     -machine q35,ovmf-code=OVMF-CODE-NODE,ovmf-data=OVMF-DATA-NODE
>>>>>>
>>>>>> where OVMF-CODE-NODE and OVMF-DATA-NODE are block backend node names,
>>>>>> i.e.
>>>>>>
>>>>>>     -blockdev file,node-name=OVMF-CODE-NODE,read-only=on,filename=/usr/share/edk2/ovmf/OVMF_CODE.fd
>>>>>>     -blockdev file,node-name=OVMF-DATA-NODE,read-only=on,filename=...
>>>>>
>>>>> Yes, though I would call it pflash0 and pflash1.
>>>>
>>>> Digression... should we put traditional BIOS in flash as well?  Only for
>>>> new machine types, obviously.
>>>
>>> The blocker was that very old KVM didn't support ROMD memory regions.
>>> Now on one hand we don't support those old kernel versions anymore; on
>>> the other hand we have HAX and WHPX that do not support ROMD at all.
>> 
>> This is all greek to me.  I take it there's something wrong with these
>> accelerators that makes (read-only?) flash memory not work, even though
>> the read-only mapping we now create for traditional BIOS works.  Weird,
>> but I'm of course willing to take your word for it.
>
> Yes, as I wrote in the other message even read-only flash memory
> supports commands, and these accelerators do not support direct reads +
> MMIO writes on the same memory slot.

Got it, thanks.

> At least I checked HAX code and it doesn't; I don't know about WHPX.
>
>> Aside: accepting incomplete accelerators, then letting their
>> incompleteness hold back things doesn't strike me as sound policy.
>
> Yes, there is a balance to be found between that and accepting features
> from known out-of-tree forks, in order to help these out-of-tree forks
> not remain forever on very old releases.

I support inviting forks back into the fold, and I understand why
"feature parity or go away" would be impractical there.  But I'd like to
see at least *commitment* to reaching feature parity.

> However, another important question is---if you changed the default from
> -bios to -pflash, would you also flip secure from off to on?  Only TCG
> and KVM supports SMM, and it's quite unlikely that the other
> accelerators will support it (there are also "philosophical" debates
> behind that choice...).

I would not, simply because secure has always been off by default.

>> Do we reject these accelerators when the user asks for firmware in
>> flash?  Or do we let the guest run into some more or less obscure
>> failure?
>
> For HAX it just fails to boot I think.

I'd call that a user interface bug.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-02-01  2:49                       ` Ning, Yu
@ 2019-02-04 10:00                         ` Paolo Bonzini
  0 siblings, 0 replies; 45+ messages in thread
From: Paolo Bonzini @ 2019-02-04 10:00 UTC (permalink / raw)
  To: Ning, Yu, Alexandro Sanchez Bach
  Cc: 'Peter Maydell', 'Peter Krempa',
	'Qemu-block', 'Libvirt',
	'QEMU Developers', 'László Érsek',
	'Justin Terry (VM)', 'Markus Armbruster'

On 01/02/19 03:49, Ning, Yu wrote:
> Thank you both for outlining the changes we have to make in order to
> support ROMD memory regions!  The only question is whether we should
> pass a new flag to HAX_VM_IOCTL_SET_RAM2 for ROMD, so the hypervisor
> could respond differently to writes to ROM and ROMD regions.  Would
> that be useful at all?  What would happen if HAXM asked QEMU to
> emulate a write to ROM?

It's more about backwards compatibility.  As far as QEMU is concerned
it's okay if all ROM reads are handled as MMIO.  However, other
emulators may not expect that, so maybe you want to add another flag.

Paolo

> HAXM didn't implement ROMD support at first, because the guests we
> tested could boot without it (including Chrome OS).  Now that this
> feature has become more popular (and we want to be able to boot
> OVMF), I think it's time to get it done.  I'd like to get to it after
> the Lunar New Year holidays, but if anyone can finish it sooner, I'll
> be happy to merge their patch into HAXM.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-25 15:03 [Qemu-devel] Configuring pflash devices for OVMF firmware Markus Armbruster
  2019-01-28  7:58 ` Laszlo Ersek
  2019-01-28 10:39 ` Peter Maydell
@ 2019-02-07  9:30 ` Markus Armbruster
  2019-02-07 12:31   ` Laszlo Ersek
  2 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-02-07  9:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, libvir-list, László Érsek,
	Daniel P. Berrangé,
	Peter Krempa

The thread got long, let me try to summarize, and elaborate a few
points.

* The problem at hand is configuring firmware residing in flash memory
  (OVMF requires this) without legacy -drive.

* The wider problem is configuring onboard devices.  Our general device
  configuration interface doesn't cover them.  Instead, we have a zoo of
  ad hoc interfaces that are much more limited.  Some of them we'd
  rather deprecate (-drive, -net), but can't until we have a suitable
  replacements.

  I think a board should be a composite object that exposes properties
  of its own and its parts, just like other composite devices, so that
  "create, set properties, realize" just works.  That would extend our
  common device configuration mechanism naturally to onboard devices.

  A PC board's flash memory device would be just another part.  It could
  be something like /machine/q35/cfi.pflash01/ in the QOM tree.  To
  configure it, you'd set its properties, such as
  /machine/q35/cfi.pflash01/drive.

  Note that this requires a way to set an existing device's properties.
  Perhaps qom-set already works.

* While I do believe we should tackle the wider problem, I'd rather not
  sit on the narrow problem until we crack it.  So, what can we do about
  it?

  - Paolo proposed to add block backend properties to the PC machine,
    settable like -machine pflash0=BLOCK-BACKEND.

    Possible drawback: if we add /machine/q35/pflash0 to the QOM tree
    now, and later replace it by /machine/q35/cfi.pflash01/drive, we'll
    have to deal with yet another machine type variation.  We'll live.

  - I proposed to sidestep our onboard device configuration problem by
    adding the cfi.pflash01 devices with our existing general device
    configuration interface: -device.  Possible since the onboard
    cfi.pflash01 devices are optional.  Requires a small extension to
    the firmware descriptors, and a bit of extra work in libvirt to
    process that extension.  I think it's workable, but Paolo's idea is
    simpler.

  I can give Paolo's idea a try.  Objections?

* A flash device supporting multiple regions is desirable, because it's
  what physical hardware has.  We currently use multiple flash devices
  instead.  We'll be stuck with them for existing machine types due to
  guest ABI and migration compatibility.

* cfi.pflash01 currently requires users to opt out of "bad, do not use".
  It should require opt in, to guard against accidental new uses of
  "bad".


PS: Big thanks to László, whose patient guidance helped me map this part
of the jungle.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-02-07  9:30 ` Markus Armbruster
@ 2019-02-07 12:31   ` Laszlo Ersek
  2019-02-07 13:49     ` Markus Armbruster
  0 siblings, 1 reply; 45+ messages in thread
From: Laszlo Ersek @ 2019-02-07 12:31 UTC (permalink / raw)
  To: Markus Armbruster, qemu-devel
  Cc: qemu-block, libvir-list, Daniel P. Berrangé, Peter Krempa

Hi Markus,

On 02/07/19 10:30, Markus Armbruster wrote:
> The thread got long, let me try to summarize, and elaborate a few
> points.
> 
> * The problem at hand is configuring firmware residing in flash memory
>   (OVMF requires this) without legacy -drive.
> 
> * The wider problem is configuring onboard devices.  Our general device
>   configuration interface doesn't cover them.  Instead, we have a zoo of
>   ad hoc interfaces that are much more limited.  Some of them we'd
>   rather deprecate (-drive, -net), but can't until we have a suitable
>   replacements.
> 
>   I think a board should be a composite object that exposes properties
>   of its own and its parts, just like other composite devices, so that
>   "create, set properties, realize" just works.  That would extend our
>   common device configuration mechanism naturally to onboard devices.
> 
>   A PC board's flash memory device would be just another part.  It could
>   be something like /machine/q35/cfi.pflash01/ in the QOM tree.  To
>   configure it, you'd set its properties, such as
>   /machine/q35/cfi.pflash01/drive.
> 
>   Note that this requires a way to set an existing device's properties.
>   Perhaps qom-set already works.
> 
> * While I do believe we should tackle the wider problem, I'd rather not
>   sit on the narrow problem until we crack it.  So, what can we do about
>   it?
> 
>   - Paolo proposed to add block backend properties to the PC machine,
>     settable like -machine pflash0=BLOCK-BACKEND.
> 
>     Possible drawback: if we add /machine/q35/pflash0 to the QOM tree
>     now, and later replace it by /machine/q35/cfi.pflash01/drive, we'll
>     have to deal with yet another machine type variation.  We'll live.
> 
>   - I proposed to sidestep our onboard device configuration problem by
>     adding the cfi.pflash01 devices with our existing general device
>     configuration interface: -device.  Possible since the onboard
>     cfi.pflash01 devices are optional.  Requires a small extension to
>     the firmware descriptors, and a bit of extra work in libvirt to
>     process that extension.  I think it's workable, but Paolo's idea is
>     simpler.
> 
>   I can give Paolo's idea a try.  Objections?
> 
> * A flash device supporting multiple regions is desirable, because it's
>   what physical hardware has.  We currently use multiple flash devices
>   instead.  We'll be stuck with them for existing machine types due to
>   guest ABI and migration compatibility.
> 
> * cfi.pflash01 currently requires users to opt out of "bad, do not use".
>   It should require opt in, to guard against accidental new uses of
>   "bad".
> 
> 
> PS: Big thanks to László, whose patient guidance helped me map this part
> of the jungle.
> 

I've read the above carefully.

At the QEMU design level, I don't have any opinion or preference; there
I simply don't know enough -- and don't suffer from bad decisions enough
-- to make sensible comments.

Regarding the choice betwen "-machine pflash0=BLOCK-BACKEND" and
"-device pflash": I don't object to exploring the former first.

I'd just like to note that "-machine pflash0=BLOCK-BACKEND" will also
require changes to the firmware descriptor schema. Not to the types that
the schema defines -- and therefore concrete descriptor *documents* that
already conform to the schema wouldn't be affected --, but to the
documentation that the schema directs at management applications.

The schema is supposed to specify (in the documentation) QEMU command
line options for management applications. If we add "-machine
pflash0=BLOCK-BACKEND", then even if the types in the schema stay the
same, some mappings to the QEMU cmdline will have to be re-documented.

Of course, that's still easier / less intrusive than changing the types!
... Which does make me prefer "-machine pflash0=BLOCK-BACKEND", if I'm
being honest.

(I hope my followup isn't totally useless. I certainly didn't want to
ignore your summary.)

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-02-07 12:31   ` Laszlo Ersek
@ 2019-02-07 13:49     ` Markus Armbruster
  0 siblings, 0 replies; 45+ messages in thread
From: Markus Armbruster @ 2019-02-07 13:49 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: qemu-devel, libvir-list, Peter Krempa, qemu-block

Laszlo Ersek <lersek@redhat.com> writes:

> Hi Markus,
>
> On 02/07/19 10:30, Markus Armbruster wrote:
>> The thread got long, let me try to summarize, and elaborate a few
>> points.
>> 
>> * The problem at hand is configuring firmware residing in flash memory
>>   (OVMF requires this) without legacy -drive.
>> 
>> * The wider problem is configuring onboard devices.  Our general device
>>   configuration interface doesn't cover them.  Instead, we have a zoo of
>>   ad hoc interfaces that are much more limited.  Some of them we'd
>>   rather deprecate (-drive, -net), but can't until we have a suitable
>>   replacements.
>> 
>>   I think a board should be a composite object that exposes properties
>>   of its own and its parts, just like other composite devices, so that
>>   "create, set properties, realize" just works.  That would extend our
>>   common device configuration mechanism naturally to onboard devices.
>> 
>>   A PC board's flash memory device would be just another part.  It could
>>   be something like /machine/q35/cfi.pflash01/ in the QOM tree.  To
>>   configure it, you'd set its properties, such as
>>   /machine/q35/cfi.pflash01/drive.
>> 
>>   Note that this requires a way to set an existing device's properties.
>>   Perhaps qom-set already works.
>> 
>> * While I do believe we should tackle the wider problem, I'd rather not
>>   sit on the narrow problem until we crack it.  So, what can we do about
>>   it?
>> 
>>   - Paolo proposed to add block backend properties to the PC machine,
>>     settable like -machine pflash0=BLOCK-BACKEND.
>> 
>>     Possible drawback: if we add /machine/q35/pflash0 to the QOM tree
>>     now, and later replace it by /machine/q35/cfi.pflash01/drive, we'll
>>     have to deal with yet another machine type variation.  We'll live.
>> 
>>   - I proposed to sidestep our onboard device configuration problem by
>>     adding the cfi.pflash01 devices with our existing general device
>>     configuration interface: -device.  Possible since the onboard
>>     cfi.pflash01 devices are optional.  Requires a small extension to
>>     the firmware descriptors, and a bit of extra work in libvirt to
>>     process that extension.  I think it's workable, but Paolo's idea is
>>     simpler.
>> 
>>   I can give Paolo's idea a try.  Objections?
>> 
>> * A flash device supporting multiple regions is desirable, because it's
>>   what physical hardware has.  We currently use multiple flash devices
>>   instead.  We'll be stuck with them for existing machine types due to
>>   guest ABI and migration compatibility.
>> 
>> * cfi.pflash01 currently requires users to opt out of "bad, do not use".
>>   It should require opt in, to guard against accidental new uses of
>>   "bad".
>> 
>> 
>> PS: Big thanks to László, whose patient guidance helped me map this part
>> of the jungle.
>> 
>
> I've read the above carefully.
>
> At the QEMU design level, I don't have any opinion or preference; there
> I simply don't know enough -- and don't suffer from bad decisions enough
> -- to make sensible comments.
>
> Regarding the choice betwen "-machine pflash0=BLOCK-BACKEND" and
> "-device pflash": I don't object to exploring the former first.
>
> I'd just like to note that "-machine pflash0=BLOCK-BACKEND" will also
> require changes to the firmware descriptor schema. Not to the types that
> the schema defines -- and therefore concrete descriptor *documents* that
> already conform to the schema wouldn't be affected --, but to the
> documentation that the schema directs at management applications.
>
> The schema is supposed to specify (in the documentation) QEMU command
> line options for management applications. If we add "-machine
> pflash0=BLOCK-BACKEND", then even if the types in the schema stay the
> same, some mappings to the QEMU cmdline will have to be re-documented.

Good point.

> Of course, that's still easier / less intrusive than changing the types!
> ... Which does make me prefer "-machine pflash0=BLOCK-BACKEND", if I'm
> being honest.
>
> (I hope my followup isn't totally useless. I certainly didn't want to
> ignore your summary.)

Pointing out the need to update these comments is anything but useless.

Thanks!

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-01-30 14:33     ` Paolo Bonzini
  2019-01-30 16:38       ` Laszlo Ersek
  2019-01-31  8:40       ` Markus Armbruster
@ 2019-02-19  7:19       ` Markus Armbruster
  2019-02-22 13:28         ` Markus Armbruster
  2 siblings, 1 reply; 45+ messages in thread
From: Markus Armbruster @ 2019-02-19  7:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Libvirt, Peter Krempa,
	László Érsek, QEMU Developers, Qemu-block

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 30/01/19 15:13, Markus Armbruster wrote:
>>     -global driver=cfi.pflash01,property=secure,value=on
>> 
>> Affects *all* such devices, but fortunately we have at most two, and the
>> one we don't want to affect happens to ignore the property value.
>
> Is this true?  I think both need secure=on, at least on x86.
>
>> For libvirt, plumbing the base address from the firmware's descriptor to
>> QEMU would be the lesser mess (for the firmware, providing the base
>> address there would be no mess at all).
>> 
>> For human users, it's perhaps the greater mess.  They can continue to
>> use -drive if=pflash.
>> 
>> Perhaps we *should* redo board configuration from the ground up.
>> Perhaps a board should be a composite object that exposes properties of
>> its own and its part, just like other composite devices, and so that
>> "create, set properties, realize" works.  That would extend our common
>> device configuration mechanism naturally to onboard devices.
>> 
>> Of course, "we should" doesn't imply "I could".
>
> Maybe we should just add pflash block properties to the machine?  And
> then it can create the devices if the properties are set to a non-empty
> value.
>
> This doesn't remove the need to use -global to configure the "secure"
> property, but it's not particularly an issue.

I played with this idea.  Here's how it went so far.

The new machine properties name block backends, so they should be just
like any other such property.  That means DEFINE_PROP_DRIVE().

Oh, that's a *qdev* property, but a TYPE_MACHINE is not a TYPE_DEVICE.

Duplicating the machinery behind DEFINE_PROP_DRIVE() for a non-qdev
property would be wrong.

But they're not properties of the machine anyway, they're properties of
onboard pflash devices.  So let's create (but not realize) these pflash
devices, and forward the property with something like

    object_property_add_alias(machine, "pflash0",
                              pcms->flash0, "pflash0", &error_abort);

If the property gets set, we realize, else we destroy.

As we just discussed in another thread[*], the place to create child
objects is .instance_init().  Okay, let's add the qdev_create() to
pc_machine_initfn() and see what happens.

Oh, crash happens.

Turns out I can create child objects just fine there, but not
TYPE_DEVICE objects.  These have

    .instance_post_init = device_post_init,

and device_post_init() calls object_apply_compat_props(), which calls
qdev_get_machine(), which calls container_get(object_get_root(),
"/machine").  Since the machine is still being constructed, it hasn't
been made a child of the root object, yet --- main() will do that right
after we return there --- so container_get() "helpfully" creates such a
child for us:

        if (!child) {
            child = object_new("container");
            object_property_add_child(obj, parts[i], child, NULL);
            object_unref(child);
        }

When main() adds the real thing, it fails with

    attempt to add duplicate property 'machine' to object (type 'container')

There are just two users of .instance_post_init, TYPE_DEVICE and
TYPE_MEMORY_BACKEND.  Both use it to call object_apply_compat_props().
I guess that means I can't create TYPE_MEMORY_BACKEND from a
TYPE_MACHINE's .instance_init() either.

Here's object_apply_compat_props():

    void object_apply_compat_props(Object *obj)
    {
        if (object_dynamic_cast(qdev_get_machine(), TYPE_MACHINE)) {
            MachineState *m = MACHINE(qdev_get_machine());
            MachineClass *mc = MACHINE_GET_CLASS(m);

            if (m->accelerator) {
                AccelClass *ac = ACCEL_GET_CLASS(m->accelerator);

                if (ac->compat_props) {
                    object_apply_global_props(obj, ac->compat_props, &error_abort);
                }
            }
            object_apply_global_props(obj, mc->compat_props, &error_abort);
        }
    }

What it really wants is not the MachineState, but its AccelClass and
MachineClass.  Which already exist, they're just not available via
qdev_get_machine().

Opinions?  Advice?



[*] Subject: Object instantiation vs. device realization: what to do
when?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Qemu-devel] Configuring pflash devices for OVMF firmware
  2019-02-19  7:19       ` Markus Armbruster
@ 2019-02-22 13:28         ` Markus Armbruster
  0 siblings, 0 replies; 45+ messages in thread
From: Markus Armbruster @ 2019-02-22 13:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Libvirt, Peter Krempa,
	László Érsek, QEMU Developers, Qemu-block

The other day, I described how my attempt to implement Paolo's
suggestion to add block properties to the machine ran into difficulties.
To recap briefly, creating devices within a machine's .instance_init()
crashes.  Turns out device_post_init() calls qdev_get_machine(), which
calls container_get() for QOM path "/machine".  Since "/machine" doesn't
exist, yet (we're busy creating it), container_get() "helpfully" creates
it as a new "container" object.  Oww.  Fortunately, we crash soon after,
when we try to create the real "/machine".

I managed to rejigger global property application code to work without
qdev_get_machine().  Good.

Not so good: QEMU still crashes the same way.

Turns out we *also* call qdev_get_machine() on the first
sysbus_get_default().  We create the main system bus then, and
immediately store it in "/machine/unattached/sysbus".  So
container_get() "helpfully" creates first "/machine" and then
"/machine/unattached" for us.

I managed to rejigger that, too.  Now my QEMU survives creating a pflash
device in pc_machine_initfn(), and I can add a machine property
"pflash0" with object_property_add_alias().  Good.

Not so good: attempting to set it with -machine pflash0=pflash0 fails
with "Property 'cfi.pflash01.drive' can't find value 'pflash0'".

Turns out we set the machine's properties around line 4250, some 140
lines before we create block backends.  Block backend "pflash0" doesn't
exist, yet.

This is madness, but at least it's familiar madness; we've rejiggered
the order of creating stuff in main() numerous times already.

To be continued...

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2019-02-22 13:28 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-25 15:03 [Qemu-devel] Configuring pflash devices for OVMF firmware Markus Armbruster
2019-01-28  7:58 ` Laszlo Ersek
2019-01-28 10:39 ` Peter Maydell
2019-01-28 12:40   ` [Qemu-devel] [libvirt] " Gerd Hoffmann
2019-01-28 13:06     ` Peter Maydell
2019-01-28 14:55       ` Laszlo Ersek
2019-01-28 14:58         ` Peter Maydell
2019-01-28 15:03           ` Laszlo Ersek
2019-01-30  7:36       ` Markus Armbruster
2019-01-30  8:00         ` Gerd Hoffmann
2019-01-30  7:24   ` [Qemu-devel] " Markus Armbruster
2019-01-30 15:24     ` Peter Maydell
2019-01-30 16:44       ` Laszlo Ersek
2019-01-30 17:24         ` Peter Maydell
2019-01-31  8:52           ` Markus Armbruster
2019-01-31 10:01             ` Peter Maydell
2019-01-31 10:24               ` Markus Armbruster
2019-01-31 10:34                 ` Peter Maydell
2019-01-31 12:05                   ` Markus Armbruster
2019-01-30 14:13   ` Markus Armbruster
2019-01-30 14:33     ` Paolo Bonzini
2019-01-30 16:38       ` Laszlo Ersek
2019-01-31  8:33         ` Markus Armbruster
2019-01-31  9:19           ` Paolo Bonzini
2019-01-31  9:37             ` Markus Armbruster
2019-01-31 12:02               ` Laszlo Ersek
2019-01-31 12:10               ` Paolo Bonzini
2019-01-31 12:51                 ` Markus Armbruster
2019-01-31  8:40       ` Markus Armbruster
2019-01-31  9:19         ` Paolo Bonzini
2019-01-31  9:41           ` Markus Armbruster
2019-01-31 10:12             ` Paolo Bonzini
2019-01-31 12:12               ` Markus Armbruster
2019-01-31 22:57                 ` Paolo Bonzini
2019-01-31 23:28                   ` Alexandro Sanchez Bach
2019-01-31 23:54                     ` Paolo Bonzini
2019-02-01  2:49                       ` Ning, Yu
2019-02-04 10:00                         ` Paolo Bonzini
2019-02-01  8:58                   ` Markus Armbruster
2019-01-31 11:57           ` Laszlo Ersek
2019-02-19  7:19       ` Markus Armbruster
2019-02-22 13:28         ` Markus Armbruster
2019-02-07  9:30 ` Markus Armbruster
2019-02-07 12:31   ` Laszlo Ersek
2019-02-07 13:49     ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.