All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] qdev for programmers writeup
@ 2011-07-11 10:20 Paolo Bonzini
  2011-07-11 10:46 ` Peter Maydell
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2011-07-11 10:20 UTC (permalink / raw)
  To: qemu-devel

Hi,

this is a partial version of a "qdev for programmers" document I've been
working on.  Comments are welcome.

Paolo

--------------------------------- 8< ---------------------------------

== qdev overview and concepts ==

qdev is the factory interface that QEMU uses to create guest devices and
connect them to each other.  It also provides a uniform way to expose host
devices (character, network and block) to the guest.  In the remainder,
unless specified explicitly, "device" will refer to _guest_ devices.

qdev exposes a device tree that alternates buses (qbuses) and devices
(qdevs).  The root of the tree is the system bus SysBus.  Devices can be
leaves, or they can expose buses and talk to the devices on those buses.
Such relation does not cover host counterparts of the devices, which
are not part of the device tree.

A device's interaction occurs by invoking services specific to the kind of
bus.  In general, if a device or bus X wants to requests something from Y,
X needs to know the type of Y, and of course needs to have a pointer to
it.  In a properly "qdevified" board, these assumptions hold:

- qdev enforces what bus a device is placed on;

- buses are not user-visible;

- initialization of buses is driven exclusively by the parent device, and
initialization of devices is driven by the parent bus and a well-defined
set of properties (defined per-device);

- buses do not know what device exposes them;

- devices do not know what device exposes their bus.


With these assumptions in place, leaf devices are the simplest to understand.
They only make requests to the bus and/or to the character/block/network
subsystems; and possibly, they provide services (routines) used by
the bus and the grandparent device.

Intermediate devices also have to provide glue between their parent bus
and their child bus(es), and buses likewise glue two devices.  Depending
on the kind of bus and the relationship of a device with the bus (parent
or child), different sets of services may be defined.  For example, a SCSI
bus mediates many kinds of interaction:

- from a SCSI controller to a SCSI device (e.g., start process this
command);

- from a SCSI bus to a child device (e.g., cancel this command due to a
bus reset);

- from a SCSI bus to its parent controller (e.g., this piece of data was
sent to you by a SCSI device);

- from a SCSI controller to its child bus (e.g., I dealt with this data,
please transfer more);

- from a SCSI device to its parent bus (e.g., please pass this data on
to the controller);


In general, the following rules and best practices are common:

- devices interact with their parent bus, and vice versa;

- buses interact with their parent devices, and vice versa;

- occasionally, devices may interact directly with their grandchildren
devices, but _not_ vice versa; interaction with the grandfather
device is mediated by the parent bus;

- in addition, devices interact freely with their host counterparts
(that is, character/block/network devices).


qdev defines a set of data structures, and devices use them to expose
metainformation to the rest of QEMU and to the user.  The qdev system
is object-oriented; qdev data structures can be subclassed and used to
store additional information, including function pointers for bus-
specific services.  The remainder of this document explains how to
define and use these data structures.


== Implementation techniques ==

qdev exposes an object-oriented mechanism in C through containment (C
replacement for inheritance, so to speak) and tries to make this
as type-safe as possible by leveraging the DO_UPCAST macro.

Sample structure definitions for a superclass and subclass are as follows:

    typedef struct Superclass {
        int field1;
        int field2;
        struct Superclass *field3;
    } Superclass;

    typedef struct Subclass {
        struct Superclass sup;
        int subfield1;
        int subfield2;
    } Subclass;

In many cases, C programmers pass such objects using an opaque pointer 
(void *).  These are then casted to the appropriate subtype like

     void func (void *opaque)
     {
         Subclass *s = (Subclass *) opaque;
         ...
     }

QEMU prefers to always use a more type-safe approach that passes the 
pointer to the superclass.  The cast is then done using the 
aforementioned macro:

     void func (Superclass *state)
     {
         /* more typesafe version of (Subclass *) state, that also
            verifies that &state->sup == state.
            - First argument: subclass type.
            - Second argument: field being accessed.
            - Third argument: variable being casted.  */
         Subclass *sub = DO_UPCAST(Subclass, sup, state);
     }

Casts to a superclass are done with &state->sup.  This scheme is quite 
handy to use, even though may be a bit strange-looking at the beginning.


== qdev data structures ==

This part of the document explains the data structures used by qdev.
These include the class hierarchies for buses and devices, together
with the corresponding metaclass hierarchies, and a registry of
devices and corresponding metainformation.

=== Bus and device hierarchies ===

Buses and devices reside on two parallel hierarchies, BusState and 
DeviceState.  Devices that work on the same bus usually share a
superclass.  Hence, each bus defines a subclass of BusState and an
abstract subclass of DeviceState.  Each device then adds its concrete
subclass in the DeviceState hierarchy.  For example:

     BusState
         PCIBus
         ISABus
         i2c_bus
     DeviceState
         PCIState /* bus common superclass */
             LSIState /* device-specific class */
             ...
         ISADevice
             IB700State
             ISASerialState
             ...
         i2c_slave
             WM8750State
             ...

Here is how tasks are separated between these classes:

1) bus classes (e.g. i2c_bus) are usually the least interesting of all. 
Their fields are mostly private and used at device creation time.  For 
example, you could place here the highest IRQ allocated to devices on 
the bus.  In some cases it is even absent, for example the SysBus reuses 
BusState.

2) bus superclasses (e.g. i2c_slave) typically include the address of 
the device and the interrupt lines that it is connected to.

3) device subclasses contain device-specific configuration information 
(e.g. the character or block devices to connect to) and registers.


=== Describing qdev data structures ===

In addition to defining the structs, each bus and device should describe 
them as "properties".  Since the description that a device exposes is 
shared between the bus superclasses and the device subclasses, a device 
is described completely by the union of "bus properties" (representing 
fields of the abstract per-bus superclass) and "device properties"
(representing fields of the device subclass).  Example:

     /* This is a bus superclass */
     struct i2c_slave
     {
         DeviceState qdev;
         I2CSlaveInfo *info;  /* explained later */
         uint8_t address;
     };

     /* This is how we explain it to QEMU */
     static struct BusInfo i2c_bus_info = {
         .name = "I2C",
         .size = sizeof(i2c_bus),
         .props = (Property[]) {
             /* This means: "address" is an uint8_t property with a
                default value of 0.  Store it in field address of
                struct i2c_slave.  */
             DEFINE_PROP_UINT8("address", struct i2c_slave, address, 0),
             DEFINE_PROP_END_OF_LIST(),
         }
     };

     /* This is a device that exposes no properties.  */
     static I2CSlaveInfo wm8750_info = {
         .qdev.name = "wm8750",
         .qdev.size = sizeof(WM8750State),
         /* For migration and save/restore; do not care yet.  */
         .qdev.vmsd = &vmstate_wm8750,
         /* These functions are exposed to the bus and possibly to
            the grandparent device.  */
         .init = wm8750_init,
         .event = wm8750_event,
         .recv = wm8750_rx,
         .send = wm8750_tx
     };

Another example:

     /* ISA defines no bus properties */
     static struct BusInfo isa_bus_info = {
         .name      = "ISA",
         .size      = sizeof(ISABus),
         /* ISA defines a couple of bus-specific callbacks.  */
         .print_dev = isabus_dev_print,
         .get_fw_dev_path = isabus_get_fw_dev_path,
     };

     /* However, a parallel port does define device properties: */
     static ISADeviceInfo parallel_isa_info = {
         .qdev.name  = "isa-parallel",
         .qdev.size  = sizeof(ISAParallelState),
         .init       = parallel_isa_initfn,
         .qdev.props = (Property[]) {
             DEFINE_PROP_UINT32("index", ISAParallelState, index,   -1),
             DEFINE_PROP_HEX32("iobase", ISAParallelState, iobase,  -1),
             DEFINE_PROP_UINT32("irq",   ISAParallelState, isairq,  7),
             DEFINE_PROP_CHR("chardev",  ISAParallelState, state.chr),
             DEFINE_PROP_END_OF_LIST(),
         },
     };

In general, a device may have both bus properties and device properties. 
Simple examples appropriate for documentation unfortunately don't. :)


=== Metainformation hierarchy ===

Above you may have noticed some new type names: BusInfo, I2CSlaveInfo, 
DeviceInfo.  These are the names used to store information on the class: 
properties of course, and also virtual functions.  In some sense
these *are* metaclass objects.  Their hierarchies mimics the BusState
and DeviceState ones.  The BusInfo/DeviceInfo hierarchy includes a
struct for each abstract class in the BusState/DeviceState hierarchy,
and an instance for each concrete class:

     BusState <=> BusInfo
         PCIBus -> struct BusInfo pci_bus_info = ...
         ISABus -> struct BusInfo isa_bus_info = ...
         i2c_bus -> struct BusInfo i2c_bus_info = ...
     DeviceState <=> DeviceInfo
         PCIState <=> PCIDeviceInfo
             LSIState -> static PCIDeviceInfo lsi_info = ...
             ...
         ISADevice <=> ISADeviceInfo
             IB700State -> static ISADeviceInfo wdt_ib700_info = ...
             ISASerialState -> static ISADeviceInfo serial_isa_info = ...
             ...
         i2c_slave <=> I2CSlaveInfo
             WM8750State -> static I2CSlaveInfo wm8750_info = ...
             ...

I2CSlaveInfo are the place where devices declare virtual functions
requested by the bus, in addition to those already in DeviceInfo.
In many cases, these functions correspond to additional "services" that
only make sense for that bus (example: event/recv/send in the i2c bus).
Sometimes, instead, they replace the ones in the superclass because
the bus needs to pass extra information.  The init function is always
overridden in this way; there is an internal init member in
DeviceInfo:

     typedef int (*qdev_initfn)(DeviceState *dev, DeviceInfo *info);

and one per bus, for example:

     typedef int (*i2c_slave_initfn)(i2c_slave *dev);
     typedef int (*isa_qdev_initfn)(ISADevice *dev);
     typedef int (*pci_qdev_initfn)(PCIDevice *dev);

Here is the way the I2C bus defines its qdev_initfn in terms of 
i2c_slave_initfn:

     static int i2c_slave_qdev_init(DeviceState *dev, DeviceInfo *base)
     {
         I2CSlaveInfo *info = DO_UPCAST (I2CSlaveInfo, qdev, base);
         i2c_slave *s = DO_UPCAST(i2c_slave, qdev, dev);

         /* Store virtual function table for later use.  */
         s->info = info;

         return info->init(s);
     }


=== Registering devices and making them public ===

The last part of qdev is the registry of all devices defined by the
target system.  This is a fundamental piece of metainformation, because
it allows the "-device" option to work, at least for devices that
do not rely on DEFINE_PROP_PTR or sysbus_create_varargs (those
devices can only be instantiated from QEMU's machine initialization
code).

Registering a device's name is done with the qdev_register function.
This function however is used only internally.  The actual function
to be used varies per-bus, so that the bus can first perform some checks
and do some initialization that is common to all DeviceInfo objects for
that bus.

To this end, each bus defines a wrapper function that initializes common
part of the struct DeviceInfo, and passes it to qdev_register:

     void i2c_register_slave(I2CSlaveInfo *info)
     {
         assert(info->qdev.size >= sizeof(i2c_slave));
         info->qdev.init = i2c_slave_qdev_init;
         info->qdev.bus_info = &i2c_bus_info;
         qdev_register(&info->qdev);
     }

Each device then calls this function:

     static void wm8750_register_devices(void)
     {
         i2c_register_slave(&wm8750_info);
     }

In turn, wm8750_register_devices is called at startup (as if it was a 
C++ global constructor; a gcc extension allows to do it in C):

     device_init(wm8750_register_devices)


== Letting buses and devices "talk" ==

In this part of the document, we will examine the mechanisms by which
buses and devices are connected.  The first section will explain how
buses convert human-readable properties into pointers to internal
data structures.  The second section will explain how devices take
care of creating buses.  Finally, we will describe SysBus, which is
the root of the qdev system and connects qdev with the rest of the
QEMU device model.


=== Using buses to connect device layers ===

As mentioned above, buses sit in a unique location, as they have access
to services from both the parent device and the child device.  As such,
they provide the "glue" between two layers of devices.

As part of this, they may simply expose some of the services of the
parent devices to the children.  For example, a USB host controller
interface exposes a bus with one or more "ports", and defines a set of
functions to operate on ports.  USB devices do not operate directly
on these functions; they always go through helpers such as this one:

    void usb_wakeup(USBDevice *dev)
    {
        if (dev->remote_wakeup && dev->port && dev->port->ops->wakeup) {
            dev->port->ops->wakeup(dev);
        }
    }

Helpers like this makes change easier, for example if a function
used to be mandatory and you want to make it optional.

Another very important piece of glue is initialization.  When the bus's
init function is called, properties have been set already and the parent
bus is known too.  hence the bus has the occasion to take the values of
the properties, and convert them to pointers for internal data structures
(or for example qemu_irqs).  Here is an example:

1) the bus defines a property (irq, the IRQ number):

     static struct BusInfo spapr_vio_bus_info = {
         .name       = "spapr-vio",
         .size       = sizeof(VIOsPAPRBus),
         .props = (Property[]) {
             DEFINE_PROP_UINT32("irq", VIOsPAPRDevice, vio_irq_num, 0),
             DEFINE_PROP_END_OF_LIST(),
         },
     };

2) the bus init function talks to the parent device (spapr) in order to 
get a default value and especially a qemu_irq:

     if (!dev->vio_irq_num) {
         dev->vio_irq_num = spapr_allocate_irq (spapr);
     }
     dev->qirq = xics_find_qirq(spapr->icp, dev->vio_irq_num);


So this is how qdev manages to convert human-readable configuration into 
pointers.  Since you cannot go "turtles all the way down", there are two 
fallback mechanisms to pass pointers directly to devices:

1) one is DEFINE_PROP_PTR, which you probably shouldn't use;

2) one is specific to qemu_irq and devices from sysbus; see 
sysbus_create_varargs.


=== Defining a child bus ===

[...]


=== SysBus: the root ===

[...]


== A quick guide to qdev conversion ==

Converting devices to qdev is a three-step process:

1) ensuring that an appropriate bus type is defined where the device can 
be attached to;

2) defining a device's properties (the "schema" exposed by the device);

3) converting board initialization functions to use qdev services.

The first step is very important to achieve a "quality" conversion
to qdev.  QEMU includes partial conversions to qdev that have a large
amount of SysBus devices, or devices that use DEFINE_PROP_PTR.  In many
cases, this is because the authors did not introduce a board-specific
bus type to mediate access to the board resources.  Together with such
a bus type there should be a single root board-specific device that is
attached to SysBus.  An interrupt controller is usually a good candidate
for this because it takes qemu_irqs from the outside, and can make good
use of the specificities of SysBus.

A good design will make the conversion simpler (this is important,
because it is usually hard to convert only a small part of the devices)
and especially the second step might be mostly trivial.

The third step is also very important.  If the conversion was done
well, a lot of board-specific initialization code may be removed and
replaced by command-line options.  This will also give the user the
flexibility of working with "dumbed down" versions of the board, with
some devices removed.  If necessary, standard versions of the board
may be described with configuration files.

Old code not yet converted to qdev uses a specific function for each
device type:

     goldfish_timer_and_rtc_init(0xff003000, 3);
     ...

     static struct goldfish_timer_state timer_state;

     void goldfish_timer_and_rtc_init(uint32_t timerbase, int timerirq)
     {
         timer_state.dev.base = timerbase;
         timer_state.dev.irq = timerirq;
         timer_state.timer = qemu_new_timer_ns(vm_clock,
                                goldfish_timer_tick, &timer_state);

         goldfish_device_add(&timer_state.dev, goldfish_timer_readfn,
             goldfish_timer_writefn, &timer_state);
     }

Here, the "timer_state.dev" function is a sub-structure that is common
to all devices in the board.  This is an embryonal separation between
bus-specific and device-specific data that can be exploited when
converting to qdev.  However, there are substantial differences between
this code and what will be required after qdev conversion:

- the timerbase and timerirq are set via properties before the qdev is
actually created; qdev takes care of initializing the structure's fields;

- creation of the timer is moved into the init virtual function for the
device;

- of all the arguments to goldfish_device_add, only "&timer_state" 
matters, because the goldfish_timer_readfn and goldfish_timer_writefn 
arguments will be stored in the GoldfishDeviceInfo;

- last but not least, everything will be allocated dynamically, so
static device objects such as "timer_state" will have to go.


qdev's metainformation structures BusInfo and DeviceInfo provide a place
for all this information, including even initializers for the static
"timer_state" object.  These for example can become bus property defaults,
or can be moved to the DeviceInfo subclass.

So, the call to goldfish_timer_and_rtc_init can be described entirely
in terms of qdev properties.  This can in turn be expressed in
different ways:

1) command-line

    -device goldfish_timer,base=0xff003000,irq=3

2) configuration files (for -readconfig):

     [device "goldfish_timer"]
         base = 0xff003000
         irq = 3

3) C code:

     /* The first argument is the bus.  See below for how to
        create a bus-specific wrapper to qdev_create.  */
     dev = qdev_create(&goldfish_bus->qbus, "goldfish_timer");
     qdev_prop_set_uint32(dev, "base", 0xff003000);
     qdev_prop_set_uint32(dev, "irq", 3);
     qdev_init_nofail(dev);


The last case will appear in the machine initialization function in
several cases: devices using DEFINE_PROP_PTR; devices that are present
in the board by default (though in the long term we would like to
move those to configuration files); code that creates devices based
on legacy command-line interfaces.  It will often be hidden behind a
helper function not unlike goldfish_timer_and_rtc_init; for example
(slightly edited from the actual QEMU code):

    static ISABus *isabus;

    ISADevice *isa_create(const char *name)
    {
        DeviceState *dev;

        dev = qdev_create(&isabus->qbus, name);
        return DO_UPCAST(ISADevice, qdev, dev);
    }

    static inline void serial_isa_init(int index, CharDriverState *chr)
    {
        ISADevice *dev;
    
        dev = isa_create("isa-serial");
        qdev_prop_set_uint32(&dev->qdev, "index", index);
        qdev_prop_set_chr(&dev->qdev, "chardev", chr);
        qdev_init_nofail(&dev->qdev);
    }

    ...

    /* Here we create ISA serial ports for each -serial option
       on the command line.  */
    for(i = 0; i < MAX_SERIAL_PORTS; i++) {
        if (serial_hds[i]) {
            serial_isa_init(i, serial_hds[i]);
        }
    }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qdev for programmers writeup
  2011-07-11 10:20 [Qemu-devel] qdev for programmers writeup Paolo Bonzini
@ 2011-07-11 10:46 ` Peter Maydell
  2011-07-11 12:48   ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2011-07-11 10:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 11 July 2011 11:20, Paolo Bonzini <pbonzini@redhat.com> wrote:

This is cool; more qdev documentation is really useful.

One point I'd like clarification on is when you need to invent
a new bus type. Sometimes it's pretty obvious because there's
a real hardware bus type there (PCI, SCSI) that you're modelling.
It's the edge cases I find confusing -- for instance, do we need
a qbus for the connection between an SD card controller and the
SD card model (hw/sd.c) ? There's a well defined pluggable interface
between those two parts but there's only ever one SD card so a "bus"
would be a bit odd.

> The first step is very important to achieve a "quality" conversion
> to qdev.  QEMU includes partial conversions to qdev that have a large
> amount of SysBus devices, or devices that use DEFINE_PROP_PTR.  In many
> cases, this is because the authors did not introduce a board-specific
> bus type to mediate access to the board resources.  Together with such
> a bus type there should be a single root board-specific device that is
> attached to SysBus.  An interrupt controller is usually a good candidate
> for this because it takes qemu_irqs from the outside, and can make good
> use of the specificities of SysBus.

...and this bit I don't understand. Why is SysBus a bad thing? It
generally seems to me to be the right way to represent a bit of
hardware which is fundamentally providing a memory mapped interface
plus some GPIO lines. If you make your board use sysbus then it's
easy to just plug in any existing sysbus device model qemu already
has; if every board has its own bus type instead then this reuse
just becomes unnecessarily harder, it seems to me.

Also having the interrupt controller be the board specific device
which you attach to sysbus seems totally wrong to me. This doesn't
match hardware at all -- the interrupt controller deals with
interrupt lines and isn't in the path for memory transactions at
all. (The hierarchy for memory accesses may be completely different
from how interrupts are wired, for that matter.)

-- PMM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qdev for programmers writeup
  2011-07-11 10:46 ` Peter Maydell
@ 2011-07-11 12:48   ` Paolo Bonzini
  2011-07-11 14:44     ` Peter Maydell
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2011-07-11 12:48 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On 07/11/2011 12:46 PM, Peter Maydell wrote:
> One point I'd like clarification on is when you need to invent
> a new bus type.

As rarely as possible, but as often as necessary? :P

New buses limit reusability of the models, but you need one whenever the 
existing buses do not express how two devices interact.

> Sometimes it's pretty obvious because there's
> a real hardware bus type there (PCI, SCSI) that you're modelling.
> It's the edge cases I find confusing -- for instance, do we need
> a qbus for the connection between an SD card controller and the
> SD card model (hw/sd.c) ? There's a well defined pluggable interface
> between those two parts but there's only ever one SD card so a "bus"
> would be a bit odd.

Perhaps not and you can use containment.  A very similar case is the 
16550 device, which has both an ISA (ioport-based) interface and a 
generic memory-mapped interface.  Anthony at some point argued that if 
serial_mm_init was qdev-ified, one should describe SerialState as a 
separate 16550 device, and then put a bus between {isa,mm}-serial and 
thisg 16550 device.  But perhaps including a SerialState struct into 
both ISASerialState and the (hypothetical) MMSerialState is enough.

I think you have to look at the properties of the child device and the 
interfaces between the devices.  For SerialState, the properties would 
be chardev and baudbase, and just a qemu_irq between the parent and 
child.  For SDState, the only property of the SDState would be the 
blockdev, and again a couple GPIO pins between the two.

It probably would be feasible to separate the two.  But then without a 
clean plan about hiding such internal devices, it is likely a useless 
complication for the user to see the existence of the SerialState and 
SDState.

>> Together with such
>> a bus type there should be a single root board-specific device that is
>> attached to SysBus.  An interrupt controller is usually a good candidate
>> for this because it takes qemu_irqs from the outside, and can make good
>> use of the specificities of SysBus.
>
> ...and this bit I don't understand. Why is SysBus a bad thing?

1) because SysBus devices are in general not accessible from the 
command-line or configuration files.

2) because SysBus hardcodes in the source code some things that ought to 
be device properties, for example the MMIO base address.  The presence 
of MMIO in sysbus_create_simple/sysbus_create_varargs is totally 
unnecessary IMHO.

> It generally seems to me to be the right way to represent a bit of
> hardware which is fundamentally providing a memory mapped interface
> plus some GPIO lines. If you make your board use sysbus then it's
> easy to just plug in any existing sysbus device model qemu already
> has; if every board has its own bus type instead then this reuse
> just becomes unnecessarily harder, it seems to me.

That's true, but the only way to plug in those device models would be 
with C code.  You cannot just play with -device to reconfigure them. 
It's not like SysBus has any problem; but it is right now the only 
choice you have if you want a reusable model, and that means that 
reusability can only be done at the cost of rebuilding QEMU.

For example, one reusable device is gpio_i2c.  However, I cannot simply 
take it and add it to a new board.  I need to add glue like this:

     /* dev is my GPIO device.  */
     i2c_dev = sysbus_create_simple("gpio_i2c", -1, NULL);
     qdev_connect_gpio_out(i2c_dev, 0, qdev_get_gpio_in(dev, 29));
     qdev_connect_gpio_out(dev, 3, qdev_get_gpio_in(i2c_dev, 0));
     qdev_connect_gpio_out(dev, 4, qdev_get_gpio_in(i2c_dev, 1));

and recompile QEMU.

In fact, perhaps qdev_{connect_gpio_out,get_gpio_in} should never have 
been public.  Imagine we added to qdev GPIO properties and we used them 
like this in gpio_i2c:

     /* gpio_out=N means "connect my 0th output pin to the parent's
        N-th input pin.  */
     DEFINE_PROP_GPIO_OUT("gpio_out", 0),

     /* gpio_in=N means "connect my 0th input pin to the parent's
        N-th output pin.  */
     DEFINE_PROP_GPIO_IN("gpio_in", 0),
     DEFINE_PROP_GPIO_IN("gpio_clk", 1)

Then we define a GPIOBus that is really a bare-bones BusState, with no 
MMIO and nothing.  However, GPIO chips would expose one such bus, and a 
lot of reusable components could be moved from SysBus to GPIOBus... and 
get -device configuration at once!  With this in place you can do:

     -device gpio_i2c,gpio_out=29,gpio_in=3,gpio_clk=4

or in a configuration file:

     [device "gpio_i2c"]
         gpio_out = 29
         gpio_in = 3
         gpio_clk = 4

or if you really have to do it in C:

     dev = qdev_create(&gpiobus->bus, "gpio_i2c");
     qdev_set_prop_set_gpio_out("gpio_out", 29);
     qdev_set_prop_set_gpio_in("gpio_in", 3);
     qdev_set_prop_set_gpio_in("gpio_clk", 4);
     qdev_init_nofail(dev);

Even the C code would already be an improvement, because the client code 
has no idea of the pin numbers of gpio_i2c.

Note that gpio_i2c is already a well-defined device, and it uses only a 
bunch of qdev gpio pins.  In a more complex case when you use the sysbus 
IRQ mechanism, sharing is going to be even harder (and this is what I 
had in mind when proposing that the interrupt controller have its own bus).

> Also having the interrupt controller be the board specific device
> which you attach to sysbus seems totally wrong to me. This doesn't
> match hardware at all -- the interrupt controller deals with
> interrupt lines and isn't in the path for memory transactions at
> all.

Well, it is clear that buses should be modelled after the way data 
flows.  But what is data?  If data is "what is being written", buses 
should be modelled after the way memory transactions flow.  If data is 
"what is being made available", buses are modelled more "after the way 
interrupts flow.  GPIO is a strange thing in the middle. :)

Unfortunately there is only one path you have to choose, which is 
perhaps the strongest limitation of qdev (unless you're working with PCI 
where the bus provides both MMIO and interrupts).  MMIO assignments for 
embedded boards are usually simple enough that I found an 
interrupt-driven design (SysBus->PIC->InterruptBus->GPIO->GPIOBus->...) 
clearer and easier to explain; that biased my writing.

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qdev for programmers writeup
  2011-07-11 12:48   ` Paolo Bonzini
@ 2011-07-11 14:44     ` Peter Maydell
  2011-07-11 15:29       ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2011-07-11 14:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 11 July 2011 13:48, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 07/11/2011 12:46 PM, Peter Maydell wrote:
>>Paolo wrote:
>>> Together with such
>>> a bus type there should be a single root board-specific device that is
>>> attached to SysBus.  An interrupt controller is usually a good candidate
>>> for this because it takes qemu_irqs from the outside, and can make good
>>> use of the specificities of SysBus.
>>
>> ...and this bit I don't understand. Why is SysBus a bad thing?
>
> 1) because SysBus devices are in general not accessible from the
> command-line or configuration files.
>
> 2) because SysBus hardcodes in the source code some things that ought to be
> device properties, for example the MMIO base address.  The presence of MMIO
> in sysbus_create_simple/sysbus_create_varargs is totally unnecessary IMHO.

MMIO base address shouldn't be a device property anyway -- it is
a property of the machine (board) model, ie "how am I wiring up
these devices?", not a property of the device itself.

>> It generally seems to me to be the right way to represent a bit of
>> hardware which is fundamentally providing a memory mapped interface
>> plus some GPIO lines. If you make your board use sysbus then it's
>> easy to just plug in any existing sysbus device model qemu already
>> has; if every board has its own bus type instead then this reuse
>> just becomes unnecessarily harder, it seems to me.
>
> That's true, but the only way to plug in those device models would be with C
> code.  You cannot just play with -device to reconfigure them.

I think the C source level reuse is more important and more useful
than plugging stuff around with -device, because in practice messing
about with the memory map of an embedded board isn't something people
typically want to do, because the kernel will have more-or-less
hardcoded where the devices are anyhow. (Also if you have one bus type
per board then you're still very limited in what you can do with -device
because you can't plug in some random other sysbus device anyway.) So if
we want to allow runtime configurability of boards like that we need
to do it by providing runtime configurability of sysbus devices.
(which approximately equates to runtime script-driven machine models).

> In fact, perhaps qdev_{connect_gpio_out,get_gpio_in} should never have been
> public.  Imagine we added to qdev GPIO properties and we used them like this
> in gpio_i2c:
>
>    /* gpio_out=N means "connect my 0th output pin to the parent's
>       N-th input pin.  */
>    DEFINE_PROP_GPIO_OUT("gpio_out", 0),
>
>    /* gpio_in=N means "connect my 0th input pin to the parent's
>       N-th output pin.  */
>    DEFINE_PROP_GPIO_IN("gpio_in", 0),
>    DEFINE_PROP_GPIO_IN("gpio_clk", 1)

I think it would be better to think of GPIO as a special (simple)
case of a generic desire to plug devices into each other (this is what
SystemC calls a "port": basically an interface (API) defined so that
two things can make calls to arbitrary callback functions on the other
end of a connection). How ports are wired up is a property of the
machine model, not a property of the device at either end, and I think
it is better for the syntax not to mix up ports and properties.

Also, you want to be able to actually name your output pins, so
at a board level you can talk about wiring up sd_controller.wprot,
not sd_controller.gpio[3]... (Yes, this is to some extent "just syntax"
but I really don't want to end up with machine level modelling looking
like:

>    [device "gpio_i2c"]
>        gpio_out = 29
>        gpio_in = 3
>        gpio_clk = 4

...because that's just not really maintainable IMHO.)

>> Also having the interrupt controller be the board specific device
>> which you attach to sysbus seems totally wrong to me. This doesn't
>> match hardware at all -- the interrupt controller deals with
>> interrupt lines and isn't in the path for memory transactions at
>> all.
>
> Well, it is clear that buses should be modelled after the way data flows.
>  But what is data?  If data is "what is being written", buses should be
> modelled after the way memory transactions flow.  If data is "what is being
> made available", buses are modelled more "after the way interrupts flow.
>  GPIO is a strange thing in the middle. :)

Actually I think we should really be modelling buses only where the
hardware actually has a bus, ie a coherent collection of signals
between multiple devices such that you could in theory plug in
different devices in different slots. The memory transaction
related connections are much more buslike than interrupt wiring.

-- PMM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qdev for programmers writeup
  2011-07-11 14:44     ` Peter Maydell
@ 2011-07-11 15:29       ` Paolo Bonzini
  2011-07-11 16:47         ` Peter Maydell
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2011-07-11 15:29 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On 07/11/2011 04:44 PM, Peter Maydell wrote:
>> That's true, but the only way to plug in those device models would be with C
>> code.  You cannot just play with -device to reconfigure them.
>
> I think the C source level reuse is more important and more useful
> than plugging stuff around with -device, because in practice messing
> about with the memory map of an embedded board isn't something people
> typically want to do, because the kernel will have more-or-less
> hardcoded where the devices are anyhow.

That's just because ARM doesn't (yet?) do device trees... :)

> (Also if you have one bus type
> per board then you're still very limited in what you can do with -device
> because you can't plug in some random other sysbus device anyway.)

I'm not talking about one bus type per board!  I'm talking about as few 
as possible board-specific root devices, and sharing buses between 
boards as much as possible.

> So if
> we want to allow runtime configurability of boards like that we need
> to do it by providing runtime configurability of sysbus devices.

But we already have properties as a runtime configurability mechanism, 
and the additional functionality provided by SysBus is exactly to bypass 
it.  In that sense SysBus is bad (if used widely as it is now).

It is also bad because SysBus takes device metainformation (number of 
IRQ lines, size of MMIO and PIO areas) and puts it into per-device 
structure.

> I think it would be better to think of GPIO as a special (simple)
> case of a generic desire to plug devices into each other (this is what
> SystemC calls a "port": basically an interface (API) defined so that
> two things can make calls to arbitrary callback functions on the other
> end of a connection). How ports are wired up is a property of the
> machine model, not a property of the device at either end

But in practice you tend to have tree-like relationships.  Even if it is 
just a two-level tree with a GPIO chip or interrupt controller at one 
end, and everything else at the other.

> Also, you want to be able to actually name your output pins, so
> at a board level you can talk about wiring up sd_controller.wprot,
> not sd_controller.gpio[3]... (Yes, this is to some extent "just syntax"
> but I really don't want to end up with machine level modelling looking
> like:

It _is_ really just syntax.  In my examples I used the raw values for 
GPIO pin numbers because that's what we have.  So, right now there is no 
way to use symbolic values, but nothing forbids adding one.  Or even 
automatically generating the QEMU .conf from something else.

If you are using C code, of course you have #defines/enums.

>> Well, it is clear that buses should be modelled after the way data flows.
>> But what is data?  If data is "what is being written", buses should be
>> modelled after the way memory transactions flow.  If data is "what is being
>> made available", buses are modelled more "after the way interrupts flow.
>> GPIO is a strange thing in the middle. :)
>
> Actually I think we should really be modelling buses only where the
> hardware actually has a bus, ie a coherent collection of signals
> between multiple devices such that you could in theory plug in
> different devices in different slots. The memory transaction
> related connections are much more buslike than interrupt wiring.

I'm not sure, perhaps the naming is bad.  Abstracting from the name for 
a moment, unifying interrupts/GPIO with the qbus concept seemed natural 
to me in most of the cases I looked at.  And it makes the handling of 
GPIO in qdev less schizophrenic; qdev seems to treat the qbus (whatever 
it is) as the One True Mechanism to join devices, and then adds GPIO on 
the side while providing no way to configure it.  If you try to match 
the qdev tree with the tree of qemu_irq, the design becomes a lot more 
coherent.

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qdev for programmers writeup
  2011-07-11 15:29       ` Paolo Bonzini
@ 2011-07-11 16:47         ` Peter Maydell
  2011-07-12 15:22           ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2011-07-11 16:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 11 July 2011 16:29, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 07/11/2011 04:44 PM, Peter Maydell wrote:
>> (Also if you have one bus type
>> per board then you're still very limited in what you can do with -device
>> because you can't plug in some random other sysbus device anyway.)
>
> I'm not talking about one bus type per board!  I'm talking about as few as
> possible board-specific root devices, and sharing buses between boards as
> much as possible.

Er, doesn't that just get you sysbus again? Also if you have a root device
and it's not the CPU then something's a bit odd. (The CPU lives above
the interrupt controller in the interrupt tree if you want to look at it
like that.)

>> So if
>> we want to allow runtime configurability of boards like that we need
>> to do it by providing runtime configurability of sysbus devices.
>
> But we already have properties as a runtime configurability mechanism, and
> the additional functionality provided by SysBus is exactly to bypass it.  In
> that sense SysBus is bad (if used widely as it is now).
>
> It is also bad because SysBus takes device metainformation (number of IRQ
> lines, size of MMIO and PIO areas) and puts it into per-device structure.

I think the real reason so many devices use sysbus is that it is basically
"I'm a device and I support some gpio signals and some memory mappings",
which is just a very natural way to model a lot of things. It's also an
easy to use API for machine models to instantiate and wire up both IRQs
and memory mappings. If we want less use of sysbus then we need to have
the basic qdev device model have much better support for doing this.

>> I think it would be better to think of GPIO as a special (simple)
>> case of a generic desire to plug devices into each other (this is what
>> SystemC calls a "port": basically an interface (API) defined so that
>> two things can make calls to arbitrary callback functions on the other
>> end of a connection). How ports are wired up is a property of the
>> machine model, not a property of the device at either end
>
> But in practice you tend to have tree-like relationships.  Even if it is
> just a two-level tree with a GPIO chip or interrupt controller at one end,
> and everything else at the other.

Yes, but the tree-like relation of the memory transaction bus/interconnect
isn't necessarily the same as the tree-like relation of the interrupt
controllers; and the tree-like relation of the clocks might be different
again. And even if things tend to be tree-like, you still need to support
arbitrary inter-wiring for the corner cases (like "this MMC controller's
'card present' wire needs to connect to the board-register model's input").
You can model trees with arbitrary interconnections, but not vice-versa.
Insisting that interrupts are always a tree also rules out ever having
qemu support multiple distinct CPUs where some interrupt sources end up
going to more than one CPU; I'd rather we didn't bake that kind of
assumption into the core device model if possible.

(This view of the world, which I accept is not really qdev's, says that
a bus is really just a conveniently named and manipulable bundle of
connections.)

-- PMM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qdev for programmers writeup
  2011-07-11 16:47         ` Peter Maydell
@ 2011-07-12 15:22           ` Paolo Bonzini
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2011-07-12 15:22 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On 07/11/2011 06:47 PM, Peter Maydell wrote:
> On 11 July 2011 16:29, Paolo Bonzini<pbonzini@redhat.com>  wrote:
>> On 07/11/2011 04:44 PM, Peter Maydell wrote:
>>> (Also if you have one bus type
>>> per board then you're still very limited in what you can do with -device
>>> because you can't plug in some random other sysbus device anyway.)
>>
>> I'm not talking about one bus type per board!  I'm talking about as few as
>> possible board-specific root devices, and sharing buses between boards as
>> much as possible.
>
> Er, doesn't that just get you sysbus again?

It does get you a bus that can be reused by devices.  It doesn't get you 
a bus that is a pot-pourri of features, some of which are not even 
meaningful in the context of all boards (e.g. PIO), and some of which 
override the run-time reconfigurability mechanisms that qdev has built-in.

By the way, while it's true that run-time reconfigurability does not buy 
you much in terms of adding devices---at least without a device tree in 
the guest---it can help in terms of removing devices for debugging.

If a device only needs MMIO and no GPIO/IRQ pins, it probably can stay 
under SysBus.  However, I don't believe the magic MMIO functionality of 
SysBus is useful, and I do think it should be replaced by properties.

> Also if you have a root device
> and it's not the CPU then something's a bit odd. (The CPU lives above
> the interrupt controller in the interrupt tree if you want to look at it
> like that.)

If you consider the CPU to be hidden beyond sysbus, then yes, you do 
have CPU->SysBus->PIC.  It is interesting that in the PC the devices 
below SysBus are indeed mostly managing interrupts:

     CPU->SysBus->LAPIC(s)
                  IOAPIC
                  HPET
                  i440FX-pcihost
                  fw_cfg

I think the PC's fw_cfg device should move below the ISA bridge; and the 
HPET is there only because there is no single device for the northbridge 
chip.  It should perhaps be more like

     CPU->SysBus->LAPIC(s)
                  i440FX-nb->i440FX-pcihost
                             IOAPIC
                             HPET
                             i8259

> I think the real reason so many devices use sysbus is that it is basically
> "I'm a device and I support some gpio signals and some memory mappings",
> which is just a very natural way to model a lot of things.

I agree that sysbus is convenient sugar right now, and we need that 
sugar to be available at all levels (not just sysbus), but you don't 
need sysbus to express that.

There is actually one thing that I'd save in sysbus, and that is IRQs. 
That is because GPIO pins provided by qdev work in one direction only. 
If you want to have interrupt/GPIO sources both towards the children and 
towards the parent, it doesn't work well.  This is a nice niche that 
sysbus IRQs fit in; a GPIO chip can use gpio_in/gpio_out towards the 
children, and sysbus IRQs towards the parent, giving nice separation.

> And even if things tend to be tree-like, you still need to support
> arbitrary inter-wiring for the corner cases (like "this MMC controller's
> 'card present' wire needs to connect to the board-register model's input").
> You can model trees with arbitrary interconnections, but not vice-versa.

Yes, any slot/socket mechanism for run-time reconfigurability of GPIO or 
IRQ connections needs to take into account the possibility of connecting 
siblings (or even completely disconnected devices).  Right now that is 
limited to C code.  But since a GPIO/IRQ is simply a pointer, adding 
such a mechanism would be be just syntax to name the devices' GPIO/IRQ 
slots.

But in any case you will need a preferred topology defined somewhere, 
because code needs more than a bunch of qemu_irqs.  Since they know that 
the model is a tree, qdevified devices can exploit their parent-child 
relationship and you can use that to tie the parent and child in more 
specific ways with virtual functions.  It's quite fundamental.  This can 
stay even if you turn the preferred topology into a DAG, or into the 
superposition of many trees.

> (This view of the world, which I accept is not really qdev's, says that
> a bus is really just a conveniently named and manipulable bundle of
> connections.)

I see qbuses as a conveniently named and pluggable set of callbacks 
(including qemu_irq callbacks whenever that's convenient). 
Alternatively, it's the point where the children's sockets are joined to 
the children's slots we're forced by qdev to make all sockets meet their 
slots in the same place---i.e. on the same qbus).

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-07-12 15:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-11 10:20 [Qemu-devel] qdev for programmers writeup Paolo Bonzini
2011-07-11 10:46 ` Peter Maydell
2011-07-11 12:48   ` Paolo Bonzini
2011-07-11 14:44     ` Peter Maydell
2011-07-11 15:29       ` Paolo Bonzini
2011-07-11 16:47         ` Peter Maydell
2011-07-12 15:22           ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.