linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* virtio-pci new configuration proposal
@ 2011-11-02 18:49 Sasha Levin
  2011-11-02 19:08 ` Michael S. Tsirkin
  2011-11-03  1:58 ` Rusty Russell
  0 siblings, 2 replies; 27+ messages in thread
From: Sasha Levin @ 2011-11-02 18:49 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin; +Cc: linux-kernel, kvm, virtualization

This is a proposal for a new layout of the virtio-pci config space.

We will separate the current configuration into two: A virtio-pci common
configuration and a device specific configuration. This allows more flexibility
with adding features and makes usage easier, specifically in cases like the
ones in virtio-net where device specific configurations depend on device
specific features.

The preferred location of the configuration will be a MMIO BAR, therefore
several size optimizations such as the queue selector which were useful
on a PIO BAR were removed and linked lists were introduced to add
flexibility to existing structures in attempt to keep their size and
offsets constant and not dependent on features.

While this is not a complete specification, it describes all the main
structures and should be enough to make sure the idea is solid before
going into implementation specifics.

A new virtio-pci capability structure will be introduced. This structure
will be used to describe the basics of a virtio-pci device. The
structure is actually a PCI capability structure which wraps the new virtio-pci
capability structure:

+------------++------------+------------+------------+--------------+-----------------
| Bits       || 8          | 8          | 8          | 8            | Cap specific
+------------++------------+------------+------------+--------------+-----------------
| Read/Write || R          | R          | R          | R            | Cap specific
+------------++------------+------------+------------+--------------+-----------------
| Purpose    || Capability | Capability | virtio-pci | virtio-pci   | Cap specific
|            || ID         | Next PTR   | Cap ID     | Next Cap PTR |
+------------++------------+------------+------------+--------------+-----------------

Currently, only one virtio-pci capability is defined, and is the one that defines
the layout of the virtio-pci config space.

VIRTIO_PCI_C_LAYOUT:

+------------++---------------+---------------+-----------------+-----------------+
| Bits       || 29            | 3             | 29              | 3               |
+------------++---------------+---------------+-----------------+-----------------+
| Read/Write || R             | R             | R               | R               |
+------------++---------------+---------------+-----------------+-----------------+
| Purpose    || Common config | Common config | Device specific | Device specific |
|            || Offset        | BIR           | Offset          | BIR             |
+------------++---------------+---------------+-----------------+-----------------+

The combination of the offset and the BIR will point to the common and device
specific configuration spaces. These spaces may be pointed to by the same, or
by different BARs and may sit in the same or different MMIO regions.

The BIR is a PCI concept which describes which BAR should be used:

"Indicates which one of a function’s Base Address registers, located beginning
at 10h in Configuration space, is used to map the function’s MSI-X Table into
memory space.

BIR Value	Base Address register
0		10h
1		14h
2		18h
3		1Ch
4		20h
5		24h
6		Reserved
7		Reserved"

While the offset will provide a 32-bit aligned offset into the memory region
and will point to the appropriate config space.

First, we will look at the virtio-pci configuration space, which has been
simplified from the original.

virtio-pci header:

+------------++---------+--------+--------------+--------------+
| Bits       || 8       | 8      | 16           | 16           |
+------------++---------+--------+--------------+--------------+
| Read/Write || R+W     | R      | R            | R            |
+------------++---------+--------+--------------+--------------+
| Purpose    || Device  | ISR    | Queue        | virtio-pci   |
|            || Status  | Status | Table Ptr    | Features Ptr |
+------------++---------+--------+--------------+--------------+

Device and ISR status are the same as in the original, and two new fields
have been added. Queue table ptr is an offset from the beginning of the
config space to a table which describes all virtio queues, and is defined
below.
Virtio-pci features ptr will point to virtio-pci features, such features
will be placed in a virtio-pci feature struct which is defined as follows:

virtio-pci features:

+------------++------------+------------+-------------------
| Bits       || 8          | 16         | Feature specific
+------------++------------+------------+-------------------
| Read/Write || R          | R          | Feature specific
+------------++------------+------------+-------------------
| Purpose    || Feature    | Feature    | Feature specific
|            || ID         | Next PTR   | 
+------------++------------+------------+-------------------

For example, the virtio-pci MSI-X feature will look as follows:

VIRTIO_PCI_F_MSIX:

+------------++----------------+
| Bits       || 16             |
+------------++----------------+
| Read/Write || R+W            |
+------------++----------------+
| Purpose    || Configuration  |
|            || Vector         |
+------------++----------------+

The queue table is the table which will hold the definition of all virtio
queues. The virtio queue selector has been dropped and instead the queues
are listed in a simple list:

Queue table:

+------------++-------------------+--------+---------+--------------+--------------
| Bits       || 8      | 32       | 16     | 16      | 16           | More queues
+------------++-------------------+--------+---------+--------------+--------------
| Read/Write || R      | R+W      | R      | R+W     | R            | More queues
+------------++-------------------+--------+---------+--------------+--------------
| Purpose    || Queue  | Queue    | Queue  | Queue   | Queue        | More queues
|            || Count  | Address  | Size   | Notify  | Features Ptr |
+------------++-------------------+--------+---------+--------------+--------------

Besides the usual fields, each virtio queue will also have a feature ptr which
points to queue specific features. These features may be shared between queues
therefore it is not required to define a feature struct for each queue.

Queue features:

+------------++------------+------------+-------------------
| Bits       || 8          | 16         | Feature specific
+------------++------------+------------+-------------------
| Read/Write || R          | R          | Feature specific
+------------++------------+------------+-------------------
| Purpose    || Feature    | Feature    | Feature specific
|            || ID         | Next PTR   | 
+------------++------------+------------+-------------------

For example, the virtio queue MSI-X feature will look as follows and may be
shared between all virtio queues if we only have one vector for all queues.

VIRTIO_QUEUE_F_MSIX:

+------------++--------+
| Bits       || 16     |
+------------++--------+
| Read/Write || R+W    |
+------------++--------+
| Purpose    || Queue  |
| (MSI-X)    || Vector |
+------------++--------+



The second part is the device specific configuration space. This is
basically a linked list of features which may be activated or
deactivated by setting the active flag in each feature.

Device specific features:

+------------++------------+------------+----------+---------
| Bits       || 8          | 16         | 8        | Feature specific
+------------++------------+------------+----------+---------
| Read/Write || R          | R          | R+W      | Feature specific
+------------++------------+------------+----------+---------
| Purpose    || Feature    | Feature    | Feature  | Feature specific
|            || ID         | Next PTR   | Active   |
+------------++------------+------------+----------+---------

For example, the virtio-net MAC feature will look as follows:

VIRTIO_NET_F_MAC:

+------------++------------+
| Bits       || 48         |
+------------++------------+
| Read/Write || R          |
+------------++------------+
| Purpose    || virtio-net |
|            || MAC        | 
+------------++------------+

-- 

Sasha.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-02 19:08 ` Michael S. Tsirkin
@ 2011-11-02 19:07   ` Sasha Levin
  2011-11-02 19:14     ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Sasha Levin @ 2011-11-02 19:07 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Rusty Russell, linux-kernel, kvm, virtualization

On Wed, 2011-11-02 at 21:08 +0200, Michael S. Tsirkin wrote:
> On Wed, Nov 02, 2011 at 08:49:27PM +0200, Sasha Levin wrote:
> > This is a proposal for a new layout of the virtio-pci config space.
> > 
> > We will separate the current configuration into two: A virtio-pci common
> > configuration and a device specific configuration. This allows more flexibility
> > with adding features and makes usage easier, specifically in cases like the
> > ones in virtio-net where device specific configurations depend on device
> > specific features.
> > 
> > The preferred location of the configuration will be a MMIO BAR, therefore
> > several size optimizations such as the queue selector which were useful
> > on a PIO BAR were removed and linked lists were introduced to add
> > flexibility to existing structures in attempt to keep their size and
> > offsets constant and not dependent on features.
> > 
> > While this is not a complete specification, it describes all the main
> > structures and should be enough to make sure the idea is solid before
> > going into implementation specifics.
> > 
> > A new virtio-pci capability structure will be introduced. This structure
> > will be used to describe the basics of a virtio-pci device. The
> > structure is actually a PCI capability structure which wraps the new virtio-pci
> > capability structure:
> > 
> > +------------++------------+------------+------------+--------------+-----------------
> > | Bits       || 8          | 8          | 8          | 8            | Cap specific
> > +------------++------------+------------+------------+--------------+-----------------
> > | Read/Write || R          | R          | R          | R            | Cap specific
> > +------------++------------+------------+------------+--------------+-----------------
> > | Purpose    || Capability | Capability | virtio-pci | virtio-pci   | Cap specific
> > |            || ID         | Next PTR   | Cap ID     | Next Cap PTR |
> > +------------++------------+------------+------------+--------------+-----------------
> > 
> > Currently, only one virtio-pci capability is defined, and is the one that defines
> > the layout of the virtio-pci config space.
> > 
> > VIRTIO_PCI_C_LAYOUT:
> > 
> > +------------++---------------+---------------+-----------------+-----------------+
> > | Bits       || 29            | 3             | 29              | 3               |
> > +------------++---------------+---------------+-----------------+-----------------+
> > | Read/Write || R             | R             | R               | R               |
> > +------------++---------------+---------------+-----------------+-----------------+
> > | Purpose    || Common config | Common config | Device specific | Device specific |
> > |            || Offset        | BIR           | Offset          | BIR             |
> > +------------++---------------+---------------+-----------------+-----------------+
> 
> I'm implementing a different layout, with
> separate capabilities for common and device specific
> fields.

Which is what I have here, no?

-- 

Sasha.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-02 18:49 virtio-pci new configuration proposal Sasha Levin
@ 2011-11-02 19:08 ` Michael S. Tsirkin
  2011-11-02 19:07   ` Sasha Levin
  2011-11-03  1:58 ` Rusty Russell
  1 sibling, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-02 19:08 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Rusty Russell, linux-kernel, kvm, virtualization

On Wed, Nov 02, 2011 at 08:49:27PM +0200, Sasha Levin wrote:
> This is a proposal for a new layout of the virtio-pci config space.
> 
> We will separate the current configuration into two: A virtio-pci common
> configuration and a device specific configuration. This allows more flexibility
> with adding features and makes usage easier, specifically in cases like the
> ones in virtio-net where device specific configurations depend on device
> specific features.
> 
> The preferred location of the configuration will be a MMIO BAR, therefore
> several size optimizations such as the queue selector which were useful
> on a PIO BAR were removed and linked lists were introduced to add
> flexibility to existing structures in attempt to keep their size and
> offsets constant and not dependent on features.
> 
> While this is not a complete specification, it describes all the main
> structures and should be enough to make sure the idea is solid before
> going into implementation specifics.
> 
> A new virtio-pci capability structure will be introduced. This structure
> will be used to describe the basics of a virtio-pci device. The
> structure is actually a PCI capability structure which wraps the new virtio-pci
> capability structure:
> 
> +------------++------------+------------+------------+--------------+-----------------
> | Bits       || 8          | 8          | 8          | 8            | Cap specific
> +------------++------------+------------+------------+--------------+-----------------
> | Read/Write || R          | R          | R          | R            | Cap specific
> +------------++------------+------------+------------+--------------+-----------------
> | Purpose    || Capability | Capability | virtio-pci | virtio-pci   | Cap specific
> |            || ID         | Next PTR   | Cap ID     | Next Cap PTR |
> +------------++------------+------------+------------+--------------+-----------------
> 
> Currently, only one virtio-pci capability is defined, and is the one that defines
> the layout of the virtio-pci config space.
> 
> VIRTIO_PCI_C_LAYOUT:
> 
> +------------++---------------+---------------+-----------------+-----------------+
> | Bits       || 29            | 3             | 29              | 3               |
> +------------++---------------+---------------+-----------------+-----------------+
> | Read/Write || R             | R             | R               | R               |
> +------------++---------------+---------------+-----------------+-----------------+
> | Purpose    || Common config | Common config | Device specific | Device specific |
> |            || Offset        | BIR           | Offset          | BIR             |
> +------------++---------------+---------------+-----------------+-----------------+

I'm implementing a different layout, with
separate capabilities for common and device specific
fields.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-02 19:07   ` Sasha Levin
@ 2011-11-02 19:14     ` Michael S. Tsirkin
  0 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-02 19:14 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Rusty Russell, linux-kernel, kvm, virtualization

On Wed, Nov 02, 2011 at 09:07:12PM +0200, Sasha Levin wrote:
> On Wed, 2011-11-02 at 21:08 +0200, Michael S. Tsirkin wrote:
> > On Wed, Nov 02, 2011 at 08:49:27PM +0200, Sasha Levin wrote:
> > > This is a proposal for a new layout of the virtio-pci config space.
> > > 
> > > We will separate the current configuration into two: A virtio-pci common
> > > configuration and a device specific configuration. This allows more flexibility
> > > with adding features and makes usage easier, specifically in cases like the
> > > ones in virtio-net where device specific configurations depend on device
> > > specific features.
> > > 
> > > The preferred location of the configuration will be a MMIO BAR, therefore
> > > several size optimizations such as the queue selector which were useful
> > > on a PIO BAR were removed and linked lists were introduced to add
> > > flexibility to existing structures in attempt to keep their size and
> > > offsets constant and not dependent on features.
> > > 
> > > While this is not a complete specification, it describes all the main
> > > structures and should be enough to make sure the idea is solid before
> > > going into implementation specifics.
> > > 
> > > A new virtio-pci capability structure will be introduced. This structure
> > > will be used to describe the basics of a virtio-pci device. The
> > > structure is actually a PCI capability structure which wraps the new virtio-pci
> > > capability structure:
> > > 
> > > +------------++------------+------------+------------+--------------+-----------------
> > > | Bits       || 8          | 8          | 8          | 8            | Cap specific
> > > +------------++------------+------------+------------+--------------+-----------------
> > > | Read/Write || R          | R          | R          | R            | Cap specific
> > > +------------++------------+------------+------------+--------------+-----------------
> > > | Purpose    || Capability | Capability | virtio-pci | virtio-pci   | Cap specific
> > > |            || ID         | Next PTR   | Cap ID     | Next Cap PTR |
> > > +------------++------------+------------+------------+--------------+-----------------
> > > 
> > > Currently, only one virtio-pci capability is defined, and is the one that defines
> > > the layout of the virtio-pci config space.
> > > 
> > > VIRTIO_PCI_C_LAYOUT:
> > > 
> > > +------------++---------------+---------------+-----------------+-----------------+
> > > | Bits       || 29            | 3             | 29              | 3               |
> > > +------------++---------------+---------------+-----------------+-----------------+
> > > | Read/Write || R             | R             | R               | R               |
> > > +------------++---------------+---------------+-----------------+-----------------+
> > > | Purpose    || Common config | Common config | Device specific | Device specific |
> > > |            || Offset        | BIR           | Offset          | BIR             |
> > > +------------++---------------+---------------+-----------------+-----------------+
> > 
> > I'm implementing a different layout, with
> > separate capabilities for common and device specific
> > fields.
> 
> Which is what I have here, no?

No :) Give me a couple of hours I'll send an RFC patch so you can see
what I mean.

> -- 
> 
> Sasha.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-02 18:49 virtio-pci new configuration proposal Sasha Levin
  2011-11-02 19:08 ` Michael S. Tsirkin
@ 2011-11-03  1:58 ` Rusty Russell
  2011-11-03  8:33   ` Sasha Levin
  2011-11-03 11:03   ` Michael S. Tsirkin
  1 sibling, 2 replies; 27+ messages in thread
From: Rusty Russell @ 2011-11-03  1:58 UTC (permalink / raw)
  To: Sasha Levin, Michael S. Tsirkin
  Cc: linux-kernel, kvm, virtualization, Anthony Liguori

On Wed, 02 Nov 2011 20:49:27 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> This is a proposal for a new layout of the virtio-pci config space.
> 
> We will separate the current configuration into two: A virtio-pci common
> configuration and a device specific configuration. This allows more flexibility
> with adding features and makes usage easier, specifically in cases like the
> ones in virtio-net where device specific configurations depend on device
> specific features.

Thanks for this Sasha.  Several general comments:

1) How to we distinguish the two layouts?  In theory, we need to do this
   forever.  In practice we can deprecate the old layout in several
   years' time.

2) I don't think we want to turn the device-specific config into a
   linked list.  We haven't needed variable-length config (yet!), and
   it's (slightly) more complex.  That's also the part of the spec which
   is shared with non-PCI virtio implementations.

3) If we're changing the queue layout, it's a chance to fix a
   longstanding bug: let the guest notify the host of preferred
   queue size and alignment.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-03  1:58 ` Rusty Russell
@ 2011-11-03  8:33   ` Sasha Levin
  2011-11-03 12:46     ` Michael S. Tsirkin
  2011-11-04  9:44     ` Rusty Russell
  2011-11-03 11:03   ` Michael S. Tsirkin
  1 sibling, 2 replies; 27+ messages in thread
From: Sasha Levin @ 2011-11-03  8:33 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, linux-kernel, kvm, virtualization, Anthony Liguori

On Thu, 2011-11-03 at 12:28 +1030, Rusty Russell wrote: 
> On Wed, 02 Nov 2011 20:49:27 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> > This is a proposal for a new layout of the virtio-pci config space.
> > 
> > We will separate the current configuration into two: A virtio-pci common
> > configuration and a device specific configuration. This allows more flexibility
> > with adding features and makes usage easier, specifically in cases like the
> > ones in virtio-net where device specific configurations depend on device
> > specific features.
> 
> Thanks for this Sasha.  Several general comments:
> 
> 1) How to we distinguish the two layouts?  In theory, we need to do this
>    forever.  In practice we can deprecate the old layout in several
>    years' time.

Old layouts won't have the new virtio-pci cap structure in their PCI
config space.

> 2) I don't think we want to turn the device-specific config into a
>    linked list.  We haven't needed variable-length config (yet!), and
>    it's (slightly) more complex.  That's also the part of the spec which
>    is shared with non-PCI virtio implementations.

Variable length config wasn't used yet because space in the device
specific space was reserved for a feature even if that feature wasn't
used.

For example, the MAC feature reserved 6 bytes in the config space for
the MAC even if VIRTIO_NET_F_MAC wasn't enabled. Here we can just avoid
having it pollute the config space until it's enabled.

I don't think it'll have any impact on non-PCI implementations since the
"pointers" are simply offsets from the beginning of the config space,
and are not PCI specific in any way.

> 3) If we're changing the queue layout, it's a chance to fix a
>    longstanding bug: let the guest notify the host of preferred
>    queue size and alignment.

Yup, we can do that.

-- 

Sasha.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-03  1:58 ` Rusty Russell
  2011-11-03  8:33   ` Sasha Levin
@ 2011-11-03 11:03   ` Michael S. Tsirkin
  1 sibling, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-03 11:03 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Sasha Levin, linux-kernel, kvm, virtualization, Anthony Liguori

On Thu, Nov 03, 2011 at 12:28:46PM +1030, Rusty Russell wrote:
> 3) If we're changing the queue layout, it's a chance to fix a
>    longstanding bug: let the guest notify the host of preferred
>    queue size and alignment.

With device config split from the common one, we can
just tuck new fields at the tail of the common one.

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-03  8:33   ` Sasha Levin
@ 2011-11-03 12:46     ` Michael S. Tsirkin
  2011-11-03 13:19       ` Sasha Levin
  2011-11-04  9:44     ` Rusty Russell
  1 sibling, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-03 12:46 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Thu, Nov 03, 2011 at 10:33:23AM +0200, Sasha Levin wrote:
> On Thu, 2011-11-03 at 12:28 +1030, Rusty Russell wrote: 
> > On Wed, 02 Nov 2011 20:49:27 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> > > This is a proposal for a new layout of the virtio-pci config space.
> > > 
> > > We will separate the current configuration into two: A virtio-pci common
> > > configuration and a device specific configuration. This allows more flexibility
> > > with adding features and makes usage easier, specifically in cases like the
> > > ones in virtio-net where device specific configurations depend on device
> > > specific features.
> > 
> > Thanks for this Sasha.  Several general comments:
> > 
> > 1) How to we distinguish the two layouts?  In theory, we need to do this
> >    forever.  In practice we can deprecate the old layout in several
> >    years' time.
> 
> Old layouts won't have the new virtio-pci cap structure in their PCI
> config space.

What happens next time we want to change something?

> > 2) I don't think we want to turn the device-specific config into a
> >    linked list.  We haven't needed variable-length config (yet!), and
> >    it's (slightly) more complex.  That's also the part of the spec which
> >    is shared with non-PCI virtio implementations.
> 
> Variable length config wasn't used yet because space in the device
> specific space was reserved for a feature even if that feature wasn't
> used.

Not only that, also because it is messy to debug.  With fixed offsets
you just print the address/data and you know what it's doing.

> For example, the MAC feature reserved 6 bytes in the config space for
> the MAC even if VIRTIO_NET_F_MAC wasn't enabled. Here we can just avoid
> having it pollute the config space until it's enabled.
> 

This looks like overdesign to me. The only reason PCI spec
used capability list is
1. to save space
2. standard mechanism to discover features
You say explicitly space is not an issue, and you keep
feature bits around.

> I don't think it'll have any impact on non-PCI implementations since the
> "pointers" are simply offsets from the beginning of the config space,
> and are not PCI specific in any way.

The APIs use a single offset cookie to pass configuration.
If you want capabilities, that will have to be changed.

> > 3) If we're changing the queue layout, it's a chance to fix a
> >    longstanding bug: let the guest notify the host of preferred
> >    queue size and alignment.
> 
> Yup, we can do that.
> 
> -- 
> 
> Sasha.
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-03 12:46     ` Michael S. Tsirkin
@ 2011-11-03 13:19       ` Sasha Levin
  2011-11-03 13:48         ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Sasha Levin @ 2011-11-03 13:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Thu, 2011-11-03 at 14:46 +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 03, 2011 at 10:33:23AM +0200, Sasha Levin wrote:
> > On Thu, 2011-11-03 at 12:28 +1030, Rusty Russell wrote: 
> > > On Wed, 02 Nov 2011 20:49:27 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > This is a proposal for a new layout of the virtio-pci config space.
> > > > 
> > > > We will separate the current configuration into two: A virtio-pci common
> > > > configuration and a device specific configuration. This allows more flexibility
> > > > with adding features and makes usage easier, specifically in cases like the
> > > > ones in virtio-net where device specific configurations depend on device
> > > > specific features.
> > > 
> > > Thanks for this Sasha.  Several general comments:
> > > 
> > > 1) How to we distinguish the two layouts?  In theory, we need to do this
> > >    forever.  In practice we can deprecate the old layout in several
> > >    years' time.
> > 
> > Old layouts won't have the new virtio-pci cap structure in their PCI
> > config space.
> 
> What happens next time we want to change something?

Thats why there are virtio-pci cap values. This layout cap was defined
as VIRTIO_PCI_C_LAYOUT. If for example we want to provide a whole
different layout, we can define something similar to
VIRTIO_PCI_C_LAYOUT_V2 and use it in newer drivers (we can also add a
version field to the original layout ofcourse).

This also allows an easy way to provide backwards compatibility by
specifying many layout definitions and letting the driver choose the
latest layout he can. This would also allow to make larger changes
easier as it allows you to have several different layout configs in the
same code.

> 
> > > 2) I don't think we want to turn the device-specific config into a
> > >    linked list.  We haven't needed variable-length config (yet!), and
> > >    it's (slightly) more complex.  That's also the part of the spec which
> > >    is shared with non-PCI virtio implementations.
> > 
> > Variable length config wasn't used yet because space in the device
> > specific space was reserved for a feature even if that feature wasn't
> > used.
> 
> Not only that, also because it is messy to debug.  With fixed offsets
> you just print the address/data and you know what it's doing.
> 
> > For example, the MAC feature reserved 6 bytes in the config space for
> > the MAC even if VIRTIO_NET_F_MAC wasn't enabled. Here we can just avoid
> > having it pollute the config space until it's enabled.
> > 
> 
> This looks like overdesign to me. The only reason PCI spec
> used capability list is
> 1. to save space
> 2. standard mechanism to discover features
> You say explicitly space is not an issue, and you keep
> feature bits around.

Okay, I agree with not doing linked lists for device specific features.

I do think we need them for virtio-pci itself to handle features such as
MSI-X.

> > I don't think it'll have any impact on non-PCI implementations since the
> > "pointers" are simply offsets from the beginning of the config space,
> > and are not PCI specific in any way.
> 
> The APIs use a single offset cookie to pass configuration.
> If you want capabilities, that will have to be changed.
> 
> > > 3) If we're changing the queue layout, it's a chance to fix a
> > >    longstanding bug: let the guest notify the host of preferred
> > >    queue size and alignment.
> > 
> > Yup, we can do that.
> > 
> > -- 
> > 
> > Sasha.
> > 

-- 

Sasha.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-03 13:19       ` Sasha Levin
@ 2011-11-03 13:48         ` Michael S. Tsirkin
  0 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-03 13:48 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Thu, Nov 03, 2011 at 03:19:01PM +0200, Sasha Levin wrote:
> On Thu, 2011-11-03 at 14:46 +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 03, 2011 at 10:33:23AM +0200, Sasha Levin wrote:
> > > On Thu, 2011-11-03 at 12:28 +1030, Rusty Russell wrote: 
> > > > On Wed, 02 Nov 2011 20:49:27 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > This is a proposal for a new layout of the virtio-pci config space.
> > > > > 
> > > > > We will separate the current configuration into two: A virtio-pci common
> > > > > configuration and a device specific configuration. This allows more flexibility
> > > > > with adding features and makes usage easier, specifically in cases like the
> > > > > ones in virtio-net where device specific configurations depend on device
> > > > > specific features.
> > > > 
> > > > Thanks for this Sasha.  Several general comments:
> > > > 
> > > > 1) How to we distinguish the two layouts?  In theory, we need to do this
> > > >    forever.  In practice we can deprecate the old layout in several
> > > >    years' time.
> > > 
> > > Old layouts won't have the new virtio-pci cap structure in their PCI
> > > config space.
> > 
> > What happens next time we want to change something?
> 
> Thats why there are virtio-pci cap values. This layout cap was defined
> as VIRTIO_PCI_C_LAYOUT. If for example we want to provide a whole
> different layout, we can define something similar to
> VIRTIO_PCI_C_LAYOUT_V2 and use it in newer drivers (we can also add a
> version field to the original layout ofcourse).
> 
> This also allows an easy way to provide backwards compatibility by
> specifying many layout definitions and letting the driver choose the
> latest layout he can. This would also allow to make larger changes
> easier as it allows you to have several different layout configs in the
> same code.

This plan does not make me happy. Versioning is much messier than
features to support, and much harder for downstreams to
cherry-pick.

> > 
> > > > 2) I don't think we want to turn the device-specific config into a
> > > >    linked list.  We haven't needed variable-length config (yet!), and
> > > >    it's (slightly) more complex.  That's also the part of the spec which
> > > >    is shared with non-PCI virtio implementations.
> > > 
> > > Variable length config wasn't used yet because space in the device
> > > specific space was reserved for a feature even if that feature wasn't
> > > used.
> > 
> > Not only that, also because it is messy to debug.  With fixed offsets
> > you just print the address/data and you know what it's doing.
> > 
> > > For example, the MAC feature reserved 6 bytes in the config space for
> > > the MAC even if VIRTIO_NET_F_MAC wasn't enabled. Here we can just avoid
> > > having it pollute the config space until it's enabled.
> > > 
> > 
> > This looks like overdesign to me. The only reason PCI spec
> > used capability list is
> > 1. to save space
> > 2. standard mechanism to discover features
> > You say explicitly space is not an issue, and you keep
> > feature bits around.
> 
> Okay, I agree with not doing linked lists for device specific features.
> 
> I do think we need them for virtio-pci itself to handle features such as
> MSI-X.

It's definitely easier for virtio-pci than for devices.
But let's address the motivation point.
Do you expect a need to have a huge structure there, like
megabytes in size, so space will be an issue again?

> > > I don't think it'll have any impact on non-PCI implementations since the
> > > "pointers" are simply offsets from the beginning of the config space,
> > > and are not PCI specific in any way.
> > 
> > The APIs use a single offset cookie to pass configuration.
> > If you want capabilities, that will have to be changed.
> > 
> > > > 3) If we're changing the queue layout, it's a chance to fix a
> > > >    longstanding bug: let the guest notify the host of preferred
> > > >    queue size and alignment.
> > > 
> > > Yup, we can do that.
> > > 
> > > -- 
> > > 
> > > Sasha.
> > > 
> 
> -- 
> 
> Sasha.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-03  8:33   ` Sasha Levin
  2011-11-03 12:46     ` Michael S. Tsirkin
@ 2011-11-04  9:44     ` Rusty Russell
  2011-11-04 11:40       ` Michael S. Tsirkin
  1 sibling, 1 reply; 27+ messages in thread
From: Rusty Russell @ 2011-11-04  9:44 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Michael S. Tsirkin, linux-kernel, kvm, virtualization, Anthony Liguori

On Thu, 03 Nov 2011 10:33:23 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> On Thu, 2011-11-03 at 12:28 +1030, Rusty Russell wrote: 
> > 2) I don't think we want to turn the device-specific config into a
> >    linked list.  We haven't needed variable-length config (yet!), and
> >    it's (slightly) more complex.  That's also the part of the spec which
> >    is shared with non-PCI virtio implementations.
> 
> Variable length config wasn't used yet because space in the device
> specific space was reserved for a feature even if that feature wasn't
> used.
> 
> For example, the MAC feature reserved 6 bytes in the config space for
> the MAC even if VIRTIO_NET_F_MAC wasn't enabled. Here we can just avoid
> having it pollute the config space until it's enabled.

Exactly.  But we haven't had a problem so far; but we don't put
arbitrarily large fields in there.

> I don't think it'll have any impact on non-PCI implementations since the
> "pointers" are simply offsets from the beginning of the config space,
> and are not PCI specific in any way.

But the drivers currently just use offsetof() to access it, eg:

	if (virtio_config_val_len(vdev, VIRTIO_NET_F_MAC,
				  offsetof(struct virtio_net_config, mac),
				  dev->dev_addr, dev->addr_len) < 0)

That would have to change, and that means a change for drivers and for
the non-PCI implementations.

Hence I think this is a step too far.

> > 3) If we're changing the queue layout, it's a chance to fix a
> >    longstanding bug: let the guest notify the host of preferred
> >    queue size and alignment.
> 
> Yup, we can do that.

The seabios guys will definitely thank you!

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-04  9:44     ` Rusty Russell
@ 2011-11-04 11:40       ` Michael S. Tsirkin
  2011-11-04 12:32         ` Sasha Levin
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-04 11:40 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Sasha Levin, linux-kernel, kvm, virtualization, Anthony Liguori

On Fri, Nov 04, 2011 at 08:14:43PM +1030, Rusty Russell wrote:
> > > 3) If we're changing the queue layout, it's a chance to fix a
> > >    longstanding bug: let the guest notify the host of preferred
> > >    queue size and alignment.
> > 
> > Yup, we can do that.

We don't need to change all of layout for that - just add another field
in the common config structure to supply the alignment.

> The seabios guys will definitely thank you!

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-04 11:40       ` Michael S. Tsirkin
@ 2011-11-04 12:32         ` Sasha Levin
  2011-11-04 13:51           ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Sasha Levin @ 2011-11-04 12:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Fri, 2011-11-04 at 13:40 +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 04, 2011 at 08:14:43PM +1030, Rusty Russell wrote:
> > > > 3) If we're changing the queue layout, it's a chance to fix a
> > > >    longstanding bug: let the guest notify the host of preferred
> > > >    queue size and alignment.
> > > 
> > > Yup, we can do that.
> 
> We don't need to change all of layout for that - just add another field
> in the common config structure to supply the alignment.

How would you do it without changing the layout? Add another optional
field at the end which will shift offsets based on whether the host and
guest support this new feature or not?

This leads to 3 different things which now shift config offsets around.

As you said, the PCI cap list was introduced both to save space (which
is not the motivation here), and because it's a very efficient and easy
way to manage optional features without requiring tricks which move
offsets around like we do now.

-- 

Sasha.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-04 12:32         ` Sasha Levin
@ 2011-11-04 13:51           ` Michael S. Tsirkin
  2011-11-04 13:53             ` Sasha Levin
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-04 13:51 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Fri, Nov 04, 2011 at 02:32:19PM +0200, Sasha Levin wrote:
> On Fri, 2011-11-04 at 13:40 +0200, Michael S. Tsirkin wrote:
> > On Fri, Nov 04, 2011 at 08:14:43PM +1030, Rusty Russell wrote:
> > > > > 3) If we're changing the queue layout, it's a chance to fix a
> > > > >    longstanding bug: let the guest notify the host of preferred
> > > > >    queue size and alignment.
> > > > 
> > > > Yup, we can do that.
> > 
> > We don't need to change all of layout for that - just add another field
> > in the common config structure to supply the alignment.
> 
> How would you do it without changing the layout? Add another optional
> field at the end which will shift offsets based on whether the host and
> guest support this new feature or not?
> 
> This leads to 3 different things which now shift config offsets around.

No. Just put the field at offset 24 from the offset specified
by VIRTIO_PCI_CAP_COMMON_CFG.

> As you said, the PCI cap list was introduced both to save space (which
> is not the motivation here), and because it's a very efficient

It's actually pretty inefficient - there's an overhead of 3 bytes for
each vendor specific option.

> and easy way to manage optional features without requiring tricks
> which move offsets around like we do now.

Tricks with offsets only appeared because we had datapath, device
specific and common config in the same place.
feature list isn't needed to fix that.

> -- 
> 
> Sasha.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-04 13:51           ` Michael S. Tsirkin
@ 2011-11-04 13:53             ` Sasha Levin
  2011-11-04 14:23               ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Sasha Levin @ 2011-11-04 13:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Fri, 2011-11-04 at 15:51 +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 04, 2011 at 02:32:19PM +0200, Sasha Levin wrote:
> > On Fri, 2011-11-04 at 13:40 +0200, Michael S. Tsirkin wrote:
> > > On Fri, Nov 04, 2011 at 08:14:43PM +1030, Rusty Russell wrote:
> > > > > > 3) If we're changing the queue layout, it's a chance to fix a
> > > > > >    longstanding bug: let the guest notify the host of preferred
> > > > > >    queue size and alignment.
> > > > > 
> > > > > Yup, we can do that.
> > > 
> > > We don't need to change all of layout for that - just add another field
> > > in the common config structure to supply the alignment.
> > 
> > How would you do it without changing the layout? Add another optional
> > field at the end which will shift offsets based on whether the host and
> > guest support this new feature or not?
> > 
> > This leads to 3 different things which now shift config offsets around.
> 
> No. Just put the field at offset 24 from the offset specified
> by VIRTIO_PCI_CAP_COMMON_CFG.

Two questions here:

- What about backwards compatibility? How would the config space look
when we're not using the new layout?
- How does it work with 64 bit features which are also located there?

> > As you said, the PCI cap list was introduced both to save space (which
> > is not the motivation here), and because it's a very efficient
> 
> It's actually pretty inefficient - there's an overhead of 3 bytes for
> each vendor specific option.

It's efficient because while you pay a small price for each optional
option it also means that that option is optional and won't clutter the
config space if it's not really in use.

Think of how the PCI config space would look if all those caps wouldn't
have been optional and would instead all of them would have just have
been attached to the end of the config space.

> 
> > and easy way to manage optional features without requiring tricks
> > which move offsets around like we do now.
> 
> Tricks with offsets only appeared because we had datapath, device
> specific and common config in the same place.
> feature list isn't needed to fix that.
> 
> > -- 
> > 
> > Sasha.

-- 

Sasha.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-04 13:53             ` Sasha Levin
@ 2011-11-04 14:23               ` Michael S. Tsirkin
  2011-11-04 14:53                 ` Sasha Levin
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-04 14:23 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Fri, Nov 04, 2011 at 03:53:24PM +0200, Sasha Levin wrote:
> On Fri, 2011-11-04 at 15:51 +0200, Michael S. Tsirkin wrote:
> > On Fri, Nov 04, 2011 at 02:32:19PM +0200, Sasha Levin wrote:
> > > On Fri, 2011-11-04 at 13:40 +0200, Michael S. Tsirkin wrote:
> > > > On Fri, Nov 04, 2011 at 08:14:43PM +1030, Rusty Russell wrote:
> > > > > > > 3) If we're changing the queue layout, it's a chance to fix a
> > > > > > >    longstanding bug: let the guest notify the host of preferred
> > > > > > >    queue size and alignment.
> > > > > > 
> > > > > > Yup, we can do that.
> > > > 
> > > > We don't need to change all of layout for that - just add another field
> > > > in the common config structure to supply the alignment.
> > > 
> > > How would you do it without changing the layout? Add another optional
> > > field at the end which will shift offsets based on whether the host and
> > > guest support this new feature or not?
> > > 
> > > This leads to 3 different things which now shift config offsets around.
> > 
> > No. Just put the field at offset 24 from the offset specified
> > by VIRTIO_PCI_CAP_COMMON_CFG.
> 
> Two questions here:
> 
> - What about backwards compatibility? How would the config space look
> when we're not using the new layout?

Exactly as it does now. You don't get to tweak alignment then.

> - How does it work with 64 bit features which are also located there?

Basically each field gets an offset.  E.g.
24 - features
28 - queue alignment

> > > As you said, the PCI cap list was introduced both to save space (which
> > > is not the motivation here), and because it's a very efficient
> > 
> > It's actually pretty inefficient - there's an overhead of 3 bytes for
> > each vendor specific option.
> 
> It's efficient because while you pay a small price for each optional
> option it also means that that option is optional and won't clutter the
> config space if it's not really in use.

I guess my assumption is that most options will be in use,
not discarded dead-ends.

> Think of how the PCI config space would look if all those caps wouldn't
> have been optional and would instead all of them would have just have
> been attached to the end of the config space.

It started out this way, but then they started running out
of space - it's only 256 bytes - so the capability mechanism
was invented.


> > 
> > > and easy way to manage optional features without requiring tricks
> > > which move offsets around like we do now.
> > 
> > Tricks with offsets only appeared because we had datapath, device
> > specific and common config in the same place.
> > feature list isn't needed to fix that.
> > 
> > > -- 
> > > 
> > > Sasha.
> 
> -- 
> 
> Sasha.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-04 14:23               ` Michael S. Tsirkin
@ 2011-11-04 14:53                 ` Sasha Levin
  2011-11-06  7:30                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Sasha Levin @ 2011-11-04 14:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Fri, 2011-11-04 at 16:23 +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 04, 2011 at 03:53:24PM +0200, Sasha Levin wrote:
> > On Fri, 2011-11-04 at 15:51 +0200, Michael S. Tsirkin wrote:
> > > On Fri, Nov 04, 2011 at 02:32:19PM +0200, Sasha Levin wrote:
> > > > On Fri, 2011-11-04 at 13:40 +0200, Michael S. Tsirkin wrote:
> > > > > On Fri, Nov 04, 2011 at 08:14:43PM +1030, Rusty Russell wrote:
> > > > > > > > 3) If we're changing the queue layout, it's a chance to fix a
> > > > > > > >    longstanding bug: let the guest notify the host of preferred
> > > > > > > >    queue size and alignment.
> > > > > > > 
> > > > > > > Yup, we can do that.
> > > > > 
> > > > > We don't need to change all of layout for that - just add another field
> > > > > in the common config structure to supply the alignment.
> > > > 
> > > > How would you do it without changing the layout? Add another optional
> > > > field at the end which will shift offsets based on whether the host and
> > > > guest support this new feature or not?
> > > > 
> > > > This leads to 3 different things which now shift config offsets around.
> > > 
> > > No. Just put the field at offset 24 from the offset specified
> > > by VIRTIO_PCI_CAP_COMMON_CFG.
> > 
> > Two questions here:
> > 
> > - What about backwards compatibility? How would the config space look
> > when we're not using the new layout?
> 
> Exactly as it does now. You don't get to tweak alignment then.
> 
> > - How does it work with 64 bit features which are also located there?
> 
> Basically each field gets an offset.  E.g.
> 24 - features
> 28 - queue alignment
> 
> > > > As you said, the PCI cap list was introduced both to save space (which
> > > > is not the motivation here), and because it's a very efficient
> > > 
> > > It's actually pretty inefficient - there's an overhead of 3 bytes for
> > > each vendor specific option.
> > 
> > It's efficient because while you pay a small price for each optional
> > option it also means that that option is optional and won't clutter the
> > config space if it's not really in use.
> 
> I guess my assumption is that most options will be in use,
> not discarded dead-ends.

I don't know about that. 64 bit features would be pretty rare for now -
and I don't think that setting the alignment will be also enabled by
default.

I think that we're looking at it differently because I assume that any
feature we add at this point would be optional and used only in specific
scenarios, while you think that everything added will be used most of
the time.

> > Think of how the PCI config space would look if all those caps wouldn't
> > have been optional and would instead all of them would have just have
> > been attached to the end of the config space.
> 
> It started out this way, but then they started running out
> of space - it's only 256 bytes - so the capability mechanism
> was invented.
> 
> 
> > > 
> > > > and easy way to manage optional features without requiring tricks
> > > > which move offsets around like we do now.
> > > 
> > > Tricks with offsets only appeared because we had datapath, device
> > > specific and common config in the same place.
> > > feature list isn't needed to fix that.
> > > 
> > > > -- 
> > > > 
> > > > Sasha.
> > 
> > -- 
> > 
> > Sasha.

-- 

Sasha.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-04 14:53                 ` Sasha Levin
@ 2011-11-06  7:30                   ` Michael S. Tsirkin
  2011-11-06 20:24                     ` Sasha Levin
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-06  7:30 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Fri, Nov 04, 2011 at 04:53:05PM +0200, Sasha Levin wrote:
> > > > > As you said, the PCI cap list was introduced both to save space (which
> > > > > is not the motivation here), and because it's a very efficient
> > > > 
> > > > It's actually pretty inefficient - there's an overhead of 3 bytes for
> > > > each vendor specific option.
> > > 
> > > It's efficient because while you pay a small price for each optional
> > > option it also means that that option is optional and won't clutter the
> > > config space if it's not really in use.
> > 
> > I guess my assumption is that most options will be in use,
> > not discarded dead-ends.
> 
> I don't know about that. 64 bit features would be pretty rare for now -
> and I don't think that setting the alignment will be also enabled by
> default.

Setting the alignment might not be *used* by default but
I think it must be enabled by default to allow bios access.

> I think that we're looking at it differently because I assume that any
> feature we add at this point would be optional and used only in specific
> scenarios, while you think that everything added will be used most of
> the time.

Options must often be present even if not used. For example, as device
has no way to know whether a guest will want to program alignment, it
has to make that option available.

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-06  7:30                   ` Michael S. Tsirkin
@ 2011-11-06 20:24                     ` Sasha Levin
  2011-11-06 21:38                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Sasha Levin @ 2011-11-06 20:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Sun, 2011-11-06 at 09:30 +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 04, 2011 at 04:53:05PM +0200, Sasha Levin wrote:
> > > > > > As you said, the PCI cap list was introduced both to save space (which
> > > > > > is not the motivation here), and because it's a very efficient
> > > > > 
> > > > > It's actually pretty inefficient - there's an overhead of 3 bytes for
> > > > > each vendor specific option.
> > > > 
> > > > It's efficient because while you pay a small price for each optional
> > > > option it also means that that option is optional and won't clutter the
> > > > config space if it's not really in use.
> > > 
> > > I guess my assumption is that most options will be in use,
> > > not discarded dead-ends.
> > 
> > I don't know about that. 64 bit features would be pretty rare for now -
> > and I don't think that setting the alignment will be also enabled by
> > default.
> 
> Setting the alignment might not be *used* by default but
> I think it must be enabled by default to allow bios access.
> 
> > I think that we're looking at it differently because I assume that any
> > feature we add at this point would be optional and used only in specific
> > scenarios, while you think that everything added will be used most of
> > the time.
> 
> Options must often be present even if not used. For example, as device
> has no way to know whether a guest will want to program alignment, it
> has to make that option available.

They should be enabled, but heres the difference between the two
approaches is that if it's cap it simply won't be there, while in the
other case it would just remain empty at some random offset of the
struct.

-- 

Sasha.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-06 20:24                     ` Sasha Levin
@ 2011-11-06 21:38                       ` Michael S. Tsirkin
  2011-11-07  5:16                         ` Rusty Russell
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-06 21:38 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Rusty Russell, linux-kernel, kvm, virtualization, Anthony Liguori

On Sun, Nov 06, 2011 at 10:24:57PM +0200, Sasha Levin wrote:
> On Sun, 2011-11-06 at 09:30 +0200, Michael S. Tsirkin wrote:
> > On Fri, Nov 04, 2011 at 04:53:05PM +0200, Sasha Levin wrote:
> > > > > > > As you said, the PCI cap list was introduced both to save space (which
> > > > > > > is not the motivation here), and because it's a very efficient
> > > > > > 
> > > > > > It's actually pretty inefficient - there's an overhead of 3 bytes for
> > > > > > each vendor specific option.
> > > > > 
> > > > > It's efficient because while you pay a small price for each optional
> > > > > option it also means that that option is optional and won't clutter the
> > > > > config space if it's not really in use.
> > > > 
> > > > I guess my assumption is that most options will be in use,
> > > > not discarded dead-ends.
> > > 
> > > I don't know about that. 64 bit features would be pretty rare for now -
> > > and I don't think that setting the alignment will be also enabled by
> > > default.
> > 
> > Setting the alignment might not be *used* by default but
> > I think it must be enabled by default to allow bios access.
> > 
> > > I think that we're looking at it differently because I assume that any
> > > feature we add at this point would be optional and used only in specific
> > > scenarios, while you think that everything added will be used most of
> > > the time.
> > 
> > Options must often be present even if not used. For example, as device
> > has no way to know whether a guest will want to program alignment, it
> > has to make that option available.
> 
> They should be enabled, but heres the difference between the two
> approaches is that if it's cap it simply won't be there,

How can it not be there? They layout is specified by host,
not by guest.

> while in the
> other case it would just remain empty at some random offset of the
> struct.
> 
> -- 
> 
> Sasha.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-06 21:38                       ` Michael S. Tsirkin
@ 2011-11-07  5:16                         ` Rusty Russell
  2011-11-07 21:14                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Rusty Russell @ 2011-11-07  5:16 UTC (permalink / raw)
  To: Michael S. Tsirkin, Sasha Levin
  Cc: linux-kernel, kvm, virtualization, Anthony Liguori

On Sun, 6 Nov 2011 23:38:49 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Nov 06, 2011 at 10:24:57PM +0200, Sasha Levin wrote:
> > On Sun, 2011-11-06 at 09:30 +0200, Michael S. Tsirkin wrote:
> > > On Fri, Nov 04, 2011 at 04:53:05PM +0200, Sasha Levin wrote:
> > > > > I guess my assumption is that most options will be in use,
> > > > > not discarded dead-ends.
> > > > 
> > > > I don't know about that. 64 bit features would be pretty rare for now -
> > > > and I don't think that setting the alignment will be also enabled by
> > > > default.

The truth is somewhere between.  New features tend to be used, and
importantly, they have to be offered.

Anyway, the *per-device* config part will be contiguous.  So we're
talking about pci-specific features.

So far, the only three things make sense to have in a capability list:
MSI-X, the upper 32 feature bits, and the per-device config.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-07  5:16                         ` Rusty Russell
@ 2011-11-07 21:14                           ` Michael S. Tsirkin
  2011-11-07 23:53                             ` Rusty Russell
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-07 21:14 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Sasha Levin, linux-kernel, kvm, virtualization, Anthony Liguori

On Mon, Nov 07, 2011 at 03:46:23PM +1030, Rusty Russell wrote:
> On Sun, 6 Nov 2011 23:38:49 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Nov 06, 2011 at 10:24:57PM +0200, Sasha Levin wrote:
> > > On Sun, 2011-11-06 at 09:30 +0200, Michael S. Tsirkin wrote:
> > > > On Fri, Nov 04, 2011 at 04:53:05PM +0200, Sasha Levin wrote:
> > > > > > I guess my assumption is that most options will be in use,
> > > > > > not discarded dead-ends.
> > > > > 
> > > > > I don't know about that. 64 bit features would be pretty rare for now -
> > > > > and I don't think that setting the alignment will be also enabled by
> > > > > default.
> 
> The truth is somewhere between.  New features tend to be used, and
> importantly, they have to be offered.
> 
> Anyway, the *per-device* config part will be contiguous.  So we're
> talking about pci-specific features.
> 
> So far, the only three things make sense to have in a capability list:
> MSI-X, the upper 32 feature bits, and the per-device config.
> 
> Thanks,
> Rusty.

You mean the queue # to MSI-X vector mapping?
One thing to remember is that it must be in the same type of BAR as
the queue selection, since by PCI rules MMIO writes aren't I think
ordered with PIO writes (it doesn't matter with KVM but might
with another hypervisor).

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-07 21:14                           ` Michael S. Tsirkin
@ 2011-11-07 23:53                             ` Rusty Russell
  2011-11-08  6:32                               ` Michael S. Tsirkin
  2011-11-08 14:15                               ` Sasha Levin
  0 siblings, 2 replies; 27+ messages in thread
From: Rusty Russell @ 2011-11-07 23:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sasha Levin, linux-kernel, kvm, virtualization, Anthony Liguori

On Mon, 7 Nov 2011 23:14:14 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Nov 07, 2011 at 03:46:23PM +1030, Rusty Russell wrote:
> > So far, the only three things make sense to have in a capability list:
> > MSI-X, the upper 32 feature bits, and the per-device config.
> 
> You mean the queue # to MSI-X vector mapping?

Yep.

> One thing to remember is that it must be in the same type of BAR as
> the queue selection, since by PCI rules MMIO writes aren't I think
> ordered with PIO writes (it doesn't matter with KVM but might
> with another hypervisor).

OK, I'm slowly getting up to speed.

Next dumb q: Sasha, why did you introduce the idea of a separate
virtio-pci capability list, rather than just using PCI capabilities
directly?  ie. instead of VIRTIO_PCI_C_LAYOUT, have VIRTIO_PCI_CORE,
VIRTIO_PCI_MSIX, VIRTIO_PCI_DEV_SPECIFIC?

Is it because we really want this stuff outside the PCI configuration
space?  Even so, should we just use the PCI cap list, and have each
cap entry just contain a BIR & offset?

Thanks,
Rusty.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-07 23:53                             ` Rusty Russell
@ 2011-11-08  6:32                               ` Michael S. Tsirkin
  2011-11-08 10:21                                 ` Rusty Russell
  2011-11-08 14:15                               ` Sasha Levin
  1 sibling, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-08  6:32 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Sasha Levin, linux-kernel, kvm, virtualization, Anthony Liguori

On Tue, Nov 08, 2011 at 10:23:33AM +1030, Rusty Russell wrote:
> Even so, should we just use the PCI cap list, and have each
> cap entry just contain a BIR & offset?
> 
> Thanks,
> Rusty.

And size :)
I say, Rusty, did you see my patch? That's what it's doing,
I also addressed the issue that with KVM on x86, PIO is faster
than MMIO so we need to use it for notifications/isr.

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-08  6:32                               ` Michael S. Tsirkin
@ 2011-11-08 10:21                                 ` Rusty Russell
  2011-11-08 21:31                                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Rusty Russell @ 2011-11-08 10:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sasha Levin, linux-kernel, kvm, virtualization, Anthony Liguori

On Tue, 8 Nov 2011 08:32:50 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Tue, Nov 08, 2011 at 10:23:33AM +1030, Rusty Russell wrote:
> > Even so, should we just use the PCI cap list, and have each
> > cap entry just contain a BIR & offset?
> > 
> > Thanks,
> > Rusty.
> 
> And size :)
> I say, Rusty, did you see my patch? That's what it's doing,
> I also addressed the issue that with KVM on x86, PIO is faster
> than MMIO so we need to use it for notifications/isr.

Faster?  Really?  Why?  Were the ppc guys right that's it's obsolescent?

Because PIO is going to be ugly AFAICT for every non-x86 platform.

Cheers,
Rusty.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-07 23:53                             ` Rusty Russell
  2011-11-08  6:32                               ` Michael S. Tsirkin
@ 2011-11-08 14:15                               ` Sasha Levin
  1 sibling, 0 replies; 27+ messages in thread
From: Sasha Levin @ 2011-11-08 14:15 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, linux-kernel, kvm, virtualization, Anthony Liguori

On Tue, Nov 8, 2011 at 1:53 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> On Mon, 7 Nov 2011 23:14:14 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> On Mon, Nov 07, 2011 at 03:46:23PM +1030, Rusty Russell wrote:
>> > So far, the only three things make sense to have in a capability list:
>> > MSI-X, the upper 32 feature bits, and the per-device config.
>>
>> You mean the queue # to MSI-X vector mapping?
>
> Yep.
>
>> One thing to remember is that it must be in the same type of BAR as
>> the queue selection, since by PCI rules MMIO writes aren't I think
>> ordered with PIO writes (it doesn't matter with KVM but might
>> with another hypervisor).
>
> OK, I'm slowly getting up to speed.
>
> Next dumb q: Sasha, why did you introduce the idea of a separate
> virtio-pci capability list, rather than just using PCI capabilities
> directly?  ie. instead of VIRTIO_PCI_C_LAYOUT, have VIRTIO_PCI_CORE,
> VIRTIO_PCI_MSIX, VIRTIO_PCI_DEV_SPECIFIC?
>
> Is it because we really want this stuff outside the PCI configuration
> space?  Even so, should we just use the PCI cap list, and have each
> cap entry just contain a BIR & offset?

Mostly so that we can have the configuration out of PCI space so that
we can grow it later on.

We can go with doing it in PCI space with BIR & offset, yes.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: virtio-pci new configuration proposal
  2011-11-08 10:21                                 ` Rusty Russell
@ 2011-11-08 21:31                                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-08 21:31 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Sasha Levin, linux-kernel, kvm, virtualization, Anthony Liguori

On Tue, Nov 08, 2011 at 08:51:38PM +1030, Rusty Russell wrote:
> On Tue, 8 Nov 2011 08:32:50 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Tue, Nov 08, 2011 at 10:23:33AM +1030, Rusty Russell wrote:
> > > Even so, should we just use the PCI cap list, and have each
> > > cap entry just contain a BIR & offset?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > And size :)
> > I say, Rusty, did you see my patch? That's what it's doing,
> > I also addressed the issue that with KVM on x86, PIO is faster
> > than MMIO so we need to use it for notifications/isr.
> 
> Faster?  Really?  Why?

I think, because it doesn't need to be emulated to get the address
and the value. There are so many ways on x86 to do memory
access, you need a lot of code to figure out what's going on.

> Were the ppc guys right that's it's obsolescent?

For PCI - surely not.  PCI Express spec tries to discourage its usage.
But support is there even in Express.

> Because PIO is going to be ugly AFAICT for every non-x86 platform.
> 
> Cheers,
> Rusty.
> 


So we should make the BAR MMIO on non-x86 host, PIO on x86.
Guest should just map and use what it's given.

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2011-11-08 21:30 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-02 18:49 virtio-pci new configuration proposal Sasha Levin
2011-11-02 19:08 ` Michael S. Tsirkin
2011-11-02 19:07   ` Sasha Levin
2011-11-02 19:14     ` Michael S. Tsirkin
2011-11-03  1:58 ` Rusty Russell
2011-11-03  8:33   ` Sasha Levin
2011-11-03 12:46     ` Michael S. Tsirkin
2011-11-03 13:19       ` Sasha Levin
2011-11-03 13:48         ` Michael S. Tsirkin
2011-11-04  9:44     ` Rusty Russell
2011-11-04 11:40       ` Michael S. Tsirkin
2011-11-04 12:32         ` Sasha Levin
2011-11-04 13:51           ` Michael S. Tsirkin
2011-11-04 13:53             ` Sasha Levin
2011-11-04 14:23               ` Michael S. Tsirkin
2011-11-04 14:53                 ` Sasha Levin
2011-11-06  7:30                   ` Michael S. Tsirkin
2011-11-06 20:24                     ` Sasha Levin
2011-11-06 21:38                       ` Michael S. Tsirkin
2011-11-07  5:16                         ` Rusty Russell
2011-11-07 21:14                           ` Michael S. Tsirkin
2011-11-07 23:53                             ` Rusty Russell
2011-11-08  6:32                               ` Michael S. Tsirkin
2011-11-08 10:21                                 ` Rusty Russell
2011-11-08 21:31                                   ` Michael S. Tsirkin
2011-11-08 14:15                               ` Sasha Levin
2011-11-03 11:03   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).