qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* making a qdev bus available from a (non-qtree?) device
@ 2021-05-11 18:17 Klaus Jensen
  2021-05-12  3:39 ` Philippe Mathieu-Daudé
  2021-05-12 12:02 ` Markus Armbruster
  0 siblings, 2 replies; 10+ messages in thread
From: Klaus Jensen @ 2021-05-11 18:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha, qemu-block, mst

[-- Attachment #1: Type: text/plain, Size: 2859 bytes --]

Hi all,

I need some help with grok'ing qdev busses. Stefan, Michael - David 
suggested on IRC that I CC'ed you guys since you might have solved a 
similar issue with virtio devices. I've tried to study how that works, 
but I'm not exactly sure how to apply it to the issue I'm having.

Currently, to support multiple namespaces on the emulated nvme device, 
one can do something like this:

   -device nvme,id=nvme-ctrl-0,serial=foo,...
   -device nvme-ns,id=nvme-ns-0,bus=nvme-ctrl-0,...
   -device nvme-ns,id-nvme-ns-1,bus=nvme-ctrl-0,...

The nvme device creates an 'nvme-bus' and the nvme-ns devices has 
dc->bus_type = TYPE_NVME_BUS. This all works very well and provides a 
nice overview in `info qtree`:

   bus: main-system-bus
   type System
     ...
     dev: q35-pcihost, id ""
       ..
       bus: pcie.0
	type PCIE
	..
	dev: nvme, id "nvme-ctrl-0"
	  ..
	  bus: nvme-ctrl-0
	    type nvme-bus
	    dev: nvme-ns, id "nvme-ns-0"
	      ..
	    dev: nvme-ns, id "nvme-ns-1"
	      ..


Nice and qdevy.

We have since introduced support for NVM Subsystems through an 
nvme-subsys device. The nvme-subsys device is just a TYPE_DEVICE and 
does not show in `info qtree` (I wonder if this should actually just 
have been an -object?). Anyway. The nvme device has a 'subsys' link 
parameter and we use this to manage the namespaces across the subsystem 
that may contain several nvme devices (controllers). The problem is that 
this doesnt work too well with unplugging since if the nvme device is 
`device_del`'ed, the nvme-ns devices on the nvme-bus are unrealized 
which is not what we want. We really want the namespaces to linger, 
preferably on an nvme-bus of the nvme-subsys device so they can be 
attached to other nvme devices that may show up (or already exist) in 
the subsystem.

The core problem I'm having is that I can't seem to create an nvme-bus 
from the nvme-subsys device and make it available to the nvme-ns device 
on the command line:

   -device nvme-subsys,id=nvme-subsys-0,...
   -device nvme-ns,bus=nvme-subsys-0

The above results in 'No 'nvme-bus' bus found for device 'nvme-ns', even 
though I do `qbus_create_inplace()` just like the nvme device. However, 
I *can* reparent the nvme-ns device in its realize() method, so if I 
instead define it like so:

   -device nvme-subsys,id=nvme-subsys-0,...
   -device nvme,id=nvme-ctrl-0,subsys=nvme-subsys-0
   -device nvme-ns,bus=nvme-ctrl-0

I can then call `qdev_set_parent_bus()` and set the parent bus to the 
bus creates in the nvme-subsys device. This solves the problem since the 
namespaces are not "garbage collected" when the nvme device is removed, 
but it just feels wrong you know? Also, if possible, I'd of course 
really like to retain the nice entries in `info qtree`.


Thanks,
Klaus

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-11 18:17 making a qdev bus available from a (non-qtree?) device Klaus Jensen
@ 2021-05-12  3:39 ` Philippe Mathieu-Daudé
  2021-05-12  8:00   ` Peter Maydell
  2021-05-12 12:02 ` Markus Armbruster
  1 sibling, 1 reply; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-05-12  3:39 UTC (permalink / raw)
  To: Klaus Jensen, qemu-devel, Markus Armbruster, Peter Maydell,
	Eduardo Habkost
  Cc: qemu-block, stefanha, mst

On 5/11/21 8:17 PM, Klaus Jensen wrote:
> Hi all,
> 
> I need some help with grok'ing qdev busses. Stefan, Michael - David
> suggested on IRC that I CC'ed you guys since you might have solved a
> similar issue with virtio devices. I've tried to study how that works,
> but I'm not exactly sure how to apply it to the issue I'm having.

The experts on this topic are Peter/Markus/Eduardo.

> Currently, to support multiple namespaces on the emulated nvme device,
> one can do something like this:
> 
>   -device nvme,id=nvme-ctrl-0,serial=foo,...
>   -device nvme-ns,id=nvme-ns-0,bus=nvme-ctrl-0,...
>   -device nvme-ns,id-nvme-ns-1,bus=nvme-ctrl-0,...
> 
> The nvme device creates an 'nvme-bus' and the nvme-ns devices has
> dc->bus_type = TYPE_NVME_BUS. This all works very well and provides a
> nice overview in `info qtree`:
> 
>   bus: main-system-bus
>   type System
>     ...
>     dev: q35-pcihost, id ""
>       ..
>       bus: pcie.0
>     type PCIE
>     ..
>     dev: nvme, id "nvme-ctrl-0"
>       ..
>       bus: nvme-ctrl-0
>         type nvme-bus
>         dev: nvme-ns, id "nvme-ns-0"
>           ..
>         dev: nvme-ns, id "nvme-ns-1"
>           ..
> 
> 
> Nice and qdevy.
> 
> We have since introduced support for NVM Subsystems through an
> nvme-subsys device. The nvme-subsys device is just a TYPE_DEVICE and
> does not show in `info qtree` (I wonder if this should actually just
> have been an -object?). Anyway. The nvme device has a 'subsys' link
> parameter and we use this to manage the namespaces across the subsystem
> that may contain several nvme devices (controllers). The problem is that
> this doesnt work too well with unplugging since if the nvme device is
> `device_del`'ed, the nvme-ns devices on the nvme-bus are unrealized
> which is not what we want. We really want the namespaces to linger,
> preferably on an nvme-bus of the nvme-subsys device so they can be
> attached to other nvme devices that may show up (or already exist) in
> the subsystem.

IIUC, while we can have unattached drives, we can't (by design) have
qdev unattached to qbus.

Not sure this is a good suggestion (bad design IMO) but you could add
a fake nvme qbus to hold the lingering nvme devices...

> The core problem I'm having is that I can't seem to create an nvme-bus
> from the nvme-subsys device and make it available to the nvme-ns device
> on the command line:
> 
>   -device nvme-subsys,id=nvme-subsys-0,...
>   -device nvme-ns,bus=nvme-subsys-0
> 
> The above results in 'No 'nvme-bus' bus found for device 'nvme-ns', even
> though I do `qbus_create_inplace()` just like the nvme device. However,
> I *can* reparent the nvme-ns device in its realize() method, so if I
> instead define it like so:
> 
>   -device nvme-subsys,id=nvme-subsys-0,...
>   -device nvme,id=nvme-ctrl-0,subsys=nvme-subsys-0
>   -device nvme-ns,bus=nvme-ctrl-0
> 
> I can then call `qdev_set_parent_bus()` and set the parent bus to the
> bus creates in the nvme-subsys device. This solves the problem since the
> namespaces are not "garbage collected" when the nvme device is removed,
> but it just feels wrong you know? Also, if possible, I'd of course
> really like to retain the nice entries in `info qtree`.
> 
> 
> Thanks,
> Klaus



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-12  3:39 ` Philippe Mathieu-Daudé
@ 2021-05-12  8:00   ` Peter Maydell
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Maydell @ 2021-05-12  8:00 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eduardo Habkost, Qemu-block, Michael S. Tsirkin, QEMU Developers,
	Markus Armbruster, Stefan Hajnoczi, Klaus Jensen

On Wed, 12 May 2021 at 04:39, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> IIUC, while we can have unattached drives, we can't (by design) have
> qdev unattached to qbus.

You can (and we do), but it is a bit of a problem because a
device not attached to a qbus will not get automatically reset,
and so you need to arrange to reset it manually somehow.


-- PMM


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-11 18:17 making a qdev bus available from a (non-qtree?) device Klaus Jensen
  2021-05-12  3:39 ` Philippe Mathieu-Daudé
@ 2021-05-12 12:02 ` Markus Armbruster
  2021-05-13 14:02   ` Stefan Hajnoczi
  2021-05-17  6:44   ` Klaus Jensen
  1 sibling, 2 replies; 10+ messages in thread
From: Markus Armbruster @ 2021-05-12 12:02 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: qemu-block, qemu-devel, stefanha, mst

Klaus Jensen <its@irrelevant.dk> writes:

> Hi all,
>
> I need some help with grok'ing qdev busses. Stefan, Michael - David
> suggested on IRC that I CC'ed you guys since you might have solved a 
> similar issue with virtio devices. I've tried to study how that works,
> but I'm not exactly sure how to apply it to the issue I'm having.
>
> Currently, to support multiple namespaces on the emulated nvme device,
> one can do something like this:
>
>   -device nvme,id=nvme-ctrl-0,serial=foo,...
>   -device nvme-ns,id=nvme-ns-0,bus=nvme-ctrl-0,...
>   -device nvme-ns,id-nvme-ns-1,bus=nvme-ctrl-0,...
>
> The nvme device creates an 'nvme-bus' and the nvme-ns devices has
> dc->bus_type = TYPE_NVME_BUS. This all works very well and provides a 
> nice overview in `info qtree`:
>
>   bus: main-system-bus
>   type System
>     ...
>     dev: q35-pcihost, id ""
>       ..
>       bus: pcie.0
> 	type PCIE
> 	..
> 	dev: nvme, id "nvme-ctrl-0"
> 	  ..
> 	  bus: nvme-ctrl-0
> 	    type nvme-bus
> 	    dev: nvme-ns, id "nvme-ns-0"
> 	      ..
> 	    dev: nvme-ns, id "nvme-ns-1"
> 	      ..
>
>
> Nice and qdevy.
>
> We have since introduced support for NVM Subsystems through an
> nvme-subsys device. The nvme-subsys device is just a TYPE_DEVICE and 
> does not show in `info qtree`

Yes.

Most devices plug into a bus.  DeviceClass member @bus_type specifies
the type of bus they plug into, and DeviceState member @parent_bus
points to the actual BusState.  Example: PCI devices plug into a PCI
bus, and have ->bus_type = TYPE_PCI_BUS.

Some devices don't.  @bus_type and @parent_bus are NULL then.

Most buses are provided by a device.  BusState member @parent points to
the device.

The main-system-bus isn't.  Its @parent is null.

"info qtree" only shows the qtree rooted at main-system-bus.  It doesn't
show qtrees rooted at bus-less devices or device-less buses other than
main-system-bus.  I doubt such buses exist.

>                               (I wonder if this should actually just
> have been an -object?).

Does nvme-subsys expose virtual hardware to the guest?  Memory, IRQs,
...

If yes, it needs to be a device.

If no, object may be more appropriate.  Tell us more about what it does.


>                         Anyway. The nvme device has a 'subsys' link 
> parameter and we use this to manage the namespaces across the
> subsystem that may contain several nvme devices (controllers). The
> problem is that this doesnt work too well with unplugging since if the
> nvme device is `device_del`'ed, the nvme-ns devices on the nvme-bus
> are unrealized which is not what we want. We really want the
> namespaces to linger, preferably on an nvme-bus of the nvme-subsys
> device so they can be attached to other nvme devices that may show up
> (or already exist) in the subsystem.
>
> The core problem I'm having is that I can't seem to create an nvme-bus
> from the nvme-subsys device and make it available to the nvme-ns
> device on the command line:
>
>   -device nvme-subsys,id=nvme-subsys-0,...
>   -device nvme-ns,bus=nvme-subsys-0
>
> The above results in 'No 'nvme-bus' bus found for device 'nvme-ns',
> even though I do `qbus_create_inplace()` just like the nvme
> device. However, I *can* reparent the nvme-ns device in its realize()
> method, so if I instead define it like so:
>
>   -device nvme-subsys,id=nvme-subsys-0,...
>   -device nvme,id=nvme-ctrl-0,subsys=nvme-subsys-0
>   -device nvme-ns,bus=nvme-ctrl-0
>
> I can then call `qdev_set_parent_bus()` and set the parent bus to the
> bus creates in the nvme-subsys device. This solves the problem since
> the namespaces are not "garbage collected" when the nvme device is
> removed, but it just feels wrong you know? Also, if possible, I'd of
> course really like to retain the nice entries in `info qtree`.

I'm afraid I'm too ignorant on NVME to give useful advice.

Can you give us a brief primer on the aspects of physical NVME devices
you'd like to model in QEMU?  What are "controllers", "namespaces", and
"subsystems", and how do they work together?

Once we understand the relevant aspects of physical devices, we can
discuss how to best model them in QEMU.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-12 12:02 ` Markus Armbruster
@ 2021-05-13 14:02   ` Stefan Hajnoczi
  2021-05-17  6:55     ` Klaus Jensen
  2021-05-17  6:44   ` Klaus Jensen
  1 sibling, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2021-05-13 14:02 UTC (permalink / raw)
  To: its; +Cc: mst, Markus Armbruster, qemu-block, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1390 bytes --]

On Wed, May 12, 2021 at 02:02:50PM +0200, Markus Armbruster wrote:
> Klaus Jensen <its@irrelevant.dk> writes:
> > I can then call `qdev_set_parent_bus()` and set the parent bus to the
> > bus creates in the nvme-subsys device. This solves the problem since
> > the namespaces are not "garbage collected" when the nvme device is
> > removed, but it just feels wrong you know? Also, if possible, I'd of
> > course really like to retain the nice entries in `info qtree`.
> 
> I'm afraid I'm too ignorant on NVME to give useful advice.
> 
> Can you give us a brief primer on the aspects of physical NVME devices
> you'd like to model in QEMU?  What are "controllers", "namespaces", and
> "subsystems", and how do they work together?
> 
> Once we understand the relevant aspects of physical devices, we can
> discuss how to best model them in QEMU.

One specific question about the nature of devices vs subsystems vs
namespaces:

Does the device expose all the namespaces from one subsystem, or does it
need to be able to filter them (e.g. hide certain namespaces or present
a mix of namespaces from multiple subsystems)?

The status of the namespace as a DeviceState is a bit questionable since
the only possible parent it could have is a device, but multiple devices
want to use it. I understand why you're considering whether it should be
an --object...

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-12 12:02 ` Markus Armbruster
  2021-05-13 14:02   ` Stefan Hajnoczi
@ 2021-05-17  6:44   ` Klaus Jensen
  2021-05-21  7:33     ` Markus Armbruster
  1 sibling, 1 reply; 10+ messages in thread
From: Klaus Jensen @ 2021-05-17  6:44 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-block, qemu-devel, stefanha, mst

[-- Attachment #1: Type: text/plain, Size: 7531 bytes --]

On May 12 14:02, Markus Armbruster wrote:
>Klaus Jensen <its@irrelevant.dk> writes:
>
>> Hi all,
>>
>> I need some help with grok'ing qdev busses. Stefan, Michael - David
>> suggested on IRC that I CC'ed you guys since you might have solved a
>> similar issue with virtio devices. I've tried to study how that works,
>> but I'm not exactly sure how to apply it to the issue I'm having.
>>
>> Currently, to support multiple namespaces on the emulated nvme device,
>> one can do something like this:
>>
>>   -device nvme,id=nvme-ctrl-0,serial=foo,...
>>   -device nvme-ns,id=nvme-ns-0,bus=nvme-ctrl-0,...
>>   -device nvme-ns,id-nvme-ns-1,bus=nvme-ctrl-0,...
>>
>> The nvme device creates an 'nvme-bus' and the nvme-ns devices has
>> dc->bus_type = TYPE_NVME_BUS. This all works very well and provides a
>> nice overview in `info qtree`:
>>
>>   bus: main-system-bus
>>   type System
>>     ...
>>     dev: q35-pcihost, id ""
>>       ..
>>       bus: pcie.0
>> 	type PCIE
>> 	..
>> 	dev: nvme, id "nvme-ctrl-0"
>> 	  ..
>> 	  bus: nvme-ctrl-0
>> 	    type nvme-bus
>> 	    dev: nvme-ns, id "nvme-ns-0"
>> 	      ..
>> 	    dev: nvme-ns, id "nvme-ns-1"
>> 	      ..
>>
>>
>> Nice and qdevy.
>>
>> We have since introduced support for NVM Subsystems through an
>> nvme-subsys device. The nvme-subsys device is just a TYPE_DEVICE and
>> does not show in `info qtree`
>
>Yes.
>
>Most devices plug into a bus.  DeviceClass member @bus_type specifies
>the type of bus they plug into, and DeviceState member @parent_bus
>points to the actual BusState.  Example: PCI devices plug into a PCI
>bus, and have ->bus_type = TYPE_PCI_BUS.
>
>Some devices don't.  @bus_type and @parent_bus are NULL then.
>
>Most buses are provided by a device.  BusState member @parent points to
>the device.
>
>The main-system-bus isn't.  Its @parent is null.
>
>"info qtree" only shows the qtree rooted at main-system-bus.  It doesn't
>show qtrees rooted at bus-less devices or device-less buses other than
>main-system-bus.  I doubt such buses exist.
>

Makes sense.

>>                               (I wonder if this should actually just
>> have been an -object?).
>
>Does nvme-subsys expose virtual hardware to the guest?  Memory, IRQs,
>...
>
>If yes, it needs to be a device.
>
>If no, object may be more appropriate.  Tell us more about what it does.
>

It does not expose any virtual hardware. See below.

>
>>                         Anyway. The nvme device has a 'subsys' link
>> parameter and we use this to manage the namespaces across the
>> subsystem that may contain several nvme devices (controllers). The
>> problem is that this doesnt work too well with unplugging since if the
>> nvme device is `device_del`'ed, the nvme-ns devices on the nvme-bus
>> are unrealized which is not what we want. We really want the
>> namespaces to linger, preferably on an nvme-bus of the nvme-subsys
>> device so they can be attached to other nvme devices that may show up
>> (or already exist) in the subsystem.
>>
>> The core problem I'm having is that I can't seem to create an nvme-bus
>> from the nvme-subsys device and make it available to the nvme-ns
>> device on the command line:
>>
>>   -device nvme-subsys,id=nvme-subsys-0,...
>>   -device nvme-ns,bus=nvme-subsys-0
>>
>> The above results in 'No 'nvme-bus' bus found for device 'nvme-ns',
>> even though I do `qbus_create_inplace()` just like the nvme
>> device. However, I *can* reparent the nvme-ns device in its realize()
>> method, so if I instead define it like so:
>>
>>   -device nvme-subsys,id=nvme-subsys-0,...
>>   -device nvme,id=nvme-ctrl-0,subsys=nvme-subsys-0
>>   -device nvme-ns,bus=nvme-ctrl-0
>>
>> I can then call `qdev_set_parent_bus()` and set the parent bus to the
>> bus creates in the nvme-subsys device. This solves the problem since
>> the namespaces are not "garbage collected" when the nvme device is
>> removed, but it just feels wrong you know? Also, if possible, I'd of
>> course really like to retain the nice entries in `info qtree`.
>
>I'm afraid I'm too ignorant on NVME to give useful advice.
>
>Can you give us a brief primer on the aspects of physical NVME devices
>you'd like to model in QEMU?  What are "controllers", "namespaces", and
>"subsystems", and how do they work together?
>
>Once we understand the relevant aspects of physical devices, we can
>discuss how to best model them in QEMU.
>

An "NVM Subsystem" is basically just a term to talk about a collection 
of controllers and namespaces. A namespace is just a quantity of 
non-volatile memory that the controller can use to store stuff on.

Only the controller is a piece of virtual hardware. An example subsystem 
looks like this:


           +------------------+     +-----------------+
           |   controller A   |     |   controller B  |
           +------------------+     +-----------------+
           +--------++--------+     +--------++-------+
           | NSID 1 || NSID 2 |     | NSID 3 | NSID 2 |
           +--------++--------+     +--------++-------+
           +--------+    |          +--------+    |
           |  NS A  |    |          |  NS C  |    |
           +--------+    |          +--------+    |
                         |                        |
                         +------------------------+
                                      |
                                  +--------+
                                  |  NS B  |
                                  +--------+


This is the example in Figure 5 in the NVMe v1.4 specification. Here, we 
have two controllers (that we model with the 'nvme' pci-based device). 
Each controller has one "private" namespace (NS A and NS C) and shares 
one namespace (NS B). The namespace IDs are unique across the subsystem 
and are assigned by the controller when attached to a namespace.

We use the 'nvme-ns' device (TYPE_DEVICE) to model the namespaces, and I 
guess this should could also just have been an -object, not sure if we 
can change that now. The 'nvme-ns' device mostly exist to hold the block 
backend configuration and related namespace only parameters. Prior to 
the introduction of subsystem, while we could have multiple controllers 
on the PCI bus, they could not share namespaces. To support this we 
introduced the 'nvme-subsys' device to allow the namespaces to be 
shared. This support is considered experimental, so I think we can get 
away with changing this to be an object.

As I explained in my first mail, we attach namespaces to controllers 
through a bus. This means that even in the absence of an explicit 
"bus=..." parameter on the nvme-ns device, it will "connect" on the most 
recently defined "nvme-bus" (of the most recently defined controller). 
With subsystems we would also like to model "unattached" namespaces that 
exists solely in the subsystem (i.e. NOT attached to any controllers). 
That is why I was trying to get the nvme-ns devices to attach to a bus 
created by the "non-bus-attached" subsystem device. And that is what I 
can't do. We could add a link property to the nvme-ns device instead, 
but then the bus magic in qemu would still happen and the namespace 
would end up "attached" (in qemu terms) to a controller anyway - and it 
would complain if we defined the namespace device prior to defining any 
controller devices since no usable bus exist.

Thanks for helping out with this!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-13 14:02   ` Stefan Hajnoczi
@ 2021-05-17  6:55     ` Klaus Jensen
  2021-05-17  9:56       ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Klaus Jensen @ 2021-05-17  6:55 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: mst, Markus Armbruster, qemu-block, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2846 bytes --]

On May 13 15:02, Stefan Hajnoczi wrote:
>On Wed, May 12, 2021 at 02:02:50PM +0200, Markus Armbruster wrote:
>> Klaus Jensen <its@irrelevant.dk> writes:
>> > I can then call `qdev_set_parent_bus()` and set the parent bus to the
>> > bus creates in the nvme-subsys device. This solves the problem since
>> > the namespaces are not "garbage collected" when the nvme device is
>> > removed, but it just feels wrong you know? Also, if possible, I'd of
>> > course really like to retain the nice entries in `info qtree`.
>>
>> I'm afraid I'm too ignorant on NVME to give useful advice.
>>
>> Can you give us a brief primer on the aspects of physical NVME devices
>> you'd like to model in QEMU?  What are "controllers", "namespaces", and
>> "subsystems", and how do they work together?
>>
>> Once we understand the relevant aspects of physical devices, we can
>> discuss how to best model them in QEMU.
>
>One specific question about the nature of devices vs subsystems vs
>namespaces:
>
>Does the device expose all the namespaces from one subsystem, or does it
>need to be able to filter them (e.g. hide certain namespaces or present
>a mix of namespaces from multiple subsystems)?
>

Subsystems are fully isolated. There are no interaction possible between 
different subsystems. Within a subsystem, all the "resources" 
(controllers and namespaces) are potentially "shared". That is, there 
may exists many-to-many relationships. A controller may have multiple 
namespaces attached and namespaces may be attached to multiple 
controllers.

>The status of the namespace as a DeviceState is a bit questionable since
>the only possible parent it could have is a device, but multiple devices
>want to use it. I understand why you're considering whether it should be
>an --object...
>

When you say parent, I think you mean parent in terms of bus-device 
relationship? In that case, then the parent can actually be the 
subsystem, since if the namespace is not attached to any controllers, 
then it is just an entity/object in the subsystem that the controllers 
(the actual devices) may attach to[1].

Yes, the more I think about this and understand qdev I realize that it 
was a mistake to define nvme-ns to be a TYPE_DEVICE, since it does not 
act as a piece of virtual hardware. It is just an entity (object). The 
biggest mistake right now seems to be the bus_type use. It just worked 
wonderfully in the absence of subsystem support, but I feel that that 
choice is coming back to haunt me now. If we'd used a 'ctrl' link 
property we could just add a 'subsys' link property now and be happy.

Is there any way that we can "overload" the implicit "bus=" parameter to 
provide backwards compatibility (while basically changing it to function 
like a "link" parameter)?

Thanks for you help!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-17  6:55     ` Klaus Jensen
@ 2021-05-17  9:56       ` Stefan Hajnoczi
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2021-05-17  9:56 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: mst, Markus Armbruster, qemu-block, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3589 bytes --]

On Mon, May 17, 2021 at 08:55:50AM +0200, Klaus Jensen wrote:
> On May 13 15:02, Stefan Hajnoczi wrote:
> > On Wed, May 12, 2021 at 02:02:50PM +0200, Markus Armbruster wrote:
> > > Klaus Jensen <its@irrelevant.dk> writes:
> > > > I can then call `qdev_set_parent_bus()` and set the parent bus to the
> > > > bus creates in the nvme-subsys device. This solves the problem since
> > > > the namespaces are not "garbage collected" when the nvme device is
> > > > removed, but it just feels wrong you know? Also, if possible, I'd of
> > > > course really like to retain the nice entries in `info qtree`.
> > > 
> > > I'm afraid I'm too ignorant on NVME to give useful advice.
> > > 
> > > Can you give us a brief primer on the aspects of physical NVME devices
> > > you'd like to model in QEMU?  What are "controllers", "namespaces", and
> > > "subsystems", and how do they work together?
> > > 
> > > Once we understand the relevant aspects of physical devices, we can
> > > discuss how to best model them in QEMU.
> > 
> > One specific question about the nature of devices vs subsystems vs
> > namespaces:
> > 
> > Does the device expose all the namespaces from one subsystem, or does it
> > need to be able to filter them (e.g. hide certain namespaces or present
> > a mix of namespaces from multiple subsystems)?
> > 
> 
> Subsystems are fully isolated. There are no interaction possible between
> different subsystems. Within a subsystem, all the "resources" (controllers
> and namespaces) are potentially "shared". That is, there may exists
> many-to-many relationships. A controller may have multiple namespaces
> attached and namespaces may be attached to multiple controllers.
> 
> > The status of the namespace as a DeviceState is a bit questionable since
> > the only possible parent it could have is a device, but multiple devices
> > want to use it. I understand why you're considering whether it should be
> > an --object...
> > 
> 
> When you say parent, I think you mean parent in terms of bus-device
> relationship? In that case, then the parent can actually be the subsystem,
> since if the namespace is not attached to any controllers, then it is just
> an entity/object in the subsystem that the controllers (the actual devices)
> may attach to[1].
> 
> Yes, the more I think about this and understand qdev I realize that it was a
> mistake to define nvme-ns to be a TYPE_DEVICE, since it does not act as a
> piece of virtual hardware. It is just an entity (object). The biggest
> mistake right now seems to be the bus_type use. It just worked wonderfully
> in the absence of subsystem support, but I feel that that choice is coming
> back to haunt me now. If we'd used a 'ctrl' link property we could just add
> a 'subsys' link property now and be happy.
> 
> Is there any way that we can "overload" the implicit "bus=" parameter to
> provide backwards compatibility (while basically changing it to function
> like a "link" parameter)?

I would consider adding new --object types and deprecating the devices
so they can be dropped in a future QEMU release. It may be necessary to
choose new names to avoid collisions with the existing ones.

Backwards compatibility might be tricky. One way might be to extract
most of the code from --device nvme-ns and move it into the new
--object, but leave the device to instantiate an object behind the
scenes? Then the device can still have its bus and translate that
relationship to --object somehow. I'm not sure, it depends on the
details of the code.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-17  6:44   ` Klaus Jensen
@ 2021-05-21  7:33     ` Markus Armbruster
  2021-05-21  8:48       ` Klaus Jensen
  0 siblings, 1 reply; 10+ messages in thread
From: Markus Armbruster @ 2021-05-21  7:33 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: Paolo Bonzini, stefanha, qemu-devel, qemu-block, mst

I'm about to drop off for two weeks of much-needed vacation.  I meant to
study your explanation and give design advice before I leave, but I'm
out of time.  Regrettable.  I hope Stefan can help you.  Or perhaps
Paolo.  If you still have questions when I'm back, feel free to contact
me again.

Klaus Jensen <its@irrelevant.dk> writes:

> On May 12 14:02, Markus Armbruster wrote:
>>Klaus Jensen <its@irrelevant.dk> writes:
>>
>>> Hi all,
>>>
>>> I need some help with grok'ing qdev busses. Stefan, Michael - David
>>> suggested on IRC that I CC'ed you guys since you might have solved a
>>> similar issue with virtio devices. I've tried to study how that works,
>>> but I'm not exactly sure how to apply it to the issue I'm having.
>>>
>>> Currently, to support multiple namespaces on the emulated nvme device,
>>> one can do something like this:
>>>
>>>   -device nvme,id=nvme-ctrl-0,serial=foo,...
>>>   -device nvme-ns,id=nvme-ns-0,bus=nvme-ctrl-0,...
>>>   -device nvme-ns,id-nvme-ns-1,bus=nvme-ctrl-0,...
>>>
>>> The nvme device creates an 'nvme-bus' and the nvme-ns devices has
>>> dc->bus_type = TYPE_NVME_BUS. This all works very well and provides a
>>> nice overview in `info qtree`:
>>>
>>>   bus: main-system-bus
>>>   type System
>>>     ...
>>>     dev: q35-pcihost, id ""
>>>       ..
>>>       bus: pcie.0
>>> 	type PCIE
>>> 	..
>>> 	dev: nvme, id "nvme-ctrl-0"
>>> 	  ..
>>> 	  bus: nvme-ctrl-0
>>> 	    type nvme-bus
>>> 	    dev: nvme-ns, id "nvme-ns-0"
>>> 	      ..
>>> 	    dev: nvme-ns, id "nvme-ns-1"
>>> 	      ..
>>>
>>>
>>> Nice and qdevy.
>>>
>>> We have since introduced support for NVM Subsystems through an
>>> nvme-subsys device. The nvme-subsys device is just a TYPE_DEVICE and
>>> does not show in `info qtree`
>>
>>Yes.
>>
>>Most devices plug into a bus.  DeviceClass member @bus_type specifies
>>the type of bus they plug into, and DeviceState member @parent_bus
>>points to the actual BusState.  Example: PCI devices plug into a PCI
>>bus, and have ->bus_type = TYPE_PCI_BUS.
>>
>>Some devices don't.  @bus_type and @parent_bus are NULL then.
>>
>>Most buses are provided by a device.  BusState member @parent points to
>>the device.
>>
>>The main-system-bus isn't.  Its @parent is null.
>>
>>"info qtree" only shows the qtree rooted at main-system-bus.  It doesn't
>>show qtrees rooted at bus-less devices or device-less buses other than
>>main-system-bus.  I doubt such buses exist.
>>
>
> Makes sense.
>
>>>                               (I wonder if this should actually just
>>> have been an -object?).
>>
>>Does nvme-subsys expose virtual hardware to the guest?  Memory, IRQs,
>>...
>>
>>If yes, it needs to be a device.
>>
>>If no, object may be more appropriate.  Tell us more about what it does.
>>
>
> It does not expose any virtual hardware. See below.
>
>>
>>>                         Anyway. The nvme device has a 'subsys' link
>>> parameter and we use this to manage the namespaces across the
>>> subsystem that may contain several nvme devices (controllers). The
>>> problem is that this doesnt work too well with unplugging since if the
>>> nvme device is `device_del`'ed, the nvme-ns devices on the nvme-bus
>>> are unrealized which is not what we want. We really want the
>>> namespaces to linger, preferably on an nvme-bus of the nvme-subsys
>>> device so they can be attached to other nvme devices that may show up
>>> (or already exist) in the subsystem.
>>>
>>> The core problem I'm having is that I can't seem to create an nvme-bus
>>> from the nvme-subsys device and make it available to the nvme-ns
>>> device on the command line:
>>>
>>>   -device nvme-subsys,id=nvme-subsys-0,...
>>>   -device nvme-ns,bus=nvme-subsys-0
>>>
>>> The above results in 'No 'nvme-bus' bus found for device 'nvme-ns',
>>> even though I do `qbus_create_inplace()` just like the nvme
>>> device. However, I *can* reparent the nvme-ns device in its realize()
>>> method, so if I instead define it like so:
>>>
>>>   -device nvme-subsys,id=nvme-subsys-0,...
>>>   -device nvme,id=nvme-ctrl-0,subsys=nvme-subsys-0
>>>   -device nvme-ns,bus=nvme-ctrl-0
>>>
>>> I can then call `qdev_set_parent_bus()` and set the parent bus to the
>>> bus creates in the nvme-subsys device. This solves the problem since
>>> the namespaces are not "garbage collected" when the nvme device is
>>> removed, but it just feels wrong you know? Also, if possible, I'd of
>>> course really like to retain the nice entries in `info qtree`.
>>
>>I'm afraid I'm too ignorant on NVME to give useful advice.
>>
>>Can you give us a brief primer on the aspects of physical NVME devices
>>you'd like to model in QEMU?  What are "controllers", "namespaces", and
>>"subsystems", and how do they work together?
>>
>>Once we understand the relevant aspects of physical devices, we can
>>discuss how to best model them in QEMU.
>>
>
> An "NVM Subsystem" is basically just a term to talk about a collection
> of controllers and namespaces. A namespace is just a quantity of 
> non-volatile memory that the controller can use to store stuff on.
>
> Only the controller is a piece of virtual hardware. An example
> subsystem looks like this:
>
>
>           +------------------+     +-----------------+
>           |   controller A   |     |   controller B  |
>           +------------------+     +-----------------+
>           +--------++--------+     +--------++-------+
>           | NSID 1 || NSID 2 |     | NSID 3 | NSID 2 |
>           +--------++--------+     +--------++-------+
>           +--------+    |          +--------+    |
>           |  NS A  |    |          |  NS C  |    |
>           +--------+    |          +--------+    |
>                         |                        |
>                         +------------------------+
>                                      |
>                                  +--------+
>                                  |  NS B  |
>                                  +--------+
>
>
> This is the example in Figure 5 in the NVMe v1.4 specification. Here,
> we have two controllers (that we model with the 'nvme' pci-based
> device). Each controller has one "private" namespace (NS A and NS C)
> and shares one namespace (NS B). The namespace IDs are unique across
> the subsystem and are assigned by the controller when attached to a
> namespace.
>
> We use the 'nvme-ns' device (TYPE_DEVICE) to model the namespaces, and
> I guess this should could also just have been an -object, not sure if
> we can change that now. The 'nvme-ns' device mostly exist to hold the
> block backend configuration and related namespace only
> parameters. Prior to the introduction of subsystem, while we could
> have multiple controllers on the PCI bus, they could not share
> namespaces. To support this we introduced the 'nvme-subsys' device to
> allow the namespaces to be shared. This support is considered
> experimental, so I think we can get away with changing this to be an
> object.
>
> As I explained in my first mail, we attach namespaces to controllers
> through a bus. This means that even in the absence of an explicit 
> "bus=..." parameter on the nvme-ns device, it will "connect" on the
> most recently defined "nvme-bus" (of the most recently defined
> controller). With subsystems we would also like to model "unattached"
> namespaces that exists solely in the subsystem (i.e. NOT attached to
> any controllers). That is why I was trying to get the nvme-ns devices
> to attach to a bus created by the "non-bus-attached" subsystem
> device. And that is what I can't do. We could add a link property to
> the nvme-ns device instead, but then the bus magic in qemu would still
> happen and the namespace would end up "attached" (in qemu terms) to a
> controller anyway - and it would complain if we defined the namespace
> device prior to defining any controller devices since no usable bus
> exist.
>
> Thanks for helping out with this!



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: making a qdev bus available from a (non-qtree?) device
  2021-05-21  7:33     ` Markus Armbruster
@ 2021-05-21  8:48       ` Klaus Jensen
  0 siblings, 0 replies; 10+ messages in thread
From: Klaus Jensen @ 2021-05-21  8:48 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Paolo Bonzini, qemu-block, qemu-devel, stefanha, mst

[-- Attachment #1: Type: text/plain, Size: 371 bytes --]

On May 21 09:33, Markus Armbruster wrote:
>I'm about to drop off for two weeks of much-needed vacation.  I meant to
>study your explanation and give design advice before I leave, but I'm
>out of time.  Regrettable.  I hope Stefan can help you.  Or perhaps
>Paolo.  If you still have questions when I'm back, feel free to contact
>me again.
>

No worries Markus, enjoy :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-05-21  8:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 18:17 making a qdev bus available from a (non-qtree?) device Klaus Jensen
2021-05-12  3:39 ` Philippe Mathieu-Daudé
2021-05-12  8:00   ` Peter Maydell
2021-05-12 12:02 ` Markus Armbruster
2021-05-13 14:02   ` Stefan Hajnoczi
2021-05-17  6:55     ` Klaus Jensen
2021-05-17  9:56       ` Stefan Hajnoczi
2021-05-17  6:44   ` Klaus Jensen
2021-05-21  7:33     ` Markus Armbruster
2021-05-21  8:48       ` Klaus Jensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).