qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Question about scsi device hotplug (e.g scsi-hd)
@ 2020-03-31 16:16 Maxim Levitsky
  2020-04-01 15:09 ` Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Maxim Levitsky @ 2020-03-31 16:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

Hi!

I recently investigated an interesting issue related to repeated scsi-hd hotplug/hotunplug.
The bugzilla for it is here:
https://bugzilla.redhat.com/show_bug.cgi?id=1812399

In nutshell the issue that I think that I found and I would like to ask your opinion on it,
since I don't have enough experience to be 100% sure that I didn't miss something  is this:

When a new device is hotplugged via monitor, the qdev_device_add first puts the device on
the bus where the user requested it to be, and then calls the device's .realize.

However for scsi bus, each time a new request is sent from the guest, the scsi adapter drivers
(e.g virtio-scsi) call scsi_device_find to find the LUN's driver to dispatch the request to,
and scsi_device_find will return the added device as soon as it is placed on the bus.

Thus between the point when the new device is placed on the bus and until the end of the .realize,
the device can be accessed by the guest when it is not yet realized or partially realized as
happens in the bugreport.

What do you think about it?

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-03-31 16:16 Question about scsi device hotplug (e.g scsi-hd) Maxim Levitsky
@ 2020-04-01 15:09 ` Stefan Hajnoczi
  2020-04-01 16:36   ` Paolo Bonzini
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2020-04-01 15:09 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Paolo Bonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1499 bytes --]

On Tue, Mar 31, 2020 at 07:16:23PM +0300, Maxim Levitsky wrote:
> Hi!
> 
> I recently investigated an interesting issue related to repeated scsi-hd hotplug/hotunplug.
> The bugzilla for it is here:
> https://bugzilla.redhat.com/show_bug.cgi?id=1812399
> 
> In nutshell the issue that I think that I found and I would like to ask your opinion on it,
> since I don't have enough experience to be 100% sure that I didn't miss something  is this:
> 
> When a new device is hotplugged via monitor, the qdev_device_add first puts the device on
> the bus where the user requested it to be, and then calls the device's .realize.
> 
> However for scsi bus, each time a new request is sent from the guest, the scsi adapter drivers
> (e.g virtio-scsi) call scsi_device_find to find the LUN's driver to dispatch the request to,
> and scsi_device_find will return the added device as soon as it is placed on the bus.
> 
> Thus between the point when the new device is placed on the bus and until the end of the .realize,
> the device can be accessed by the guest when it is not yet realized or partially realized as
> happens in the bugreport.
> 
> What do you think about it?

Maybe aio_disable_external() is needed to postpone device emulation
until after realize has finished?

Virtqueue kick ioeventfds are marked "external" and won't be processed
while external events are disabled.  See also
virtio_queue_aio_set_host_notifier_handler() ->
aio_set_event_notifier().

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-04-01 15:09 ` Stefan Hajnoczi
@ 2020-04-01 16:36   ` Paolo Bonzini
  2020-04-02  5:37     ` Markus Armbruster
  2020-04-02  8:46     ` Maxim Levitsky
  0 siblings, 2 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-04-01 16:36 UTC (permalink / raw)
  To: Stefan Hajnoczi, Maxim Levitsky; +Cc: qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 437 bytes --]

On 01/04/20 17:09, Stefan Hajnoczi wrote:
>> What do you think about it?
>
> Maybe aio_disable_external() is needed to postpone device emulation
> until after realize has finished?
> 
> Virtqueue kick ioeventfds are marked "external" and won't be processed
> while external events are disabled.  See also
> virtio_queue_aio_set_host_notifier_handler() ->
> aio_set_event_notifier().

Yes, I think Stefan is right.

Paolo


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-04-01 16:36   ` Paolo Bonzini
@ 2020-04-02  5:37     ` Markus Armbruster
  2020-04-02  8:49       ` Paolo Bonzini
  2020-04-02  8:46     ` Maxim Levitsky
  1 sibling, 1 reply; 9+ messages in thread
From: Markus Armbruster @ 2020-04-02  5:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Stefan Hajnoczi, qemu-devel, Maxim Levitsky

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 01/04/20 17:09, Stefan Hajnoczi wrote:
>>> What do you think about it?
>>
>> Maybe aio_disable_external() is needed to postpone device emulation
>> until after realize has finished?
>> 
>> Virtqueue kick ioeventfds are marked "external" and won't be processed
>> while external events are disabled.  See also
>> virtio_queue_aio_set_host_notifier_handler() ->
>> aio_set_event_notifier().
>
> Yes, I think Stefan is right.

Is this issue limited to SCSI devices?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-04-01 16:36   ` Paolo Bonzini
  2020-04-02  5:37     ` Markus Armbruster
@ 2020-04-02  8:46     ` Maxim Levitsky
  2020-04-03 12:41       ` Stefan Hajnoczi
  1 sibling, 1 reply; 9+ messages in thread
From: Maxim Levitsky @ 2020-04-02  8:46 UTC (permalink / raw)
  To: Paolo Bonzini, Stefan Hajnoczi; +Cc: qemu-devel

On Wed, 2020-04-01 at 18:36 +0200, Paolo Bonzini wrote:
> On 01/04/20 17:09, Stefan Hajnoczi wrote:
> > > What do you think about it?
> > 
> > Maybe aio_disable_external() is needed to postpone device emulation
> > until after realize has finished?

Isn't that virtio specific? In theory this issue can touch any driver that
behaves similar to scsi.

Also due to racing, the request might already be in virtqueue and the virtio-scsi iothread might have already
fetched it when we place the device on the bus.

In fact I don't see any locking in bus_add_child / scsi_device_find so in fact if the timing is right
and iothread calls the scsi_device_find while main thread adds the device on the bus that will cause
the whole thing go south.

IMHO the right solution for this is first to realize the device and then place it
on the bus, and even that I am not sure that will fix the race completely.

IMHO the correct solution here is to stop querying the bus children on the guest controlled IO path 
(scsi_device_find) but rather use the hotplug handlers to
store the active luns of the scsi device in parallel (and with proper locking).


Best regards,
	Maxim Levitsky


> > 
> > Virtqueue kick ioeventfds are marked "external" and won't be processed
> > while external events are disabled.  See also
> > virtio_queue_aio_set_host_notifier_handler() ->
> > aio_set_event_notifier().
> 
> Yes, I think Stefan is right.
> 
> Paolo
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-04-02  5:37     ` Markus Armbruster
@ 2020-04-02  8:49       ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-04-02  8:49 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Stefan Hajnoczi, qemu-devel, Maxim Levitsky

On 02/04/20 07:37, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 01/04/20 17:09, Stefan Hajnoczi wrote:
>>>> What do you think about it?
>>>
>>> Maybe aio_disable_external() is needed to postpone device emulation
>>> until after realize has finished?
>>>
>>> Virtqueue kick ioeventfds are marked "external" and won't be processed
>>> while external events are disabled.  See also
>>> virtio_queue_aio_set_host_notifier_handler() ->
>>> aio_set_event_notifier().
>>
>> Yes, I think Stefan is right.
> 
> Is this issue limited to SCSI devices?

Yes, because they (and more specifically virtio-scsi) are the only ones
that can be processed

1) outside the BQL

2) by a device other than themselves (in this case, by the adapter).

Paolo



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-04-02  8:46     ` Maxim Levitsky
@ 2020-04-03 12:41       ` Stefan Hajnoczi
  2020-04-03 13:45         ` Markus Armbruster
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2020-04-03 12:41 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Paolo Bonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2511 bytes --]

On Thu, Apr 02, 2020 at 11:46:57AM +0300, Maxim Levitsky wrote:
> On Wed, 2020-04-01 at 18:36 +0200, Paolo Bonzini wrote:
> > On 01/04/20 17:09, Stefan Hajnoczi wrote:
> > > > What do you think about it?
> > > 
> > > Maybe aio_disable_external() is needed to postpone device emulation
> > > until after realize has finished?
> 
> Isn't that virtio specific? In theory this issue can touch any driver that
> behaves similar to scsi.

No, aio_disable_external() is a general solution for suspending file
descriptor handlers.  In practice virtio is the main user of external
fds (ioeventfd).  Devices that do not use ioeventfd will not perform
device emulation from the AioContext event loop and are therefore safe.

> Also due to racing, the request might already be in virtqueue and the virtio-scsi iothread might have already
> fetched it when we place the device on the bus.
> 
> In fact I don't see any locking in bus_add_child / scsi_device_find so in fact if the timing is right
> and iothread calls the scsi_device_find while main thread adds the device on the bus that will cause
> the whole thing go south.

Thread-safety with IOThreads is a separate issue and also worth fixing.
It requires a different and complementary solution from the
single-threaded ioeventfd bug we've been discussing so far.

It looks like virtio_scsi_acquire/release() were intended to provide
thread-safety for virtio-scsi:

  static bool virtio_scsi_data_plane_handle_cmd(VirtIODevice *vdev,
                                                VirtQueue *vq)
  {
      bool progress;
      VirtIOSCSI *s = VIRTIO_SCSI(vdev);

      virtio_scsi_acquire(s);
      assert(s->ctx && s->dataplane_started);
      progress = virtio_scsi_handle_cmd_vq(s, vq);
      virtio_scsi_release(s);
      return progress;
  }

If the monitor code calls virtio_scsi_acquire() then further request
processing will have to wait until the monitor code is finished.

> IMHO the right solution for this is first to realize the device and then place it
> on the bus, and even that I am not sure that will fix the race completely.
>
> IMHO the correct solution here is to stop querying the bus children on the guest controlled IO path 
> (scsi_device_find) but rather use the hotplug handlers to
> store the active luns of the scsi device in parallel (and with proper locking).

It would be nice to have a general solution in qdev core that makes
thread-safe hotplug easy for all devices.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-04-03 12:41       ` Stefan Hajnoczi
@ 2020-04-03 13:45         ` Markus Armbruster
  2020-04-03 13:51           ` Paolo Bonzini
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Armbruster @ 2020-04-03 13:45 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, Maxim Levitsky

Stefan Hajnoczi <stefanha@gmail.com> writes:

[...]
> It would be nice to have a general solution in qdev core that makes
> thread-safe hotplug easy for all devices.

Excuse my ignorance / confusion...

The "realize" concept exists to enable

    create device
    configure / wire up step by step without impact on the guest
    realize, device becomes "real" atomically

How and why does this break down with threads?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about scsi device hotplug (e.g scsi-hd)
  2020-04-03 13:45         ` Markus Armbruster
@ 2020-04-03 13:51           ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-04-03 13:51 UTC (permalink / raw)
  To: Markus Armbruster, Stefan Hajnoczi; +Cc: qemu-devel, Maxim Levitsky

On 03/04/20 15:45, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> 
> [...]
>> It would be nice to have a general solution in qdev core that makes
>> thread-safe hotplug easy for all devices.
> Excuse my ignorance / confusion...
> 
> The "realize" concept exists to enable
> 
>     create device
>     configure / wire up step by step without impact on the guest
>     realize, device becomes "real" atomically
> 
> How and why does this break down with threads?

After more discussions with Maxim, it's a bug in the SCSI layer.  It
looks for devices without checking that they have been realized.

Paolo



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-04-03 13:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-31 16:16 Question about scsi device hotplug (e.g scsi-hd) Maxim Levitsky
2020-04-01 15:09 ` Stefan Hajnoczi
2020-04-01 16:36   ` Paolo Bonzini
2020-04-02  5:37     ` Markus Armbruster
2020-04-02  8:49       ` Paolo Bonzini
2020-04-02  8:46     ` Maxim Levitsky
2020-04-03 12:41       ` Stefan Hajnoczi
2020-04-03 13:45         ` Markus Armbruster
2020-04-03 13:51           ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).