All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Michal Privoznik <mprivozn@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	libvir-list@redhat.com, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [libvirt] qapi DEVICE_DELETED event issued *before* instance_finalize?!
Date: Mon, 05 Sep 2016 11:23:59 +0200	[thread overview]
Message-ID: <87d1ki3djk.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <5abeb25c-a8a9-9144-cb6f-724647c5f35a@redhat.com> (Michal Privoznik's message of "Fri, 2 Sep 2016 06:31:33 +0200")

Adding Paolo.

Michal Privoznik <mprivozn@redhat.com> writes:

> On 02.09.2016 01:11, Alex Williamson wrote:
>> Hey,
>> 
>> I'm out of my QOM depth, so I'll just beg for help in advance.  I
>> noticed in testing vfio-pci hotunplug that the host seems to be trying
>> to reclaim the device before QEMU is actually done with it, there's a
>> very short race where libvirt has seen the DEVICE_DELETED event and
>> tries to unbind the physical device from vfio-pci, the use count is
>> clearly non-zero because the host driver tries to send a device
>> request, but that event channel has already been torn down.  Nearly
>> immediately after, QEMU finally releases the device, but we can't do a
>> proper reset due to some issues with device references in the kernel.
>> 
>> When I run gdb on QEMU with breakpoints at
>> qapi_event_send_device_deleted() and vfio_instance_finalize(),  the
>> QAPI even happens first.  Clearly this is horribly wrong, right?  I
>> can't unmap my references to the vfio device file until my
>> instance_finalize is called, so I'm always going to have that open when
>> libvirt takes the DEVICE_DELETED event as a cue to return the device to
>> host drivers.  The call chains look like this:
>> 
>> #0  qapi_event_send_device_deleted (has_device=true, 
>>     device=0x7f5ca3e36fb0 "hostdev0", 
>>     path=0x7f5c89e84fe0 "/machine/peripheral/hostdev0", 
>>     errp=0x7f5ca241f9e8 <error_abort>) at qapi-event.c:412
>> #1  0x00007f5ca1701608 in device_unparent (obj=0x7f5ca43ffc00)
>>     at hw/core/qdev.c:1115
>> #2  0x00007f5ca18b7891 in object_finalize_child_property (obj=0x7f5ca380f500, 
>>     name=0x7f5ca3f21da0 "hostdev0", opaque=0x7f5ca43ffc00) at qom/object.c:1362
>> #3  0x00007f5ca18b56b2 in object_property_del_child (obj=0x7f5ca380f500, 
>>     child=0x7f5ca43ffc00, errp=0x0) at qom/object.c:422
>> #4  0x00007f5ca18b5790 in object_unparent (obj=0x7f5ca43ffc00)
>>     at qom/object.c:441
>> #5  0x00007f5ca16c1f31 in acpi_pcihp_eject_slot (s=0x7f5ca4c41268, bsel=0, 
>>     slots=4) at hw/acpi/pcihp.c:139
>> 
>> 
>> #0  vfio_instance_finalize (obj=0x7f5ca43ffc00)
>>     at /net/gimli/home/alwillia/Work/qemu.git/hw/vfio/pci.c:2731
>> #1  0x00007f5ca18b57c0 in object_deinit (obj=0x7f5ca43ffc00, 
>>     type=0x7f5ca376f490) at qom/object.c:448
>> #2  0x00007f5ca18b5831 in object_finalize (data=0x7f5ca43ffc00)
>>     at qom/object.c:462
>> #3  0x00007f5ca18b6782 in object_unref (obj=0x7f5ca43ffc00) at qom/object.c:896
>> #4  0x00007f5ca1550cc0 in memory_region_unref (mr=0x7f5ca43fff00)
>>     at /net/gimli/home/alwillia/Work/qemu.git/memory.c:1476
>> #5  0x00007f5ca1553886 in do_address_space_destroy (as=0x7f5ca43ffe10)
>>     at /net/gimli/home/alwillia/Work/qemu.git/memory.c:2272
>> 
>> 
>> It appears that DEVICE_DELETED only means the VM is done with the
>> device but libvirt is interpreting it as QEMU is done with the device.
>> Which is correct?  Do we need a new event or do we need to fix the
>> ordering of this event?  An ordering fix would be more compatible with
>> existing libvirt.  Thanks,
>
> What an interesting race. I think the even should be sent only after
> both guest and qemu are done with the device. Having two events looks
> like too much granularity to me. I mean, even if libvirt learns that
> guest has detached device, it still can't do anything until qemu clears
> its internal state.
>
> On the other hand, it is clearly documented that the DEVICE_DELETED
> event is sent as soon as guest acknowledges completion of device
> removal. So libvirt's buggy if we'd follow documentation strictly. But
> then again, I don't see much information value in "guest has detached
> device but qemu hasn't yet" event. Libvirt would ignore such event.

Unless I'm missing something, libvirt needs an event that signals "Guest
and QEMU are done with this device".  Current DEVICE_DELETED isn't.

Can we imagine a use for current DEVICE_DELETED, i.e. "Guest is done,
but QEMU isn't"?

Would anything break if we changed semantics of DEVICE_DELETED to what
libvirt actually needs?

If the answers are "no" and "no", let's do it.

  reply	other threads:[~2016-09-05  9:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-01 23:11 [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?! Alex Williamson
2016-09-02  4:31 ` [Qemu-devel] [libvirt] " Michal Privoznik
2016-09-05  9:23   ` Markus Armbruster [this message]
2016-09-05  9:36     ` Paolo Bonzini
2016-09-06  1:05       ` Laine Stump
2016-09-06  2:18       ` Alex Williamson
2016-09-06 14:02         ` Laine Stump
2016-09-06 14:26           ` Paolo Bonzini
2016-09-06 14:25         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d1ki3djk.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mprivozn@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.