From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60515) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bgq8N-0006ZQ-Bu for qemu-devel@nongnu.org; Mon, 05 Sep 2016 05:24:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bgq8I-0005x3-5q for qemu-devel@nongnu.org; Mon, 05 Sep 2016 05:24:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51716) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bgq8H-0005wx-Ua for qemu-devel@nongnu.org; Mon, 05 Sep 2016 05:24:02 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 83F55C0467D0 for ; Mon, 5 Sep 2016 09:24:01 +0000 (UTC) From: Markus Armbruster References: <20160901171148.697e696b@t450s.home> <5abeb25c-a8a9-9144-cb6f-724647c5f35a@redhat.com> Date: Mon, 05 Sep 2016 11:23:59 +0200 In-Reply-To: <5abeb25c-a8a9-9144-cb6f-724647c5f35a@redhat.com> (Michal Privoznik's message of "Fri, 2 Sep 2016 06:31:33 +0200") Message-ID: <87d1ki3djk.fsf@dusky.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [libvirt] qapi DEVICE_DELETED event issued *before* instance_finalize?! List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Michal Privoznik Cc: Alex Williamson , qemu-devel , libvir-list@redhat.com, Paolo Bonzini Adding Paolo. Michal Privoznik writes: > On 02.09.2016 01:11, Alex Williamson wrote: >> Hey, >> >> I'm out of my QOM depth, so I'll just beg for help in advance. I >> noticed in testing vfio-pci hotunplug that the host seems to be trying >> to reclaim the device before QEMU is actually done with it, there's a >> very short race where libvirt has seen the DEVICE_DELETED event and >> tries to unbind the physical device from vfio-pci, the use count is >> clearly non-zero because the host driver tries to send a device >> request, but that event channel has already been torn down. Nearly >> immediately after, QEMU finally releases the device, but we can't do a >> proper reset due to some issues with device references in the kernel. >> >> When I run gdb on QEMU with breakpoints at >> qapi_event_send_device_deleted() and vfio_instance_finalize(), the >> QAPI even happens first. Clearly this is horribly wrong, right? I >> can't unmap my references to the vfio device file until my >> instance_finalize is called, so I'm always going to have that open when >> libvirt takes the DEVICE_DELETED event as a cue to return the device to >> host drivers. The call chains look like this: >> >> #0 qapi_event_send_device_deleted (has_device=true, >> device=0x7f5ca3e36fb0 "hostdev0", >> path=0x7f5c89e84fe0 "/machine/peripheral/hostdev0", >> errp=0x7f5ca241f9e8 ) at qapi-event.c:412 >> #1 0x00007f5ca1701608 in device_unparent (obj=0x7f5ca43ffc00) >> at hw/core/qdev.c:1115 >> #2 0x00007f5ca18b7891 in object_finalize_child_property (obj=0x7f5ca380f500, >> name=0x7f5ca3f21da0 "hostdev0", opaque=0x7f5ca43ffc00) at qom/object.c:1362 >> #3 0x00007f5ca18b56b2 in object_property_del_child (obj=0x7f5ca380f500, >> child=0x7f5ca43ffc00, errp=0x0) at qom/object.c:422 >> #4 0x00007f5ca18b5790 in object_unparent (obj=0x7f5ca43ffc00) >> at qom/object.c:441 >> #5 0x00007f5ca16c1f31 in acpi_pcihp_eject_slot (s=0x7f5ca4c41268, bsel=0, >> slots=4) at hw/acpi/pcihp.c:139 >> >> >> #0 vfio_instance_finalize (obj=0x7f5ca43ffc00) >> at /net/gimli/home/alwillia/Work/qemu.git/hw/vfio/pci.c:2731 >> #1 0x00007f5ca18b57c0 in object_deinit (obj=0x7f5ca43ffc00, >> type=0x7f5ca376f490) at qom/object.c:448 >> #2 0x00007f5ca18b5831 in object_finalize (data=0x7f5ca43ffc00) >> at qom/object.c:462 >> #3 0x00007f5ca18b6782 in object_unref (obj=0x7f5ca43ffc00) at qom/object.c:896 >> #4 0x00007f5ca1550cc0 in memory_region_unref (mr=0x7f5ca43fff00) >> at /net/gimli/home/alwillia/Work/qemu.git/memory.c:1476 >> #5 0x00007f5ca1553886 in do_address_space_destroy (as=0x7f5ca43ffe10) >> at /net/gimli/home/alwillia/Work/qemu.git/memory.c:2272 >> >> >> It appears that DEVICE_DELETED only means the VM is done with the >> device but libvirt is interpreting it as QEMU is done with the device. >> Which is correct? Do we need a new event or do we need to fix the >> ordering of this event? An ordering fix would be more compatible with >> existing libvirt. Thanks, > > What an interesting race. I think the even should be sent only after > both guest and qemu are done with the device. Having two events looks > like too much granularity to me. I mean, even if libvirt learns that > guest has detached device, it still can't do anything until qemu clears > its internal state. > > On the other hand, it is clearly documented that the DEVICE_DELETED > event is sent as soon as guest acknowledges completion of device > removal. So libvirt's buggy if we'd follow documentation strictly. But > then again, I don't see much information value in "guest has detached > device but qemu hasn't yet" event. Libvirt would ignore such event. Unless I'm missing something, libvirt needs an event that signals "Guest and QEMU are done with this device". Current DEVICE_DELETED isn't. Can we imagine a use for current DEVICE_DELETED, i.e. "Guest is done, but QEMU isn't"? Would anything break if we changed semantics of DEVICE_DELETED to what libvirt actually needs? If the answers are "no" and "no", let's do it.