From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33722) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHuCD-00070W-Bf for qemu-devel@nongnu.org; Tue, 28 Jun 2016 10:41:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bHuC6-0002kd-6b for qemu-devel@nongnu.org; Tue, 28 Jun 2016 10:41:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37874) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHuC5-0002jr-Ue for qemu-devel@nongnu.org; Tue, 28 Jun 2016 10:40:54 -0400 Date: Tue, 28 Jun 2016 08:40:52 -0600 From: Alex Williamson Message-ID: <20160628084052.1e85a730@t450s.home> In-Reply-To: <7912dad0-0e37-603d-fdfe-bb4950b55f28@cn.fujitsu.com> References: <1464315131-25834-1-git-send-email-zhoujie2011@cn.fujitsu.com> <20160527100655.60db8206@t450s.home> <30d1cd95-7f67-29cf-c55e-0565364d89ff@cn.fujitsu.com> <41b0c187-ade0-182e-46b5-afd3e99f1e36@cn.fujitsu.com> <20160620103226.0ff61b21@ul30vt.home> <20160620211306.66a6b249@t450s.home> <576935FC.1080503@easystack.cn> <20160621084443.330f932d@t450s.home> <20160621215626.71c99582@t450s.home> <113474d2-8408-db49-e7ef-8c6b736af866@cn.fujitsu.com> <468b752b-a161-902b-d4cc-489dfa18c21e@cn.fujitsu.com> <20160622094236.515549fa@t450s.home> <7746532f-2fad-1304-0df7-7cd25ba761af@cn.fujitsu.com> <20160627095418.659e6e5f@t450s.home> <20160627215808.1531a774@t450s.home> <7912dad0-0e37-603d-fdfe-bb4950b55f28@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Zhou Jie Cc: izumi.taku@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com, Chen Fan , qemu-devel@nongnu.org, mst@redhat.com On Tue, 28 Jun 2016 13:27:21 +0800 Zhou Jie wrote: > Hi Alex, > > On 2016/6/28 11:58, Alex Williamson wrote: > > On Tue, 28 Jun 2016 11:26:33 +0800 > > Zhou Jie wrote: > > > >> Hi Alex, > >> > >>> The INTx/MSI part needs further definition for the user. Are we > >>> actually completely tearing down interrupts with the expectation that > >>> the user will re-enable them or are we just masking them such that the > >>> user needs to unmask? Also note that not all devices support DisINTx. > >> > >> After reset, the "Bus Master Enable" bit of "Command Register" > >> should be cleared, so MSI/MSI- X interrupt Messages is still disabled. > >> After reset, the "Interrupt Disable" bit of "Command Register" > >> should be cleared, so INTx interrupts is enabled. > >> If the device doesn't support INTx, "Interrupt Disable" bit will > >> hardware to 0, it is OK here. > >> > >> After fatal-error occurs, the user should reset the device and > >> reinitialize the device. > >> So I disable the interrupt before host reset the device, > >> and let user to do the reinitialization. > > > > I'm dubious here. When DisINTx is not supported by the device or it's > > marked broken in host quirks, then we can't trust the device to stop > > sending INTx. It's hardwired to zero, meaning that it doesn't work or > > it's been found to be broken in other ways. So COMMAND register > > masking is not sufficient for all devices. > For Endpoints that generate INTx interrupts, this bit is required. > For Endpoints that do not generate IN Tx interrupts this bit is > optional. If not implemented, this bit must be hardwired to 0b. > For Root Ports, Switch Ports, and Bridges that generate INTx > interrupts on their own behalf, this bit is required. > > The above is from "7.5.1.1." of "PCI Express Base Specification 3.1a". > So I think "Interrupt Disable" bit must be supported by the device > which can generate INTx interrupts. And yet we have struct pci_dev.broken_intx_masking and we test for working DisINTx via pci_intx_mask_supported() rather than simply looking for a PCIe device. Some devices are broken and some simply don't follow the spec, so you're going to need to deal with that or exclude those devices. > > Also, any time we start > > changing the state of the device from what the user expects, we risk > > consistency problems. We need to consider how the user last saw the > > device and whether we can legitimately expect them to handle the device > > in a new state. If we expect the user to re-initialize the device then > > would it be more correct to teardown all interrupt signaling such that > > the device is effectively in the same state as initial handoff when the > > vfio device fd is opened? > Before the user re-initialize the device, host has reseted the device. How does that happen, aren't we notifying the user at the point the error occurs, while the device is still in the process or being reset? My question is how does the user know that the host reset is complete in order to begin their own re-initialization? > The interrupt status will be cleared by hardware. > So the hardware is the same as the state when the > vfio device fd is opened. The PCI-core in Linux will save and restore the device state around reset, how do we know that vfio-pci itself is not racing that reset and whether PCI-core will restore the state including our interrupt masking or a state without it? Do we need to restore the state to the one we saved when we originally opened the device? Shouldn't that mean we teardown the interrupt setup the user had prior to the error event? > > How will the user know when the device is > > ready to be reset? Which of the ioctls that you're blocking can they > > poll w/o any unwanted side-effects or awkward interactions? Should > > flag bits in the device info ioctl indicate not only support for this > > behavior but also the current status? Thanks, > I can block the reset ioctl and config write. > I will not add flag for the device current status, > because I don't depend on user to prevent awkward interactions. Ok, so that's a reason to block rather than return -EAGAIN. Still we need some way to indicate to the user whether the device supports this new interaction rather than the existing behavior. Thanks, Alex