From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59878) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHYri-00076S-8V for qemu-devel@nongnu.org; Mon, 27 Jun 2016 11:54:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bHYrc-0002Uw-Kn for qemu-devel@nongnu.org; Mon, 27 Jun 2016 11:54:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37992) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHYrc-0002Us-CJ for qemu-devel@nongnu.org; Mon, 27 Jun 2016 11:54:20 -0400 Date: Mon, 27 Jun 2016 09:54:18 -0600 From: Alex Williamson Message-ID: <20160627095418.659e6e5f@t450s.home> In-Reply-To: <7746532f-2fad-1304-0df7-7cd25ba761af@cn.fujitsu.com> References: <1464315131-25834-1-git-send-email-zhoujie2011@cn.fujitsu.com> <20160527100655.60db8206@t450s.home> <30d1cd95-7f67-29cf-c55e-0565364d89ff@cn.fujitsu.com> <41b0c187-ade0-182e-46b5-afd3e99f1e36@cn.fujitsu.com> <20160620103226.0ff61b21@ul30vt.home> <20160620211306.66a6b249@t450s.home> <576935FC.1080503@easystack.cn> <20160621084443.330f932d@t450s.home> <20160621215626.71c99582@t450s.home> <113474d2-8408-db49-e7ef-8c6b736af866@cn.fujitsu.com> <468b752b-a161-902b-d4cc-489dfa18c21e@cn.fujitsu.com> <20160622094236.515549fa@t450s.home> <7746532f-2fad-1304-0df7-7cd25ba761af@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Zhou Jie Cc: Chen Fan , mst@redhat.com, qemu-devel@nongnu.org, caoj.fnst@cn.fujitsu.com, izumi.taku@jp.fujitsu.com On Sat, 25 Jun 2016 09:24:19 +0800 Zhou Jie wrote: > Hi Alex, > > > We should never depend on the guest driver to behave in a certain way, > > but we need to prioritize what that actually means. vfio in the kernel > > has a responsibility first and foremost to the host kernel. User owned > > devices cannot be allowed to exploit or interfere with the host > > regardless of user behavior. The next priority is correct operation > > for the user. When the host kernel is handling the AER event between > > the error and resume notifies, it doesn't have device specific drivers, > > it's manipulating the device as a generic PCI device. That makes me > > think that vfio should not allow the user to interact (interfere) with > > the device during that process and that such interference can be > > limited to standard PCI level interactions. That means config space, > > and things that operate on config space (like interrupt ioctls and > > resets). On the QEMU side, we've sent a notification that an error > > occurred, how the user and the guest respond to that is beyond the > > concern of vfio in the kernel. If the user/guest driver continues to > > interact with resources on the device, that's fine, but I think vfio in > > the kernel does need to prevent the user from interfering with the PCI > > state of the device for that brief window when we know the host kernel > > is operating on the device. Otherwise the results are unpredictable > > and therefore unsupportable. Does that make sense? Thanks, > I understand. > > I want to alter the VFIO driver like following. > During err occurs and resume: > 1. Make config space read only. > Ignore config space writing to prevent the user from > interfering with the PCI state of the device. > User can get the error infomation by reading the config space. > 2. Disable INTx and MSI > Write "Command Register" to disable INTx and MSI. > 3. Do nothing for bar regions. > Guest driver may access bar regions. > It doesn't matter as device is going to be reset. > > The following code will be modified. > 1. vfio_pci_ioctl > add flag for aer support > 2. vfio_pci_ioctl > During err occurs and resume: > if (cmd == VFIO_DEVICE_SET_IRQS) return EAGAIN > if (cmd == VFIO_DEVICE_RESET) return EAGAIN > if (cmd == VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) return EAGAIN > if (cmd == VFIO_DEVICE_PCI_HOT_RESET) return EAGAIN > 3. vfio_pci_write > During err occurs and resume: > block > 4. vfio_pci_aer_err_detected > Set aer state in "struct vfio_pci_device" > Write "Command Register" to disable INTx and MSI. > 5. vfio_pci_aer_resume > Clear aer state in "struct vfio_pci_device" > I don't need to enable INTx and MSI. > The device will be initalized by guest driver. The INTx/MSI part needs further definition for the user. Are we actually completely tearing down interrupts with the expectation that the user will re-enable them or are we just masking them such that the user needs to unmask? Also note that not all devices support DisINTx. Otherwise it seems like a reasonable approach, but I can't guarantee we won't find new issues along the way. For instance we'll need to test how -EAGAIN returns interact with existing QEMU and maybe decided whether there are cases that are better handled by doing an interruptible wait. Thanks, Alex