From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:46183)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.williamson@redhat.com>) id 1bFZHB-00034y-89
	for qemu-devel@nongnu.org; Tue, 21 Jun 2016 23:56:30 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <alex.williamson@redhat.com>) id 1bFZH7-0001k8-5U
	for qemu-devel@nongnu.org; Tue, 21 Jun 2016 23:56:28 -0400
Received: from mx1.redhat.com ([209.132.183.28]:42309)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.williamson@redhat.com>) id 1bFZH6-0001jL-TZ
	for qemu-devel@nongnu.org; Tue, 21 Jun 2016 23:56:25 -0400
Date: Tue, 21 Jun 2016 21:56:26 -0600
From: Alex Williamson <alex.williamson@redhat.com>
Message-ID: <20160621215626.71c99582@t450s.home>
In-Reply-To: <be32e794-4ad7-a7b6-dbe2-e14d2c181c0b@cn.fujitsu.com>
References: <1464315131-25834-1-git-send-email-zhoujie2011@cn.fujitsu.com>
	<20160527100655.60db8206@t450s.home>
	<30d1cd95-7f67-29cf-c55e-0565364d89ff@cn.fujitsu.com>
	<41b0c187-ade0-182e-46b5-afd3e99f1e36@cn.fujitsu.com>
	<20160620103226.0ff61b21@ul30vt.home>
	<c12c77e8-e664-9b09-5380-7dd9e09ec4e2@cn.fujitsu.com>
	<20160620211306.66a6b249@t450s.home>
	<576935FC.1080503@easystack.cn>
	<20160621084443.330f932d@t450s.home>
	<be32e794-4ad7-a7b6-dbe2-e14d2c181c0b@cn.fujitsu.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume
 notification handler for aer resume
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Cc: Chen Fan <fan.chen@easystack.cn>, Chen Fan <chen.fan.fnst@cn.fujitsu.com>, izumi.taku@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com, qemu-devel@nongnu.org, mst@redhat.com

On Wed, 22 Jun 2016 11:28:50 +0800
Zhou Jie <zhoujie2011@cn.fujitsu.com> wrote:

> Hi Alex,
> 
> >> Hi Alex,
> >>       on kernel side, I think if we don't trust the user behaviors, we
> >> should
> >>   disable the access of vfio-pci interface once vfio-pci driver got the
> >> error_detected,
> >>   we should disable all access to vfio fd regardless whether the vfio-pci
> >>   was assigned to a VM, we also can return a EAGAIN error if user try
> >>   to access it during the reset period until the host reset finished.
> >>       on qemu side, when we got a error_detect, we pass through the
> >> aer error to guest directly, ignore all access to vfio-pci during this
> >> time,
> >> when qemu need to do a hot reset, we can retry to get the info from
> >> the get info ioctl until we got the info that vfio-pci has been reset
> >> finished,
> >> then do the hot_reset ioctl if need, the kernel should ensure the ioctl
> >> become
> >> //// accessible after host reset completed.
> >>  
> >
> > That sounds pretty thorough, the sticky point there is always disabling
> > the device mmaps w/o a revoke interface.  Do we invalidate the pfn
> > range and setup a fault handler that blocks on access?  I don't think
> > we have a whole lot of options, either block or sigbus, but having such
> > a mechanism might allow us to easily put a device in a "dead" state
> > where the user can't touch it, which could be useful for other purposes
> > too.  QEMU would also need to timeout after some number of reset
> > attempts and assume the device is not coming back.  Plus we'd need a
> > device flag to indicate this behavior.  Thanks,
> >
> > Alex  
> 
> In vfio I have some questions.
> 1. How can I disable the access by mmap?
>     We can disable all access to vfio fd by returning a EAGAIN error
>     if user try to access it during the reset period until the host
>     reset finished.
>     But about the bar region which is maped by vfio_pci_mmap.
>     How can I disable it in vfio driver?
>     Even there is a way to do it,
>     how about the complexity to recovery the mmap?

That's exactly the "sticky point" I refer to above, you'd need to
solve that problem.  MST would probably still argue that we don't need
to disable all those interfaces, a userspace driver can already do
things like disable mmio space and then attempt to read from the mmio
space of the device.  So maybe the problem can be simplified to
non-device specific interfaces, like config space access plus ioctls.  I
think we know we shouldn't be doing any of those between error and
resume notification.

> In qemu I have following proposals.
> 1. Setup a fault handler that blocks on access of bar region.
>     So the data transmission will be blocked.
> 2. Disable vfio_pci_write_config, but keep vfio_pci_read_config
>     enabled.
>     The VM can get the error information by reading configure space.
>     But operation of writing the configure space will be ignored.
> 3. Get VFIO device infomation instend of receiving resume notification.
>     When I tested the non-fatal error.
>     I found that sometimes the qemu receive resume notification earlier
>     than error notification.
>     The notification receiving time between different eventfd is not
>     in the order of sending time.

I can imagine if events occur close enough together, QEMU will have
both events queued and which one we get a callback for first might
depend on the order they get tested.  APerhaps another reason not to
have a resume notifier.  Thanks,

Alex