kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
To: Jacob Pan <jacob.jun.pan@linux.intel.com>,
	Auger Eric <eric.auger@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"yi.l.liu@intel.com" <yi.l.liu@intel.com>,
	Will Deacon <Will.Deacon@arm.com>,
	Robin Murphy <Robin.Murphy@arm.com>,
	"kevin.tian@intel.com" <kevin.tian@intel.com>,
	"ashok.raj@intel.com" <ashok.raj@intel.com>,
	Marc Zyngier <Marc.Zyngier@arm.com>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	Vincent Stehle <Vincent.Stehle@arm.com>
Subject: Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler
Date: Thu, 6 Jun 2019 19:54:05 +0100	[thread overview]
Message-ID: <2753d192-1c46-d78e-c425-0c828e48cde2@arm.com> (raw)
In-Reply-To: <20190605154553.0d00ad8d@jacob-builder>

On 05/06/2019 23:45, Jacob Pan wrote:
> On Tue, 4 Jun 2019 18:11:08 +0200
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi Alex,
>>
>> On 6/4/19 12:31 AM, Alex Williamson wrote:
>>> On Sun, 26 May 2019 18:10:01 +0200
>>> Eric Auger <eric.auger@redhat.com> wrote:
>>>   
>>>> This patch registers a fault handler which records faults in
>>>> a circular buffer and then signals an eventfd. This buffer is
>>>> exposed within the fault region.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>
>>>> ---
>>>>
>>>> v3 -> v4:
>>>> - move iommu_unregister_device_fault_handler to vfio_pci_release
>>>> ---
>>>>  drivers/vfio/pci/vfio_pci.c         | 49
>>>> +++++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci_private.h
>>>> |  1 + 2 files changed, 50 insertions(+)
>>>>
>>>> diff --git a/drivers/vfio/pci/vfio_pci.c
>>>> b/drivers/vfio/pci/vfio_pci.c index f75f61127277..520999994ba8
>>>> 100644 --- a/drivers/vfio/pci/vfio_pci.c
>>>> +++ b/drivers/vfio/pci/vfio_pci.c
>>>> @@ -30,6 +30,7 @@
>>>>  #include <linux/vfio.h>
>>>>  #include <linux/vgaarb.h>
>>>>  #include <linux/nospec.h>
>>>> +#include <linux/circ_buf.h>
>>>>  
>>>>  #include "vfio_pci_private.h"
>>>>  
>>>> @@ -296,6 +297,46 @@ static const struct vfio_pci_regops
>>>> vfio_pci_fault_prod_regops = { .add_capability =
>>>> vfio_pci_fault_prod_add_capability, };
>>>>  
>>>> +int vfio_pci_iommu_dev_fault_handler(struct iommu_fault_event
>>>> *evt, void *data) +{
>>>> +	struct vfio_pci_device *vdev = (struct vfio_pci_device *)
>>>> data;
>>>> +	struct vfio_region_fault_prod *prod_region =
>>>> +		(struct vfio_region_fault_prod
>>>> *)vdev->fault_pages;
>>>> +	struct vfio_region_fault_cons *cons_region =
>>>> +		(struct vfio_region_fault_cons
>>>> *)(vdev->fault_pages + 2 * PAGE_SIZE);
>>>> +	struct iommu_fault *new =
>>>> +		(struct iommu_fault *)(vdev->fault_pages +
>>>> prod_region->offset +
>>>> +			prod_region->prod *
>>>> prod_region->entry_size);
>>>> +	int prod, cons, size;
>>>> +
>>>> +	mutex_lock(&vdev->fault_queue_lock);
>>>> +
>>>> +	if (!vdev->fault_abi)
>>>> +		goto unlock;
>>>> +
>>>> +	prod = prod_region->prod;
>>>> +	cons = cons_region->cons;
>>>> +	size = prod_region->nb_entries;
>>>> +
>>>> +	if (CIRC_SPACE(prod, cons, size) < 1)
>>>> +		goto unlock;
>>>> +
>>>> +	*new = evt->fault;
>>>> +	prod = (prod + 1) % size;
>>>> +	prod_region->prod = prod;
>>>> +	mutex_unlock(&vdev->fault_queue_lock);
>>>> +
>>>> +	mutex_lock(&vdev->igate);
>>>> +	if (vdev->dma_fault_trigger)
>>>> +		eventfd_signal(vdev->dma_fault_trigger, 1);
>>>> +	mutex_unlock(&vdev->igate);
>>>> +	return 0;
>>>> +
>>>> +unlock:
>>>> +	mutex_unlock(&vdev->fault_queue_lock);
>>>> +	return -EINVAL;
>>>> +}
>>>> +
>>>>  static int vfio_pci_init_fault_region(struct vfio_pci_device
>>>> *vdev) {
>>>>  	struct vfio_region_fault_prod *header;
>>>> @@ -328,6 +369,13 @@ static int vfio_pci_init_fault_region(struct
>>>> vfio_pci_device *vdev) header = (struct vfio_region_fault_prod
>>>> *)vdev->fault_pages; header->version = -1;
>>>>  	header->offset = PAGE_SIZE;
>>>> +
>>>> +	ret =
>>>> iommu_register_device_fault_handler(&vdev->pdev->dev,
>>>> +
>>>> vfio_pci_iommu_dev_fault_handler,
>>>> +					vdev);
>>>> +	if (ret)
>>>> +		goto out;
>>>> +
>>>>  	return 0;
>>>>  out:
>>>>  	kfree(vdev->fault_pages);
>>>> @@ -570,6 +618,7 @@ static void vfio_pci_release(void *device_data)
>>>>  	if (!(--vdev->refcnt)) {
>>>>  		vfio_spapr_pci_eeh_release(vdev->pdev);
>>>>  		vfio_pci_disable(vdev);
>>>> +
>>>> iommu_unregister_device_fault_handler(&vdev->pdev->dev);  
>>>
>>>
>>> But this can fail if there are pending faults which leaves a device
>>> reference and then the system is broken :(  
>> This series only features unrecoverable errors and for those the
>> unregistration cannot fail. Now unrecoverable errors were added I
>> admit this is confusing. We need to sort this out or clean the
>> dependencies.
> As Alex pointed out in 4/29, we can make
> iommu_unregister_device_fault_handler() never fail and clean up all the
> pending faults in the host IOMMU belong to that device. But the problem
> is that if a fault, such as PRQ, has already been injected into the
> guest, the page response may come back after handler is unregistered
> and registered again.

I'm trying to figure out if that would be harmful in any way. I guess it
can be a bit nasty if we handle the page response right after having
injected a new page request that uses the same PRGI. In any other case we
discard the page response, but here we forward it to the endpoint and:

* If the response status is success, endpoint retries the translation. The
guest probably hasn't had time to handle the new page request and
translation will fail, which may lead the endpoint to give up (two
unsuccessful translation requests). Or send a new request

* otherwise the endpoint won't retry the access, and could also disable
PRI if the status is failure.

> We need a way to reject such page response belong
> to the previous life of the handler. Perhaps a sync call to the guest
> with your fault queue eventfd? I am not sure.

We could simply expect the device driver not to send any page response
after unregistering the fault handler. Is there any reason VFIO would need
to unregister and re-register the fault handler on a live guest?

Thanks,
Jean

  reply	other threads:[~2019-06-06 18:54 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-26 16:09 [PATCH v8 00/29] SMMUv3 Nested Stage Setup Eric Auger
2019-05-26 16:09 ` [PATCH v8 01/29] driver core: Add per device iommu param Eric Auger
2019-05-26 16:09 ` [PATCH v8 02/29] iommu: Introduce device fault data Eric Auger
2019-05-26 16:09 ` [PATCH v8 03/29] iommu: Introduce device fault report API Eric Auger
2019-05-26 16:09 ` [PATCH v8 04/29] iommu: Add recoverable fault reporting Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-04 15:48     ` Jacob Pan
2019-05-26 16:09 ` [PATCH v8 05/29] iommu: Add a timeout parameter for PRQ response Eric Auger
2019-06-03 22:32   ` Alex Williamson
2019-06-04 10:52     ` Jean-Philippe Brucker
2019-06-04 15:50       ` Jacob Pan
2019-05-26 16:09 ` [PATCH v8 06/29] trace/iommu: Add sva trace events Eric Auger
2019-05-26 16:09 ` [PATCH v8 07/29] iommu: Use device fault trace event Eric Auger
2019-05-26 16:09 ` [PATCH v8 08/29] iommu: Introduce attach/detach_pasid_table API Eric Auger
2019-05-26 16:09 ` [PATCH v8 09/29] iommu: Introduce cache_invalidate API Eric Auger
2019-05-26 16:09 ` [PATCH v8 10/29] iommu: Introduce bind/unbind_guest_msi Eric Auger
2019-05-26 16:09 ` [PATCH v8 11/29] iommu/arm-smmu-v3: Maintain a SID->device structure Eric Auger
2019-05-26 16:09 ` [PATCH v8 12/29] iommu/smmuv3: Dynamically allocate s1_cfg and s2_cfg Eric Auger
2019-05-26 16:09 ` [PATCH v8 13/29] iommu/smmuv3: Get prepared for nested stage support Eric Auger
2019-05-26 16:09 ` [PATCH v8 14/29] iommu/smmuv3: Implement attach/detach_pasid_table Eric Auger
2019-05-26 16:09 ` [PATCH v8 15/29] iommu/smmuv3: Introduce __arm_smmu_tlb_inv_asid/s1_range_nosync Eric Auger
2019-05-26 16:09 ` [PATCH v8 16/29] iommu/smmuv3: Implement cache_invalidate Eric Auger
2019-05-26 16:09 ` [PATCH v8 17/29] dma-iommu: Implement NESTED_MSI cookie Eric Auger
2019-05-26 16:09 ` [PATCH v8 18/29] iommu/smmuv3: Nested mode single MSI doorbell per domain enforcement Eric Auger
2019-05-26 16:09 ` [PATCH v8 19/29] iommu/smmuv3: Enforce incompatibility between nested mode and HW MSI regions Eric Auger
2019-05-26 16:09 ` [PATCH v8 20/29] iommu/smmuv3: Implement bind/unbind_guest_msi Eric Auger
2019-05-26 16:09 ` [PATCH v8 21/29] iommu/smmuv3: Report non recoverable faults Eric Auger
2019-05-26 16:09 ` [PATCH v8 22/29] vfio: VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE Eric Auger
2019-06-03 22:32   ` Alex Williamson
2019-05-26 16:09 ` [PATCH v8 23/29] vfio: VFIO_IOMMU_CACHE_INVALIDATE Eric Auger
2019-06-14 12:38   ` Liu, Yi L
2019-06-14 13:17     ` Auger Eric
2019-05-26 16:09 ` [PATCH v8 24/29] vfio: VFIO_IOMMU_BIND/UNBIND_MSI Eric Auger
2019-06-03 22:32   ` Alex Williamson
2019-06-07  8:30     ` Auger Eric
2019-05-26 16:10 ` [PATCH v8 25/29] vfio-pci: Add a new VFIO_REGION_TYPE_NESTED region type Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-07  8:28     ` Auger Eric
2019-06-07 12:47       ` Jean-Philippe Brucker
2019-06-07 16:29       ` Alex Williamson
2019-05-26 16:10 ` [PATCH v8 26/29] vfio-pci: Register an iommu fault handler Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-04 16:11     ` Auger Eric
2019-06-05 22:45       ` Jacob Pan
2019-06-06 18:54         ` Jean-Philippe Brucker [this message]
2019-06-06 20:29           ` Jacob Pan
2019-06-07  7:02             ` Auger Eric
2019-06-07 10:28             ` Jean-Philippe Brucker
2019-06-07 17:43               ` Jacob Pan
2019-06-10 12:45                 ` Jean-Philippe Brucker
2019-06-10 21:31                   ` Jacob Pan
2019-06-11 13:14                     ` Jean-Philippe Brucker
2019-06-12 18:53                       ` Jacob Pan
2019-06-18 14:04                         ` Jean-Philippe Brucker
2019-06-19  0:19                           ` Jacob Pan
2019-06-19 11:44                             ` Jean-Philippe Brucker
2019-07-11 13:07                           ` Auger Eric
2019-06-07 12:48   ` Jean-Philippe Brucker
2019-06-07 14:18     ` Auger Eric
2019-05-26 16:10 ` [PATCH v8 27/29] vfio_pci: Allow to mmap the fault queue Eric Auger
2019-05-26 16:10 ` [PATCH v8 28/29] vfio-pci: Add VFIO_PCI_DMA_FAULT_IRQ_INDEX Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-04 16:11     ` Auger Eric
2019-05-26 16:10 ` [PATCH v8 29/29] vfio: Document nested stage control Eric Auger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2753d192-1c46-d78e-c425-0c828e48cde2@arm.com \
    --to=jean-philippe.brucker@arm.com \
    --cc=Marc.Zyngier@arm.com \
    --cc=Robin.Murphy@arm.com \
    --cc=Vincent.Stehle@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=eric.auger.pro@gmail.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter.maydell@linaro.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).