KVM ARM Archive on lore.kernel.org
 help / color / Atom feed
From: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
To: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: "kevin.tian@intel.com" <kevin.tian@intel.com>,
	Vincent Stehle <Vincent.Stehle@arm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"ashok.raj@intel.com" <ashok.raj@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Marc Zyngier <Marc.Zyngier@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	Robin Murphy <Robin.Murphy@arm.com>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>
Subject: Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler
Date: Mon, 10 Jun 2019 13:45:02 +0100
Message-ID: <e02b024f-6ebc-e8fa-c30c-5bf3f4b164d6@arm.com> (raw)
In-Reply-To: <20190607104301.6b1bbd74@jacob-builder>

On 07/06/2019 18:43, Jacob Pan wrote:
>>> So it seems we agree on the following:
>>> - iommu_unregister_device_fault_handler() will never fail
>>> - iommu driver cleans up all pending faults when handler is
>>> unregistered
>>> - assume device driver or guest not sending more page response
>>> _after_ handler is unregistered.
>>> - system will tolerate rare spurious response
>>>
>>> Sounds right?  
>>
>> Yes, I'll add that to the fault series
> Hold on a second please, I think we need more clarifications. Ashok
> pointed out to me that the spurious response can be harmful to other
> devices when it comes to mdev, where PRQ group id is not per PASID,
> device may reuse the group number and receiving spurious page response
> can confuse the entire PF. 

I don't understand how mdev differs from the non-mdev situation (but I
also still don't fully get how mdev+PASID will be implemented). Is the
following the case you're worried about?

  M#: mdev #

# Dev         Host        mdev drv       VFIO/QEMU        Guest
====================================================================
1                     <- reg(handler)
2 PR1 G1 P1    ->         M1 PR1 G1        inject ->     M1 PR1 G1
3                     <- unreg(handler)
4       <- PS1 G1 P1 (F)      |
5                        unreg(handler)
6                     <- reg(handler)
7 PR2 G1 P1    ->         M2 PR2 G1        inject ->     M2 PR2 G1
8                                                     <- M1 PS1 G1
9         accept ??    <- PS1 G1 P1
10                                                    <- M2 PS2 G1
11        accept       <- PS2 G1 P1


Step 2 injects PR1 for mdev#1. Step 4 auto-responds to PR1. Between
steps 5 and 6, we re-allocate PASID #1 for mdev #2. At step 7, we inject
PR2 for mdev #2. Step 8 is the spurious Page Response for PR1.

But I don't think step 9 is possible, because the mdev driver knows that
mdev #1 isn't using PASID #1 anymore. If the configuration is valid at
all (a page response channel still exists for mdev #1), then mdev #1 now
has a different PASID, e.g. #2, and step 9 would be "<- PS1 G1 P2" which
is rejected by iommu.c (no such pending page request). And step 11 will
be accepted.

If PASIDs are allocated through VCMD, then the situation seems similar:
at step 2 you inject "M1 PR1 G1 P1" into the guest, and at step 8 the
spurious response is "M1 PS1 G1 P1". If mdev #1 doesn't have PASID #1
anymore, then the mdev driver can check that the PASID is invalid and
can reject the page response.

> Having spurious page response is also not
> abiding the PCIe spec. exactly.

We are following the PCI spec though, in that we don't send page
responses for PRGIs that aren't in flight.

> We have two options here:
> 1. unregister handler will get -EBUSY if outstanding fault exists.
> 	-PROs: block offending device unbind only, eventually timeout
> 	will clear.
> 	-CONs: flooded faults can prevent clearing
> 2. unregister handle will block until all faults are clear in the host.
>    Never fails unregistration

Here the host completes the faults itself or wait for a response from
the guest? I'm slightly confused by the word "blocking". I'd rather we
don't introduce an uninterruptible sleep in the IOMMU core, since it's
unlikely to ever finish if we rely on the guest to complete things.

> 	-PROs: simple flow for VFIO, no need to worry about device
> 	holding reference.
> 	-CONs: spurious page response may come from
> 	misbehaving/malicious guest if guest does unregister and
> 	register back to back.

> It seems the only way to prevent spurious page response is to introduce
> a SW token or sequence# for each PRQ that needs a response. I still
> think option 2 is good.
> 
> Consider the following time line:
> decoding
>  PR#: page request
>  G#:  group #
>  P#:  PASID
>  S#:  sequence #
>  A#:  address
>  PS#: page response
>  (F): Fail
>  (S): Success
> 
> # Dev		Host		VFIO/QEMU	Guest
> ===========================================================	
> 1				<-reg(handler)
> 2 PR1G1S1A1	->		inject	->	PR1G1S1A1
> 3 PR2G1S2A2	->		inject	->	PR2G1S2A2
> 4.				<-unreg(handler)
> 5.	<-PR1G1S1A1(F)			| 
> 6.	<-PR2G1S2A2(F)			V
> 7.				<-unreg(handler)
> 8.				<-reg(handler)
> 9 PR3G1S3A1	->		inject	->	PR3G1S3A1
> 10.						<-PS1G1S1A1
> 11.		<reject S1>
> 11.		<accept S3>			<-PS3G1S3A1
> 12.PS3G1S3A1(S)
> 
> The spurious page response comes in at step 10 where the guest sends
> response for the request in step 1. But since the sequence # is 1, host
> IOMMU driver will reject it. At step 11, we accept page response for
> the matching sequence # then respond SUCCESS to the device.
> 
> So would it be OK to add this sequence# to iommu_fault and page
> response, or could event reuse the time stamp for that purpose.

With a PV interface we can do what we want, but it can't work with an
IOMMU emulation that only has 9 bits for the PRGI. I suppose we can add
the sequence number but we'll have to handle the case where it isn't
present in the page response (ie. accept it anyway).

Thanks,
Jean
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  reply index

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-26 16:09 [PATCH v8 00/29] SMMUv3 Nested Stage Setup Eric Auger
2019-05-26 16:09 ` [PATCH v8 01/29] driver core: Add per device iommu param Eric Auger
2019-05-26 16:09 ` [PATCH v8 02/29] iommu: Introduce device fault data Eric Auger
2019-05-26 16:09 ` [PATCH v8 03/29] iommu: Introduce device fault report API Eric Auger
2019-05-26 16:09 ` [PATCH v8 04/29] iommu: Add recoverable fault reporting Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-04 15:48     ` Jacob Pan
2019-05-26 16:09 ` [PATCH v8 05/29] iommu: Add a timeout parameter for PRQ response Eric Auger
2019-06-03 22:32   ` Alex Williamson
2019-06-04 10:52     ` Jean-Philippe Brucker
2019-06-04 15:50       ` Jacob Pan
2019-05-26 16:09 ` [PATCH v8 06/29] trace/iommu: Add sva trace events Eric Auger
2019-05-26 16:09 ` [PATCH v8 07/29] iommu: Use device fault trace event Eric Auger
2019-05-26 16:09 ` [PATCH v8 08/29] iommu: Introduce attach/detach_pasid_table API Eric Auger
2019-05-26 16:09 ` [PATCH v8 09/29] iommu: Introduce cache_invalidate API Eric Auger
2019-05-26 16:09 ` [PATCH v8 10/29] iommu: Introduce bind/unbind_guest_msi Eric Auger
2019-05-26 16:09 ` [PATCH v8 11/29] iommu/arm-smmu-v3: Maintain a SID->device structure Eric Auger
2019-05-26 16:09 ` [PATCH v8 12/29] iommu/smmuv3: Dynamically allocate s1_cfg and s2_cfg Eric Auger
2019-05-26 16:09 ` [PATCH v8 13/29] iommu/smmuv3: Get prepared for nested stage support Eric Auger
2019-05-26 16:09 ` [PATCH v8 14/29] iommu/smmuv3: Implement attach/detach_pasid_table Eric Auger
2019-05-26 16:09 ` [PATCH v8 15/29] iommu/smmuv3: Introduce __arm_smmu_tlb_inv_asid/s1_range_nosync Eric Auger
2019-05-26 16:09 ` [PATCH v8 16/29] iommu/smmuv3: Implement cache_invalidate Eric Auger
2019-05-26 16:09 ` [PATCH v8 17/29] dma-iommu: Implement NESTED_MSI cookie Eric Auger
2019-05-26 16:09 ` [PATCH v8 18/29] iommu/smmuv3: Nested mode single MSI doorbell per domain enforcement Eric Auger
2019-05-26 16:09 ` [PATCH v8 19/29] iommu/smmuv3: Enforce incompatibility between nested mode and HW MSI regions Eric Auger
2019-05-26 16:09 ` [PATCH v8 20/29] iommu/smmuv3: Implement bind/unbind_guest_msi Eric Auger
2019-05-26 16:09 ` [PATCH v8 21/29] iommu/smmuv3: Report non recoverable faults Eric Auger
2019-05-26 16:09 ` [PATCH v8 22/29] vfio: VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE Eric Auger
2019-06-03 22:32   ` Alex Williamson
2019-05-26 16:09 ` [PATCH v8 23/29] vfio: VFIO_IOMMU_CACHE_INVALIDATE Eric Auger
2019-06-14 12:38   ` Liu, Yi L
2019-06-14 13:17     ` Auger Eric
2019-05-26 16:09 ` [PATCH v8 24/29] vfio: VFIO_IOMMU_BIND/UNBIND_MSI Eric Auger
2019-06-03 22:32   ` Alex Williamson
2019-06-07  8:30     ` Auger Eric
2019-05-26 16:10 ` [PATCH v8 25/29] vfio-pci: Add a new VFIO_REGION_TYPE_NESTED region type Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-07  8:28     ` Auger Eric
2019-06-07 12:47       ` Jean-Philippe Brucker
2019-06-07 16:29       ` Alex Williamson
2019-05-26 16:10 ` [PATCH v8 26/29] vfio-pci: Register an iommu fault handler Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-04 16:11     ` Auger Eric
2019-06-05 22:45       ` Jacob Pan
2019-06-06 18:54         ` Jean-Philippe Brucker
2019-06-06 20:29           ` Jacob Pan
2019-06-07  7:02             ` Auger Eric
2019-06-07 10:28             ` Jean-Philippe Brucker
2019-06-07 17:43               ` Jacob Pan
2019-06-10 12:45                 ` Jean-Philippe Brucker [this message]
2019-06-10 21:31                   ` Jacob Pan
2019-06-11 13:14                     ` Jean-Philippe Brucker
2019-06-12 18:53                       ` Jacob Pan
2019-06-18 14:04                         ` Jean-Philippe Brucker
2019-06-19  0:19                           ` Jacob Pan
2019-06-19 11:44                             ` Jean-Philippe Brucker
2019-07-11 13:07                           ` Auger Eric
2019-06-07 12:48   ` Jean-Philippe Brucker
2019-06-07 14:18     ` Auger Eric
2019-05-26 16:10 ` [PATCH v8 27/29] vfio_pci: Allow to mmap the fault queue Eric Auger
2019-05-26 16:10 ` [PATCH v8 28/29] vfio-pci: Add VFIO_PCI_DMA_FAULT_IRQ_INDEX Eric Auger
2019-06-03 22:31   ` Alex Williamson
2019-06-04 16:11     ` Auger Eric
2019-05-26 16:10 ` [PATCH v8 29/29] vfio: Document nested stage control Eric Auger

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e02b024f-6ebc-e8fa-c30c-5bf3f4b164d6@arm.com \
    --to=jean-philippe.brucker@arm.com \
    --cc=Marc.Zyngier@arm.com \
    --cc=Robin.Murphy@arm.com \
    --cc=Vincent.Stehle@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=eric.auger.pro@gmail.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM ARM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvmarm/0 kvmarm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvmarm kvmarm/ https://lore.kernel.org/kvmarm \
		kvmarm@lists.cs.columbia.edu kvmarm@archiver.kernel.org
	public-inbox-index kvmarm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/edu.columbia.cs.lists.kvmarm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox