kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Support SVM without PASID
@ 2017-07-08 17:03 valmiki
       [not found] ` <a30962a6-2240-9263-96cc-10da1b179fcc-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: valmiki @ 2017-07-08 17:03 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA
  Cc: tianyu.lan-ral2JQCrhuEAvxtiuMwx3w,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w,
	jacob.jun.pan-ral2JQCrhuEAvxtiuMwx3w

Hi,

In SMMUv3 architecture document i see "PASIDs are optional, 
configurable, and of a size determined by the minimum
of the endpoint".

So if PASID's are optional and not supported by PCIe end point, how SVM 
can be achieved ?

How will the translation for a particular process virtual address is 
obtained ?

Regards,
Valmiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
       [not found] ` <a30962a6-2240-9263-96cc-10da1b179fcc-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-07-08 20:02   ` Alex Williamson
  2017-07-09  3:15     ` valmiki
  0 siblings, 1 reply; 33+ messages in thread
From: Alex Williamson @ 2017-07-08 20:02 UTC (permalink / raw)
  To: valmiki
  Cc: tianyu.lan-ral2JQCrhuEAvxtiuMwx3w,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jacob.jun.pan-ral2JQCrhuEAvxtiuMwx3w

On Sat, 8 Jul 2017 22:33:01 +0530
valmiki <valmikibow-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Hi,
> 
> In SMMUv3 architecture document i see "PASIDs are optional, 
> configurable, and of a size determined by the minimum
> of the endpoint".
> 
> So if PASID's are optional and not supported by PCIe end point, how SVM 
> can be achieved ?

It cannot be inferred from that statement that PASID support is not
required for SVM.  AIUI, SVM is a software feature enabled by numerous
"optional" hardware features, including PASID.  Features that are
optional per the hardware specification may be required for specific
software features.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-07-08 20:02   ` Alex Williamson
@ 2017-07-09  3:15     ` valmiki
  2017-07-09  9:29       ` Liu, Yi L
                         ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: valmiki @ 2017-07-09  3:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: iommu, kvm, linux-pci, jean-Philippe Brucker, tianyu.lan,
	kevin.tian, jacob.jun.pan

>> Hi,
>>
>> In SMMUv3 architecture document i see "PASIDs are optional,
>> configurable, and of a size determined by the minimum
>> of the endpoint".
>>
>> So if PASID's are optional and not supported by PCIe end point, how SVM
>> can be achieved ?
>
> It cannot be inferred from that statement that PASID support is not
> required for SVM.  AIUI, SVM is a software feature enabled by numerous
> "optional" hardware features, including PASID.  Features that are
> optional per the hardware specification may be required for specific
> software features.  Thanks,
>
Thanks for the information Alex. Suppose if an End point doesn't support
PASID, is it still possible to achieve SVM ?
Are there any such features in SMMUv3 with which we can achieve it ?

Regards,
valmki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
  2017-07-09  3:15     ` valmiki
@ 2017-07-09  9:29       ` Liu, Yi L
  2017-07-10  0:14       ` Bob Liu
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Liu, Yi L @ 2017-07-09  9:29 UTC (permalink / raw)
  To: valmiki, Alex Williamson
  Cc: Lan, Tianyu, Tian, Kevin, kvm, linux-pci, iommu, Pan, Jacob jun,
	jean-philippe.brucker

> -----Original Message-----
> From: iommu-bounces@lists.linux-foundation.org [mailto:iommu-
> bounces@lists.linux-foundation.org] On Behalf Of valmiki
> Sent: Sunday, July 9, 2017 11:16 AM
> To: Alex Williamson <alex.williamson@redhat.com>
> Cc: Lan, Tianyu <tianyu.lan@intel.com>; Tian, Kevin <kevin.tian@intel.com>;
> kvm@vger.kernel.org; linux-pci@vger.kernel.org; iommu@lists.linux-foundation.org;
> Pan, Jacob jun <jacob.jun.pan@intel.com>
> Subject: Re: Support SVM without PASID
> 
> >> Hi,
> >>
> >> In SMMUv3 architecture document i see "PASIDs are optional,
> >> configurable, and of a size determined by the minimum of the
> >> endpoint".
> >>
> >> So if PASID's are optional and not supported by PCIe end point, how
> >> SVM can be achieved ?
> >
> > It cannot be inferred from that statement that PASID support is not
> > required for SVM.  AIUI, SVM is a software feature enabled by numerous
> > "optional" hardware features, including PASID.  Features that are
> > optional per the hardware specification may be required for specific
> > software features.  Thanks,
> >
> Thanks for the information Alex. Suppose if an End point doesn't support PASID, is it
> still possible to achieve SVM ?
> Are there any such features in SMMUv3 with which we can achieve it ?

If endpoint has no PASID support, I don't think it is SVM capable. For SMMU, maybe
you can get more info from Jean.

Regards,
Yi L

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-07-09  3:15     ` valmiki
  2017-07-09  9:29       ` Liu, Yi L
@ 2017-07-10  0:14       ` Bob Liu
  2017-07-10 19:31       ` Jerome Glisse
       [not found]       ` <73619426-6fcc-21ce-cfd4-8c66bde63f9a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  3 siblings, 0 replies; 33+ messages in thread
From: Bob Liu @ 2017-07-10  0:14 UTC (permalink / raw)
  To: valmiki, Alex Williamson
  Cc: tianyu.lan, kevin.tian, kvm, linux-pci, iommu, jacob.jun.pan

On 2017/7/9 11:15, valmiki wrote:
>>> Hi,
>>>
>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>> configurable, and of a size determined by the minimum
>>> of the endpoint".
>>>
>>> So if PASID's are optional and not supported by PCIe end point, how SVM
>>> can be achieved ?
>>
>> It cannot be inferred from that statement that PASID support is not
>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>> "optional" hardware features, including PASID.  Features that are
>> optional per the hardware specification may be required for specific
>> software features.  Thanks,
>>
> Thanks for the information Alex. Suppose if an End point doesn't support
> PASID, is it still possible to achieve SVM ?
> Are there any such features in SMMUv3 with which we can achieve it ?
> 

I don't think so.
But one option is your device has an internal MMU. e.g Nvidia GPU.

Thanks,
Bob Liu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-07-09  3:15     ` valmiki
  2017-07-09  9:29       ` Liu, Yi L
  2017-07-10  0:14       ` Bob Liu
@ 2017-07-10 19:31       ` Jerome Glisse
       [not found]         ` <20170710193141.GA3813-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
       [not found]       ` <73619426-6fcc-21ce-cfd4-8c66bde63f9a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  3 siblings, 1 reply; 33+ messages in thread
From: Jerome Glisse @ 2017-07-10 19:31 UTC (permalink / raw)
  To: valmiki
  Cc: Alex Williamson, tianyu.lan, kevin.tian, kvm, linux-pci, iommu,
	jacob.jun.pan

On Sun, Jul 09, 2017 at 08:45:57AM +0530, valmiki wrote:
> > > Hi,
> > > 
> > > In SMMUv3 architecture document i see "PASIDs are optional,
> > > configurable, and of a size determined by the minimum
> > > of the endpoint".
> > > 
> > > So if PASID's are optional and not supported by PCIe end point, how SVM
> > > can be achieved ?
> > 
> > It cannot be inferred from that statement that PASID support is not
> > required for SVM.  AIUI, SVM is a software feature enabled by numerous
> > "optional" hardware features, including PASID.  Features that are
> > optional per the hardware specification may be required for specific
> > software features.  Thanks,
> > 
> Thanks for the information Alex. Suppose if an End point doesn't support
> PASID, is it still possible to achieve SVM ?
> Are there any such features in SMMUv3 with which we can achieve it ?

You can achieve SVM in software, this is what HMM is for. But the hardware
must have an mmu with similar features as you get on CPU mmu. Device like
GPU do have such MMU.

You can also mix HMM with PASID/ATS to leverage device memory. HMM allows
you to use device memory inside process address space for device threads
(ie device memory is still consider as un-accessible from CPU, only device
can access it). Again very useful for GPU.

Cheers,
Jérôme

(1) https://lwn.net/Articles/726691/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
       [not found]       ` <73619426-6fcc-21ce-cfd4-8c66bde63f9a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-07-11 10:56         ` Jean-Philippe Brucker
       [not found]           ` <eb132dfa-6708-5898-c2ce-ce7ab08809b1-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-07-11 10:56 UTC (permalink / raw)
  To: valmiki, Alex Williamson
  Cc: tianyu.lan-ral2JQCrhuEAvxtiuMwx3w,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jacob.jun.pan-ral2JQCrhuEAvxtiuMwx3w

Hi Valmiki,

On 09/07/17 04:15, valmiki wrote:
>>> Hi,
>>>
>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>> configurable, and of a size determined by the minimum
>>> of the endpoint".
>>>
>>> So if PASID's are optional and not supported by PCIe end point, how SVM
>>> can be achieved ?
>>
>> It cannot be inferred from that statement that PASID support is not
>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>> "optional" hardware features, including PASID.  Features that are
>> optional per the hardware specification may be required for specific
>> software features.  Thanks,
>>
> Thanks for the information Alex. Suppose if an End point doesn't support
> PASID, is it still possible to achieve SVM ?
> Are there any such features in SMMUv3 with which we can achieve it ?

Not really, we don't plan to share the non-PASID context with a process.

In theory you could achieve something resembling SVM by assigning the
entire endpoint to userspace using VFIO, then use ATS+PRI capabilities
with a bind ioctl. If your device can do SR-IOV, then you can bind one
process per virtual function.

Unless we end up seeing lots of endpoints that implement PRI but not
PASID, I don't plan to add this to VFIO or SMMUv3.

For a PCIe endpoint, the requirements for SVM are ATS, PRI and PASID
enabled. In addition, the SMMU should support DVM (broadcast TLB
maintenance) and must be compatible with the MMU (page sizes, output
address size, ASID bits...)

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
       [not found]         ` <20170710193141.GA3813-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-07-12 16:23           ` valmiki
  0 siblings, 0 replies; 33+ messages in thread
From: valmiki @ 2017-07-12 16:23 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: tianyu.lan-ral2JQCrhuEAvxtiuMwx3w,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jacob.jun.pan-ral2JQCrhuEAvxtiuMwx3w

On 7/11/2017 1:01 AM, Jerome Glisse wrote:
> On Sun, Jul 09, 2017 at 08:45:57AM +0530, valmiki wrote:
>>>> Hi,
>>>>
>>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>>> configurable, and of a size determined by the minimum
>>>> of the endpoint".
>>>>
>>>> So if PASID's are optional and not supported by PCIe end point, how SVM
>>>> can be achieved ?
>>>
>>> It cannot be inferred from that statement that PASID support is not
>>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>>> "optional" hardware features, including PASID.  Features that are
>>> optional per the hardware specification may be required for specific
>>> software features.  Thanks,
>>>
>> Thanks for the information Alex. Suppose if an End point doesn't support
>> PASID, is it still possible to achieve SVM ?
>> Are there any such features in SMMUv3 with which we can achieve it ?
>
> You can achieve SVM in software, this is what HMM is for. But the hardware
> must have an mmu with similar features as you get on CPU mmu. Device like
> GPU do have such MMU.
>
> You can also mix HMM with PASID/ATS to leverage device memory. HMM allows
> you to use device memory inside process address space for device threads
> (ie device memory is still consider as un-accessible from CPU, only device
> can access it). Again very useful for GPU.
>
Thanks Jerome, this is interesting and great work. Will try to explore 
more on this option.

Regards,
Valmiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
       [not found]           ` <eb132dfa-6708-5898-c2ce-ce7ab08809b1-5wv7dgnIgG8@public.gmane.org>
@ 2017-07-12 16:27             ` valmiki
  2017-07-12 16:48               ` Jean-Philippe Brucker
  0 siblings, 1 reply; 33+ messages in thread
From: valmiki @ 2017-07-12 16:27 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Alex Williamson
  Cc: tianyu.lan-ral2JQCrhuEAvxtiuMwx3w,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jacob.jun.pan-ral2JQCrhuEAvxtiuMwx3w

On 7/11/2017 4:26 PM, Jean-Philippe Brucker wrote:
> Hi Valmiki,
>
> On 09/07/17 04:15, valmiki wrote:
>>>> Hi,
>>>>
>>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>>> configurable, and of a size determined by the minimum
>>>> of the endpoint".
>>>>
>>>> So if PASID's are optional and not supported by PCIe end point, how SVM
>>>> can be achieved ?
>>>
>>> It cannot be inferred from that statement that PASID support is not
>>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>>> "optional" hardware features, including PASID.  Features that are
>>> optional per the hardware specification may be required for specific
>>> software features.  Thanks,
>>>
>> Thanks for the information Alex. Suppose if an End point doesn't support
>> PASID, is it still possible to achieve SVM ?
>> Are there any such features in SMMUv3 with which we can achieve it ?
>
> Not really, we don't plan to share the non-PASID context with a process.
>
> In theory you could achieve something resembling SVM by assigning the
> entire endpoint to userspace using VFIO, then use ATS+PRI capabilities
> with a bind ioctl. If your device can do SR-IOV, then you can bind one
> process per virtual function.
>
> Unless we end up seeing lots of endpoints that implement PRI but not
> PASID, I don't plan to add this to VFIO or SMMUv3.
>
> For a PCIe endpoint, the requirements for SVM are ATS, PRI and PASID
> enabled. In addition, the SMMU should support DVM (broadcast TLB
> maintenance) and must be compatible with the MMU (page sizes, output
> address size, ASID bits...)
>
Thanks Jean.
In SMMU document it was quoted as follows
"When STE.S1DSS==0b10, a transaction without a SubstreamID is accepted 
and uses the CD of Substream 0. Under this configuration, transactions 
that arrive with SubstreamID 0 are aborted and an event recorded."

Is this mode supported in your previous series of SMMUv3 patches ?
If it is supported is it achieved through VFIO ?

Regards,
Valmiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-07-12 16:27             ` valmiki
@ 2017-07-12 16:48               ` Jean-Philippe Brucker
  2017-07-22  2:05                 ` valmiki
  0 siblings, 1 reply; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-07-12 16:48 UTC (permalink / raw)
  To: valmiki, Alex Williamson
  Cc: iommu, kvm, linux-pci, tianyu.lan, kevin.tian, jacob.jun.pan

On 12/07/17 17:27, valmiki wrote:
> On 7/11/2017 4:26 PM, Jean-Philippe Brucker wrote:
>> Hi Valmiki,
>>
>> On 09/07/17 04:15, valmiki wrote:
>>>>> Hi,
>>>>>
>>>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>>>> configurable, and of a size determined by the minimum
>>>>> of the endpoint".
>>>>>
>>>>> So if PASID's are optional and not supported by PCIe end point, how SVM
>>>>> can be achieved ?
>>>>
>>>> It cannot be inferred from that statement that PASID support is not
>>>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>>>> "optional" hardware features, including PASID.  Features that are
>>>> optional per the hardware specification may be required for specific
>>>> software features.  Thanks,
>>>>
>>> Thanks for the information Alex. Suppose if an End point doesn't support
>>> PASID, is it still possible to achieve SVM ?
>>> Are there any such features in SMMUv3 with which we can achieve it ?
>>
>> Not really, we don't plan to share the non-PASID context with a process.
>>
>> In theory you could achieve something resembling SVM by assigning the
>> entire endpoint to userspace using VFIO, then use ATS+PRI capabilities
>> with a bind ioctl. If your device can do SR-IOV, then you can bind one
>> process per virtual function.
>>
>> Unless we end up seeing lots of endpoints that implement PRI but not
>> PASID, I don't plan to add this to VFIO or SMMUv3.
>>
>> For a PCIe endpoint, the requirements for SVM are ATS, PRI and PASID
>> enabled. In addition, the SMMU should support DVM (broadcast TLB
>> maintenance) and must be compatible with the MMU (page sizes, output
>> address size, ASID bits...)
>>
> Thanks Jean.
> In SMMU document it was quoted as follows
> "When STE.S1DSS==0b10, a transaction without a SubstreamID is accepted and
> uses the CD of Substream 0. Under this configuration, transactions that
> arrive with SubstreamID 0 are aborted and an event recorded."
> 
> Is this mode supported in your previous series of SMMUv3 patches ?
> If it is supported is it achieved through VFIO ?

Yes, STE.S1DSS==0b10 is the only supported mode with my patches. And in
VFIO, the non-PASID context (CD 0) is managed with
VFIO_IOMMU_MAP/UNMAP_DMA ioctls, mirroring the current behavior for
devices that don't support PASID. All other CDs, with PASID > 0, are
managed via the new BIND ioctl.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-07-12 16:48               ` Jean-Philippe Brucker
@ 2017-07-22  2:05                 ` valmiki
       [not found]                   ` <41333a03-bf91-1152-4779-6579845609f6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: valmiki @ 2017-07-22  2:05 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Alex Williamson
  Cc: iommu, kvm, linux-pci, tianyu.lan, kevin.tian, jacob.jun.pan

On 7/12/2017 10:18 PM, Jean-Philippe Brucker wrote:
> On 12/07/17 17:27, valmiki wrote:
>> On 7/11/2017 4:26 PM, Jean-Philippe Brucker wrote:
>>> Hi Valmiki,
>>>
>>> On 09/07/17 04:15, valmiki wrote:
>>>>>> Hi,
>>>>>>
>>>>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>>>>> configurable, and of a size determined by the minimum
>>>>>> of the endpoint".
>>>>>>
>>>>>> So if PASID's are optional and not supported by PCIe end point, how SVM
>>>>>> can be achieved ?
>>>>>
>>>>> It cannot be inferred from that statement that PASID support is not
>>>>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>>>>> "optional" hardware features, including PASID.  Features that are
>>>>> optional per the hardware specification may be required for specific
>>>>> software features.  Thanks,
>>>>>
>>>> Thanks for the information Alex. Suppose if an End point doesn't support
>>>> PASID, is it still possible to achieve SVM ?
>>>> Are there any such features in SMMUv3 with which we can achieve it ?
>>>
>>> Not really, we don't plan to share the non-PASID context with a process.
>>>
>>> In theory you could achieve something resembling SVM by assigning the
>>> entire endpoint to userspace using VFIO, then use ATS+PRI capabilities
>>> with a bind ioctl. If your device can do SR-IOV, then you can bind one
>>> process per virtual function.
>>>
>>> Unless we end up seeing lots of endpoints that implement PRI but not
>>> PASID, I don't plan to add this to VFIO or SMMUv3.
>>>
>>> For a PCIe endpoint, the requirements for SVM are ATS, PRI and PASID
>>> enabled. In addition, the SMMU should support DVM (broadcast TLB
>>> maintenance) and must be compatible with the MMU (page sizes, output
>>> address size, ASID bits...)
>>>
>> Thanks Jean.
>> In SMMU document it was quoted as follows
>> "When STE.S1DSS==0b10, a transaction without a SubstreamID is accepted and
>> uses the CD of Substream 0. Under this configuration, transactions that
>> arrive with SubstreamID 0 are aborted and an event recorded."
>>
>> Is this mode supported in your previous series of SMMUv3 patches ?
>> If it is supported is it achieved through VFIO ?
>
> Yes, STE.S1DSS==0b10 is the only supported mode with my patches. And in
> VFIO, the non-PASID context (CD 0) is managed with
> VFIO_IOMMU_MAP/UNMAP_DMA ioctls, mirroring the current behavior for
> devices that don't support PASID. All other CDs, with PASID > 0, are
> managed via the new BIND ioctl.

Thanks Jean, this helps a lot.
So i digged through your patches and i understood that using BIND ioctls 
satge-1 translations are setup in SMMU for an application.
If we use VFIO_IOMMU_MAP/UNMAP_DMA ioctls they are setting up stage-2 
translations in SMMU.
So without PASID support we can only work with stage-2 translations ?


Regards,
valmiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
       [not found]                   ` <41333a03-bf91-1152-4779-6579845609f6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-08-01  8:26                     ` Jean-Philippe Brucker
       [not found]                       ` <564ba70b-db95-7fe0-86bb-bb4eefcd87ec-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-01  8:26 UTC (permalink / raw)
  To: valmiki, Alex Williamson
  Cc: tianyu.lan-ral2JQCrhuEAvxtiuMwx3w,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jacob.jun.pan-ral2JQCrhuEAvxtiuMwx3w

Hi Valmiki,

Sorry for the delay, I was away last week.

On 22/07/17 03:05, valmiki wrote:
> On 7/12/2017 10:18 PM, Jean-Philippe Brucker wrote:
>> On 12/07/17 17:27, valmiki wrote:
>>> On 7/11/2017 4:26 PM, Jean-Philippe Brucker wrote:
>>>> Hi Valmiki,
>>>>
>>>> On 09/07/17 04:15, valmiki wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>>>>>> configurable, and of a size determined by the minimum
>>>>>>> of the endpoint".
>>>>>>>
>>>>>>> So if PASID's are optional and not supported by PCIe end point, how
>>>>>>> SVM
>>>>>>> can be achieved ?
>>>>>>
>>>>>> It cannot be inferred from that statement that PASID support is not
>>>>>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>>>>>> "optional" hardware features, including PASID.  Features that are
>>>>>> optional per the hardware specification may be required for specific
>>>>>> software features.  Thanks,
>>>>>>
>>>>> Thanks for the information Alex. Suppose if an End point doesn't support
>>>>> PASID, is it still possible to achieve SVM ?
>>>>> Are there any such features in SMMUv3 with which we can achieve it ?
>>>>
>>>> Not really, we don't plan to share the non-PASID context with a process.
>>>>
>>>> In theory you could achieve something resembling SVM by assigning the
>>>> entire endpoint to userspace using VFIO, then use ATS+PRI capabilities
>>>> with a bind ioctl. If your device can do SR-IOV, then you can bind one
>>>> process per virtual function.
>>>>
>>>> Unless we end up seeing lots of endpoints that implement PRI but not
>>>> PASID, I don't plan to add this to VFIO or SMMUv3.
>>>>
>>>> For a PCIe endpoint, the requirements for SVM are ATS, PRI and PASID
>>>> enabled. In addition, the SMMU should support DVM (broadcast TLB
>>>> maintenance) and must be compatible with the MMU (page sizes, output
>>>> address size, ASID bits...)
>>>>
>>> Thanks Jean.
>>> In SMMU document it was quoted as follows
>>> "When STE.S1DSS==0b10, a transaction without a SubstreamID is accepted and
>>> uses the CD of Substream 0. Under this configuration, transactions that
>>> arrive with SubstreamID 0 are aborted and an event recorded."
>>>
>>> Is this mode supported in your previous series of SMMUv3 patches ?
>>> If it is supported is it achieved through VFIO ?
>>
>> Yes, STE.S1DSS==0b10 is the only supported mode with my patches. And in
>> VFIO, the non-PASID context (CD 0) is managed with
>> VFIO_IOMMU_MAP/UNMAP_DMA ioctls, mirroring the current behavior for
>> devices that don't support PASID. All other CDs, with PASID > 0, are
>> managed via the new BIND ioctl.
> 
> Thanks Jean, this helps a lot.
> So i digged through your patches and i understood that using BIND ioctls
> satge-1 translations are setup in SMMU for an application.
> If we use VFIO_IOMMU_MAP/UNMAP_DMA ioctls they are setting up stage-2
> translations in SMMU.
> So without PASID support we can only work with stage-2 translations ?

It depends what type you use when registering the IOMMU with VFIO_SET_IOMMU:

* If the type is VFIO_TYPE1v2_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
  affects the stage-1 non-PASID context (already the case in mainline).
  In addition, with my patch the BIND ioctl will affect stage-1 PASID
  contexts, and bind process page directories to the device (host SVM).

* If the type is VFIO_TYPE1_NESTING_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
  will affect stage-2 mappings (already in mainline).
  With my SMMU patch series, the BIND ioctl is not supported in this mode.
  But in the future, BIND would allow to manage stage-1 as well:
  - bind a process page directory (host SVM with added stage-2), or
  - bind a guest page directory (viommu), or
  - bind the full stage-1 context table (viommu).

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
       [not found]                       ` <564ba70b-db95-7fe0-86bb-bb4eefcd87ec-5wv7dgnIgG8@public.gmane.org>
@ 2017-08-01 17:38                         ` valmiki
  2017-08-01 18:40                           ` Jean-Philippe Brucker
  2017-08-04  1:49                         ` Tian, Kevin
  1 sibling, 1 reply; 33+ messages in thread
From: valmiki @ 2017-08-01 17:38 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Alex Williamson
  Cc: tianyu.lan-ral2JQCrhuEAvxtiuMwx3w,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jacob.jun.pan-ral2JQCrhuEAvxtiuMwx3w

On 8/1/2017 1:56 PM, Jean-Philippe Brucker wrote:
> Hi Valmiki,
>
> Sorry for the delay, I was away last week.
>
> On 22/07/17 03:05, valmiki wrote:
>> On 7/12/2017 10:18 PM, Jean-Philippe Brucker wrote:
>>> On 12/07/17 17:27, valmiki wrote:
>>>> On 7/11/2017 4:26 PM, Jean-Philippe Brucker wrote:
>>>>> Hi Valmiki,
>>>>>
>>>>> On 09/07/17 04:15, valmiki wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> In SMMUv3 architecture document i see "PASIDs are optional,
>>>>>>>> configurable, and of a size determined by the minimum
>>>>>>>> of the endpoint".
>>>>>>>>
>>>>>>>> So if PASID's are optional and not supported by PCIe end point, how
>>>>>>>> SVM
>>>>>>>> can be achieved ?
>>>>>>>
>>>>>>> It cannot be inferred from that statement that PASID support is not
>>>>>>> required for SVM.  AIUI, SVM is a software feature enabled by numerous
>>>>>>> "optional" hardware features, including PASID.  Features that are
>>>>>>> optional per the hardware specification may be required for specific
>>>>>>> software features.  Thanks,
>>>>>>>
>>>>>> Thanks for the information Alex. Suppose if an End point doesn't support
>>>>>> PASID, is it still possible to achieve SVM ?
>>>>>> Are there any such features in SMMUv3 with which we can achieve it ?
>>>>>
>>>>> Not really, we don't plan to share the non-PASID context with a process.
>>>>>
>>>>> In theory you could achieve something resembling SVM by assigning the
>>>>> entire endpoint to userspace using VFIO, then use ATS+PRI capabilities
>>>>> with a bind ioctl. If your device can do SR-IOV, then you can bind one
>>>>> process per virtual function.
>>>>>
>>>>> Unless we end up seeing lots of endpoints that implement PRI but not
>>>>> PASID, I don't plan to add this to VFIO or SMMUv3.
>>>>>
>>>>> For a PCIe endpoint, the requirements for SVM are ATS, PRI and PASID
>>>>> enabled. In addition, the SMMU should support DVM (broadcast TLB
>>>>> maintenance) and must be compatible with the MMU (page sizes, output
>>>>> address size, ASID bits...)
>>>>>
>>>> Thanks Jean.
>>>> In SMMU document it was quoted as follows
>>>> "When STE.S1DSS==0b10, a transaction without a SubstreamID is accepted and
>>>> uses the CD of Substream 0. Under this configuration, transactions that
>>>> arrive with SubstreamID 0 are aborted and an event recorded."
>>>>
>>>> Is this mode supported in your previous series of SMMUv3 patches ?
>>>> If it is supported is it achieved through VFIO ?
>>>
>>> Yes, STE.S1DSS==0b10 is the only supported mode with my patches. And in
>>> VFIO, the non-PASID context (CD 0) is managed with
>>> VFIO_IOMMU_MAP/UNMAP_DMA ioctls, mirroring the current behavior for
>>> devices that don't support PASID. All other CDs, with PASID > 0, are
>>> managed via the new BIND ioctl.
>>
>> Thanks Jean, this helps a lot.
>> So i digged through your patches and i understood that using BIND ioctls
>> satge-1 translations are setup in SMMU for an application.
>> If we use VFIO_IOMMU_MAP/UNMAP_DMA ioctls they are setting up stage-2
>> translations in SMMU.
>> So without PASID support we can only work with stage-2 translations ?
>
> It depends what type you use when registering the IOMMU with VFIO_SET_IOMMU:
>
> * If the type is VFIO_TYPE1v2_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
>   affects the stage-1 non-PASID context (already the case in mainline).
>   In addition, with my patch the BIND ioctl will affect stage-1 PASID
>   contexts, and bind process page directories to the device (host SVM).
>
> * If the type is VFIO_TYPE1_NESTING_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
>   will affect stage-2 mappings (already in mainline).
>   With my SMMU patch series, the BIND ioctl is not supported in this mode.
>   But in the future, BIND would allow to manage stage-1 as well:
>   - bind a process page directory (host SVM with added stage-2), or
>   - bind a guest page directory (viommu), or
>   - bind the full stage-1 context table (viommu).

Hi Jean, Thanks a lot for this information.
I tried to understand how stage-1 translations are setup if we use 
VFIO_TYPE1v2_IOMMU & VFIO_IOMMU_MAP/UNMAP_DMA flow, but I'm not 
successful, I couldn't find when context descriptor details are updated 
with stage-1 translation table details in this flow.
I found that in BIND flow a new context descriptor being created and 
assigned with required translation tables.

Can you please point to piece of code/patch series where and how context 
descriptor is updated for VFIO_IOMMU_MAP/UNMAP_DMA flow with 
VFIO_TYPE1v2_IOMMU.


Regards,
Valmiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-01 17:38                         ` valmiki
@ 2017-08-01 18:40                           ` Jean-Philippe Brucker
  2017-08-05  5:14                             ` valmiki
  0 siblings, 1 reply; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-01 18:40 UTC (permalink / raw)
  To: valmiki, Alex Williamson
  Cc: iommu, kvm, linux-pci, tianyu.lan, kevin.tian, jacob.jun.pan

On 01/08/17 18:38, valmiki wrote:
[...]
>>> So i digged through your patches and i understood that using BIND ioctls
>>> satge-1 translations are setup in SMMU for an application.
>>> If we use VFIO_IOMMU_MAP/UNMAP_DMA ioctls they are setting up stage-2
>>> translations in SMMU.
>>> So without PASID support we can only work with stage-2 translations ?
>>
>> It depends what type you use when registering the IOMMU with
>> VFIO_SET_IOMMU:
>>
>> * If the type is VFIO_TYPE1v2_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
>>   affects the stage-1 non-PASID context (already the case in mainline).
>>   In addition, with my patch the BIND ioctl will affect stage-1 PASID
>>   contexts, and bind process page directories to the device (host SVM).
>>
>> * If the type is VFIO_TYPE1_NESTING_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
>>   will affect stage-2 mappings (already in mainline).
>>   With my SMMU patch series, the BIND ioctl is not supported in this mode.
>>   But in the future, BIND would allow to manage stage-1 as well:
>>   - bind a process page directory (host SVM with added stage-2), or
>>   - bind a guest page directory (viommu), or
>>   - bind the full stage-1 context table (viommu).
> 
> Hi Jean, Thanks a lot for this information.
> I tried to understand how stage-1 translations are setup if we use
> VFIO_TYPE1v2_IOMMU & VFIO_IOMMU_MAP/UNMAP_DMA flow, but I'm not
> successful, I couldn't find when context descriptor details are updated
> with stage-1 translation table details in this flow.
> I found that in BIND flow a new context descriptor being created and
> assigned with required translation tables.
> 
> Can you please point to piece of code/patch series where and how context
> descriptor is updated for VFIO_IOMMU_MAP/UNMAP_DMA flow with
> VFIO_TYPE1v2_IOMMU.

What happens for SMMU is during initialization, the
VFIO_GROUP_SET_CONTAINER ioctl calls vfio_iommu_type1_attach_group.
This function creates a new iommu_domain, and then calls
iommu_attach_group(domain, group), which calls arm_smmu_attach_dev(domain,
dev) for all devices in the group.

If necessary we detach the existing domain, by clearing the STE (and all
contexts). Then, since the new domain is only partially initialized, we
call arm_smmu_domain_finalise, which allocates the page tables, and ends
up calling arm_smmu_domain_finalise_s1. That function prepares the
configuration of context descriptor 0, which is used for non-PASID
mappings. Afterwards, back in arm_smmu_attach_dev, context descriptor 0 is
written, along with the STE, and then the domain is live.

Then when userspace does VFIO_IOMMU_MAP/UNMAP_DMA, iommu_map/unmap are
called on that domain. arm_smmu_map/unmap update the page tables attached
to context descriptor 0 (but they don't modify the CD in any way).

This is the current behavior in mainline Linux. My patch series don't
change this, as we need to keep backward compatibility for all VFIO IOMMU
types.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
       [not found]                       ` <564ba70b-db95-7fe0-86bb-bb4eefcd87ec-5wv7dgnIgG8@public.gmane.org>
  2017-08-01 17:38                         ` valmiki
@ 2017-08-04  1:49                         ` Tian, Kevin
  2017-08-04  9:42                           ` Jean-Philippe Brucker
  1 sibling, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2017-08-04  1:49 UTC (permalink / raw)
  To: Jean-Philippe Brucker, valmiki, Alex Williamson
  Cc: Lan, Tianyu, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Pan,
	Jacob jun, kvm-u79uwXL29TY76Z2rM5mHXA

> From: Jean-Philippe Brucker
> Sent: Tuesday, August 1, 2017 4:26 PM
> 
> It depends what type you use when registering the IOMMU with
> VFIO_SET_IOMMU:
> 
> * If the type is VFIO_TYPE1v2_IOMMU, then
> VFIO_IOMMU_MAP/UNMAP_DMA
>   affects the stage-1 non-PASID context (already the case in mainline).
>   In addition, with my patch the BIND ioctl will affect stage-1 PASID
>   contexts, and bind process page directories to the device (host SVM).
> 
> * If the type is VFIO_TYPE1_NESTING_IOMMU, then
> VFIO_IOMMU_MAP/UNMAP_DMA
>   will affect stage-2 mappings (already in mainline).
>   With my SMMU patch series, the BIND ioctl is not supported in this mode.
>   But in the future, BIND would allow to manage stage-1 as well:
>   - bind a process page directory (host SVM with added stage-2), or

I thought host SVM will only go through VFIO_TYPE1v2_IOMMU,
since you said stage-2 in ARM SMMU is only for GPA->HPA usage
in previous explanation. then what does "host SVM with added 
stage-2" mean here?

>   - bind a guest page directory (viommu), or
>   - bind the full stage-1 context table (viommu).
> 
> Thanks,
> Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-04  1:49                         ` Tian, Kevin
@ 2017-08-04  9:42                           ` Jean-Philippe Brucker
  2017-08-11  6:29                             ` Tian, Kevin
  2017-08-11 16:25                             ` Raj, Ashok
  0 siblings, 2 replies; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-04  9:42 UTC (permalink / raw)
  To: Tian, Kevin, valmiki, Alex Williamson
  Cc: iommu, kvm, linux-pci, Lan, Tianyu, Pan, Jacob jun

Hi Kevin,

On 04/08/17 02:49, Tian, Kevin wrote:
>> From: Jean-Philippe Brucker
>> Sent: Tuesday, August 1, 2017 4:26 PM
>>
>> It depends what type you use when registering the IOMMU with
>> VFIO_SET_IOMMU:
>>
>> * If the type is VFIO_TYPE1v2_IOMMU, then
>> VFIO_IOMMU_MAP/UNMAP_DMA
>>   affects the stage-1 non-PASID context (already the case in mainline).
>>   In addition, with my patch the BIND ioctl will affect stage-1 PASID
>>   contexts, and bind process page directories to the device (host SVM).
>>
>> * If the type is VFIO_TYPE1_NESTING_IOMMU, then
>> VFIO_IOMMU_MAP/UNMAP_DMA
>>   will affect stage-2 mappings (already in mainline).
>>   With my SMMU patch series, the BIND ioctl is not supported in this mode.
>>   But in the future, BIND would allow to manage stage-1 as well:
>>   - bind a process page directory (host SVM with added stage-2), or
> 
> I thought host SVM will only go through VFIO_TYPE1v2_IOMMU,
> since you said stage-2 in ARM SMMU is only for GPA->HPA usage
> in previous explanation. then what does "host SVM with added 
> stage-2" mean here?

Ah, that's just a crazy idea I had. I'm not sure it is useful or worth
implementing, but it is one of the possibility offered by nested translation.

Consider the situation where a userspace driver (no virtualization) is
built in a client-server fashion: the server controls a device and spawns
new processes (clients), each sharing a context with the device using its
own PASID. If the server wants to hide parts of the client address space
from the device (e.g. .text), then it could control stage-2 via MAP/UNMAP
to restrict the address space.

It would use different semantics of MAP/UNMAP though, as the ioctl would
only be used to define 1:1 translation windows, not pin memory.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-01 18:40                           ` Jean-Philippe Brucker
@ 2017-08-05  5:14                             ` valmiki
  2017-08-07 10:31                               ` Jean-Philippe Brucker
  0 siblings, 1 reply; 33+ messages in thread
From: valmiki @ 2017-08-05  5:14 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Alex Williamson
  Cc: iommu, kvm, linux-pci, tianyu.lan, kevin.tian, jacob.jun.pan

On 8/2/2017 12:10 AM, Jean-Philippe Brucker wrote:
> On 01/08/17 18:38, valmiki wrote:
> [...]
>>>> So i digged through your patches and i understood that using BIND ioctls
>>>> satge-1 translations are setup in SMMU for an application.
>>>> If we use VFIO_IOMMU_MAP/UNMAP_DMA ioctls they are setting up stage-2
>>>> translations in SMMU.
>>>> So without PASID support we can only work with stage-2 translations ?
>>>
>>> It depends what type you use when registering the IOMMU with
>>> VFIO_SET_IOMMU:
>>>
>>> * If the type is VFIO_TYPE1v2_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
>>>   affects the stage-1 non-PASID context (already the case in mainline).
>>>   In addition, with my patch the BIND ioctl will affect stage-1 PASID
>>>   contexts, and bind process page directories to the device (host SVM).
>>>
>>> * If the type is VFIO_TYPE1_NESTING_IOMMU, then VFIO_IOMMU_MAP/UNMAP_DMA
>>>   will affect stage-2 mappings (already in mainline).
>>>   With my SMMU patch series, the BIND ioctl is not supported in this mode.
>>>   But in the future, BIND would allow to manage stage-1 as well:
>>>   - bind a process page directory (host SVM with added stage-2), or
>>>   - bind a guest page directory (viommu), or
>>>   - bind the full stage-1 context table (viommu).
>>
>> Hi Jean, Thanks a lot for this information.
>> I tried to understand how stage-1 translations are setup if we use
>> VFIO_TYPE1v2_IOMMU & VFIO_IOMMU_MAP/UNMAP_DMA flow, but I'm not
>> successful, I couldn't find when context descriptor details are updated
>> with stage-1 translation table details in this flow.
>> I found that in BIND flow a new context descriptor being created and
>> assigned with required translation tables.
>>
>> Can you please point to piece of code/patch series where and how context
>> descriptor is updated for struct vfio_iommu_type1_dma_map/UNMAP_DMA flow with
>> VFIO_TYPE1v2_IOMMU.
>
> What happens for SMMU is during initialization, the
> VFIO_GROUP_SET_CONTAINER ioctl calls vfio_iommu_type1_attach_group.
> This function creates a new iommu_domain, and then calls
> iommu_attach_group(domain, group), which calls arm_smmu_attach_dev(domain,
> dev) for all devices in the group.
>
> If necessary we detach the existing domain, by clearing the STE (and all
> contexts). Then, since the new domain is only partially initialized, we
> call arm_smmu_domain_finalise, which allocates the page tables, and ends
> up calling arm_smmu_domain_finalise_s1. That function prepares the
> configuration of context descriptor 0, which is used for non-PASID
> mappings. Afterwards, back in arm_smmu_attach_dev, context descriptor 0 is
> written, along with the STE, and then the domain is live.
>
> Then when userspace does VFIO_IOMMU_MAP/UNMAP_DMA, iommu_map/unmap are
> called on that domain. arm_smmu_map/unmap update the page tables attached
> to context descriptor 0 (but they don't modify the CD in any way).
>
Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel 
documentation we fill vaddr and iova in struct vfio_iommu_type1_dma_map 
and pass them to VFIO. But if we use dynamic allocation in application 
(say malloc), do we need to use dma API to get iova and then call 
VFIO_IOMMU_MAP ioctl ?
If application needs multiple such dynamic allocations, then it need to 
allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and then 
manage rest allocations requirements from this buffer ?

Regards,
valmiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-05  5:14                             ` valmiki
@ 2017-08-07 10:31                               ` Jean-Philippe Brucker
  2017-08-07 12:18                                 ` Bob Liu
  2017-08-12 12:10                                 ` valmiki
  0 siblings, 2 replies; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-07 10:31 UTC (permalink / raw)
  To: valmiki, Alex Williamson
  Cc: iommu, kvm, linux-pci, tianyu.lan, kevin.tian, jacob.jun.pan

On 05/08/17 06:14, valmiki wrote:
[...]
> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
> documentation we fill vaddr and iova in struct vfio_iommu_type1_dma_map
> and pass them to VFIO. But if we use dynamic allocation in application
> (say malloc), do we need to use dma API to get iova and then call
> VFIO_IOMMU_MAP ioctl ?
> If application needs multiple such dynamic allocations, then it need to
> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and then
> manage rest allocations requirements from this buffer ?

Yes, without SVM, the application allocates large buffers, allocates IOVAs
itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on the
DMA API at all, it manages IOVAs as it wants. Sizes passed to
VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page granularity
(that is at least 4kB), and both iova and vaddr have to be aligned on that
granularity as well. So malloc isn't really suitable in this case, you'll
need mmap. The application can then implement a small allocator to manage
the DMA pool created with VFIO_IOMMU_MAP.

With SVM the application binds its address space to the device, and then
uses malloc for all DMA buffers, no need for VFIO_IOMMU_MAP.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-07 10:31                               ` Jean-Philippe Brucker
@ 2017-08-07 12:18                                 ` Bob Liu
  2017-08-07 12:52                                   ` Jean-Philippe Brucker
  2017-08-12 12:10                                 ` valmiki
  1 sibling, 1 reply; 33+ messages in thread
From: Bob Liu @ 2017-08-07 12:18 UTC (permalink / raw)
  To: Jean-Philippe Brucker, valmiki, Alex Williamson
  Cc: tianyu.lan, kevin.tian, kvm, linux-pci, iommu, jacob.jun.pan

On 2017/8/7 18:31, Jean-Philippe Brucker wrote:
> On 05/08/17 06:14, valmiki wrote:
> [...]
>> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
>> documentation we fill vaddr and iova in struct vfio_iommu_type1_dma_map
>> and pass them to VFIO. But if we use dynamic allocation in application
>> (say malloc), do we need to use dma API to get iova and then call
>> VFIO_IOMMU_MAP ioctl ?
>> If application needs multiple such dynamic allocations, then it need to
>> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and then
>> manage rest allocations requirements from this buffer ?
> 
> Yes, without SVM, the application allocates large buffers, allocates IOVAs
> itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on the
> DMA API at all, it manages IOVAs as it wants. Sizes passed to
> VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page granularity
> (that is at least 4kB), and both iova and vaddr have to be aligned on that
> granularity as well. So malloc isn't really suitable in this case, you'll
> need mmap. The application can then implement a small allocator to manage
> the DMA pool created with VFIO_IOMMU_MAP.
> 
> With SVM the application binds its address space to the device, and then
> uses malloc for all DMA buffers, no need for VFIO_IOMMU_MAP.
>

Hi Jean,

I think there is another way to support SVM without PASID.

Suppose there is a device in the same SOC-chip, the device access memory through SMMU(using internal bus instead of PCIe)
Once page fault, the device send an event with (vaddr, substreamID) to SMMU, then SMMU triggers an event interrupt.

In the event interrupt handler, we can implement the same logic as PRI interrupt in your patch.
What do you think about that?

Thanks,
Bob Liu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-07 12:18                                 ` Bob Liu
@ 2017-08-07 12:52                                   ` Jean-Philippe Brucker
  2017-08-08  0:51                                     ` Bob Liu
  2017-08-11  6:41                                     ` Tian, Kevin
  0 siblings, 2 replies; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-07 12:52 UTC (permalink / raw)
  To: Bob Liu, valmiki, Alex Williamson
  Cc: tianyu.lan, kevin.tian, kvm, linux-pci, iommu, jacob.jun.pan

Hi Bob,

On 07/08/17 13:18, Bob Liu wrote:
> On 2017/8/7 18:31, Jean-Philippe Brucker wrote:
>> On 05/08/17 06:14, valmiki wrote:
>> [...]
>>> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
>>> documentation we fill vaddr and iova in struct vfio_iommu_type1_dma_map
>>> and pass them to VFIO. But if we use dynamic allocation in application
>>> (say malloc), do we need to use dma API to get iova and then call
>>> VFIO_IOMMU_MAP ioctl ?
>>> If application needs multiple such dynamic allocations, then it need to
>>> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and then
>>> manage rest allocations requirements from this buffer ?
>>
>> Yes, without SVM, the application allocates large buffers, allocates IOVAs
>> itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on the
>> DMA API at all, it manages IOVAs as it wants. Sizes passed to
>> VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page granularity
>> (that is at least 4kB), and both iova and vaddr have to be aligned on that
>> granularity as well. So malloc isn't really suitable in this case, you'll
>> need mmap. The application can then implement a small allocator to manage
>> the DMA pool created with VFIO_IOMMU_MAP.
>>
>> With SVM the application binds its address space to the device, and then
>> uses malloc for all DMA buffers, no need for VFIO_IOMMU_MAP.
>>
> 
> Hi Jean,
> 
> I think there is another way to support SVM without PASID.
> 
> Suppose there is a device in the same SOC-chip, the device access memory through SMMU(using internal bus instead of PCIe)
> Once page fault, the device send an event with (vaddr, substreamID) to SMMU, then SMMU triggers an event interrupt.
> 
> In the event interrupt handler, we can implement the same logic as PRI interrupt in your patch.
> What do you think about that?
What you're describing is the SMMU stall model for platform devices. From
the driver perspective it's the same as PRI and PASID (SubstreamID=PASID).

When a stall-capable device accesses unmapped memory, the SMMU parks the
transaction and sends an event marked "stall" on the event queue, with a
stall tag (STAG, roughly equivalent to PRG Index). The OS handles the
fault and sends a CMD_RESUME command with the status and the STAG. Then
the SMMU completes the access or terminates it.

In a prototype I have, the stall implementation reuses most of the
PASID/PRI code. The main difficulty is defining SSID and stall capability
in firmware, as there are no standard capability probing for platform
devices. Stall-capable devices must be able to wait an indefinite amount
of time that their DMA transactions returns, therefore the stall model
cannot work with PCI, only some integrated devices.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-07 12:52                                   ` Jean-Philippe Brucker
@ 2017-08-08  0:51                                     ` Bob Liu
  2017-08-09 15:01                                       ` Jean-Philippe Brucker
  2017-08-11  6:41                                     ` Tian, Kevin
  1 sibling, 1 reply; 33+ messages in thread
From: Bob Liu @ 2017-08-08  0:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker, valmiki, Alex Williamson
  Cc: tianyu.lan, kevin.tian, kvm, linux-pci, iommu, jacob.jun.pan,
	xieyisheng (A)

On 2017/8/7 20:52, Jean-Philippe Brucker wrote:
> Hi Bob,
> 
> On 07/08/17 13:18, Bob Liu wrote:
>> On 2017/8/7 18:31, Jean-Philippe Brucker wrote:
>>> On 05/08/17 06:14, valmiki wrote:
>>> [...]
>>>> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
>>>> documentation we fill vaddr and iova in struct vfio_iommu_type1_dma_map
>>>> and pass them to VFIO. But if we use dynamic allocation in application
>>>> (say malloc), do we need to use dma API to get iova and then call
>>>> VFIO_IOMMU_MAP ioctl ?
>>>> If application needs multiple such dynamic allocations, then it need to
>>>> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and then
>>>> manage rest allocations requirements from this buffer ?
>>>
>>> Yes, without SVM, the application allocates large buffers, allocates IOVAs
>>> itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on the
>>> DMA API at all, it manages IOVAs as it wants. Sizes passed to
>>> VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page granularity
>>> (that is at least 4kB), and both iova and vaddr have to be aligned on that
>>> granularity as well. So malloc isn't really suitable in this case, you'll
>>> need mmap. The application can then implement a small allocator to manage
>>> the DMA pool created with VFIO_IOMMU_MAP.
>>>
>>> With SVM the application binds its address space to the device, and then
>>> uses malloc for all DMA buffers, no need for VFIO_IOMMU_MAP.
>>>
>>
>> Hi Jean,
>>
>> I think there is another way to support SVM without PASID.
>>
>> Suppose there is a device in the same SOC-chip, the device access memory through SMMU(using internal bus instead of PCIe)
>> Once page fault, the device send an event with (vaddr, substreamID) to SMMU, then SMMU triggers an event interrupt.
>>
>> In the event interrupt handler, we can implement the same logic as PRI interrupt in your patch.
>> What do you think about that?
> What you're describing is the SMMU stall model for platform devices. From
> the driver perspective it's the same as PRI and PASID (SubstreamID=PASID).
> 

Indeed!

> When a stall-capable device accesses unmapped memory, the SMMU parks the
> transaction and sends an event marked "stall" on the event queue, with a
> stall tag (STAG, roughly equivalent to PRG Index). The OS handles the
> fault and sends a CMD_RESUME command with the status and the STAG. Then
> the SMMU completes the access or terminates it.
> 
> In a prototype I have, the stall implementation reuses most of the

Glad to hear that.
Would you mind to share me the prototype patch?

> PASID/PRI code. The main difficulty is defining SSID and stall capability
> in firmware, as there are no standard capability probing for platform
> devices. Stall-capable devices must be able to wait an indefinite amount
> of time that their DMA transactions returns, therefore the stall model
> cannot work with PCI, only some integrated devices.
> 

I happen to have a board with such devices and like to do the test.
Will re-post a full version patch upstream once completing the verification.

Thanks,
Bob Liu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-08  0:51                                     ` Bob Liu
@ 2017-08-09 15:01                                       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-09 15:01 UTC (permalink / raw)
  To: Bob Liu, valmiki, Alex Williamson
  Cc: tianyu.lan, kevin.tian, kvm, linux-pci, iommu, jacob.jun.pan,
	xieyisheng (A)

On 08/08/17 01:51, Bob Liu wrote:
> On 2017/8/7 20:52, Jean-Philippe Brucker wrote:
>> Hi Bob,
>>
>> On 07/08/17 13:18, Bob Liu wrote:
>>> On 2017/8/7 18:31, Jean-Philippe Brucker wrote:
>>>> On 05/08/17 06:14, valmiki wrote:
>>>> [...]
>>>>> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
>>>>> documentation we fill vaddr and iova in struct vfio_iommu_type1_dma_map
>>>>> and pass them to VFIO. But if we use dynamic allocation in application
>>>>> (say malloc), do we need to use dma API to get iova and then call
>>>>> VFIO_IOMMU_MAP ioctl ?
>>>>> If application needs multiple such dynamic allocations, then it need to
>>>>> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and then
>>>>> manage rest allocations requirements from this buffer ?
>>>>
>>>> Yes, without SVM, the application allocates large buffers, allocates IOVAs
>>>> itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on the
>>>> DMA API at all, it manages IOVAs as it wants. Sizes passed to
>>>> VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page granularity
>>>> (that is at least 4kB), and both iova and vaddr have to be aligned on that
>>>> granularity as well. So malloc isn't really suitable in this case, you'll
>>>> need mmap. The application can then implement a small allocator to manage
>>>> the DMA pool created with VFIO_IOMMU_MAP.
>>>>
>>>> With SVM the application binds its address space to the device, and then
>>>> uses malloc for all DMA buffers, no need for VFIO_IOMMU_MAP.
>>>>
>>>
>>> Hi Jean,
>>>
>>> I think there is another way to support SVM without PASID.
>>>
>>> Suppose there is a device in the same SOC-chip, the device access memory through SMMU(using internal bus instead of PCIe)
>>> Once page fault, the device send an event with (vaddr, substreamID) to SMMU, then SMMU triggers an event interrupt.
>>>
>>> In the event interrupt handler, we can implement the same logic as PRI interrupt in your patch.
>>> What do you think about that?
>> What you're describing is the SMMU stall model for platform devices. From
>> the driver perspective it's the same as PRI and PASID (SubstreamID=PASID).
>>
> 
> Indeed!
> 
>> When a stall-capable device accesses unmapped memory, the SMMU parks the
>> transaction and sends an event marked "stall" on the event queue, with a
>> stall tag (STAG, roughly equivalent to PRG Index). The OS handles the
>> fault and sends a CMD_RESUME command with the status and the STAG. Then
>> the SMMU completes the access or terminates it.
>>
>> In a prototype I have, the stall implementation reuses most of the
> 
> Glad to hear that.
> Would you mind to share me the prototype patch?
> 
>> PASID/PRI code. The main difficulty is defining SSID and stall capability
>> in firmware, as there are no standard capability probing for platform
>> devices. Stall-capable devices must be able to wait an indefinite amount
>> of time that their DMA transactions returns, therefore the stall model
>> cannot work with PCI, only some integrated devices.
>>
> 
> I happen to have a board with such devices and like to do the test.
> Will re-post a full version patch upstream once completing the verification.

Cool! You can find the prototype here:
git://linux-arm.org/linux-jpb.git svm/stall

Please let me know if you get anywhere with it, I'd like to get the series
moving again.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
  2017-08-04  9:42                           ` Jean-Philippe Brucker
@ 2017-08-11  6:29                             ` Tian, Kevin
  2017-08-11 16:25                             ` Raj, Ashok
  1 sibling, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2017-08-11  6:29 UTC (permalink / raw)
  To: Jean-Philippe Brucker, valmiki, Alex Williamson
  Cc: iommu, kvm, linux-pci, Lan, Tianyu, Pan, Jacob jun

> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker@arm.com]
> Sent: Friday, August 4, 2017 5:43 PM
> 
> Hi Kevin,
> 
> On 04/08/17 02:49, Tian, Kevin wrote:
> >> From: Jean-Philippe Brucker
> >> Sent: Tuesday, August 1, 2017 4:26 PM
> >>
> >> It depends what type you use when registering the IOMMU with
> >> VFIO_SET_IOMMU:
> >>
> >> * If the type is VFIO_TYPE1v2_IOMMU, then
> >> VFIO_IOMMU_MAP/UNMAP_DMA
> >>   affects the stage-1 non-PASID context (already the case in mainline).
> >>   In addition, with my patch the BIND ioctl will affect stage-1 PASID
> >>   contexts, and bind process page directories to the device (host SVM).
> >>
> >> * If the type is VFIO_TYPE1_NESTING_IOMMU, then
> >> VFIO_IOMMU_MAP/UNMAP_DMA
> >>   will affect stage-2 mappings (already in mainline).
> >>   With my SMMU patch series, the BIND ioctl is not supported in this mode.
> >>   But in the future, BIND would allow to manage stage-1 as well:
> >>   - bind a process page directory (host SVM with added stage-2), or
> >
> > I thought host SVM will only go through VFIO_TYPE1v2_IOMMU,
> > since you said stage-2 in ARM SMMU is only for GPA->HPA usage
> > in previous explanation. then what does "host SVM with added
> > stage-2" mean here?
> 
> Ah, that's just a crazy idea I had. I'm not sure it is useful or worth
> implementing, but it is one of the possibility offered by nested translation.
> 
> Consider the situation where a userspace driver (no virtualization) is
> built in a client-server fashion: the server controls a device and spawns
> new processes (clients), each sharing a context with the device using its
> own PASID. If the server wants to hide parts of the client address space
> from the device (e.g. .text), then it could control stage-2 via MAP/UNMAP
> to restrict the address space.

stage-1 is linked to CPU page table (VA->PA) for SVM, while physical 
memory is managed by kernel. I didn't come up a good reason why 
server application needs or has knowledge to hide some resource 
which is not managed by itself...

> 
> It would use different semantics of MAP/UNMAP though, as the ioctl would
> only be used to define 1:1 translation windows, not pin memory.
> 
> Thanks,
> Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
  2017-08-07 12:52                                   ` Jean-Philippe Brucker
  2017-08-08  0:51                                     ` Bob Liu
@ 2017-08-11  6:41                                     ` Tian, Kevin
       [not found]                                       ` <AADFC41AFE54684AB9EE6CBC0274A5D190D6EDB7-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2017-08-11  9:36                                       ` Bob Liu
  1 sibling, 2 replies; 33+ messages in thread
From: Tian, Kevin @ 2017-08-11  6:41 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Bob Liu, valmiki, Alex Williamson
  Cc: Lan, Tianyu, kvm, linux-pci, iommu, Pan, Jacob jun

> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker@arm.com]
> Sent: Monday, August 7, 2017 8:52 PM
> 
> Hi Bob,
> 
> On 07/08/17 13:18, Bob Liu wrote:
> > On 2017/8/7 18:31, Jean-Philippe Brucker wrote:
> >> On 05/08/17 06:14, valmiki wrote:
> >> [...]
> >>> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
> >>> documentation we fill vaddr and iova in struct
> vfio_iommu_type1_dma_map
> >>> and pass them to VFIO. But if we use dynamic allocation in application
> >>> (say malloc), do we need to use dma API to get iova and then call
> >>> VFIO_IOMMU_MAP ioctl ?
> >>> If application needs multiple such dynamic allocations, then it need to
> >>> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and
> then
> >>> manage rest allocations requirements from this buffer ?
> >>
> >> Yes, without SVM, the application allocates large buffers, allocates IOVAs
> >> itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on
> the
> >> DMA API at all, it manages IOVAs as it wants. Sizes passed to
> >> VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page
> granularity
> >> (that is at least 4kB), and both iova and vaddr have to be aligned on that
> >> granularity as well. So malloc isn't really suitable in this case, you'll
> >> need mmap. The application can then implement a small allocator to
> manage
> >> the DMA pool created with VFIO_IOMMU_MAP.
> >>
> >> With SVM the application binds its address space to the device, and then
> >> uses malloc for all DMA buffers, no need for VFIO_IOMMU_MAP.
> >>
> >
> > Hi Jean,
> >
> > I think there is another way to support SVM without PASID.
> >
> > Suppose there is a device in the same SOC-chip, the device access memory
> through SMMU(using internal bus instead of PCIe)
> > Once page fault, the device send an event with (vaddr, substreamID) to
> SMMU, then SMMU triggers an event interrupt.
> >
> > In the event interrupt handler, we can implement the same logic as PRI
> interrupt in your patch.
> > What do you think about that?
> What you're describing is the SMMU stall model for platform devices. From
> the driver perspective it's the same as PRI and PASID (SubstreamID=PASID).
> 
> When a stall-capable device accesses unmapped memory, the SMMU parks
> the
> transaction and sends an event marked "stall" on the event queue, with a
> stall tag (STAG, roughly equivalent to PRG Index). The OS handles the
> fault and sends a CMD_RESUME command with the status and the STAG.
> Then
> the SMMU completes the access or terminates it.

Can such platform device send multiple SubstreamIDs, or one ID per
device? 

Thanks
Kevin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
       [not found]                                       ` <AADFC41AFE54684AB9EE6CBC0274A5D190D6EDB7-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-08-11  9:25                                         ` Jean-Philippe Brucker
  0 siblings, 0 replies; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-11  9:25 UTC (permalink / raw)
  To: Tian, Kevin, Bob Liu, valmiki, Alex Williamson
  Cc: Lan, Tianyu, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Pan,
	Jacob jun, kvm-u79uwXL29TY76Z2rM5mHXA

On 11/08/17 07:41, Tian, Kevin wrote:
[...]
>>> Hi Jean,
>>>
>>> I think there is another way to support SVM without PASID.
>>>
>>> Suppose there is a device in the same SOC-chip, the device access memory
>> through SMMU(using internal bus instead of PCIe)
>>> Once page fault, the device send an event with (vaddr, substreamID) to
>> SMMU, then SMMU triggers an event interrupt.
>>>
>>> In the event interrupt handler, we can implement the same logic as PRI
>> interrupt in your patch.
>>> What do you think about that?
>> What you're describing is the SMMU stall model for platform devices. From
>> the driver perspective it's the same as PRI and PASID (SubstreamID=PASID).
>>
>> When a stall-capable device accesses unmapped memory, the SMMU parks
>> the
>> transaction and sends an event marked "stall" on the event queue, with a
>> stall tag (STAG, roughly equivalent to PRG Index). The OS handles the
>> fault and sends a CMD_RESUME command with the status and the STAG.
>> Then
>> the SMMU completes the access or terminates it.
> 
> Can such platform device send multiple SubstreamIDs, or one ID per
> device? 

Yes, SVM-capable platform devices should issue multiple SubstreamIDs and a
single StreamID (equivalent to VT-d's Source-ID).

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-11  6:41                                     ` Tian, Kevin
       [not found]                                       ` <AADFC41AFE54684AB9EE6CBC0274A5D190D6EDB7-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-08-11  9:36                                       ` Bob Liu
  1 sibling, 0 replies; 33+ messages in thread
From: Bob Liu @ 2017-08-11  9:36 UTC (permalink / raw)
  To: Tian, Kevin, Jean-Philippe Brucker, valmiki, Alex Williamson
  Cc: Lan, Tianyu, kvm, linux-pci, iommu, Pan, Jacob jun

On 2017/8/11 14:41, Tian, Kevin wrote:
>> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker@arm.com]
>> Sent: Monday, August 7, 2017 8:52 PM
>>
>> Hi Bob,
>>
>> On 07/08/17 13:18, Bob Liu wrote:
>>> On 2017/8/7 18:31, Jean-Philippe Brucker wrote:
>>>> On 05/08/17 06:14, valmiki wrote:
>>>> [...]
>>>>> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
>>>>> documentation we fill vaddr and iova in struct
>> vfio_iommu_type1_dma_map
>>>>> and pass them to VFIO. But if we use dynamic allocation in application
>>>>> (say malloc), do we need to use dma API to get iova and then call
>>>>> VFIO_IOMMU_MAP ioctl ?
>>>>> If application needs multiple such dynamic allocations, then it need to
>>>>> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and
>> then
>>>>> manage rest allocations requirements from this buffer ?
>>>>
>>>> Yes, without SVM, the application allocates large buffers, allocates IOVAs
>>>> itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on
>> the
>>>> DMA API at all, it manages IOVAs as it wants. Sizes passed to
>>>> VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page
>> granularity
>>>> (that is at least 4kB), and both iova and vaddr have to be aligned on that
>>>> granularity as well. So malloc isn't really suitable in this case, you'll
>>>> need mmap. The application can then implement a small allocator to
>> manage
>>>> the DMA pool created with VFIO_IOMMU_MAP.
>>>>
>>>> With SVM the application binds its address space to the device, and then
>>>> uses malloc for all DMA buffers, no need for VFIO_IOMMU_MAP.
>>>>
>>>
>>> Hi Jean,
>>>
>>> I think there is another way to support SVM without PASID.
>>>
>>> Suppose there is a device in the same SOC-chip, the device access memory
>> through SMMU(using internal bus instead of PCIe)
>>> Once page fault, the device send an event with (vaddr, substreamID) to
>> SMMU, then SMMU triggers an event interrupt.
>>>
>>> In the event interrupt handler, we can implement the same logic as PRI
>> interrupt in your patch.
>>> What do you think about that?
>> What you're describing is the SMMU stall model for platform devices. From
>> the driver perspective it's the same as PRI and PASID (SubstreamID=PASID).
>>
>> When a stall-capable device accesses unmapped memory, the SMMU parks
>> the
>> transaction and sends an event marked "stall" on the event queue, with a
>> stall tag (STAG, roughly equivalent to PRG Index). The OS handles the
>> fault and sends a CMD_RESUME command with the status and the STAG.
>> Then
>> the SMMU completes the access or terminates it.
> 
> Can such platform device send multiple SubstreamIDs, or one ID per
> device? 
> 

Software should able to config different SubstreamID to the device,
but I'm not sure whether our device can send multiple SubStreamIDs at the same time.
Btw, why do you care about this?

Thanks,
Liubo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-04  9:42                           ` Jean-Philippe Brucker
  2017-08-11  6:29                             ` Tian, Kevin
@ 2017-08-11 16:25                             ` Raj, Ashok
  2017-08-14  8:00                               ` Tian, Kevin
  1 sibling, 1 reply; 33+ messages in thread
From: Raj, Ashok @ 2017-08-11 16:25 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Tian, Kevin, valmiki, Alex Williamson, Lan, Tianyu, linux-pci,
	iommu, Pan, Jacob jun, kvm, Ashok Raj

On Fri, Aug 04, 2017 at 10:42:41AM +0100, Jean-Philippe Brucker wrote:
> Hi Kevin,
> 
> 
> Consider the situation where a userspace driver (no virtualization) is
> built in a client-server fashion: the server controls a device and spawns
> new processes (clients), each sharing a context with the device using its
> own PASID. If the server wants to hide parts of the client address space

Just to be sure, you are't expecting the PASID's to be duplicated or 
recreated after a new process is spawned. I would expect each process to 
get its own PASID by doing a bind. Threads of the same process would be
sharing the same PASID since they all share the same first level 
mappings.


> from the device (e.g. .text), then it could control stage-2 via MAP/UNMAP
> to restrict the address space.

I'm confused.. maybe this is different from Intel IOMMU. the first level 
requiring a second level is only true when virtualization is in play.

First level is gVA->gPA, and second level is gPA->hPA (sort of the cloned
EPT map that is setup via VFIO to set up second level)

When you are in native user application, there is no nesting between first
and second level. The first level is directly VA->hPA. There is no need
for a nested walk in this case?

> 
> It would use different semantics of MAP/UNMAP though, as the ioctl would
> only be used to define 1:1 translation windows, not pin memory.
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-07 10:31                               ` Jean-Philippe Brucker
  2017-08-07 12:18                                 ` Bob Liu
@ 2017-08-12 12:10                                 ` valmiki
  2017-08-14  7:49                                   ` Tian, Kevin
  1 sibling, 1 reply; 33+ messages in thread
From: valmiki @ 2017-08-12 12:10 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Alex Williamson
  Cc: iommu, kvm, linux-pci, tianyu.lan, kevin.tian, jacob.jun.pan



On 8/7/2017 4:01 PM, Jean-Philippe Brucker wrote:
> On 05/08/17 06:14, valmiki wrote:
> [...]
>> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
>> documentation we fill vaddr and iova in struct vfio_iommu_type1_dma_map
>> and pass them to VFIO. But if we use dynamic allocation in application
>> (say malloc), do we need to use dma API to get iova and then call
>> VFIO_IOMMU_MAP ioctl ?
>> If application needs multiple such dynamic allocations, then it need to
>> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and then
>> manage rest allocations requirements from this buffer ?
>
> Yes, without SVM, the application allocates large buffers, allocates IOVAs
> itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely on the
> DMA API at all, it manages IOVAs as it wants. Sizes passed to
> VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page granularity
> (that is at least 4kB), and both iova and vaddr have to be aligned on that
> granularity as well. So malloc isn't really suitable in this case, you'll
> need mmap. The application can then implement a small allocator to manage
> the DMA pool created with VFIO_IOMMU_MAP.

Thanks Jean, I have a confusion allocate IOVA's in userspace means, how 
can user application decide IOVA address, can user application pick any 
random IOVA address ?

Regards,
Valmiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
  2017-08-12 12:10                                 ` valmiki
@ 2017-08-14  7:49                                   ` Tian, Kevin
  2017-08-28 13:10                                     ` Bharat Kumar Gogada
  0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2017-08-14  7:49 UTC (permalink / raw)
  To: valmiki, Jean-Philippe Brucker, Alex Williamson
  Cc: iommu, kvm, linux-pci, Lan, Tianyu, Pan, Jacob jun

> From: valmiki [mailto:valmikibow@gmail.com]
> Sent: Saturday, August 12, 2017 8:11 PM
> 
> On 8/7/2017 4:01 PM, Jean-Philippe Brucker wrote:
> > On 05/08/17 06:14, valmiki wrote:
> > [...]
> >> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
> >> documentation we fill vaddr and iova in struct
> vfio_iommu_type1_dma_map
> >> and pass them to VFIO. But if we use dynamic allocation in application
> >> (say malloc), do we need to use dma API to get iova and then call
> >> VFIO_IOMMU_MAP ioctl ?
> >> If application needs multiple such dynamic allocations, then it need to
> >> allocate large chunk and program it via VFIO_IOMMU_MAP ioctl and
> then
> >> manage rest allocations requirements from this buffer ?
> >
> > Yes, without SVM, the application allocates large buffers, allocates IOVAs
> > itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't rely
> on the
> > DMA API at all, it manages IOVAs as it wants. Sizes passed to
> > VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page
> granularity
> > (that is at least 4kB), and both iova and vaddr have to be aligned on that
> > granularity as well. So malloc isn't really suitable in this case, you'll
> > need mmap. The application can then implement a small allocator to
> manage
> > the DMA pool created with VFIO_IOMMU_MAP.
> 
> Thanks Jean, I have a confusion allocate IOVA's in userspace means, how
> can user application decide IOVA address, can user application pick any
> random IOVA address ?
> 

yes, any address. In this mode the whole IOVA address space is owned
by application, which just needs to use VFIO_IOMMU_MAP to setup
IOVA->PA mapping in IOMMU (As Jean pointed out, input paramters
are iova and vaddr. VFIO will figure out pa corresponding to vaddr).

Thanks
Kevin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
  2017-08-11 16:25                             ` Raj, Ashok
@ 2017-08-14  8:00                               ` Tian, Kevin
  2017-08-14  9:07                                 ` Jean-Philippe Brucker
  0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2017-08-14  8:00 UTC (permalink / raw)
  To: Raj, Ashok, Jean-Philippe Brucker
  Cc: Lan, Tianyu, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Pan,
	Jacob jun

> From: Raj, Ashok
> Sent: Saturday, August 12, 2017 12:25 AM
> 
> On Fri, Aug 04, 2017 at 10:42:41AM +0100, Jean-Philippe Brucker wrote:
> > Hi Kevin,
> >
> >
> > Consider the situation where a userspace driver (no virtualization) is
> > built in a client-server fashion: the server controls a device and spawns
> > new processes (clients), each sharing a context with the device using its
> > own PASID. If the server wants to hide parts of the client address space
> 
> Just to be sure, you are't expecting the PASID's to be duplicated or
> recreated after a new process is spawned. I would expect each process to
> get its own PASID by doing a bind. Threads of the same process would be
> sharing the same PASID since they all share the same first level
> mappings.
> 
> 
> > from the device (e.g. .text), then it could control stage-2 via MAP/UNMAP
> > to restrict the address space.
> 
> I'm confused.. maybe this is different from Intel IOMMU. the first level
> requiring a second level is only true when virtualization is in play.
> 
> First level is gVA->gPA, and second level is gPA->hPA (sort of the cloned
> EPT map that is setup via VFIO to set up second level)
> 
> When you are in native user application, there is no nesting between first
> and second level. The first level is directly VA->hPA. There is no need
> for a nested walk in this case?
> 

Strictly speaking nesting is just a hardware capability while 
virtualization is an use case using that capability. As long as 
some software entity (not hypervisor) can setup two level 
page tables, it should just work regardless of how the 
intermediate address is called.

I think Jean is trying to come up a non-virtualization usage
using nesting. Of course current example that he illustrated 
is not very meaningful (as I replied in another mail). :-)

Thanks
Kevin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Support SVM without PASID
  2017-08-14  8:00                               ` Tian, Kevin
@ 2017-08-14  9:07                                 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 33+ messages in thread
From: Jean-Philippe Brucker @ 2017-08-14  9:07 UTC (permalink / raw)
  To: Tian, Kevin, Raj, Ashok
  Cc: valmiki, Alex Williamson, Lan, Tianyu, linux-pci, iommu, Pan,
	Jacob jun, kvm

On 14/08/17 09:00, Tian, Kevin wrote:
>> From: Raj, Ashok
>> Sent: Saturday, August 12, 2017 12:25 AM
>>
>> On Fri, Aug 04, 2017 at 10:42:41AM +0100, Jean-Philippe Brucker wrote:
>>> Hi Kevin,
>>>
>>>
>>> Consider the situation where a userspace driver (no virtualization) is
>>> built in a client-server fashion: the server controls a device and spawns
>>> new processes (clients), each sharing a context with the device using its
>>> own PASID. If the server wants to hide parts of the client address space
>>
>> Just to be sure, you are't expecting the PASID's to be duplicated or
>> recreated after a new process is spawned. I would expect each process to
>> get its own PASID by doing a bind. Threads of the same process would be
>> sharing the same PASID since they all share the same first level
>> mappings.
>>
>>
>>> from the device (e.g. .text), then it could control stage-2 via MAP/UNMAP
>>> to restrict the address space.
>>
>> I'm confused.. maybe this is different from Intel IOMMU. the first level
>> requiring a second level is only true when virtualization is in play.
>>
>> First level is gVA->gPA, and second level is gPA->hPA (sort of the cloned
>> EPT map that is setup via VFIO to set up second level)
>>
>> When you are in native user application, there is no nesting between first
>> and second level. The first level is directly VA->hPA. There is no need
>> for a nested walk in this case?
>>
> 
> Strictly speaking nesting is just a hardware capability while 
> virtualization is an use case using that capability. As long as 
> some software entity (not hypervisor) can setup two level 
> page tables, it should just work regardless of how the 
> intermediate address is called.

Yes, nested translation is just a tool, it doesn't have to be dedicated to
virtualization.

> I think Jean is trying to come up a non-virtualization usage
> using nesting. Of course current example that he illustrated 
> is not very meaningful (as I replied in another mail). :-)

I also can't come up with a convincing API for setting up the second stage
from userspace. Hopefully we can ignore this case for the moment :)

Thanks,
Jean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
  2017-08-14  7:49                                   ` Tian, Kevin
@ 2017-08-28 13:10                                     ` Bharat Kumar Gogada
  2017-08-29  1:32                                       ` Tian, Kevin
  0 siblings, 1 reply; 33+ messages in thread
From: Bharat Kumar Gogada @ 2017-08-28 13:10 UTC (permalink / raw)
  To: Tian, Kevin, valmiki, Jean-Philippe Brucker, Alex Williamson
  Cc: iommu, kvm, linux-pci, Lan, Tianyu, Pan, Jacob jun

> Subject: RE: Support SVM without PASID
> 
> > From: valmiki [mailto:valmikibow@gmail.com]
> > Sent: Saturday, August 12, 2017 8:11 PM
> >
> > On 8/7/2017 4:01 PM, Jean-Philippe Brucker wrote:
> > > On 05/08/17 06:14, valmiki wrote:
> > > [...]
> > >> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
> > >> documentation we fill vaddr and iova in struct
> > vfio_iommu_type1_dma_map
> > >> and pass them to VFIO. But if we use dynamic allocation in
> > >> application (say malloc), do we need to use dma API to get iova and
> > >> then call VFIO_IOMMU_MAP ioctl ?
> > >> If application needs multiple such dynamic allocations, then it
> > >> need to allocate large chunk and program it via VFIO_IOMMU_MAP
> > >> ioctl and
> > then
> > >> manage rest allocations requirements from this buffer ?
> > >
> > > Yes, without SVM, the application allocates large buffers, allocates
> > > IOVAs itself, and maps them with VFIO_IOMMU_MAP. Userspace doesn't
> > > rely
> > on the
> > > DMA API at all, it manages IOVAs as it wants. Sizes passed to
> > > VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page
> > granularity
> > > (that is at least 4kB), and both iova and vaddr have to be aligned
> > > on that granularity as well. So malloc isn't really suitable in this
> > > case, you'll need mmap. The application can then implement a small
> > > allocator to
> > manage
> > > the DMA pool created with VFIO_IOMMU_MAP.
> >
> > Thanks Jean, I have a confusion allocate IOVA's in userspace means,
> > how can user application decide IOVA address, can user application
> > pick any random IOVA address ?
> >
> 
> yes, any address. In this mode the whole IOVA address space is owned by
> application, which just needs to use VFIO_IOMMU_MAP to setup
> IOVA->PA mapping in IOMMU (As Jean pointed out, input paramters
> are iova and vaddr. VFIO will figure out pa corresponding to vaddr).
> 
Hi Kevin, I have a doubt in this case, what if someone assigns mmap returned virtual address as iova address, then 
EP will assume it is generating request on IOVA but in reality we are using application allocated virtual address, then
it looks like we are working with application virtual address directly without PASID. Is this valid ?

Regards,
Bharat

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Support SVM without PASID
  2017-08-28 13:10                                     ` Bharat Kumar Gogada
@ 2017-08-29  1:32                                       ` Tian, Kevin
  0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2017-08-29  1:32 UTC (permalink / raw)
  To: Bharat Kumar Gogada, valmiki, Jean-Philippe Brucker, Alex Williamson
  Cc: iommu, kvm, linux-pci, Lan, Tianyu, Pan, Jacob jun

> From: Bharat Kumar Gogada [mailto:bharatku@xilinx.com]
> Sent: Monday, August 28, 2017 9:10 PM
> 
> > Subject: RE: Support SVM without PASID
> >
> > > From: valmiki [mailto:valmikibow@gmail.com]
> > > Sent: Saturday, August 12, 2017 8:11 PM
> > >
> > > On 8/7/2017 4:01 PM, Jean-Philippe Brucker wrote:
> > > > On 05/08/17 06:14, valmiki wrote:
> > > > [...]
> > > >> Hi Jean, Thanks a lot, now i understood the flow. From vfio kernel
> > > >> documentation we fill vaddr and iova in struct
> > > vfio_iommu_type1_dma_map
> > > >> and pass them to VFIO. But if we use dynamic allocation in
> > > >> application (say malloc), do we need to use dma API to get iova and
> > > >> then call VFIO_IOMMU_MAP ioctl ?
> > > >> If application needs multiple such dynamic allocations, then it
> > > >> need to allocate large chunk and program it via VFIO_IOMMU_MAP
> > > >> ioctl and
> > > then
> > > >> manage rest allocations requirements from this buffer ?
> > > >
> > > > Yes, without SVM, the application allocates large buffers, allocates
> > > > IOVAs itself, and maps them with VFIO_IOMMU_MAP. Userspace
> doesn't
> > > > rely
> > > on the
> > > > DMA API at all, it manages IOVAs as it wants. Sizes passed to
> > > > VFIO_IOMMU_MAP have to be multiples of the MMU or IOMMU page
> > > granularity
> > > > (that is at least 4kB), and both iova and vaddr have to be aligned
> > > > on that granularity as well. So malloc isn't really suitable in this
> > > > case, you'll need mmap. The application can then implement a small
> > > > allocator to
> > > manage
> > > > the DMA pool created with VFIO_IOMMU_MAP.
> > >
> > > Thanks Jean, I have a confusion allocate IOVA's in userspace means,
> > > how can user application decide IOVA address, can user application
> > > pick any random IOVA address ?
> > >
> >
> > yes, any address. In this mode the whole IOVA address space is owned by
> > application, which just needs to use VFIO_IOMMU_MAP to setup
> > IOVA->PA mapping in IOMMU (As Jean pointed out, input paramters
> > are iova and vaddr. VFIO will figure out pa corresponding to vaddr).
> >
> Hi Kevin, I have a doubt in this case, what if someone assigns mmap
> returned virtual address as iova address, then
> EP will assume it is generating request on IOVA but in reality we are using
> application allocated virtual address, then
> it looks like we are working with application virtual address directly without
> PASID. Is this valid ?
> 

IOMMU doesn't care what an address in the DMA transaction from
a device is. It cares only whether the DMA transaction has PASID
tagged or not, and then pursue different structure to walk corresponding 
I/O page table to get a translated address. Each I/O page table hosts
a standalone virtual address space. An application can implement a
new allocator to manage the address space (iova != vaddr), or simply
reuse malloc-ed virtual address (iova == vaddr), which is not what
hardware really cares about.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-08-29  1:32 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-08 17:03 Support SVM without PASID valmiki
     [not found] ` <a30962a6-2240-9263-96cc-10da1b179fcc-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-07-08 20:02   ` Alex Williamson
2017-07-09  3:15     ` valmiki
2017-07-09  9:29       ` Liu, Yi L
2017-07-10  0:14       ` Bob Liu
2017-07-10 19:31       ` Jerome Glisse
     [not found]         ` <20170710193141.GA3813-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-07-12 16:23           ` valmiki
     [not found]       ` <73619426-6fcc-21ce-cfd4-8c66bde63f9a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-07-11 10:56         ` Jean-Philippe Brucker
     [not found]           ` <eb132dfa-6708-5898-c2ce-ce7ab08809b1-5wv7dgnIgG8@public.gmane.org>
2017-07-12 16:27             ` valmiki
2017-07-12 16:48               ` Jean-Philippe Brucker
2017-07-22  2:05                 ` valmiki
     [not found]                   ` <41333a03-bf91-1152-4779-6579845609f6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-08-01  8:26                     ` Jean-Philippe Brucker
     [not found]                       ` <564ba70b-db95-7fe0-86bb-bb4eefcd87ec-5wv7dgnIgG8@public.gmane.org>
2017-08-01 17:38                         ` valmiki
2017-08-01 18:40                           ` Jean-Philippe Brucker
2017-08-05  5:14                             ` valmiki
2017-08-07 10:31                               ` Jean-Philippe Brucker
2017-08-07 12:18                                 ` Bob Liu
2017-08-07 12:52                                   ` Jean-Philippe Brucker
2017-08-08  0:51                                     ` Bob Liu
2017-08-09 15:01                                       ` Jean-Philippe Brucker
2017-08-11  6:41                                     ` Tian, Kevin
     [not found]                                       ` <AADFC41AFE54684AB9EE6CBC0274A5D190D6EDB7-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-08-11  9:25                                         ` Jean-Philippe Brucker
2017-08-11  9:36                                       ` Bob Liu
2017-08-12 12:10                                 ` valmiki
2017-08-14  7:49                                   ` Tian, Kevin
2017-08-28 13:10                                     ` Bharat Kumar Gogada
2017-08-29  1:32                                       ` Tian, Kevin
2017-08-04  1:49                         ` Tian, Kevin
2017-08-04  9:42                           ` Jean-Philippe Brucker
2017-08-11  6:29                             ` Tian, Kevin
2017-08-11 16:25                             ` Raj, Ashok
2017-08-14  8:00                               ` Tian, Kevin
2017-08-14  9:07                                 ` Jean-Philippe Brucker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).