iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: "Jacob Pan (Jun)" <jacob.jun.pan@intel.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	kevin.tian@intel.com, "Raj, Ashok" <ashok.raj@intel.com>,
	kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>,
	stefanha@gmail.com, yi.y.sun@intel.com,
	Alex Williamson <alex.williamson@redhat.com>,
	iommu@lists.linux-foundation.org,
	Jason Gunthorpe <jgg@nvidia.com>,
	hao.wu@intel.com, jun.j.tian@intel.com
Subject: Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
Date: Fri, 18 Sep 2020 11:58:29 +0800	[thread overview]
Message-ID: <ad909852-1201-6a4a-ffb0-9ad0df6c2e81@redhat.com> (raw)
In-Reply-To: <20200917111754.00006bcc@intel.com>


On 2020/9/18 上午2:17, Jacob Pan (Jun) wrote:
> Hi Jason,
> On Thu, 17 Sep 2020 11:53:49 +0800, Jason Wang <jasowang@redhat.com>
> wrote:
>
>> On 2020/9/17 上午7:09, Jacob Pan (Jun) wrote:
>>> Hi Jason,
>>> On Wed, 16 Sep 2020 15:38:41 -0300, Jason Gunthorpe <jgg@nvidia.com>
>>> wrote:
>>>   
>>>> On Wed, Sep 16, 2020 at 11:21:10AM -0700, Jacob Pan (Jun) wrote:
>>>>> Hi Jason,
>>>>> On Wed, 16 Sep 2020 14:01:13 -0300, Jason Gunthorpe
>>>>> <jgg@nvidia.com> wrote:
>>>>>       
>>>>>> On Wed, Sep 16, 2020 at 09:33:43AM -0700, Raj, Ashok wrote:
>>>>>>> On Wed, Sep 16, 2020 at 12:07:54PM -0300, Jason Gunthorpe
>>>>>>> wrote:
>>>>>>>> On Tue, Sep 15, 2020 at 05:22:26PM -0700, Jacob Pan (Jun)
>>>>>>>> wrote:
>>>>>>>>>> If user space wants to bind page tables, create the PASID
>>>>>>>>>> with /dev/sva, use ioctls there to setup the page table
>>>>>>>>>> the way it wants, then pass the now configured PASID to a
>>>>>>>>>> driver that can use it.
>>>>>>>>> Are we talking about bare metal SVA?
>>>>>>>> What a weird term.
>>>>>>> Glad you noticed it at v7 :-)
>>>>>>>
>>>>>>> Any suggestions on something less weird than
>>>>>>> Shared Virtual Addressing? There is a reason why we moved from
>>>>>>> SVM to SVA.
>>>>>> SVA is fine, what is "bare metal" supposed to mean?
>>>>>>       
>>>>> What I meant here is sharing virtual address between DMA and host
>>>>> process. This requires devices perform DMA request with PASID and
>>>>> use IOMMU first level/stage 1 page tables.
>>>>> This can be further divided into 1) user SVA 2) supervisor SVA
>>>>> (sharing init_mm)
>>>>>
>>>>> My point is that /dev/sva is not useful here since the driver can
>>>>> perform PASID allocation while doing SVA bind.
>>>> No, you are thinking too small.
>>>>
>>>> Look at VDPA, it has a SVA uAPI. Some HW might use PASID for the
>>>> SVA.
>>> Could you point to me the SVA UAPI? I couldn't find it in the
>>> mainline. Seems VDPA uses VHOST interface?
>>
>> It's the vhost_iotlb_msg defined in uapi/linux/vhost_types.h.
>>
> Thanks for the pointer, for complete vSVA functionality we would need
> 1 TLB flush (IOTLB and PASID cache etc.)
> 2 PASID alloc/free
> 3 bind/unbind page tables or PASID tables
> 4 Page request service
>
> Seems vhost_iotlb_msg can be used for #1 partially. And the
> proposal is to pluck out the rest into /dev/sda? Seems awkward as Alex
> pointed out earlier for similar situation in VFIO.


Consider it doesn't have any PASID support yet, my understanding is that 
if we go with /dev/sva:

- vhost uAPI will still keep the uAPI for associating an ASID to a 
specific virtqueue
- except for this, we can use /dev/sva for all the rest (P)ASID operations


>
>>>   
>>>> When VDPA is used by DPDK it makes sense that the PASID will be SVA
>>>> and 1:1 with the mm_struct.
>>>>   
>>> I still don't see why bare metal DPDK needs to get a handle of the
>>> PASID.
>>
>> My understanding is that it may:
>>
>> - have a unified uAPI with vSVA: alloc, bind, unbind, free
> Got your point, but vSVA needs more than these


Yes it's just a subset of what vSVA required.


>
>> - leave the binding policy to userspace instead of the using a
>> implied one in the kenrel
>>
> Only if necessary.


Yes, I think it's all about visibility(flexibility) and**manageability.

Consider device has queue A, B, C. We will only dedicated queue A, B for 
one PASID(for vSVA) and C with another PASID(for SVA). It looks to me 
the current sva_bind() API doesn't support this. We still need an API 
for allocating a PASID for SVA and assign it to the (mediated) device.  
This case is pretty common for implementing a shadow queue for a guest.


>
>>> Perhaps the SVA patch would explain. Or are you talking about
>>> vDPA DPDK process that is used to support virtio-net-pmd in the
>>> guest?
>>>> When VDPA is used by qemu it makes sense that the PASID will be an
>>>> arbitary IOVA map constructed to be 1:1 with the guest vCPU
>>>> physical map. /dev/sva allows a single uAPI to do this kind of
>>>> setup, and qemu can support it while supporting a range of SVA
>>>> kernel drivers. VDPA and vfio-mdev are obvious initial targets.
>>>>
>>>> *BOTH* are needed.
>>>>
>>>> In general any uAPI for PASID should have the option to use either
>>>> the mm_struct SVA PASID *OR* a PASID from /dev/sva. It costs
>>>> virtually nothing to implement this in the driver as PASID is just
>>>> a number, and gives so much more flexability.
>>>>   
>>> Not really nothing in terms of PASID life cycles. For example, if
>>> user uses uacce interface to open an accelerator, it gets an
>>> FD_acc. Then it opens /dev/sva to allocate PASID then get another
>>> FD_pasid. Then we pass FD_pasid to the driver to bind page tables,
>>> perhaps multiple drivers. Now we have to worry about If FD_pasid
>>> gets closed before FD_acc(s) closed and all these race conditions.
>>
>> I'm not sure I understand this. But this demonstrates the flexibility
>> of an unified uAPI. E.g it allows vDPA and VFIO device to use the
>> same PAISD which can be shared with a process in the guest.
>>
> This is for user DMA not for vSVA. I was contending that /dev/sva
> creates unnecessary steps for such usage.


A question here is where the PASID management is expected to be done. 
I'm not quite sure the silent 1:1 binding done in intel_svm_bind_mm() 
can satisfy the requirement for management layer.


>
> For vSVA, I think vDPA and VFIO can potentially share but I am not
> seeing convincing benefits.
>
> If a guest process wants to do SVA with a VFIO assigned device and a
> vDPA-backed virtio-net at the same time, it might be a limitation if
> PASID is not managed via a common interface.


Yes.


> But I am not sure how vDPA
> SVA support will look like, does it support gIOVA? need virtio IOMMU?


Yes, it supports gIOVA and it should work with any type of vIOMMU. I 
think vDPA will start from Intel vIOMMU support in Qemu.

For virtio IOMMU, we will probably support it in the future consider it 
doesn't have any SVA capability, and it doesn't use a page table that 
can be nested via a hardware IOMMU.


>> For the race condition, it could be probably solved with refcnt.
>>
> Agreed but the best solution might be not to have the problem in the
> first place :)


I agree, it's only worth to bother if it has real benefits.

Thanks


>
>> Thanks
>>
>>
>>> If we do not expose FD_pasid to the user, the teardown is much
>>> simpler and streamlined. Following each FD_acc close, PASID unbind
>>> is performed.
>>>>> Yi can correct me but this set is is about VFIO-PCI, VFIO-mdev
>>>>> will be introduced later.
>>>> Last patch is:
>>>>
>>>>     vfio/type1: Add vSVA support for IOMMU-backed mdevs
>>>>
>>>> So pretty hard to see how this is not about vfio-mdev, at least a
>>>> little..
>>>>
>>>> Jason
>>> Thanks,
>>>
>>> Jacob
>>>   
>
> Thanks,
>
> Jacob
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2020-09-18  3:59 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-10 10:45 [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs Liu Yi L
2020-09-10 10:45 ` [PATCH v7 01/16] iommu: Report domain nesting info Liu Yi L
2020-09-11 19:38   ` Alex Williamson
2020-09-10 10:45 ` [PATCH v7 02/16] iommu/smmu: Report empty " Liu Yi L
2021-01-12  6:50   ` Vivek Gautam
2021-01-12  9:21     ` Liu, Yi L
2021-01-12 11:05       ` Vivek Gautam
2021-01-13  5:56         ` Liu, Yi L
2021-01-19 10:03           ` Auger Eric
2021-01-23  8:59             ` Liu, Yi L
2021-02-12  7:14               ` Vivek Gautam
2021-02-12  9:57                 ` Auger Eric
2021-02-12 10:18                   ` Vivek Kumar Gautam
2021-02-12 11:01                     ` Vivek Kumar Gautam
2021-03-03  9:44                   ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 03/16] vfio/type1: Report iommu nesting info to userspace Liu Yi L
2020-09-11 20:16   ` Alex Williamson
2020-09-12  8:24     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 04/16] vfio: Add PASID allocation/free support Liu Yi L
2020-09-11 20:54   ` Alex Williamson
2020-09-15  4:03     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 05/16] iommu/vt-d: Support setting ioasid set to domain Liu Yi L
2020-09-10 10:45 ` [PATCH v7 06/16] iommu/vt-d: Remove get_task_mm() in bind_gpasid() Liu Yi L
2020-09-10 10:45 ` [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free) Liu Yi L
2020-09-11 21:38   ` Alex Williamson
2020-09-12  6:17     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 08/16] iommu: Pass domain to sva_unbind_gpasid() Liu Yi L
2020-09-10 10:45 ` [PATCH v7 09/16] iommu/vt-d: Check ownership for PASIDs from user-space Liu Yi L
2020-09-10 10:45 ` [PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID Liu Yi L
2020-09-11 22:03   ` Alex Williamson
2020-09-12  6:02     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 11/16] vfio/type1: Allow invalidating first-level/stage IOMMU cache Liu Yi L
2020-09-10 10:45 ` [PATCH v7 12/16] vfio/type1: Add vSVA support for IOMMU-backed mdevs Liu Yi L
2020-09-10 10:45 ` [PATCH v7 13/16] vfio/pci: Expose PCIe PASID capability to guest Liu Yi L
2020-09-11 22:13   ` Alex Williamson
2020-09-12  7:17     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 14/16] vfio: Document dual stage control Liu Yi L
2020-09-10 10:45 ` [PATCH v7 15/16] iommu/vt-d: Only support nesting when nesting caps are consistent across iommu units Liu Yi L
2020-09-10 10:45 ` [PATCH v7 16/16] iommu/vt-d: Support reporting nesting capability info Liu Yi L
2020-09-14  4:20 ` [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs Jason Wang
2020-09-14  8:01   ` Tian, Kevin
2020-09-14  8:57     ` Jason Wang
2020-09-14 10:38       ` Tian, Kevin
2020-09-14 11:38         ` Jason Gunthorpe
2020-09-14 13:31   ` Jean-Philippe Brucker
2020-09-14 13:47     ` Jason Gunthorpe
2020-09-14 16:22       ` Raj, Ashok
2020-09-14 16:33         ` Jason Gunthorpe
2020-09-14 16:58           ` Alex Williamson
2020-09-14 17:41             ` Jason Gunthorpe
2020-09-14 18:23               ` Alex Williamson
2020-09-14 19:00                 ` Jason Gunthorpe
2020-09-14 22:33                   ` Alex Williamson
2020-09-15 14:29                     ` Jason Gunthorpe
2020-09-16  1:19                       ` Tian, Kevin
2020-09-16  8:32                         ` Jean-Philippe Brucker
2020-09-16 14:51                           ` Jason Gunthorpe
2020-09-16 16:20                             ` Jean-Philippe Brucker
2020-09-16 16:32                               ` Jason Gunthorpe
2020-09-16 16:50                                 ` Auger Eric
2020-09-16 14:44                         ` Jason Gunthorpe
2020-09-17  6:01                           ` Tian, Kevin
2020-09-14 22:44                   ` Raj, Ashok
2020-09-15 11:33                     ` Jason Gunthorpe
2020-09-15 18:11                       ` Raj, Ashok
2020-09-15 18:45                         ` Jason Gunthorpe
2020-09-15 19:26                           ` Raj, Ashok
2020-09-15 23:45                             ` Jason Gunthorpe
2020-09-16  2:33                             ` Jason Wang
2020-09-15 22:08                           ` Jacob Pan
2020-09-15 23:51                             ` Jason Gunthorpe
2020-09-16  0:22                               ` Jacob Pan (Jun)
2020-09-16  1:46                                 ` Lu Baolu
2020-09-16 15:07                                 ` Jason Gunthorpe
2020-09-16 16:33                                   ` Raj, Ashok
2020-09-16 17:01                                     ` Jason Gunthorpe
2020-09-16 18:21                                       ` Jacob Pan (Jun)
2020-09-16 18:38                                         ` Jason Gunthorpe
2020-09-16 23:09                                           ` Jacob Pan (Jun)
2020-09-17  3:53                                             ` Jason Wang
2020-09-17 17:31                                               ` Jason Gunthorpe
2020-09-17 18:17                                               ` Jacob Pan (Jun)
2020-09-18  3:58                                                 ` Jason Wang [this message]
2020-09-16  2:29     ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad909852-1201-6a4a-ffb0-9ad0df6c2e81@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=stefanha@gmail.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).