All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Yongji Xie <xieyongji@bytedance.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Parav Pandit <parav@nvidia.com>, Bob Liu <bob.liu@oracle.com>,
	Christoph Hellwig <hch@infradead.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	viro@zeniv.linux.org.uk, Jens Axboe <axboe@kernel.dk>,
	bcrl@kvack.org, Jonathan Corbet <corbet@lwn.net>,
	virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, kvm@vger.kernel.org, linux-aio@kvack.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC v4 06/11] vduse: Implement an MMU-based IOMMU driver
Date: Mon, 8 Mar 2021 11:52:18 +0800	[thread overview]
Message-ID: <0b671aef-f2b2-6162-f407-7ca5178dbebb@redhat.com> (raw)
In-Reply-To: <CACycT3uA5y=jcKPwu6rZ83Lqf1ytuPhnxWLCeMpDYrvRodHFVg@mail.gmail.com>


On 2021/3/8 11:45 上午, Yongji Xie wrote:
> On Mon, Mar 8, 2021 at 11:17 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/3/5 3:59 下午, Yongji Xie wrote:
>>> On Fri, Mar 5, 2021 at 3:27 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/3/5 3:13 下午, Yongji Xie wrote:
>>>>> On Fri, Mar 5, 2021 at 2:52 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On 2021/3/5 2:15 下午, Yongji Xie wrote:
>>>>>>
>>>>>> Sorry if I've asked this before.
>>>>>>
>>>>>> But what's the reason for maintaing a dedicated IOTLB here? I think we
>>>>>> could reuse vduse_dev->iommu since the device can not be used by both
>>>>>> virtio and vhost in the same time or use vduse_iova_domain->iotlb for
>>>>>> set_map().
>>>>>>
>>>>>> The main difference between domain->iotlb and dev->iotlb is the way to
>>>>>> deal with bounce buffer. In the domain->iotlb case, bounce buffer
>>>>>> needs to be mapped each DMA transfer because we need to get the bounce
>>>>>> pages by an IOVA during DMA unmapping. In the dev->iotlb case, bounce
>>>>>> buffer only needs to be mapped once during initialization, which will
>>>>>> be used to tell userspace how to do mmap().
>>>>>>
>>>>>> Also, since vhost IOTLB support per mapping token (opauqe), can we use
>>>>>> that instead of the bounce_pages *?
>>>>>>
>>>>>> Sorry, I didn't get you here. Which value do you mean to store in the
>>>>>> opaque pointer?
>>>>>>
>>>>>> So I would like to have a way to use a single IOTLB for manage all kinds
>>>>>> of mappings. Two possible ideas:
>>>>>>
>>>>>> 1) map bounce page one by one in vduse_dev_map_page(), in
>>>>>> VDUSE_IOTLB_GET_FD, try to merge the result if we had the same fd. Then
>>>>>> for bounce pages, userspace still only need to map it once and we can
>>>>>> maintain the actual mapping by storing the page or pa in the opaque
>>>>>> field of IOTLB entry.
>>>>>>
>>>>>> Looks like userspace still needs to unmap the old region and map a new
>>>>>> region (size is changed) with the fd in each VDUSE_IOTLB_GET_FD ioctl.
>>>>>>
>>>>>>
>>>>>> I don't get here. Can you give an example?
>>>>>>
>>>>> For example, userspace needs to process two I/O requests (one page per
>>>>> request). To process the first request, userspace uses
>>>>> VDUSE_IOTLB_GET_FD ioctl to query the iova region (0 ~ 4096) and mmap
>>>>> it.
>>>> I think in this case we should let VDUSE_IOTLB_GET_FD return the maximum
>>>> range as far as they are backed by the same fd.
>>>>
>>> But now the bounce page is mapped one by one. The second page (4096 ~
>>> 8192) might not be mapped when userspace is processing the first
>>> request. So the maximum range is 0 ~ 4096 at that time.
>>>
>>> Thanks,
>>> Yongji
>>
>> A question, if I read the code correctly, VDUSE_IOTLB_GET_FD will return
>> the whole bounce map range which is setup in vduse_dev_map_page()? So my
>> understanding is that usersapce may choose to map all its range via mmap().
>>
> Yes.
>
>> So if we 'map' bounce page one by one in vduse_dev_map_page(). (Here
>> 'map' means using multiple itree entries instead of a single one). Then
>> in the VDUSE_IOTLB_GET_FD we can keep traversing itree (dev->iommu)
>> until the range is backed by a different file.
>>
>> With this, there's no userspace visible changes and there's no need for
>> the domain->iotlb?
>>
> In this case, I wonder what range can be obtained if userspace calls
> VDUSE_IOTLB_GET_FD when the first I/O (e.g. 4K) occurs. [0, 4K] or [0,
> 64M]? In current implementation, userspace will map [0, 64M].


It should still be [0, 64M). Do you see any issue?

Thanks


>
> Thanks,
> Yongji
>


WARNING: multiple messages have this Message-ID (diff)
From: Jason Wang <jasowang@redhat.com>
To: Yongji Xie <xieyongji@bytedance.com>
Cc: Jens Axboe <axboe@kernel.dk>, Jonathan Corbet <corbet@lwn.net>,
	kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>,
	linux-aio@kvack.org, netdev@vger.kernel.org,
	Randy Dunlap <rdunlap@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	virtualization@lists.linux-foundation.org,
	Christoph Hellwig <hch@infradead.org>,
	Bob Liu <bob.liu@oracle.com>,
	bcrl@kvack.org, viro@zeniv.linux.org.uk,
	Stefan Hajnoczi <stefanha@redhat.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC v4 06/11] vduse: Implement an MMU-based IOMMU driver
Date: Mon, 8 Mar 2021 11:52:18 +0800	[thread overview]
Message-ID: <0b671aef-f2b2-6162-f407-7ca5178dbebb@redhat.com> (raw)
In-Reply-To: <CACycT3uA5y=jcKPwu6rZ83Lqf1ytuPhnxWLCeMpDYrvRodHFVg@mail.gmail.com>


On 2021/3/8 11:45 上午, Yongji Xie wrote:
> On Mon, Mar 8, 2021 at 11:17 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/3/5 3:59 下午, Yongji Xie wrote:
>>> On Fri, Mar 5, 2021 at 3:27 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/3/5 3:13 下午, Yongji Xie wrote:
>>>>> On Fri, Mar 5, 2021 at 2:52 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On 2021/3/5 2:15 下午, Yongji Xie wrote:
>>>>>>
>>>>>> Sorry if I've asked this before.
>>>>>>
>>>>>> But what's the reason for maintaing a dedicated IOTLB here? I think we
>>>>>> could reuse vduse_dev->iommu since the device can not be used by both
>>>>>> virtio and vhost in the same time or use vduse_iova_domain->iotlb for
>>>>>> set_map().
>>>>>>
>>>>>> The main difference between domain->iotlb and dev->iotlb is the way to
>>>>>> deal with bounce buffer. In the domain->iotlb case, bounce buffer
>>>>>> needs to be mapped each DMA transfer because we need to get the bounce
>>>>>> pages by an IOVA during DMA unmapping. In the dev->iotlb case, bounce
>>>>>> buffer only needs to be mapped once during initialization, which will
>>>>>> be used to tell userspace how to do mmap().
>>>>>>
>>>>>> Also, since vhost IOTLB support per mapping token (opauqe), can we use
>>>>>> that instead of the bounce_pages *?
>>>>>>
>>>>>> Sorry, I didn't get you here. Which value do you mean to store in the
>>>>>> opaque pointer?
>>>>>>
>>>>>> So I would like to have a way to use a single IOTLB for manage all kinds
>>>>>> of mappings. Two possible ideas:
>>>>>>
>>>>>> 1) map bounce page one by one in vduse_dev_map_page(), in
>>>>>> VDUSE_IOTLB_GET_FD, try to merge the result if we had the same fd. Then
>>>>>> for bounce pages, userspace still only need to map it once and we can
>>>>>> maintain the actual mapping by storing the page or pa in the opaque
>>>>>> field of IOTLB entry.
>>>>>>
>>>>>> Looks like userspace still needs to unmap the old region and map a new
>>>>>> region (size is changed) with the fd in each VDUSE_IOTLB_GET_FD ioctl.
>>>>>>
>>>>>>
>>>>>> I don't get here. Can you give an example?
>>>>>>
>>>>> For example, userspace needs to process two I/O requests (one page per
>>>>> request). To process the first request, userspace uses
>>>>> VDUSE_IOTLB_GET_FD ioctl to query the iova region (0 ~ 4096) and mmap
>>>>> it.
>>>> I think in this case we should let VDUSE_IOTLB_GET_FD return the maximum
>>>> range as far as they are backed by the same fd.
>>>>
>>> But now the bounce page is mapped one by one. The second page (4096 ~
>>> 8192) might not be mapped when userspace is processing the first
>>> request. So the maximum range is 0 ~ 4096 at that time.
>>>
>>> Thanks,
>>> Yongji
>>
>> A question, if I read the code correctly, VDUSE_IOTLB_GET_FD will return
>> the whole bounce map range which is setup in vduse_dev_map_page()? So my
>> understanding is that usersapce may choose to map all its range via mmap().
>>
> Yes.
>
>> So if we 'map' bounce page one by one in vduse_dev_map_page(). (Here
>> 'map' means using multiple itree entries instead of a single one). Then
>> in the VDUSE_IOTLB_GET_FD we can keep traversing itree (dev->iommu)
>> until the range is backed by a different file.
>>
>> With this, there's no userspace visible changes and there's no need for
>> the domain->iotlb?
>>
> In this case, I wonder what range can be obtained if userspace calls
> VDUSE_IOTLB_GET_FD when the first I/O (e.g. 4K) occurs. [0, 4K] or [0,
> 64M]? In current implementation, userspace will map [0, 64M].


It should still be [0, 64M). Do you see any issue?

Thanks


>
> Thanks,
> Yongji
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2021-03-08  3:53 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-23 11:50 [RFC v4 00/11] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-02-23 11:50 ` [RFC v4 01/11] eventfd: Increase the recursion depth of eventfd_signal() Xie Yongji
2021-03-02  6:44   ` Jason Wang
2021-03-02  6:44     ` Jason Wang
2021-03-02 10:32     ` Yongji Xie
2021-02-23 11:50 ` [RFC v4 02/11] vhost-vdpa: protect concurrent access to vhost device iotlb Xie Yongji
2021-03-02  6:47   ` Jason Wang
2021-03-02  6:47     ` Jason Wang
2021-03-02 10:20     ` Yongji Xie
2021-02-23 11:50 ` [RFC v4 03/11] vhost-iotlb: Add an opaque pointer for vhost IOTLB Xie Yongji
2021-03-02  6:49   ` Jason Wang
2021-03-02  6:49     ` Jason Wang
2021-02-23 11:50 ` [RFC v4 04/11] vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Xie Yongji
2021-03-02  6:50   ` Jason Wang
2021-03-02  6:50     ` Jason Wang
2021-02-23 11:50 ` [RFC v4 05/11] vdpa: Support transferring virtual addressing during DMA mapping Xie Yongji
2021-02-24  7:37   ` Dan Carpenter
2021-02-24  7:37     ` Dan Carpenter
2021-03-03 10:52   ` Mika Penttilä
2021-03-03 12:45     ` Yongji Xie
2021-03-03 13:38       ` Mika Penttilä
2021-03-04  3:07   ` Jason Wang
2021-03-04  3:07     ` Jason Wang
2021-03-04  5:40     ` Yongji Xie
2021-02-23 11:50 ` [RFC v4 06/11] vduse: Implement an MMU-based IOMMU driver Xie Yongji
2021-03-04  4:20   ` Jason Wang
2021-03-04  4:20     ` Jason Wang
2021-03-04  5:12     ` Yongji Xie
2021-03-05  3:35       ` Jason Wang
2021-03-05  3:35         ` Jason Wang
2021-03-05  6:15         ` Yongji Xie
2021-03-05  6:51           ` Jason Wang
2021-03-05  7:13             ` Yongji Xie
2021-03-05  7:27               ` Jason Wang
2021-03-05  7:27                 ` Jason Wang
2021-03-05  7:59                 ` Yongji Xie
2021-03-08  3:17                   ` Jason Wang
2021-03-08  3:17                     ` Jason Wang
2021-03-08  3:45                     ` Yongji Xie
2021-03-08  3:52                       ` Jason Wang [this message]
2021-03-08  3:52                         ` Jason Wang
2021-03-08  5:05                         ` Yongji Xie
2021-03-08  7:04                           ` Jason Wang
2021-03-08  7:04                             ` Jason Wang
2021-03-08  7:08                             ` Yongji Xie
2021-02-23 11:50 ` [RFC v4 07/11] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-02-23 15:44   ` kernel test robot
2021-02-23 20:24   ` kernel test robot
2021-03-04  6:27   ` Jason Wang
2021-03-04  6:27     ` Jason Wang
2021-03-04  8:05     ` Yongji Xie
2021-03-05  3:20       ` Jason Wang
2021-03-05  3:20         ` Jason Wang
2021-03-05  3:49         ` Yongji Xie
2021-03-10 12:58   ` Jason Wang
2021-03-10 12:58     ` Jason Wang
2021-03-11  2:28     ` Yongji Xie
2021-02-23 11:50 ` [RFC v4 08/11] vduse: Add config interrupt support Xie Yongji
2021-02-23 11:50 ` [RFC v4 09/11] Documentation: Add documentation for VDUSE Xie Yongji
2021-03-04  6:39   ` Jason Wang
2021-03-04  6:39     ` Jason Wang
2021-03-04 10:35     ` Yongji Xie
2021-02-23 11:50 ` [RFC v4 10/11] vduse: Introduce a workqueue for irq injection Xie Yongji
2021-03-04  6:59   ` Jason Wang
2021-03-04  6:59     ` Jason Wang
2021-03-04  8:58     ` Yongji Xie
2021-03-05  3:04       ` Jason Wang
2021-03-05  3:04         ` Jason Wang
2021-03-05  3:30         ` Yongji Xie
2021-03-05  3:42           ` Jason Wang
2021-03-05  3:42             ` Jason Wang
2021-03-05  6:36             ` Yongji Xie
2021-03-05  7:01               ` Jason Wang
2021-03-05  7:01                 ` Jason Wang
2021-03-05  7:27                 ` Yongji Xie
2021-03-05  7:36                   ` Jason Wang
2021-03-05  7:36                     ` Jason Wang
2021-03-05  8:12                     ` Yongji Xie
2021-03-08  3:04                       ` Jason Wang
2021-03-08  3:04                         ` Jason Wang
2021-03-08  4:50                         ` Yongji Xie
2021-03-08  7:01                           ` Jason Wang
2021-03-08  7:01                             ` Jason Wang
2021-03-08  7:16                             ` Yongji Xie
2021-03-08  7:29                               ` Jason Wang
2021-03-08  7:29                                 ` Jason Wang
2021-02-23 11:50 ` [RFC v4 11/11] vduse: Support binding irq to the specified cpu Xie Yongji
2021-03-04  7:30   ` Jason Wang
2021-03-04  7:30     ` Jason Wang
2021-03-04  8:19     ` Yongji Xie
2021-03-05  3:11       ` Jason Wang
2021-03-05  3:11         ` Jason Wang
2021-03-05  3:37         ` Yongji Xie
2021-03-05  3:44           ` Jason Wang
2021-03-05  3:44             ` Jason Wang
2021-03-05  6:40             ` Yongji Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b671aef-f2b2-6162-f407-7ca5178dbebb@redhat.com \
    --to=jasowang@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=bob.liu@oracle.com \
    --cc=corbet@lwn.net \
    --cc=hch@infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=rdunlap@infradead.org \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.