From: Jason Wang <jasowang@redhat.com> To: Yongji Xie <xieyongji@bytedance.com> Cc: "Michael S. Tsirkin" <mst@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Stefano Garzarella <sgarzare@redhat.com>, Parav Pandit <parav@nvidia.com>, Bob Liu <bob.liu@oracle.com>, Christoph Hellwig <hch@infradead.org>, Randy Dunlap <rdunlap@infradead.org>, Matthew Wilcox <willy@infradead.org>, viro@zeniv.linux.org.uk, Jens Axboe <axboe@kernel.dk>, bcrl@kvack.org, Jonathan Corbet <corbet@lwn.net>, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [RFC v4 06/11] vduse: Implement an MMU-based IOMMU driver Date: Mon, 8 Mar 2021 11:52:18 +0800 [thread overview] Message-ID: <0b671aef-f2b2-6162-f407-7ca5178dbebb@redhat.com> (raw) In-Reply-To: <CACycT3uA5y=jcKPwu6rZ83Lqf1ytuPhnxWLCeMpDYrvRodHFVg@mail.gmail.com> On 2021/3/8 11:45 上午, Yongji Xie wrote: > On Mon, Mar 8, 2021 at 11:17 AM Jason Wang <jasowang@redhat.com> wrote: >> >> On 2021/3/5 3:59 下午, Yongji Xie wrote: >>> On Fri, Mar 5, 2021 at 3:27 PM Jason Wang <jasowang@redhat.com> wrote: >>>> On 2021/3/5 3:13 下午, Yongji Xie wrote: >>>>> On Fri, Mar 5, 2021 at 2:52 PM Jason Wang <jasowang@redhat.com> wrote: >>>>>> On 2021/3/5 2:15 下午, Yongji Xie wrote: >>>>>> >>>>>> Sorry if I've asked this before. >>>>>> >>>>>> But what's the reason for maintaing a dedicated IOTLB here? I think we >>>>>> could reuse vduse_dev->iommu since the device can not be used by both >>>>>> virtio and vhost in the same time or use vduse_iova_domain->iotlb for >>>>>> set_map(). >>>>>> >>>>>> The main difference between domain->iotlb and dev->iotlb is the way to >>>>>> deal with bounce buffer. In the domain->iotlb case, bounce buffer >>>>>> needs to be mapped each DMA transfer because we need to get the bounce >>>>>> pages by an IOVA during DMA unmapping. In the dev->iotlb case, bounce >>>>>> buffer only needs to be mapped once during initialization, which will >>>>>> be used to tell userspace how to do mmap(). >>>>>> >>>>>> Also, since vhost IOTLB support per mapping token (opauqe), can we use >>>>>> that instead of the bounce_pages *? >>>>>> >>>>>> Sorry, I didn't get you here. Which value do you mean to store in the >>>>>> opaque pointer? >>>>>> >>>>>> So I would like to have a way to use a single IOTLB for manage all kinds >>>>>> of mappings. Two possible ideas: >>>>>> >>>>>> 1) map bounce page one by one in vduse_dev_map_page(), in >>>>>> VDUSE_IOTLB_GET_FD, try to merge the result if we had the same fd. Then >>>>>> for bounce pages, userspace still only need to map it once and we can >>>>>> maintain the actual mapping by storing the page or pa in the opaque >>>>>> field of IOTLB entry. >>>>>> >>>>>> Looks like userspace still needs to unmap the old region and map a new >>>>>> region (size is changed) with the fd in each VDUSE_IOTLB_GET_FD ioctl. >>>>>> >>>>>> >>>>>> I don't get here. Can you give an example? >>>>>> >>>>> For example, userspace needs to process two I/O requests (one page per >>>>> request). To process the first request, userspace uses >>>>> VDUSE_IOTLB_GET_FD ioctl to query the iova region (0 ~ 4096) and mmap >>>>> it. >>>> I think in this case we should let VDUSE_IOTLB_GET_FD return the maximum >>>> range as far as they are backed by the same fd. >>>> >>> But now the bounce page is mapped one by one. The second page (4096 ~ >>> 8192) might not be mapped when userspace is processing the first >>> request. So the maximum range is 0 ~ 4096 at that time. >>> >>> Thanks, >>> Yongji >> >> A question, if I read the code correctly, VDUSE_IOTLB_GET_FD will return >> the whole bounce map range which is setup in vduse_dev_map_page()? So my >> understanding is that usersapce may choose to map all its range via mmap(). >> > Yes. > >> So if we 'map' bounce page one by one in vduse_dev_map_page(). (Here >> 'map' means using multiple itree entries instead of a single one). Then >> in the VDUSE_IOTLB_GET_FD we can keep traversing itree (dev->iommu) >> until the range is backed by a different file. >> >> With this, there's no userspace visible changes and there's no need for >> the domain->iotlb? >> > In this case, I wonder what range can be obtained if userspace calls > VDUSE_IOTLB_GET_FD when the first I/O (e.g. 4K) occurs. [0, 4K] or [0, > 64M]? In current implementation, userspace will map [0, 64M]. It should still be [0, 64M). Do you see any issue? Thanks > > Thanks, > Yongji >
WARNING: multiple messages have this Message-ID (diff)
From: Jason Wang <jasowang@redhat.com> To: Yongji Xie <xieyongji@bytedance.com> Cc: Jens Axboe <axboe@kernel.dk>, Jonathan Corbet <corbet@lwn.net>, kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>, linux-aio@kvack.org, netdev@vger.kernel.org, Randy Dunlap <rdunlap@infradead.org>, Matthew Wilcox <willy@infradead.org>, virtualization@lists.linux-foundation.org, Christoph Hellwig <hch@infradead.org>, Bob Liu <bob.liu@oracle.com>, bcrl@kvack.org, viro@zeniv.linux.org.uk, Stefan Hajnoczi <stefanha@redhat.com>, linux-fsdevel@vger.kernel.org Subject: Re: [RFC v4 06/11] vduse: Implement an MMU-based IOMMU driver Date: Mon, 8 Mar 2021 11:52:18 +0800 [thread overview] Message-ID: <0b671aef-f2b2-6162-f407-7ca5178dbebb@redhat.com> (raw) In-Reply-To: <CACycT3uA5y=jcKPwu6rZ83Lqf1ytuPhnxWLCeMpDYrvRodHFVg@mail.gmail.com> On 2021/3/8 11:45 上午, Yongji Xie wrote: > On Mon, Mar 8, 2021 at 11:17 AM Jason Wang <jasowang@redhat.com> wrote: >> >> On 2021/3/5 3:59 下午, Yongji Xie wrote: >>> On Fri, Mar 5, 2021 at 3:27 PM Jason Wang <jasowang@redhat.com> wrote: >>>> On 2021/3/5 3:13 下午, Yongji Xie wrote: >>>>> On Fri, Mar 5, 2021 at 2:52 PM Jason Wang <jasowang@redhat.com> wrote: >>>>>> On 2021/3/5 2:15 下午, Yongji Xie wrote: >>>>>> >>>>>> Sorry if I've asked this before. >>>>>> >>>>>> But what's the reason for maintaing a dedicated IOTLB here? I think we >>>>>> could reuse vduse_dev->iommu since the device can not be used by both >>>>>> virtio and vhost in the same time or use vduse_iova_domain->iotlb for >>>>>> set_map(). >>>>>> >>>>>> The main difference between domain->iotlb and dev->iotlb is the way to >>>>>> deal with bounce buffer. In the domain->iotlb case, bounce buffer >>>>>> needs to be mapped each DMA transfer because we need to get the bounce >>>>>> pages by an IOVA during DMA unmapping. In the dev->iotlb case, bounce >>>>>> buffer only needs to be mapped once during initialization, which will >>>>>> be used to tell userspace how to do mmap(). >>>>>> >>>>>> Also, since vhost IOTLB support per mapping token (opauqe), can we use >>>>>> that instead of the bounce_pages *? >>>>>> >>>>>> Sorry, I didn't get you here. Which value do you mean to store in the >>>>>> opaque pointer? >>>>>> >>>>>> So I would like to have a way to use a single IOTLB for manage all kinds >>>>>> of mappings. Two possible ideas: >>>>>> >>>>>> 1) map bounce page one by one in vduse_dev_map_page(), in >>>>>> VDUSE_IOTLB_GET_FD, try to merge the result if we had the same fd. Then >>>>>> for bounce pages, userspace still only need to map it once and we can >>>>>> maintain the actual mapping by storing the page or pa in the opaque >>>>>> field of IOTLB entry. >>>>>> >>>>>> Looks like userspace still needs to unmap the old region and map a new >>>>>> region (size is changed) with the fd in each VDUSE_IOTLB_GET_FD ioctl. >>>>>> >>>>>> >>>>>> I don't get here. Can you give an example? >>>>>> >>>>> For example, userspace needs to process two I/O requests (one page per >>>>> request). To process the first request, userspace uses >>>>> VDUSE_IOTLB_GET_FD ioctl to query the iova region (0 ~ 4096) and mmap >>>>> it. >>>> I think in this case we should let VDUSE_IOTLB_GET_FD return the maximum >>>> range as far as they are backed by the same fd. >>>> >>> But now the bounce page is mapped one by one. The second page (4096 ~ >>> 8192) might not be mapped when userspace is processing the first >>> request. So the maximum range is 0 ~ 4096 at that time. >>> >>> Thanks, >>> Yongji >> >> A question, if I read the code correctly, VDUSE_IOTLB_GET_FD will return >> the whole bounce map range which is setup in vduse_dev_map_page()? So my >> understanding is that usersapce may choose to map all its range via mmap(). >> > Yes. > >> So if we 'map' bounce page one by one in vduse_dev_map_page(). (Here >> 'map' means using multiple itree entries instead of a single one). Then >> in the VDUSE_IOTLB_GET_FD we can keep traversing itree (dev->iommu) >> until the range is backed by a different file. >> >> With this, there's no userspace visible changes and there's no need for >> the domain->iotlb? >> > In this case, I wonder what range can be obtained if userspace calls > VDUSE_IOTLB_GET_FD when the first I/O (e.g. 4K) occurs. [0, 4K] or [0, > 64M]? In current implementation, userspace will map [0, 64M]. It should still be [0, 64M). Do you see any issue? Thanks > > Thanks, > Yongji > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2021-03-08 3:53 UTC|newest] Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-23 11:50 [RFC v4 00/11] Introduce VDUSE - vDPA Device in Userspace Xie Yongji 2021-02-23 11:50 ` [RFC v4 01/11] eventfd: Increase the recursion depth of eventfd_signal() Xie Yongji 2021-03-02 6:44 ` Jason Wang 2021-03-02 6:44 ` Jason Wang 2021-03-02 10:32 ` Yongji Xie 2021-02-23 11:50 ` [RFC v4 02/11] vhost-vdpa: protect concurrent access to vhost device iotlb Xie Yongji 2021-03-02 6:47 ` Jason Wang 2021-03-02 6:47 ` Jason Wang 2021-03-02 10:20 ` Yongji Xie 2021-02-23 11:50 ` [RFC v4 03/11] vhost-iotlb: Add an opaque pointer for vhost IOTLB Xie Yongji 2021-03-02 6:49 ` Jason Wang 2021-03-02 6:49 ` Jason Wang 2021-02-23 11:50 ` [RFC v4 04/11] vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Xie Yongji 2021-03-02 6:50 ` Jason Wang 2021-03-02 6:50 ` Jason Wang 2021-02-23 11:50 ` [RFC v4 05/11] vdpa: Support transferring virtual addressing during DMA mapping Xie Yongji 2021-02-24 7:37 ` Dan Carpenter 2021-02-24 7:37 ` Dan Carpenter 2021-03-03 10:52 ` Mika Penttilä 2021-03-03 12:45 ` Yongji Xie 2021-03-03 13:38 ` Mika Penttilä 2021-03-04 3:07 ` Jason Wang 2021-03-04 3:07 ` Jason Wang 2021-03-04 5:40 ` Yongji Xie 2021-02-23 11:50 ` [RFC v4 06/11] vduse: Implement an MMU-based IOMMU driver Xie Yongji 2021-03-04 4:20 ` Jason Wang 2021-03-04 4:20 ` Jason Wang 2021-03-04 5:12 ` Yongji Xie 2021-03-05 3:35 ` Jason Wang 2021-03-05 3:35 ` Jason Wang 2021-03-05 6:15 ` Yongji Xie 2021-03-05 6:51 ` Jason Wang 2021-03-05 7:13 ` Yongji Xie 2021-03-05 7:27 ` Jason Wang 2021-03-05 7:27 ` Jason Wang 2021-03-05 7:59 ` Yongji Xie 2021-03-08 3:17 ` Jason Wang 2021-03-08 3:17 ` Jason Wang 2021-03-08 3:45 ` Yongji Xie 2021-03-08 3:52 ` Jason Wang [this message] 2021-03-08 3:52 ` Jason Wang 2021-03-08 5:05 ` Yongji Xie 2021-03-08 7:04 ` Jason Wang 2021-03-08 7:04 ` Jason Wang 2021-03-08 7:08 ` Yongji Xie 2021-02-23 11:50 ` [RFC v4 07/11] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji 2021-02-23 15:44 ` kernel test robot 2021-02-23 20:24 ` kernel test robot 2021-03-04 6:27 ` Jason Wang 2021-03-04 6:27 ` Jason Wang 2021-03-04 8:05 ` Yongji Xie 2021-03-05 3:20 ` Jason Wang 2021-03-05 3:20 ` Jason Wang 2021-03-05 3:49 ` Yongji Xie 2021-03-10 12:58 ` Jason Wang 2021-03-10 12:58 ` Jason Wang 2021-03-11 2:28 ` Yongji Xie 2021-02-23 11:50 ` [RFC v4 08/11] vduse: Add config interrupt support Xie Yongji 2021-02-23 11:50 ` [RFC v4 09/11] Documentation: Add documentation for VDUSE Xie Yongji 2021-03-04 6:39 ` Jason Wang 2021-03-04 6:39 ` Jason Wang 2021-03-04 10:35 ` Yongji Xie 2021-02-23 11:50 ` [RFC v4 10/11] vduse: Introduce a workqueue for irq injection Xie Yongji 2021-03-04 6:59 ` Jason Wang 2021-03-04 6:59 ` Jason Wang 2021-03-04 8:58 ` Yongji Xie 2021-03-05 3:04 ` Jason Wang 2021-03-05 3:04 ` Jason Wang 2021-03-05 3:30 ` Yongji Xie 2021-03-05 3:42 ` Jason Wang 2021-03-05 3:42 ` Jason Wang 2021-03-05 6:36 ` Yongji Xie 2021-03-05 7:01 ` Jason Wang 2021-03-05 7:01 ` Jason Wang 2021-03-05 7:27 ` Yongji Xie 2021-03-05 7:36 ` Jason Wang 2021-03-05 7:36 ` Jason Wang 2021-03-05 8:12 ` Yongji Xie 2021-03-08 3:04 ` Jason Wang 2021-03-08 3:04 ` Jason Wang 2021-03-08 4:50 ` Yongji Xie 2021-03-08 7:01 ` Jason Wang 2021-03-08 7:01 ` Jason Wang 2021-03-08 7:16 ` Yongji Xie 2021-03-08 7:29 ` Jason Wang 2021-03-08 7:29 ` Jason Wang 2021-02-23 11:50 ` [RFC v4 11/11] vduse: Support binding irq to the specified cpu Xie Yongji 2021-03-04 7:30 ` Jason Wang 2021-03-04 7:30 ` Jason Wang 2021-03-04 8:19 ` Yongji Xie 2021-03-05 3:11 ` Jason Wang 2021-03-05 3:11 ` Jason Wang 2021-03-05 3:37 ` Yongji Xie 2021-03-05 3:44 ` Jason Wang 2021-03-05 3:44 ` Jason Wang 2021-03-05 6:40 ` Yongji Xie
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=0b671aef-f2b2-6162-f407-7ca5178dbebb@redhat.com \ --to=jasowang@redhat.com \ --cc=axboe@kernel.dk \ --cc=bcrl@kvack.org \ --cc=bob.liu@oracle.com \ --cc=corbet@lwn.net \ --cc=hch@infradead.org \ --cc=kvm@vger.kernel.org \ --cc=linux-aio@kvack.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=mst@redhat.com \ --cc=netdev@vger.kernel.org \ --cc=parav@nvidia.com \ --cc=rdunlap@infradead.org \ --cc=sgarzare@redhat.com \ --cc=stefanha@redhat.com \ --cc=viro@zeniv.linux.org.uk \ --cc=virtualization@lists.linux-foundation.org \ --cc=willy@infradead.org \ --cc=xieyongji@bytedance.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.