linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Yongji Xie <xieyongji@bytedance.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [External] Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace
Date: Fri, 23 Oct 2020 16:44:45 +0800	[thread overview]
Message-ID: <427448f0-58ba-0730-d199-6c8cd818ea63@redhat.com> (raw)
In-Reply-To: <CACycT3s2GZ3yKP+Xn2V83_-=tXg342J4n91ZAb0c-+UD_+sFnA@mail.gmail.com>


On 2020/10/23 上午10:55, Yongji Xie wrote:
>
>
> On Tue, Oct 20, 2020 at 5:13 PM Jason Wang <jasowang@redhat.com 
> <mailto:jasowang@redhat.com>> wrote:
>
>
>     On 2020/10/20 下午4:35, Yongji Xie wrote:
>     >
>     >
>     > On Tue, Oct 20, 2020 at 4:01 PM Jason Wang <jasowang@redhat.com
>     <mailto:jasowang@redhat.com>
>     > <mailto:jasowang@redhat.com <mailto:jasowang@redhat.com>>> wrote:
>     >
>     >
>     >     On 2020/10/20 下午3:39, Yongji Xie wrote:
>     >     >
>     >     >
>     >     > On Tue, Oct 20, 2020 at 11:20 AM Jason Wang
>     <jasowang@redhat.com <mailto:jasowang@redhat.com>
>     >     <mailto:jasowang@redhat.com <mailto:jasowang@redhat.com>>
>     >     > <mailto:jasowang@redhat.com <mailto:jasowang@redhat.com>
>     <mailto:jasowang@redhat.com <mailto:jasowang@redhat.com>>>> wrote:
>     >     >
>     >     >
>     >     >     On 2020/10/19 下午10:56, Xie Yongji wrote:
>     >     >     > This series introduces a framework, which can be used to
>     >     implement
>     >     >     > vDPA Devices in a userspace program. To implement
>     it, the work
>     >     >     > consist of two parts: control path emulating and
>     data path
>     >     >     offloading.
>     >     >     >
>     >     >     > In the control path, the VDUSE driver will make use
>     of message
>     >     >     > mechnism to forward the actions (get/set features,
>     get/st
>     >     status,
>     >     >     > get/set config space and set virtqueue states) from
>     >     virtio-vdpa
>     >     >     > driver to userspace. Userspace can use read()/write() to
>     >     >     > receive/reply to those control messages.
>     >     >     >
>     >     >     > In the data path, the VDUSE driver implements a
>     MMU-based
>     >     >     > on-chip IOMMU driver which supports both direct
>     mapping and
>     >     >     > indirect mapping with bounce buffer. Then userspace
>     can access
>     >     >     > those iova space via mmap(). Besides, eventfd
>     mechnism is
>     >     used to
>     >     >     > trigger interrupts and forward virtqueue kicks.
>     >     >
>     >     >
>     >     >     This is pretty interesting!
>     >     >
>     >     >     For vhost-vdpa, it should work, but for virtio-vdpa, I
>     think we
>     >     >     should
>     >     >     carefully deal with the IOMMU/DMA ops stuffs.
>     >     >
>     >     >
>     >     >     I notice that neither dma_map nor set_map is
>     implemented in
>     >     >     vduse_vdpa_config_ops, this means you want to let
>     vhost-vDPA
>     >     to deal
>     >     >     with IOMMU domains stuffs.  Any reason for doing that?
>     >     >
>     >     > Actually, this series only focus on virtio-vdpa case now. To
>     >     support
>     >     > vhost-vdpa,  as you said, we need to implement
>     >     dma_map/dma_unmap. But
>     >     > there is a limit that vm's memory can't be anonymous pages
>     which
>     >     are
>     >     > forbidden in vm_insert_page(). Maybe we need to add some
>     limits on
>     >     > vhost-vdpa?
>     >
>     >
>     >     I'm not sure I get this, any reason that you want to use
>     >     vm_insert_page() to VM's memory. Or do you mean you want to
>     implement
>     >     some kind of zero-copy?
>     >
>     >
>     >
>     > If my understanding is right, we will have a QEMU (VM) process
>     and a
>     > device emulation process in the vhost-vdpa case, right? When I/O
>     > happens, the virtio driver in VM will put the IOVA to vring and
>     device
>     > emulation process will get the IOVA from vring. Then the device
>     > emulation process will translate the IOVA to its VA to access
>     the dma
>     > buffer which resides in VM's memory. That means the device
>     emulation
>     > process needs to access VM's memory, so we should use
>     vm_insert_page()
>     > to build the page table of the device emulation process.
>
>
>     Ok, I get you now. So it looks to me the that the real issue is
>     not the
>     limitation to anonymous page but see the comments above
>     vm_insert_page():
>
>     "
>
>       * The page has to be a nice clean _individual_ kernel allocation.
>     "
>
>     So I suspect that using vm_insert_page() to share pages between
>     processes is legal. We need inputs from MM experts.
>
>
> Yes,  vm_insert_page() can't be used in this case. So could we add the 
> shmfd into the vhost iotlb msg and pass it to the device emulation 
> process as a new iova_domain, just like vhost-user does.
>
> Thanks,
> Yongji


I think vhost-user did that via SET_MEM_TABLE which is not supported by 
vDPA. Note that the current IOTLB message will be used when vIOMMU is 
enabled.

This needs more thought. Will come back if I had any thought.

Thanks


>
>
>
>
>     >
>     >     I guess from the software device implemention in user space it
>     >     only need
>     >     to receive IOVA ranges and map them in its own address space.
>     >
>     >
>     > How to map them in its own address space if we don't use
>     vm_insert_page()?
>



      reply	other threads:[~2020-10-23  8:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-19 14:56 [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 14:56 ` [RFC 1/4] mm: export zap_page_range() for driver use Xie Yongji
2020-10-19 15:14   ` Matthew Wilcox
2020-10-19 15:36     ` [External] " 谢永吉
2020-10-19 14:56 ` [RFC 2/4] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 15:08   ` Michael S. Tsirkin
2020-10-19 15:24     ` Randy Dunlap
2020-10-19 15:46       ` [External] " 谢永吉
2020-10-19 15:48     ` 谢永吉
2020-10-19 14:56 ` [RFC 3/4] vduse: grab the module's references until there is no vduse device Xie Yongji
2020-10-19 15:05   ` Michael S. Tsirkin
2020-10-19 15:44     ` [External] " 谢永吉
2020-10-19 15:47       ` Michael S. Tsirkin
2020-10-19 15:56         ` 谢永吉
2020-10-19 16:41           ` Michael S. Tsirkin
2020-10-20  7:42             ` Yongji Xie
2020-10-19 14:56 ` [RFC 4/4] vduse: Add memory shrinker to reclaim bounce pages Xie Yongji
2020-10-19 17:16 ` [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
2020-10-20  2:18   ` [External] " 谢永吉
2020-10-20  2:20     ` Jason Wang
2020-10-20  2:28       ` 谢永吉
2020-10-20  3:20 ` Jason Wang
2020-10-20  7:39   ` [External] " Yongji Xie
2020-10-20  8:01     ` Jason Wang
2020-10-20  8:35       ` Yongji Xie
2020-10-20  9:12         ` Jason Wang
2020-10-23  2:55           ` Yongji Xie
2020-10-23  8:44             ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=427448f0-58ba-0730-d199-6c8cd818ea63@redhat.com \
    --to=jasowang@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).