linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yongji Xie <xieyongji@bytedance.com>
To: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	 virtualization@lists.linux-foundation.org
Subject: Re: [External] Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace
Date: Tue, 20 Oct 2020 16:35:32 +0800	[thread overview]
Message-ID: <CACycT3ssE-iMquAmrrHGQyBCv7XkQ2WrinFMMPTTubxuuOQ92g@mail.gmail.com> (raw)
In-Reply-To: <c83aad4f-8ac5-3279-0429-ae2154622fe5@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6977 bytes --]

On Tue, Oct 20, 2020 at 4:01 PM Jason Wang <jasowang@redhat.com> wrote:

>
> On 2020/10/20 下午3:39, Yongji Xie wrote:
> >
> >
> > On Tue, Oct 20, 2020 at 11:20 AM Jason Wang <jasowang@redhat.com
> > <mailto:jasowang@redhat.com>> wrote:
> >
> >
> >     On 2020/10/19 下午10:56, Xie Yongji wrote:
> >     > This series introduces a framework, which can be used to implement
> >     > vDPA Devices in a userspace program. To implement it, the work
> >     > consist of two parts: control path emulating and data path
> >     offloading.
> >     >
> >     > In the control path, the VDUSE driver will make use of message
> >     > mechnism to forward the actions (get/set features, get/st status,
> >     > get/set config space and set virtqueue states) from virtio-vdpa
> >     > driver to userspace. Userspace can use read()/write() to
> >     > receive/reply to those control messages.
> >     >
> >     > In the data path, the VDUSE driver implements a MMU-based
> >     > on-chip IOMMU driver which supports both direct mapping and
> >     > indirect mapping with bounce buffer. Then userspace can access
> >     > those iova space via mmap(). Besides, eventfd mechnism is used to
> >     > trigger interrupts and forward virtqueue kicks.
> >
> >
> >     This is pretty interesting!
> >
> >     For vhost-vdpa, it should work, but for virtio-vdpa, I think we
> >     should
> >     carefully deal with the IOMMU/DMA ops stuffs.
> >
> >
> >     I notice that neither dma_map nor set_map is implemented in
> >     vduse_vdpa_config_ops, this means you want to let vhost-vDPA to deal
> >     with IOMMU domains stuffs.  Any reason for doing that?
> >
> > Actually, this series only focus on virtio-vdpa case now. To support
> > vhost-vdpa,  as you said, we need to implement dma_map/dma_unmap. But
> > there is a limit that vm's memory can't be anonymous pages which are
> > forbidden in vm_insert_page(). Maybe we need to add some limits on
> > vhost-vdpa?
>
>
> I'm not sure I get this, any reason that you want to use
> vm_insert_page() to VM's memory. Or do you mean you want to implement
> some kind of zero-copy?


>

If my understanding is right, we will have a QEMU (VM) process and a device
emulation process in the vhost-vdpa case, right? When I/O happens, the
virtio driver in VM will put the IOVA to vring and device emulation process
will get the IOVA from vring. Then the device emulation process
will translate the IOVA to its VA to access the dma buffer which resides in
VM's memory. That means the device emulation process needs to access
VM's memory, so we should use vm_insert_page() to build the page table of
the device emulation process.

I guess from the software device implemention in user space it only need
> to receive IOVA ranges and map them in its own address space.


How to map them in its own address space if we don't use vm_insert_page()?


>

> >     The reason for the questions are:
> >
> >     1) You've implemented a on-chip IOMMU driver but don't expose it to
> >     generic IOMMU layer (or generic IOMMU layer may need some
> >     extension to
> >     support this)
> >     2) We will probably remove the IOMMU domain management in vhost-vDPA,
> >     and move it to the device(parent).
> >
> >     So if it's possible, please implement either set_map() or
> >     dma_map()/dma_unmap(), this may align with our future goal and may
> >     speed
> >     up the development.
> >
> >     Btw, it would be helpful to give even more details on how the on-chip
> >     IOMMU driver in implemented.
> >
> >
> > The basic idea is treating MMU (VA->PA) as IOMMU (IOVA->PA). And using
> > vm_insert_page()/zap_page_range() to do address mapping/unmapping. And
> > the address mapping will be done in page fault handler because
> > vm_insert_page() can't be called in atomic_context such
> > as dma_map_ops->map_page().
>
>
> Ok, please add it in the cover letter or patch 2 in the next version.
>

> >
> >     >
> >     > The details and our user case is shown below:
> >     >
> >     > ------------------------
> >      -----------------------------------------------------------
> >     > |                  APP |     | QEMU                           |
> >     > |       ---------      |     | --------------------
> >     -------------------+<-->+------ |
> >     > |       |dev/vdx|      |     | | device emulation | | virtio
> >     dataplane |    | BDS | |
> >     > ------------+-----------
> >      -----------+-----------------------+-----------------+-----
> >     >              |                           |          |
> >          |
> >     >              |                           | emulating          |
> >     offloading      |
> >     >
> >
>  ------------+---------------------------+-----------------------+-----------------+------
> >     > |    | block device |           |  vduse driver |   |  vdpa
> >     device |    | TCP/IP | |
> >     > |    -------+--------           --------+--------
> >     +------+-------     -----+---- |
> >     > |           |                           |   |      |
> >          |     |
> >     > |           |                           |   |      |
> >          |     |
> >     > | ----------+----------       ----------+-----------  |      |
> >                    |     |
> >     > | | virtio-blk driver |       | virtio-vdpa driver |  |      |
> >                    |     |
> >     > | ----------+----------       ----------+-----------  |      |
> >                    |     |
> >     > |           |                           |   |      |
> >          |     |
> >     > |           |  ------------------      |                 |     |
> >     > |  -----------------------------------------------------
> >     ---+---  |
> >     >
> >
>  ------------------------------------------------------------------------------
> >     | NIC |---
> >     >                          ---+---
> >     >                             |
> >     >                    ---------+---------
> >     >                    | Remote Storages |
> >     >                    -------------------
> >
> >
> >     The figure is not very clear to me in the following points:
> >
> >     1) if the device emulation and virtio dataplane is all implemented in
> >     QEMU, what's the point of doing this? I thought the device should
> >     be a
> >     remove process?
> >
> >     2) it would be better to draw a vDPA bus somewhere to help people to
> >     understand the architecture
> >     3) for the "offloading" I guess it should be done virtio
> >     vhost-vDPA, so
> >     it's better to draw a vhost-vDPA block there
> >
> >
> > This figure only shows virtio-vdpa case, I will take vhost-vdpa case
> > into consideration in next version.
>
>
> Please do that, otherwise this proposal is incomplete.
>
>
Sure.

Thanks,
Yongji

[-- Attachment #2: Type: text/html, Size: 9824 bytes --]

  reply	other threads:[~2020-10-20  8:35 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-19 14:56 [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 14:56 ` [RFC 1/4] mm: export zap_page_range() for driver use Xie Yongji
2020-10-19 15:14   ` Matthew Wilcox
2020-10-19 15:36     ` [External] " 谢永吉
2020-10-19 14:56 ` [RFC 2/4] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 15:08   ` Michael S. Tsirkin
2020-10-19 15:24     ` Randy Dunlap
2020-10-19 15:46       ` [External] " 谢永吉
2020-10-19 15:48     ` 谢永吉
2020-10-19 14:56 ` [RFC 3/4] vduse: grab the module's references until there is no vduse device Xie Yongji
2020-10-19 15:05   ` Michael S. Tsirkin
2020-10-19 15:44     ` [External] " 谢永吉
2020-10-19 15:47       ` Michael S. Tsirkin
2020-10-19 15:56         ` 谢永吉
2020-10-19 16:41           ` Michael S. Tsirkin
2020-10-20  7:42             ` Yongji Xie
2020-10-19 14:56 ` [RFC 4/4] vduse: Add memory shrinker to reclaim bounce pages Xie Yongji
2020-10-19 17:16 ` [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
2020-10-20  2:18   ` [External] " 谢永吉
2020-10-20  2:20     ` Jason Wang
2020-10-20  2:28       ` 谢永吉
2020-10-20  3:20 ` Jason Wang
2020-10-20  7:39   ` [External] " Yongji Xie
2020-10-20  8:01     ` Jason Wang
2020-10-20  8:35       ` Yongji Xie [this message]
2020-10-20  9:12         ` Jason Wang
2020-10-23  2:55           ` Yongji Xie
2020-10-23  8:44             ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACycT3ssE-iMquAmrrHGQyBCv7XkQ2WrinFMMPTTubxuuOQ92g@mail.gmail.com \
    --to=xieyongji@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).