From: "Michael S. Tsirkin" <mst@redhat.com>
To: Xie Yongji <xieyongji@bytedance.com>
Cc: jasowang@redhat.com, akpm@linux-foundation.org,
linux-mm@kvack.org, virtualization@lists.linux-foundation.org
Subject: Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace
Date: Mon, 19 Oct 2020 13:16:10 -0400 [thread overview]
Message-ID: <20201019130815-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20201019145623.671-1-xieyongji@bytedance.com>
On Mon, Oct 19, 2020 at 10:56:19PM +0800, Xie Yongji wrote:
> This series introduces a framework, which can be used to implement
> vDPA Devices in a userspace program. To implement it, the work
> consist of two parts: control path emulating and data path offloading.
>
> In the control path, the VDUSE driver will make use of message
> mechnism to forward the actions (get/set features, get/st status,
> get/set config space and set virtqueue states) from virtio-vdpa
> driver to userspace. Userspace can use read()/write() to
> receive/reply to those control messages.
>
> In the data path, the VDUSE driver implements a MMU-based
> on-chip IOMMU driver which supports both direct mapping and
> indirect mapping with bounce buffer. Then userspace can access
> those iova space via mmap(). Besides, eventfd mechnism is used to
> trigger interrupts and forward virtqueue kicks.
>
> The details and our user case is shown below:
>
> ------------------------ -----------------------------------------------------------
> | APP | | QEMU |
> | --------- | | -------------------- -------------------+<-->+------ |
> | |dev/vdx| | | | device emulation | | virtio dataplane | | BDS | |
> ------------+----------- -----------+-----------------------+-----------------+-----
> | | | |
> | | emulating | offloading |
> ------------+---------------------------+-----------------------+-----------------+------
> | | block device | | vduse driver | | vdpa device | | TCP/IP | |
> | -------+-------- --------+-------- +------+------- -----+---- |
> | | | | | | |
> | | | | | | |
> | ----------+---------- ----------+----------- | | | |
> | | virtio-blk driver | | virtio-vdpa driver | | | | |
> | ----------+---------- ----------+----------- | | | |
> | | | | | | |
> | | ------------------ | | |
> | ----------------------------------------------------- ---+--- |
> ------------------------------------------------------------------------------ | NIC |---
> ---+---
> |
> ---------+---------
> | Remote Storages |
> -------------------
> We make use of it to implement a block device connecting to
> our distributed storage, which can be used in containers and
> bare metal.
What is not exactly clear is what is the APP above doing.
Taking virtio blk requests and sending them over the network
in some proprietary way?
> Compared with qemu-nbd solution, this solution has
> higher performance, and we can have an unified technology stack
> in VM and containers for remote storages.
>
> To test it with a host disk (e.g. /dev/sdx):
>
> $ qemu-storage-daemon \
> --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
> --monitor chardev=charmonitor \
> --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/sdx,node-name=disk0 \
> --export vduse-blk,id=test,node-name=disk0,writable=on,vduse-id=1,num-queues=16,queue-size=128
>
> The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse
>
> Future work:
> - Improve performance (e.g. zero copy implementation in datapath)
> - Config interrupt support
> - Userspace library (find a way to reuse device emulation code in qemu/rust-vmm)
How does this driver compare with vhost-user-blk (which doesn't need kernel support)?
> Xie Yongji (4):
> mm: export zap_page_range() for driver use
> vduse: Introduce VDUSE - vDPA Device in Userspace
> vduse: grab the module's references until there is no vduse device
> vduse: Add memory shrinker to reclaim bounce pages
>
> drivers/vdpa/Kconfig | 8 +
> drivers/vdpa/Makefile | 1 +
> drivers/vdpa/vdpa_user/Makefile | 5 +
> drivers/vdpa/vdpa_user/eventfd.c | 221 ++++++
> drivers/vdpa/vdpa_user/eventfd.h | 48 ++
> drivers/vdpa/vdpa_user/iova_domain.c | 488 ++++++++++++
> drivers/vdpa/vdpa_user/iova_domain.h | 104 +++
> drivers/vdpa/vdpa_user/vduse.h | 66 ++
> drivers/vdpa/vdpa_user/vduse_dev.c | 1081 ++++++++++++++++++++++++++
> include/uapi/linux/vduse.h | 85 ++
> mm/memory.c | 1 +
> 11 files changed, 2108 insertions(+)
> create mode 100644 drivers/vdpa/vdpa_user/Makefile
> create mode 100644 drivers/vdpa/vdpa_user/eventfd.c
> create mode 100644 drivers/vdpa/vdpa_user/eventfd.h
> create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c
> create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h
> create mode 100644 drivers/vdpa/vdpa_user/vduse.h
> create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c
> create mode 100644 include/uapi/linux/vduse.h
>
> --
> 2.25.1
next prev parent reply other threads:[~2020-10-19 17:16 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-19 14:56 [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 14:56 ` [RFC 1/4] mm: export zap_page_range() for driver use Xie Yongji
2020-10-19 15:14 ` Matthew Wilcox
2020-10-19 15:36 ` [External] " 谢永吉
2020-10-19 14:56 ` [RFC 2/4] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 15:08 ` Michael S. Tsirkin
2020-10-19 15:24 ` Randy Dunlap
2020-10-19 15:46 ` [External] " 谢永吉
2020-10-19 15:48 ` 谢永吉
2020-10-19 14:56 ` [RFC 3/4] vduse: grab the module's references until there is no vduse device Xie Yongji
2020-10-19 15:05 ` Michael S. Tsirkin
2020-10-19 15:44 ` [External] " 谢永吉
2020-10-19 15:47 ` Michael S. Tsirkin
2020-10-19 15:56 ` 谢永吉
2020-10-19 16:41 ` Michael S. Tsirkin
2020-10-20 7:42 ` Yongji Xie
2020-10-19 14:56 ` [RFC 4/4] vduse: Add memory shrinker to reclaim bounce pages Xie Yongji
2020-10-19 17:16 ` Michael S. Tsirkin [this message]
2020-10-20 2:18 ` [External] Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace 谢永吉
2020-10-20 2:20 ` Jason Wang
2020-10-20 2:28 ` 谢永吉
2020-10-20 3:20 ` Jason Wang
2020-10-20 7:39 ` [External] " Yongji Xie
2020-10-20 8:01 ` Jason Wang
2020-10-20 8:35 ` Yongji Xie
2020-10-20 9:12 ` Jason Wang
2020-10-23 2:55 ` Yongji Xie
2020-10-23 8:44 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201019130815-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=jasowang@redhat.com \
--cc=linux-mm@kvack.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=xieyongji@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).