linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace
@ 2020-10-19 14:56 Xie Yongji
  2020-10-19 14:56 ` [RFC 1/4] mm: export zap_page_range() for driver use Xie Yongji
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: Xie Yongji @ 2020-10-19 14:56 UTC (permalink / raw)
  To: mst, jasowang, akpm; +Cc: linux-mm, virtualization

This series introduces a framework, which can be used to implement
vDPA Devices in a userspace program. To implement it, the work
consist of two parts: control path emulating and data path offloading.

In the control path, the VDUSE driver will make use of message
mechnism to forward the actions (get/set features, get/st status,
get/set config space and set virtqueue states) from virtio-vdpa
driver to userspace. Userspace can use read()/write() to
receive/reply to those control messages.

In the data path, the VDUSE driver implements a MMU-based
on-chip IOMMU driver which supports both direct mapping and
indirect mapping with bounce buffer. Then userspace can access
those iova space via mmap(). Besides, eventfd mechnism is used to
trigger interrupts and forward virtqueue kicks.

The details and our user case is shown below:

------------------------     -----------------------------------------------------------
|                  APP |     |                          QEMU                           |
|       ---------      |     | --------------------    -------------------+<-->+------ |
|       |dev/vdx|      |     | | device emulation |    | virtio dataplane |    | BDS | |
------------+-----------     -----------+-----------------------+-----------------+-----
            |                           |                       |                 |
            |                           | emulating             | offloading      |
------------+---------------------------+-----------------------+-----------------+------
|    | block device |           |  vduse driver |        |  vdpa device |    | TCP/IP | |
|    -------+--------           --------+--------        +------+-------     -----+---- |
|           |                           |                |      |                 |     |
|           |                           |                |      |                 |     |
| ----------+----------       ----------+-----------     |      |                 |     |
| | virtio-blk driver |       | virtio-vdpa driver |     |      |                 |     |
| ----------+----------       ----------+-----------     |      |                 |     |
|           |                           |                |      |                 |     |
|           |                           ------------------      |                 |     |
|           -----------------------------------------------------              ---+---  |
------------------------------------------------------------------------------ | NIC |---
                                                                               ---+---
                                                                                  |
                                                                         ---------+---------
                                                                         | Remote Storages |
                                                                         -------------------

We make use of it to implement a block device connecting to
our distributed storage, which can be used in containers and
bare metal. Compared with qemu-nbd solution, this solution has
higher performance, and we can have an unified technology stack
in VM and containers for remote storages.

To test it with a host disk (e.g. /dev/sdx):

  $ qemu-storage-daemon \
      --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
      --monitor chardev=charmonitor \
      --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/sdx,node-name=disk0 \
      --export vduse-blk,id=test,node-name=disk0,writable=on,vduse-id=1,num-queues=16,queue-size=128

The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse

Future work:
  - Improve performance (e.g. zero copy implementation in datapath)
  - Config interrupt support
  - Userspace library (find a way to reuse device emulation code in qemu/rust-vmm)

Xie Yongji (4):
  mm: export zap_page_range() for driver use
  vduse: Introduce VDUSE - vDPA Device in Userspace
  vduse: grab the module's references until there is no vduse device
  vduse: Add memory shrinker to reclaim bounce pages

 drivers/vdpa/Kconfig                 |    8 +
 drivers/vdpa/Makefile                |    1 +
 drivers/vdpa/vdpa_user/Makefile      |    5 +
 drivers/vdpa/vdpa_user/eventfd.c     |  221 ++++++
 drivers/vdpa/vdpa_user/eventfd.h     |   48 ++
 drivers/vdpa/vdpa_user/iova_domain.c |  488 ++++++++++++
 drivers/vdpa/vdpa_user/iova_domain.h |  104 +++
 drivers/vdpa/vdpa_user/vduse.h       |   66 ++
 drivers/vdpa/vdpa_user/vduse_dev.c   | 1081 ++++++++++++++++++++++++++
 include/uapi/linux/vduse.h           |   85 ++
 mm/memory.c                          |    1 +
 11 files changed, 2108 insertions(+)
 create mode 100644 drivers/vdpa/vdpa_user/Makefile
 create mode 100644 drivers/vdpa/vdpa_user/eventfd.c
 create mode 100644 drivers/vdpa/vdpa_user/eventfd.h
 create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c
 create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h
 create mode 100644 drivers/vdpa/vdpa_user/vduse.h
 create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c
 create mode 100644 include/uapi/linux/vduse.h

-- 
2.25.1



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-10-23  8:45 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-19 14:56 [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 14:56 ` [RFC 1/4] mm: export zap_page_range() for driver use Xie Yongji
2020-10-19 15:14   ` Matthew Wilcox
2020-10-19 15:36     ` [External] " 谢永吉
2020-10-19 14:56 ` [RFC 2/4] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 15:08   ` Michael S. Tsirkin
2020-10-19 15:24     ` Randy Dunlap
2020-10-19 15:46       ` [External] " 谢永吉
2020-10-19 15:48     ` 谢永吉
2020-10-19 14:56 ` [RFC 3/4] vduse: grab the module's references until there is no vduse device Xie Yongji
2020-10-19 15:05   ` Michael S. Tsirkin
2020-10-19 15:44     ` [External] " 谢永吉
2020-10-19 15:47       ` Michael S. Tsirkin
2020-10-19 15:56         ` 谢永吉
2020-10-19 16:41           ` Michael S. Tsirkin
2020-10-20  7:42             ` Yongji Xie
2020-10-19 14:56 ` [RFC 4/4] vduse: Add memory shrinker to reclaim bounce pages Xie Yongji
2020-10-19 17:16 ` [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
2020-10-20  2:18   ` [External] " 谢永吉
2020-10-20  2:20     ` Jason Wang
2020-10-20  2:28       ` 谢永吉
2020-10-20  3:20 ` Jason Wang
2020-10-20  7:39   ` [External] " Yongji Xie
2020-10-20  8:01     ` Jason Wang
2020-10-20  8:35       ` Yongji Xie
2020-10-20  9:12         ` Jason Wang
2020-10-23  2:55           ` Yongji Xie
2020-10-23  8:44             ` Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).