virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
  • * Re: [RFC v2 00/13] Introduce VDUSE - vDPA Device in Userspace
           [not found] <20201222145221.711-1-xieyongji@bytedance.com>
           [not found] ` <20201222145221.711-2-xieyongji@bytedance.com>
    @ 2020-12-23  6:38 ` Jason Wang
      2020-12-23  8:14   ` Jason Wang
           [not found]   ` <CACycT3tr-1EDeH4j0AojD+-qM5yqDUU0tQHieVUXgL7AOTHyvQ@mail.gmail.com>
           [not found] ` <20201222145221.711-7-xieyongji@bytedance.com>
                       ` (2 subsequent siblings)
      4 siblings, 2 replies; 22+ messages in thread
    From: Jason Wang @ 2020-12-23  6:38 UTC (permalink / raw)
      To: Xie Yongji, mst, stefanha, sgarzare, parav, akpm, rdunlap, willy,
    	viro, axboe, bcrl, corbet
      Cc: linux-aio, kvm, netdev, virtualization, linux-mm, linux-fsdevel
    
    
    On 2020/12/22 下午10:52, Xie Yongji wrote:
    > This series introduces a framework, which can be used to implement
    > vDPA Devices in a userspace program. The work consist of two parts:
    > control path forwarding and data path offloading.
    >
    > In the control path, the VDUSE driver will make use of message
    > mechnism to forward the config operation from vdpa bus driver
    > to userspace. Userspace can use read()/write() to receive/reply
    > those control messages.
    >
    > In the data path, the core is mapping dma buffer into VDUSE
    > daemon's address space, which can be implemented in different ways
    > depending on the vdpa bus to which the vDPA device is attached.
    >
    > In virtio-vdpa case, we implements a MMU-based on-chip IOMMU driver with
    > bounce-buffering mechanism to achieve that.
    
    
    Rethink about the bounce buffer stuffs. I wonder instead of using kernel 
    pages with mmap(), how about just use userspace pages like what vhost did?
    
    It means we need a worker to do bouncing but we don't need to care about 
    annoying stuffs like page reclaiming?
    
    
    > And in vhost-vdpa case, the dma
    > buffer is reside in a userspace memory region which can be shared to the
    > VDUSE userspace processs via transferring the shmfd.
    >
    > The details and our user case is shown below:
    >
    > ------------------------    -------------------------   ----------------------------------------------
    > |            Container |    |              QEMU(VM) |   |                               VDUSE daemon |
    > |       ---------      |    |  -------------------  |   | ------------------------- ---------------- |
    > |       |dev/vdx|      |    |  |/dev/vhost-vdpa-x|  |   | | vDPA device emulation | | block driver | |
    > ------------+-----------     -----------+------------   -------------+----------------------+---------
    >              |                           |                            |                      |
    >              |                           |                            |                      |
    > ------------+---------------------------+----------------------------+----------------------+---------
    > |    | block device |           |  vhost device |            | vduse driver |          | TCP/IP |    |
    > |    -------+--------           --------+--------            -------+--------          -----+----    |
    > |           |                           |                           |                       |        |
    > | ----------+----------       ----------+-----------         -------+-------                |        |
    > | | virtio-blk driver |       |  vhost-vdpa driver |         | vdpa device |                |        |
    > | ----------+----------       ----------+-----------         -------+-------                |        |
    > |           |      virtio bus           |                           |                       |        |
    > |   --------+----+-----------           |                           |                       |        |
    > |                |                      |                           |                       |        |
    > |      ----------+----------            |                           |                       |        |
    > |      | virtio-blk device |            |                           |                       |        |
    > |      ----------+----------            |                           |                       |        |
    > |                |                      |                           |                       |        |
    > |     -----------+-----------           |                           |                       |        |
    > |     |  virtio-vdpa driver |           |                           |                       |        |
    > |     -----------+-----------           |                           |                       |        |
    > |                |                      |                           |    vdpa bus           |        |
    > |     -----------+----------------------+---------------------------+------------           |        |
    > |                                                                                        ---+---     |
    > -----------------------------------------------------------------------------------------| NIC |------
    >                                                                                           ---+---
    >                                                                                              |
    >                                                                                     ---------+---------
    >                                                                                     | Remote Storages |
    >                                                                                     -------------------
    >
    > We make use of it to implement a block device connecting to
    > our distributed storage, which can be used both in containers and
    > VMs. Thus, we can have an unified technology stack in this two cases.
    >
    > To test it with null-blk:
    >
    >    $ qemu-storage-daemon \
    >        --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
    >        --monitor chardev=charmonitor \
    >        --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \
    >        --export vduse-blk,id=test,node-name=disk0,writable=on,vduse-id=1,num-queues=16,queue-size=128
    >
    > The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse
    >
    > Future work:
    >    - Improve performance (e.g. zero copy implementation in datapath)
    >    - Config interrupt support
    >    - Userspace library (find a way to reuse device emulation code in qemu/rust-vmm)
    >
    > This is now based on below series:
    > https://lore.kernel.org/netdev/20201112064005.349268-1-parav@nvidia.com/
    >
    > V1 to V2:
    > - Add vhost-vdpa support
    
    
    I may miss something but I don't see any code to support that. E.g 
    neither set_map nor dma_map/unmap is implemented in the config ops.
    
    Thanks
    
    
    > - Add some documents
    > - Based on the vdpa management tool
    > - Introduce a workqueue for irq injection
    > - Replace interval tree with array map to store the iova_map
    >
    > Xie Yongji (13):
    >    mm: export zap_page_range() for driver use
    >    eventfd: track eventfd_signal() recursion depth separately in different cases
    >    eventfd: Increase the recursion depth of eventfd_signal()
    >    vdpa: Remove the restriction that only supports virtio-net devices
    >    vdpa: Pass the netlink attributes to ops.dev_add()
    >    vduse: Introduce VDUSE - vDPA Device in Userspace
    >    vduse: support get/set virtqueue state
    >    vdpa: Introduce process_iotlb_msg() in vdpa_config_ops
    >    vduse: Add support for processing vhost iotlb message
    >    vduse: grab the module's references until there is no vduse device
    >    vduse/iova_domain: Support reclaiming bounce pages
    >    vduse: Add memory shrinker to reclaim bounce pages
    >    vduse: Introduce a workqueue for irq injection
    >
    >   Documentation/driver-api/vduse.rst                 |   91 ++
    >   Documentation/userspace-api/ioctl/ioctl-number.rst |    1 +
    >   drivers/vdpa/Kconfig                               |    8 +
    >   drivers/vdpa/Makefile                              |    1 +
    >   drivers/vdpa/vdpa.c                                |    2 +-
    >   drivers/vdpa/vdpa_sim/vdpa_sim.c                   |    3 +-
    >   drivers/vdpa/vdpa_user/Makefile                    |    5 +
    >   drivers/vdpa/vdpa_user/eventfd.c                   |  229 ++++
    >   drivers/vdpa/vdpa_user/eventfd.h                   |   48 +
    >   drivers/vdpa/vdpa_user/iova_domain.c               |  517 ++++++++
    >   drivers/vdpa/vdpa_user/iova_domain.h               |  103 ++
    >   drivers/vdpa/vdpa_user/vduse.h                     |   59 +
    >   drivers/vdpa/vdpa_user/vduse_dev.c                 | 1373 ++++++++++++++++++++
    >   drivers/vhost/vdpa.c                               |   34 +-
    >   fs/aio.c                                           |    3 +-
    >   fs/eventfd.c                                       |   20 +-
    >   include/linux/eventfd.h                            |    5 +-
    >   include/linux/vdpa.h                               |   11 +-
    >   include/uapi/linux/vdpa.h                          |    1 +
    >   include/uapi/linux/vduse.h                         |  119 ++
    >   mm/memory.c                                        |    1 +
    >   21 files changed, 2598 insertions(+), 36 deletions(-)
    >   create mode 100644 Documentation/driver-api/vduse.rst
    >   create mode 100644 drivers/vdpa/vdpa_user/Makefile
    >   create mode 100644 drivers/vdpa/vdpa_user/eventfd.c
    >   create mode 100644 drivers/vdpa/vdpa_user/eventfd.h
    >   create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c
    >   create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h
    >   create mode 100644 drivers/vdpa/vdpa_user/vduse.h
    >   create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c
    >   create mode 100644 include/uapi/linux/vduse.h
    >
    
    _______________________________________________
    Virtualization mailing list
    Virtualization@lists.linux-foundation.org
    https://lists.linuxfoundation.org/mailman/listinfo/virtualization
    
    ^ permalink raw reply	[flat|nested] 22+ messages in thread
  • [parent not found: <20201222145221.711-7-xieyongji@bytedance.com>]
  • [parent not found: <20201222145221.711-9-xieyongji@bytedance.com>]
  • [parent not found: <20201222145221.711-10-xieyongji@bytedance.com>]

  • end of thread, other threads:[~2021-01-08 13:33 UTC | newest]
    
    Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
    -- links below jump to the message on this page --
         [not found] <20201222145221.711-1-xieyongji@bytedance.com>
         [not found] ` <20201222145221.711-2-xieyongji@bytedance.com>
    2020-12-22 15:44   ` [RFC v2 01/13] mm: export zap_page_range() for driver use Christoph Hellwig
    2020-12-23  6:38 ` [RFC v2 00/13] Introduce VDUSE - vDPA Device in Userspace Jason Wang
    2020-12-23  8:14   ` Jason Wang
         [not found]   ` <CACycT3tr-1EDeH4j0AojD+-qM5yqDUU0tQHieVUXgL7AOTHyvQ@mail.gmail.com>
    2020-12-24  2:24     ` Jason Wang
         [not found] ` <20201222145221.711-7-xieyongji@bytedance.com>
    2020-12-23  8:08   ` [RFC v2 06/13] vduse: " Jason Wang
         [not found]     ` <CACycT3vYb_CdWz3wZ1OY=KynG=1qZgaa_Ngko2AO0JHn_fFXEA@mail.gmail.com>
    2020-12-24  3:01       ` Jason Wang
         [not found]         ` <CACycT3uKb1P7zXyCBYWDb6VhGXV0cdJPH3CPcRzjwz57tyODgA@mail.gmail.com>
    2020-12-25  6:59           ` Jason Wang
    2021-01-08 13:32   ` Bob Liu
         [not found] ` <20201222145221.711-9-xieyongji@bytedance.com>
    2020-12-23  8:36   ` [RFC v2 08/13] vdpa: Introduce process_iotlb_msg() in vdpa_config_ops Jason Wang
         [not found]     ` <CACycT3tP8mgj043idjJW3BF12qmOhmHzYz8X5FyL8t5MbwLysw@mail.gmail.com>
    2020-12-24  2:36       ` Jason Wang
         [not found] ` <20201222145221.711-10-xieyongji@bytedance.com>
    2020-12-23  9:05   ` [RFC v2 09/13] vduse: Add support for processing vhost iotlb message Jason Wang
         [not found]     ` <CACycT3vVU9vg6R6UujSnSdk8cwxWPVgeJJs0JaBH_Zg4xC-epQ@mail.gmail.com>
    2020-12-24  2:41       ` [External] " Jason Wang
         [not found]         ` <CACycT3s=m=PQb5WFoMGhz8TNGme4+=rmbbBTtrugF9ZmNnWxEw@mail.gmail.com>
    2020-12-25  6:57           ` Jason Wang
         [not found]             ` <CACycT3uwXBYvRbKDWdN3oCekv+o6_Lc=-KTrxejD=fr-zgibGw@mail.gmail.com>
    2020-12-28  7:43               ` Jason Wang
         [not found]                 ` <CACycT3uDV43ecScrMh1QVpStuwDETHykJzzY=pkmZjP2Dd2kvg@mail.gmail.com>
    2020-12-28  8:43                   ` Jason Wang
         [not found]                     ` <CACycT3soQoX5avZiFBLEGBuJpdni6-UxdhAPGpWHBWVf+dEySg@mail.gmail.com>
    2020-12-29  9:11                       ` Jason Wang
         [not found]                         ` <CACycT3sg61yRdupnD+jQEkWKsVEvMWfhkJ=5z_bYZLxCibDiHw@mail.gmail.com>
    2020-12-30  6:10                           ` Jason Wang
         [not found]                             ` <CACycT3vZ7V5WWhCFLBK6FuvVNmPmMj_yc=COOB4cjjC13yHUwg@mail.gmail.com>
    2020-12-30  8:41                               ` Jason Wang
         [not found]                                 ` <CACycT3tD3zyvV6Zy5NT4x=02hBgrRGq35xeTsRXXx-_wPGJXpQ@mail.gmail.com>
    2020-12-31  2:49                                   ` Jason Wang
         [not found]                                     ` <CACycT3vwMU5R7N8dZFBYX4-bxe2YT7EfK_M_jEkH8wzfH_GkBw@mail.gmail.com>
    2020-12-31  5:49                                       ` Jason Wang
         [not found]                                         ` <CACycT3tc2P63k6J9ZkWTpPvHk_H8zUq0_Q6WOqYX_dSigUAnzA@mail.gmail.com>
    2020-12-31  7:11                                           ` Jason Wang
         [not found]           ` <CACycT3tLG=13fDdY0YPzViK2-AUy5F+uJor2cmVDFOGjXTOaYA@mail.gmail.com>
    2020-12-25  7:02             ` Jason Wang
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox;
    as well as URLs for NNTP newsgroup(s).