From: Yongji Xie <xieyongji@bytedance.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Jason Wang" <jasowang@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Stefano Garzarella" <sgarzare@redhat.com>,
"Parav Pandit" <parav@nvidia.com>,
"Christoph Hellwig" <hch@infradead.org>,
"Christian Brauner" <christian.brauner@canonical.com>,
"Randy Dunlap" <rdunlap@infradead.org>,
"Matthew Wilcox" <willy@infradead.org>,
"Al Viro" <viro@zeniv.linux.org.uk>,
"Jens Axboe" <axboe@kernel.dk>,
bcrl@kvack.org, "Jonathan Corbet" <corbet@lwn.net>,
"Mika Penttilä" <mika.penttila@nextfour.com>,
"Dan Carpenter" <dan.carpenter@oracle.com>,
joro@8bytes.org, "Greg KH" <gregkh@linuxfoundation.org>,
"He Zhe" <zhe.he@windriver.com>,
"Liu Xiaodong" <xiaodong.liu@intel.com>,
"Joe Perches" <joe@perches.com>,
"Robin Murphy" <robin.murphy@arm.com>,
"Will Deacon" <will@kernel.org>,
"John Garry" <john.garry@huawei.com>,
songmuchun@bytedance.com,
virtualization <virtualization@lists.linux-foundation.org>,
Netdev <netdev@vger.kernel.org>, kvm <kvm@vger.kernel.org>,
linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace
Date: Tue, 11 Jan 2022 11:31:37 +0800 [thread overview]
Message-ID: <CACycT3sbJC1Jn7NeWk_ccQ_2_YgKybjugfxmKpfgCP3Ayoju4w@mail.gmail.com> (raw)
In-Reply-To: <20220110103938-mutt-send-email-mst@kernel.org>
On Mon, Jan 10, 2022 at 11:44 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 10, 2022 at 11:24:40PM +0800, Yongji Xie wrote:
> > On Mon, Jan 10, 2022 at 11:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jan 10, 2022 at 09:54:08PM +0800, Yongji Xie wrote:
> > > > On Mon, Jan 10, 2022 at 8:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Aug 30, 2021 at 10:17:24PM +0800, Xie Yongji wrote:
> > > > > > This series introduces a framework that makes it possible to implement
> > > > > > software-emulated vDPA devices in userspace. And to make the device
> > > > > > emulation more secure, the emulated vDPA device's control path is handled
> > > > > > in the kernel and only the data path is implemented in the userspace.
> > > > > >
> > > > > > Since the emuldated vDPA device's control path is handled in the kernel,
> > > > > > a message mechnism is introduced to make userspace be aware of the data
> > > > > > path related changes. Userspace can use read()/write() to receive/reply
> > > > > > the control messages.
> > > > > >
> > > > > > In the data path, the core is mapping dma buffer into VDUSE daemon's
> > > > > > address space, which can be implemented in different ways depending on
> > > > > > the vdpa bus to which the vDPA device is attached.
> > > > > >
> > > > > > In virtio-vdpa case, we implements a MMU-based software IOTLB with
> > > > > > bounce-buffering mechanism to achieve that. And in vhost-vdpa case, the dma
> > > > > > buffer is reside in a userspace memory region which can be shared to the
> > > > > > VDUSE userspace processs via transferring the shmfd.
> > > > > >
> > > > > > The details and our user case is shown below:
> > > > > >
> > > > > > ------------------------ ------------------------- ----------------------------------------------
> > > > > > | Container | | QEMU(VM) | | VDUSE daemon |
> > > > > > | --------- | | ------------------- | | ------------------------- ---------------- |
> > > > > > | |dev/vdx| | | |/dev/vhost-vdpa-x| | | | vDPA device emulation | | block driver | |
> > > > > > ------------+----------- -----------+------------ -------------+----------------------+---------
> > > > > > | | | |
> > > > > > | | | |
> > > > > > ------------+---------------------------+----------------------------+----------------------+---------
> > > > > > | | block device | | vhost device | | vduse driver | | TCP/IP | |
> > > > > > | -------+-------- --------+-------- -------+-------- -----+---- |
> > > > > > | | | | | |
> > > > > > | ----------+---------- ----------+----------- -------+------- | |
> > > > > > | | virtio-blk driver | | vhost-vdpa driver | | vdpa device | | |
> > > > > > | ----------+---------- ----------+----------- -------+------- | |
> > > > > > | | virtio bus | | | |
> > > > > > | --------+----+----------- | | | |
> > > > > > | | | | | |
> > > > > > | ----------+---------- | | | |
> > > > > > | | virtio-blk device | | | | |
> > > > > > | ----------+---------- | | | |
> > > > > > | | | | | |
> > > > > > | -----------+----------- | | | |
> > > > > > | | virtio-vdpa driver | | | | |
> > > > > > | -----------+----------- | | | |
> > > > > > | | | | vdpa bus | |
> > > > > > | -----------+----------------------+---------------------------+------------ | |
> > > > > > | ---+--- |
> > > > > > -----------------------------------------------------------------------------------------| NIC |------
> > > > > > ---+---
> > > > > > |
> > > > > > ---------+---------
> > > > > > | Remote Storages |
> > > > > > -------------------
> > > > > >
> > > > > > We make use of it to implement a block device connecting to
> > > > > > our distributed storage, which can be used both in containers and
> > > > > > VMs. Thus, we can have an unified technology stack in this two cases.
> > > > > >
> > > > > > To test it with null-blk:
> > > > > >
> > > > > > $ qemu-storage-daemon \
> > > > > > --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
> > > > > > --monitor chardev=charmonitor \
> > > > > > --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \
> > > > > > --export type=vduse-blk,id=test,node-name=disk0,writable=on,name=vduse-null,num-queues=16,queue-size=128
> > > > > >
> > > > > > The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse
> > > > >
> > > > > It's been half a year - any plans to upstream this?
> > > >
> > > > Yeah, this is on my to-do list this month.
> > > >
> > > > Sorry for taking so long... I've been working on another project
> > > > enabling userspace RDMA with VDUSE for the past few months. So I
> > > > didn't have much time for this. Anyway, I will submit the first
> > > > version as soon as possible.
> > > >
> > > > Thanks,
> > > > Yongji
> > >
> > > Oh fun. You mean like virtio-rdma? Or RDMA as a backend for regular
> > > virtio?
> > >
> >
> > Yes, like virtio-rdma. Then we can develop something like userspace
> > rxe、siw or custom protocol with VDUSE.
> >
> > Thanks,
> > Yongji
>
> Would be interesting to see the spec for that.
Will send it ASAP.
> The issues with RDMA revolved around the fact that current
> apps tend to either use non-standard propocols for connection
> establishment or use UD where there's IIRC no standard
> at all. So QP numbers are hard to virtualize.
> Similarly many use LIDs directly with the same effect.
> GUIDs might be virtualizeable but no one went to the effort.
>
Actually we aimed at emulating a soft RDMA with normal NIC (not use
RDMA capability) rather than virtualizing a physical RDMA NIC into
several vRDMA devices. If so, I think we won't have those issues,
right?
> To say nothing about the interaction with memory overcommit.
>
I don't get you here. Could you give me more details?
Thanks,
Yongji
next prev parent reply other threads:[~2022-01-11 3:31 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-30 14:17 [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-08-30 14:17 ` [PATCH v12 01/13] iova: Export alloc_iova_fast() and free_iova_fast() Xie Yongji
2021-08-30 14:17 ` [PATCH v12 02/13] eventfd: Export eventfd_wake_count to modules Xie Yongji
2021-08-30 14:17 ` [PATCH v12 03/13] file: Export receive_fd() " Xie Yongji
2021-08-30 14:17 ` [PATCH v12 04/13] vdpa: Fix some coding style issues Xie Yongji
2021-08-30 14:17 ` [PATCH v12 05/13] vdpa: Add reset callback in vdpa_config_ops Xie Yongji
2021-09-06 5:54 ` Michael S. Tsirkin
2021-08-30 14:17 ` [PATCH v12 06/13] vhost-vdpa: Handle the failure of vdpa_reset() Xie Yongji
2021-08-30 14:17 ` [PATCH v12 07/13] vhost-iotlb: Add an opaque pointer for vhost IOTLB Xie Yongji
2021-08-30 14:17 ` [PATCH v12 08/13] vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Xie Yongji
2021-08-30 14:17 ` [PATCH v12 09/13] vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() Xie Yongji
2021-08-30 14:17 ` [PATCH v12 10/13] vdpa: Support transferring virtual addressing during DMA mapping Xie Yongji
2021-08-30 14:17 ` [PATCH v12 11/13] vduse: Implement an MMU-based software IOTLB Xie Yongji
2021-08-30 14:17 ` [PATCH v12 12/13] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-08-30 14:17 ` [PATCH v12 13/13] Documentation: Add documentation for VDUSE Xie Yongji
2022-01-10 12:56 ` [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
2022-01-10 13:54 ` Yongji Xie
2022-01-10 15:09 ` Michael S. Tsirkin
2022-01-10 15:24 ` Yongji Xie
2022-01-10 15:44 ` Michael S. Tsirkin
2022-01-11 3:31 ` Yongji Xie [this message]
2022-01-11 11:54 ` Michael S. Tsirkin
2022-01-11 12:57 ` Yongji Xie
2022-01-11 13:04 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACycT3sbJC1Jn7NeWk_ccQ_2_YgKybjugfxmKpfgCP3Ayoju4w@mail.gmail.com \
--to=xieyongji@bytedance.com \
--cc=axboe@kernel.dk \
--cc=bcrl@kvack.org \
--cc=christian.brauner@canonical.com \
--cc=corbet@lwn.net \
--cc=dan.carpenter@oracle.com \
--cc=gregkh@linuxfoundation.org \
--cc=hch@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=jasowang@redhat.com \
--cc=joe@perches.com \
--cc=john.garry@huawei.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mika.penttila@nextfour.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=parav@nvidia.com \
--cc=rdunlap@infradead.org \
--cc=robin.murphy@arm.com \
--cc=sgarzare@redhat.com \
--cc=songmuchun@bytedance.com \
--cc=stefanha@redhat.com \
--cc=viro@zeniv.linux.org.uk \
--cc=virtualization@lists.linux-foundation.org \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=xiaodong.liu@intel.com \
--cc=zhe.he@windriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).