All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yongji Xie <xieyongji@bytedance.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Jason Wang" <jasowang@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	"Parav Pandit" <parav@nvidia.com>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Christian Brauner" <christian.brauner@canonical.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Jens Axboe" <axboe@kernel.dk>,
	bcrl@kvack.org, "Jonathan Corbet" <corbet@lwn.net>,
	"Mika Penttilä" <mika.penttila@nextfour.com>,
	"Dan Carpenter" <dan.carpenter@oracle.com>,
	joro@8bytes.org, "Greg KH" <gregkh@linuxfoundation.org>,
	"He Zhe" <zhe.he@windriver.com>,
	"Liu Xiaodong" <xiaodong.liu@intel.com>,
	"Joe Perches" <joe@perches.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Will Deacon" <will@kernel.org>,
	"John Garry" <john.garry@huawei.com>,
	songmuchun@bytedance.com,
	virtualization <virtualization@lists.linux-foundation.org>,
	Netdev <netdev@vger.kernel.org>, kvm <kvm@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace
Date: Tue, 11 Jan 2022 11:31:37 +0800	[thread overview]
Message-ID: <CACycT3sbJC1Jn7NeWk_ccQ_2_YgKybjugfxmKpfgCP3Ayoju4w@mail.gmail.com> (raw)
In-Reply-To: <20220110103938-mutt-send-email-mst@kernel.org>

On Mon, Jan 10, 2022 at 11:44 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 10, 2022 at 11:24:40PM +0800, Yongji Xie wrote:
> > On Mon, Jan 10, 2022 at 11:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jan 10, 2022 at 09:54:08PM +0800, Yongji Xie wrote:
> > > > On Mon, Jan 10, 2022 at 8:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Aug 30, 2021 at 10:17:24PM +0800, Xie Yongji wrote:
> > > > > > This series introduces a framework that makes it possible to implement
> > > > > > software-emulated vDPA devices in userspace. And to make the device
> > > > > > emulation more secure, the emulated vDPA device's control path is handled
> > > > > > in the kernel and only the data path is implemented in the userspace.
> > > > > >
> > > > > > Since the emuldated vDPA device's control path is handled in the kernel,
> > > > > > a message mechnism is introduced to make userspace be aware of the data
> > > > > > path related changes. Userspace can use read()/write() to receive/reply
> > > > > > the control messages.
> > > > > >
> > > > > > In the data path, the core is mapping dma buffer into VDUSE daemon's
> > > > > > address space, which can be implemented in different ways depending on
> > > > > > the vdpa bus to which the vDPA device is attached.
> > > > > >
> > > > > > In virtio-vdpa case, we implements a MMU-based software IOTLB with
> > > > > > bounce-buffering mechanism to achieve that. And in vhost-vdpa case, the dma
> > > > > > buffer is reside in a userspace memory region which can be shared to the
> > > > > > VDUSE userspace processs via transferring the shmfd.
> > > > > >
> > > > > > The details and our user case is shown below:
> > > > > >
> > > > > > ------------------------    -------------------------   ----------------------------------------------
> > > > > > |            Container |    |              QEMU(VM) |   |                               VDUSE daemon |
> > > > > > |       ---------      |    |  -------------------  |   | ------------------------- ---------------- |
> > > > > > |       |dev/vdx|      |    |  |/dev/vhost-vdpa-x|  |   | | vDPA device emulation | | block driver | |
> > > > > > ------------+-----------     -----------+------------   -------------+----------------------+---------
> > > > > >             |                           |                            |                      |
> > > > > >             |                           |                            |                      |
> > > > > > ------------+---------------------------+----------------------------+----------------------+---------
> > > > > > |    | block device |           |  vhost device |            | vduse driver |          | TCP/IP |    |
> > > > > > |    -------+--------           --------+--------            -------+--------          -----+----    |
> > > > > > |           |                           |                           |                       |        |
> > > > > > | ----------+----------       ----------+-----------         -------+-------                |        |
> > > > > > | | virtio-blk driver |       |  vhost-vdpa driver |         | vdpa device |                |        |
> > > > > > | ----------+----------       ----------+-----------         -------+-------                |        |
> > > > > > |           |      virtio bus           |                           |                       |        |
> > > > > > |   --------+----+-----------           |                           |                       |        |
> > > > > > |                |                      |                           |                       |        |
> > > > > > |      ----------+----------            |                           |                       |        |
> > > > > > |      | virtio-blk device |            |                           |                       |        |
> > > > > > |      ----------+----------            |                           |                       |        |
> > > > > > |                |                      |                           |                       |        |
> > > > > > |     -----------+-----------           |                           |                       |        |
> > > > > > |     |  virtio-vdpa driver |           |                           |                       |        |
> > > > > > |     -----------+-----------           |                           |                       |        |
> > > > > > |                |                      |                           |    vdpa bus           |        |
> > > > > > |     -----------+----------------------+---------------------------+------------           |        |
> > > > > > |                                                                                        ---+---     |
> > > > > > -----------------------------------------------------------------------------------------| NIC |------
> > > > > >                                                                                          ---+---
> > > > > >                                                                                             |
> > > > > >                                                                                    ---------+---------
> > > > > >                                                                                    | Remote Storages |
> > > > > >                                                                                    -------------------
> > > > > >
> > > > > > We make use of it to implement a block device connecting to
> > > > > > our distributed storage, which can be used both in containers and
> > > > > > VMs. Thus, we can have an unified technology stack in this two cases.
> > > > > >
> > > > > > To test it with null-blk:
> > > > > >
> > > > > >   $ qemu-storage-daemon \
> > > > > >       --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
> > > > > >       --monitor chardev=charmonitor \
> > > > > >       --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \
> > > > > >       --export type=vduse-blk,id=test,node-name=disk0,writable=on,name=vduse-null,num-queues=16,queue-size=128
> > > > > >
> > > > > > The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse
> > > > >
> > > > > It's been half a year - any plans to upstream this?
> > > >
> > > > Yeah, this is on my to-do list this month.
> > > >
> > > > Sorry for taking so long... I've been working on another project
> > > > enabling userspace RDMA with VDUSE for the past few months. So I
> > > > didn't have much time for this. Anyway, I will submit the first
> > > > version as soon as possible.
> > > >
> > > > Thanks,
> > > > Yongji
> > >
> > > Oh fun. You mean like virtio-rdma? Or RDMA as a backend for regular
> > > virtio?
> > >
> >
> > Yes, like virtio-rdma. Then we can develop something like userspace
> > rxe、siw or custom protocol with VDUSE.
> >
> > Thanks,
> > Yongji
>
> Would be interesting to see the spec for that.

Will send it ASAP.

> The issues with RDMA revolved around the fact that current
> apps tend to either use non-standard propocols for connection
> establishment or use UD where there's IIRC no standard
> at all. So QP numbers are hard to virtualize.
> Similarly many use LIDs directly with the same effect.
> GUIDs might be virtualizeable but no one went to the effort.
>

Actually we aimed at emulating a soft RDMA with normal NIC (not use
RDMA capability) rather than virtualizing a physical RDMA NIC into
several vRDMA devices. If so, I think we won't have those issues,
right?

> To say nothing about the interaction with memory overcommit.
>

I don't get you here. Could you give me more details?

Thanks,
Yongji

WARNING: multiple messages have this Message-ID (diff)
From: Yongji Xie <xieyongji@bytedance.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: kvm <kvm@vger.kernel.org>, "Jason Wang" <jasowang@redhat.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	"Christian Brauner" <christian.brauner@canonical.com>,
	"Will Deacon" <will@kernel.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Dan Carpenter" <dan.carpenter@oracle.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	"Liu Xiaodong" <xiaodong.liu@intel.com>,
	linux-fsdevel@vger.kernel.org,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	songmuchun@bytedance.com, "Jens Axboe" <axboe@kernel.dk>,
	"He Zhe" <zhe.he@windriver.com>,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	iommu@lists.linux-foundation.org, bcrl@kvack.org,
	Netdev <netdev@vger.kernel.org>, "Joe Perches" <joe@perches.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Mika Penttilä" <mika.penttila@nextfour.com>
Subject: Re: [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace
Date: Tue, 11 Jan 2022 11:31:37 +0800	[thread overview]
Message-ID: <CACycT3sbJC1Jn7NeWk_ccQ_2_YgKybjugfxmKpfgCP3Ayoju4w@mail.gmail.com> (raw)
In-Reply-To: <20220110103938-mutt-send-email-mst@kernel.org>

On Mon, Jan 10, 2022 at 11:44 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 10, 2022 at 11:24:40PM +0800, Yongji Xie wrote:
> > On Mon, Jan 10, 2022 at 11:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jan 10, 2022 at 09:54:08PM +0800, Yongji Xie wrote:
> > > > On Mon, Jan 10, 2022 at 8:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Aug 30, 2021 at 10:17:24PM +0800, Xie Yongji wrote:
> > > > > > This series introduces a framework that makes it possible to implement
> > > > > > software-emulated vDPA devices in userspace. And to make the device
> > > > > > emulation more secure, the emulated vDPA device's control path is handled
> > > > > > in the kernel and only the data path is implemented in the userspace.
> > > > > >
> > > > > > Since the emuldated vDPA device's control path is handled in the kernel,
> > > > > > a message mechnism is introduced to make userspace be aware of the data
> > > > > > path related changes. Userspace can use read()/write() to receive/reply
> > > > > > the control messages.
> > > > > >
> > > > > > In the data path, the core is mapping dma buffer into VDUSE daemon's
> > > > > > address space, which can be implemented in different ways depending on
> > > > > > the vdpa bus to which the vDPA device is attached.
> > > > > >
> > > > > > In virtio-vdpa case, we implements a MMU-based software IOTLB with
> > > > > > bounce-buffering mechanism to achieve that. And in vhost-vdpa case, the dma
> > > > > > buffer is reside in a userspace memory region which can be shared to the
> > > > > > VDUSE userspace processs via transferring the shmfd.
> > > > > >
> > > > > > The details and our user case is shown below:
> > > > > >
> > > > > > ------------------------    -------------------------   ----------------------------------------------
> > > > > > |            Container |    |              QEMU(VM) |   |                               VDUSE daemon |
> > > > > > |       ---------      |    |  -------------------  |   | ------------------------- ---------------- |
> > > > > > |       |dev/vdx|      |    |  |/dev/vhost-vdpa-x|  |   | | vDPA device emulation | | block driver | |
> > > > > > ------------+-----------     -----------+------------   -------------+----------------------+---------
> > > > > >             |                           |                            |                      |
> > > > > >             |                           |                            |                      |
> > > > > > ------------+---------------------------+----------------------------+----------------------+---------
> > > > > > |    | block device |           |  vhost device |            | vduse driver |          | TCP/IP |    |
> > > > > > |    -------+--------           --------+--------            -------+--------          -----+----    |
> > > > > > |           |                           |                           |                       |        |
> > > > > > | ----------+----------       ----------+-----------         -------+-------                |        |
> > > > > > | | virtio-blk driver |       |  vhost-vdpa driver |         | vdpa device |                |        |
> > > > > > | ----------+----------       ----------+-----------         -------+-------                |        |
> > > > > > |           |      virtio bus           |                           |                       |        |
> > > > > > |   --------+----+-----------           |                           |                       |        |
> > > > > > |                |                      |                           |                       |        |
> > > > > > |      ----------+----------            |                           |                       |        |
> > > > > > |      | virtio-blk device |            |                           |                       |        |
> > > > > > |      ----------+----------            |                           |                       |        |
> > > > > > |                |                      |                           |                       |        |
> > > > > > |     -----------+-----------           |                           |                       |        |
> > > > > > |     |  virtio-vdpa driver |           |                           |                       |        |
> > > > > > |     -----------+-----------           |                           |                       |        |
> > > > > > |                |                      |                           |    vdpa bus           |        |
> > > > > > |     -----------+----------------------+---------------------------+------------           |        |
> > > > > > |                                                                                        ---+---     |
> > > > > > -----------------------------------------------------------------------------------------| NIC |------
> > > > > >                                                                                          ---+---
> > > > > >                                                                                             |
> > > > > >                                                                                    ---------+---------
> > > > > >                                                                                    | Remote Storages |
> > > > > >                                                                                    -------------------
> > > > > >
> > > > > > We make use of it to implement a block device connecting to
> > > > > > our distributed storage, which can be used both in containers and
> > > > > > VMs. Thus, we can have an unified technology stack in this two cases.
> > > > > >
> > > > > > To test it with null-blk:
> > > > > >
> > > > > >   $ qemu-storage-daemon \
> > > > > >       --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \
> > > > > >       --monitor chardev=charmonitor \
> > > > > >       --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \
> > > > > >       --export type=vduse-blk,id=test,node-name=disk0,writable=on,name=vduse-null,num-queues=16,queue-size=128
> > > > > >
> > > > > > The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse
> > > > >
> > > > > It's been half a year - any plans to upstream this?
> > > >
> > > > Yeah, this is on my to-do list this month.
> > > >
> > > > Sorry for taking so long... I've been working on another project
> > > > enabling userspace RDMA with VDUSE for the past few months. So I
> > > > didn't have much time for this. Anyway, I will submit the first
> > > > version as soon as possible.
> > > >
> > > > Thanks,
> > > > Yongji
> > >
> > > Oh fun. You mean like virtio-rdma? Or RDMA as a backend for regular
> > > virtio?
> > >
> >
> > Yes, like virtio-rdma. Then we can develop something like userspace
> > rxe、siw or custom protocol with VDUSE.
> >
> > Thanks,
> > Yongji
>
> Would be interesting to see the spec for that.

Will send it ASAP.

> The issues with RDMA revolved around the fact that current
> apps tend to either use non-standard propocols for connection
> establishment or use UD where there's IIRC no standard
> at all. So QP numbers are hard to virtualize.
> Similarly many use LIDs directly with the same effect.
> GUIDs might be virtualizeable but no one went to the effort.
>

Actually we aimed at emulating a soft RDMA with normal NIC (not use
RDMA capability) rather than virtualizing a physical RDMA NIC into
several vRDMA devices. If so, I think we won't have those issues,
right?

> To say nothing about the interaction with memory overcommit.
>

I don't get you here. Could you give me more details?

Thanks,
Yongji
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2022-01-11  3:31 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-30 14:17 [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-08-30 14:17 ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 01/13] iova: Export alloc_iova_fast() and free_iova_fast() Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 02/13] eventfd: Export eventfd_wake_count to modules Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 03/13] file: Export receive_fd() " Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 04/13] vdpa: Fix some coding style issues Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 05/13] vdpa: Add reset callback in vdpa_config_ops Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 18:15   ` kernel test robot
2021-08-30 22:43   ` kernel test robot
2021-08-30 22:43     ` kernel test robot
2021-08-31  1:27   ` kernel test robot
2021-09-06  5:54   ` Michael S. Tsirkin
2021-09-06  5:54     ` Michael S. Tsirkin
2021-09-06  5:54     ` Michael S. Tsirkin
2021-08-30 14:17 ` [PATCH v12 06/13] vhost-vdpa: Handle the failure of vdpa_reset() Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 07/13] vhost-iotlb: Add an opaque pointer for vhost IOTLB Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 08/13] vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 09/13] vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 10/13] vdpa: Support transferring virtual addressing during DMA mapping Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 11/13] vduse: Implement an MMU-based software IOTLB Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 12/13] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2021-08-30 14:17 ` [PATCH v12 13/13] Documentation: Add documentation for VDUSE Xie Yongji
2021-08-30 14:17   ` Xie Yongji
2022-01-10 12:56 ` [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
2022-01-10 12:56   ` Michael S. Tsirkin
2022-01-10 12:56   ` Michael S. Tsirkin
2022-01-10 13:54   ` Yongji Xie
2022-01-10 13:54     ` Yongji Xie
2022-01-10 15:09     ` Michael S. Tsirkin
2022-01-10 15:09       ` Michael S. Tsirkin
2022-01-10 15:09       ` Michael S. Tsirkin
2022-01-10 15:24       ` Yongji Xie
2022-01-10 15:24         ` Yongji Xie
2022-01-10 15:44         ` Michael S. Tsirkin
2022-01-10 15:44           ` Michael S. Tsirkin
2022-01-10 15:44           ` Michael S. Tsirkin
2022-01-11  3:31           ` Yongji Xie [this message]
2022-01-11  3:31             ` Yongji Xie
2022-01-11 11:54             ` Michael S. Tsirkin
2022-01-11 11:54               ` Michael S. Tsirkin
2022-01-11 11:54               ` Michael S. Tsirkin
2022-01-11 12:57               ` Yongji Xie
2022-01-11 12:57                 ` Yongji Xie
2022-01-11 13:04                 ` Michael S. Tsirkin
2022-01-11 13:04                   ` Michael S. Tsirkin
2022-01-11 13:04                   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACycT3sbJC1Jn7NeWk_ccQ_2_YgKybjugfxmKpfgCP3Ayoju4w@mail.gmail.com \
    --to=xieyongji@bytedance.com \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=christian.brauner@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dan.carpenter@oracle.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=joe@perches.com \
    --cc=john.garry@huawei.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mika.penttila@nextfour.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=rdunlap@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=sgarzare@redhat.com \
    --cc=songmuchun@bytedance.com \
    --cc=stefanha@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=xiaodong.liu@intel.com \
    --cc=zhe.he@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.