From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DB6FC433B4 for ; Wed, 14 Apr 2021 07:34:33 +0000 (UTC) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C9628613FC for ; Wed, 14 Apr 2021 07:34:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C9628613FC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=virtualization-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 7CE2084284; Wed, 14 Apr 2021 07:34:32 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wZXQvGY0Hb7k; Wed, 14 Apr 2021 07:34:31 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTP id C004D841FD; Wed, 14 Apr 2021 07:34:30 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 91FE7C000B; Wed, 14 Apr 2021 07:34:30 +0000 (UTC) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2B62CC000A for ; Wed, 14 Apr 2021 07:34:29 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 1D2804013B for ; Wed, 14 Apr 2021 07:34:29 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Authentication-Results: smtp2.osuosl.org (amavisd-new); dkim=pass (1024-bit key) header.d=redhat.com Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5Cq4UzAVW-o7 for ; Wed, 14 Apr 2021 07:34:28 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp2.osuosl.org (Postfix) with ESMTPS id E82DE400F3 for ; Wed, 14 Apr 2021 07:34:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618385666; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/EKL5XvCVEpC/xAzaCkVa2iKoqKzecx9m08YdXt8r2U=; b=V/vQPZEs66FtI80yn4/wygAPoi/rwR9235XkQuEOmef2/WVPRw0LVswvymPzDz5m3mzdSj +qA+V2n30M/C6pzc1Q7QZaQm3xl1DwZtd0KUcttUsDsb1cFt2WL2Rn+4JXzh6tJJ58FXCX W4bF9G9a34uL7ZyMlJSLVCHV06qLe24= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-534-X97jz4f6NFeUKUEFpmgJ9w-1; Wed, 14 Apr 2021 03:34:22 -0400 X-MC-Unique: X97jz4f6NFeUKUEFpmgJ9w-1 Received: by mail-wr1-f71.google.com with SMTP id o14-20020a5d474e0000b029010298882dadso665607wrs.2 for ; Wed, 14 Apr 2021 00:34:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=/EKL5XvCVEpC/xAzaCkVa2iKoqKzecx9m08YdXt8r2U=; b=gvSbDrbQKa4f+NlST0g2ger0BrjjyLvfJH6P8/1LeBm/yvpT9Wf4ngfcRBGiOlU7B9 gwQhq9LkgoXJd85gZlLxBR81V3fwXGYDxGFo4HBFvY99dYZG/i3vlAtuirlyWfTk5ekr 9tJJJYTrXf8rR/vv9Hu6B2+GinY6FxmCUZY3vpaDgMpMqDVtCuV5arhItXXQH0YYqILj RbhdvmuuO0ky+SA4ac4eC7klZHw5sQpxTr1UyscX3Ux1tYya5+6aknWOgAAZwCgthkJ9 KLLG/rUyBXT/+IFpfSBuDhq4+razNvsi8MSs5Xni/nAi1sz6w7w0hxSFszU+I+VGQPte 8zvQ== X-Gm-Message-State: AOAM533LvRCa7yV7NHH5No5AA6rjilBz123sAg4syNZuOz9g4ukzi1hU WNx359fROFvsKPEUSLcR0PL73r2uMUjEoa43i03FprUA0R61tQuEHAmm/zESxjY7skDEVeQL/+A WpiyYKUs5oj3b6bv+j0xVIMlSOyBJKm6K24RIiXc+pA== X-Received: by 2002:adf:f506:: with SMTP id q6mr15884892wro.65.1618385660771; Wed, 14 Apr 2021 00:34:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxfhdUyW/rYKPQEdLrMjzLeK4RYqnv+jhM0pMJO57jzpHnUMxj42Lj91ZhkDYfqygrXVL2Ebw== X-Received: by 2002:adf:f506:: with SMTP id q6mr15884871wro.65.1618385660595; Wed, 14 Apr 2021 00:34:20 -0700 (PDT) Received: from redhat.com ([2a10:8006:2281:0:1994:c627:9eac:1825]) by smtp.gmail.com with ESMTPSA id l5sm4529131wmh.0.2021.04.14.00.34.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Apr 2021 00:34:19 -0700 (PDT) Date: Wed, 14 Apr 2021 03:34:16 -0400 From: "Michael S. Tsirkin" To: Xie Yongji Subject: Re: [PATCH v6 00/10] Introduce VDUSE - vDPA Device in Userspace Message-ID: <20210414032909-mutt-send-email-mst@kernel.org> References: <20210331080519.172-1-xieyongji@bytedance.com> MIME-Version: 1.0 In-Reply-To: <20210331080519.172-1-xieyongji@bytedance.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mst@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline Cc: axboe@kernel.dk, kvm@vger.kernel.org, corbet@lwn.net, netdev@vger.kernel.org, rdunlap@infradead.org, willy@infradead.org, virtualization@lists.linux-foundation.org, hch@infradead.org, christian.brauner@canonical.com, bcrl@kvack.org, viro@zeniv.linux.org.uk, stefanha@redhat.com, linux-fsdevel@vger.kernel.org, dan.carpenter@oracle.com, mika.penttila@nextfour.com X-BeenThere: virtualization@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Linux virtualization List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: virtualization-bounces@lists.linux-foundation.org Sender: "Virtualization" On Wed, Mar 31, 2021 at 04:05:09PM +0800, Xie Yongji wrote: > This series introduces a framework, which can be used to implement > vDPA Devices in a userspace program. The work consist of two parts: > control path forwarding and data path offloading. > > In the control path, the VDUSE driver will make use of message > mechnism to forward the config operation from vdpa bus driver > to userspace. Userspace can use read()/write() to receive/reply > those control messages. > > In the data path, the core is mapping dma buffer into VDUSE > daemon's address space, which can be implemented in different ways > depending on the vdpa bus to which the vDPA device is attached. > > In virtio-vdpa case, we implements a MMU-based on-chip IOMMU driver with > bounce-buffering mechanism to achieve that. And in vhost-vdpa case, the dma > buffer is reside in a userspace memory region which can be shared to the > VDUSE userspace processs via transferring the shmfd. > > The details and our user case is shown below: > > ------------------------ ------------------------- ---------------------------------------------- > | Container | | QEMU(VM) | | VDUSE daemon | > | --------- | | ------------------- | | ------------------------- ---------------- | > | |dev/vdx| | | |/dev/vhost-vdpa-x| | | | vDPA device emulation | | block driver | | > ------------+----------- -----------+------------ -------------+----------------------+--------- > | | | | > | | | | > ------------+---------------------------+----------------------------+----------------------+--------- > | | block device | | vhost device | | vduse driver | | TCP/IP | | > | -------+-------- --------+-------- -------+-------- -----+---- | > | | | | | | > | ----------+---------- ----------+----------- -------+------- | | > | | virtio-blk driver | | vhost-vdpa driver | | vdpa device | | | > | ----------+---------- ----------+----------- -------+------- | | > | | virtio bus | | | | > | --------+----+----------- | | | | > | | | | | | > | ----------+---------- | | | | > | | virtio-blk device | | | | | > | ----------+---------- | | | | > | | | | | | > | -----------+----------- | | | | > | | virtio-vdpa driver | | | | | > | -----------+----------- | | | | > | | | | vdpa bus | | > | -----------+----------------------+---------------------------+------------ | | > | ---+--- | > -----------------------------------------------------------------------------------------| NIC |------ > ---+--- > | > ---------+--------- > | Remote Storages | > ------------------- This all looks quite similar to vhost-user-block except that one does not need any kernel support at all. So I am still scratching my head about its advantages over vhost-user-block. > We make use of it to implement a block device connecting to > our distributed storage, which can be used both in containers and > VMs. Thus, we can have an unified technology stack in this two cases. Maybe the container part is the answer. How does that stack look? > To test it with null-blk: > > $ qemu-storage-daemon \ > --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \ > --monitor chardev=charmonitor \ > --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \ > --export type=vduse-blk,id=test,node-name=disk0,writable=on,name=vduse-null,num-queues=16,queue-size=128 > > The qemu-storage-daemon can be found at https://github.com/bytedance/qemu/tree/vduse > > Future work: > - Improve performance > - Userspace library (find a way to reuse device emulation code in qemu/rust-vmm) > > V5 to V6: > - Export receive_fd() instead of __receive_fd() > - Factor out the unmapping logic of pa and va separatedly > - Remove the logic of bounce page allocation in page fault handler > - Use PAGE_SIZE as IOVA allocation granule > - Add EPOLLOUT support > - Enable setting API version in userspace > - Fix some bugs > > V4 to V5: > - Remove the patch for irq binding > - Use a single IOTLB for all types of mapping > - Factor out vhost_vdpa_pa_map() > - Add some sample codes in document > - Use receice_fd_user() to pass file descriptor > - Fix some bugs > > V3 to V4: > - Rebase to vhost.git > - Split some patches > - Add some documents > - Use ioctl to inject interrupt rather than eventfd > - Enable config interrupt support > - Support binding irq to the specified cpu > - Add two module parameter to limit bounce/iova size > - Create char device rather than anon inode per vduse > - Reuse vhost IOTLB for iova domain > - Rework the message mechnism in control path > > V2 to V3: > - Rework the MMU-based IOMMU driver > - Use the iova domain as iova allocator instead of genpool > - Support transferring vma->vm_file in vhost-vdpa > - Add SVA support in vhost-vdpa > - Remove the patches on bounce pages reclaim > > V1 to V2: > - Add vhost-vdpa support > - Add some documents > - Based on the vdpa management tool > - Introduce a workqueue for irq injection > - Replace interval tree with array map to store the iova_map > > Xie Yongji (10): > file: Export receive_fd() to modules > eventfd: Increase the recursion depth of eventfd_signal() > vhost-vdpa: protect concurrent access to vhost device iotlb > vhost-iotlb: Add an opaque pointer for vhost IOTLB > vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() > vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() > vdpa: Support transferring virtual addressing during DMA mapping > vduse: Implement an MMU-based IOMMU driver > vduse: Introduce VDUSE - vDPA Device in Userspace > Documentation: Add documentation for VDUSE > > Documentation/userspace-api/index.rst | 1 + > Documentation/userspace-api/ioctl/ioctl-number.rst | 1 + > Documentation/userspace-api/vduse.rst | 212 +++ > drivers/vdpa/Kconfig | 10 + > drivers/vdpa/Makefile | 1 + > drivers/vdpa/ifcvf/ifcvf_main.c | 2 +- > drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- > drivers/vdpa/vdpa.c | 9 +- > drivers/vdpa/vdpa_sim/vdpa_sim.c | 8 +- > drivers/vdpa/vdpa_user/Makefile | 5 + > drivers/vdpa/vdpa_user/iova_domain.c | 521 ++++++++ > drivers/vdpa/vdpa_user/iova_domain.h | 70 + > drivers/vdpa/vdpa_user/vduse_dev.c | 1362 ++++++++++++++++++++ > drivers/vdpa/virtio_pci/vp_vdpa.c | 2 +- > drivers/vhost/iotlb.c | 20 +- > drivers/vhost/vdpa.c | 154 ++- > fs/eventfd.c | 2 +- > fs/file.c | 6 + > include/linux/eventfd.h | 5 +- > include/linux/file.h | 7 +- > include/linux/vdpa.h | 21 +- > include/linux/vhost_iotlb.h | 3 + > include/uapi/linux/vduse.h | 175 +++ > 23 files changed, 2548 insertions(+), 51 deletions(-) > create mode 100644 Documentation/userspace-api/vduse.rst > create mode 100644 drivers/vdpa/vdpa_user/Makefile > create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c > create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h > create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c > create mode 100644 include/uapi/linux/vduse.h > > -- > 2.11.0 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization