From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BE8BC433DF for ; Mon, 19 Oct 2020 14:56:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6A4B2222D9 for ; Mon, 19 Oct 2020 14:56:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="hpHH3Iv6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6A4B2222D9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 28EE96B005C; Mon, 19 Oct 2020 10:56:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 240206B0062; Mon, 19 Oct 2020 10:56:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 130EB6B0068; Mon, 19 Oct 2020 10:56:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0162.hostedemail.com [216.40.44.162]) by kanga.kvack.org (Postfix) with ESMTP id DB7F66B005C for ; Mon, 19 Oct 2020 10:56:48 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 60930181AEF07 for ; Mon, 19 Oct 2020 14:56:48 +0000 (UTC) X-FDA: 77388976896.26.touch51_6115a0527237 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 3B11F1804B654 for ; Mon, 19 Oct 2020 14:56:48 +0000 (UTC) X-HE-Tag: touch51_6115a0527237 X-Filterd-Recvd-Size: 8070 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Mon, 19 Oct 2020 14:56:47 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id h2so5085994pll.11 for ; Mon, 19 Oct 2020 07:56:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=bDZ00yz7NOD/QLUn/sWzQJeAs3yaEn33AbPAJ6Tz+II=; b=hpHH3Iv6FYYmXzLJLABXjtPTyJfM6+AAPHmoliUt0Imcxa56yoJbanavrfy9Ult1Wl lWFCChFcVWvkA/i/3ENsn1u+9Fw2xy8ChA56L30J+/AQ0Zg9MuL6pZzl1Th938ghV79T NJBeZi0SfpAQyj7VkAIwMIOWYszauyD7z3D4HO/tsisV3dulo6+EcRe61GBx1rvj9DuG TrgwSw7deLXJNdjwEFz4iGRibCbWJAAx588d0DQF5wiUIC7Wx/Fip0ty2B5vcRV+c+ni IxH2y7AAf6RHil9TAZY6pOwVcGDIdtc2zcQbQh+6ejlCOyfEAFB6PYWgwbv0aUktLyss Q+9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=bDZ00yz7NOD/QLUn/sWzQJeAs3yaEn33AbPAJ6Tz+II=; b=FzPNeZZEtIPUREtxAb5MZNg7GIjjcq1p0kDEvmUqUE2wRKmMWl13CQtYQ4Z4BJFLrG SWHZ+Es9/+cvuXEuEvxlkdn5ejCz8+xHB2Pm0s3InnXeKRpEFVY8CWUBo0/putV41Cuj oZIGqRDiQH/eM4OOzS1Sgtw0a4J94dk0n7aIcw5UaVxQb5hhB4uftq/NLHI8Ye6ZILtf MsIPupGN/BE4bgQ6HEVbjpGPKqQ9/6LtV9GgpJhIpmi7GtmDURk0Vv1+m8FXAeOQuM2Y WKpMkFDZQuHfIP3ngVovKYimwrz7BQ0rEJHsh+PgLThwwyL7Z3LNRGJk5FtVI4MuC6rX ECig== X-Gm-Message-State: AOAM532yziJhof4ZQQ4MGqwTYY1J5XLy8fPRIoA/jrepdrqraaValAfQ gmQaZWKSm0eDLT3Q8iUr2YiJ X-Google-Smtp-Source: ABdhPJx9Ui3ZCIoQ2zqAgqe7sfSrffbAmWHZyB6fSrrAtV2umUNZScjqOw3H+u9eX2uqbH8RYssyYQ== X-Received: by 2002:a17:90b:3847:: with SMTP id nl7mr207680pjb.40.1603119405854; Mon, 19 Oct 2020 07:56:45 -0700 (PDT) Received: from localhost ([103.136.220.106]) by smtp.gmail.com with ESMTPSA id h13sm11083024pgs.66.2020.10.19.07.56.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 07:56:45 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, akpm@linux-foundation.org Cc: linux-mm@kvack.org, virtualization@lists.linux-foundation.org Subject: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Date: Mon, 19 Oct 2020 22:56:19 +0800 Message-Id: <20201019145623.671-1-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This series introduces a framework, which can be used to implement vDPA Devices in a userspace program. To implement it, the work consist of two parts: control path emulating and data path offloading. In the control path, the VDUSE driver will make use of message mechnism to forward the actions (get/set features, get/st status, get/set config space and set virtqueue states) from virtio-vdpa driver to userspace. Userspace can use read()/write() to receive/reply to those control messages. In the data path, the VDUSE driver implements a MMU-based on-chip IOMMU driver which supports both direct mapping and indirect mapping with bounce buffer. Then userspace can access those iova space via mmap(). Besides, eventfd mechnism is used to trigger interrupts and forward virtqueue kicks. The details and our user case is shown below: ------------------------ --------------------------------------------= --------------- | APP | | QEMU = | | --------- | | -------------------- ------------------= -+<-->+------ | | |dev/vdx| | | | device emulation | | virtio dataplane= | | BDS | | ------------+----------- -----------+-----------------------+--------= ---------+----- | | | = | | | emulating | offload= ing | ------------+---------------------------+-----------------------+--------= ---------+------ | | block device | | vduse driver | | vdpa device |= | TCP/IP | | | -------+-------- --------+-------- +------+------- = -----+---- | | | | | | = | | | | | | | = | | | ----------+---------- ----------+----------- | | = | | | | virtio-blk driver | | virtio-vdpa driver | | | = | | | ----------+---------- ----------+----------- | | = | | | | | | | = | | | | ------------------ | = | | | ----------------------------------------------------- = ---+--- | -------------------------------------------------------------------------= ----- | NIC |--- = ---+--- = | = ---------+--------- = | Remote Storages | = ------------------- We make use of it to implement a block device connecting to our distributed storage, which can be used in containers and bare metal. Compared with qemu-nbd solution, this solution has higher performance, and we can have an unified technology stack in VM and containers for remote storages. To test it with a host disk (e.g. /dev/sdx): $ qemu-storage-daemon \ --chardev socket,id=3Dcharmonitor,path=3D/tmp/qmp.sock,server,nowai= t \ --monitor chardev=3Dcharmonitor \ --blockdev driver=3Dhost_device,cache.direct=3Don,aio=3Dnative,file= name=3D/dev/sdx,node-name=3Ddisk0 \ --export vduse-blk,id=3Dtest,node-name=3Ddisk0,writable=3Don,vduse-= id=3D1,num-queues=3D16,queue-size=3D128 The qemu-storage-daemon can be found at https://github.com/bytedance/qemu= /tree/vduse Future work: - Improve performance (e.g. zero copy implementation in datapath) - Config interrupt support - Userspace library (find a way to reuse device emulation code in qemu/= rust-vmm) Xie Yongji (4): mm: export zap_page_range() for driver use vduse: Introduce VDUSE - vDPA Device in Userspace vduse: grab the module's references until there is no vduse device vduse: Add memory shrinker to reclaim bounce pages drivers/vdpa/Kconfig | 8 + drivers/vdpa/Makefile | 1 + drivers/vdpa/vdpa_user/Makefile | 5 + drivers/vdpa/vdpa_user/eventfd.c | 221 ++++++ drivers/vdpa/vdpa_user/eventfd.h | 48 ++ drivers/vdpa/vdpa_user/iova_domain.c | 488 ++++++++++++ drivers/vdpa/vdpa_user/iova_domain.h | 104 +++ drivers/vdpa/vdpa_user/vduse.h | 66 ++ drivers/vdpa/vdpa_user/vduse_dev.c | 1081 ++++++++++++++++++++++++++ include/uapi/linux/vduse.h | 85 ++ mm/memory.c | 1 + 11 files changed, 2108 insertions(+) create mode 100644 drivers/vdpa/vdpa_user/Makefile create mode 100644 drivers/vdpa/vdpa_user/eventfd.c create mode 100644 drivers/vdpa/vdpa_user/eventfd.h create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h create mode 100644 drivers/vdpa/vdpa_user/vduse.h create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c create mode 100644 include/uapi/linux/vduse.h --=20 2.25.1