All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yongji Xie <xieyongji@bytedance.com>
To: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	sgarzare@redhat.com, Parav Pandit <parav@nvidia.com>,
	akpm@linux-foundation.org, Randy Dunlap <rdunlap@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org,
	corbet@lwn.net, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, kvm@vger.kernel.org, linux-aio@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: Re: [RFC v2 09/13] vduse: Add support for processing vhost iotlb message
Date: Thu, 24 Dec 2020 15:37:10 +0800	[thread overview]
Message-ID: <CACycT3s=m=PQb5WFoMGhz8TNGme4+=rmbbBTtrugF9ZmNnWxEw@mail.gmail.com> (raw)
In-Reply-To: <595fe7d6-7876-26e4-0b7c-1d63ca6d7a97@redhat.com>

On Thu, Dec 24, 2020 at 10:41 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/12/23 下午8:14, Yongji Xie wrote:
> > On Wed, Dec 23, 2020 at 5:05 PM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On 2020/12/22 下午10:52, Xie Yongji wrote:
> >>> To support vhost-vdpa bus driver, we need a way to share the
> >>> vhost-vdpa backend process's memory with the userspace VDUSE process.
> >>>
> >>> This patch tries to make use of the vhost iotlb message to achieve
> >>> that. We will get the shm file from the iotlb message and pass it
> >>> to the userspace VDUSE process.
> >>>
> >>> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> >>> ---
> >>>    Documentation/driver-api/vduse.rst |  15 +++-
> >>>    drivers/vdpa/vdpa_user/vduse_dev.c | 147 ++++++++++++++++++++++++++++++++++++-
> >>>    include/uapi/linux/vduse.h         |  11 +++
> >>>    3 files changed, 171 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/Documentation/driver-api/vduse.rst b/Documentation/driver-api/vduse.rst
> >>> index 623f7b040ccf..48e4b1ba353f 100644
> >>> --- a/Documentation/driver-api/vduse.rst
> >>> +++ b/Documentation/driver-api/vduse.rst
> >>> @@ -46,13 +46,26 @@ The following types of messages are provided by the VDUSE framework now:
> >>>
> >>>    - VDUSE_GET_CONFIG: Read from device specific configuration space
> >>>
> >>> +- VDUSE_UPDATE_IOTLB: Update the memory mapping in device IOTLB
> >>> +
> >>> +- VDUSE_INVALIDATE_IOTLB: Invalidate the memory mapping in device IOTLB
> >>> +
> >>>    Please see include/linux/vdpa.h for details.
> >>>
> >>> -In the data path, VDUSE framework implements a MMU-based on-chip IOMMU
> >>> +The data path of userspace vDPA device is implemented in different ways
> >>> +depending on the vdpa bus to which it is attached.
> >>> +
> >>> +In virtio-vdpa case, VDUSE framework implements a MMU-based on-chip IOMMU
> >>>    driver which supports mapping the kernel dma buffer to a userspace iova
> >>>    region dynamically. The userspace iova region can be created by passing
> >>>    the userspace vDPA device fd to mmap(2).
> >>>
> >>> +In vhost-vdpa case, the dma buffer is reside in a userspace memory region
> >>> +which will be shared to the VDUSE userspace processs via the file
> >>> +descriptor in VDUSE_UPDATE_IOTLB message. And the corresponding address
> >>> +mapping (IOVA of dma buffer <-> VA of the memory region) is also included
> >>> +in this message.
> >>> +
> >>>    Besides, the eventfd mechanism is used to trigger interrupt callbacks and
> >>>    receive virtqueue kicks in userspace. The following ioctls on the userspace
> >>>    vDPA device fd are provided to support that:
> >>> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> >>> index b974333ed4e9..d24aaacb6008 100644
> >>> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> >>> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> >>> @@ -34,6 +34,7 @@
> >>>
> >>>    struct vduse_dev_msg {
> >>>        struct vduse_dev_request req;
> >>> +     struct file *iotlb_file;
> >>>        struct vduse_dev_response resp;
> >>>        struct list_head list;
> >>>        wait_queue_head_t waitq;
> >>> @@ -325,12 +326,80 @@ static int vduse_dev_set_vq_state(struct vduse_dev *dev,
> >>>        return ret;
> >>>    }
> >>>
> >>> +static int vduse_dev_update_iotlb(struct vduse_dev *dev, struct file *file,
> >>> +                             u64 offset, u64 iova, u64 size, u8 perm)
> >>> +{
> >>> +     struct vduse_dev_msg *msg;
> >>> +     int ret;
> >>> +
> >>> +     if (!size)
> >>> +             return -EINVAL;
> >>> +
> >>> +     msg = vduse_dev_new_msg(dev, VDUSE_UPDATE_IOTLB);
> >>> +     msg->req.size = sizeof(struct vduse_iotlb);
> >>> +     msg->req.iotlb.offset = offset;
> >>> +     msg->req.iotlb.iova = iova;
> >>> +     msg->req.iotlb.size = size;
> >>> +     msg->req.iotlb.perm = perm;
> >>> +     msg->req.iotlb.fd = -1;
> >>> +     msg->iotlb_file = get_file(file);
> >>> +
> >>> +     ret = vduse_dev_msg_sync(dev, msg);
> >>
> >> My feeling is that we should provide consistent API for the userspace
> >> device to use.
> >>
> >> E.g we'd better carry the IOTLB message for both virtio/vhost drivers.
> >>
> >> It looks to me for virtio drivers we can still use UPDAT_IOTLB message
> >> by using VDUSE file as msg->iotlb_file here.
> >>
> > It's OK for me. One problem is when to transfer the UPDATE_IOTLB
> > message in virtio cases.
>
>
> Instead of generating IOTLB messages for userspace.
>
> How about record the mappings (which is a common case for device have
> on-chip IOMMU e.g mlx5e and vdpa simlator), then we can introduce ioctl
> for userspace to query?
>

If so, the IOTLB UPDATE is actually triggered by ioctl, but
IOTLB_INVALIDATE is triggered by the message. Is it a little odd? Or
how about trigger it when userspace call mmap() on the device fd?

Thanks,
Yongji

  reply	other threads:[~2020-12-24  7:38 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-22 14:52 [RFC v2 00/13] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-12-22 14:52 ` [RFC v2 01/13] mm: export zap_page_range() for driver use Xie Yongji
2020-12-22 15:44   ` Christoph Hellwig
2020-12-22 15:44     ` Christoph Hellwig
2020-12-22 14:52 ` [RFC v2 02/13] eventfd: track eventfd_signal() recursion depth separately in different cases Xie Yongji
2020-12-22 14:52 ` [RFC v2 03/13] eventfd: Increase the recursion depth of eventfd_signal() Xie Yongji
2020-12-22 14:52 ` [RFC v2 04/13] vdpa: Remove the restriction that only supports virtio-net devices Xie Yongji
2020-12-22 14:52 ` [RFC v2 05/13] vdpa: Pass the netlink attributes to ops.dev_add() Xie Yongji
2020-12-22 14:52 ` [RFC v2 06/13] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-12-23  8:08   ` Jason Wang
2020-12-23  8:08     ` Jason Wang
2020-12-23 14:17     ` Yongji Xie
2020-12-23 14:17       ` Yongji Xie
2020-12-24  3:01       ` Jason Wang
2020-12-24  3:01         ` Jason Wang
2020-12-24  8:34         ` Yongji Xie
2020-12-24  8:34           ` Yongji Xie
2020-12-25  6:59           ` Jason Wang
2020-12-25  6:59             ` Jason Wang
2021-01-08 13:32   ` Bob Liu
2021-01-08 13:32     ` Bob Liu
2021-01-10 10:03     ` Yongji Xie
2021-01-10 10:03       ` Yongji Xie
2020-12-22 14:52 ` [RFC v2 07/13] vduse: support get/set virtqueue state Xie Yongji
2020-12-22 14:52 ` [RFC v2 08/13] vdpa: Introduce process_iotlb_msg() in vdpa_config_ops Xie Yongji
2020-12-23  8:36   ` Jason Wang
2020-12-23  8:36     ` Jason Wang
2020-12-23 11:06     ` Yongji Xie
2020-12-23 11:06       ` Yongji Xie
2020-12-24  2:36       ` Jason Wang
2020-12-24  2:36         ` Jason Wang
2020-12-24  7:24         ` Yongji Xie
2020-12-24  7:24           ` Yongji Xie
2020-12-22 14:52 ` [RFC v2 09/13] vduse: Add support for processing vhost iotlb message Xie Yongji
2020-12-23  9:05   ` Jason Wang
2020-12-23  9:05     ` Jason Wang
2020-12-23 12:14     ` [External] " Yongji Xie
2020-12-23 12:14       ` Yongji Xie
2020-12-24  2:41       ` Jason Wang
2020-12-24  2:41         ` Jason Wang
2020-12-24  7:37         ` Yongji Xie [this message]
2020-12-24  7:37           ` Yongji Xie
2020-12-25  2:37           ` Yongji Xie
2020-12-25  2:37             ` Yongji Xie
2020-12-25  7:02             ` Jason Wang
2020-12-25  7:02               ` Jason Wang
2020-12-25 11:36               ` Yongji Xie
2020-12-25 11:36                 ` Yongji Xie
2020-12-25  6:57           ` Jason Wang
2020-12-25  6:57             ` Jason Wang
2020-12-25 10:31             ` Yongji Xie
2020-12-25 10:31               ` Yongji Xie
2020-12-28  7:43               ` Jason Wang
2020-12-28  7:43                 ` Jason Wang
2020-12-28  8:14                 ` Yongji Xie
2020-12-28  8:14                   ` Yongji Xie
2020-12-28  8:43                   ` Jason Wang
2020-12-28  8:43                     ` Jason Wang
2020-12-28  9:12                     ` Yongji Xie
2020-12-28  9:12                       ` Yongji Xie
2020-12-29  9:11                       ` Jason Wang
2020-12-29  9:11                         ` Jason Wang
2020-12-29  9:11                         ` Jason Wang
2020-12-29 10:26                         ` Yongji Xie
2020-12-29 10:26                           ` Yongji Xie
2020-12-30  6:10                           ` Jason Wang
2020-12-30  6:10                             ` Jason Wang
2020-12-30  7:09                             ` Yongji Xie
2020-12-30  7:09                               ` Yongji Xie
2020-12-30  8:41                               ` Jason Wang
2020-12-30  8:41                                 ` Jason Wang
2020-12-30 10:12                                 ` Yongji Xie
2020-12-30 10:12                                   ` Yongji Xie
2020-12-31  2:49                                   ` Jason Wang
2020-12-31  2:49                                     ` Jason Wang
2020-12-31  5:15                                     ` Yongji Xie
2020-12-31  5:15                                       ` Yongji Xie
2020-12-31  5:49                                       ` Jason Wang
2020-12-31  5:49                                         ` Jason Wang
2020-12-31  6:52                                         ` Yongji Xie
2020-12-31  6:52                                           ` Yongji Xie
2020-12-31  7:11                                           ` Jason Wang
2020-12-31  7:11                                             ` Jason Wang
2020-12-31  8:00                                             ` Yongji Xie
2020-12-31  8:00                                               ` Yongji Xie
2020-12-22 14:52 ` [RFC v2 10/13] vduse: grab the module's references until there is no vduse device Xie Yongji
2020-12-22 14:52 ` [RFC v2 11/13] vduse/iova_domain: Support reclaiming bounce pages Xie Yongji
2020-12-22 14:52 ` [RFC v2 12/13] vduse: Add memory shrinker to reclaim " Xie Yongji
2020-12-22 14:52 ` [RFC v2 13/13] vduse: Introduce a workqueue for irq injection Xie Yongji
2020-12-23  6:38 ` [RFC v2 00/13] Introduce VDUSE - vDPA Device in Userspace Jason Wang
2020-12-23  6:38   ` Jason Wang
2020-12-23  8:14   ` Jason Wang
2020-12-23  8:14     ` Jason Wang
2020-12-23 10:59   ` Yongji Xie
2020-12-23 10:59     ` Yongji Xie
2020-12-24  2:24     ` Jason Wang
2020-12-24  2:24       ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACycT3s=m=PQb5WFoMGhz8TNGme4+=rmbbBTtrugF9ZmNnWxEw@mail.gmail.com' \
    --to=xieyongji@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=corbet@lwn.net \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=rdunlap@infradead.org \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.