From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Tiwei Bie <tiwei.bie@intel.com>,
alex.williamson@redhat.com, maxime.coquelin@redhat.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
virtualization@lists.linux-foundation.org,
netdev@vger.kernel.org, dan.daly@intel.com,
cunming.liang@intel.com, zhihong.wang@intel.com,
lingshan.zhu@intel.com
Subject: Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
Date: Wed, 18 Sep 2019 10:32:00 -0400 [thread overview]
Message-ID: <20190918102923-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <fa6957f3-19ad-f351-8c43-65bc8342b82e@redhat.com>
On Wed, Sep 18, 2019 at 01:51:21PM +0800, Jason Wang wrote:
>
> On 2019/9/17 下午6:58, Tiwei Bie wrote:
> > On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote:
> > > On 2019/9/17 上午9:02, Tiwei Bie wrote:
> > > > This RFC is to demonstrate below ideas,
> > > >
> > > > a) Build vhost-mdev on top of the same abstraction defined in
> > > > the virtio-mdev series [1];
> > > >
> > > > b) Introduce /dev/vhost-mdev to do vhost ioctls and support
> > > > setting mdev device as backend;
> > > >
> > > > Now the userspace API looks like this:
> > > >
> > > > - Userspace generates a compatible mdev device;
> > > >
> > > > - Userspace opens this mdev device with VFIO API (including
> > > > doing IOMMU programming for this mdev device with VFIO's
> > > > container/group based interface);
> > > >
> > > > - Userspace opens /dev/vhost-mdev and gets vhost fd;
> > > >
> > > > - Userspace uses vhost ioctls to setup vhost (userspace should
> > > > do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device
> > > > fd first before doing other vhost ioctls);
> > > >
> > > > Only compile test has been done for this series for now.
> > >
> > > Have a hard thought on the architecture:
> > Thanks a lot! Do appreciate it!
> >
> > > 1) Create a vhost char device and pass vfio mdev device fd to it as a
> > > backend and translate vhost-mdev ioctl to virtio mdev transport (e.g
> > > read/write). DMA was done through the VFIO DMA mapping on the container that
> > > is attached.
> > Yeah, that's what we are doing in this series.
> >
> > > We have two more choices:
> > >
> > > 2) Use vfio-mdev but do not create vhost-mdev device, instead, just
> > > implement vhost ioctl on vfio_device_ops, and translate them into
> > > virtio-mdev transport or just pass ioctl to parent.
> > Yeah. Instead of introducing /dev/vhost-mdev char device, do
> > vhost ioctls on VFIO device fd directly. That's what we did
> > in RFC v3.
> >
> > > 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still
> > > try to add dev to vfio group and talk to parent with device specific ops
> > If my understanding is correct, this means we need to introduce
> > a new VFIO device driver to replace the existing vfio-mdev driver
> > in our case. Below is a quick draft just to show my understanding:
> >
> > #include <linux/init.h>
> > #include <linux/module.h>
> > #include <linux/device.h>
> > #include <linux/kernel.h>
> > #include <linux/slab.h>
> > #include <linux/vfio.h>
> > #include <linux/mdev.h>
> >
> > #include "mdev_private.h"
> >
> > /* XXX: we need a proper way to include below vhost header. */
> > #include "../../vhost/vhost.h"
> >
> > static int vfio_vhost_mdev_open(void *device_data)
> > {
> > if (!try_module_get(THIS_MODULE))
> > return -ENODEV;
> >
> > /* ... */
> > vhost_dev_init(...);
> >
> > return 0;
> > }
> >
> > static void vfio_vhost_mdev_release(void *device_data)
> > {
> > /* ... */
> > module_put(THIS_MODULE);
> > }
> >
> > static long vfio_vhost_mdev_unlocked_ioctl(void *device_data,
> > unsigned int cmd, unsigned long arg)
> > {
> > struct mdev_device *mdev = device_data;
> > struct mdev_parent *parent = mdev->parent;
> >
> > /*
> > * Use vhost ioctls.
> > *
> > * We will have a different parent_ops design.
> > * And potentially, we can share the same parent_ops
> > * with virtio_mdev.
> > */
> > switch (cmd) {
> > case VHOST_GET_FEATURES:
> > parent->ops->get_features(mdev, ...);
> > break;
> > /* ... */
> > }
> >
> > return 0;
> > }
> >
> > static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf,
> > size_t count, loff_t *ppos)
> > {
> > /* ... */
> > return 0;
> > }
> >
> > static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf,
> > size_t count, loff_t *ppos)
> > {
> > /* ... */
> > return 0;
> > }
> >
> > static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma)
> > {
> > /* ... */
> > return 0;
> > }
> >
> > static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = {
> > .name = "vfio-vhost-mdev",
> > .open = vfio_vhost_mdev_open,
> > .release = vfio_vhost_mdev_release,
> > .ioctl = vfio_vhost_mdev_unlocked_ioctl,
> > .read = vfio_vhost_mdev_read,
> > .write = vfio_vhost_mdev_write,
> > .mmap = vfio_vhost_mdev_mmap,
> > };
> >
> > static int vfio_vhost_mdev_probe(struct device *dev)
> > {
> > struct mdev_device *mdev = to_mdev_device(dev);
> >
> > /* ... */
> > return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev);
> > }
> >
> > static void vfio_vhost_mdev_remove(struct device *dev)
> > {
> > /* ... */
> > vfio_del_group_dev(dev);
> > }
> >
> > static struct mdev_driver vfio_vhost_mdev_driver = {
> > .name = "vfio_vhost_mdev",
> > .probe = vfio_vhost_mdev_probe,
> > .remove = vfio_vhost_mdev_remove,
> > };
> >
> > static int __init vfio_vhost_mdev_init(void)
> > {
> > return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE);
> > }
> > module_init(vfio_vhost_mdev_init)
> >
> > static void __exit vfio_vhost_mdev_exit(void)
> > {
> > mdev_unregister_driver(&vfio_vhost_mdev_driver);
> > }
> > module_exit(vfio_vhost_mdev_exit)
>
>
> Yes, something like this basically.
>
>
> > > So I have some questions:
> > >
> > > 1) Compared to method 2, what's the advantage of creating a new vhost char
> > > device? I guess it's for keep the API compatibility?
> > One benefit is that we can avoid doing vhost ioctls on
> > VFIO device fd.
>
>
> Yes, but any benefit from doing this?
It does seem a bit more modular, but it's certainly not a big deal.
> >
> > > 2) For method 2, is there any easy way for user/admin to distinguish e.g
> > > ordinary vfio-mdev for vhost from ordinary vfio-mdev?
> > I think device-api could be a choice.
>
>
> Ok.
>
>
> >
> > > I saw you introduce
> > > ops matching helper but it's not friendly to management.
> > The ops matching helper is just to check whether a given
> > vfio-device is based on a mdev device.
> >
> > > 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that
> > > assumes the parameter comes from userspace, it prevents support kernel
> > > virtio drivers.
> > >
> > > 4) So comes the idea of method 3, since it register a new vhost-mdev driver,
> > > we can use device specific ops instead of VFIO ones, then we can have a
> > > common API between vDPA parent and vhost-mdev/virtio-mdev drivers.
> > As the above draft shows, this requires introducing a new
> > VFIO device driver. I think Alex's opinion matters here.
>
>
> Yes, it is.
>
> Thanks
>
>
> > Thanks,
> > Tiwei
> >
> > > What's your thoughts?
> > >
> > > Thanks
> > >
> > >
> > > > RFCv3: https://patchwork.kernel.org/patch/11117785/
> > > >
> > > > [1] https://lkml.org/lkml/2019/9/10/135
> > > >
> > > > Tiwei Bie (3):
> > > > vfio: support getting vfio device from device fd
> > > > vfio: support checking vfio driver by device ops
> > > > vhost: introduce mdev based hardware backend
> > > >
> > > > drivers/vfio/mdev/vfio_mdev.c | 3 +-
> > > > drivers/vfio/vfio.c | 32 +++
> > > > drivers/vhost/Kconfig | 9 +
> > > > drivers/vhost/Makefile | 3 +
> > > > drivers/vhost/mdev.c | 462 +++++++++++++++++++++++++++++++
> > > > drivers/vhost/vhost.c | 39 ++-
> > > > drivers/vhost/vhost.h | 6 +
> > > > include/linux/vfio.h | 11 +
> > > > include/uapi/linux/vhost.h | 10 +
> > > > include/uapi/linux/vhost_types.h | 5 +
> > > > 10 files changed, 573 insertions(+), 7 deletions(-)
> > > > create mode 100644 drivers/vhost/mdev.c
> > > >
next prev parent reply other threads:[~2019-09-18 14:32 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-17 1:02 [RFC v4 0/3] vhost: introduce mdev based hardware backend Tiwei Bie
2019-09-17 1:02 ` [RFC v4 1/3] vfio: support getting vfio device from device fd Tiwei Bie
2019-09-17 1:02 ` [RFC v4 2/3] vfio: support checking vfio driver by device ops Tiwei Bie
2019-09-17 1:02 ` [RFC v4 3/3] vhost: introduce mdev based hardware backend Tiwei Bie
2019-09-17 7:26 ` Jason Wang
2019-09-20 4:21 ` Tiwei Bie
2019-09-17 1:29 ` [RFC v4 0/3] " Jason Wang
2019-09-17 3:32 ` Jason Wang
2019-09-17 10:58 ` Tiwei Bie
2019-09-18 5:51 ` Jason Wang
2019-09-18 14:32 ` Michael S. Tsirkin [this message]
2019-09-19 13:08 ` Jason Wang
2019-09-19 15:45 ` Tiwei Bie
2019-09-20 0:59 ` Jason Wang
2019-09-20 1:30 ` Jason Wang
2019-09-20 2:16 ` Tiwei Bie
2019-09-20 2:36 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190918102923-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=cunming.liang@intel.com \
--cc=dan.daly@intel.com \
--cc=jasowang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=lingshan.zhu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maxime.coquelin@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=tiwei.bie@intel.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).