linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: mst@redhat.com, alex.williamson@redhat.com,
	maxime.coquelin@redhat.com, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, dan.daly@intel.com,
	cunming.liang@intel.com, zhihong.wang@intel.com,
	lingshan.zhu@intel.com
Subject: Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
Date: Wed, 18 Sep 2019 13:51:21 +0800	[thread overview]
Message-ID: <fa6957f3-19ad-f351-8c43-65bc8342b82e@redhat.com> (raw)
In-Reply-To: <20190917105801.GA24855@___>


On 2019/9/17 下午6:58, Tiwei Bie wrote:
> On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote:
>> On 2019/9/17 上午9:02, Tiwei Bie wrote:
>>> This RFC is to demonstrate below ideas,
>>>
>>> a) Build vhost-mdev on top of the same abstraction defined in
>>>      the virtio-mdev series [1];
>>>
>>> b) Introduce /dev/vhost-mdev to do vhost ioctls and support
>>>      setting mdev device as backend;
>>>
>>> Now the userspace API looks like this:
>>>
>>> - Userspace generates a compatible mdev device;
>>>
>>> - Userspace opens this mdev device with VFIO API (including
>>>     doing IOMMU programming for this mdev device with VFIO's
>>>     container/group based interface);
>>>
>>> - Userspace opens /dev/vhost-mdev and gets vhost fd;
>>>
>>> - Userspace uses vhost ioctls to setup vhost (userspace should
>>>     do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device
>>>     fd first before doing other vhost ioctls);
>>>
>>> Only compile test has been done for this series for now.
>>
>> Have a hard thought on the architecture:
> Thanks a lot! Do appreciate it!
>
>> 1) Create a vhost char device and pass vfio mdev device fd to it as a
>> backend and translate vhost-mdev ioctl to virtio mdev transport (e.g
>> read/write). DMA was done through the VFIO DMA mapping on the container that
>> is attached.
> Yeah, that's what we are doing in this series.
>
>> We have two more choices:
>>
>> 2) Use vfio-mdev but do not create vhost-mdev device, instead, just
>> implement vhost ioctl on vfio_device_ops, and translate them into
>> virtio-mdev transport or just pass ioctl to parent.
> Yeah. Instead of introducing /dev/vhost-mdev char device, do
> vhost ioctls on VFIO device fd directly. That's what we did
> in RFC v3.
>
>> 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still
>> try to add dev to vfio group and talk to parent with device specific ops
> If my understanding is correct, this means we need to introduce
> a new VFIO device driver to replace the existing vfio-mdev driver
> in our case. Below is a quick draft just to show my understanding:
>
> #include <linux/init.h>
> #include <linux/module.h>
> #include <linux/device.h>
> #include <linux/kernel.h>
> #include <linux/slab.h>
> #include <linux/vfio.h>
> #include <linux/mdev.h>
>
> #include "mdev_private.h"
>
> /* XXX: we need a proper way to include below vhost header. */
> #include "../../vhost/vhost.h"
>
> static int vfio_vhost_mdev_open(void *device_data)
> {
> 	if (!try_module_get(THIS_MODULE))
> 		return -ENODEV;
>
> 	/* ... */
> 	vhost_dev_init(...);
>
> 	return 0;
> }
>
> static void vfio_vhost_mdev_release(void *device_data)
> {
> 	/* ... */
> 	module_put(THIS_MODULE);
> }
>
> static long vfio_vhost_mdev_unlocked_ioctl(void *device_data,
> 					   unsigned int cmd, unsigned long arg)
> {
> 	struct mdev_device *mdev = device_data;
> 	struct mdev_parent *parent = mdev->parent;
>
> 	/*
> 	 * Use vhost ioctls.
> 	 *
> 	 * We will have a different parent_ops design.
> 	 * And potentially, we can share the same parent_ops
> 	 * with virtio_mdev.
> 	 */
> 	switch (cmd) {
> 	case VHOST_GET_FEATURES:
> 		parent->ops->get_features(mdev, ...);
> 		break;
> 	/* ... */
> 	}
>
> 	return 0;
> }
>
> static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf,
> 				    size_t count, loff_t *ppos)
> {
> 	/* ... */
> 	return 0;
> }
>
> static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf,
> 				     size_t count, loff_t *ppos)
> {
> 	/* ... */
> 	return 0;
> }
>
> static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma)
> {
> 	/* ... */
> 	return 0;
> }
>
> static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = {
> 	.name		= "vfio-vhost-mdev",
> 	.open		= vfio_vhost_mdev_open,
> 	.release	= vfio_vhost_mdev_release,
> 	.ioctl		= vfio_vhost_mdev_unlocked_ioctl,
> 	.read		= vfio_vhost_mdev_read,
> 	.write		= vfio_vhost_mdev_write,
> 	.mmap		= vfio_vhost_mdev_mmap,
> };
>
> static int vfio_vhost_mdev_probe(struct device *dev)
> {
> 	struct mdev_device *mdev = to_mdev_device(dev);
>
> 	/* ... */
> 	return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev);
> }
>
> static void vfio_vhost_mdev_remove(struct device *dev)
> {
> 	/* ... */
> 	vfio_del_group_dev(dev);
> }
>
> static struct mdev_driver vfio_vhost_mdev_driver = {
> 	.name	= "vfio_vhost_mdev",
> 	.probe	= vfio_vhost_mdev_probe,
> 	.remove	= vfio_vhost_mdev_remove,
> };
>
> static int __init vfio_vhost_mdev_init(void)
> {
> 	return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE);
> }
> module_init(vfio_vhost_mdev_init)
>
> static void __exit vfio_vhost_mdev_exit(void)
> {
> 	mdev_unregister_driver(&vfio_vhost_mdev_driver);
> }
> module_exit(vfio_vhost_mdev_exit)


Yes, something like this basically.


>> So I have some questions:
>>
>> 1) Compared to method 2, what's the advantage of creating a new vhost char
>> device? I guess it's for keep the API compatibility?
> One benefit is that we can avoid doing vhost ioctls on
> VFIO device fd.


Yes, but any benefit from doing this?


>
>> 2) For method 2, is there any easy way for user/admin to distinguish e.g
>> ordinary vfio-mdev for vhost from ordinary vfio-mdev?
> I think device-api could be a choice.


Ok.


>
>> I saw you introduce
>> ops matching helper but it's not friendly to management.
> The ops matching helper is just to check whether a given
> vfio-device is based on a mdev device.
>
>> 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that
>> assumes the parameter comes from userspace, it prevents support kernel
>> virtio drivers.
>>
>> 4) So comes the idea of method 3, since it register a new vhost-mdev driver,
>> we can use device specific ops instead of VFIO ones, then we can have a
>> common API between vDPA parent and vhost-mdev/virtio-mdev drivers.
> As the above draft shows, this requires introducing a new
> VFIO device driver. I think Alex's opinion matters here.


Yes, it is.

Thanks


> Thanks,
> Tiwei
>
>> What's your thoughts?
>>
>> Thanks
>>
>>
>>> RFCv3: https://patchwork.kernel.org/patch/11117785/
>>>
>>> [1] https://lkml.org/lkml/2019/9/10/135
>>>
>>> Tiwei Bie (3):
>>>     vfio: support getting vfio device from device fd
>>>     vfio: support checking vfio driver by device ops
>>>     vhost: introduce mdev based hardware backend
>>>
>>>    drivers/vfio/mdev/vfio_mdev.c    |   3 +-
>>>    drivers/vfio/vfio.c              |  32 +++
>>>    drivers/vhost/Kconfig            |   9 +
>>>    drivers/vhost/Makefile           |   3 +
>>>    drivers/vhost/mdev.c             | 462 +++++++++++++++++++++++++++++++
>>>    drivers/vhost/vhost.c            |  39 ++-
>>>    drivers/vhost/vhost.h            |   6 +
>>>    include/linux/vfio.h             |  11 +
>>>    include/uapi/linux/vhost.h       |  10 +
>>>    include/uapi/linux/vhost_types.h |   5 +
>>>    10 files changed, 573 insertions(+), 7 deletions(-)
>>>    create mode 100644 drivers/vhost/mdev.c
>>>

  reply	other threads:[~2019-09-18  5:51 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17  1:02 [RFC v4 0/3] vhost: introduce mdev based hardware backend Tiwei Bie
2019-09-17  1:02 ` [RFC v4 1/3] vfio: support getting vfio device from device fd Tiwei Bie
2019-09-17  1:02 ` [RFC v4 2/3] vfio: support checking vfio driver by device ops Tiwei Bie
2019-09-17  1:02 ` [RFC v4 3/3] vhost: introduce mdev based hardware backend Tiwei Bie
2019-09-17  7:26   ` Jason Wang
2019-09-20  4:21     ` Tiwei Bie
2019-09-17  1:29 ` [RFC v4 0/3] " Jason Wang
2019-09-17  3:32 ` Jason Wang
2019-09-17 10:58   ` Tiwei Bie
2019-09-18  5:51     ` Jason Wang [this message]
2019-09-18 14:32       ` Michael S. Tsirkin
2019-09-19 13:08         ` Jason Wang
2019-09-19 15:45           ` Tiwei Bie
2019-09-20  0:59             ` Jason Wang
2019-09-20  1:30             ` Jason Wang
2019-09-20  2:16               ` Tiwei Bie
2019-09-20  2:36                 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa6957f3-19ad-f351-8c43-65bc8342b82e@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=lingshan.zhu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=tiwei.bie@intel.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).