[Qemu-devel] [RFC 0/2] vhost-vfio: introduce mdev based HW vhost backend

* [Qemu-devel] [RFC 0/2] vhost-vfio: introduce mdev based HW vhost backend
@ 2018-10-16 13:23 Xiao Wang
  2018-10-16 13:23 ` [Qemu-devel] [RFC 1/2] vhost-vfio: introduce vhost-vfio net client Xiao Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Xiao Wang @ 2018-10-16 13:23 UTC (permalink / raw)
  To: jasowang, mst, alex.williamson
  Cc: qemu-devel, tiwei.bie, cunming.liang, xiaolong.ye, zhihong.wang,
	dan.daly, Xiao Wang

What's this
===========
Following the patch (vhost: introduce mdev based hardware vhost backend)
https://lwn.net/Articles/750770/, which defines a generic mdev device for
vhost data path acceleration (aliased as vDPA mdev below), this patch set
introduces a new net client type: vhost-vfio.

Currently we have 2 types of vhost backends in QEMU: vhost kernel (tap)
and vhost-user (e.g. DPDK vhost), in order to have a kernel space HW vhost
acceleration framework, the vDPA mdev device works as a generic configuring
channel. It exposes to user space a non-vendor-specific configuration
interface for setting up a vhost HW accelerator, based on this, this patch
set introduces a third vhost backend called vhost-vfio.

How does it work
================
The vDPA mdev defines 2 BAR regions, BAR0 and BAR1. BAR0 is the main
device interface, vhost messages can be written to or read from this
region following below format. All the regular vhost messages about vring
addr, negotiated features, etc., are written to this region directly.

struct vhost_vfio_op {
	__u64 request;
	__u32 flags;
	/* Flag values: */
#define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
	__u32 size;
	union {
		__u64 u64;
		struct vhost_vring_state state;
		struct vhost_vring_addr addr;
		struct vhost_memory memory;
	} payload;
};

BAR1 is defined to be a region of doorbells, QEMU can use this region as
host notifier for virtio. To optimize virtio notify, vhost-vfio trys to
mmap the corresponding page on BAR1 for each queue and leverage EPT to let
guest virtio driver kick vDPA device doorbell directly. For virtio 0.95
case in which we cannot set host notifier memory region, QEMU will help to
relay the notify to vDPA device.

Note: EPT mapping requires each queue's notify address locates at the
beginning of a separate page, parameter "page-per-vq=on" could help.

For interrupt setting, vDPA mdev device leverages existing VFIO API to
enable interrupt config in user space. In this way, KVM's irqfd for virtio
can be set to mdev device by QEMU using ioctl().

vhost-vfio net client will set up a vDPA mdev device which is specified
by a "sysfsdev" parameter, during the net client init, the device will be
opened and parsed using VFIO API, the VFIO device fd and device BAR region
offset will be kept in a VhostVFIO structure, this initialization provides
a channel to configure vhost information to the vDPA device driver.

To do later
===========
1. The net client initialization uses raw VFIO API to open vDPA mdev
device, it's better to provide a set of helpers in hw/vfio/common.c
to help vhost-vfio initialize device easily.

2. For device DMA mapping, QEMU passes memory region info to mdev device
and let kernel parent device driver program IOMMU. This is a temporary
implementation, for future when IOMMU driver supports mdev bus, we
can use VFIO API to program IOMMU directly for parent device.
Refer to the patch (vfio/mdev: IOMMU aware mediated device):
https://lkml.org/lkml/2018/10/12/225

Vhost-vfio usage
================
# Query the number of available mdev instances
$ cat /sys/class/mdev_bus/0000:84:00.3/mdev_supported_types/ifcvf_vdpa-vdpa_virtio/available_instances

# Create a mdev instance
$ echo $UUID > /sys/class/mdev_bus/0000:84:00.3/mdev_supported_types/ifcvf_vdpa-vdpa_virtio/create

# Launch QEMU with a virtio-net device
    qemu-system-x86_64 -cpu host -enable-kvm \
    <snip>
    -mem-prealloc \
    -netdev type=vhost-vfio,sysfsdev=/sys/bus/mdev/devices/$UUID,id=mynet\
    -device virtio-net-pci,netdv=mynet,page-per-vq=on \

-------- END --------

Xiao Wang (2):
  vhost-vfio: introduce vhost-vfio net client
  vhost-vfio: implement vhost-vfio backend

 hw/net/vhost_net.c                |  56 ++++-
 hw/vfio/common.c                  |   3 +-
 hw/virtio/Makefile.objs           |   2 +-
 hw/virtio/vhost-backend.c         |   3 +
 hw/virtio/vhost-vfio.c            | 501 ++++++++++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                 |  15 ++
 include/hw/virtio/vhost-backend.h |   7 +-
 include/hw/virtio/vhost-vfio.h    |  35 +++
 include/hw/virtio/vhost.h         |   2 +
 include/net/vhost-vfio.h          |  17 ++
 linux-headers/linux/vhost.h       |   9 +
 net/Makefile.objs                 |   1 +
 net/clients.h                     |   3 +
 net/net.c                         |   1 +
 net/vhost-vfio.c                  | 327 +++++++++++++++++++++++++
 qapi/net.json                     |  22 +-
 16 files changed, 996 insertions(+), 8 deletions(-)
 create mode 100644 hw/virtio/vhost-vfio.c
 create mode 100644 include/hw/virtio/vhost-vfio.h
 create mode 100644 include/net/vhost-vfio.h
 create mode 100644 net/vhost-vfio.c

-- 
2.15.1

^ permalink raw reply	[flat|nested] 10+ messages in thread