From mboxrd@z Thu Jan  1 00:00:00 1970
From: Kirti Wankhede <kwankhede@nvidia.com>
Subject: Re: VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT -
 a Mediated ...)
Date: Thu, 28 Jan 2016 02:25:32 +0530
Message-ID: <56A92EC4.5050105@nvidia.com>
References: <569C5071.6080004@intel.com>
 <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com>
 <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com>
 <AADFC41AFE54684AB9EE6CBC0274A5D15F786B4B@SHSMSX101.ccr.corp.intel.com>
 <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com>
 <AADFC41AFE54684AB9EE6CBC0274A5D15F78D2A3@SHSMSX101.ccr.corp.intel.com>
 <20160126102003.GA14400@nvidia.com> <1453838773.15515.1.camel@redhat.com>
 <56A87A93.3000105@nvidia.com> <1453910459.6261.1.camel@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: "Song, Jike" <jike.song@intel.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Lv, Zhiyuan" <zhiyuan.lv@intel.com>,
	"Ruan, Shuai" <shuai.ruan@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	"igvt-g@lists.01.org" <igvt-g@ml01.01.org>
To: Alex Williamson <alex.williamson@redhat.com>,
	Neo Jia <cjia@nvidia.com>, "Tian, Kevin" <kevin.tian@intel.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from hqemgate15.nvidia.com ([216.228.121.64]:6628 "EHLO
	hqemgate15.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S964941AbcA0Uzn convert rfc822-to-8bit (ORCPT
	<rfc822;kvm@vger.kernel.org>); Wed, 27 Jan 2016 15:55:43 -0500
In-Reply-To: <1453910459.6261.1.camel@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


On 1/27/2016 9:30 PM, Alex Williamson wrote:
> On Wed, 2016-01-27 at 13:36 +0530, Kirti Wankhede wrote:
>>
>> On 1/27/2016 1:36 AM, Alex Williamson wrote:
>>> On Tue, 2016-01-26 at 02:20 -0800, Neo Jia wrote:
>>>> On Mon, Jan 25, 2016 at 09:45:14PM +0000, Tian, Kevin wrote:
>>>>>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>>>
>>>> Hi Alex, Kevin and Jike,
>>>>
>>>> (Seems I shouldn't use attachment, resend it again to the list, pa=
tches are
>>>> inline at the end)
>>>>
>>>> Thanks for adding me to this technical discussion, a great opportu=
nity
>>>> for us to design together which can bring both Intel and NVIDIA vG=
PU solution to
>>>> KVM platform.
>>>>
>>>> Instead of directly jumping to the proposal that we have been work=
ing on
>>>> recently for NVIDIA vGPU on KVM, I think it is better for me to pu=
t out couple
>>>> quick comments / thoughts regarding the existing discussions on th=
is thread as
>>>> fundamentally I think we are solving the same problem, DMA, interr=
upt and MMIO.
>>>>
>>>> Then we can look at what we have, hopefully we can reach some cons=
ensus soon.
>>>>
>>>>> Yes, and since you're creating and destroying the vgpu here, this=
 is
>>>>> where I'd expect a struct device to be created and added to an IO=
MMU
>>>>> group.  The lifecycle management should really include links betw=
een
>>>>> the vGPU and physical GPU, which would be much, much easier to do=
 with
>>>>> struct devices create here rather than at the point where we star=
t
>>>>> doing vfio "stuff".
>>>>
>>>> Infact to keep vfio-vgpu to be more generic, vgpu device creation =
and management
>>>> can be centralized and done in vfio-vgpu. That also include adding=
 to IOMMU
>>>> group and VFIO group.
>>> Is this really a good idea?  The concept of a vgpu is not unique to
>>> vfio, we want vfio to be a driver for a vgpu, not an integral part =
of
>>> the lifecycle of a vgpu.  That certainly doesn't exclude adding
>>> infrastructure to make lifecycle management of a vgpu more consiste=
nt
>>> between drivers, but it should be done independently of vfio.  I'll=
 go
>>> back to the SR-IOV model, vfio is often used with SR-IOV VFs, but v=
fio
>>> does not create the VF, that's done in coordination with the PF mak=
ing
>>> use of some PCI infrastructure for consistency between drivers.
>>>
>>> It seems like we need to take more advantage of the class and drive=
r
>>> core support to perhaps setup a vgpu bus and class with vfio-vgpu j=
ust
>>> being a driver for those devices.
>>
>> For device passthrough or SR-IOV model, PCI devices are created by P=
CI
>> bus driver and from the probe routine each device is added in vfio g=
roup.
>
> An SR-IOV VF is created by the PF driver using standard interfaces
> provided by the PCI core.  The IOMMU group for a VF is added by the
> IOMMU driver when the device is created on the pci_bus_type.  The pro=
be
> routine of the vfio bus driver (vfio-pci) is what adds the device int=
o
> the vfio group.
>
>> For vgpu, there should be a common module that create vgpu device, s=
ay
>> vgpu module, add vgpu device to an IOMMU group and then add it to vf=
io
>> group.  This module can handle management of vgpus. Advantage of kee=
ping
>> this module a separate module than doing device creation in vendor
>> modules is to have generic interface for vgpu management, for exampl=
e,
>> files /sys/class/vgpu/vgpu_start and  /sys/class/vgpu/vgpu_shudown a=
nd
>> vgpu driver registration interface.
>
> But you're suggesting something very different from the SR-IOV model.
> If we wanted to mimic that model, the GPU specific driver should crea=
te
> the vgpu using services provided by a common interface.  For instance
> i915 could call a new vgpu_device_create() which creates the device,
> adds it to the vgpu class, etc.  That vgpu device should not be assum=
ed
> to be used with vfio though, that should happen via a separate probe
> using a vfio-vgpu driver.  It's that vfio bus driver that will add th=
e
> device to a vfio group.
>

In that case vgpu driver should provide a driver registration interface=
=20
to register vfio-vgpu driver.

struct vgpu_driver {
	const char *name;
	int (*probe) (struct vgpu_device *vdev);
	void (*remove) (struct vgpu_device *vdev);
}

int vgpu_register_driver(struct vgpu_driver *driver)
{
=2E..
}
EXPORT_SYMBOL(vgpu_register_driver);

int vgpu_unregister_driver(struct vgpu_driver *driver)
{
=2E..
}
EXPORT_SYMBOL(vgpu_unregister_driver);

vfio-vgpu driver registers to vgpu driver. Then from=20
vgpu_device_create(), after creating the device it calls=20
vgpu_driver->probe(vgpu_device) and vfio-vgpu driver adds the device to=
=20
vfio group.

+--------------+    vgpu_register_driver()+---------------+
|     __init() +------------------------->+               |
|              |                          |               |
|              +<-------------------------+    vgpu.ko    |
| vfio_vgpu.ko |   probe()/remove()       |               |
|              |                +---------+               +---------+
+--------------+                |         +-------+-------+         |
                                 |                 ^                 |
                                 | callback        |                 |
                                 |         +-------+--------+        |
                                 |         |vgpu_register_device()   |
                                 |         |                |        |
                                 +---^-----+-----+    +-----+------+-+
                                     | nvidia.ko |    |  i915.ko   |
                                     |           |    |            |
                                     +-----------+    +------------+

Is my understanding correct?

Thanks,
Kirti


>> In the patch, vgpu_dev.c + vgpu_sysfs.c form such vgpu module and
>> vgpu_vfio.c is for VFIO interface. Each vgpu device should be added =
to
>> vfio group, so vgpu_group_init() from vgpu_vfio.c should be called p=
er
>> device. In the vgpu module, vgpu devices are created on request, so
>> vgpu_group_init() should be called explicitly for per vgpu device.
>>    That=E2=80=99s why had merged the 2 modules, vgpu + vgpu_vfio to =
form one vgpu
>> module.  Vgpu_vfio would remain separate entity but merged with vgpu
>> module.
>
> I disagree with this design, creation of a vgpu necessarily involves =
the
> GPU driver and should not be tied to use of the vgpu with vfio.  vfio
> should be a driver for the device, maybe eventually not the only driv=
er
> for the device.  Thanks,
>
> Alex
>

From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41587)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwankhede@nvidia.com>) id 1aOX7y-0003dZ-Sa
	for qemu-devel@nongnu.org; Wed, 27 Jan 2016 15:55:48 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <kwankhede@nvidia.com>) id 1aOX7u-0002ZD-Dh
	for qemu-devel@nongnu.org; Wed, 27 Jan 2016 15:55:46 -0500
Received: from hqemgate15.nvidia.com ([216.228.121.64]:6629)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwankhede@nvidia.com>) id 1aOX7u-0002Z7-3W
	for qemu-devel@nongnu.org; Wed, 27 Jan 2016 15:55:42 -0500
References: <569C5071.6080004@intel.com>
	<1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com>
	<1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com>
	<AADFC41AFE54684AB9EE6CBC0274A5D15F786B4B@SHSMSX101.ccr.corp.intel.com>
	<56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com>
	<AADFC41AFE54684AB9EE6CBC0274A5D15F78D2A3@SHSMSX101.ccr.corp.intel.com>
	<20160126102003.GA14400@nvidia.com>
	<1453838773.15515.1.camel@redhat.com>
	<56A87A93.3000105@nvidia.com> <1453910459.6261.1.camel@redhat.com>
From: Kirti Wankhede <kwankhede@nvidia.com>
Message-ID: <56A92EC4.5050105@nvidia.com>
Date: Thu, 28 Jan 2016 02:25:32 +0530
MIME-Version: 1.0
In-Reply-To: <1453910459.6261.1.camel@redhat.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3
 release of XenGT - a Mediated ...)
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Alex Williamson <alex.williamson@redhat.com>, Neo Jia <cjia@nvidia.com>, "Tian, Kevin" <kevin.tian@intel.com>
Cc: "Ruan, Shuai" <shuai.ruan@intel.com>, "Song, Jike" <jike.song@intel.com>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "igvt-g@lists.01.org" <igvt-g@ml01.01.org>, qemu-devel <qemu-devel@nongnu.org>, Gerd Hoffmann <kraxel@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Lv, Zhiyuan" <zhiyuan.lv@intel.com>


On 1/27/2016 9:30 PM, Alex Williamson wrote:
> On Wed, 2016-01-27 at 13:36 +0530, Kirti Wankhede wrote:
>>
>> On 1/27/2016 1:36 AM, Alex Williamson wrote:
>>> On Tue, 2016-01-26 at 02:20 -0800, Neo Jia wrote:
>>>> On Mon, Jan 25, 2016 at 09:45:14PM +0000, Tian, Kevin wrote:
>>>>>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>>>
>>>> Hi Alex, Kevin and Jike,
>>>>
>>>> (Seems I shouldn't use attachment, resend it again to the list, patche=
s are
>>>> inline at the end)
>>>>
>>>> Thanks for adding me to this technical discussion, a great opportunity
>>>> for us to design together which can bring both Intel and NVIDIA vGPU s=
olution to
>>>> KVM platform.
>>>>
>>>> Instead of directly jumping to the proposal that we have been working =
on
>>>> recently for NVIDIA vGPU on KVM, I think it is better for me to put ou=
t couple
>>>> quick comments / thoughts regarding the existing discussions on this t=
hread as
>>>> fundamentally I think we are solving the same problem, DMA, interrupt =
and MMIO.
>>>>
>>>> Then we can look at what we have, hopefully we can reach some consensu=
s soon.
>>>>
>>>>> Yes, and since you're creating and destroying the vgpu here, this is
>>>>> where I'd expect a struct device to be created and added to an IOMMU
>>>>> group.  The lifecycle management should really include links between
>>>>> the vGPU and physical GPU, which would be much, much easier to do wit=
h
>>>>> struct devices create here rather than at the point where we start
>>>>> doing vfio "stuff".
>>>>
>>>> Infact to keep vfio-vgpu to be more generic, vgpu device creation and =
management
>>>> can be centralized and done in vfio-vgpu. That also include adding to =
IOMMU
>>>> group and VFIO group.
>>> Is this really a good idea?  The concept of a vgpu is not unique to
>>> vfio, we want vfio to be a driver for a vgpu, not an integral part of
>>> the lifecycle of a vgpu.  That certainly doesn't exclude adding
>>> infrastructure to make lifecycle management of a vgpu more consistent
>>> between drivers, but it should be done independently of vfio.  I'll go
>>> back to the SR-IOV model, vfio is often used with SR-IOV VFs, but vfio
>>> does not create the VF, that's done in coordination with the PF making
>>> use of some PCI infrastructure for consistency between drivers.
>>>
>>> It seems like we need to take more advantage of the class and driver
>>> core support to perhaps setup a vgpu bus and class with vfio-vgpu just
>>> being a driver for those devices.
>>
>> For device passthrough or SR-IOV model, PCI devices are created by PCI
>> bus driver and from the probe routine each device is added in vfio group=
.
>
> An SR-IOV VF is created by the PF driver using standard interfaces
> provided by the PCI core.  The IOMMU group for a VF is added by the
> IOMMU driver when the device is created on the pci_bus_type.  The probe
> routine of the vfio bus driver (vfio-pci) is what adds the device into
> the vfio group.
>
>> For vgpu, there should be a common module that create vgpu device, say
>> vgpu module, add vgpu device to an IOMMU group and then add it to vfio
>> group.  This module can handle management of vgpus. Advantage of keeping
>> this module a separate module than doing device creation in vendor
>> modules is to have generic interface for vgpu management, for example,
>> files /sys/class/vgpu/vgpu_start and  /sys/class/vgpu/vgpu_shudown and
>> vgpu driver registration interface.
>
> But you're suggesting something very different from the SR-IOV model.
> If we wanted to mimic that model, the GPU specific driver should create
> the vgpu using services provided by a common interface.  For instance
> i915 could call a new vgpu_device_create() which creates the device,
> adds it to the vgpu class, etc.  That vgpu device should not be assumed
> to be used with vfio though, that should happen via a separate probe
> using a vfio-vgpu driver.  It's that vfio bus driver that will add the
> device to a vfio group.
>

In that case vgpu driver should provide a driver registration interface=20
to register vfio-vgpu driver.

struct vgpu_driver {
	const char *name;
	int (*probe) (struct vgpu_device *vdev);
	void (*remove) (struct vgpu_device *vdev);
}

int vgpu_register_driver(struct vgpu_driver *driver)
{
...
}
EXPORT_SYMBOL(vgpu_register_driver);

int vgpu_unregister_driver(struct vgpu_driver *driver)
{
...
}
EXPORT_SYMBOL(vgpu_unregister_driver);

vfio-vgpu driver registers to vgpu driver. Then from=20
vgpu_device_create(), after creating the device it calls=20
vgpu_driver->probe(vgpu_device) and vfio-vgpu driver adds the device to=20
vfio group.

+--------------+    vgpu_register_driver()+---------------+
|     __init() +------------------------->+               |
|              |                          |               |
|              +<-------------------------+    vgpu.ko    |
| vfio_vgpu.ko |   probe()/remove()       |               |
|              |                +---------+               +---------+
+--------------+                |         +-------+-------+         |
                                 |                 ^                 |
                                 | callback        |                 |
                                 |         +-------+--------+        |
                                 |         |vgpu_register_device()   |
                                 |         |                |        |
                                 +---^-----+-----+    +-----+------+-+
                                     | nvidia.ko |    |  i915.ko   |
                                     |           |    |            |
                                     +-----------+    +------------+

Is my understanding correct?

Thanks,
Kirti


>> In the patch, vgpu_dev.c + vgpu_sysfs.c form such vgpu module and
>> vgpu_vfio.c is for VFIO interface. Each vgpu device should be added to
>> vfio group, so vgpu_group_init() from vgpu_vfio.c should be called per
>> device. In the vgpu module, vgpu devices are created on request, so
>> vgpu_group_init() should be called explicitly for per vgpu device.
>>    That=E2=80=99s why had merged the 2 modules, vgpu + vgpu_vfio to form=
 one vgpu
>> module.  Vgpu_vfio would remain separate entity but merged with vgpu
>> module.
>
> I disagree with this design, creation of a vgpu necessarily involves the
> GPU driver and should not be tied to use of the vgpu with vfio.  vfio
> should be a driver for the device, maybe eventually not the only driver
> for the device.  Thanks,
>
> Alex
>