linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Song, Jike" <jike.song@intel.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"igvt-g@ml01.01.org" <igvt-g@ml01.01.org>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"White, Michael L" <michael.l.white@intel.com>,
	"Dong, Eddie" <eddie.dong@intel.com>,
	"Li, Susie" <susie.li@intel.com>,
	"Cowperthwaite, David J" <david.j.cowperthwaite@intel.com>,
	"Reddy, Raghuveer" <raghuveer.reddy@intel.com>,
	"Zhu, Libo" <libo.zhu@intel.com>,
	"Zhou, Chao" <chao.zhou@intel.com>,
	"Wang, Hongbo" <hongbo.wang@intel.com>,
	"Lv, Zhiyuan" <zhiyuan.lv@intel.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Gerd Hoffmann <kraxel@redhat.com>
Subject: RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel
Date: Fri, 20 Nov 2015 07:09:38 +0000	[thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D15F717855@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <1447963356.4697.184.camel@redhat.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5518 bytes --]

> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Friday, November 20, 2015 4:03 AM
> 
> > >
> > > The proposal is therefore that GPU vendors can expose vGPUs to
> > > userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
> > > supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
> > > module (or extension of i915) can register as a vfio bus driver, create
> > > a struct device per vGPU, create an IOMMU group for that device, and
> > > register that device with the vfio-core.  Since we don't rely on the
> > > system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
> > > extension of the same module) can register a "type1" compliant IOMMU
> > > driver into vfio-core.  From the perspective of QEMU then, all of the
> > > existing vfio-pci code is re-used, QEMU remains largely unaware of any
> > > specifics of the vGPU being assigned, and the only necessary change so
> > > far is how QEMU traverses sysfs to find the device and thus the IOMMU
> > > group leading to the vfio group.
> >
> > GVT-g requires to pin guest memory and query GPA->HPA information,
> > upon which shadow GTTs will be updated accordingly from (GMA->GPA)
> > to (GMA->HPA). So yes, here a dummy or simple "type1" compliant IOMMU
> > can be introduced just for this requirement.
> >
> > However there's one tricky point which I'm not sure whether overall
> > VFIO concept will be violated. GVT-g doesn't require system IOMMU
> > to function, however host system may enable system IOMMU just for
> > hardening purpose. This means two-level translations existing (GMA->
> > IOVA->HPA), so the dummy IOMMU driver has to request system IOMMU
> > driver to allocate IOVA for VMs and then setup IOVA->HPA mapping
> > in IOMMU page table. In this case, multiple VM's translations are
> > multiplexed in one IOMMU page table.
> >
> > We might need create some group/sub-group or parent/child concepts
> > among those IOMMUs for thorough permission control.
> 
> My thought here is that this is all abstracted through the vGPU IOMMU
> and device vfio backends.  It's the GPU driver itself, or some vfio
> extension of that driver, mediating access to the device and deciding
> when to configure GPU MMU mappings.  That driver has access to the GPA
> to HVA translations thanks to the type1 complaint IOMMU it implements
> and can pin pages as needed to create GPA to HPA mappings.  That should
> give it all the pieces it needs to fully setup mappings for the vGPU.
> Whether or not there's a system IOMMU is simply an exercise for that
> driver.  It needs to do a DMA mapping operation through the system IOMMU
> the same for a vGPU as if it was doing it for itself, because they are
> in fact one in the same.  The GMA to IOVA mapping seems like an internal
> detail.  I assume the IOVA is some sort of GPA, and the GMA is managed
> through mediation of the device.

Sorry I'm not familiar with VFIO internal. My original worry is that system 
IOMMU for GPU may be already claimed by another vfio driver (e.g. host kernel
wants to harden gfx driver from rest sub-systems, regardless of whether vGPU 
is created or not). In that case vGPU IOMMU driver shouldn't manage system
IOMMU directly.

btw, curious today how VFIO coordinates with system IOMMU driver regarding
to whether a IOMMU is used to control device assignment, or used for kernel 
hardening. Somehow two are conflicting since different address spaces are
concerned (GPA vs. IOVA)...

> 
> 
> > > There are a few areas where we know we'll need to extend the VFIO API to
> > > make this work, but it seems like they can all be done generically.  One
> > > is that PCI BARs are described through the VFIO API as regions and each
> > > region has a single flag describing whether mmap (ie. direct mapping) of
> > > that region is possible.  We expect that vGPUs likely need finer
> > > granularity, enabling some areas within a BAR to be trapped and fowarded
> > > as a read or write access for the vGPU-vfio-device module to emulate,
> > > while other regions, like framebuffers or texture regions, are directly
> > > mapped.  I have prototype code to enable this already.
> >
> > Yes in GVT-g one BAR resource might be partitioned among multiple vGPUs.
> > If VFIO can support such partial resource assignment, it'd be great. Similar
> > parent/child concept might also be required here, so any resource enumerated
> > on a vGPU shouldn't break limitations enforced on the physical device.
> 
> To be clear, I'm talking about partitioning of the BAR exposed to the
> guest.  Partitioning of the physical BAR would be managed by the vGPU
> vfio device driver.  For instance when the guest mmap's a section of the
> virtual BAR, the vGPU device driver would map that to a portion of the
> physical device BAR.
> 
> > One unique requirement for GVT-g here, though, is that vGPU device model
> > need to know guest BAR configuration for proper emulation (e.g. register
> > IO emulation handler to KVM). Similar is about guest MSI vector for virtual
> > interrupt injection. Not sure how this can be fit into common VFIO model.
> > Does VFIO allow vendor specific extension today?
> 
> As a vfio device driver all config accesses and interrupt configuration
> would be forwarded to you, so I don't see this being a problem.

Sure, nice to know that.

Thanks
Kevin
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

  reply	other threads:[~2015-11-20  7:09 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-25  8:31 [Intel-gfx] [Announcement] Updates to XenGT - a Mediated Graphics Passthrough Solution from Intel Jike Song
2014-07-29 10:09 ` [Xen-devel] " Dario Faggioli
2014-07-30  9:39   ` Jike Song
2014-12-04  2:45 ` [Intel-gfx] [Announcement] 2014-Q3 release of " Jike Song
2014-12-04 10:20   ` [Xen-devel] " Fabio Fantoni
2015-01-09  8:51   ` [Intel-gfx] [Announcement] 2015-Q1 " Jike Song
2015-01-12  3:04     ` [Intel-gfx] [Announcement] 2014-Q4 " Jike Song
2015-04-10 13:23     ` [Intel-gfx] [Announcement] 2015-Q1 " Jike Song
2015-07-07  2:49       ` [Intel-gfx] [Announcement] 2015-Q2 " Jike Song
2015-10-27  9:25         ` [Intel-gfx] [Announcement] 2015-Q3 " Jike Song
2015-11-18 18:12           ` Alex Williamson
2015-11-19  4:06             ` Tian, Kevin
2015-11-19  7:22               ` Jike Song
2015-11-19 15:32                 ` Stefano Stabellini
2015-11-19 15:49                   ` Paolo Bonzini
2015-11-19 16:12                     ` Stefano Stabellini
2015-11-19 15:52                   ` Alex Williamson
2015-11-20  2:58                     ` Jike Song
2015-11-20  4:22                       ` Alex Williamson
2015-11-20  5:51                         ` Jike Song
2015-11-20  6:01                           ` Tian, Kevin
2015-11-20 16:40                           ` Alex Williamson
2015-11-23  4:52                             ` [Qemu-devel] " Jike Song
2015-11-19  8:40               ` Gerd Hoffmann
2015-11-19 11:09                 ` Paolo Bonzini
2015-11-20  2:46                   ` Jike Song
2015-11-20  6:12                 ` Tian, Kevin
2015-11-20  8:26                   ` Gerd Hoffmann
2015-11-20  8:36                     ` Tian, Kevin
2015-11-20  8:46                       ` Zhiyuan Lv
2015-12-03  6:57                     ` Tian, Kevin
2015-12-04 10:13                       ` Gerd Hoffmann
2015-11-19 20:02               ` Alex Williamson
2015-11-20  7:09                 ` Tian, Kevin [this message]
2015-11-20 17:03                   ` Alex Williamson
2015-11-20  8:10                 ` Tian, Kevin
2015-11-20 17:25                   ` Alex Williamson
2015-11-23  5:05                     ` Jike Song
2015-11-24 11:19                 ` Daniel Vetter
2015-11-24 11:49                   ` Chris Wilson
2015-11-24 12:38                   ` Gerd Hoffmann
2015-11-24 13:31                     ` Daniel Vetter
2015-11-24 14:12                       ` Gerd Hoffmann
2015-11-24 14:19                         ` Daniel Vetter
2016-01-27  6:21           ` [Intel-gfx] [Announcement] 2015-Q4 " Jike Song
2016-04-28  5:29             ` [Intel-gfx] [Announcement] 2016-Q1 " Jike Song
2016-07-22  5:42               ` [Intel-gfx] [Announcement] 2016-Q2 " Jike Song
2016-11-06 14:59                 ` [Intel-gfx] [Announcement] 2016-Q3 " Jike Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AADFC41AFE54684AB9EE6CBC0274A5D15F717855@SHSMSX101.ccr.corp.intel.com \
    --to=kevin.tian@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=chao.zhou@intel.com \
    --cc=david.j.cowperthwaite@intel.com \
    --cc=eddie.dong@intel.com \
    --cc=hongbo.wang@intel.com \
    --cc=igvt-g@ml01.01.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jike.song@intel.com \
    --cc=kraxel@redhat.com \
    --cc=libo.zhu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.l.white@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=raghuveer.reddy@intel.com \
    --cc=susie.li@intel.com \
    --cc=xen-devel@lists.xen.org \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).