xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	"Lan, Tianyu" <tianyu.lan@intel.com>
Cc: "yang.zhang.wz@gmail.com" <yang.zhang.wz@gmail.com>,
	"xuquan8@huawei.com" <xuquan8@huawei.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Jan Beulich <JBeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	"ian.jackson@eu.citrix.com" <ian.jackson@eu.citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"anthony.perard@citrix.com" <anthony.perard@citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: Xen virtual IOMMU high level design doc V2
Date: Thu, 20 Oct 2016 10:11:01 +0000	[thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D18DFD4F3F@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <20161018202631.GO30736@x230.dumpdata.com>

> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Wednesday, October 19, 2016 4:27 AM
> >
> > 2) For physical PCI device
> > DMA operations go though physical IOMMU directly and IO page table for
> > IOVA->HPA should be loaded into physical IOMMU. When guest updates
> > l2 Page-table pointer field, it provides IO page table for
> > IOVA->GPA. vIOMMU needs to shadow l2 translation table, translate
> > GPA->HPA and update shadow page table(IOVA->HPA) pointer to l2
> > Page-table pointer to context entry of physical IOMMU.
> >
> > Now all PCI devices in same hvm domain share one IO page table
> > (GPA->HPA) in physical IOMMU driver of Xen. To support l2
> > translation of vIOMMU, IOMMU driver need to support multiple address
> > spaces per device entry. Using existing IO page table(GPA->HPA)
> > defaultly and switch to shadow IO page table(IOVA->HPA) when l2
> 
> defaultly?
> 
> > translation function is enabled. These change will not affect current
> > P2M logic.
> 
> What happens if the guests IO page tables have incorrect values?
> 
> For example the guest sets up the pagetables to cover some section
> of HPA ranges (which are all good and permitted). But then during execution
> the guest kernel decides to muck around with the pagetables and adds an HPA
> range that is outside what the guest has been allocated.
> 
> What then?

Shadow PTE is controlled by hypervisor. Whatever IOVA->GPA mapping in
guest PTE must be validated (IOVA->GPA->HPA) before updating into the
shadow PTE. So regardless of when guest mucks its PTE, the operation is
always trapped and validated. Why do you think there is a problem?

Also guest only sees GPA. All it can operate is GPA ranges.

> >
> > 3.3 Interrupt remapping
> > Interrupts from virtual devices and physical devices will be delivered
> > to vlapic from vIOAPIC and vMSI. It needs to add interrupt remapping
> > hooks in the vmsi_deliver() and ioapic_deliver() to find target vlapic
> > according interrupt remapping table.
> >
> >
> > 3.4 l1 translation
> > When nested translation is enabled, any address generated by l1
> > translation is used as the input address for nesting with l2
> > translation. Physical IOMMU needs to enable both l1 and l2 translation
> > in nested translation mode(GVA->GPA->HPA) for passthrough
> > device.
> >
> > VT-d context entry points to guest l1 translation table which
> > will be nest-translated by l2 translation table and so it
> > can be directly linked to context entry of physical IOMMU.
> 
> I think this means that the shared_ept will be disabled?
> >
> What about different versions of contexts? Say the V1 is exposed
> to guest but the hardware supports V2? Are there any flags that have
> swapped positions? Or is it pretty backwards compatible?

yes, backward compatible.

> >
> >
> > 3.5 Implementation consideration
> > VT-d spec doesn't define a capability bit for the l2 translation.
> > Architecturally there is no way to tell guest that l2 translation
> > capability is not available. Linux Intel IOMMU driver thinks l2
> > translation is always available when VTD exits and fail to be loaded
> > without l2 translation support even if interrupt remapping and l1
> > translation are available. So it needs to enable l2 translation first
> 
> I am lost on that sentence. Are you saying that it tries to load
> the IOVA and if they fail.. then it keeps on going? What is the result
> of this? That you can't do IOVA (so can't use vfio ?)

It's about VT-d capability. VT-d supports both 1st-level and 2nd-level 
translation, however only the 1st-level translation can be optionally
reported through a capability bit. There is no capability bit to say
a version doesn't support 2nd-level translation. The implication is
that, as long as a vIOMMU is exposed, guest IOMMU driver always
assumes IOVA capability available thru 2nd level translation. 

So we can first emulate a vIOMMU w/ only 2nd-level capability, and
then extend it to support 1st-level and interrupt remapping, but cannot 
do the reverse direction. I think Tianyu's point is more to describe 
enabling sequence based on this fact. :-)

> > 4.1 Qemu vIOMMU framework
> > Qemu has a framework to create virtual IOMMU(e.g. virtual intel VTD and
> > AMD IOMMU) and report in guest ACPI table. So for Xen side, a dummy
> > xen-vIOMMU wrapper is required to connect with actual vIOMMU in Xen.
> > Especially for l2 translation of virtual PCI device because
> > emulations of virtual PCI devices are in the Qemu. Qemu's vIOMMU
> > framework provides callback to deal with l2 translation when
> > DMA operations of virtual PCI devices happen.
> 
> You say AMD and Intel. This sounds quite OS agnostic. Does it mean you
> could expose an vIOMMU to a guest and actually use the AMD IOMMU
> in the hypervisor?

Did you mean "expose an Intel vIOMMU to guest and then use physical
AMD IOMMU in hypervisor"? I didn't think about this, but what's the value
of doing so? :-)
 
Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-10-20 10:11 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-26  8:29 Discussion about virtual iommu support for Xen guest Lan Tianyu
2016-05-26  8:42 ` Dong, Eddie
2016-05-27  2:26   ` Lan Tianyu
2016-05-27  8:11     ` Tian, Kevin
2016-05-26 11:35 ` Andrew Cooper
2016-05-27  8:19   ` Lan Tianyu
2016-06-02 15:03     ` Lan, Tianyu
2016-06-02 18:58       ` Andrew Cooper
2016-06-03 11:01         ` Current PVH/HVMlite work and planning (was :Re: Discussion about virtual iommu support for Xen guest) Roger Pau Monne
2016-06-03 11:21           ` Tian, Kevin
2016-06-03 11:52             ` Roger Pau Monne
2016-06-03 12:11               ` Tian, Kevin
2016-06-03 16:56                 ` Stefano Stabellini
2016-06-07  5:48                   ` Tian, Kevin
2016-06-03 11:17         ` Discussion about virtual iommu support for Xen guest Tian, Kevin
2016-06-03 13:09           ` Lan, Tianyu
2016-06-03 14:00             ` Andrew Cooper
2016-06-03 13:51           ` Andrew Cooper
2016-06-03 14:31             ` Jan Beulich
2016-06-03 17:14             ` Stefano Stabellini
2016-06-07  5:14               ` Tian, Kevin
2016-06-07  7:26                 ` Jan Beulich
2016-06-07 10:07                 ` Stefano Stabellini
2016-06-08  8:11                   ` Tian, Kevin
2016-06-26 13:42                     ` Lan, Tianyu
2016-06-29  3:04                       ` Tian, Kevin
2016-07-05 13:37                         ` Lan, Tianyu
2016-07-05 13:57                           ` Jan Beulich
2016-07-05 14:19                             ` Lan, Tianyu
2016-08-17 12:05                             ` Xen virtual IOMMU high level design doc Lan, Tianyu
2016-08-17 12:42                               ` Paul Durrant
2016-08-18  2:57                                 ` Lan, Tianyu
2016-08-25 11:11                               ` Jan Beulich
2016-08-31  8:39                                 ` Lan Tianyu
2016-08-31 12:02                                   ` Jan Beulich
2016-09-01  1:26                                     ` Tian, Kevin
2016-09-01  2:35                                     ` Lan Tianyu
2016-09-15 14:22                               ` Lan, Tianyu
2016-10-05 18:36                                 ` Konrad Rzeszutek Wilk
2016-10-11  1:52                                   ` Lan Tianyu
2016-11-23 18:19                               ` Edgar E. Iglesias
2016-11-23 19:09                                 ` Stefano Stabellini
2016-11-24  2:00                                   ` Tian, Kevin
2016-11-24  4:09                                     ` Edgar E. Iglesias
2016-11-24  6:49                                       ` Lan Tianyu
2016-11-24 13:37                                         ` Edgar E. Iglesias
2016-11-25  2:01                                           ` Xuquan (Quan Xu)
2016-11-25  5:53                                           ` Lan, Tianyu
2016-10-18 14:14                             ` Xen virtual IOMMU high level design doc V2 Lan Tianyu
2016-10-18 19:17                               ` Andrew Cooper
2016-10-20  9:53                                 ` Tian, Kevin
2016-10-20 18:10                                   ` Andrew Cooper
2016-10-20 14:17                                 ` Lan Tianyu
2016-10-20 20:36                                   ` Andrew Cooper
2016-10-22  7:32                                     ` Lan, Tianyu
2016-10-26  9:39                                       ` Jan Beulich
2016-10-26 15:03                                         ` Lan, Tianyu
2016-11-03 15:41                                         ` Lan, Tianyu
2016-10-28 15:36                                     ` Lan Tianyu
2016-10-18 20:26                               ` Konrad Rzeszutek Wilk
2016-10-20 10:11                                 ` Tian, Kevin [this message]
2016-10-20 14:56                                 ` Lan, Tianyu
2016-10-26  9:36                               ` Jan Beulich
2016-10-26 14:53                                 ` Lan, Tianyu
2016-11-17 15:36                             ` Xen virtual IOMMU high level design doc V3 Lan Tianyu
2016-11-18 19:43                               ` Julien Grall
2016-11-21  2:21                                 ` Lan, Tianyu
2016-11-21 13:17                                   ` Julien Grall
2016-11-21 18:24                                     ` Stefano Stabellini
2016-11-21  7:05                               ` Tian, Kevin
2016-11-23  1:36                                 ` Lan Tianyu
2016-11-21 13:41                               ` Andrew Cooper
2016-11-22  6:02                                 ` Tian, Kevin
2016-11-22  8:32                                 ` Lan Tianyu
2016-11-22 10:24                               ` Jan Beulich
2016-11-24  2:34                                 ` Lan Tianyu
2016-06-03 19:51             ` Is: 'basic pci bridge and root device support. 'Was:Re: Discussion about virtual iommu support for Xen guest Konrad Rzeszutek Wilk
2016-06-06  9:55               ` Jan Beulich
2016-06-06 17:25                 ` Konrad Rzeszutek Wilk
2016-08-02 15:15     ` Lan, Tianyu
2016-05-27  8:35   ` Tian, Kevin
2016-05-27  8:46     ` Paul Durrant
2016-05-27  9:39       ` Tian, Kevin
2016-05-31  9:43   ` George Dunlap
2016-05-27  2:26 ` Yang Zhang
2016-05-27  8:13   ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AADFC41AFE54684AB9EE6CBC0274A5D18DFD4F3F@SHSMSX101.ccr.corp.intel.com \
    --to=kevin.tian@intel.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=tianyu.lan@intel.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=xuquan8@huawei.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).