xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Lan Tianyu <tianyu.lan@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "yang.zhang.wz@gmail.com" <yang.zhang.wz@gmail.com>,
	xuquan8@huawei.com, Stefano Stabellini <sstabellini@kernel.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	"ian.jackson@eu.citrix.com" <ian.jackson@eu.citrix.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	"anthony.perard@citrix.com" <anthony.perard@citrix.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: Xen virtual IOMMU high level design doc
Date: Wed, 31 Aug 2016 16:39:24 +0800	[thread overview]
Message-ID: <57C697BC.3090506@intel.com> (raw)
In-Reply-To: <57BEEE8F0200007800108EEE@prv-mh.provo.novell.com>

Hi Jan:
	Sorry for later response. Thanks a lot for your comments.

On 2016年08月25日 19:11, Jan Beulich wrote:
>>>> On 17.08.16 at 14:05, <tianyu.lan@intel.com> wrote:
>> 1 Motivation for Xen vIOMMU
>> ============================================================================
>> ===
>> 1.1 Enable more than 255 vcpu support
>> HPC virtualization requires more than 255 vcpus support in a single VM
>> to meet parallel computing requirement. More than 255 vcpus support
>> requires interrupt remapping capability present on vIOMMU to deliver
>> interrupt to #vcpu >255 Otherwise Linux guest fails to boot up with >255
>> vcpus if interrupt remapping is absent.
> 
> I continue to question this as a valid motivation at this point in
> time, for the reasons Andrew has been explaining.

If we want to support Linux guest with >255 vcpus, interrupt remapping
is necessary.

From Linux commit introducing x2apic and IR mode, it said IR was
a pre-requisite for enabling x2apic mode in the CPU.
https://lwn.net/Articles/289881/

So far, no sure behavior on the other OS. We may watch Windows guest
behavior later on KVM and there is still a bug to run Windows guest with
IR function on KVM.


> 
>> 2. Xen vIOMMU Architecture
>> ============================================================================
>> ====
>>
>> * vIOMMU will be inside Xen hypervisor for following factors
>> 	1) Avoid round trips between Qemu and Xen hypervisor
>> 	2) Ease of integration with the rest of the hypervisor
>> 	3) HVMlite/PVH doesn't use Qemu
>> * Dummy xen-vIOMMU in Qemu as a wrapper of new hypercall to create
>> /destory vIOMMU in hypervisor and deal with virtual PCI device's 2th
>> level translation.
> 
> How does the create/destroy part of this match up with 3) right
> ahead of it?

The create/destroy hypercalls will work for both hvm and hvmlite.
Suppose hvmlite has tool stack(E.G libxl) which can call new hypercalls
to create or destroy virtual iommu in hypervisor.

> 
>> 3 Xen hypervisor
>> ==========================================================================
>>
>> 3.1 New hypercall XEN_SYSCTL_viommu_op
>> 1) Definition of "struct xen_sysctl_viommu_op" as new hypercall parameter.
>>
>> struct xen_sysctl_viommu_op {
>> 	u32 cmd;
>> 	u32 domid;
>> 	union {
>> 		struct {
>> 			u32 capabilities;
>> 		} query_capabilities;
>> 		struct {
>> 			u32 capabilities;
>> 			u64 base_address;
>> 		} create_iommu;
>> 		struct {
>> 			u8  bus;
>> 			u8  devfn;
> 
> Please can we avoid introducing any new interfaces without segment/
> domain value, even if for now it'll be always zero?

Sure. Will add segment field.

> 
>> 			u64 iova;
>> 			u64 translated_addr;
>> 			u64 addr_mask; /* Translation page size */
>> 			IOMMUAccessFlags permisson;		
>> 		} 2th_level_translation;
> 
> I suppose "translated_addr" is an output here, but for the following
> fields this already isn't clear. Please add IN and OUT annotations for
> clarity.
> 
> Also, may I suggest to name this "l2_translation"? (But there are
> other implementation specific things to be considered here, which
> I guess don't belong into a design doc discussion.)

How about this?
        struct {
	    /* IN parameters. */
	    u8  segment;
            u8  bus;
            u8  devfn;
            u64 iova;
	    /* Out parameters. */
            u64 translated_addr;
            u64 addr_mask; /* Translation page size */
            IOMMUAccessFlags permisson;
        } l2_translation;

> 
>> };
>>
>> typedef enum {
>> 	IOMMU_NONE = 0,
>> 	IOMMU_RO   = 1,
>> 	IOMMU_WO   = 2,
>> 	IOMMU_RW   = 3,
>> } IOMMUAccessFlags;
>>
>>
>> Definition of VIOMMU subops:
>> #define XEN_SYSCTL_viommu_query_capability		0
>> #define XEN_SYSCTL_viommu_create			1
>> #define XEN_SYSCTL_viommu_destroy			2
>> #define XEN_SYSCTL_viommu_dma_translation_for_vpdev 	3
>>
>> Definition of VIOMMU capabilities
>> #define XEN_VIOMMU_CAPABILITY_1nd_level_translation	(1 << 0)
>> #define XEN_VIOMMU_CAPABILITY_2nd_level_translation	(1 << 1)
> 
> l1 and l2 respectively again, please.

Will update.

> 
>> 3.3 Interrupt remapping
>> Interrupts from virtual devices and physical devices will be delivered
>> to vlapic from vIOAPIC and vMSI. It needs to add interrupt remapping
>> hooks in the vmsi_deliver() and ioapic_deliver() to find target vlapic
>> according interrupt remapping table. The following diagram shows the logic.
> 
> Missing diagram or stale sentence?

Sorry. It's stale sentence and moved the diagram to 2.2 Interrupt
remapping overview.

> 
>> 3.5 Implementation consideration
>> Linux Intel IOMMU driver will fail to be loaded without 2th level
>> translation support even if interrupt remapping and 1th level
>> translation are available. This means it's needed to enable 2th level
>> translation first before other functions.
> 
> Is there a reason for this? I.e. do they unconditionally need that
> functionality?

Yes, Linux intel IOMMU driver unconditionally needs l2 translation.
Driver checks whether there is a valid sagaw(supported Adjusted Guest
Address Widths) during initializing IOMMU data struct and return error
if not.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-08-31  8:53 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-26  8:29 Discussion about virtual iommu support for Xen guest Lan Tianyu
2016-05-26  8:42 ` Dong, Eddie
2016-05-27  2:26   ` Lan Tianyu
2016-05-27  8:11     ` Tian, Kevin
2016-05-26 11:35 ` Andrew Cooper
2016-05-27  8:19   ` Lan Tianyu
2016-06-02 15:03     ` Lan, Tianyu
2016-06-02 18:58       ` Andrew Cooper
2016-06-03 11:01         ` Current PVH/HVMlite work and planning (was :Re: Discussion about virtual iommu support for Xen guest) Roger Pau Monne
2016-06-03 11:21           ` Tian, Kevin
2016-06-03 11:52             ` Roger Pau Monne
2016-06-03 12:11               ` Tian, Kevin
2016-06-03 16:56                 ` Stefano Stabellini
2016-06-07  5:48                   ` Tian, Kevin
2016-06-03 11:17         ` Discussion about virtual iommu support for Xen guest Tian, Kevin
2016-06-03 13:09           ` Lan, Tianyu
2016-06-03 14:00             ` Andrew Cooper
2016-06-03 13:51           ` Andrew Cooper
2016-06-03 14:31             ` Jan Beulich
2016-06-03 17:14             ` Stefano Stabellini
2016-06-07  5:14               ` Tian, Kevin
2016-06-07  7:26                 ` Jan Beulich
2016-06-07 10:07                 ` Stefano Stabellini
2016-06-08  8:11                   ` Tian, Kevin
2016-06-26 13:42                     ` Lan, Tianyu
2016-06-29  3:04                       ` Tian, Kevin
2016-07-05 13:37                         ` Lan, Tianyu
2016-07-05 13:57                           ` Jan Beulich
2016-07-05 14:19                             ` Lan, Tianyu
2016-08-17 12:05                             ` Xen virtual IOMMU high level design doc Lan, Tianyu
2016-08-17 12:42                               ` Paul Durrant
2016-08-18  2:57                                 ` Lan, Tianyu
2016-08-25 11:11                               ` Jan Beulich
2016-08-31  8:39                                 ` Lan Tianyu [this message]
2016-08-31 12:02                                   ` Jan Beulich
2016-09-01  1:26                                     ` Tian, Kevin
2016-09-01  2:35                                     ` Lan Tianyu
2016-09-15 14:22                               ` Lan, Tianyu
2016-10-05 18:36                                 ` Konrad Rzeszutek Wilk
2016-10-11  1:52                                   ` Lan Tianyu
2016-11-23 18:19                               ` Edgar E. Iglesias
2016-11-23 19:09                                 ` Stefano Stabellini
2016-11-24  2:00                                   ` Tian, Kevin
2016-11-24  4:09                                     ` Edgar E. Iglesias
2016-11-24  6:49                                       ` Lan Tianyu
2016-11-24 13:37                                         ` Edgar E. Iglesias
2016-11-25  2:01                                           ` Xuquan (Quan Xu)
2016-11-25  5:53                                           ` Lan, Tianyu
2016-10-18 14:14                             ` Xen virtual IOMMU high level design doc V2 Lan Tianyu
2016-10-18 19:17                               ` Andrew Cooper
2016-10-20  9:53                                 ` Tian, Kevin
2016-10-20 18:10                                   ` Andrew Cooper
2016-10-20 14:17                                 ` Lan Tianyu
2016-10-20 20:36                                   ` Andrew Cooper
2016-10-22  7:32                                     ` Lan, Tianyu
2016-10-26  9:39                                       ` Jan Beulich
2016-10-26 15:03                                         ` Lan, Tianyu
2016-11-03 15:41                                         ` Lan, Tianyu
2016-10-28 15:36                                     ` Lan Tianyu
2016-10-18 20:26                               ` Konrad Rzeszutek Wilk
2016-10-20 10:11                                 ` Tian, Kevin
2016-10-20 14:56                                 ` Lan, Tianyu
2016-10-26  9:36                               ` Jan Beulich
2016-10-26 14:53                                 ` Lan, Tianyu
2016-11-17 15:36                             ` Xen virtual IOMMU high level design doc V3 Lan Tianyu
2016-11-18 19:43                               ` Julien Grall
2016-11-21  2:21                                 ` Lan, Tianyu
2016-11-21 13:17                                   ` Julien Grall
2016-11-21 18:24                                     ` Stefano Stabellini
2016-11-21  7:05                               ` Tian, Kevin
2016-11-23  1:36                                 ` Lan Tianyu
2016-11-21 13:41                               ` Andrew Cooper
2016-11-22  6:02                                 ` Tian, Kevin
2016-11-22  8:32                                 ` Lan Tianyu
2016-11-22 10:24                               ` Jan Beulich
2016-11-24  2:34                                 ` Lan Tianyu
2016-06-03 19:51             ` Is: 'basic pci bridge and root device support. 'Was:Re: Discussion about virtual iommu support for Xen guest Konrad Rzeszutek Wilk
2016-06-06  9:55               ` Jan Beulich
2016-06-06 17:25                 ` Konrad Rzeszutek Wilk
2016-08-02 15:15     ` Lan, Tianyu
2016-05-27  8:35   ` Tian, Kevin
2016-05-27  8:46     ` Paul Durrant
2016-05-27  9:39       ` Tian, Kevin
2016-05-31  9:43   ` George Dunlap
2016-05-27  2:26 ` Yang Zhang
2016-05-27  8:13   ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57C697BC.3090506@intel.com \
    --to=tianyu.lan@intel.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=xuquan8@huawei.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).