xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: "Lan, Tianyu" <tianyu.lan@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"jbeulich@suse.com" <jbeulich@suse.com>,
	"sstabellini@kernel.org" <sstabellini@kernel.org>,
	"ian.jackson@eu.citrix.com" <ian.jackson@eu.citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Dong, Eddie" <eddie.dong@intel.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"yang.zhang.wz@gmail.com" <yang.zhang.wz@gmail.com>,
	"anthony.perard@citrix.com" <anthony.perard@citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: Discussion about virtual iommu support for Xen guest
Date: Fri, 3 Jun 2016 15:00:43 +0100	[thread overview]
Message-ID: <57518D8B.8090703@citrix.com> (raw)
In-Reply-To: <7971988a-cdca-9556-5d61-bf450221fd4c@intel.com>

On 03/06/16 14:09, Lan, Tianyu wrote:
>
>
> On 6/3/2016 7:17 PM, Tian, Kevin wrote:
>>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>>> Sent: Friday, June 03, 2016 2:59 AM
>>>
>>> On 02/06/16 16:03, Lan, Tianyu wrote:
>>>> On 5/27/2016 4:19 PM, Lan Tianyu wrote:
>>>>> On 2016年05月26日 19:35, Andrew Cooper wrote:
>>>>>> On 26/05/16 09:29, Lan Tianyu wrote:
>>>>>>
>>>>>> To be viable going forwards, any solution must work with
>>>>>> PVH/HVMLite as
>>>>>> much as HVM.  This alone negates qemu as a viable option.
>>>>>>
>>>>>> From a design point of view, having Xen needing to delegate to
>>>>>> qemu to
>>>>>> inject an interrupt into a guest seems backwards.
>>>>>>
>>>>>
>>>>> Sorry, I am not familiar with HVMlite. HVMlite doesn't use Qemu and
>>>>> the qemu virtual iommu can't work for it. We have to rewrite virtual
>>>>> iommu in the Xen, right?
>>>>>
>>>>>>
>>>>>> A whole lot of this would be easier to reason about if/when we get a
>>>>>> basic root port implementation in Xen, which is necessary for
>>>>>> HVMLite,
>>>>>> and which will make the interaction with qemu rather more clean. 
>>>>>> It is
>>>>>> probably worth coordinating work in this area.
>>>>>
>>>>> The virtual iommu also should be under basic root port in Xen, right?
>>>>>
>>>>>>
>>>>>> As for the individual issue of 288vcpu support, there are already
>>>>>> issues
>>>>>> with 64vcpu guests at the moment. While it is certainly fine to
>>>>>> remove
>>>>>> the hard limit at 255 vcpus, there is a lot of other work
>>>>>> required to
>>>>>> even get 128vcpu guests stable.
>>>>>
>>>>>
>>>>> Could you give some points to these issues? We are enabling more
>>>>> vcpus
>>>>> support and it can boot up 255 vcpus without IR support basically.
>>>>> It's
>>>>> very helpful to learn about known issues.
>>>>>
>>>>> We will also add more tests for 128 vcpus into our regular test to
>>>>> find
>>>>> related bugs. Increasing max vcpu to 255 should be a good start.
>>>>
>>>> Hi Andrew:
>>>> Could you give more inputs about issues with 64 vcpus and what
>>>> needs to
>>>> be done to make 128vcpu guest stable? We hope to do somethings to
>>>> improve them.
>>>>
>>>> What's progress of PCI host bridge in Xen? From your opinion, we
>>>> should
>>>> do that first, right? Thanks.
>>>
>>> Very sorry for the delay.
>>>
>>> There are multiple interacting issues here.  On the one side, it would
>>> be useful if we could have a central point of coordination on
>>> PVH/HVMLite work.  Roger - as the person who last did HVMLite work,
>>> would you mind organising that?
>>>
>>> For the qemu/xen interaction, the current state is woeful and a tangled
>>> mess.  I wish to ensure that we don't make any development decisions
>>> which makes the situation worse.
>>>
>>> In your case, the two motivations are quite different I would recommend
>>> dealing with them independently.
>>>
>>> IIRC, the issue with more than 255 cpus and interrupt remapping is that
>>> you can only use x2apic mode with more than 255 cpus, and IOAPIC RTEs
>>> can't be programmed to generate x2apic interrupts?  In principle, if
>>> you
>>> don't have an IOAPIC, are there any other issues to be considered? 
>>> What
>>> happens if you configure the LAPICs in x2apic mode, but have the IOAPIC
>>> deliver xapic interrupts?
>>
>> The key is the APIC ID. There is no modification to existing PCI MSI and
>> IOAPIC with the introduction of x2apic. PCI MSI/IOAPIC can only send
>> interrupt message containing 8bit APIC ID, which cannot address >255
>> cpus. Interrupt remapping supports 32bit APIC ID so it's necessary to
>> enable >255 cpus with x2apic mode.
>>
>> If LAPIC is in x2apic while interrupt remapping is disabled, IOAPIC
>> cannot
>> deliver interrupts to all cpus in the system if #cpu > 255.
>
> Another key factor, Linux kernel disables x2apic mode when MAX APIC id
> is > 255 if no interrupt remapping function. The reason for this is what
> Kevin said. So booting up >255 cpus relies on the interrupt remapping.

That is an implementation decision of Linux, not an architectural
requirement.

We need to carefully distinguish the two (even if it doesn't affect the
planned outcome from Xen's point if view), as Linux is not the only
operating system we virtualise.


One interesting issue in this area is plain, no-frills HVMLite domains,
which have an LAPIC but no IOAPIC, as they have no legacy devices/PCI
bus/etc.  In this scenario, no vIOMMU would be required for x2apic mode,
even if the domain had >255 vcpus.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-06-03 14:00 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-26  8:29 Discussion about virtual iommu support for Xen guest Lan Tianyu
2016-05-26  8:42 ` Dong, Eddie
2016-05-27  2:26   ` Lan Tianyu
2016-05-27  8:11     ` Tian, Kevin
2016-05-26 11:35 ` Andrew Cooper
2016-05-27  8:19   ` Lan Tianyu
2016-06-02 15:03     ` Lan, Tianyu
2016-06-02 18:58       ` Andrew Cooper
2016-06-03 11:01         ` Current PVH/HVMlite work and planning (was :Re: Discussion about virtual iommu support for Xen guest) Roger Pau Monne
2016-06-03 11:21           ` Tian, Kevin
2016-06-03 11:52             ` Roger Pau Monne
2016-06-03 12:11               ` Tian, Kevin
2016-06-03 16:56                 ` Stefano Stabellini
2016-06-07  5:48                   ` Tian, Kevin
2016-06-03 11:17         ` Discussion about virtual iommu support for Xen guest Tian, Kevin
2016-06-03 13:09           ` Lan, Tianyu
2016-06-03 14:00             ` Andrew Cooper [this message]
2016-06-03 13:51           ` Andrew Cooper
2016-06-03 14:31             ` Jan Beulich
2016-06-03 17:14             ` Stefano Stabellini
2016-06-07  5:14               ` Tian, Kevin
2016-06-07  7:26                 ` Jan Beulich
2016-06-07 10:07                 ` Stefano Stabellini
2016-06-08  8:11                   ` Tian, Kevin
2016-06-26 13:42                     ` Lan, Tianyu
2016-06-29  3:04                       ` Tian, Kevin
2016-07-05 13:37                         ` Lan, Tianyu
2016-07-05 13:57                           ` Jan Beulich
2016-07-05 14:19                             ` Lan, Tianyu
2016-08-17 12:05                             ` Xen virtual IOMMU high level design doc Lan, Tianyu
2016-08-17 12:42                               ` Paul Durrant
2016-08-18  2:57                                 ` Lan, Tianyu
2016-08-25 11:11                               ` Jan Beulich
2016-08-31  8:39                                 ` Lan Tianyu
2016-08-31 12:02                                   ` Jan Beulich
2016-09-01  1:26                                     ` Tian, Kevin
2016-09-01  2:35                                     ` Lan Tianyu
2016-09-15 14:22                               ` Lan, Tianyu
2016-10-05 18:36                                 ` Konrad Rzeszutek Wilk
2016-10-11  1:52                                   ` Lan Tianyu
2016-11-23 18:19                               ` Edgar E. Iglesias
2016-11-23 19:09                                 ` Stefano Stabellini
2016-11-24  2:00                                   ` Tian, Kevin
2016-11-24  4:09                                     ` Edgar E. Iglesias
2016-11-24  6:49                                       ` Lan Tianyu
2016-11-24 13:37                                         ` Edgar E. Iglesias
2016-11-25  2:01                                           ` Xuquan (Quan Xu)
2016-11-25  5:53                                           ` Lan, Tianyu
2016-10-18 14:14                             ` Xen virtual IOMMU high level design doc V2 Lan Tianyu
2016-10-18 19:17                               ` Andrew Cooper
2016-10-20  9:53                                 ` Tian, Kevin
2016-10-20 18:10                                   ` Andrew Cooper
2016-10-20 14:17                                 ` Lan Tianyu
2016-10-20 20:36                                   ` Andrew Cooper
2016-10-22  7:32                                     ` Lan, Tianyu
2016-10-26  9:39                                       ` Jan Beulich
2016-10-26 15:03                                         ` Lan, Tianyu
2016-11-03 15:41                                         ` Lan, Tianyu
2016-10-28 15:36                                     ` Lan Tianyu
2016-10-18 20:26                               ` Konrad Rzeszutek Wilk
2016-10-20 10:11                                 ` Tian, Kevin
2016-10-20 14:56                                 ` Lan, Tianyu
2016-10-26  9:36                               ` Jan Beulich
2016-10-26 14:53                                 ` Lan, Tianyu
2016-11-17 15:36                             ` Xen virtual IOMMU high level design doc V3 Lan Tianyu
2016-11-18 19:43                               ` Julien Grall
2016-11-21  2:21                                 ` Lan, Tianyu
2016-11-21 13:17                                   ` Julien Grall
2016-11-21 18:24                                     ` Stefano Stabellini
2016-11-21  7:05                               ` Tian, Kevin
2016-11-23  1:36                                 ` Lan Tianyu
2016-11-21 13:41                               ` Andrew Cooper
2016-11-22  6:02                                 ` Tian, Kevin
2016-11-22  8:32                                 ` Lan Tianyu
2016-11-22 10:24                               ` Jan Beulich
2016-11-24  2:34                                 ` Lan Tianyu
2016-06-03 19:51             ` Is: 'basic pci bridge and root device support. 'Was:Re: Discussion about virtual iommu support for Xen guest Konrad Rzeszutek Wilk
2016-06-06  9:55               ` Jan Beulich
2016-06-06 17:25                 ` Konrad Rzeszutek Wilk
2016-08-02 15:15     ` Lan, Tianyu
2016-05-27  8:35   ` Tian, Kevin
2016-05-27  8:46     ` Paul Durrant
2016-05-27  9:39       ` Tian, Kevin
2016-05-31  9:43   ` George Dunlap
2016-05-27  2:26 ` Yang Zhang
2016-05-27  8:13   ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57518D8B.8090703@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=anthony.perard@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=tianyu.lan@intel.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).