All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@citrix.com>
To: David Vrabel <david.vrabel@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	Kevin Tian <kevin.tian@intel.com>
Cc: Lars Kurth <lars.kurth@citrix.com>, Feng Wu <feng.wu@intel.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: vmx: VT-d posted-interrupt core logic handling
Date: Thu, 10 Mar 2016 10:46:55 +0000	[thread overview]
Message-ID: <56E1509F.1060108@citrix.com> (raw)
In-Reply-To: <56E14DDA.3040500@citrix.com>

On 10/03/16 10:35, David Vrabel wrote:
> On 10/03/16 10:18, Jan Beulich wrote:
>>>>> On 10.03.16 at 11:05, <kevin.tian@intel.com> wrote:
>>>>  From: Tian, Kevin
>>>> Sent: Thursday, March 10, 2016 5:20 PM
>>>>
>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>> Sent: Thursday, March 10, 2016 5:06 PM
>>>>>
>>>>>
>>>>>> There are many linked list usages today in Xen hypervisor, which
>>>>>> have different theoretical maximum possible number. The closest
>>>>>> one to PI might be the usage in tmem (pool->share_list) which is
>>>>>> page based so could grow 'overly large'. Other examples are
>>>>>> magnitude lower, e.g. s->ioreq_vcpu_list in ioreq server (which
>>>>>> could be 8K in above example), and d->arch.hvm_domain.msixtbl_list
>>>>>> in MSI-x virtualization (which could be 2^11 per spec). Do we
>>>>>> also want to create some artificial scenarios to examine them
>>>>>> since based on actual operation K-level entries may also become
>>>>>> a problem?
>>>>>>
>>>>>> Just want to figure out how best we can solve all related linked-list
>>>>>> usages in current hypervisor.
>>>>>
>>>>> As you say, those are (perhaps with the exception of tmem, which
>>>>> isn't supported anyway due to XSA-15, and which therefore also
>>>>> isn't on by default) in the order of a few thousand list elements.
>>>>> And as mentioned above, different bounds apply for lists traversed
>>>>> in interrupt context vs such traversed only in "normal" context.
>>>>>
>>>>
>>>> That's a good point. Interrupt context should have more restrictions.
>>>
>>> Hi, Jan,
>>>
>>> I'm thinking your earlier idea about evenly distributed list:
>>>
>>> --
>>> Ah, right, I think that limitation was named before, yet I've
>>> forgotten about it again. But that only slightly alters the
>>> suggestion: To distribute vCPU-s evenly would then require to
>>> change their placement on the pCPU in the course of entering
>>> blocked state.
>>> --
>>>
>>> Actually after more thinking, there is no hard requirement that
>>> the vcpu must block on the pcpu which is configured in 'NDST'
>>> of that vcpu's PI descriptor. What really matters, is that the
>>> vcpu is added to the linked list of the very pcpu, then when PI
>>> notification comes we can always find out the vcpu struct from
>>> that pcpu's linked list. Of course one drawback of such placement
>>> is additional IPI incurred in wake up path.
>>>
>>> Then one possible optimized policy within vmx_vcpu_block could 
>>> be:
>>>
>>> (Say PCPU1 which VCPU1 is currently blocked on)
>>> - As long as the #vcpus in the linked list on PCPU1 is below a 
>>> threshold (say 16), add VCPU1 to the list. NDST set to PCPU1;
>>> Upon PI notification on PCPU1, local linked list is searched to
>>> find VCPU1 and then VCPU1 will be unblocked on PCPU1;
>>>
>>> - Otherwise, add VCPU1 to PCPU2 based on a simple distribution 
>>> algorithm (based on vcpu_id/vm_id). VCPU1 still blocks on PCPU1
>>> but NDST set to PCPU2. Upon notification on PCPU2, local linked
>>> list is searched to find VCPU1 and then an IPI is sent to PCPU1 to 
>>> unblock VCPU1;
>>
>> Sounds possible, if the lock handling can be got right. But of
>> course there can't be any hard limit like 16, at least not alone
>> (on a systems with extremely many mostly idle vCPU-s we'd
>> need to allow larger counts - see my earlier explanations in this
>> regard).
> 
> You could also consider only waking the first N VCPUs and just making
> the rest runnable.  If you wake more VCPUs than PCPUs at the same time
> most of them won't actually be scheduled.

"Waking" a vcpu means "changing from blocked to runnable", so those two
things are the same.  And I can't figure out what you mean instead --
can you elaborate?

Waking up 1000 vcpus is going to take strictly more time than checking
whether there's a PI interrupt pending on 1000 vcpus to see if they need
to be woken up.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-03-10 10:46 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-29  3:00 [PATCH v14 0/2] Add VT-d Posted-Interrupts support Feng Wu
2016-02-29  3:00 ` [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling Feng Wu
2016-02-29 13:33   ` Jan Beulich
2016-02-29 13:52     ` Dario Faggioli
2016-03-01  5:39       ` Wu, Feng
2016-03-01  9:24         ` Jan Beulich
2016-03-01 10:16     ` George Dunlap
2016-03-01 13:06       ` Wu, Feng
2016-03-01  5:24   ` Tian, Kevin
2016-03-01  5:39     ` Wu, Feng
2016-03-04 22:00   ` Ideas " Konrad Rzeszutek Wilk
2016-03-07 11:21     ` George Dunlap
2016-03-07 15:53       ` Konrad Rzeszutek Wilk
2016-03-07 16:19         ` Dario Faggioli
2016-03-07 20:23           ` Konrad Rzeszutek Wilk
2016-03-08 12:02         ` George Dunlap
2016-03-08 13:10           ` Wu, Feng
2016-03-08 14:42             ` George Dunlap
2016-03-08 15:42               ` Jan Beulich
2016-03-08 17:05                 ` George Dunlap
2016-03-08 17:26                   ` Jan Beulich
2016-03-08 18:38                     ` George Dunlap
2016-03-09  5:06                       ` Wu, Feng
2016-03-09 13:39                       ` Jan Beulich
2016-03-09 16:01                         ` George Dunlap
2016-03-09 16:31                           ` Jan Beulich
2016-03-09 16:23                         ` On setting clear criteria for declaring a feature acceptable (was "vmx: VT-d posted-interrupt core logic handling") George Dunlap
2016-03-09 16:58                           ` On setting clear criteria for declaring a feature acceptable Jan Beulich
2016-03-09 18:02                           ` On setting clear criteria for declaring a feature acceptable (was "vmx: VT-d posted-interrupt core logic handling") David Vrabel
2016-03-10  1:15                             ` Wu, Feng
2016-03-10  9:30                             ` George Dunlap
2016-03-10  5:09                           ` Tian, Kevin
2016-03-10  8:07                             ` vmx: VT-d posted-interrupt core logic handling Jan Beulich
2016-03-10  8:43                               ` Tian, Kevin
2016-03-10  9:05                                 ` Jan Beulich
2016-03-10  9:20                                   ` Tian, Kevin
2016-03-10 10:05                                   ` Tian, Kevin
2016-03-10 10:18                                     ` Jan Beulich
2016-03-10 10:35                                       ` David Vrabel
2016-03-10 10:46                                         ` George Dunlap [this message]
2016-03-10 11:16                                           ` David Vrabel
2016-03-10 11:49                                             ` George Dunlap
2016-03-10 13:24                                             ` Jan Beulich
2016-03-10 11:00                                       ` George Dunlap
2016-03-10 11:21                                         ` Dario Faggioli
2016-03-10 13:36                                     ` Wu, Feng
2016-05-17 13:27                                       ` Konrad Rzeszutek Wilk
2016-05-19  7:22                                         ` Wu, Feng
2016-03-10 10:41                               ` George Dunlap
2016-03-09  5:22                   ` Ideas Re: [PATCH v14 1/2] " Wu, Feng
2016-03-09 11:25                     ` George Dunlap
2016-03-09 12:06                       ` Wu, Feng
2016-02-29  3:00 ` [PATCH v14 2/2] Add a command line parameter for VT-d posted-interrupts Feng Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56E1509F.1060108@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=feng.wu@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=lars.kurth@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.