xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: George Dunlap <george.dunlap@citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	David Vrabel <david.vrabel@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest
Date: Mon, 27 Jul 2015 14:01:41 +0200	[thread overview]
Message-ID: <55B61DA5.5030903@suse.com> (raw)
In-Reply-To: <55B611F1.80508@citrix.com>

On 07/27/2015 01:11 PM, George Dunlap wrote:
> On 07/27/2015 11:54 AM, Juergen Gross wrote:
>> On 07/27/2015 12:43 PM, George Dunlap wrote:
>>> On Mon, Jul 27, 2015 at 5:35 AM, Juergen Gross <jgross@suse.com> wrote:
>>>> On 07/24/2015 06:44 PM, Boris Ostrovsky wrote:
>>>>>
>>>>> On 07/24/2015 12:39 PM, Juergen Gross wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> I don't say mangling cpuids can't solve the scheduling problem. It
>>>>>> surely can. But it can't solve the scheduling problem without hiding
>>>>>> information like number of sockets or cores which might be required
>>>>>> for license purposes. If we don't care, fine.
>>>>>>
>>>>>
>>>>> (this is somewhat repeating the email I just sent)
>>>>>
>>>>> Why can's we construct socket/core info with CPUID (and *possibly* ACPI
>>>>> changes) that we present a reasonable (licensing-wise) picture?
>>>>>
>>>>> Can you suggest an example where it will not work and then maybe we can
>>>>> figure something out?
>>>>
>>>>
>>>> Let's assume a software with license based on core count. You have a
>>>> system with a 2 8 core processors and hyperthreads enabled, summing up
>>>> to 32 logical processors. Your license is valid for up to 16 cores, so
>>>> running the software on bare metal on your system is fine.
>>>>
>>>> Now you are running the software inside a virtual machine with 24 vcpus
>>>> in a cpupool with 24 logical cpus limited to 12 cores (6 cores of each
>>>> processor). As we have to hide hyperthreading in order to not to have
>>>> to pin each vcpu to just a single logical processor, the topology
>>>> resulting from this picture will have to present 24 cores. The license
>>>> will not cover this hardware.
>>>
>>> But how does doing a PV topology help this situation?  Because we're
>>> telling one thing to the OS (via our PV interface) and another thing
>>> to applications (via direct CPUID access)?
>>
>> Exactly.
>>
>> In my example it would even work to not modify the cpuid information at
>> all. The kernel wouldn't try to be extremely clever regarding scheduling
>> and the user land would see the cpuid information from the real hardware
>> (only the 12 cores it is running on, of course).
>
> Right; so it seems
>
> 1. Userspace applications are in the habit of reading CPUID to determine
> the topology of the system they're running on
>
> 2. Many use the topology information to help themselves make better
> scheduling decisions.  Because a vcpu is not typically pinned to a
> specific pcpu, we may need to lie here slightly (e.g., not mention
> threads) to get the optimal behavior overall.
>
> 3. Others use the topology information to implement licensing
> restrictions.  Because threads are treated differently to cores, we want
> to tell the truth here (i.e., make sure we mention that some of these
> are threads) to get the optimal behavior overall.
>
> Numbers #2 and #3 lead to contradictory courses of action; we cannot
> optimize for both at the same time.
>
> I think at some level we need to just try to accommodate both -- if the
> user doesn't have licensing issues, or prefers performance over
> licensing, then present a unified topology in PVH / HVM using CPUID,
> ACPI, &c.  I think this should be the default.
>
> If the user has licensing issues, and doesn't mind having wonky or
> unreliable topology to its guests, then let the raw CPUID through.  But
> it would, in this case, be good to try to give the guest OS scheduler a
> hint that it shouldn't really bother trying to read the topology or do
> placement as a result, as any decisions will be unreliable.
>
> Or alternately, if the user wants to give up on the "consolidation"
> aspect of virtualization, they can pin vcpus to pcpus and then pass in
> the actual host topology (hyperthreads and all).

There would be another solution, of course:

Support hyperthreads in the Xen scheduler via gang scheduling. While
this is not a simple solution, it is a fair one. Hyperthreads on one
core can influence each other rather much. With both threads always
running vcpus of the same guest the penalty/advantage would stay in the
same domain. The guest could make really sensible scheduling decisions
and the licensing would still work as desired.

Just an idea, but maybe worth to explore further instead of tweaking
more and more bits to make the virtual system somehow act sane.


Juergen

  reply	other threads:[~2015-07-27 12:01 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-16 10:32 PV-vNUMA issue: topology is misinterpreted by the guest Dario Faggioli
2015-07-16 10:47 ` Jan Beulich
2015-07-16 10:56   ` Andrew Cooper
2015-07-16 15:25     ` Wei Liu
2015-07-16 15:45       ` Andrew Cooper
2015-07-16 15:50         ` Boris Ostrovsky
2015-07-16 16:29           ` Jan Beulich
2015-07-16 16:39             ` Andrew Cooper
2015-07-16 16:59               ` Boris Ostrovsky
2015-07-17  6:09                 ` Jan Beulich
2015-07-17  7:27                   ` Dario Faggioli
2015-07-17  7:42                     ` Jan Beulich
2015-07-17  8:44                     ` Wei Liu
2015-07-17 18:17                     ` Boris Ostrovsky
2015-07-20 14:09                       ` Dario Faggioli
2015-07-20 14:43                         ` Boris Ostrovsky
2015-07-21 20:00                           ` Boris Ostrovsky
2015-07-22 13:36                             ` Dario Faggioli
2015-07-22 13:50                               ` Juergen Gross
2015-07-22 13:58                                 ` Boris Ostrovsky
2015-07-22 14:09                                   ` Juergen Gross
2015-07-22 14:44                                     ` Boris Ostrovsky
2015-07-23  4:43                                       ` Juergen Gross
2015-07-23  7:28                                         ` Jan Beulich
2015-07-23  9:42                                         ` Andrew Cooper
2015-07-23 14:07                                         ` Dario Faggioli
2015-07-23 14:13                                           ` Juergen Gross
2015-07-24 10:28                                           ` Juergen Gross
2015-07-24 14:44                                             ` Dario Faggioli
2015-07-24 15:14                                               ` Juergen Gross
2015-07-24 15:24                                                 ` Juergen Gross
2015-07-24 15:58                                                   ` Dario Faggioli
2015-07-24 16:09                                                     ` Konrad Rzeszutek Wilk
2015-07-24 16:14                                                       ` Dario Faggioli
2015-07-24 16:18                                                       ` Juergen Gross
2015-07-24 16:29                                                         ` Konrad Rzeszutek Wilk
2015-07-24 16:39                                                           ` Juergen Gross
2015-07-24 16:44                                                             ` Boris Ostrovsky
2015-07-27  4:35                                                               ` Juergen Gross
2015-07-27 10:43                                                                 ` George Dunlap
2015-07-27 10:54                                                                   ` Andrew Cooper
2015-07-27 11:13                                                                     ` Juergen Gross
2015-07-27 10:54                                                                   ` Juergen Gross
2015-07-27 11:11                                                                     ` George Dunlap
2015-07-27 12:01                                                                       ` Juergen Gross [this message]
2015-07-27 12:16                                                                         ` Tim Deegan
2015-07-27 13:23                                                                         ` Dario Faggioli
2015-07-27 14:02                                                                           ` Juergen Gross
2015-07-27 14:02                                                                       ` Dario Faggioli
2015-07-27 10:41                                                       ` George Dunlap
2015-07-27 10:49                                                         ` Andrew Cooper
2015-07-27 13:11                                                           ` Dario Faggioli
2015-07-24 16:10                                                     ` Juergen Gross
2015-07-24 16:40                                                       ` Boris Ostrovsky
2015-07-24 16:48                                                         ` Juergen Gross
2015-07-24 17:11                                                           ` Boris Ostrovsky
2015-07-27 13:40                                                             ` Dario Faggioli
2015-07-27  4:24                                                         ` Juergen Gross
2015-07-27 14:09                                                       ` Dario Faggioli
2015-07-27 14:34                                                         ` Boris Ostrovsky
2015-07-27 14:43                                                           ` Juergen Gross
2015-07-27 14:51                                                             ` Boris Ostrovsky
2015-07-27 15:03                                                               ` Juergen Gross
2015-07-27 14:47                                                           ` Juergen Gross
2015-07-27 14:58                                                           ` Dario Faggioli
2015-07-28  4:29                                                         ` Juergen Gross
2015-07-28 15:11                                                           ` Juergen Gross
2015-07-28 16:17                                                             ` Dario Faggioli
2015-07-28 17:13                                                               ` Dario Faggioli
2015-07-29  6:04                                                               ` Juergen Gross
2015-07-29  7:09                                                                 ` Dario Faggioli
2015-07-29  7:44                                                             ` Dario Faggioli
2015-07-24 16:05                                                 ` Dario Faggioli
2015-07-28 10:05                                                   ` Wei Liu
2015-07-28 15:17                                                     ` Dario Faggioli
2015-07-24 20:27                                               ` Elena Ufimtseva
2015-07-22 14:50                                     ` Dario Faggioli
2015-07-22 15:32                                       ` Boris Ostrovsky
2015-07-22 15:49                                         ` Dario Faggioli
2015-07-22 18:10                                           ` Boris Ostrovsky
2015-07-23  7:25                                             ` Jan Beulich
2015-07-24 16:03                                               ` Boris Ostrovsky
2015-07-23 13:46                                             ` Dario Faggioli
2015-07-17 10:17                 ` Andrew Cooper
2015-07-16 15:26 ` Wei Liu
2015-07-27 15:13 ` David Vrabel
2015-07-27 16:02   ` Dario Faggioli
2015-07-27 16:31     ` David Vrabel
2015-07-27 16:33       ` Andrew Cooper
2015-07-27 17:42         ` Dario Faggioli
2015-07-27 17:50           ` Konrad Rzeszutek Wilk
2015-07-27 23:19           ` Andrew Cooper
2015-07-28  3:52             ` Juergen Gross
2015-07-28  9:40               ` Andrew Cooper
2015-07-28  9:28             ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B61DA5.5030903@suse.com \
    --to=jgross@suse.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dario.faggioli@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=george.dunlap@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).