All of lore.kernel.org
 help / color / mirror / Atom feed
* What is the target CPU "topology" of an SMP HVM machine?
@ 2013-08-14 19:23 Eric Shelton
  2013-08-14 20:06 ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Shelton @ 2013-08-14 19:23 UTC (permalink / raw)
  To: xen-devel, Keir Fraser, Jan Beulich

In doing some work to run OS X under Xen on my MacBook Air 2012 (Ivy
Bridge), I ran into some issues in Darwin's probing of what it refers
to as the CPU topology.  Although the Darwin kernel may make certain
assumptions about the platforms on which it is being run, it
nevertheless appears the various values Xen returns via CPUID and MSR
are not wholly consistent.  For example, when I configured the domain
to have only 1 vcpu, Darwin was still able to infer that the system
had multiple processors (maybe even the correct numbers of cores and
processors).

Adding the following to the domain config file got things to move past
a divide by zero resulting from the topology info reported by Xen:

cpuid = [ '4,3:eax=0001xxxxxxxxxx1111xxxxxxxxxxxxxx' ]

The '1111' portion is the key part, and was merely copied from the
bits natively reported by the CPU outside of Xen - a configuration
providing 4 logical processors.

So, seeing as this information is being closely interrogated, what is
the target virtual CPU topology?  How should this be reported via
CPUID and MSR?  Darwin appears to be trying to determine or take into
account things such as a number of packages, dies per package, cores
per pie & package, and threads/logical CPUs per core & package; the
degrees of sharing of caches by CPUs at various cache levels, and the
presence of hyperthreading.

For example,  Darwin's osfmk/i386/cpu_threads.c (thankfully open
source), will report the following - I believe just based on the CPUID
and MSR values:

    TOPO_DBG("\nCache Topology Parameters:\n");
    TOPO_DBG("\tLLC Depth:           %d\n", topoParms.LLCDepth);
    TOPO_DBG("\tCores Sharing LLC:   %d\n", topoParms.nCoresSharingLLC);
    TOPO_DBG("\tThreads Sharing LLC: %d\n", topoParms.nLCPUsSharingLLC);
    TOPO_DBG("\tmax Sharing of LLC:  %d\n", topoParms.maxSharingLLC);

    TOPO_DBG("\nLogical Topology Parameters:\n");
    TOPO_DBG("\tThreads per Core:  %d\n", topoParms.nLThreadsPerCore);
    TOPO_DBG("\tCores per Die:     %d\n", topoParms.nLCoresPerDie);
    TOPO_DBG("\tThreads per Die:   %d\n", topoParms.nLThreadsPerDie);
    TOPO_DBG("\tDies per Package:  %d\n", topoParms.nLDiesPerPackage);
    TOPO_DBG("\tCores per Package: %d\n", topoParms.nLCoresPerPackage);
    TOPO_DBG("\tThreads per Package: %d\n", topoParms.nLThreadsPerPackage);

    TOPO_DBG("\nPhysical Topology Parameters:\n");
    TOPO_DBG("\tThreads per Core: %d\n", topoParms.nPThreadsPerCore);
    TOPO_DBG("\tCores per Die:     %d\n", topoParms.nPCoresPerDie);
    TOPO_DBG("\tThreads per Die:   %d\n", topoParms.nPThreadsPerDie);
    TOPO_DBG("\tDies per Package:  %d\n", topoParms.nPDiesPerPackage);
    TOPO_DBG("\tCores per Package: %d\n", topoParms.nPCoresPerPackage);
    TOPO_DBG("\tThreads per Package: %d\n", topoParms.nPThreadsPerPackage);

In addition to CPUID and MSR, does any of this get reflected in the
ACPI tables?  Also, is there a presumed relationship between the
number of dies or cores and the number of HPET comparators to be
concerned with?

Finally, included in all of this is the use of an undocumented MSR
0x35, which appears to be available on at least Nehalem on up, which
reports the number of cores and processors, and reports this
information slightly differently between some of the Intel
architectures.  Would it be OK to trap & emulate this behavior where
CPUID is reporting a model that implements MSR 0x35?  Would it be
better to be able to override MSR values in the domain config file,
much as with CPUID?

Thanks,
Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What is the target CPU "topology" of an SMP HVM machine?
  2013-08-14 19:23 What is the target CPU "topology" of an SMP HVM machine? Eric Shelton
@ 2013-08-14 20:06 ` Andrew Cooper
  2013-08-14 20:52   ` Dario Faggioli
  2013-08-14 22:59   ` Eric Shelton
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Cooper @ 2013-08-14 20:06 UTC (permalink / raw)
  To: Eric Shelton; +Cc: Keir Fraser, Jan Beulich, xen-devel

On 14/08/13 20:23, Eric Shelton wrote:
> In doing some work to run OS X under Xen on my MacBook Air 2012 (Ivy
> Bridge), I ran into some issues in Darwin's probing of what it refers
> to as the CPU topology.  Although the Darwin kernel may make certain
> assumptions about the platforms on which it is being run, it
> nevertheless appears the various values Xen returns via CPUID and MSR
> are not wholly consistent.  For example, when I configured the domain
> to have only 1 vcpu, Darwin was still able to infer that the system
> had multiple processors (maybe even the correct numbers of cores and
> processors).

The extended model name is passed through from the real CPU, so Darwin
could easily be working on logic such as "I have found a CPU which
claims to be this type of IvyBridge - I know it has these details"

>
> Adding the following to the domain config file got things to move past
> a divide by zero resulting from the topology info reported by Xen:
>
> cpuid = [ '4,3:eax=0001xxxxxxxxxx1111xxxxxxxxxxxxxx' ]
>
> The '1111' portion is the key part, and was merely copied from the
> bits natively reported by the CPU outside of Xen - a configuration
> providing 4 logical processors.
>
> So, seeing as this information is being closely interrogated, what is
> the target virtual CPU topology?  How should this be reported via
> CPUID and MSR?  Darwin appears to be trying to determine or take into
> account things such as a number of packages, dies per package, cores
> per pie & package, and threads/logical CPUs per core & package; the
> degrees of sharing of caches by CPUs at various cache levels, and the
> presence of hyperthreading.

Xen by default advertises all VCPUs as separate sockets, to try and
dissuade "clever" schedulers from doing dumb things based on false
information.

>
> For example,  Darwin's osfmk/i386/cpu_threads.c (thankfully open
> source), will report the following - I believe just based on the CPUID
> and MSR values:
>
>     TOPO_DBG("\nCache Topology Parameters:\n");
>     TOPO_DBG("\tLLC Depth:           %d\n", topoParms.LLCDepth);
>     TOPO_DBG("\tCores Sharing LLC:   %d\n", topoParms.nCoresSharingLLC);
>     TOPO_DBG("\tThreads Sharing LLC: %d\n", topoParms.nLCPUsSharingLLC);
>     TOPO_DBG("\tmax Sharing of LLC:  %d\n", topoParms.maxSharingLLC);
>
>     TOPO_DBG("\nLogical Topology Parameters:\n");
>     TOPO_DBG("\tThreads per Core:  %d\n", topoParms.nLThreadsPerCore);
>     TOPO_DBG("\tCores per Die:     %d\n", topoParms.nLCoresPerDie);
>     TOPO_DBG("\tThreads per Die:   %d\n", topoParms.nLThreadsPerDie);
>     TOPO_DBG("\tDies per Package:  %d\n", topoParms.nLDiesPerPackage);
>     TOPO_DBG("\tCores per Package: %d\n", topoParms.nLCoresPerPackage);
>     TOPO_DBG("\tThreads per Package: %d\n", topoParms.nLThreadsPerPackage);
>
>     TOPO_DBG("\nPhysical Topology Parameters:\n");
>     TOPO_DBG("\tThreads per Core: %d\n", topoParms.nPThreadsPerCore);
>     TOPO_DBG("\tCores per Die:     %d\n", topoParms.nPCoresPerDie);
>     TOPO_DBG("\tThreads per Die:   %d\n", topoParms.nPThreadsPerDie);
>     TOPO_DBG("\tDies per Package:  %d\n", topoParms.nPDiesPerPackage);
>     TOPO_DBG("\tCores per Package: %d\n", topoParms.nPCoresPerPackage);
>     TOPO_DBG("\tThreads per Package: %d\n", topoParms.nPThreadsPerPackage);
>
> In addition to CPUID and MSR, does any of this get reflected in the
> ACPI tables?  Also, is there a presumed relationship between the
> number of dies or cores and the number of HPET comparators to be
> concerned with?

There is a distinct lack of consistency between the various mechanisms. 
The ACPI tables are essentially static from build time.

>
> Finally, included in all of this is the use of an undocumented MSR
> 0x35, which appears to be available on at least Nehalem on up, which
> reports the number of cores and processors, and reports this
> information slightly differently between some of the Intel
> architectures.  Would it be OK to trap & emulate this behavior where
> CPUID is reporting a model that implements MSR 0x35?  Would it be
> better to be able to override MSR values in the domain config file,
> much as with CPUID?
>
> Thanks,
> Eric

>From a quick glance at the documentation, there are several different
generations of processors which use MSR 0x35 for different purposes,
although it does appear to be somewhat common as performance counters of
one form or another.

What reference are you using to find out that this msr provides topology
information?

It would certainly be a good project to try and make this information
more consistent and easier to configure.

~Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What is the target CPU "topology" of an SMP HVM machine?
  2013-08-14 20:06 ` Andrew Cooper
@ 2013-08-14 20:52   ` Dario Faggioli
  2013-08-15  2:18     ` Elena Ufimtseva
  2013-08-21 21:30     ` Matt Wilson
  2013-08-14 22:59   ` Eric Shelton
  1 sibling, 2 replies; 6+ messages in thread
From: Dario Faggioli @ 2013-08-14 20:52 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, Keir Fraser, Elena Ufimtseva, Jan Beulich, Eric Shelton


[-- Attachment #1.1: Type: text/plain, Size: 1544 bytes --]

On mer, 2013-08-14 at 21:06 +0100, Andrew Cooper wrote:
> On 14/08/13 20:23, Eric Shelton wrote:
> > So, seeing as this information is being closely interrogated, what is
> > the target virtual CPU topology?  How should this be reported via
> > CPUID and MSR?  Darwin appears to be trying to determine or take into
> > account things such as a number of packages, dies per package, cores
> > per pie & package, and threads/logical CPUs per core & package; the
> > degrees of sharing of caches by CPUs at various cache levels, and the
> > presence of hyperthreading.
> 
> Xen by default advertises all VCPUs as separate sockets, to try and
> dissuade "clever" schedulers from doing dumb things based on false
> information.
> 
Are we absolutely sure about this? I'm asking because Elena run into a
similar issue, i.e., seeing some vCPUs being advertised as
threads/siblings (although that was a pv-guest)... Am I right Elena?

I think she also has a patch that she may be able to share soon, which
does right the masking of some of the CPUID stuff, as it looks like some
false information was reaching out to the Linux Scheduler! :-O

I'm not sure this is the exact same issue, though.... Elena, could you
tell something more about this?

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What is the target CPU "topology" of an SMP HVM machine?
  2013-08-14 20:06 ` Andrew Cooper
  2013-08-14 20:52   ` Dario Faggioli
@ 2013-08-14 22:59   ` Eric Shelton
  1 sibling, 0 replies; 6+ messages in thread
From: Eric Shelton @ 2013-08-14 22:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Jan Beulich, xen-devel

On Wed, Aug 14, 2013 at 4:06 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 14/08/13 20:23, Eric Shelton wrote:
>> In doing some work to run OS X under Xen on my MacBook Air 2012 (Ivy
>> Bridge), I ran into some issues in Darwin's probing of what it refers
>> to as the CPU topology.  Although the Darwin kernel may make certain
>> assumptions about the platforms on which it is being run, it
>> nevertheless appears the various values Xen returns via CPUID and MSR
>> are not wholly consistent.  For example, when I configured the domain
>> to have only 1 vcpu, Darwin was still able to infer that the system
>> had multiple processors (maybe even the correct numbers of cores and
>> processors).
>
> The extended model name is passed through from the real CPU, so Darwin
> could easily be working on logic such as "I have found a CPU which
> claims to be this type of IvyBridge - I know it has these details"

In this case, I do not have that kind of a table lookup problem.  The
kernel source indicates that values obtained via CPUID and MSR are
being used to make these determinations.  The Ivy Bridge model id also
triggers reading from MSRs 0x35, 0xCE, and 0x194; and CPUID leaf 7.
MSR 0x35 is the one that affects the topology calculations.

> Xen by default advertises all VCPUs as separate sockets, to try and
> dissuade "clever" schedulers from doing dumb things based on false
> information.

I think OS X has such a scheduler, and I imagine moreso with the next
rev (10.9), which has some new emphasis on power consumption.

>> In addition to CPUID and MSR, does any of this get reflected in the
>> ACPI tables?  Also, is there a presumed relationship between the
>> number of dies or cores and the number of HPET comparators to be
>> concerned with?
>
> There is a distinct lack of consistency between the various mechanisms.
> The ACPI tables are essentially static from build time.

Although it may be nontrivial to do runtime generation or modification
of the ACPI tables, are there any specific items in the ACPI tables
that come to mind which maybe should vary according to the number of
CPUs for better consistency?  Having 256 CPU entries seems to be
harmless.

> From a quick glance at the documentation, there are several different
> generations of processors which use MSR 0x35 for different purposes,
> although it does appear to be somewhat common as performance counters of
> one form or another.
>
> What reference are you using to find out that this msr provides topology
> information?

>From review of the kernel source (osfmk/i386/cpuid.c), which indicates
for MSR 0x35 that on Westmere bits 19-16 are the core count, and bits
15-0 are the thread count; on Nehalem, Sandy Bridge, and Ivy Bridge
bits 31-16 are the core count, and 15-0 are the thread count.  There
is no indication as to what the other bits are (eg, the performance
counters you mentioned).  My Core i5 returns 0x00020004 in the lower
32 bits.  It sounds like in an HVM both of these bit ranges should be
equal to the number of vcpus.

> It would certainly be a good project to try and make this information
> more consistent and easier to configure.

I'm looking at least to have a few more of the CPUID and MSR values
line up with the number of vcpus and their simple topology, and see
where I end up.
http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
appears to give a good amount of information as to what values are
involved.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What is the target CPU "topology" of an SMP HVM machine?
  2013-08-14 20:52   ` Dario Faggioli
@ 2013-08-15  2:18     ` Elena Ufimtseva
  2013-08-21 21:30     ` Matt Wilson
  1 sibling, 0 replies; 6+ messages in thread
From: Elena Ufimtseva @ 2013-08-15  2:18 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Andrew Cooper, xen-devel, Keir Fraser, Jan Beulich, Eric Shelton


[-- Attachment #1.1: Type: text/plain, Size: 3327 bytes --]

Hi

Well, I see thats for HVM guest, right.

On pv guest I run into the following when assigning vcpus to the virtual
numa nodes:

[    0.004000] ------------[ cut here ]------------
[    0.004000] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/smpboot.c:324
topology_sane.isra.7+0x67/0x79()
[    0.004000] sched: CPU #1's smt-sibling CPU #0 is not on the same node!
[node: 1 != 0]. Ignoring dependency.
[    0.004000] Modules linked in:
[    0.004000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc4+ #36
[    0.004000]  0000000000000000 0000000000000009 ffffffff813b9ce7
ffff88001f1b1e60
[    0.004000]  ffffffff810462f0 ffff88001f1b1e70 ffffffff8102f5bb
ffff880000000000
[    0.004000]  0000000000000001 ffff88003f613880 0000000000000000
000000000000b0c0
[    0.004000] Call Trace:
[    0.004000]  [<ffffffff813b9ce7>] ? dump_stack+0x41/0x51
[    0.004000]  [<ffffffff810462f0>] ? warn_slowpath_common+0x78/0x90
[    0.004000]  [<ffffffff8102f5bb>] ? topology_sane.isra.7+0x67/0x79
[    0.004000]  [<ffffffff810463a0>] ? warn_slowpath_fmt+0x45/0x4a
[    0.004000]  [<ffffffff8102f5bb>] ? topology_sane.isra.7+0x67/0x79
[    0.004000]  [<ffffffff8102f80d>] ? set_cpu_sibling_map+0x1c9/0x3fc
[    0.004000]  [<ffffffff8100b9e1>] ? cpu_bringup+0x47/0x86
[    0.004000]  [<ffffffff8100ba41>] ? cpu_bringup_and_idle+0x7/0x12
[    0.004000] ---[ end trace 62b6815bad5814b4 ]---

I just added into the cpuid trap masking out the SMT-width for initial APIC
ID leaf 0x1, so the topology will be physical package = logical processor.
Not sure if in this case SMT cache topology should be masked out as well.
Also masked out on 0x1 leaf X86_FEATURE_HT.

Elena








On Wed, Aug 14, 2013 at 4:52 PM, Dario Faggioli
<dario.faggioli@citrix.com>wrote:

> On mer, 2013-08-14 at 21:06 +0100, Andrew Cooper wrote:
> > On 14/08/13 20:23, Eric Shelton wrote:
> > > So, seeing as this information is being closely interrogated, what is
> > > the target virtual CPU topology?  How should this be reported via
> > > CPUID and MSR?  Darwin appears to be trying to determine or take into
> > > account things such as a number of packages, dies per package, cores
> > > per pie & package, and threads/logical CPUs per core & package; the
> > > degrees of sharing of caches by CPUs at various cache levels, and the
> > > presence of hyperthreading.
> >
> > Xen by default advertises all VCPUs as separate sockets, to try and
> > dissuade "clever" schedulers from doing dumb things based on false
> > information.
> >
> Are we absolutely sure about this? I'm asking because Elena run into a
> similar issue, i.e., seeing some vCPUs being advertised as
> threads/siblings (although that was a pv-guest)... Am I right Elena?
>
> I think she also has a patch that she may be able to share soon, which
> does right the masking of some of the CPUID stuff, as it looks like some
> false information was reaching out to the Linux Scheduler! :-O
>
> I'm not sure this is the exact same issue, though.... Elena, could you
> tell something more about this?
>
> Regards,
> Dario
>
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
>
>


-- 
Elena

[-- Attachment #1.2: Type: text/html, Size: 4598 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What is the target CPU "topology" of an SMP HVM machine?
  2013-08-14 20:52   ` Dario Faggioli
  2013-08-15  2:18     ` Elena Ufimtseva
@ 2013-08-21 21:30     ` Matt Wilson
  1 sibling, 0 replies; 6+ messages in thread
From: Matt Wilson @ 2013-08-21 21:30 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Keir Fraser, Elena Ufimtseva, Andrew Cooper, xen-devel,
	Jan Beulich, Eric Shelton

On Wed, Aug 14, 2013 at 10:52:54PM +0200, Dario Faggioli wrote:
> On mer, 2013-08-14 at 21:06 +0100, Andrew Cooper wrote:
> > On 14/08/13 20:23, Eric Shelton wrote:
> > > So, seeing as this information is being closely interrogated, what is
> > > the target virtual CPU topology?  How should this be reported via
> > > CPUID and MSR?  Darwin appears to be trying to determine or take into
> > > account things such as a number of packages, dies per package, cores
> > > per pie & package, and threads/logical CPUs per core & package; the
> > > degrees of sharing of caches by CPUs at various cache levels, and the
> > > presence of hyperthreading.
> > 
> > Xen by default advertises all VCPUs as separate sockets, to try and
> > dissuade "clever" schedulers from doing dumb things based on false
> > information.
> > 
> Are we absolutely sure about this? I'm asking because Elena run into a
> similar issue, i.e., seeing some vCPUs being advertised as
> threads/siblings (although that was a pv-guest)... Am I right Elena?

Yes, this is the current behavior when we set up HVM guests. My guest
NUMA patch for HVM guests adds the necessary features to adjust the
initial local APIC ID so that CPU topology enumeration works.

We should figure out how we want topology to be presented to PV
guests.

--msw

> I think she also has a patch that she may be able to share soon, which
> does right the masking of some of the CPUID stuff, as it looks like some
> false information was reaching out to the Linux Scheduler! :-O
> 
> I'm not sure this is the exact same issue, though.... Elena, could you
> tell something more about this?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-08-21 21:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-14 19:23 What is the target CPU "topology" of an SMP HVM machine? Eric Shelton
2013-08-14 20:06 ` Andrew Cooper
2013-08-14 20:52   ` Dario Faggioli
2013-08-15  2:18     ` Elena Ufimtseva
2013-08-21 21:30     ` Matt Wilson
2013-08-14 22:59   ` Eric Shelton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.