xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* PV-vNUMA issue: topology is misinterpreted by the guest
@ 2015-07-16 10:32 Dario Faggioli
  2015-07-16 10:47 ` Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 95+ messages in thread
From: Dario Faggioli @ 2015-07-16 10:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Andrew Cooper, David Vrabel,
	Jan Beulich, Boris Ostrovsky


[-- Attachment #1.1: Type: text/plain, Size: 3987 bytes --]

Hey,

This started on IRC, but it's actually appropriate to have the
conversation here.

I just discovered an issue with vNUMA, when PV guests are used. In fact,
creating a 4 vCPUs PV guest, and making up things so that all the 4
vCPUs should be busy, I see this:

root@Zhaman:~# xl vcpu-list test
Name                                ID  VCPU   CPU State   Time(s) Affinity (Hard / Soft)
test                                 4     0    5   r--    1481.9  all / 0-7
test                                 4     1    2   r--    1479.4  all / 0-7
test                                 4     2   15   -b-       7.5  all / 8-15
test                                 4     3   10   -b-    1324.8  all / 8-15

Going checking inside the guest, confirms that *everything* runs on
vCPUs 0 and 1. However, using schedtool or taskset, I can force tasks to
execute on vCPUs 2 and 3.

Inspecting the guest's dmesg, I've seen this:

[    0.128416] ------------[ cut here ]------------
[    0.128416] WARNING: CPU: 2 PID: 0 at ../arch/x86/kernel/smpboot.c:317 topology_sane.isra.2+0x74/0x88()
[    0.128416] sched: CPU #2's smt-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.128416] Modules linked in:
[    0.128416] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0+ #1
[    0.128416]  0000000000000009 ffff88001ee3bdd0 ffffffff81657c7b ffffffff810bbd2c
[    0.128416]  ffff88001ee3be20 ffff88001ee3be10 ffffffff81081510 ffff88001ee3bea0
[    0.128416]  ffffffff8103aa02 ffff88003ea0a001 0000000000000000 ffff88001f20a040
[    0.128416] Call Trace:
[    0.128416]  [<ffffffff81657c7b>] dump_stack+0x4f/0x7b
[    0.128416]  [<ffffffff810bbd2c>] ? up+0x39/0x3e
[    0.128416]  [<ffffffff81081510>] warn_slowpath_common+0xa1/0xbb
[    0.128416]  [<ffffffff8103aa02>] ? topology_sane.isra.2+0x74/0x88
[    0.128416]  [<ffffffff81081570>] warn_slowpath_fmt+0x46/0x48
[    0.128416]  [<ffffffff8101eeb1>] ? __cpuid.constprop.0+0x15/0x19
[    0.128416]  [<ffffffff8103aa02>] topology_sane.isra.2+0x74/0x88
[    0.128416]  [<ffffffff8103ac70>] set_cpu_sibling_map+0x21a/0x444
[    0.128416]  [<ffffffff81056ac3>] ? numa_add_cpu+0x98/0x9f
[    0.128416]  [<ffffffff8100b8f2>] cpu_bringup+0x63/0xa8
[    0.128416]  [<ffffffff8100b945>] cpu_bringup_and_idle+0xe/0x1a
[    0.128416] ---[ end trace 95bff1aef57ee1b1 ]---

So, basically, Linux is complaining that we're trying to put two vCPUs,
that looks to be SMT siblings, on different NUMA nodes. And, yes, I
think this is quite disruptive for the Linux's scheduler internal logic.

The vnuma bits of the guest config are these:

 vnuma = [ [ "pnode=0","size=512","vcpus=0-1","vdistances=10,20"  ],
           [ "pnode=1","size=512","vcpus=2-3","vdistances=20,10"  ] ]

From inside the guest, the topology looks to be like this:

root@test:~# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1
node 0 size: 475 MB
node 0 free: 382 MB
node 1 cpus: 2 3
node 1 size: 495 MB
node 1 free: 475 MB
node distances:
node   0   1 
  0:  10  10 
  1:  20  10

root@test:~# cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list 
0-1
root@test:~# cat /sys/devices/system/cpu/cpu0/topology/core_siblings_list 
0-3
root@test:~# cat /sys/devices/system/cpu/cpu2/topology/thread_siblings_list 
2-3
root@test:~# cat /sys/devices/system/cpu/cpu2/topology/core_siblings_list 
0-3

So the complain during boot seems to be against 'core_siblings' (which
was not what I expected, but perhaps I misremember the meaning of
"core_siblings" VS. "thread_siblings" VS. smt-siblings in Linux; I'll
double check).

Anyway, is there anything we can do to fix or workaround things?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 95+ messages in thread

end of thread, other threads:[~2015-07-29  7:45 UTC | newest]

Thread overview: 95+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-16 10:32 PV-vNUMA issue: topology is misinterpreted by the guest Dario Faggioli
2015-07-16 10:47 ` Jan Beulich
2015-07-16 10:56   ` Andrew Cooper
2015-07-16 15:25     ` Wei Liu
2015-07-16 15:45       ` Andrew Cooper
2015-07-16 15:50         ` Boris Ostrovsky
2015-07-16 16:29           ` Jan Beulich
2015-07-16 16:39             ` Andrew Cooper
2015-07-16 16:59               ` Boris Ostrovsky
2015-07-17  6:09                 ` Jan Beulich
2015-07-17  7:27                   ` Dario Faggioli
2015-07-17  7:42                     ` Jan Beulich
2015-07-17  8:44                     ` Wei Liu
2015-07-17 18:17                     ` Boris Ostrovsky
2015-07-20 14:09                       ` Dario Faggioli
2015-07-20 14:43                         ` Boris Ostrovsky
2015-07-21 20:00                           ` Boris Ostrovsky
2015-07-22 13:36                             ` Dario Faggioli
2015-07-22 13:50                               ` Juergen Gross
2015-07-22 13:58                                 ` Boris Ostrovsky
2015-07-22 14:09                                   ` Juergen Gross
2015-07-22 14:44                                     ` Boris Ostrovsky
2015-07-23  4:43                                       ` Juergen Gross
2015-07-23  7:28                                         ` Jan Beulich
2015-07-23  9:42                                         ` Andrew Cooper
2015-07-23 14:07                                         ` Dario Faggioli
2015-07-23 14:13                                           ` Juergen Gross
2015-07-24 10:28                                           ` Juergen Gross
2015-07-24 14:44                                             ` Dario Faggioli
2015-07-24 15:14                                               ` Juergen Gross
2015-07-24 15:24                                                 ` Juergen Gross
2015-07-24 15:58                                                   ` Dario Faggioli
2015-07-24 16:09                                                     ` Konrad Rzeszutek Wilk
2015-07-24 16:14                                                       ` Dario Faggioli
2015-07-24 16:18                                                       ` Juergen Gross
2015-07-24 16:29                                                         ` Konrad Rzeszutek Wilk
2015-07-24 16:39                                                           ` Juergen Gross
2015-07-24 16:44                                                             ` Boris Ostrovsky
2015-07-27  4:35                                                               ` Juergen Gross
2015-07-27 10:43                                                                 ` George Dunlap
2015-07-27 10:54                                                                   ` Andrew Cooper
2015-07-27 11:13                                                                     ` Juergen Gross
2015-07-27 10:54                                                                   ` Juergen Gross
2015-07-27 11:11                                                                     ` George Dunlap
2015-07-27 12:01                                                                       ` Juergen Gross
2015-07-27 12:16                                                                         ` Tim Deegan
2015-07-27 13:23                                                                         ` Dario Faggioli
2015-07-27 14:02                                                                           ` Juergen Gross
2015-07-27 14:02                                                                       ` Dario Faggioli
2015-07-27 10:41                                                       ` George Dunlap
2015-07-27 10:49                                                         ` Andrew Cooper
2015-07-27 13:11                                                           ` Dario Faggioli
2015-07-24 16:10                                                     ` Juergen Gross
2015-07-24 16:40                                                       ` Boris Ostrovsky
2015-07-24 16:48                                                         ` Juergen Gross
2015-07-24 17:11                                                           ` Boris Ostrovsky
2015-07-27 13:40                                                             ` Dario Faggioli
2015-07-27  4:24                                                         ` Juergen Gross
2015-07-27 14:09                                                       ` Dario Faggioli
2015-07-27 14:34                                                         ` Boris Ostrovsky
2015-07-27 14:43                                                           ` Juergen Gross
2015-07-27 14:51                                                             ` Boris Ostrovsky
2015-07-27 15:03                                                               ` Juergen Gross
2015-07-27 14:47                                                           ` Juergen Gross
2015-07-27 14:58                                                           ` Dario Faggioli
2015-07-28  4:29                                                         ` Juergen Gross
2015-07-28 15:11                                                           ` Juergen Gross
2015-07-28 16:17                                                             ` Dario Faggioli
2015-07-28 17:13                                                               ` Dario Faggioli
2015-07-29  6:04                                                               ` Juergen Gross
2015-07-29  7:09                                                                 ` Dario Faggioli
2015-07-29  7:44                                                             ` Dario Faggioli
2015-07-24 16:05                                                 ` Dario Faggioli
2015-07-28 10:05                                                   ` Wei Liu
2015-07-28 15:17                                                     ` Dario Faggioli
2015-07-24 20:27                                               ` Elena Ufimtseva
2015-07-22 14:50                                     ` Dario Faggioli
2015-07-22 15:32                                       ` Boris Ostrovsky
2015-07-22 15:49                                         ` Dario Faggioli
2015-07-22 18:10                                           ` Boris Ostrovsky
2015-07-23  7:25                                             ` Jan Beulich
2015-07-24 16:03                                               ` Boris Ostrovsky
2015-07-23 13:46                                             ` Dario Faggioli
2015-07-17 10:17                 ` Andrew Cooper
2015-07-16 15:26 ` Wei Liu
2015-07-27 15:13 ` David Vrabel
2015-07-27 16:02   ` Dario Faggioli
2015-07-27 16:31     ` David Vrabel
2015-07-27 16:33       ` Andrew Cooper
2015-07-27 17:42         ` Dario Faggioli
2015-07-27 17:50           ` Konrad Rzeszutek Wilk
2015-07-27 23:19           ` Andrew Cooper
2015-07-28  3:52             ` Juergen Gross
2015-07-28  9:40               ` Andrew Cooper
2015-07-28  9:28             ` Dario Faggioli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).