On mar, 2014-07-22 at 15:48 +0100, Wei Liu wrote: > On Tue, Jul 22, 2014 at 04:03:44PM +0200, Dario Faggioli wrote: > > I mean, even right now, PV guests see completely random cache-sharing > > topology, and that does (at least potentially) affect performance, as > > the guest scheduler will make incorrect/inconsistent assumptions. > > > > Correct. It's just that it might be more obvious to see the problem with > vNUMA. > Yep. > > > Yes, given that you derive numa memory allocation from cpu pinning or > > > use combination of cpu pinning, vcpu to vnode map and vnode to pnode > > > map, in those cases those IDs might reflect the right topology. > > > > > Well, pinning does (should?) not always happen, as a consequence of a > > virtual topology being used. > > > > That's true. I was just referring to the current status of the patch > series. AIUI that's how it is implemented now, not necessary the way it > has to be. > Ok. > > With the following guest configuration, in terms of vcpu pinning: > > > > 1) 2 vCPUs ==> same pCPUs > > 4 vcpus, I think. > > > root@benny:~# xl vcpu-list > > Name ID VCPU CPU State Time(s) CPU Affinity > > debian.guest.osstest 9 0 0 -b- 2.7 0 > > debian.guest.osstest 9 1 0 -b- 5.2 0 > > debian.guest.osstest 9 2 7 -b- 2.4 7 > > debian.guest.osstest 9 3 7 -b- 4.4 7 > > What I meant with "2 vCPUs" was that I was putting 2 vCPUs of the guest (0 and 1) on the same pCPU (0), and the other 2 (2 and 3) on another (7). That should have meant a topology that does not share at least the least cache level in the guest, but it is not. > > 2) no SMT > > root@benny:~# xl vcpu-list > > Name ID VCPU CPU State Time(s) CPU > > Affinity > > debian.guest.osstest 11 0 0 -b- 0.6 0 > > debian.guest.osstest 11 1 2 -b- 0.4 2 > > debian.guest.osstest 11 2 4 -b- 1.5 4 > > debian.guest.osstest 11 3 6 -b- 0.5 6 > > > > 3) Random > > root@benny:~# xl vcpu-list > > Name ID VCPU CPU State Time(s) CPU > > Affinity > > debian.guest.osstest 12 0 3 -b- 1.6 all > > debian.guest.osstest 12 1 1 -b- 1.4 all > > debian.guest.osstest 12 2 5 -b- 2.4 all > > debian.guest.osstest 12 3 7 -b- 1.5 all > > > > 4) yes SMT > > root@benny:~# xl vcpu-list > > Name ID VCPU CPU State Time(s) CPU > > Affinity > > debian.guest.osstest 14 0 1 -b- 1.0 1 > > debian.guest.osstest 14 1 2 -b- 1.8 2 > > debian.guest.osstest 14 2 6 -b- 1.1 6 > > debian.guest.osstest 14 3 7 -b- 0.8 7 > > > > And, in *all* these 4 cases, here's what I see: > > > > root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/core_siblings_list > > 0-3 > > 0-3 > > 0-3 > > 0-3 > > > > root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list > > 0-3 > > 0-3 > > 0-3 > > 0-3 > > > > root@debian:~# lstopo > > Machine (488MB) + Socket L#0 + L3 L#0 (8192KB) + L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 > > PU L#0 (P#0) > > PU L#1 (P#1) > > PU L#2 (P#2) > > PU L#3 (P#3) > > > > I won't be surprised if guest builds up a wrong topology, as what real > "ID"s it sees depends very much on what pcpus you pick. > Exactly, but if I pin all the guest vCPUs on specific host pCPUs from the very beginning (pinning specified in the config file, which is what I'm doing), I should be able to control that... > Have you tried pinning vcpus to pcpus [0, 1, 2, 3]? That way you should > be able to see the same topology as the one you saw in Dom0? > Well, at least some of the examples above should have shown some non-shared cache levels already. Anyway, here it comes: root@benny:~# xl vcpu-list Name ID VCPU CPU State Time(s) CPU Affinity debian.guest.osstest 15 0 0 -b- 1.8 0 debian.guest.osstest 15 1 1 -b- 0.7 1 debian.guest.osstest 15 2 2 -b- 0.6 2 debian.guest.osstest 15 3 3 -b- 0.7 3 root@debian:~# hwloc-ls --of console Machine (488MB) + Socket L#0 + L3 L#0 (8192KB) + L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#1) PU L#2 (P#2) PU L#3 (P#3) root@debian:~# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 4 Core(s) per socket: 1 Socket(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 60 Stepping: 3 CPU MHz: 3591.780 BogoMIPS: 7183.56 Hypervisor vendor: Xen Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 8192K So, no, that is not giving the same result as in Dom0. :-( > > This is not the case for dom0 where (I booted with dom0_max_vcpus=4 on > > the xen command line) I see this: > > > > I guess this is because you're basically picking pcpu 0-3 for Dom0. It > doesn't matter if you pin them or not. > That makes total sense, and in fact, I was not surprised about Dom0 looking like this... I rather am about not being able to get a similar topology for the guest, no matter how I pin it... :-/ Dario -- <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)