* [PATCH 0/3][RFC] NUMA: add host side pinning
@ 2010-06-23 21:09 Andre Przywara
2010-06-23 21:09 ` [PATCH 1/3] NUMA: add Linux libnuma detection Andre Przywara
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Andre Przywara @ 2010-06-23 21:09 UTC (permalink / raw)
To: kvm; +Cc: anthony, agraf
Hi,
these three patches add basic NUMA pinning to KVM. According to a user
provided assignment parts of the guest's memory will be bound to different
host nodes. This should increase performance in large virtual machines
and on loaded hosts.
These patches are quite basic (but work) and I send them as RFC to get
some feedback before implementing stuff in vain.
To use it you need to provide a guest NUMA configuration, this could be
as simple as "-numa node -numa node" to give two nodes in the guest. Then
you pin these nodes on a separate command line option to different host
nodes: "-numa pin,nodeid=0,host=0 -numa pin,nodeid=1,host=2"
This separation of host and guest config sounds a bit complicated, but
was demanded last time I submitted a similar version.
I refrained from binding the vCPUs to physical CPUs for now, but this
can be added later with an "cpubind" option to "-numa pin,". Also this
could be done from a management application by using sched_setaffinity().
Please note that this is currently made for qemu-kvm, although I am not
up-to-date regarding the curent status of upstreams QEMU's true SMP
capabilities. The final patch will be made against upstream QEMU anyway.
Also this is currently for Linux hosts (any other KVM hosts alive?) and
for PC guests only. I think both can be fixed easily if someone requests
it (and gives me a pointer to further information).
Please comment on the approach in general and the implementation.
Thanks and Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/3] NUMA: add Linux libnuma detection
2010-06-23 21:09 [PATCH 0/3][RFC] NUMA: add host side pinning Andre Przywara
@ 2010-06-23 21:09 ` Andre Przywara
2010-06-23 21:09 ` [PATCH 2/3] NUMA: add parsing of host NUMA pin option Andre Przywara
` (2 subsequent siblings)
3 siblings, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2010-06-23 21:09 UTC (permalink / raw)
To: kvm; +Cc: anthony, agraf, Andre Przywara
Add detection of libnuma (mostly contained in the numactl package)
to the configure script. Currently this is Linux only, but can be
extended later should the need for other interfaces come up.
Can be enabled or disabled on the command line, default is use if
available.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
configure | 33 +++++++++++++++++++++++++++++++++
1 files changed, 33 insertions(+), 0 deletions(-)
diff --git a/configure b/configure
index 08883e7..3e2dc5b 100755
--- a/configure
+++ b/configure
@@ -281,6 +281,7 @@ vnc_sasl=""
xen=""
linux_aio=""
vhost_net=""
+numa="yes"
gprof="no"
debug_tcg="no"
@@ -721,6 +722,10 @@ for opt do
;;
--enable-vhost-net) vhost_net="yes"
;;
+ --disable-numa) numa="no"
+ ;;
+ --enable-numa) numa="yes"
+ ;;
--*dir)
;;
*) echo "ERROR: unknown option $opt"; show_help="yes"
@@ -905,6 +910,8 @@ echo " --enable-docs enable documentation build"
echo " --disable-docs disable documentation build"
echo " --disable-vhost-net disable vhost-net acceleration support"
echo " --enable-vhost-net enable vhost-net acceleration support"
+echo " --disable-numa disable host Linux NUMA support"
+echo " --enable-numa enable host Linux NUMA support"
echo ""
echo "NOTE: The object files are built at the place where configure is launched"
exit 1
@@ -1962,6 +1969,28 @@ if compile_prog "" "" ; then
signalfd=yes
fi
+##########################################
+# libnuma probe
+
+if test "$numa" = "yes" ; then
+ numa=no
+ cat > $TMPC << EOF
+#include <numa.h>
+int main(void) { return numa_available(); }
+EOF
+
+ if compile_prog "" "-lnuma" ; then
+ numa=yes
+ libs_softmmu="-lnuma $libs_softmmu"
+ else
+ if test "$numa" = "yes" ; then
+ feature_not_found "linux NUMA (install numactl?)"
+ fi
+ numa=no
+ fi
+fi
+
+
# check if eventfd is supported
eventfd=no
cat > $TMPC << EOF
@@ -2245,6 +2274,7 @@ echo "preadv support $preadv"
echo "fdatasync $fdatasync"
echo "uuid support $uuid"
echo "vhost-net support $vhost_net"
+echo "NUMA host support $numa"
if test $sdl_too_old = "yes"; then
echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -2468,6 +2498,9 @@ if test $cpu_emulation = "yes"; then
else
echo "CONFIG_NO_CPU_EMULATION=y" >> $config_host_mak
fi
+if test "$numa" = "yes"; then
+ echo "CONFIG_NUMA=y" >> $config_host_mak
+fi
# XXX: suppress that
if [ "$bsd" = "yes" ] ; then
--
1.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/3] NUMA: add parsing of host NUMA pin option
2010-06-23 21:09 [PATCH 0/3][RFC] NUMA: add host side pinning Andre Przywara
2010-06-23 21:09 ` [PATCH 1/3] NUMA: add Linux libnuma detection Andre Przywara
@ 2010-06-23 21:09 ` Andre Przywara
2010-06-23 21:09 ` [PATCH 3/3] NUMA: realize NUMA memory pinning Andre Przywara
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
3 siblings, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2010-06-23 21:09 UTC (permalink / raw)
To: kvm; +Cc: anthony, agraf, Andre Przywara
Introduce another variant of QEMU's -numa option to allow host node
pinning. This was separated from the guest relevant configuration
to make it cleaner to use, especially for management applications.
The syntax is -numa pin,nodeid=n,host=m to assign the guest node n
to host node m.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
sysemu.h | 1 +
vl.c | 18 ++++++++++++++++++
2 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/sysemu.h b/sysemu.h
index 6018d97..1b3f77b 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -139,6 +139,7 @@ extern long hpagesize;
extern int nb_numa_nodes;
extern uint64_t node_mem[MAX_NODES];
extern uint64_t node_cpumask[MAX_NODES];
+extern int node_pin[MAX_NODES];
#define MAX_OPTION_ROMS 16
extern const char *option_rom[MAX_OPTION_ROMS];
diff --git a/vl.c b/vl.c
index 0ee963c..02e0bed 100644
--- a/vl.c
+++ b/vl.c
@@ -234,6 +234,7 @@ int boot_menu;
int nb_numa_nodes;
uint64_t node_mem[MAX_NODES];
uint64_t node_cpumask[MAX_NODES];
+int node_pin[MAX_NODES];
static QEMUTimer *nographic_timer;
@@ -771,6 +772,22 @@ static void numa_add(const char *optarg)
node_cpumask[nodenr] = value;
}
nb_numa_nodes++;
+ } else if (!strcmp(option, "pin")) {
+ if (get_param_value(option, 128, "nodeid", optarg) == 0) {
+ fprintf(stderr, "error: need nodeid for -numa pin,...\n");
+ exit(1);
+ } else {
+ nodenr = strtoull(option, NULL, 10);
+ if (nodenr >= nb_numa_nodes) {
+ fprintf(stderr, "nodeid exceed specified NUMA nodes\n");
+ exit(1);
+ }
+ }
+ if (get_param_value(option, 128, "host", optarg) == 0) {
+ node_pin[nodenr] = -1;
+ } else {
+ node_pin[nodenr] = strtoull(option, NULL, 10);
+ }
}
return;
}
@@ -1873,6 +1890,7 @@ int main(int argc, char **argv, char **envp)
for (i = 0; i < MAX_NODES; i++) {
node_mem[i] = 0;
node_cpumask[i] = 0;
+ node_pin[i] = -1;
}
assigned_devices_index = 0;
--
1.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/3] NUMA: realize NUMA memory pinning
2010-06-23 21:09 [PATCH 0/3][RFC] NUMA: add host side pinning Andre Przywara
2010-06-23 21:09 ` [PATCH 1/3] NUMA: add Linux libnuma detection Andre Przywara
2010-06-23 21:09 ` [PATCH 2/3] NUMA: add parsing of host NUMA pin option Andre Przywara
@ 2010-06-23 21:09 ` Andre Przywara
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
3 siblings, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2010-06-23 21:09 UTC (permalink / raw)
To: kvm; +Cc: anthony, agraf, Andre Przywara
According to the user-provided assignment bind the respective part
of the guest's memory to the given host node. This uses Linux'
libnuma interface to realize the pinning right after the allocation.
Failures are not fatal, but produce a warning.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
hw/pc.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 51 insertions(+), 0 deletions(-)
diff --git a/hw/pc.c b/hw/pc.c
index 1f61609..b6d4d7a 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -41,6 +41,11 @@
#include "device-assignment.h"
#include "kvm.h"
+#ifdef CONFIG_NUMA
+#include <numa.h>
+#include <numaif.h>
+#endif
+
/* output Bochs bios info messages */
//#define DEBUG_BIOS
@@ -874,6 +879,49 @@ void pc_cpus_init(const char *cpu_model)
}
}
+static void bind_numa(ram_addr_t ram_addr, ram_addr_t border_4g,
+ int below_4g)
+{
+#ifdef CONFIG_NUMA
+ int i, skip;
+ char* ram_ptr;
+ nodemask_t nodemask;
+ ram_addr_t len, ram_offset;
+
+ ram_ptr = qemu_get_ram_ptr(ram_addr);
+
+ ram_offset = 0;
+ skip = !below_4g;
+ for (i = 0; i < nb_numa_nodes; i++) {
+ len = node_mem[i];
+ if (ram_offset <= border_4g && ram_offset + len > border_4g) {
+ len = border_4g - ram_offset;
+ if (skip) {
+ ram_offset = 0;
+ len = node_mem[i] - len;
+ skip = 0;
+ }
+ }
+ if (skip && ram_offset + len <= border_4g) {
+ ram_offset += len;
+ continue;
+ }
+ if (!skip && node_pin[i] >= 0) {
+ nodemask_zero(&nodemask);
+ nodemask_set_compat(&nodemask, node_pin[i]);
+ if (mbind(ram_ptr + ram_offset, len, MPOL_BIND,
+ nodemask.n, NUMA_NUM_NODES, 0)) {
+ perror("mbind");
+ }
+ }
+ ram_offset += len;
+ if (below_4g && ram_offset >= border_4g)
+ return;
+ }
+#endif
+ return;
+}
+
void pc_memory_init(ram_addr_t ram_size,
const char *kernel_filename,
const char *kernel_cmdline,
@@ -906,6 +954,8 @@ void pc_memory_init(ram_addr_t ram_size,
below_4g_mem_size - 0x100000,
ram_addr + 0x100000);
+ bind_numa(ram_addr, below_4g_mem_size, 1);
+
/* above 4giga memory allocation */
if (above_4g_mem_size > 0) {
#if TARGET_PHYS_ADDR_BITS == 32
@@ -915,6 +965,7 @@ void pc_memory_init(ram_addr_t ram_size,
cpu_register_physical_memory(0x100000000ULL,
above_4g_mem_size,
ram_addr);
+ bind_numa(ram_addr, below_4g_mem_size, 0);
#endif
}
--
1.6.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-23 21:09 [PATCH 0/3][RFC] NUMA: add host side pinning Andre Przywara
` (2 preceding siblings ...)
2010-06-23 21:09 ` [PATCH 3/3] NUMA: realize NUMA memory pinning Andre Przywara
@ 2010-06-23 22:21 ` Anthony Liguori
2010-06-23 22:29 ` Alexander Graf
` (2 more replies)
3 siblings, 3 replies; 20+ messages in thread
From: Anthony Liguori @ 2010-06-23 22:21 UTC (permalink / raw)
To: Andre Przywara; +Cc: kvm, agraf
On 06/23/2010 04:09 PM, Andre Przywara wrote:
> Hi,
>
> these three patches add basic NUMA pinning to KVM. According to a user
> provided assignment parts of the guest's memory will be bound to different
> host nodes. This should increase performance in large virtual machines
> and on loaded hosts.
> These patches are quite basic (but work) and I send them as RFC to get
> some feedback before implementing stuff in vain.
>
> To use it you need to provide a guest NUMA configuration, this could be
> as simple as "-numa node -numa node" to give two nodes in the guest. Then
> you pin these nodes on a separate command line option to different host
> nodes: "-numa pin,nodeid=0,host=0 -numa pin,nodeid=1,host=2"
> This separation of host and guest config sounds a bit complicated, but
> was demanded last time I submitted a similar version.
> I refrained from binding the vCPUs to physical CPUs for now, but this
> can be added later with an "cpubind" option to "-numa pin,". Also this
> could be done from a management application by using sched_setaffinity().
>
> Please note that this is currently made for qemu-kvm, although I am not
> up-to-date regarding the curent status of upstreams QEMU's true SMP
> capabilities. The final patch will be made against upstream QEMU anyway.
> Also this is currently for Linux hosts (any other KVM hosts alive?) and
> for PC guests only. I think both can be fixed easily if someone requests
> it (and gives me a pointer to further information).
>
> Please comment on the approach in general and the implementation.
>
If we extended integrated -mem-path with -numa such that a different
path could be used with each numa node (and we let an explicit file be
specified instead of just a directory), then if I understand correctly,
we could use numactl without any specific integration in qemu. Does
this sound correct?
IOW:
qemu -numa node,mem=1G,nodeid=0,cpus=0-1,memfile=/dev/shm/node0.mem
-numa node,mem=2G,nodeid=1,cpus=1-2,memfile=/dev/shm/node1.mem
It's then possible to say:
numactl --file /dev/shm/node0.mem --interleave=0,1
numactl --file /dev/shm/node1.mem --membind=2
I think this approach is nicer because it gives the user a lot more
flexibility without having us chase other tools like numactl. For
instance, your patches only support pinning and not interleaving.
Regards,
Anthony Liguori
> Thanks and Regards,
> Andre.
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
@ 2010-06-23 22:29 ` Alexander Graf
2010-06-24 10:58 ` Andre Przywara
2010-06-24 6:44 ` Andre Przywara
2010-06-24 13:14 ` Andi Kleen
2 siblings, 1 reply; 20+ messages in thread
From: Alexander Graf @ 2010-06-23 22:29 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Andre Przywara, kvm
On 24.06.2010, at 00:21, Anthony Liguori wrote:
> On 06/23/2010 04:09 PM, Andre Przywara wrote:
>> Hi,
>>
>> these three patches add basic NUMA pinning to KVM. According to a user
>> provided assignment parts of the guest's memory will be bound to different
>> host nodes. This should increase performance in large virtual machines
>> and on loaded hosts.
>> These patches are quite basic (but work) and I send them as RFC to get
>> some feedback before implementing stuff in vain.
>>
>> To use it you need to provide a guest NUMA configuration, this could be
>> as simple as "-numa node -numa node" to give two nodes in the guest. Then
>> you pin these nodes on a separate command line option to different host
>> nodes: "-numa pin,nodeid=0,host=0 -numa pin,nodeid=1,host=2"
>> This separation of host and guest config sounds a bit complicated, but
>> was demanded last time I submitted a similar version.
>> I refrained from binding the vCPUs to physical CPUs for now, but this
>> can be added later with an "cpubind" option to "-numa pin,". Also this
>> could be done from a management application by using sched_setaffinity().
>>
>> Please note that this is currently made for qemu-kvm, although I am not
>> up-to-date regarding the curent status of upstreams QEMU's true SMP
>> capabilities. The final patch will be made against upstream QEMU anyway.
>> Also this is currently for Linux hosts (any other KVM hosts alive?) and
>> for PC guests only. I think both can be fixed easily if someone requests
>> it (and gives me a pointer to further information).
>>
>> Please comment on the approach in general and the implementation.
>>
>
> If we extended integrated -mem-path with -numa such that a different path could be used with each numa node (and we let an explicit file be specified instead of just a directory), then if I understand correctly, we could use numactl without any specific integration in qemu. Does this sound correct?
>
> IOW:
>
> qemu -numa node,mem=1G,nodeid=0,cpus=0-1,memfile=/dev/shm/node0.mem -numa node,mem=2G,nodeid=1,cpus=1-2,memfile=/dev/shm/node1.mem
>
> It's then possible to say:
>
> numactl --file /dev/shm/node0.mem --interleave=0,1
> numactl --file /dev/shm/node1.mem --membind=2
>
> I think this approach is nicer because it gives the user a lot more flexibility without having us chase other tools like numactl. For instance, your patches only support pinning and not interleaving.
Interesting idea.
So who would create the /dev/shm/nodeXX files? I can imagine starting numactl before qemu, even though that's cumbersome. I don't think it's feasible to start numactl after qemu is running. That'd involve way too much magic that I'd prefer qemu to call numactl itself.
Alex
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
2010-06-23 22:29 ` Alexander Graf
@ 2010-06-24 6:44 ` Andre Przywara
2010-06-24 13:14 ` Andi Kleen
2 siblings, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2010-06-24 6:44 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm, agraf
Anthony Liguori wrote:
> On 06/23/2010 04:09 PM, Andre Przywara wrote:
>> Hi,
>>
>> these three patches add basic NUMA pinning to KVM. According to a user
>> provided assignment parts of the guest's memory will be bound to different
>> host nodes. This should increase performance in large virtual machines
>> and on loaded hosts.
>> These patches are quite basic (but work) and I send them as RFC to get
>> some feedback before implementing stuff in vain.
>>
>> ....
>>
>> Please comment on the approach in general and the implementation.
>>
>
> If we extended integrated -mem-path with -numa such that a different
> path could be used with each numa node (and we let an explicit file be
> specified instead of just a directory), then if I understand correctly,
> we could use numactl without any specific integration in qemu. Does
> this sound correct?
In general, yes. But I consider the whole hugetlbfs approach broken.
Since 2.6.32 or so you can use MAP_HUGETLB together with MAP_ANONYMOUS
in mmap() to avoid hugetlbfs at all, and I bet that the future will hold
transparent hugepages anyway (RHEL6 already has them).
I am not sure whether you want to keep the -memfile option and extend it
with some pseudo compat glue (faked directory names to be interpreted by
QEMU) to make it work in the future. But anyway in these cases the
external numactl approach would not work anymore.
> IOW:
>
> qemu -numa node,mem=1G,nodeid=0,cpus=0-1,memfile=/dev/shm/node0.mem
> -numa node,mem=2G,nodeid=1,cpus=1-2,memfile=/dev/shm/node1.mem
>
> It's then possible to say:
>
> numactl --file /dev/shm/node0.mem --interleave=0,1
> numactl --file /dev/shm/node1.mem --membind=2
>
> I think this approach is nicer because it gives the user a lot more
> flexibility without having us chase other tools like numactl. For
> instance, your patches only support pinning and not interleaving.
That's right. I put it on the list ;-)
Thanks for the good hint on the huge pages issue, as this is not
properly handled in the current implementation. I will think about a
proper way to handle this, but would still opt for a (at least
partially) QEMU integrated solution.
Still open for discussion, though, as I see your point of avoiding
duplicate NUMA implementation between numactl and QEMU.
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 488-3567-12
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-23 22:29 ` Alexander Graf
@ 2010-06-24 10:58 ` Andre Przywara
2010-06-24 11:12 ` Avi Kivity
0 siblings, 1 reply; 20+ messages in thread
From: Andre Przywara @ 2010-06-24 10:58 UTC (permalink / raw)
To: Alexander Graf; +Cc: Anthony Liguori, kvm
Alexander Graf wrote:
> On 24.06.2010, at 00:21, Anthony Liguori wrote:
>
>> On 06/23/2010 04:09 PM, Andre Przywara wrote:
>>> Hi,
>>>
>>> these three patches add basic NUMA pinning to KVM. According to a user
>>> provided assignment parts of the guest's memory will be bound to different
>>> host nodes. This should increase performance in large virtual machines
>>> and on loaded hosts.
>>> These patches are quite basic (but work) and I send them as RFC to get
>>> some feedback before implementing stuff in vain.
>>>
>>> To use it you need to provide a guest NUMA configuration, this could be
>>> as simple as "-numa node -numa node" to give two nodes in the guest. Then
>>> you pin these nodes on a separate command line option to different host
>>> nodes: "-numa pin,nodeid=0,host=0 -numa pin,nodeid=1,host=2"
>>> This separation of host and guest config sounds a bit complicated, but
>>> was demanded last time I submitted a similar version.
>>> I refrained from binding the vCPUs to physical CPUs for now, but this
>>> can be added later with an "cpubind" option to "-numa pin,". Also this
>>> could be done from a management application by using sched_setaffinity().
>>>
>>> Please note that this is currently made for qemu-kvm, although I am not
>>> up-to-date regarding the curent status of upstreams QEMU's true SMP
>>> capabilities. The final patch will be made against upstream QEMU anyway.
>>> Also this is currently for Linux hosts (any other KVM hosts alive?) and
>>> for PC guests only. I think both can be fixed easily if someone requests
>>> it (and gives me a pointer to further information).
>>>
>>> Please comment on the approach in general and the implementation.
>>>
>> If we extended integrated -mem-path with -numa such that a different path could be used with each numa node (and we let an explicit file be specified instead of just a directory), then if I understand correctly, we could use numactl without any specific integration in qemu. Does this sound correct?
>>
>> IOW:
>>
>> qemu -numa node,mem=1G,nodeid=0,cpus=0-1,memfile=/dev/shm/node0.mem -numa node,mem=2G,nodeid=1,cpus=1-2,memfile=/dev/shm/node1.mem
>>
>> It's then possible to say:
>>
>> numactl --file /dev/shm/node0.mem --interleave=0,1
>> numactl --file /dev/shm/node1.mem --membind=2
>>
>> I think this approach is nicer because it gives the user a lot more flexibility without having us chase other tools like numactl. For instance, your patches only support pinning and not interleaving.
>
> Interesting idea.
>
> So who would create the /dev/shm/nodeXX files?
Currently it is QEMU. It creates a somewhat unique filename, opens and
unlinks it. The difference would be to name the file after the option
and to not unlink it.
> I can imagine starting numactl before qemu, even though that's
> cumbersome. I don't think it's feasible to start numactl after
> qemu is running. That'd involve way too much magic that I'd prefer
> qemu to call numactl itself.
Using the current code the files would not exist before QEMU allocated
RAM, and after that it could already touch pages before numactl set the
policy.
To avoid this I'd like to see the pinning done from within QEMU. I am
not sure whether calling numactl via system() and friends is OK, I'd
prefer to run the syscalls directly (like in patch 3/3) and pull the
necessary options into the -numa pin,... command line. We could mimic
numactl's syntax here.
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-24 10:58 ` Andre Przywara
@ 2010-06-24 11:12 ` Avi Kivity
2010-06-24 11:34 ` Andre Przywara
2010-06-28 16:17 ` Anthony Liguori
0 siblings, 2 replies; 20+ messages in thread
From: Avi Kivity @ 2010-06-24 11:12 UTC (permalink / raw)
To: Andre Przywara; +Cc: Alexander Graf, Anthony Liguori, kvm
On 06/24/2010 01:58 PM, Andre Przywara wrote:
>> So who would create the /dev/shm/nodeXX files?
>
> Currently it is QEMU. It creates a somewhat unique filename, opens and
> unlinks it. The difference would be to name the file after the option
> and to not unlink it.
>
> > I can imagine starting numactl before qemu, even though that's
> > cumbersome. I don't think it's feasible to start numactl after
> > qemu is running. That'd involve way too much magic that I'd prefer
> > qemu to call numactl itself.
> Using the current code the files would not exist before QEMU allocated
> RAM, and after that it could already touch pages before numactl set
> the policy.
Non-anonymous memory doesn't work well with ksm and transparent
hugepages. Is it possible to use anonymous memory rather than file backed?
> To avoid this I'd like to see the pinning done from within QEMU. I am
> not sure whether calling numactl via system() and friends is OK, I'd
> prefer to run the syscalls directly (like in patch 3/3) and pull the
> necessary options into the -numa pin,... command line. We could mimic
> numactl's syntax here.
Definitely not use system(), but IIRC numactl has a library interface?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-24 11:12 ` Avi Kivity
@ 2010-06-24 11:34 ` Andre Przywara
2010-06-24 11:42 ` Avi Kivity
2010-06-25 11:00 ` Jes Sorensen
2010-06-28 16:17 ` Anthony Liguori
1 sibling, 2 replies; 20+ messages in thread
From: Andre Przywara @ 2010-06-24 11:34 UTC (permalink / raw)
To: Avi Kivity; +Cc: Alexander Graf, Anthony Liguori, kvm
Avi Kivity wrote:
> On 06/24/2010 01:58 PM, Andre Przywara wrote:
>>> So who would create the /dev/shm/nodeXX files?
>> Currently it is QEMU. It creates a somewhat unique filename, opens and
>> unlinks it. The difference would be to name the file after the option
>> and to not unlink it.
>>
>>> I can imagine starting numactl before qemu, even though that's
>>> cumbersome. I don't think it's feasible to start numactl after
>>> qemu is running. That'd involve way too much magic that I'd prefer
>>> qemu to call numactl itself.
>> Using the current code the files would not exist before QEMU allocated
>> RAM, and after that it could already touch pages before numactl set
>> the policy.
>
> Non-anonymous memory doesn't work well with ksm and transparent
> hugepages. Is it possible to use anonymous memory rather than file backed?
I'd prefer non-file backed, too. But that is how the current huge pages
implementation is done. We could use MAP_HUGETLB and declare NUMA _and_
huge pages as 2.6.32+ only. Unfortunately I didn't find an easy way to
detect the presence of the MAP_HUGETLB flag. If the kernel does not
support it, it seems that mmap silently ignores it and uses 4KB pages
instead.
>> To avoid this I'd like to see the pinning done from within QEMU. I am
>> not sure whether calling numactl via system() and friends is OK, I'd
>> prefer to run the syscalls directly (like in patch 3/3) and pull the
>> necessary options into the -numa pin,... command line. We could mimic
>> numactl's syntax here.
>
> Definitely not use system(), but IIRC numactl has a library interface?
Right, that is what I include in patch 3/3 and use. I got the impression
Anthony wanted to avoid reimplementing parts of numactl, especially
enabling the full flexibility of the command line interface (like
specifying nodes, policies and interleaving).
I want QEMU to use the library and pull the necessary options into the
-numa pin,... parsing, even if this means duplicating numactl functionality.
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-24 11:34 ` Andre Przywara
@ 2010-06-24 11:42 ` Avi Kivity
2010-06-28 16:20 ` Anthony Liguori
2010-06-25 11:00 ` Jes Sorensen
1 sibling, 1 reply; 20+ messages in thread
From: Avi Kivity @ 2010-06-24 11:42 UTC (permalink / raw)
To: Andre Przywara; +Cc: Alexander Graf, Anthony Liguori, kvm
On 06/24/2010 02:34 PM, Andre Przywara wrote:
>> Non-anonymous memory doesn't work well with ksm and transparent
>> hugepages. Is it possible to use anonymous memory rather than file
>> backed?
>
> I'd prefer non-file backed, too. But that is how the current huge
> pages implementation is done. We could use MAP_HUGETLB and declare
> NUMA _and_ huge pages as 2.6.32+ only. Unfortunately I didn't find an
> easy way to detect the presence of the MAP_HUGETLB flag. If the kernel
> does not support it, it seems that mmap silently ignores it and uses
> 4KB pages instead.
That sucks, unfortunately it is normal practice. However it is a soft
failure, everything works just a bit slower. So it's probably acceptable.
>>> To avoid this I'd like to see the pinning done from within QEMU. I
>>> am not sure whether calling numactl via system() and friends is OK,
>>> I'd prefer to run the syscalls directly (like in patch 3/3) and pull
>>> the necessary options into the -numa pin,... command line. We could
>>> mimic numactl's syntax here.
>>
>> Definitely not use system(), but IIRC numactl has a library interface?
> Right, that is what I include in patch 3/3 and use. I got the
> impression Anthony wanted to avoid reimplementing parts of numactl,
> especially enabling the full flexibility of the command line interface
> (like specifying nodes, policies and interleaving).
> I want QEMU to use the library and pull the necessary options into the
> -numa pin,... parsing, even if this means duplicating numactl
> functionality.
>
I agree with that. It's a lot easier to use a single tool than to try
to integrate things yourself, the unix tradition of grep | sort | uniq
-c | sort -n notwithstanding. Especially when one of the tools is qemu.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
2010-06-23 22:29 ` Alexander Graf
2010-06-24 6:44 ` Andre Przywara
@ 2010-06-24 13:14 ` Andi Kleen
2 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2010-06-24 13:14 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Andre Przywara, kvm, agraf
Anthony Liguori <anthony@codemonkey.ws> writes:
>
> If we extended integrated -mem-path with -numa such that a different
> path could be used with each numa node (and we let an explicit file be
> specified instead of just a directory), then if I understand
> correctly, we could use numactl without any specific integration in
> qemu. Does this sound correct?
It's a bit tricky to coordinate because numactl policy only helps
before first fault (unless you want to migrate, but that has more
overhead), and if you run the numactl in parallel with qemu
you never know who faults first. So you would need another step in
precreating the files before starting qemu.
Another issue with using tmpfs this way is that you first need to resize
it to be larger than 0.5*RAM. So more configuration hazzle.
Overall it would be rather a lot of steps this way. I guess
most people would put it into a wrapper, but why not have
that wrapper in qemu directly? Supporting interleave too
would be rather straight forward.
Also a lot of things you could do with numactl on shm you
can be also done after the fact with cpusets.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-24 11:34 ` Andre Przywara
2010-06-24 11:42 ` Avi Kivity
@ 2010-06-25 11:00 ` Jes Sorensen
2010-06-25 11:06 ` Andre Przywara
1 sibling, 1 reply; 20+ messages in thread
From: Jes Sorensen @ 2010-06-25 11:00 UTC (permalink / raw)
To: Andre Przywara; +Cc: Avi Kivity, Alexander Graf, Anthony Liguori, kvm
On 06/24/10 13:34, Andre Przywara wrote:
> Avi Kivity wrote:
>> On 06/24/2010 01:58 PM, Andre Przywara wrote:
>> Non-anonymous memory doesn't work well with ksm and transparent
>> hugepages. Is it possible to use anonymous memory rather than file
>> backed?
> I'd prefer non-file backed, too. But that is how the current huge pages
> implementation is done. We could use MAP_HUGETLB and declare NUMA _and_
> huge pages as 2.6.32+ only. Unfortunately I didn't find an easy way to
> detect the presence of the MAP_HUGETLB flag. If the kernel does not
> support it, it seems that mmap silently ignores it and uses 4KB pages
> instead.
Bit behind on the mailing list, but I think this look very promising.
I really think it makes more sense to make QEMU aware of the NUMA setup
as well, rather than relying on numctl to do the work outside.
One thing you need to consider is what happens with migration once a
user specifies -numa. IMHO it is acceptable to simply disable migration
for the given guest.
Cheers,
Jes
PS: Are you planning on submitting anything to Linux Plumbers Conference
about this? :)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-25 11:00 ` Jes Sorensen
@ 2010-06-25 11:06 ` Andre Przywara
2010-06-25 11:37 ` Jes Sorensen
0 siblings, 1 reply; 20+ messages in thread
From: Andre Przywara @ 2010-06-25 11:06 UTC (permalink / raw)
To: Jes Sorensen; +Cc: Avi Kivity, Alexander Graf, Anthony Liguori, kvm
Jes Sorensen wrote:
> On 06/24/10 13:34, Andre Przywara wrote:
>> Avi Kivity wrote:
>>> On 06/24/2010 01:58 PM, Andre Przywara wrote:
>>> Non-anonymous memory doesn't work well with ksm and transparent
>>> hugepages. Is it possible to use anonymous memory rather than file
>>> backed?
>> I'd prefer non-file backed, too. But that is how the current huge pages
>> implementation is done. We could use MAP_HUGETLB and declare NUMA _and_
>> huge pages as 2.6.32+ only. Unfortunately I didn't find an easy way to
>> detect the presence of the MAP_HUGETLB flag. If the kernel does not
>> support it, it seems that mmap silently ignores it and uses 4KB pages
>> instead.
>
> Bit behind on the mailing list, but I think this look very promising.
>
> I really think it makes more sense to make QEMU aware of the NUMA setup
> as well, rather than relying on numctl to do the work outside.
>
> One thing you need to consider is what happens with migration once a
> user specifies -numa. IMHO it is acceptable to simply disable migration
> for the given guest.
Is that really a problem? You create the guest on the target with a NUMA
setup specific to the target machine. That could mean that you pin
multiple guest nodes to the same host node, but that shouldn't break
something, right? The guest part can (and should be!) migrated along
with all the other device state. I think this is still missing from the
current implementation.
>
> Cheers,
> Jes
>
> PS: Are you planning on submitting anything to Linux Plumbers Conference
> about this? :)
Yes, I was planning to submit a proposal, as I saw NUMA mentioned in the
topics list. AFAIK the deadline is July 19th, right? That gives me
another week after my vacation (for which I leave in a few minutes).
Regards,
Andre.
--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-25 11:06 ` Andre Przywara
@ 2010-06-25 11:37 ` Jes Sorensen
0 siblings, 0 replies; 20+ messages in thread
From: Jes Sorensen @ 2010-06-25 11:37 UTC (permalink / raw)
To: Andre Przywara; +Cc: Avi Kivity, Alexander Graf, Anthony Liguori, kvm
On 06/25/10 13:06, Andre Przywara wrote:
> Jes Sorensen wrote:
>> On 06/24/10 13:34, Andre Przywara wrote:
>> I really think it makes more sense to make QEMU aware of the NUMA setup
>> as well, rather than relying on numctl to do the work outside.
>>
>> One thing you need to consider is what happens with migration once a
>> user specifies -numa. IMHO it is acceptable to simply disable migration
>> for the given guest.
> Is that really a problem? You create the guest on the target with a NUMA
> setup specific to the target machine. That could mean that you pin
> multiple guest nodes to the same host node, but that shouldn't break
> something, right? The guest part can (and should be!) migrated along
> with all the other device state. I think this is still missing from the
> current implementation.
It may be hard to guarantee the memory layout on the target machine it
may have a completely different topology. The numa bindings ought to go
into the state and be checked against the target machine's state, but
for instance you could be trying to bind things to node 7-8 on the first
host while migration target only has 2 nodes, but plenty of memory. Or
you use mode nodes on the first host than you have on the second. It's a
very complicated matrix to try and match.
>> PS: Are you planning on submitting anything to Linux Plumbers Conference
>> about this? :)
> Yes, I was planning to submit a proposal, as I saw NUMA mentioned in the
> topics list. AFAIK the deadline is July 19th, right? That gives me
> another week after my vacation (for which I leave in a few minutes).
Excellent! yes it should still by July 19th.
Enjoy your vacation!
Jes
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-24 11:12 ` Avi Kivity
2010-06-24 11:34 ` Andre Przywara
@ 2010-06-28 16:17 ` Anthony Liguori
2010-06-29 9:48 ` Avi Kivity
1 sibling, 1 reply; 20+ messages in thread
From: Anthony Liguori @ 2010-06-28 16:17 UTC (permalink / raw)
To: Avi Kivity; +Cc: Andre Przywara, Alexander Graf, kvm
On 06/24/2010 06:12 AM, Avi Kivity wrote:
> On 06/24/2010 01:58 PM, Andre Przywara wrote:
>>> So who would create the /dev/shm/nodeXX files?
>>
>> Currently it is QEMU. It creates a somewhat unique filename, opens
>> and unlinks it. The difference would be to name the file after the
>> option and to not unlink it.
>>
>> > I can imagine starting numactl before qemu, even though that's
>> > cumbersome. I don't think it's feasible to start numactl after
>> > qemu is running. That'd involve way too much magic that I'd prefer
>> > qemu to call numactl itself.
>> Using the current code the files would not exist before QEMU
>> allocated RAM, and after that it could already touch pages before
>> numactl set the policy.
>
> Non-anonymous memory doesn't work well with ksm and transparent
> hugepages. Is it possible to use anonymous memory rather than file
> backed?
You aren't going to be doing NUMA pinning and KSM AFAICT.
Regards,
Anthony Liguori
>> To avoid this I'd like to see the pinning done from within QEMU. I am
>> not sure whether calling numactl via system() and friends is OK, I'd
>> prefer to run the syscalls directly (like in patch 3/3) and pull the
>> necessary options into the -numa pin,... command line. We could mimic
>> numactl's syntax here.
>
> Definitely not use system(), but IIRC numactl has a library interface?
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-24 11:42 ` Avi Kivity
@ 2010-06-28 16:20 ` Anthony Liguori
2010-06-28 16:26 ` Alexander Graf
2010-06-29 9:46 ` Avi Kivity
0 siblings, 2 replies; 20+ messages in thread
From: Anthony Liguori @ 2010-06-28 16:20 UTC (permalink / raw)
To: Avi Kivity; +Cc: Andre Przywara, Alexander Graf, kvm
On 06/24/2010 06:42 AM, Avi Kivity wrote:
> On 06/24/2010 02:34 PM, Andre Przywara wrote:
>>> Non-anonymous memory doesn't work well with ksm and transparent
>>> hugepages. Is it possible to use anonymous memory rather than file
>>> backed?
>>
>> I'd prefer non-file backed, too. But that is how the current huge
>> pages implementation is done. We could use MAP_HUGETLB and declare
>> NUMA _and_ huge pages as 2.6.32+ only. Unfortunately I didn't find an
>> easy way to detect the presence of the MAP_HUGETLB flag. If the
>> kernel does not support it, it seems that mmap silently ignores it
>> and uses 4KB pages instead.
>
> That sucks, unfortunately it is normal practice. However it is a soft
> failure, everything works just a bit slower. So it's probably
> acceptable.
>
>>>> To avoid this I'd like to see the pinning done from within QEMU. I
>>>> am not sure whether calling numactl via system() and friends is OK,
>>>> I'd prefer to run the syscalls directly (like in patch 3/3) and
>>>> pull the necessary options into the -numa pin,... command line. We
>>>> could mimic numactl's syntax here.
>>>
>>> Definitely not use system(), but IIRC numactl has a library interface?
>> Right, that is what I include in patch 3/3 and use. I got the
>> impression Anthony wanted to avoid reimplementing parts of numactl,
>> especially enabling the full flexibility of the command line
>> interface (like specifying nodes, policies and interleaving).
>> I want QEMU to use the library and pull the necessary options into
>> the -numa pin,... parsing, even if this means duplicating numactl
>> functionality.
>>
>
> I agree with that. It's a lot easier to use a single tool than to try
> to integrate things yourself, the unix tradition of grep | sort | uniq
> -c | sort -n notwithstanding. Especially when one of the tools is qemu.
I could disagree more here. This is why we don't support CPU pinning
and instead provide PID information for each VCPU thread.
The folks that want to use pinning are not notice users. They are not
going to be happy unless you can make full use of existing tools. That
means replicating all of numactl's functionality (which is not what the
current patches do) or enable numactl to be used with a guest.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-28 16:20 ` Anthony Liguori
@ 2010-06-28 16:26 ` Alexander Graf
2010-06-29 9:46 ` Avi Kivity
1 sibling, 0 replies; 20+ messages in thread
From: Alexander Graf @ 2010-06-28 16:26 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Avi Kivity, Andre Przywara, kvm
Anthony Liguori wrote:
> On 06/24/2010 06:42 AM, Avi Kivity wrote:
>> On 06/24/2010 02:34 PM, Andre Przywara wrote:
>>>> Non-anonymous memory doesn't work well with ksm and transparent
>>>> hugepages. Is it possible to use anonymous memory rather than file
>>>> backed?
>>>
>>> I'd prefer non-file backed, too. But that is how the current huge
>>> pages implementation is done. We could use MAP_HUGETLB and declare
>>> NUMA _and_ huge pages as 2.6.32+ only. Unfortunately I didn't find
>>> an easy way to detect the presence of the MAP_HUGETLB flag. If the
>>> kernel does not support it, it seems that mmap silently ignores it
>>> and uses 4KB pages instead.
>>
>> That sucks, unfortunately it is normal practice. However it is a
>> soft failure, everything works just a bit slower. So it's probably
>> acceptable.
>>
>>>>> To avoid this I'd like to see the pinning done from within QEMU. I
>>>>> am not sure whether calling numactl via system() and friends is
>>>>> OK, I'd prefer to run the syscalls directly (like in patch 3/3)
>>>>> and pull the necessary options into the -numa pin,... command
>>>>> line. We could mimic numactl's syntax here.
>>>>
>>>> Definitely not use system(), but IIRC numactl has a library interface?
>>> Right, that is what I include in patch 3/3 and use. I got the
>>> impression Anthony wanted to avoid reimplementing parts of numactl,
>>> especially enabling the full flexibility of the command line
>>> interface (like specifying nodes, policies and interleaving).
>>> I want QEMU to use the library and pull the necessary options into
>>> the -numa pin,... parsing, even if this means duplicating numactl
>>> functionality.
>>>
>>
>> I agree with that. It's a lot easier to use a single tool than to
>> try to integrate things yourself, the unix tradition of grep | sort |
>> uniq -c | sort -n notwithstanding. Especially when one of the tools
>> is qemu.
>
> I could disagree more here. This is why we don't support CPU pinning
> and instead provide PID information for each VCPU thread.
>
> The folks that want to use pinning are not notice users. They are not
> going to be happy unless you can make full use of existing tools.
> That means replicating all of numactl's functionality (which is not
> what the current patches do) or enable numactl to be used with a guest.
So how about some QMP plumbing that would allow numactl to create the
VMs at defined ranges? So you'd basically get numactl --run-qemu --
qemu-kvm -blah -foo
Alex
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-28 16:20 ` Anthony Liguori
2010-06-28 16:26 ` Alexander Graf
@ 2010-06-29 9:46 ` Avi Kivity
1 sibling, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2010-06-29 9:46 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Andre Przywara, Alexander Graf, kvm
On 06/28/2010 07:20 PM, Anthony Liguori wrote:
>>
>>>>> To avoid this I'd like to see the pinning done from within QEMU. I
>>>>> am not sure whether calling numactl via system() and friends is
>>>>> OK, I'd prefer to run the syscalls directly (like in patch 3/3)
>>>>> and pull the necessary options into the -numa pin,... command
>>>>> line. We could mimic numactl's syntax here.
>>>>
>>>> Definitely not use system(), but IIRC numactl has a library interface?
>>> Right, that is what I include in patch 3/3 and use. I got the
>>> impression Anthony wanted to avoid reimplementing parts of numactl,
>>> especially enabling the full flexibility of the command line
>>> interface (like specifying nodes, policies and interleaving).
>>> I want QEMU to use the library and pull the necessary options into
>>> the -numa pin,... parsing, even if this means duplicating numactl
>>> functionality.
>>>
>>
>> I agree with that. It's a lot easier to use a single tool than to
>> try to integrate things yourself, the unix tradition of grep | sort |
>> uniq -c | sort -n notwithstanding. Especially when one of the tools
>> is qemu.
>
>
> I could disagree more here. This is why we don't support CPU pinning
> and instead provide PID information for each VCPU thread.
Good point. That also allows setting priority, etc.
>
> The folks that want to use pinning are not notice users. They are not
> going to be happy unless you can make full use of existing tools.
> That means replicating all of numactl's functionality (which is not
> what the current patches do) or enable numactl to be used with a guest.
>
Yeah. Unfortunately, that also forces us to use non-anonymous memory.
So it isn't just where to put the functionality, it also has side effects.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3][RFC] NUMA: add host side pinning
2010-06-28 16:17 ` Anthony Liguori
@ 2010-06-29 9:48 ` Avi Kivity
0 siblings, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2010-06-29 9:48 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Andre Przywara, Alexander Graf, kvm
On 06/28/2010 07:17 PM, Anthony Liguori wrote:
> On 06/24/2010 06:12 AM, Avi Kivity wrote:
>> On 06/24/2010 01:58 PM, Andre Przywara wrote:
>>>> So who would create the /dev/shm/nodeXX files?
>>>
>>> Currently it is QEMU. It creates a somewhat unique filename, opens
>>> and unlinks it. The difference would be to name the file after the
>>> option and to not unlink it.
>>>
>>> > I can imagine starting numactl before qemu, even though that's
>>> > cumbersome. I don't think it's feasible to start numactl after
>>> > qemu is running. That'd involve way too much magic that I'd prefer
>>> > qemu to call numactl itself.
>>> Using the current code the files would not exist before QEMU
>>> allocated RAM, and after that it could already touch pages before
>>> numactl set the policy.
>>
>> Non-anonymous memory doesn't work well with ksm and transparent
>> hugepages. Is it possible to use anonymous memory rather than file
>> backed?
>
> You aren't going to be doing NUMA pinning and KSM AFAICT.
What about transparent hugepages?
Conceptually, all of this belongs in the scheduler, so whatever we do
ends up a poorly integrated hack.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2010-06-29 9:48 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-23 21:09 [PATCH 0/3][RFC] NUMA: add host side pinning Andre Przywara
2010-06-23 21:09 ` [PATCH 1/3] NUMA: add Linux libnuma detection Andre Przywara
2010-06-23 21:09 ` [PATCH 2/3] NUMA: add parsing of host NUMA pin option Andre Przywara
2010-06-23 21:09 ` [PATCH 3/3] NUMA: realize NUMA memory pinning Andre Przywara
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
2010-06-23 22:29 ` Alexander Graf
2010-06-24 10:58 ` Andre Przywara
2010-06-24 11:12 ` Avi Kivity
2010-06-24 11:34 ` Andre Przywara
2010-06-24 11:42 ` Avi Kivity
2010-06-28 16:20 ` Anthony Liguori
2010-06-28 16:26 ` Alexander Graf
2010-06-29 9:46 ` Avi Kivity
2010-06-25 11:00 ` Jes Sorensen
2010-06-25 11:06 ` Andre Przywara
2010-06-25 11:37 ` Jes Sorensen
2010-06-28 16:17 ` Anthony Liguori
2010-06-29 9:48 ` Avi Kivity
2010-06-24 6:44 ` Andre Przywara
2010-06-24 13:14 ` Andi Kleen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.