All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes
@ 2013-06-18  8:09 Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 1/7] Add numa_info structure to contain numa nodes info Wanlong Gao
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong


As you know, QEMU can't direct it's memory allocation now, this may cause
guest cross node access performance regression.
And, the worse thing is that if PCI-passthrough is used,
direct-attached-device uses DMA transfer between device and qemu process.
All pages of the guest will be pinned by get_user_pages().

KVM_ASSIGN_PCI_DEVICE ioctl
  kvm_vm_ioctl_assign_device()
    =>kvm_assign_device()
      => kvm_iommu_map_memslots()
        => kvm_iommu_map_pages()
           => kvm_pin_pages()

So, with direct-attached-device, all guest page's page count will be +1 and
any page migration will not work. AutoNUMA won't too.

So, we should set the guest nodes memory allocation policy before
the pages are really mapped.

According to this patch set, we are able to set guest nodes memory policy
like following:

 -numa node,nodeid=0,mem=1024,cpus=0,membind=0-1
 -numa node,nodeid=1,mem=1024,cpus=1,interleave=1

This supports "{membind|interleave|preferred}=[+|!]{all|N-N}" like format.

And patch 5/7 adds a QMP command "set-mpol" to set the memory policy for every
guest nodes:
    set-mpol nodeid=0 mpol=membind nodemask=0-1

And patch 6/7 adds a monitor command "set-mpol" which like above.

And with patch 7/7, we can get the current memory policy of each guest node
using monitor command "info numa", for example:

    (qemu) info numa
    2 nodes
    node 0 cpus: 0
    node 0 size: 1024 MB
    node 0 mempolicy: membind=0,1
    node 1 cpus: 1
    node 1 size: 1024 MB
    node 1 mempolicy: interleave=1



Wanlong Gao (7):
  Add numa_info structure to contain numa nodes info
  Add Linux libnuma detection
  NUMA: parse guest numa nodes memory policy
  NUMA: set guest numa nodes memory policy
  NUMA: add qmp command set-mpol to set memory policy for NUMA node
  NUMA: add hmp command set-mpol
  NUMA: show host memory policy info in info numa command

 configure               |  32 +++++++++++
 cpus.c                  | 149 +++++++++++++++++++++++++++++++++++++++++++++++-
 hmp-commands.hx         |  16 ++++++
 hmp.c                   |  22 +++++++
 hmp.h                   |   1 +
 hw/i386/pc.c            |   4 +-
 include/sysemu/sysemu.h |  17 +++++-
 monitor.c               |  44 +++++++++++++-
 qapi-schema.json        |  13 +++++
 qmp-commands.hx         |  35 ++++++++++++
 vl.c                    | 102 ++++++++++++++++++++++++++++-----
 11 files changed, 416 insertions(+), 19 deletions(-)

-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 1/7] Add numa_info structure to contain numa nodes info
  2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
@ 2013-06-18  8:09 ` Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 2/7] Add Linux libnuma detection Wanlong Gao
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong

Add the numa_info structure to contain the numa nodes memory,
VCPUs information and the future added numa nodes host memory
policies.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 cpus.c                  |  2 +-
 hw/i386/pc.c            |  4 ++--
 include/sysemu/sysemu.h |  8 ++++++--
 monitor.c               |  2 +-
 vl.c                    | 26 +++++++++++++-------------
 5 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/cpus.c b/cpus.c
index c8bc8ad..e123d3f 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1195,7 +1195,7 @@ void set_numa_modes(void)
     for (env = first_cpu; env != NULL; env = env->next_cpu) {
         cpu = ENV_GET_CPU(env);
         for (i = 0; i < nb_numa_nodes; i++) {
-            if (test_bit(cpu->cpu_index, node_cpumask[i])) {
+            if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
                 cpu->numa_node = i;
             }
         }
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e0fbb86..935241b 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -651,14 +651,14 @@ static FWCfgState *bochs_bios_init(void)
         unsigned int apic_id = x86_cpu_apic_id_from_index(i);
         assert(apic_id < apic_id_limit);
         for (j = 0; j < nb_numa_nodes; j++) {
-            if (test_bit(i, node_cpumask[j])) {
+            if (test_bit(i, numa_info[j].node_cpu)) {
                 numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
                 break;
             }
         }
     }
     for (i = 0; i < nb_numa_nodes; i++) {
-        numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(node_mem[i]);
+        numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(numa_info[i].node_mem);
     }
     fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg,
                      (1 + apic_id_limit + nb_numa_nodes) *
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 2fb71af..70fd2ed 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -9,6 +9,7 @@
 #include "qapi-types.h"
 #include "qemu/notify.h"
 #include "qemu/main-loop.h"
+#include "qemu/bitmap.h"
 
 /* vl.c */
 
@@ -130,8 +131,11 @@ extern QEMUClock *rtc_clock;
 #define MAX_NODES 64
 #define MAX_CPUMASK_BITS 255
 extern int nb_numa_nodes;
-extern uint64_t node_mem[MAX_NODES];
-extern unsigned long *node_cpumask[MAX_NODES];
+struct node_info {
+    uint64_t node_mem;
+    DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+};
+extern struct node_info numa_info[MAX_NODES];
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/monitor.c b/monitor.c
index 70ae8f5..61dbebb 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1819,7 +1819,7 @@ static void do_info_numa(Monitor *mon, const QDict *qdict)
         }
         monitor_printf(mon, "\n");
         monitor_printf(mon, "node %d size: %" PRId64 " MB\n", i,
-            node_mem[i] >> 20);
+            numa_info[i].node_mem >> 20);
     }
 }
 
diff --git a/vl.c b/vl.c
index f94ec9c..42dec5e 100644
--- a/vl.c
+++ b/vl.c
@@ -250,8 +250,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
     QTAILQ_HEAD_INITIALIZER(fw_boot_order);
 
 int nb_numa_nodes;
-uint64_t node_mem[MAX_NODES];
-unsigned long *node_cpumask[MAX_NODES];
+struct node_info numa_info[MAX_NODES];
 
 uint8_t qemu_uuid[16];
 
@@ -1341,7 +1340,7 @@ static void numa_node_parse_cpus(int nodenr, const char *cpus)
         goto error;
     }
 
-    bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+    bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
     return;
 
 error:
@@ -1381,7 +1380,7 @@ static void numa_add(const char *optarg)
         }
 
         if (get_param_value(option, 128, "mem", optarg) == 0) {
-            node_mem[nodenr] = 0;
+            numa_info[nodenr].node_mem = 0;
         } else {
             int64_t sval;
             sval = strtosz(option, &endptr);
@@ -1389,7 +1388,7 @@ static void numa_add(const char *optarg)
                 fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg);
                 exit(1);
             }
-            node_mem[nodenr] = sval;
+            numa_info[nodenr].node_mem = sval;
         }
         if (get_param_value(option, 128, "cpus", optarg) != 0) {
             numa_node_parse_cpus(nodenr, option);
@@ -2921,8 +2920,8 @@ int main(int argc, char **argv, char **envp)
     translation = BIOS_ATA_TRANSLATION_AUTO;
 
     for (i = 0; i < MAX_NODES; i++) {
-        node_mem[i] = 0;
-        node_cpumask[i] = bitmap_new(MAX_CPUMASK_BITS);
+        numa_info[i].node_mem = 0;
+        bitmap_zero(numa_info[i].node_cpu, MAX_CPUMASK_BITS);
     }
 
     nb_numa_nodes = 0;
@@ -4228,7 +4227,7 @@ int main(int argc, char **argv, char **envp)
          * and distribute the available memory equally across all nodes
          */
         for (i = 0; i < nb_numa_nodes; i++) {
-            if (node_mem[i] != 0)
+            if (numa_info[i].node_mem != 0)
                 break;
         }
         if (i == nb_numa_nodes) {
@@ -4238,14 +4237,15 @@ int main(int argc, char **argv, char **envp)
              * the final node gets the rest.
              */
             for (i = 0; i < nb_numa_nodes - 1; i++) {
-                node_mem[i] = (ram_size / nb_numa_nodes) & ~((1 << 23UL) - 1);
-                usedmem += node_mem[i];
+                numa_info[i].node_mem = (ram_size / nb_numa_nodes) &
+                                        ~((1 << 23UL) - 1);
+                usedmem += numa_info[i].node_mem;
             }
-            node_mem[i] = ram_size - usedmem;
+            numa_info[i].node_mem = ram_size - usedmem;
         }
 
         for (i = 0; i < nb_numa_nodes; i++) {
-            if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) {
+            if (!bitmap_empty(numa_info[i].node_cpu, MAX_CPUMASK_BITS)) {
                 break;
             }
         }
@@ -4255,7 +4255,7 @@ int main(int argc, char **argv, char **envp)
          */
         if (i == nb_numa_nodes) {
             for (i = 0; i < max_cpus; i++) {
-                set_bit(i, node_cpumask[i % nb_numa_nodes]);
+                set_bit(i, numa_info[i % nb_numa_nodes].node_cpu);
             }
         }
     }
-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 2/7] Add Linux libnuma detection
  2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 1/7] Add numa_info structure to contain numa nodes info Wanlong Gao
@ 2013-06-18  8:09 ` Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy Wanlong Gao
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong

Add detection of libnuma (mostly contained in the numactl package)
to the configure script. Can be enabled or disabled on the command line,
default is use if available.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 configure | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/configure b/configure
index ad32f87..2d2b177 100755
--- a/configure
+++ b/configure
@@ -242,6 +242,7 @@ gtk=""
 gtkabi="2.0"
 tpm="no"
 libssh2=""
+numa=""
 
 # parse CC options first
 for opt do
@@ -944,6 +945,10 @@ for opt do
   ;;
   --enable-libssh2) libssh2="yes"
   ;;
+  --disable-numa) numa="no"
+  ;;
+  --enable-numa) numa="yes"
+  ;;
   *) echo "ERROR: unknown option $opt"; show_help="yes"
   ;;
   esac
@@ -1158,6 +1163,8 @@ echo "  --gcov=GCOV              use specified gcov [$gcov_tool]"
 echo "  --enable-tpm             enable TPM support"
 echo "  --disable-libssh2        disable ssh block device support"
 echo "  --enable-libssh2         enable ssh block device support"
+echo "  --disable-numa           disable libnuma support"
+echo "  --enable-numa            enable libnuma support"
 echo ""
 echo "NOTE: The object files are built at the place where configure is launched"
 exit 1
@@ -2389,6 +2396,27 @@ EOF
 fi
 
 ##########################################
+# libnuma probe
+
+if test "$numa" != "no" ; then
+  numa=no
+  cat > $TMPC << EOF
+#include <numa.h>
+int main(void) { return numa_available(); }
+EOF
+
+  if compile_prog "" "-lnuma" ; then
+    numa=yes
+    libs_softmmu="-lnuma $libs_softmmu"
+  else
+    if test "$numa" = "yes" ; then
+      feature_not_found "linux NUMA (install numactl?)"
+    fi
+    numa=no
+  fi
+fi
+
+##########################################
 # linux-aio probe
 
 if test "$linux_aio" != "no" ; then
@@ -3556,6 +3584,7 @@ echo "TPM support       $tpm"
 echo "libssh2 support   $libssh2"
 echo "TPM passthrough   $tpm_passthrough"
 echo "QOM debugging     $qom_cast_debug"
+echo "NUMA host support $numa"
 
 if test "$sdl_too_old" = "yes"; then
 echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -3589,6 +3618,9 @@ echo "extra_cflags=$EXTRA_CFLAGS" >> $config_host_mak
 echo "extra_ldflags=$EXTRA_LDFLAGS" >> $config_host_mak
 echo "qemu_localedir=$qemu_localedir" >> $config_host_mak
 echo "libs_softmmu=$libs_softmmu" >> $config_host_mak
+if test "$numa" = "yes"; then
+  echo "CONFIG_NUMA=y" >> $config_host_mak
+fi
 
 echo "ARCH=$ARCH" >> $config_host_mak
 
-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 1/7] Add numa_info structure to contain numa nodes info Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 2/7] Add Linux libnuma detection Wanlong Gao
@ 2013-06-18  8:09 ` Wanlong Gao
  2013-06-18  9:20   ` Paolo Bonzini
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 4/7] NUMA: set " Wanlong Gao
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong

The memory policy setting format is like:
{membind|interleave|preferred}=[+|!]{all|N-N}
And we are adding this setting as a suboption of "-numa",
the memory policy then can be set like following:
 -numa node,nodeid=0,mem=1024,cpus=0,membind=0-1
 -numa node,nodeid=1,mem=1024,cpus=1,interleave=1

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 include/sysemu/sysemu.h |  8 ++++++
 vl.c                    | 76 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 70fd2ed..993b8e0 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -130,10 +130,18 @@ extern QEMUClock *rtc_clock;
 
 #define MAX_NODES 64
 #define MAX_CPUMASK_BITS 255
+#define NODE_HOST_NONE        0x00
+#define NODE_HOST_BIND        0x01
+#define NODE_HOST_INTERLEAVE  0x02
+#define NODE_HOST_PREFERRED   0x03
+#define NODE_HOST_POLICY_MASK 0x03
+#define NODE_HOST_RELATIVE    0x04
 extern int nb_numa_nodes;
 struct node_info {
     uint64_t node_mem;
     DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+    DECLARE_BITMAP(host_mem, MAX_CPUMASK_BITS);
+    unsigned int flags;
 };
 extern struct node_info numa_info[MAX_NODES];
 
diff --git a/vl.c b/vl.c
index 42dec5e..ada9fb2 100644
--- a/vl.c
+++ b/vl.c
@@ -1348,11 +1348,68 @@ error:
     exit(1);
 }
 
+static unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
+{
+    unsigned long long value, endvalue;
+    char *endptr;
+    unsigned int flags = 0;
+
+    if (str[0] == '!') {
+        flags |= 2;
+        bitmap_fill(bm, MAX_CPUMASK_BITS);
+        str++;
+    }
+    if (str[0] == '+') {
+        flags |= 1;
+        str++;
+    }
+
+    if (!strcmp(str, "all")) {
+        bitmap_fill(bm, MAX_CPUMASK_BITS);
+        return flags;
+    }
+
+    if (parse_uint(str, &value, &endptr, 10) < 0)
+        goto error;
+    if (*endptr == '-') {
+        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
+            goto error;
+        }
+    } else if (*endptr == '\0') {
+        endvalue = value;
+    } else {
+        goto error;
+    }
+
+    if (endvalue >= MAX_CPUMASK_BITS) {
+        endvalue = MAX_CPUMASK_BITS - 1;
+        fprintf(stderr,
+            "qemu: NUMA: A max of %d host nodes are supported\n",
+             MAX_CPUMASK_BITS);
+    }
+
+    if (endvalue < value) {
+        goto error;
+    }
+
+    if (flags & 2)
+        bitmap_clear(bm, value, endvalue - value + 1);
+    else
+        bitmap_set(bm, value, endvalue - value + 1);
+
+    return flags;
+
+error:
+    fprintf(stderr, "qemu: Invalid host NUMA nodes range: %s\n", str);
+    return 4;
+}
+
 static void numa_add(const char *optarg)
 {
     char option[128];
     char *endptr;
     unsigned long long nodenr;
+    unsigned int ret;
 
     optarg = get_opt_name(option, 128, optarg, ',');
     if (*optarg == ',') {
@@ -1393,6 +1450,23 @@ static void numa_add(const char *optarg)
         if (get_param_value(option, 128, "cpus", optarg) != 0) {
             numa_node_parse_cpus(nodenr, option);
         }
+
+        option[0] = 0;
+        if (get_param_value(option, 128, "interleave", optarg) != 0)
+            numa_info[nodenr].flags |= NODE_HOST_INTERLEAVE;
+        else if (get_param_value(option, 128, "preferred", optarg) != 0)
+            numa_info[nodenr].flags |= NODE_HOST_PREFERRED;
+        else if (get_param_value(option, 128, "membind", optarg) != 0)
+            numa_info[nodenr].flags |= NODE_HOST_BIND;
+        if (option[0] != 0) {
+            ret = numa_node_parse_mpol(option, numa_info[nodenr].host_mem);
+            if (ret == 4) {
+                exit(1);
+            } else if (ret & 1) {
+                numa_info[nodenr].flags |= NODE_HOST_RELATIVE;
+            }
+        }
+
         nb_numa_nodes++;
     } else {
         fprintf(stderr, "Invalid -numa option: %s\n", option);
@@ -2922,6 +2996,8 @@ int main(int argc, char **argv, char **envp)
     for (i = 0; i < MAX_NODES; i++) {
         numa_info[i].node_mem = 0;
         bitmap_zero(numa_info[i].node_cpu, MAX_CPUMASK_BITS);
+        bitmap_zero(numa_info[i].host_mem, MAX_CPUMASK_BITS);
+        numa_info[i].flags = NODE_HOST_NONE;
     }
 
     nb_numa_nodes = 0;
-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 4/7] NUMA: set guest numa nodes memory policy
  2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
                   ` (2 preceding siblings ...)
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy Wanlong Gao
@ 2013-06-18  8:09 ` Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node Wanlong Gao
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong

Set the guest numa nodes memory policies using the mbind(2)
system call node by node.
After this patch, we are able to set guest nodes memory policies
through the QEMU options, this arms to solve the guest cross
nodes memory access performance issue.
And as you all know, if PCI-passthrough is used,
direct-attached-device uses DMA transfer between device and qemu process.
All pages of the guest will be pinned by get_user_pages().

KVM_ASSIGN_PCI_DEVICE ioctl
  kvm_vm_ioctl_assign_device()
    =>kvm_assign_device()
      => kvm_iommu_map_memslots()
        => kvm_iommu_map_pages()
           => kvm_pin_pages()

So, with direct-attached-device, all guest page's page count will be +1 and
any page migration will not work. AutoNUMA won't too.

So, we should set the guest nodes memory allocation policies before
the pages are really mapped.


Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 cpus.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/cpus.c b/cpus.c
index e123d3f..b868932 100644
--- a/cpus.c
+++ b/cpus.c
@@ -60,6 +60,15 @@
 
 #endif /* CONFIG_LINUX */
 
+#ifdef CONFIG_NUMA
+#include <numa.h>
+#include <numaif.h>
+#ifndef MPOL_F_RELATIVE_NODES
+#define MPOL_F_RELATIVE_NODES (1 << 14)
+#define MPOL_F_STATIC_NODES   (1 << 15)
+#endif
+#endif
+
 static CPUArchState *next_cpu;
 
 static bool cpu_thread_is_idle(CPUArchState *env)
@@ -1186,6 +1195,75 @@ static void tcg_exec_all(void)
     exit_request = 0;
 }
 
+#ifdef CONFIG_NUMA
+static int node_parse_bind_mode(unsigned int nodeid)
+{
+    int bind_mode;
+
+    switch (numa_info[nodeid].flags & NODE_HOST_POLICY_MASK) {
+    case NODE_HOST_BIND:
+        bind_mode = MPOL_BIND;
+        break;
+    case NODE_HOST_INTERLEAVE:
+        bind_mode = MPOL_INTERLEAVE;
+        break;
+    case NODE_HOST_PREFERRED:
+        bind_mode = MPOL_PREFERRED;
+        break;
+    default:
+        bind_mode = MPOL_DEFAULT;
+        return bind_mode;
+    }
+
+    bind_mode |= (numa_info[nodeid].flags & NODE_HOST_RELATIVE) ?
+        MPOL_F_RELATIVE_NODES : MPOL_F_STATIC_NODES;
+
+    return bind_mode;
+}
+#endif
+
+static int set_node_mpol(unsigned int nodeid)
+{
+#ifdef CONFIG_NUMA
+    void *ram_ptr;
+    RAMBlock *block;
+    ram_addr_t len, ram_offset = 0;
+    int bind_mode;
+    int i;
+
+    QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+        if (!strcmp(block->mr->name, "pc.ram")) {
+            break;
+        }
+    }
+
+    if (block->host == NULL)
+        return -1;
+
+    ram_ptr = block->host;
+    for (i = 0; i < nodeid; i++) {
+        len = numa_info[i].node_mem;
+        ram_offset += len;
+    }
+
+    len = numa_info[i].node_mem;
+    bind_mode = node_parse_bind_mode(i);
+
+    /* This is a workaround for a long standing bug in Linux'
+     * mbind implementation, which cuts off the last specified
+     * node. To stay compatible should this bug be fixed, we
+     * specify one more node and zero this one out.
+     */
+    clear_bit(numa_num_configured_nodes() + 1, numa_info[i].host_mem);
+    if (mbind(ram_ptr + ram_offset, len, bind_mode,
+        numa_info[i].host_mem, numa_num_configured_nodes() + 1, 0)) {
+            perror("mbind");
+            return -1;
+    }
+#endif
+    return 0;
+}
+
 void set_numa_modes(void)
 {
     CPUArchState *env;
@@ -1200,6 +1278,14 @@ void set_numa_modes(void)
             }
         }
     }
+
+#ifdef CONFIG_NUMA
+    for (i = 0; i < nb_numa_nodes; i++) {
+        if (set_node_mpol(i) == -1) {
+            fprintf(stderr, "qemu: can't set host memory policy for node%d", i);
+        }
+    }
+#endif
 }
 
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node
  2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
                   ` (3 preceding siblings ...)
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 4/7] NUMA: set " Wanlong Gao
@ 2013-06-18  8:09 ` Wanlong Gao
  2013-06-18  9:21   ` Paolo Bonzini
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 6/7] NUMA: add hmp command set-mpol Wanlong Gao
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 7/7] NUMA: show host memory policy info in info numa command Wanlong Gao
  6 siblings, 1 reply; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong

The QMP command let it be able to set node's memory policy
through the QMP protocol. The qmp-shell command is like:
    set-mpol nodeid=0 mpol=membind nodemask=0-1

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 cpus.c                  | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/sysemu/sysemu.h |  1 +
 qapi-schema.json        | 13 +++++++++++
 qmp-commands.hx         | 35 ++++++++++++++++++++++++++++
 vl.c                    |  2 +-
 5 files changed, 111 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index b868932..a2836e9 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1431,3 +1431,64 @@ void qmp_inject_nmi(Error **errp)
     error_set(errp, QERR_UNSUPPORTED);
 #endif
 }
+
+void qmp_set_mpol(int64_t nodeid, bool has_mpol, const char *mpol,
+                  bool has_nodemask, const char *nodemask, Error **errp)
+{
+    unsigned int ret;
+    unsigned int flags;
+    DECLARE_BITMAP(host_mem, MAX_CPUMASK_BITS);
+
+    if (nodeid >= nb_numa_nodes) {
+        error_setg(errp, "Only has '%d' NUMA nodes", nb_numa_nodes);
+        return;
+    }
+
+    bitmap_copy(host_mem, numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
+    flags = numa_info[nodeid].flags;
+
+    numa_info[nodeid].flags = NODE_HOST_NONE;
+    bitmap_zero(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
+
+    if (!has_mpol) {
+        if (set_node_mpol(nodeid) == -1) {
+            goto error;
+        }
+        return;
+    }
+
+    if (!strcmp(mpol, "membind")) {
+        numa_info[nodeid].flags |= NODE_HOST_BIND;
+    } else if (!strcmp(mpol, "interleave")) {
+        numa_info[nodeid].flags |= NODE_HOST_INTERLEAVE;
+    } else if (!strcmp(mpol, "preferred")) {
+        numa_info[nodeid].flags |= NODE_HOST_PREFERRED;
+    } else {
+        error_setg(errp, "Invalid NUMA policy '%s'", mpol);
+        goto error;
+    }
+
+    if (!has_nodemask) {
+        bitmap_fill(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
+    }
+
+    if (nodemask) {
+        ret = numa_node_parse_mpol(nodemask, numa_info[nodeid].host_mem);
+    }
+    if (ret == 4) {
+        goto error;
+    } else if (ret & 1) {
+        numa_info[nodeid].flags |= NODE_HOST_RELATIVE;
+    }
+
+    if (set_node_mpol(nodeid) == -1) {
+        goto error;
+    }
+
+    return;
+
+error:
+    bitmap_copy(numa_info[nodeid].host_mem, host_mem, MAX_CPUMASK_BITS);
+    numa_info[nodeid].flags = flags;
+    return;
+}
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 993b8e0..7d804af 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -144,6 +144,7 @@ struct node_info {
     unsigned int flags;
 };
 extern struct node_info numa_info[MAX_NODES];
+extern unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm);
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/qapi-schema.json b/qapi-schema.json
index a80ee40..403c703 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3608,3 +3608,16 @@
             '*cpuid-input-ecx': 'int',
             'cpuid-register': 'X86CPURegister32',
             'features': 'int' } }
+
+# @set-mpol:
+#
+# Set the host memory binding policy for guest NUMA node.
+#
+# @node-id: The node ID of guest NUMA node to set memory policy to.
+#
+# @mpol: The memory policy string to set.
+#
+# Since: 1.6.0
+##
+{ 'command': 'set-mpol', 'data': {'nodeid': 'int', '*mpol': 'str',
+                                  '*nodemask': 'str'} }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 8cea5e5..930c844 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2997,3 +2997,38 @@ Example:
 <- { "return": {} }
 
 EQMP
+
+    {
+        .name      = "set-mpol",
+        .args_type = "nodeid:i,mpol:s?,nodemask:s?",
+        .help      = "Set the host memory binding policy for guest NUMA node",
+        .mhandler.cmd_new = qmp_marshal_input_set_mpol,
+    },
+
+SQMP
+set-mpol
+------
+
+Set the host memory binding policy for guest NUMA node
+
+Arguments:
+
+- "nodeid": The nodeid of guest NUMA node to set memory policy to.
+              (json-int)
+- "mpol": The memory policy string to set.
+           (json-string, optional)
+- "nodemask": The node mask contained to mpol.
+                (json-string, optional)
+
+Example:
+
+-> { "execute": "set-mpol", "arguments": { "nodeid": 0, "mpol": "membind",
+                                           "nodemask": "0-1" }}
+<- { "return": {} }
+
+Notes:
+    1. If "mpol" is not set, the memory policy of this "nodeid" will be set
+       to "default".
+    2. If "nodemask" is not set, the node mask of this "mpol" will be set
+       to "all".
+EQMP
diff --git a/vl.c b/vl.c
index ada9fb2..73af85e 100644
--- a/vl.c
+++ b/vl.c
@@ -1348,7 +1348,7 @@ error:
     exit(1);
 }
 
-static unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
+unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
 {
     unsigned long long value, endvalue;
     char *endptr;
-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 6/7] NUMA: add hmp command set-mpol
  2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
                   ` (4 preceding siblings ...)
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node Wanlong Gao
@ 2013-06-18  8:09 ` Wanlong Gao
  2013-06-18  9:23   ` Paolo Bonzini
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 7/7] NUMA: show host memory policy info in info numa command Wanlong Gao
  6 siblings, 1 reply; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong

Add hmp command set-mpol to set host memory policy for a guest
NUMA node. Then we can also set node's memory policy using
the monitor command like:
    (qemu) set-mpol 0 membind 0-1

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 hmp-commands.hx | 16 ++++++++++++++++
 hmp.c           | 22 ++++++++++++++++++++++
 hmp.h           |  1 +
 3 files changed, 39 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 915b0d1..fd3505e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1567,6 +1567,22 @@ Executes a qemu-io command on the given block device.
 ETEXI
 
     {
+        .name       = "set-mpol",
+        .args_type  = "nodeid:i,mpol:s?,nodemask:s?",
+        .params     = "nodeid [mpol] [nodemask]",
+        .help       = "set host memory policy for a guest NUMA node",
+        .mhandler.cmd = hmp_set_mpol,
+    },
+
+STEXI
+@item set-mpol @var{nodeid} @var{mpol} @var{nodemask}
+@findex set-mpol
+
+Set host memory policy for a guest NUMA node
+
+ETEXI
+
+    {
         .name       = "info",
         .args_type  = "item:s?",
         .params     = "[subcommand]",
diff --git a/hmp.c b/hmp.c
index 494a9aa..2e5315e 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1464,3 +1464,25 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 
     hmp_handle_error(mon, &err);
 }
+
+void hmp_set_mpol(Monitor *mon, const QDict *qdict)
+{
+    Error *local_err = NULL;
+    bool has_mpol = true;
+    bool has_nodemask = true;
+
+    uint64_t nodeid = qdict_get_int(qdict, "nodeid");
+    const char *mpol = qdict_get_try_str(qdict, "mpol");
+    const char *nodemask = qdict_get_try_str(qdict, "nodemask");
+
+    if (mpol == NULL) {
+        has_mpol = false;
+    }
+
+    if (nodemask == NULL) {
+        has_nodemask = false;
+    }
+
+    qmp_set_mpol(nodeid, has_mpol, mpol, has_nodemask, nodemask, &local_err);
+    hmp_handle_error(mon, &local_err);
+}
diff --git a/hmp.h b/hmp.h
index 56d2e92..81f631b 100644
--- a/hmp.h
+++ b/hmp.h
@@ -86,5 +86,6 @@ void hmp_nbd_server_stop(Monitor *mon, const QDict *qdict);
 void hmp_chardev_add(Monitor *mon, const QDict *qdict);
 void hmp_chardev_remove(Monitor *mon, const QDict *qdict);
 void hmp_qemu_io(Monitor *mon, const QDict *qdict);
+void hmp_set_mpol(Monitor *mon, const QDict *qdict);
 
 #endif
-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 7/7] NUMA: show host memory policy info in info numa command
  2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
                   ` (5 preceding siblings ...)
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 6/7] NUMA: add hmp command set-mpol Wanlong Gao
@ 2013-06-18  8:09 ` Wanlong Gao
  6 siblings, 0 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  8:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: andre.przywara, aliguori, ehabkost, pbonzini, y-goto, afaerber,
	gaowanlong

Show host memory policy of nodes in the info numa monitor command.
After this patch, the monitor command "info numa" will show the
information like following if the host numa support is enabled:

    (qemu) info numa
    2 nodes
    node 0 cpus: 0
    node 0 size: 1024 MB
    node 0 mempolicy: membind=0,1
    node 1 cpus: 1
    node 1 size: 1024 MB
    node 1 mempolicy: interleave=1


Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 monitor.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/monitor.c b/monitor.c
index 61dbebb..b6e93e5 100644
--- a/monitor.c
+++ b/monitor.c
@@ -74,6 +74,11 @@
 #endif
 #include "hw/lm32/lm32_pic.h"
 
+#ifdef CONFIG_NUMA
+#include <numa.h>
+#include <numaif.h>
+#endif
+
 //#define DEBUG
 //#define DEBUG_COMPLETION
 
@@ -1807,6 +1812,7 @@ static void do_info_numa(Monitor *mon, const QDict *qdict)
     int i;
     CPUArchState *env;
     CPUState *cpu;
+    unsigned long first, next;
 
     monitor_printf(mon, "%d nodes\n", nb_numa_nodes);
     for (i = 0; i < nb_numa_nodes; i++) {
@@ -1820,6 +1826,42 @@ static void do_info_numa(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "\n");
         monitor_printf(mon, "node %d size: %" PRId64 " MB\n", i,
             numa_info[i].node_mem >> 20);
+
+#ifdef CONFIG_NUMA
+        monitor_printf(mon, "node %d mempolicy: ", i);
+        switch (numa_info[i].flags & NODE_HOST_POLICY_MASK) {
+        case NODE_HOST_BIND:
+            monitor_printf(mon, "membind=");
+            break;
+        case NODE_HOST_INTERLEAVE:
+            monitor_printf(mon, "interleave=");
+            break;
+        case NODE_HOST_PREFERRED:
+            monitor_printf(mon, "preferred=");
+            break;
+        default:
+            monitor_printf(mon, "default\n");
+            continue;
+        }
+
+        if (numa_info[i].flags & NODE_HOST_RELATIVE)
+            monitor_printf(mon, "+");
+
+        next = first = find_first_bit(numa_info[i].host_mem, MAX_CPUMASK_BITS);
+        monitor_printf(mon, "%lu", first);
+        do {
+            if (next == numa_max_node())
+                break;
+            next = find_next_bit(numa_info[i].host_mem, MAX_CPUMASK_BITS,
+                                 next + 1);
+            if (next > numa_max_node() || next == MAX_CPUMASK_BITS)
+                break;
+
+            monitor_printf(mon, ",%lu", next);
+        } while (true);
+
+        monitor_printf(mon, "\n");
+#endif
     }
 }
 
-- 
1.8.3.rc2.10.g0c2b1cf

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy Wanlong Gao
@ 2013-06-18  9:20   ` Paolo Bonzini
  2013-06-18  9:54     ` Wanlong Gao
  2013-06-18 19:00     ` Eduardo Habkost
  0 siblings, 2 replies; 20+ messages in thread
From: Paolo Bonzini @ 2013-06-18  9:20 UTC (permalink / raw)
  To: Wanlong Gao
  Cc: andre.przywara, aliguori, ehabkost, qemu-devel, y-goto, afaerber

Il 18/06/2013 10:09, Wanlong Gao ha scritto:
> +static unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
> +{
> +    unsigned long long value, endvalue;
> +    char *endptr;
> +    unsigned int flags = 0;
> +
> +    if (str[0] == '!') {
> +        flags |= 2;

clear = true;

> +        bitmap_fill(bm, MAX_CPUMASK_BITS);
> +        str++;
> +    }
> +    if (str[0] == '+') {
> +        flags |= 1;

flags = NODE_HOST_RELATIVE

> +        str++;
> +    }
> +
> +    if (!strcmp(str, "all")) {
> +        bitmap_fill(bm, MAX_CPUMASK_BITS);
> +        return flags;
> +    }
> +
> +    if (parse_uint(str, &value, &endptr, 10) < 0)
> +        goto error;
> +    if (*endptr == '-') {
> +        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
> +            goto error;
> +        }
> +    } else if (*endptr == '\0') {
> +        endvalue = value;
> +    } else {
> +        goto error;
> +    }
> +
> +    if (endvalue >= MAX_CPUMASK_BITS) {
> +        endvalue = MAX_CPUMASK_BITS - 1;
> +        fprintf(stderr,
> +            "qemu: NUMA: A max of %d host nodes are supported\n",
> +             MAX_CPUMASK_BITS);
> +    }
> +
> +    if (endvalue < value) {
> +        goto error;
> +    }
> +
> +    if (flags & 2)

if (clear)

> +        bitmap_clear(bm, value, endvalue - value + 1);
> +    else
> +        bitmap_set(bm, value, endvalue - value + 1);
> +
> +    return flags;
> +
> +error:
> +    fprintf(stderr, "qemu: Invalid host NUMA nodes range: %s\n", str);

Please change the functions (numa_add and numa_node_parse_mpol) to
accept an Error *.  This will make it much easier to reuse them for e.g.
memory hotplug in the future.

> +    return 4;

return -EINVAL;

> +}

> +        if (get_param_value(option, 128, "interleave", optarg) != 0)
> +            numa_info[nodenr].flags |= NODE_HOST_INTERLEAVE;
> +        else if (get_param_value(option, 128, "preferred", optarg) != 0)
> +            numa_info[nodenr].flags |= NODE_HOST_PREFERRED;
> +        else if (get_param_value(option, 128, "membind", optarg) != 0)
> +            numa_info[nodenr].flags |= NODE_HOST_BIND;

You're not handling the case where someone specifies more than one option.

What about:

   policy={interleave,preferred,bind},mem-hostnode=0

?

Also, please use QemuOpts instead of yet another homegrown parser.
Eduardo, I think you had the most recent attempt to convert -numa to
QemuOpts?

> +        if (option[0] != 0) {
> +            ret = numa_node_parse_mpol(option, numa_info[nodenr].host_mem);
> +            if (ret == 4) {

if (ret < 0)

> +                exit(1);
> +            } else if (ret & 1) {
> +                numa_info[nodenr].flags |= NODE_HOST_RELATIVE;

else {
    numa_info[nodenr].flags |= ret;
}

> +            }
> +        }
> +

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node Wanlong Gao
@ 2013-06-18  9:21   ` Paolo Bonzini
  2013-06-18  9:44     ` Wanlong Gao
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2013-06-18  9:21 UTC (permalink / raw)
  To: Wanlong Gao
  Cc: andre.przywara, aliguori, ehabkost, qemu-devel, y-goto, afaerber

Il 18/06/2013 10:09, Wanlong Gao ha scritto:
> The QMP command let it be able to set node's memory policy
> through the QMP protocol. The qmp-shell command is like:
>     set-mpol nodeid=0 mpol=membind nodemask=0-1
> 
> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>

How would this work with mem-path?

Paolo

> ---
>  cpus.c                  | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/sysemu/sysemu.h |  1 +
>  qapi-schema.json        | 13 +++++++++++
>  qmp-commands.hx         | 35 ++++++++++++++++++++++++++++
>  vl.c                    |  2 +-
>  5 files changed, 111 insertions(+), 1 deletion(-)
> 
> diff --git a/cpus.c b/cpus.c
> index b868932..a2836e9 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1431,3 +1431,64 @@ void qmp_inject_nmi(Error **errp)
>      error_set(errp, QERR_UNSUPPORTED);
>  #endif
>  }
> +
> +void qmp_set_mpol(int64_t nodeid, bool has_mpol, const char *mpol,
> +                  bool has_nodemask, const char *nodemask, Error **errp)
> +{
> +    unsigned int ret;
> +    unsigned int flags;
> +    DECLARE_BITMAP(host_mem, MAX_CPUMASK_BITS);
> +
> +    if (nodeid >= nb_numa_nodes) {
> +        error_setg(errp, "Only has '%d' NUMA nodes", nb_numa_nodes);
> +        return;
> +    }
> +
> +    bitmap_copy(host_mem, numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
> +    flags = numa_info[nodeid].flags;
> +
> +    numa_info[nodeid].flags = NODE_HOST_NONE;
> +    bitmap_zero(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
> +
> +    if (!has_mpol) {
> +        if (set_node_mpol(nodeid) == -1) {
> +            goto error;
> +        }
> +        return;
> +    }
> +
> +    if (!strcmp(mpol, "membind")) {
> +        numa_info[nodeid].flags |= NODE_HOST_BIND;
> +    } else if (!strcmp(mpol, "interleave")) {
> +        numa_info[nodeid].flags |= NODE_HOST_INTERLEAVE;
> +    } else if (!strcmp(mpol, "preferred")) {
> +        numa_info[nodeid].flags |= NODE_HOST_PREFERRED;
> +    } else {
> +        error_setg(errp, "Invalid NUMA policy '%s'", mpol);
> +        goto error;
> +    }
> +
> +    if (!has_nodemask) {
> +        bitmap_fill(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
> +    }
> +
> +    if (nodemask) {
> +        ret = numa_node_parse_mpol(nodemask, numa_info[nodeid].host_mem);
> +    }
> +    if (ret == 4) {
> +        goto error;
> +    } else if (ret & 1) {
> +        numa_info[nodeid].flags |= NODE_HOST_RELATIVE;
> +    }
> +
> +    if (set_node_mpol(nodeid) == -1) {
> +        goto error;
> +    }
> +
> +    return;
> +
> +error:
> +    bitmap_copy(numa_info[nodeid].host_mem, host_mem, MAX_CPUMASK_BITS);
> +    numa_info[nodeid].flags = flags;
> +    return;
> +}
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 993b8e0..7d804af 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -144,6 +144,7 @@ struct node_info {
>      unsigned int flags;
>  };
>  extern struct node_info numa_info[MAX_NODES];
> +extern unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm);
>  
>  #define MAX_OPTION_ROMS 16
>  typedef struct QEMUOptionRom {
> diff --git a/qapi-schema.json b/qapi-schema.json
> index a80ee40..403c703 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -3608,3 +3608,16 @@
>              '*cpuid-input-ecx': 'int',
>              'cpuid-register': 'X86CPURegister32',
>              'features': 'int' } }
> +
> +# @set-mpol:
> +#
> +# Set the host memory binding policy for guest NUMA node.
> +#
> +# @node-id: The node ID of guest NUMA node to set memory policy to.
> +#
> +# @mpol: The memory policy string to set.
> +#
> +# Since: 1.6.0
> +##
> +{ 'command': 'set-mpol', 'data': {'nodeid': 'int', '*mpol': 'str',
> +                                  '*nodemask': 'str'} }
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index 8cea5e5..930c844 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -2997,3 +2997,38 @@ Example:
>  <- { "return": {} }
>  
>  EQMP
> +
> +    {
> +        .name      = "set-mpol",
> +        .args_type = "nodeid:i,mpol:s?,nodemask:s?",
> +        .help      = "Set the host memory binding policy for guest NUMA node",
> +        .mhandler.cmd_new = qmp_marshal_input_set_mpol,
> +    },
> +
> +SQMP
> +set-mpol
> +------
> +
> +Set the host memory binding policy for guest NUMA node
> +
> +Arguments:
> +
> +- "nodeid": The nodeid of guest NUMA node to set memory policy to.
> +              (json-int)
> +- "mpol": The memory policy string to set.
> +           (json-string, optional)
> +- "nodemask": The node mask contained to mpol.
> +                (json-string, optional)
> +
> +Example:
> +
> +-> { "execute": "set-mpol", "arguments": { "nodeid": 0, "mpol": "membind",
> +                                           "nodemask": "0-1" }}
> +<- { "return": {} }
> +
> +Notes:
> +    1. If "mpol" is not set, the memory policy of this "nodeid" will be set
> +       to "default".
> +    2. If "nodemask" is not set, the node mask of this "mpol" will be set
> +       to "all".
> +EQMP
> diff --git a/vl.c b/vl.c
> index ada9fb2..73af85e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1348,7 +1348,7 @@ error:
>      exit(1);
>  }
>  
> -static unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
> +unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
>  {
>      unsigned long long value, endvalue;
>      char *endptr;
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 6/7] NUMA: add hmp command set-mpol
  2013-06-18  8:09 ` [Qemu-devel] [PATCH 6/7] NUMA: add hmp command set-mpol Wanlong Gao
@ 2013-06-18  9:23   ` Paolo Bonzini
  2013-06-18  9:49     ` Wanlong Gao
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2013-06-18  9:23 UTC (permalink / raw)
  To: Wanlong Gao
  Cc: andre.przywara, aliguori, ehabkost, qemu-devel, y-goto, afaerber

Il 18/06/2013 10:09, Wanlong Gao ha scritto:
> Add hmp command set-mpol to set host memory policy for a guest
> NUMA node. Then we can also set node's memory policy using
> the monitor command like:
>     (qemu) set-mpol 0 membind 0-1

I suggest something similar to what chardev-add does: Just make it
"set-mpol <nodeid> <qemuopts>", for example

 (qemu) set-mpol 0 mem-policy=membind,mem-hostnode=0-1

Similar to the command-line syntax.

Paolo

> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
> ---
>  hmp-commands.hx | 16 ++++++++++++++++
>  hmp.c           | 22 ++++++++++++++++++++++
>  hmp.h           |  1 +
>  3 files changed, 39 insertions(+)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 915b0d1..fd3505e 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1567,6 +1567,22 @@ Executes a qemu-io command on the given block device.
>  ETEXI
>  
>      {
> +        .name       = "set-mpol",
> +        .args_type  = "nodeid:i,mpol:s?,nodemask:s?",
> +        .params     = "nodeid [mpol] [nodemask]",
> +        .help       = "set host memory policy for a guest NUMA node",
> +        .mhandler.cmd = hmp_set_mpol,
> +    },
> +
> +STEXI
> +@item set-mpol @var{nodeid} @var{mpol} @var{nodemask}
> +@findex set-mpol
> +
> +Set host memory policy for a guest NUMA node
> +
> +ETEXI
> +
> +    {
>          .name       = "info",
>          .args_type  = "item:s?",
>          .params     = "[subcommand]",
> diff --git a/hmp.c b/hmp.c
> index 494a9aa..2e5315e 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1464,3 +1464,25 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
>  
>      hmp_handle_error(mon, &err);
>  }
> +
> +void hmp_set_mpol(Monitor *mon, const QDict *qdict)
> +{
> +    Error *local_err = NULL;
> +    bool has_mpol = true;
> +    bool has_nodemask = true;
> +
> +    uint64_t nodeid = qdict_get_int(qdict, "nodeid");
> +    const char *mpol = qdict_get_try_str(qdict, "mpol");
> +    const char *nodemask = qdict_get_try_str(qdict, "nodemask");
> +
> +    if (mpol == NULL) {
> +        has_mpol = false;
> +    }
> +
> +    if (nodemask == NULL) {
> +        has_nodemask = false;
> +    }
> +
> +    qmp_set_mpol(nodeid, has_mpol, mpol, has_nodemask, nodemask, &local_err);
> +    hmp_handle_error(mon, &local_err);
> +}
> diff --git a/hmp.h b/hmp.h
> index 56d2e92..81f631b 100644
> --- a/hmp.h
> +++ b/hmp.h
> @@ -86,5 +86,6 @@ void hmp_nbd_server_stop(Monitor *mon, const QDict *qdict);
>  void hmp_chardev_add(Monitor *mon, const QDict *qdict);
>  void hmp_chardev_remove(Monitor *mon, const QDict *qdict);
>  void hmp_qemu_io(Monitor *mon, const QDict *qdict);
> +void hmp_set_mpol(Monitor *mon, const QDict *qdict);
>  
>  #endif
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node
  2013-06-18  9:21   ` Paolo Bonzini
@ 2013-06-18  9:44     ` Wanlong Gao
  2013-06-18  9:57       ` Paolo Bonzini
  0 siblings, 1 reply; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  9:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: andre.przywara, aliguori, ehabkost, qemu-devel, y-goto, afaerber,
	Wanlong Gao

On 06/18/2013 05:21 PM, Paolo Bonzini wrote:
> Il 18/06/2013 10:09, Wanlong Gao ha scritto:
>> The QMP command let it be able to set node's memory policy
>> through the QMP protocol. The qmp-shell command is like:
>>     set-mpol nodeid=0 mpol=membind nodemask=0-1
>>
>> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
> 
> How would this work with mem-path?

This can also set mempolicy for mem-path backed memory in
guest nodes. So we don't need to know if we are using
mem-path.

Thanks,
Wanlong Gao

> 
> Paolo
> 
>> ---
>>  cpus.c                  | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  include/sysemu/sysemu.h |  1 +
>>  qapi-schema.json        | 13 +++++++++++
>>  qmp-commands.hx         | 35 ++++++++++++++++++++++++++++
>>  vl.c                    |  2 +-
>>  5 files changed, 111 insertions(+), 1 deletion(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index b868932..a2836e9 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -1431,3 +1431,64 @@ void qmp_inject_nmi(Error **errp)
>>      error_set(errp, QERR_UNSUPPORTED);
>>  #endif
>>  }
>> +
>> +void qmp_set_mpol(int64_t nodeid, bool has_mpol, const char *mpol,
>> +                  bool has_nodemask, const char *nodemask, Error **errp)
>> +{
>> +    unsigned int ret;
>> +    unsigned int flags;
>> +    DECLARE_BITMAP(host_mem, MAX_CPUMASK_BITS);
>> +
>> +    if (nodeid >= nb_numa_nodes) {
>> +        error_setg(errp, "Only has '%d' NUMA nodes", nb_numa_nodes);
>> +        return;
>> +    }
>> +
>> +    bitmap_copy(host_mem, numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
>> +    flags = numa_info[nodeid].flags;
>> +
>> +    numa_info[nodeid].flags = NODE_HOST_NONE;
>> +    bitmap_zero(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
>> +
>> +    if (!has_mpol) {
>> +        if (set_node_mpol(nodeid) == -1) {
>> +            goto error;
>> +        }
>> +        return;
>> +    }
>> +
>> +    if (!strcmp(mpol, "membind")) {
>> +        numa_info[nodeid].flags |= NODE_HOST_BIND;
>> +    } else if (!strcmp(mpol, "interleave")) {
>> +        numa_info[nodeid].flags |= NODE_HOST_INTERLEAVE;
>> +    } else if (!strcmp(mpol, "preferred")) {
>> +        numa_info[nodeid].flags |= NODE_HOST_PREFERRED;
>> +    } else {
>> +        error_setg(errp, "Invalid NUMA policy '%s'", mpol);
>> +        goto error;
>> +    }
>> +
>> +    if (!has_nodemask) {
>> +        bitmap_fill(numa_info[nodeid].host_mem, MAX_CPUMASK_BITS);
>> +    }
>> +
>> +    if (nodemask) {
>> +        ret = numa_node_parse_mpol(nodemask, numa_info[nodeid].host_mem);
>> +    }
>> +    if (ret == 4) {
>> +        goto error;
>> +    } else if (ret & 1) {
>> +        numa_info[nodeid].flags |= NODE_HOST_RELATIVE;
>> +    }
>> +
>> +    if (set_node_mpol(nodeid) == -1) {
>> +        goto error;
>> +    }
>> +
>> +    return;
>> +
>> +error:
>> +    bitmap_copy(numa_info[nodeid].host_mem, host_mem, MAX_CPUMASK_BITS);
>> +    numa_info[nodeid].flags = flags;
>> +    return;
>> +}
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index 993b8e0..7d804af 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -144,6 +144,7 @@ struct node_info {
>>      unsigned int flags;
>>  };
>>  extern struct node_info numa_info[MAX_NODES];
>> +extern unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm);
>>  
>>  #define MAX_OPTION_ROMS 16
>>  typedef struct QEMUOptionRom {
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index a80ee40..403c703 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -3608,3 +3608,16 @@
>>              '*cpuid-input-ecx': 'int',
>>              'cpuid-register': 'X86CPURegister32',
>>              'features': 'int' } }
>> +
>> +# @set-mpol:
>> +#
>> +# Set the host memory binding policy for guest NUMA node.
>> +#
>> +# @node-id: The node ID of guest NUMA node to set memory policy to.
>> +#
>> +# @mpol: The memory policy string to set.
>> +#
>> +# Since: 1.6.0
>> +##
>> +{ 'command': 'set-mpol', 'data': {'nodeid': 'int', '*mpol': 'str',
>> +                                  '*nodemask': 'str'} }
>> diff --git a/qmp-commands.hx b/qmp-commands.hx
>> index 8cea5e5..930c844 100644
>> --- a/qmp-commands.hx
>> +++ b/qmp-commands.hx
>> @@ -2997,3 +2997,38 @@ Example:
>>  <- { "return": {} }
>>  
>>  EQMP
>> +
>> +    {
>> +        .name      = "set-mpol",
>> +        .args_type = "nodeid:i,mpol:s?,nodemask:s?",
>> +        .help      = "Set the host memory binding policy for guest NUMA node",
>> +        .mhandler.cmd_new = qmp_marshal_input_set_mpol,
>> +    },
>> +
>> +SQMP
>> +set-mpol
>> +------
>> +
>> +Set the host memory binding policy for guest NUMA node
>> +
>> +Arguments:
>> +
>> +- "nodeid": The nodeid of guest NUMA node to set memory policy to.
>> +              (json-int)
>> +- "mpol": The memory policy string to set.
>> +           (json-string, optional)
>> +- "nodemask": The node mask contained to mpol.
>> +                (json-string, optional)
>> +
>> +Example:
>> +
>> +-> { "execute": "set-mpol", "arguments": { "nodeid": 0, "mpol": "membind",
>> +                                           "nodemask": "0-1" }}
>> +<- { "return": {} }
>> +
>> +Notes:
>> +    1. If "mpol" is not set, the memory policy of this "nodeid" will be set
>> +       to "default".
>> +    2. If "nodemask" is not set, the node mask of this "mpol" will be set
>> +       to "all".
>> +EQMP
>> diff --git a/vl.c b/vl.c
>> index ada9fb2..73af85e 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -1348,7 +1348,7 @@ error:
>>      exit(1);
>>  }
>>  
>> -static unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
>> +unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
>>  {
>>      unsigned long long value, endvalue;
>>      char *endptr;
>>
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 6/7] NUMA: add hmp command set-mpol
  2013-06-18  9:23   ` Paolo Bonzini
@ 2013-06-18  9:49     ` Wanlong Gao
  0 siblings, 0 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: andre.przywara, aliguori, ehabkost, qemu-devel, y-goto, afaerber,
	Wanlong Gao

On 06/18/2013 05:23 PM, Paolo Bonzini wrote:
> Il 18/06/2013 10:09, Wanlong Gao ha scritto:
>> Add hmp command set-mpol to set host memory policy for a guest
>> NUMA node. Then we can also set node's memory policy using
>> the monitor command like:
>>     (qemu) set-mpol 0 membind 0-1
> 
> I suggest something similar to what chardev-add does: Just make it
> "set-mpol <nodeid> <qemuopts>", for example
> 
>  (qemu) set-mpol 0 mem-policy=membind,mem-hostnode=0-1
> 
> Similar to the command-line syntax.

That's better, I'll try to change like this, thank you very much. ;)

Regards,
Wanlong Gao

> 
> Paolo
> 
>> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>> ---
>>  hmp-commands.hx | 16 ++++++++++++++++
>>  hmp.c           | 22 ++++++++++++++++++++++
>>  hmp.h           |  1 +
>>  3 files changed, 39 insertions(+)
>>
>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>> index 915b0d1..fd3505e 100644
>> --- a/hmp-commands.hx
>> +++ b/hmp-commands.hx
>> @@ -1567,6 +1567,22 @@ Executes a qemu-io command on the given block device.
>>  ETEXI
>>  
>>      {
>> +        .name       = "set-mpol",
>> +        .args_type  = "nodeid:i,mpol:s?,nodemask:s?",
>> +        .params     = "nodeid [mpol] [nodemask]",
>> +        .help       = "set host memory policy for a guest NUMA node",
>> +        .mhandler.cmd = hmp_set_mpol,
>> +    },
>> +
>> +STEXI
>> +@item set-mpol @var{nodeid} @var{mpol} @var{nodemask}
>> +@findex set-mpol
>> +
>> +Set host memory policy for a guest NUMA node
>> +
>> +ETEXI
>> +
>> +    {
>>          .name       = "info",
>>          .args_type  = "item:s?",
>>          .params     = "[subcommand]",
>> diff --git a/hmp.c b/hmp.c
>> index 494a9aa..2e5315e 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -1464,3 +1464,25 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
>>  
>>      hmp_handle_error(mon, &err);
>>  }
>> +
>> +void hmp_set_mpol(Monitor *mon, const QDict *qdict)
>> +{
>> +    Error *local_err = NULL;
>> +    bool has_mpol = true;
>> +    bool has_nodemask = true;
>> +
>> +    uint64_t nodeid = qdict_get_int(qdict, "nodeid");
>> +    const char *mpol = qdict_get_try_str(qdict, "mpol");
>> +    const char *nodemask = qdict_get_try_str(qdict, "nodemask");
>> +
>> +    if (mpol == NULL) {
>> +        has_mpol = false;
>> +    }
>> +
>> +    if (nodemask == NULL) {
>> +        has_nodemask = false;
>> +    }
>> +
>> +    qmp_set_mpol(nodeid, has_mpol, mpol, has_nodemask, nodemask, &local_err);
>> +    hmp_handle_error(mon, &local_err);
>> +}
>> diff --git a/hmp.h b/hmp.h
>> index 56d2e92..81f631b 100644
>> --- a/hmp.h
>> +++ b/hmp.h
>> @@ -86,5 +86,6 @@ void hmp_nbd_server_stop(Monitor *mon, const QDict *qdict);
>>  void hmp_chardev_add(Monitor *mon, const QDict *qdict);
>>  void hmp_chardev_remove(Monitor *mon, const QDict *qdict);
>>  void hmp_qemu_io(Monitor *mon, const QDict *qdict);
>> +void hmp_set_mpol(Monitor *mon, const QDict *qdict);
>>  
>>  #endif
>>
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-18  9:20   ` Paolo Bonzini
@ 2013-06-18  9:54     ` Wanlong Gao
  2013-06-18 19:00     ` Eduardo Habkost
  1 sibling, 0 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-18  9:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: andre.przywara, aliguori, ehabkost, qemu-devel, y-goto, afaerber,
	Wanlong Gao

On 06/18/2013 05:20 PM, Paolo Bonzini wrote:
> Il 18/06/2013 10:09, Wanlong Gao ha scritto:
>> +static unsigned int numa_node_parse_mpol(const char *str, unsigned long *bm)
>> +{
>> +    unsigned long long value, endvalue;
>> +    char *endptr;
>> +    unsigned int flags = 0;
>> +
>> +    if (str[0] == '!') {
>> +        flags |= 2;
> 
> clear = true;
> 
>> +        bitmap_fill(bm, MAX_CPUMASK_BITS);
>> +        str++;
>> +    }
>> +    if (str[0] == '+') {
>> +        flags |= 1;
> 
> flags = NODE_HOST_RELATIVE
> 
>> +        str++;
>> +    }
>> +
>> +    if (!strcmp(str, "all")) {
>> +        bitmap_fill(bm, MAX_CPUMASK_BITS);
>> +        return flags;
>> +    }
>> +
>> +    if (parse_uint(str, &value, &endptr, 10) < 0)
>> +        goto error;
>> +    if (*endptr == '-') {
>> +        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
>> +            goto error;
>> +        }
>> +    } else if (*endptr == '\0') {
>> +        endvalue = value;
>> +    } else {
>> +        goto error;
>> +    }
>> +
>> +    if (endvalue >= MAX_CPUMASK_BITS) {
>> +        endvalue = MAX_CPUMASK_BITS - 1;
>> +        fprintf(stderr,
>> +            "qemu: NUMA: A max of %d host nodes are supported\n",
>> +             MAX_CPUMASK_BITS);
>> +    }
>> +
>> +    if (endvalue < value) {
>> +        goto error;
>> +    }
>> +
>> +    if (flags & 2)
> 
> if (clear)
> 
>> +        bitmap_clear(bm, value, endvalue - value + 1);
>> +    else
>> +        bitmap_set(bm, value, endvalue - value + 1);
>> +
>> +    return flags;
>> +
>> +error:
>> +    fprintf(stderr, "qemu: Invalid host NUMA nodes range: %s\n", str);
> 
> Please change the functions (numa_add and numa_node_parse_mpol) to
> accept an Error *.  This will make it much easier to reuse them for e.g.
> memory hotplug in the future.

Got it, I'll try. Thank you.

> 
>> +    return 4;
> 
> return -EINVAL;
> 
>> +}
> 
>> +        if (get_param_value(option, 128, "interleave", optarg) != 0)
>> +            numa_info[nodenr].flags |= NODE_HOST_INTERLEAVE;
>> +        else if (get_param_value(option, 128, "preferred", optarg) != 0)
>> +            numa_info[nodenr].flags |= NODE_HOST_PREFERRED;
>> +        else if (get_param_value(option, 128, "membind", optarg) != 0)
>> +            numa_info[nodenr].flags |= NODE_HOST_BIND;
> 
> You're not handling the case where someone specifies more than one option.
> 
> What about:
> 
>    policy={interleave,preferred,bind},mem-hostnode=0
> 
> ?

OK, will follow this, thank you.

> 
> Also, please use QemuOpts instead of yet another homegrown parser.
> Eduardo, I think you had the most recent attempt to convert -numa to
> QemuOpts?

So, any patches I can based on or change it myself?  Eduardo?


Thanks,
Wanlong Gao

> 
>> +        if (option[0] != 0) {
>> +            ret = numa_node_parse_mpol(option, numa_info[nodenr].host_mem);
>> +            if (ret == 4) {
> 
> if (ret < 0)
> 
>> +                exit(1);
>> +            } else if (ret & 1) {
>> +                numa_info[nodenr].flags |= NODE_HOST_RELATIVE;
> 
> else {
>     numa_info[nodenr].flags |= ret;
> }
> 
>> +            }
>> +        }
>> +
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node
  2013-06-18  9:44     ` Wanlong Gao
@ 2013-06-18  9:57       ` Paolo Bonzini
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2013-06-18  9:57 UTC (permalink / raw)
  To: gaowanlong; +Cc: aliguori, ehabkost, qemu-devel, y-goto, afaerber

Il 18/06/2013 11:44, Wanlong Gao ha scritto:
> On 06/18/2013 05:21 PM, Paolo Bonzini wrote:
>> Il 18/06/2013 10:09, Wanlong Gao ha scritto:
>>> The QMP command let it be able to set node's memory policy
>>> through the QMP protocol. The qmp-shell command is like:
>>>     set-mpol nodeid=0 mpol=membind nodemask=0-1
>>>
>>> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>>
>> How would this work with mem-path?
> 
> This can also set mempolicy for mem-path backed memory in
> guest nodes. So we don't need to know if we are using
> mem-path.

Cool, I didn't know this.  Thanks.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-18  9:20   ` Paolo Bonzini
  2013-06-18  9:54     ` Wanlong Gao
@ 2013-06-18 19:00     ` Eduardo Habkost
  2013-06-18 20:19       ` Bandan Das
  1 sibling, 1 reply; 20+ messages in thread
From: Eduardo Habkost @ 2013-06-18 19:00 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: andre.przywara, aliguori, qemu-devel, y-goto, afaerber, Wanlong Gao

On Tue, Jun 18, 2013 at 11:20:37AM +0200, Paolo Bonzini wrote:
[...]
> Also, please use QemuOpts instead of yet another homegrown parser.
> Eduardo, I think you had the most recent attempt to convert -numa to
> QemuOpts?

I had one, but I believe it is more complex than it should have been. I
was creating a "numa-node" config section while keeping "-numa" just for
compatbility, but I don't think we really need to do that.

If you want to take a look, the old attemp is at:
  git://github.com/ehabkost/qemu-hacks.git work/old-numa-node-config-section-experiment

-- 
Eduardo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-18 19:00     ` Eduardo Habkost
@ 2013-06-18 20:19       ` Bandan Das
  2013-06-19  8:01         ` Wanlong Gao
  0 siblings, 1 reply; 20+ messages in thread
From: Bandan Das @ 2013-06-18 20:19 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: andre.przywara, aliguori, qemu-devel, Paolo Bonzini, y-goto,
	afaerber, Wanlong Gao

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Tue, Jun 18, 2013 at 11:20:37AM +0200, Paolo Bonzini wrote:
> [...]
>> Also, please use QemuOpts instead of yet another homegrown parser.
>> Eduardo, I think you had the most recent attempt to convert -numa to
>> QemuOpts?
>
> I had one, but I believe it is more complex than it should have been. I
> was creating a "numa-node" config section while keeping "-numa" just for
> compatbility, but I don't think we really need to do that.

Ah, I was working on an update to Eduardo's earlier proposals for multiple CPU ranges
and part of the change was to convert to QemuOpts. 

Probably needs more testing but posted it anyway since we are already discussing this :
[PATCH v3] vl.c: Support multiple CPU ranges on -numa option
(hasn't shown up in the archives yet)

Bandan

> If you want to take a look, the old attemp is at:
>   git://github.com/ehabkost/qemu-hacks.git work/old-numa-node-config-section-experiment

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-18 20:19       ` Bandan Das
@ 2013-06-19  8:01         ` Wanlong Gao
  2013-06-19 17:39           ` Paolo Bonzini
  0 siblings, 1 reply; 20+ messages in thread
From: Wanlong Gao @ 2013-06-19  8:01 UTC (permalink / raw)
  To: Bandan Das
  Cc: andre.przywara, aliguori, Eduardo Habkost, qemu-devel, y-goto,
	Paolo Bonzini, afaerber, Wanlong Gao

On 06/19/2013 04:19 AM, Bandan Das wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
>> On Tue, Jun 18, 2013 at 11:20:37AM +0200, Paolo Bonzini wrote:
>> [...]
>>> Also, please use QemuOpts instead of yet another homegrown parser.
>>> Eduardo, I think you had the most recent attempt to convert -numa to
>>> QemuOpts?
>>
>> I had one, but I believe it is more complex than it should have been. I
>> was creating a "numa-node" config section while keeping "-numa" just for
>> compatbility, but I don't think we really need to do that.
> 
> Ah, I was working on an update to Eduardo's earlier proposals for multiple CPU ranges
> and part of the change was to convert to QemuOpts. 
> 
> Probably needs more testing but posted it anyway since we are already discussing this :
> [PATCH v3] vl.c: Support multiple CPU ranges on -numa option
> (hasn't shown up in the archives yet)

Here is the archive: http://thread.gmane.org/gmane.comp.emulators.qemu/217491

So, are you all ACK with this? And we are not considering compatible thing by using
"cpu" instead of "cpus" here?


Thanks,
Wanlong Gao

> 
> Bandan
> 
>> If you want to take a look, the old attemp is at:
>>   git://github.com/ehabkost/qemu-hacks.git work/old-numa-node-config-section-experiment
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-19  8:01         ` Wanlong Gao
@ 2013-06-19 17:39           ` Paolo Bonzini
  2013-06-20  0:01             ` Wanlong Gao
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2013-06-19 17:39 UTC (permalink / raw)
  To: gaowanlong
  Cc: andre.przywara, aliguori, Eduardo Habkost, qemu-devel,
	Bandan Das, y-goto, afaerber

Il 19/06/2013 10:01, Wanlong Gao ha scritto:
> On 06/19/2013 04:19 AM, Bandan Das wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>>
>>> On Tue, Jun 18, 2013 at 11:20:37AM +0200, Paolo Bonzini wrote:
>>> [...]
>>>> Also, please use QemuOpts instead of yet another homegrown parser.
>>>> Eduardo, I think you had the most recent attempt to convert -numa to
>>>> QemuOpts?
>>>
>>> I had one, but I believe it is more complex than it should have been. I
>>> was creating a "numa-node" config section while keeping "-numa" just for
>>> compatbility, but I don't think we really need to do that.
>>
>> Ah, I was working on an update to Eduardo's earlier proposals for multiple CPU ranges
>> and part of the change was to convert to QemuOpts. 
>>
>> Probably needs more testing but posted it anyway since we are already discussing this :
>> [PATCH v3] vl.c: Support multiple CPU ranges on -numa option
>> (hasn't shown up in the archives yet)
> 
> Here is the archive: http://thread.gmane.org/gmane.comp.emulators.qemu/217491
> 
> So, are you all ACK with this? And we are not considering compatible thing by using
> "cpu" instead of "cpus" here?

No; as Eduardo pointed out, the "cpus" must be kept.  But apart from
that, picking up Bandan's patch in v2 of this series should be fine.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy
  2013-06-19 17:39           ` Paolo Bonzini
@ 2013-06-20  0:01             ` Wanlong Gao
  0 siblings, 0 replies; 20+ messages in thread
From: Wanlong Gao @ 2013-06-20  0:01 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: andre.przywara, aliguori, Eduardo Habkost, qemu-devel,
	Bandan Das, y-goto, afaerber, Wanlong Gao

On 06/20/2013 01:39 AM, Paolo Bonzini wrote:
> Il 19/06/2013 10:01, Wanlong Gao ha scritto:
>> On 06/19/2013 04:19 AM, Bandan Das wrote:
>>> Eduardo Habkost <ehabkost@redhat.com> writes:
>>>
>>>> On Tue, Jun 18, 2013 at 11:20:37AM +0200, Paolo Bonzini wrote:
>>>> [...]
>>>>> Also, please use QemuOpts instead of yet another homegrown parser.
>>>>> Eduardo, I think you had the most recent attempt to convert -numa to
>>>>> QemuOpts?
>>>>
>>>> I had one, but I believe it is more complex than it should have been. I
>>>> was creating a "numa-node" config section while keeping "-numa" just for
>>>> compatbility, but I don't think we really need to do that.
>>>
>>> Ah, I was working on an update to Eduardo's earlier proposals for multiple CPU ranges
>>> and part of the change was to convert to QemuOpts. 
>>>
>>> Probably needs more testing but posted it anyway since we are already discussing this :
>>> [PATCH v3] vl.c: Support multiple CPU ranges on -numa option
>>> (hasn't shown up in the archives yet)
>>
>> Here is the archive: http://thread.gmane.org/gmane.comp.emulators.qemu/217491
>>
>> So, are you all ACK with this? And we are not considering compatible thing by using
>> "cpu" instead of "cpus" here?
> 
> No; as Eduardo pointed out, the "cpus" must be kept.  But apart from
> that, picking up Bandan's patch in v2 of this series should be fine.

Got it, thank you.

Regards,
Wanlong Gao

> 
> Paolo
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-06-20  0:03 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-18  8:09 [Qemu-devel] [PATCH 0/7] Add support for binding guest numa nodes to host numa nodes Wanlong Gao
2013-06-18  8:09 ` [Qemu-devel] [PATCH 1/7] Add numa_info structure to contain numa nodes info Wanlong Gao
2013-06-18  8:09 ` [Qemu-devel] [PATCH 2/7] Add Linux libnuma detection Wanlong Gao
2013-06-18  8:09 ` [Qemu-devel] [PATCH 3/7] NUMA: parse guest numa nodes memory policy Wanlong Gao
2013-06-18  9:20   ` Paolo Bonzini
2013-06-18  9:54     ` Wanlong Gao
2013-06-18 19:00     ` Eduardo Habkost
2013-06-18 20:19       ` Bandan Das
2013-06-19  8:01         ` Wanlong Gao
2013-06-19 17:39           ` Paolo Bonzini
2013-06-20  0:01             ` Wanlong Gao
2013-06-18  8:09 ` [Qemu-devel] [PATCH 4/7] NUMA: set " Wanlong Gao
2013-06-18  8:09 ` [Qemu-devel] [PATCH 5/7] NUMA: add qmp command set-mpol to set memory policy for NUMA node Wanlong Gao
2013-06-18  9:21   ` Paolo Bonzini
2013-06-18  9:44     ` Wanlong Gao
2013-06-18  9:57       ` Paolo Bonzini
2013-06-18  8:09 ` [Qemu-devel] [PATCH 6/7] NUMA: add hmp command set-mpol Wanlong Gao
2013-06-18  9:23   ` Paolo Bonzini
2013-06-18  9:49     ` Wanlong Gao
2013-06-18  8:09 ` [Qemu-devel] [PATCH 7/7] NUMA: show host memory policy info in info numa command Wanlong Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.