All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
@ 2014-03-04 14:00 Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 01/28] NUMA: move numa related code to new file numa.c Paolo Bonzini
                   ` (28 more replies)
  0 siblings, 29 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

This series includes all the pending work on QOMifying the memory
backends.  Some of the patches posted so far didn't correctly build
all targets, so I'm posting it as the basis for further work
(including memory hotplug!).  It's available at branch numa on
my github repository.

To recap, the idea is to delegate all properties of the memory
backend to a new QOM class hierarchy, in which the concrete classes
are hostmem-ram and hostmem-file.  The backend is passed to the
machine via "-numa node,memdev=foo" where "foo" is the id of the
backend object.

Patches 1-6: generic numa improvements
Patches 7-8: convert -m to QemuOpts, from the memory hotplug series
Patches 9-11: improvements to object-add and -object
Patches 12-15: infrastructure for QOM memory backends
Patches 16-20: preparation for replacing -mem-path with a QOM backend
Patches 21-24: new-style replacement for -mem-path, -mem-prealloc and madvise
Patch 25: support for MAP_SHARED (requested by vhost-user)
Patches 26-28: rest of Wanlong Gao and Hu Tao's NUMA policy work

New patches are patch 9, patch 11, patches 16-25.  Please review
them.

Missing, needed before merge:
1. conversion to memory_region_allocate_system_memory of all boards
2. more documentation

Missing, nice to have:
1. improvements to string visitors for ranges
2. HMP "info memdev" command

If anyone wants to pick up management of this series from now on,
just shout.

Paolo

Hu Tao (2):
  hostmem: add properties for NUMA memory policy
  qmp: add query-memdev

Igor Mammedov (3):
  vl: convert -m to QemuOpts
  qmp: allow object-add completion handler to get canonical path
  add memdev backend infrastructure

Luiz Capitulino (1):
  man: improve -numa doc

Paolo Bonzini (16):
  qemu-option: introduce qemu_find_opts_singleton
  vl: redo -object parsing
  qmp: improve error reporting for -object and object-add
  pc: pass QEMUMachineInitArgs to pc_memory_init
  numa: introduce memory_region_allocate_system_memory
  numa: add -numa node,memdev= option
  memory: reorganize file-based allocation
  memory: move mem_path handling to memory_region_allocate_system_memory
  memory: add error propagation to file-based RAM allocation
  memory: move preallocation code out of exec.c
  memory: move RAM_PREALLOC_MASK to exec.c, rename
  hostmem: add file-based HostMemoryBackend
  hostmem: separate allocation from UserCreatable complete method
  hostmem: add merge and dump properties
  hostmem: allow preallocation of any memory region
  hostmem: add property to map memory with MAP_SHARED

Wanlong Gao (6):
  NUMA: move numa related code to new file numa.c
  NUMA: check if the total numa memory size is equal to ram_size
  NUMA: Add numa_info structure to contain numa nodes info
  NUMA: convert -numa option to use OptsVisitor
  NUMA: expand MAX_NODES from 64 to 128
  configure: add Linux libnuma detection

 Makefile.target            |   2 +-
 backends/Makefile.objs     |   3 +
 backends/hostmem-file.c    | 134 +++++++++++++++++
 backends/hostmem-ram.c     |  50 +++++++
 backends/hostmem.c         | 346 ++++++++++++++++++++++++++++++++++++++++++++
 configure                  |  33 +++++
 cpus.c                     |  14 --
 exec.c                     | 206 +++++++++++++--------------
 hw/i386/pc.c               |  27 ++--
 hw/i386/pc_piix.c          |   8 +-
 hw/i386/pc_q35.c           |   4 +-
 hw/ppc/spapr.c             |  10 +-
 include/exec/cpu-all.h     |   8 --
 include/exec/cpu-common.h  |   2 +
 include/exec/memory.h      |  33 +++++
 include/exec/ram_addr.h    |   3 +
 include/hw/boards.h        |   4 +
 include/hw/i386/pc.h       |   7 +-
 include/qemu/config-file.h |   2 +
 include/qemu/osdep.h       |  12 ++
 include/sysemu/cpus.h      |   1 -
 include/sysemu/hostmem.h   |  68 +++++++++
 include/sysemu/sysemu.h    |  18 ++-
 memory.c                   |  29 ++++
 monitor.c                  |   2 +-
 numa.c                     | 347 +++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json           |  89 ++++++++++++
 qemu-options.hx            |  25 +++-
 qmp-commands.hx            |  32 +++++
 qmp.c                      |  15 +-
 util/oslib-posix.c         |  73 ++++++++++
 util/qemu-config.c         |  14 ++
 vl.c                       | 323 ++++++++++++++---------------------------
 33 files changed, 1549 insertions(+), 395 deletions(-)
 create mode 100644 backends/hostmem-file.c
 create mode 100644 backends/hostmem-ram.c
 create mode 100644 backends/hostmem.c
 create mode 100644 include/sysemu/hostmem.h
 create mode 100644 numa.c

-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 01/28] NUMA: move numa related code to new file numa.c
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 02/28] NUMA: check if the total numa memory size is equal to ram_size Paolo Bonzini
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Wanlong Gao <gaowanlong@cn.fujitsu.com>

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Makefile.target           |   2 +-
 cpus.c                    |  14 ----
 include/exec/cpu-all.h    |   2 -
 include/exec/cpu-common.h |   2 +
 include/sysemu/cpus.h     |   1 -
 include/sysemu/sysemu.h   |   3 +
 numa.c                    | 186 ++++++++++++++++++++++++++++++++++++++++++++++
 vl.c                      | 139 +---------------------------------
 8 files changed, 193 insertions(+), 156 deletions(-)
 create mode 100644 numa.c

diff --git a/Makefile.target b/Makefile.target
index ba12340..234d9cd 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -109,7 +109,7 @@ endif #CONFIG_BSD_USER
 #########################################################
 # System emulator target
 ifdef CONFIG_SOFTMMU
-obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
+obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
 obj-y += qtest.o
 obj-y += hw/
 obj-$(CONFIG_FDT) += device_tree.o
diff --git a/cpus.c b/cpus.c
index 945d85b..891d062 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1299,20 +1299,6 @@ static void tcg_exec_all(void)
     exit_request = 0;
 }
 
-void set_numa_modes(void)
-{
-    CPUState *cpu;
-    int i;
-
-    CPU_FOREACH(cpu) {
-        for (i = 0; i < nb_numa_nodes; i++) {
-            if (test_bit(cpu->cpu_index, node_cpumask[i])) {
-                cpu->numa_node = i;
-            }
-        }
-    }
-}
-
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
 {
     /* XXX: implement xxx_cpu_list for targets that still miss it */
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 4cb4b4a..e66ab5b 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -438,8 +438,6 @@ void cpu_watchpoint_remove_all(CPUArchState *env, int mask);
 
 /* memory API */
 
-extern ram_addr_t ram_size;
-
 /* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */
 #define RAM_PREALLOC_MASK   (1 << 0)
 
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index a21b65a..e8c7970 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -45,6 +45,8 @@ typedef uintptr_t ram_addr_t;
 #  define RAM_ADDR_FMT "%" PRIxPTR
 #endif
 
+extern ram_addr_t ram_size;
+
 /* memory API */
 
 typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value);
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index 6502488..4f79081 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -23,7 +23,6 @@ extern int smp_threads;
 #define smp_threads 1
 #endif
 
-void set_numa_modes(void);
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg);
 
 #endif
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 495dae8..2509649 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -136,6 +136,9 @@ extern QEMUClockType rtc_clock;
 extern int nb_numa_nodes;
 extern uint64_t node_mem[MAX_NODES];
 extern unsigned long *node_cpumask[MAX_NODES];
+void numa_add(const char *optarg);
+void set_numa_nodes(void);
+void set_numa_modes(void);
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/numa.c b/numa.c
new file mode 100644
index 0000000..395c14f
--- /dev/null
+++ b/numa.c
@@ -0,0 +1,186 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2013 Fujitsu Ltd.
+ * Author: Wanlong Gao <gaowanlong@cn.fujitsu.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "sysemu/sysemu.h"
+#include "exec/cpu-common.h"
+#include "qemu/bitmap.h"
+#include "qom/cpu.h"
+
+static void numa_node_parse_cpus(int nodenr, const char *cpus)
+{
+    char *endptr;
+    unsigned long long value, endvalue;
+
+    /* Empty CPU range strings will be considered valid, they will simply
+     * not set any bit in the CPU bitmap.
+     */
+    if (!*cpus) {
+        return;
+    }
+
+    if (parse_uint(cpus, &value, &endptr, 10) < 0) {
+        goto error;
+    }
+    if (*endptr == '-') {
+        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
+            goto error;
+        }
+    } else if (*endptr == '\0') {
+        endvalue = value;
+    } else {
+        goto error;
+    }
+
+    if (endvalue >= MAX_CPUMASK_BITS) {
+        endvalue = MAX_CPUMASK_BITS - 1;
+        fprintf(stderr,
+            "qemu: NUMA: A max of %d VCPUs are supported\n",
+             MAX_CPUMASK_BITS);
+    }
+
+    if (endvalue < value) {
+        goto error;
+    }
+
+    bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+    return;
+
+error:
+    fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
+    exit(1);
+}
+
+void numa_add(const char *optarg)
+{
+    char option[128];
+    char *endptr;
+    unsigned long long nodenr;
+
+    optarg = get_opt_name(option, 128, optarg, ',');
+    if (*optarg == ',') {
+        optarg++;
+    }
+    if (!strcmp(option, "node")) {
+
+        if (nb_numa_nodes >= MAX_NODES) {
+            fprintf(stderr, "qemu: too many NUMA nodes\n");
+            exit(1);
+        }
+
+        if (get_param_value(option, 128, "nodeid", optarg) == 0) {
+            nodenr = nb_numa_nodes;
+        } else {
+            if (parse_uint_full(option, &nodenr, 10) < 0) {
+                fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option);
+                exit(1);
+            }
+        }
+
+        if (nodenr >= MAX_NODES) {
+            fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr);
+            exit(1);
+        }
+
+        if (get_param_value(option, 128, "mem", optarg) == 0) {
+            node_mem[nodenr] = 0;
+        } else {
+            int64_t sval;
+            sval = strtosz(option, &endptr);
+            if (sval < 0 || *endptr) {
+                fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg);
+                exit(1);
+            }
+            node_mem[nodenr] = sval;
+        }
+        if (get_param_value(option, 128, "cpus", optarg) != 0) {
+            numa_node_parse_cpus(nodenr, option);
+        }
+        nb_numa_nodes++;
+    } else {
+        fprintf(stderr, "Invalid -numa option: %s\n", option);
+        exit(1);
+    }
+}
+
+void set_numa_nodes(void)
+{
+    if (nb_numa_nodes > 0) {
+        int i;
+
+        if (nb_numa_nodes > MAX_NODES) {
+            nb_numa_nodes = MAX_NODES;
+        }
+
+        /* If no memory size if given for any node, assume the default case
+         * and distribute the available memory equally across all nodes
+         */
+        for (i = 0; i < nb_numa_nodes; i++) {
+            if (node_mem[i] != 0) {
+                break;
+            }
+        }
+        if (i == nb_numa_nodes) {
+            uint64_t usedmem = 0;
+
+            /* On Linux, the each node's border has to be 8MB aligned,
+             * the final node gets the rest.
+             */
+            for (i = 0; i < nb_numa_nodes - 1; i++) {
+                node_mem[i] = (ram_size / nb_numa_nodes) & ~((1 << 23UL) - 1);
+                usedmem += node_mem[i];
+            }
+            node_mem[i] = ram_size - usedmem;
+        }
+
+        for (i = 0; i < nb_numa_nodes; i++) {
+            if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) {
+                break;
+            }
+        }
+        /* assigning the VCPUs round-robin is easier to implement, guest OSes
+         * must cope with this anyway, because there are BIOSes out there in
+         * real machines which also use this scheme.
+         */
+        if (i == nb_numa_nodes) {
+            for (i = 0; i < max_cpus; i++) {
+                set_bit(i, node_cpumask[i % nb_numa_nodes]);
+            }
+        }
+    }
+}
+
+void set_numa_modes(void)
+{
+    CPUState *cpu;
+    int i;
+
+    CPU_FOREACH(cpu) {
+        for (i = 0; i < nb_numa_nodes; i++) {
+            if (test_bit(cpu->cpu_index, node_cpumask[i])) {
+                cpu->numa_node = i;
+            }
+        }
+    }
+}
diff --git a/vl.c b/vl.c
index 1d27b34..f4d143e 100644
--- a/vl.c
+++ b/vl.c
@@ -1211,102 +1211,6 @@ char *get_boot_devices_list(size_t *size)
     return list;
 }
 
-static void numa_node_parse_cpus(int nodenr, const char *cpus)
-{
-    char *endptr;
-    unsigned long long value, endvalue;
-
-    /* Empty CPU range strings will be considered valid, they will simply
-     * not set any bit in the CPU bitmap.
-     */
-    if (!*cpus) {
-        return;
-    }
-
-    if (parse_uint(cpus, &value, &endptr, 10) < 0) {
-        goto error;
-    }
-    if (*endptr == '-') {
-        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
-            goto error;
-        }
-    } else if (*endptr == '\0') {
-        endvalue = value;
-    } else {
-        goto error;
-    }
-
-    if (endvalue >= MAX_CPUMASK_BITS) {
-        endvalue = MAX_CPUMASK_BITS - 1;
-        fprintf(stderr,
-            "qemu: NUMA: A max of %d VCPUs are supported\n",
-             MAX_CPUMASK_BITS);
-    }
-
-    if (endvalue < value) {
-        goto error;
-    }
-
-    bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
-    return;
-
-error:
-    fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
-    exit(1);
-}
-
-static void numa_add(const char *optarg)
-{
-    char option[128];
-    char *endptr;
-    unsigned long long nodenr;
-
-    optarg = get_opt_name(option, 128, optarg, ',');
-    if (*optarg == ',') {
-        optarg++;
-    }
-    if (!strcmp(option, "node")) {
-
-        if (nb_numa_nodes >= MAX_NODES) {
-            fprintf(stderr, "qemu: too many NUMA nodes\n");
-            exit(1);
-        }
-
-        if (get_param_value(option, 128, "nodeid", optarg) == 0) {
-            nodenr = nb_numa_nodes;
-        } else {
-            if (parse_uint_full(option, &nodenr, 10) < 0) {
-                fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option);
-                exit(1);
-            }
-        }
-
-        if (nodenr >= MAX_NODES) {
-            fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr);
-            exit(1);
-        }
-
-        if (get_param_value(option, 128, "mem", optarg) == 0) {
-            node_mem[nodenr] = 0;
-        } else {
-            int64_t sval;
-            sval = strtosz(option, &endptr);
-            if (sval < 0 || *endptr) {
-                fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg);
-                exit(1);
-            }
-            node_mem[nodenr] = sval;
-        }
-        if (get_param_value(option, 128, "cpus", optarg) != 0) {
-            numa_node_parse_cpus(nodenr, option);
-        }
-        nb_numa_nodes++;
-    } else {
-        fprintf(stderr, "Invalid -numa option: %s\n", option);
-        exit(1);
-    }
-}
-
 static QemuOptsList qemu_smp_opts = {
     .name = "smp-opts",
     .implied_opt_name = "cpus",
@@ -4156,48 +4060,7 @@ int main(int argc, char **argv, char **envp)
 
     register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL);
 
-    if (nb_numa_nodes > 0) {
-        int i;
-
-        if (nb_numa_nodes > MAX_NODES) {
-            nb_numa_nodes = MAX_NODES;
-        }
-
-        /* If no memory size if given for any node, assume the default case
-         * and distribute the available memory equally across all nodes
-         */
-        for (i = 0; i < nb_numa_nodes; i++) {
-            if (node_mem[i] != 0)
-                break;
-        }
-        if (i == nb_numa_nodes) {
-            uint64_t usedmem = 0;
-
-            /* On Linux, the each node's border has to be 8MB aligned,
-             * the final node gets the rest.
-             */
-            for (i = 0; i < nb_numa_nodes - 1; i++) {
-                node_mem[i] = (ram_size / nb_numa_nodes) & ~((1 << 23UL) - 1);
-                usedmem += node_mem[i];
-            }
-            node_mem[i] = ram_size - usedmem;
-        }
-
-        for (i = 0; i < nb_numa_nodes; i++) {
-            if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) {
-                break;
-            }
-        }
-        /* assigning the VCPUs round-robin is easier to implement, guest OSes
-         * must cope with this anyway, because there are BIOSes out there in
-         * real machines which also use this scheme.
-         */
-        if (i == nb_numa_nodes) {
-            for (i = 0; i < max_cpus; i++) {
-                set_bit(i, node_cpumask[i % nb_numa_nodes]);
-            }
-        }
-    }
+    set_numa_nodes();
 
     if (qemu_opts_foreach(qemu_find_opts("mon"), mon_init_func, NULL, 1) != 0) {
         exit(1);
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 02/28] NUMA: check if the total numa memory size is equal to ram_size
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 01/28] NUMA: move numa related code to new file numa.c Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 17:00   ` Eric Blake
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 03/28] NUMA: Add numa_info structure to contain numa nodes info Paolo Bonzini
                   ` (26 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Wanlong Gao <gaowanlong@cn.fujitsu.com>

If the total number of the assigned numa nodes memory is not
equal to the assigned ram size, it will write the wrong data
to ACPI talb, then the guest will ignore the wrong ACPI table
and recognize all memory to one node. It's buggy, we should
check it to ensure that we write the right data to ACPI table.

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 numa.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/numa.c b/numa.c
index 395c14f..8ba66f1 100644
--- a/numa.c
+++ b/numa.c
@@ -127,6 +127,7 @@ void numa_add(const char *optarg)
 void set_numa_nodes(void)
 {
     if (nb_numa_nodes > 0) {
+        uint64_t numa_total;
         int i;
 
         if (nb_numa_nodes > MAX_NODES) {
@@ -154,6 +155,16 @@ void set_numa_nodes(void)
             node_mem[i] = ram_size - usedmem;
         }
 
+        numa_total = 0;
+        for (i = 0; i < nb_numa_nodes; i++) {
+            numa_total += node_mem[i];
+        }
+        if (numa_total != ram_size) {
+            fprintf(stderr, "qemu: numa nodes total memory size "
+                            "should equal to ram_size\n");
+            exit(1);
+        }
+
         for (i = 0; i < nb_numa_nodes; i++) {
             if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) {
                 break;
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 03/28] NUMA: Add numa_info structure to contain numa nodes info
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 01/28] NUMA: move numa related code to new file numa.c Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 02/28] NUMA: check if the total numa memory size is equal to ram_size Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 04/28] NUMA: convert -numa option to use OptsVisitor Paolo Bonzini
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andre Przywara, ehabkost, hutao, mtosatti, imammedo, a.motakis,
	gaowanlong

From: Wanlong Gao <gaowanlong@cn.fujitsu.com>

Add the numa_info structure to contain the numa nodes memory,
VCPUs information and the future added numa nodes host memory
policies.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
[Fix hw/ppc/spapr.c - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/pc.c            | 12 ++++++++----
 hw/ppc/spapr.c          | 10 +++++-----
 include/sysemu/sysemu.h |  8 ++++++--
 monitor.c               |  2 +-
 numa.c                  | 23 ++++++++++++-----------
 vl.c                    |  7 +++----
 6 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e715a33..a464e48 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -674,14 +674,14 @@ static FWCfgState *bochs_bios_init(void)
         unsigned int apic_id = x86_cpu_apic_id_from_index(i);
         assert(apic_id < apic_id_limit);
         for (j = 0; j < nb_numa_nodes; j++) {
-            if (test_bit(i, node_cpumask[j])) {
+            if (test_bit(i, numa_info[j].node_cpu)) {
                 numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
                 break;
             }
         }
     }
     for (i = 0; i < nb_numa_nodes; i++) {
-        numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(node_mem[i]);
+        numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(numa_info[i].node_mem);
     }
     fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg,
                      (1 + apic_id_limit + nb_numa_nodes) *
@@ -1077,8 +1077,12 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
     guest_info->apic_id_limit = pc_apic_id_limit(max_cpus);
     guest_info->apic_xrupt_override = kvm_allows_irq0_override();
     guest_info->numa_nodes = nb_numa_nodes;
-    guest_info->node_mem = g_memdup(node_mem, guest_info->numa_nodes *
+    guest_info->node_mem = g_malloc0(guest_info->numa_nodes *
                                     sizeof *guest_info->node_mem);
+    for (i = 0; i < nb_numa_nodes; i++) {
+        guest_info->node_mem[i] = numa_info[i].node_mem;
+    }
+
     guest_info->node_cpu = g_malloc0(guest_info->apic_id_limit *
                                      sizeof *guest_info->node_cpu);
 
@@ -1086,7 +1090,7 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
         unsigned int apic_id = x86_cpu_apic_id_from_index(i);
         assert(apic_id < guest_info->apic_id_limit);
         for (j = 0; j < nb_numa_nodes; j++) {
-            if (test_bit(i, node_cpumask[j])) {
+            if (test_bit(i, numa_info[j].node_cpu)) {
                 guest_info->node_cpu[apic_id] = j;
                 break;
             }
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 93d02c1e..6db9df3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -531,8 +531,8 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
     int i, off;
 
     /* memory node(s) */
-    if (nb_numa_nodes > 1 && node_mem[0] < ram_size) {
-        node0_size = node_mem[0];
+    if (nb_numa_nodes > 1 && numa_info[0].node_mem < ram_size) {
+        node0_size = numa_info[0].node_mem;
     } else {
         node0_size = ram_size;
     }
@@ -570,7 +570,7 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
         if (mem_start >= ram_size) {
             node_size = 0;
         } else {
-            node_size = node_mem[i];
+            node_size = numa_info[i].node_mem;
             if (node_size > ram_size - mem_start) {
                 node_size = ram_size - mem_start;
             }
@@ -697,7 +697,7 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
 
     /* Update the RMA size if necessary */
     if (spapr->vrma_adjust) {
-        hwaddr node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
+        hwaddr node0_size = (nb_numa_nodes > 1) ? numa_info[0].node_mem : ram_size;
         spapr->rma_size = kvmppc_rma_size(node0_size, spapr->htab_shift);
     }
 }
@@ -1115,7 +1115,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
     MemoryRegion *sysmem = get_system_memory();
     MemoryRegion *ram = g_new(MemoryRegion, 1);
     hwaddr rma_alloc_size;
-    hwaddr node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
+    hwaddr node0_size = (nb_numa_nodes > 1) ? numa_info[0].node_mem : ram_size;
     uint32_t initrd_base = 0;
     long kernel_size = 0, initrd_size = 0;
     long load_limit, rtas_limit, fw_size;
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 2509649..d873b42 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -9,6 +9,7 @@
 #include "qapi-types.h"
 #include "qemu/notify.h"
 #include "qemu/main-loop.h"
+#include "qemu/bitmap.h"
 
 /* vl.c */
 
@@ -134,8 +135,11 @@ extern QEMUClockType rtc_clock;
 #define MAX_NODES 64
 #define MAX_CPUMASK_BITS 255
 extern int nb_numa_nodes;
-extern uint64_t node_mem[MAX_NODES];
-extern unsigned long *node_cpumask[MAX_NODES];
+typedef struct node_info {
+    uint64_t node_mem;
+    DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+} NodeInfo;
+extern NodeInfo numa_info[MAX_NODES];
 void numa_add(const char *optarg);
 void set_numa_nodes(void);
 void set_numa_modes(void);
diff --git a/monitor.c b/monitor.c
index aebcbd8..d7f1ade 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2010,7 +2010,7 @@ static void do_info_numa(Monitor *mon, const QDict *qdict)
         }
         monitor_printf(mon, "\n");
         monitor_printf(mon, "node %d size: %" PRId64 " MB\n", i,
-            node_mem[i] >> 20);
+            numa_info[i].node_mem >> 20);
     }
 }
 
diff --git a/numa.c b/numa.c
index 8ba66f1..36c5134 100644
--- a/numa.c
+++ b/numa.c
@@ -64,7 +64,7 @@ static void numa_node_parse_cpus(int nodenr, const char *cpus)
         goto error;
     }
 
-    bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+    bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
     return;
 
 error:
@@ -104,7 +104,7 @@ void numa_add(const char *optarg)
         }
 
         if (get_param_value(option, 128, "mem", optarg) == 0) {
-            node_mem[nodenr] = 0;
+            numa_info[nodenr].node_mem = 0;
         } else {
             int64_t sval;
             sval = strtosz(option, &endptr);
@@ -112,7 +112,7 @@ void numa_add(const char *optarg)
                 fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg);
                 exit(1);
             }
-            node_mem[nodenr] = sval;
+            numa_info[nodenr].node_mem = sval;
         }
         if (get_param_value(option, 128, "cpus", optarg) != 0) {
             numa_node_parse_cpus(nodenr, option);
@@ -138,7 +138,7 @@ void set_numa_nodes(void)
          * and distribute the available memory equally across all nodes
          */
         for (i = 0; i < nb_numa_nodes; i++) {
-            if (node_mem[i] != 0) {
+            if (numa_info[i].node_mem != 0) {
                 break;
             }
         }
@@ -149,15 +149,16 @@ void set_numa_nodes(void)
              * the final node gets the rest.
              */
             for (i = 0; i < nb_numa_nodes - 1; i++) {
-                node_mem[i] = (ram_size / nb_numa_nodes) & ~((1 << 23UL) - 1);
-                usedmem += node_mem[i];
+                numa_info[i].node_mem = (ram_size / nb_numa_nodes) &
+                                        ~((1 << 23UL) - 1);
+                usedmem += numa_info[i].node_mem;
             }
-            node_mem[i] = ram_size - usedmem;
+            numa_info[i].node_mem = ram_size - usedmem;
         }
 
         numa_total = 0;
         for (i = 0; i < nb_numa_nodes; i++) {
-            numa_total += node_mem[i];
+            numa_total += numa_info[i].node_mem;
         }
         if (numa_total != ram_size) {
             fprintf(stderr, "qemu: numa nodes total memory size "
@@ -166,7 +167,7 @@ void set_numa_nodes(void)
         }
 
         for (i = 0; i < nb_numa_nodes; i++) {
-            if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) {
+            if (!bitmap_empty(numa_info[i].node_cpu, MAX_CPUMASK_BITS)) {
                 break;
             }
         }
@@ -176,7 +177,7 @@ void set_numa_nodes(void)
          */
         if (i == nb_numa_nodes) {
             for (i = 0; i < max_cpus; i++) {
-                set_bit(i, node_cpumask[i % nb_numa_nodes]);
+                set_bit(i, numa_info[i % nb_numa_nodes].node_cpu);
             }
         }
     }
@@ -189,7 +190,7 @@ void set_numa_modes(void)
 
     CPU_FOREACH(cpu) {
         for (i = 0; i < nb_numa_nodes; i++) {
-            if (test_bit(cpu->cpu_index, node_cpumask[i])) {
+            if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
                 cpu->numa_node = i;
             }
         }
diff --git a/vl.c b/vl.c
index f4d143e..69649fc 100644
--- a/vl.c
+++ b/vl.c
@@ -196,8 +196,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
     QTAILQ_HEAD_INITIALIZER(fw_boot_order);
 
 int nb_numa_nodes;
-uint64_t node_mem[MAX_NODES];
-unsigned long *node_cpumask[MAX_NODES];
+NodeInfo numa_info[MAX_NODES];
 
 uint8_t qemu_uuid[16];
 bool qemu_uuid_set;
@@ -2788,8 +2787,8 @@ int main(int argc, char **argv, char **envp)
     translation = BIOS_ATA_TRANSLATION_AUTO;
 
     for (i = 0; i < MAX_NODES; i++) {
-        node_mem[i] = 0;
-        node_cpumask[i] = bitmap_new(MAX_CPUMASK_BITS);
+        numa_info[i].node_mem = 0;
+        bitmap_zero(numa_info[i].node_cpu, MAX_CPUMASK_BITS);
     }
 
     nb_numa_nodes = 0;
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 04/28] NUMA: convert -numa option to use OptsVisitor
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (2 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 03/28] NUMA: Add numa_info structure to contain numa nodes info Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 05/28] NUMA: expand MAX_NODES from 64 to 128 Paolo Bonzini
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Wanlong Gao <gaowanlong@cn.fujitsu.com>

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/sysemu/sysemu.h |   3 +-
 numa.c                  | 145 +++++++++++++++++++++++-------------------------
 qapi-schema.json        |  32 +++++++++++
 vl.c                    |  11 +++-
 4 files changed, 114 insertions(+), 77 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index d873b42..20b05a3 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -140,9 +140,10 @@ typedef struct node_info {
     DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
 } NodeInfo;
 extern NodeInfo numa_info[MAX_NODES];
-void numa_add(const char *optarg);
 void set_numa_nodes(void);
 void set_numa_modes(void);
+extern QemuOptsList qemu_numa_opts;
+int numa_init_func(QemuOpts *opts, void *opaque);
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/numa.c b/numa.c
index 36c5134..6563232 100644
--- a/numa.c
+++ b/numa.c
@@ -27,101 +27,96 @@
 #include "exec/cpu-common.h"
 #include "qemu/bitmap.h"
 #include "qom/cpu.h"
-
-static void numa_node_parse_cpus(int nodenr, const char *cpus)
+#include "qapi-visit.h"
+#include "qapi/opts-visitor.h"
+#include "qapi/dealloc-visitor.h"
+#include "qapi/qmp/qerror.h"
+
+QemuOptsList qemu_numa_opts = {
+    .name = "numa",
+    .implied_opt_name = "type",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_numa_opts.head),
+    .desc = { { 0 } } /* validated with OptsVisitor */
+};
+
+static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
 {
-    char *endptr;
-    unsigned long long value, endvalue;
+    uint16_t nodenr;
+    uint16List *cpus = NULL;
 
-    /* Empty CPU range strings will be considered valid, they will simply
-     * not set any bit in the CPU bitmap.
-     */
-    if (!*cpus) {
-        return;
-    }
-
-    if (parse_uint(cpus, &value, &endptr, 10) < 0) {
-        goto error;
-    }
-    if (*endptr == '-') {
-        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
-            goto error;
-        }
-    } else if (*endptr == '\0') {
-        endvalue = value;
+    if (node->has_nodeid) {
+        nodenr = node->nodeid;
     } else {
-        goto error;
+        nodenr = nb_numa_nodes;
     }
 
-    if (endvalue >= MAX_CPUMASK_BITS) {
-        endvalue = MAX_CPUMASK_BITS - 1;
-        fprintf(stderr,
-            "qemu: NUMA: A max of %d VCPUs are supported\n",
-             MAX_CPUMASK_BITS);
+    if (nodenr >= MAX_NODES) {
+        error_setg(errp, "Max number of NUMA nodes reached: %"
+                   PRIu16 "\n", nodenr);
+        return;
     }
 
-    if (endvalue < value) {
-        goto error;
+    for (cpus = node->cpus; cpus; cpus = cpus->next) {
+        if (cpus->value > MAX_CPUMASK_BITS) {
+            error_setg(errp, "CPU number %" PRIu16 " is bigger than %d",
+                       cpus->value, MAX_CPUMASK_BITS);
+            return;
+        }
+        bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
     }
 
-    bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
-    return;
-
-error:
-    fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
-    exit(1);
+    if (node->has_mem) {
+        uint64_t mem_size = node->mem;
+        const char *mem_str = qemu_opt_get(opts, "mem");
+        /* Fix up legacy suffix-less format */
+        if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) {
+            mem_size <<= 20;
+        }
+        numa_info[nodenr].node_mem = mem_size;
+    }
 }
 
-void numa_add(const char *optarg)
+int numa_init_func(QemuOpts *opts, void *opaque)
 {
-    char option[128];
-    char *endptr;
-    unsigned long long nodenr;
+    NumaOptions *object = NULL;
+    Error *err = NULL;
 
-    optarg = get_opt_name(option, 128, optarg, ',');
-    if (*optarg == ',') {
-        optarg++;
+    {
+        OptsVisitor *ov = opts_visitor_new(opts);
+        visit_type_NumaOptions(opts_get_visitor(ov), &object, NULL, &err);
+        opts_visitor_cleanup(ov);
     }
-    if (!strcmp(option, "node")) {
 
-        if (nb_numa_nodes >= MAX_NODES) {
-            fprintf(stderr, "qemu: too many NUMA nodes\n");
-            exit(1);
-        }
+    if (err) {
+        goto error;
+    }
 
-        if (get_param_value(option, 128, "nodeid", optarg) == 0) {
-            nodenr = nb_numa_nodes;
-        } else {
-            if (parse_uint_full(option, &nodenr, 10) < 0) {
-                fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option);
-                exit(1);
-            }
+    switch (object->kind) {
+    case NUMA_OPTIONS_KIND_NODE:
+        numa_node_parse(object->node, opts, &err);
+        if (err) {
+            goto error;
         }
+        nb_numa_nodes++;
+        break;
+    default:
+        abort();
+    }
 
-        if (nodenr >= MAX_NODES) {
-            fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr);
-            exit(1);
-        }
+    return 0;
 
-        if (get_param_value(option, 128, "mem", optarg) == 0) {
-            numa_info[nodenr].node_mem = 0;
-        } else {
-            int64_t sval;
-            sval = strtosz(option, &endptr);
-            if (sval < 0 || *endptr) {
-                fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg);
-                exit(1);
-            }
-            numa_info[nodenr].node_mem = sval;
-        }
-        if (get_param_value(option, 128, "cpus", optarg) != 0) {
-            numa_node_parse_cpus(nodenr, option);
-        }
-        nb_numa_nodes++;
-    } else {
-        fprintf(stderr, "Invalid -numa option: %s\n", option);
-        exit(1);
+error:
+    qerror_report_err(err);
+    error_free(err);
+
+    if (object) {
+        QapiDeallocVisitor *dv = qapi_dealloc_visitor_new();
+        visit_type_NumaOptions(qapi_dealloc_get_visitor(dv),
+                               &object, NULL, NULL);
+        qapi_dealloc_visitor_cleanup(dv);
     }
+
+    return -1;
 }
 
 void set_numa_nodes(void)
diff --git a/qapi-schema.json b/qapi-schema.json
index ac8ad24..ca8c11e 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4518,3 +4518,35 @@
 # Since: 1.7
 ##
 { 'command': 'blockdev-add', 'data': { 'options': 'BlockdevOptions' } }
+
+##
+# @NumaOptions
+#
+# A discriminated record of NUMA options. (for OptsVisitor)
+#
+# Since: 2.1
+##
+{ 'union': 'NumaOptions',
+  'data': {
+    'node': 'NumaNodeOptions' }}
+
+##
+# @NumaNodeOptions
+#
+# Create a guest NUMA node. (for OptsVisitor)
+#
+# @nodeid: #optional NUMA node ID (increase by 1 from 0 if omitted)
+#
+# @cpus: #optional VCPUs belonging to this node (assign VCPUS round-robin
+#         if omitted)
+#
+# @mem: #optional memory size of this node (equally divide total memory among
+#        nodes if omitted)
+#
+# Since: 2.1
+##
+{ 'type': 'NumaNodeOptions',
+  'data': {
+   '*nodeid': 'uint16',
+   '*cpus':   ['uint16'],
+   '*mem':    'size' }}
diff --git a/vl.c b/vl.c
index 69649fc..899b63f 100644
--- a/vl.c
+++ b/vl.c
@@ -2766,6 +2766,7 @@ int main(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_tpmdev_opts);
     qemu_add_opts(&qemu_realtime_opts);
     qemu_add_opts(&qemu_msg_opts);
+    qemu_add_opts(&qemu_numa_opts);
 
     runstate_init();
 
@@ -2963,7 +2964,10 @@ int main(int argc, char **argv, char **envp)
                 }
                 break;
             case QEMU_OPTION_numa:
-                numa_add(optarg);
+                opts = qemu_opts_parse(qemu_find_opts("numa"), optarg, 1);
+                if (!opts) {
+                    exit(1);
+                }
                 break;
             case QEMU_OPTION_display:
                 display_type = select_display(optarg);
@@ -4059,6 +4063,11 @@ int main(int argc, char **argv, char **envp)
 
     register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL);
 
+    if (qemu_opts_foreach(qemu_find_opts("numa"), numa_init_func,
+                          NULL, 1) != 0) {
+        exit(1);
+    }
+
     set_numa_nodes();
 
     if (qemu_opts_foreach(qemu_find_opts("mon"), mon_init_func, NULL, 1) != 0) {
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 05/28] NUMA: expand MAX_NODES from 64 to 128
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (3 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 04/28] NUMA: convert -numa option to use OptsVisitor Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 06/28] man: improve -numa doc Paolo Bonzini
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Wanlong Gao <gaowanlong@cn.fujitsu.com>

libnuma choosed 128 for MAX_NODES, so we follow libnuma here.

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/sysemu/sysemu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 20b05a3..4c94cf5 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -132,7 +132,7 @@ extern size_t boot_splash_filedata_size;
 extern uint8_t qemu_extra_params_fw[2];
 extern QEMUClockType rtc_clock;
 
-#define MAX_NODES 64
+#define MAX_NODES 128
 #define MAX_CPUMASK_BITS 255
 extern int nb_numa_nodes;
 typedef struct node_info {
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 06/28] man: improve -numa doc
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (4 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 05/28] NUMA: expand MAX_NODES from 64 to 128 Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-11 18:53   ` Eduardo Habkost
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton Paolo Bonzini
                   ` (22 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, hutao, mtosatti, Luiz Capitulino, imammedo, a.motakis,
	gaowanlong

From: Luiz Capitulino <lcapitulino@redhat.com>

The -numa option documentation in qemu's manpage lacks the command-line
options and some information regarding how it relates to options -m and
-smp. This commit fills in the missing text.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qemu-options.hx | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 56e5fdf..f948f28 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -97,10 +97,14 @@ ETEXI
 DEF("numa", HAS_ARG, QEMU_OPTION_numa,
     "-numa node[,mem=size][,cpus=cpu[-cpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
 STEXI
-@item -numa @var{opts}
+@item -numa node[,mem=@var{size}][,cpus=@var{cpu[-cpu]}][,nodeid=@var{node}]
 @findex -numa
-Simulate a multi node NUMA system. If mem and cpus are omitted, resources
-are split equally.
+Simulate a multi node NUMA system. If @samp{mem}
+and @samp{cpus} are omitted, resources are split equally. Also, note
+that the -@option{numa} option doesn't allocate any of the specified
+resources. That is, it just assigns existing resources to NUMA nodes. This
+means that one still has to use the @option{-m}, @option{-smp} options
+to respectively allocate RAM and vCPUs.
 ETEXI
 
 DEF("add-fd", HAS_ARG, QEMU_OPTION_add_fd,
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (5 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 06/28] man: improve -numa doc Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-05 10:08   ` Andreas Färber
                     ` (2 more replies)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts Paolo Bonzini
                   ` (21 subsequent siblings)
  28 siblings, 3 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/config-file.h |  2 ++
 util/qemu-config.c         | 14 ++++++++++++++
 vl.c                       | 11 +----------
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/include/qemu/config-file.h b/include/qemu/config-file.h
index dbd97c4..d4ba20e 100644
--- a/include/qemu/config-file.h
+++ b/include/qemu/config-file.h
@@ -8,6 +8,8 @@
 
 QemuOptsList *qemu_find_opts(const char *group);
 QemuOptsList *qemu_find_opts_err(const char *group, Error **errp);
+QemuOpts *qemu_find_opts_singleton(const char *group);
+
 void qemu_add_opts(QemuOptsList *list);
 void qemu_add_drive_opts(QemuOptsList *list);
 int qemu_set_option(const char *str);
diff --git a/util/qemu-config.c b/util/qemu-config.c
index f610101..60051df 100644
--- a/util/qemu-config.c
+++ b/util/qemu-config.c
@@ -39,6 +39,20 @@ QemuOptsList *qemu_find_opts(const char *group)
     return ret;
 }
 
+QemuOpts *qemu_find_opts_singleton(const char *group)
+{
+    QemuOptsList *list;
+    QemuOpts *opts;
+
+    list = qemu_find_opts(group);
+    assert(list);
+    opts = qemu_opts_find(list, NULL);
+    if (!opts) {
+        opts = qemu_opts_create(list, NULL, 0, &error_abort);
+    }
+    return opts;
+}
+
 static CommandLineParameterInfoList *query_option_descs(const QemuOptDesc *desc)
 {
     CommandLineParameterInfoList *param_list = NULL, *entry;
diff --git a/vl.c b/vl.c
index 899b63f..dafe6f6 100644
--- a/vl.c
+++ b/vl.c
@@ -485,16 +485,7 @@ static QemuOptsList qemu_msg_opts = {
  */
 QemuOpts *qemu_get_machine_opts(void)
 {
-    QemuOptsList *list;
-    QemuOpts *opts;
-
-    list = qemu_find_opts("machine");
-    assert(list);
-    opts = qemu_opts_find(list, NULL);
-    if (!opts) {
-        opts = qemu_opts_create(list, NULL, 0, &error_abort);
-    }
-    return opts;
+    return qemu_find_opts_singleton("machine");
 }
 
 const char *qemu_get_vm_name(void)
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (6 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-05 10:06   ` Andreas Färber
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 09/28] vl: redo -object parsing Paolo Bonzini
                   ` (20 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Igor Mammedov <imammedo@redhat.com>

Adds option to -m
 "mem" - startup memory amount

For compatibility with legacy CLI if suffix-less number is passed,
it assumes amount in Mb.

Otherwise user is free to use suffixed number using suffixes b,k/K,M,G

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qemu-options.hx |  9 +++++---
 vl.c            | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------
 2 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index f948f28..98e78ca 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -214,10 +214,13 @@ use is discouraged as it may be removed from future versions.
 ETEXI
 
 DEF("m", HAS_ARG, QEMU_OPTION_m,
-    "-m megs         set virtual RAM size to megs MB [default="
-    stringify(DEFAULT_RAM_SIZE) "]\n", QEMU_ARCH_ALL)
+    "-m [mem=]megs\n"
+    "                configure guest RAM\n"
+    "                mem: initial amount of guest memory (default: "
+    stringify(DEFAULT_RAM_SIZE) "MiB)\n",
+    QEMU_ARCH_ALL)
 STEXI
-@item -m @var{megs}
+@item -m [mem=]@var{megs}
 @findex -m
 Set virtual RAM size to @var{megs} megabytes. Default is 128 MiB.  Optionally,
 a suffix of ``M'' or ``G'' can be used to signify a value in megabytes or
diff --git a/vl.c b/vl.c
index dafe6f6..ac5f425 100644
--- a/vl.c
+++ b/vl.c
@@ -478,6 +478,20 @@ static QemuOptsList qemu_msg_opts = {
     },
 };
 
+static QemuOptsList qemu_mem_opts = {
+    .name = "memory",
+    .implied_opt_name = "mem",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_mem_opts.head),
+    .merge_lists = true,
+    .desc = {
+        {
+            .name = "mem",
+            .type = QEMU_OPT_SIZE,
+        },
+        { /* end of list */ }
+    },
+};
+
 /**
  * Get machine options
  *
@@ -2718,6 +2732,7 @@ int main(int argc, char **argv, char **envp)
     };
     const char *trace_events = NULL;
     const char *trace_file = NULL;
+    const ram_addr_t default_ram_size = (ram_addr_t)DEFAULT_RAM_SIZE * 1024 * 1024;
 
     atexit(qemu_run_exit_notifiers);
     error_set_progname(argv[0]);
@@ -2758,6 +2773,7 @@ int main(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_realtime_opts);
     qemu_add_opts(&qemu_msg_opts);
     qemu_add_opts(&qemu_numa_opts);
+    qemu_add_opts(&qemu_mem_opts);
 
     runstate_init();
 
@@ -2773,7 +2789,7 @@ int main(int argc, char **argv, char **envp)
     module_call_init(MODULE_INIT_MACHINE);
     machine = find_default_machine();
     cpu_model = NULL;
-    ram_size = 0;
+    ram_size = default_ram_size;
     snapshot = 0;
     cyls = heads = secs = 0;
     translation = BIOS_ATA_TRANSLATION_AUTO;
@@ -3063,20 +3079,50 @@ int main(int argc, char **argv, char **envp)
                 exit(0);
                 break;
             case QEMU_OPTION_m: {
-                int64_t value;
                 uint64_t sz;
-                char *end;
+                const char *mem_str;
 
-                value = strtosz(optarg, &end);
-                if (value < 0 || *end) {
-                    fprintf(stderr, "qemu: invalid ram size: %s\n", optarg);
-                    exit(1);
+                opts = qemu_opts_parse(qemu_find_opts("memory"),
+                                       optarg, 1);
+                if (!opts) {
+                    exit(EXIT_FAILURE);
+                }
+
+                mem_str = qemu_opt_get(opts, "mem");
+                if (!mem_str) {
+                    fprintf(stderr, "qemu: invalid -m option, missing "
+                            "'mem' option\n");
+                    exit(EXIT_FAILURE);
+                }
+                if (!*mem_str) {
+                    fprintf(stderr, "qemu: missing 'mem' option value\n");
+                    exit(EXIT_FAILURE);
+                }
+
+                sz = qemu_opt_get_size(opts, "mem", ram_size);
+
+                /* Fix up legacy suffix-less format */
+                if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) {
+                    uint64_t overflow_check = sz;
+
+                    sz <<= 20;
+                    if ((sz >> 20) != overflow_check) {
+                        fprintf(stderr, "qemu: too large 'mem' option "
+                                "value\n");
+                        exit(EXIT_FAILURE);
+                    }
+                }
+
+                /* backward compatibility behaviour for case "-m 0" */
+                if (sz == 0) {
+                    sz = default_ram_size;
                 }
-                sz = QEMU_ALIGN_UP((uint64_t)value, 8192);
+
+                sz = QEMU_ALIGN_UP(sz, 8192);
                 ram_size = sz;
                 if (ram_size != sz) {
                     fprintf(stderr, "qemu: ram size too large\n");
-                    exit(1);
+                    exit(EXIT_FAILURE);
                 }
                 break;
             }
@@ -3921,10 +3967,8 @@ int main(int argc, char **argv, char **envp)
         exit(1);
     }
 
-    /* init the memory */
-    if (ram_size == 0) {
-        ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
-    }
+    /* store value for the future use */
+    qemu_opt_set_number(qemu_find_opts_singleton("memory"), "mem", ram_size);
 
     if (qemu_opts_foreach(qemu_find_opts("device"), device_help_func, NULL, 0)
         != 0) {
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 09/28] vl: redo -object parsing
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (7 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-07  2:56   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 10/28] qmp: allow object-add completion handler to get canonical path Paolo Bonzini
                   ` (19 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Follow the lines of the HMP implementation, using OptsVisitor
to parse the options.  This gives access to OptsVisitor's
rich parsing of integer lists.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 vl.c | 87 +++++++++++++++++++++++++++-----------------------------------------
 1 file changed, 35 insertions(+), 52 deletions(-)

diff --git a/vl.c b/vl.c
index ac5f425..e8709ee 100644
--- a/vl.c
+++ b/vl.c
@@ -119,8 +119,7 @@ int main(int argc, char **argv)
 #include "qemu/osdep.h"
 
 #include "ui/qemu-spice.h"
-#include "qapi/string-input-visitor.h"
-#include "qom/object_interfaces.h"
+#include "qapi/opts-visitor.h"
 
 #define DEFAULT_RAM_SIZE 128
 
@@ -2629,69 +2628,53 @@ static void free_and_trace(gpointer mem)
     free(mem);
 }
 
-static int object_set_property(const char *name, const char *value, void *opaque)
-{
-    Object *obj = OBJECT(opaque);
-    StringInputVisitor *siv;
-    Error *local_err = NULL;
-
-    if (strcmp(name, "qom-type") == 0 || strcmp(name, "id") == 0) {
-        return 0;
-    }
-
-    siv = string_input_visitor_new(value);
-    object_property_set(obj, string_input_get_visitor(siv), name, &local_err);
-    string_input_visitor_cleanup(siv);
-
-    if (local_err) {
-        qerror_report_err(local_err);
-        error_free(local_err);
-        return -1;
-    }
-
-    return 0;
-}
-
 static int object_create(QemuOpts *opts, void *opaque)
 {
-    const char *type = qemu_opt_get(opts, "qom-type");
-    const char *id = qemu_opts_id(opts);
-    Error *local_err = NULL;
-    Object *obj;
-
-    g_assert(type != NULL);
-
-    if (id == NULL) {
-        qerror_report(QERR_MISSING_PARAMETER, "id");
-        return -1;
+    Error *err = NULL;
+    char *type = NULL;
+    char *id = NULL;
+    void *dummy = NULL;
+    OptsVisitor *ov;
+    QDict *pdict;
+
+    ov = opts_visitor_new(opts);
+    pdict = qemu_opts_to_qdict(opts, NULL);
+
+    visit_start_struct(opts_get_visitor(ov), &dummy, NULL, NULL, 0, &err);
+    if (err) {
+        goto out;
     }
 
-    obj = object_new(type);
-    if (qemu_opt_foreach(opts, object_set_property, obj, 1) < 0) {
-        object_unref(obj);
-        return -1;
+    qdict_del(pdict, "qom-type");
+    visit_type_str(opts_get_visitor(ov), &type, "qom-type", &err);
+    if (err) {
+        goto out;
     }
 
-    if (!object_dynamic_cast(obj, TYPE_USER_CREATABLE)) {
-        error_setg(&local_err, "object '%s' isn't supported by -object",
-                   id);
+    qdict_del(pdict, "id");
+    visit_type_str(opts_get_visitor(ov), &id, "id", &err);
+    if (err) {
         goto out;
     }
 
-    user_creatable_complete(obj, &local_err);
-    if (local_err) {
+    object_add(type, id, pdict, opts_get_visitor(ov), &err);
+    if (err) {
         goto out;
     }
-
-    object_property_add_child(container_get(object_get_root(), "/objects"),
-                              id, obj, &local_err);
+    visit_end_struct(opts_get_visitor(ov), &err);
+    if (err) {
+        qmp_object_del(id, NULL);
+    }
 
 out:
-    object_unref(obj);
-    if (local_err) {
-        qerror_report_err(local_err);
-        error_free(local_err);
-        return -1;
+    opts_visitor_cleanup(ov);
+
+    QDECREF(pdict);
+    g_free(id);
+    g_free(type);
+    g_free(dummy);
+    if (err) {
+        qerror_report_err(err);
     }
     return 0;
 }
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 10/28] qmp: allow object-add completion handler to get canonical path
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (8 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 09/28] vl: redo -object parsing Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 11/28] qmp: improve error reporting for -object and object-add Paolo Bonzini
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Igor Mammedov <imammedo@redhat.com>

Add object to /objects before calling user_creatable_complete()
handler, so that object might be able to call
object_get_canonical_path() in its completion handler.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qmp.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/qmp.c b/qmp.c
index d0d98e7..2ff943d 100644
--- a/qmp.c
+++ b/qmp.c
@@ -561,13 +561,15 @@ void object_add(const char *type, const char *id, const QDict *qdict,
         goto out;
     }
 
+    object_property_add_child(container_get(object_get_root(), "/objects"),
+                              id, obj, &local_err);
+
     user_creatable_complete(obj, &local_err);
     if (local_err) {
+        object_property_del(container_get(object_get_root(), "/objects"),
+                            id, &error_abort);
         goto out;
     }
-
-    object_property_add_child(container_get(object_get_root(), "/objects"),
-                              id, obj, &local_err);
 out:
     if (local_err) {
         error_propagate(errp, local_err);
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 11/28] qmp: improve error reporting for -object and object-add
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (9 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 10/28] qmp: allow object-add completion handler to get canonical path Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-07  3:07   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 12/28] pc: pass QEMUMachineInitArgs to pc_memory_init Paolo Bonzini
                   ` (17 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Use QERR_INVALID_PARAMETER_VALUE for consistency, and avoid an assertion
failure if the class name is incorrect.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qmp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/qmp.c b/qmp.c
index 2ff943d..a3b0b73 100644
--- a/qmp.c
+++ b/qmp.c
@@ -541,7 +541,8 @@ void object_add(const char *type, const char *id, const QDict *qdict,
     Error *local_err = NULL;
 
     if (!object_class_by_name(type)) {
-        error_setg(errp, "invalid class name");
+        error_set(errp, QERR_INVALID_PARAMETER_VALUE,
+                  "qom-type", "a valid class name");
         return;
     }
 
@@ -556,8 +557,8 @@ void object_add(const char *type, const char *id, const QDict *qdict,
     }
 
     if (!object_dynamic_cast(obj, TYPE_USER_CREATABLE)) {
-        error_setg(&local_err, "object '%s' isn't supported by object-add",
-                   id);
+        error_setg(&local_err, "class '%s' isn't supported by object-add",
+                   type);
         goto out;
     }
 
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 12/28] pc: pass QEMUMachineInitArgs to pc_memory_init
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (10 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 11/28] qmp: improve error reporting for -object and object-add Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-07  3:09   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 13/28] numa: introduce memory_region_allocate_system_memory Paolo Bonzini
                   ` (16 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/pc.c         | 11 +++++------
 hw/i386/pc_piix.c    |  8 +++-----
 hw/i386/pc_q35.c     |  4 +---
 include/hw/i386/pc.h |  7 +++----
 4 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index a464e48..17d4820 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1145,10 +1145,8 @@ void pc_acpi_init(const char *default_dsdt)
     }
 }
 
-FWCfgState *pc_memory_init(MemoryRegion *system_memory,
-                           const char *kernel_filename,
-                           const char *kernel_cmdline,
-                           const char *initrd_filename,
+FWCfgState *pc_memory_init(QEMUMachineInitArgs *args,
+                           MemoryRegion *system_memory,
                            ram_addr_t below_4g_mem_size,
                            ram_addr_t above_4g_mem_size,
                            MemoryRegion *rom_memory,
@@ -1160,7 +1158,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
     MemoryRegion *ram_below_4g, *ram_above_4g;
     FWCfgState *fw_cfg;
 
-    linux_boot = (kernel_filename != NULL);
+    linux_boot = (args->kernel_filename != NULL);
 
     /* Allocate RAM.  We allocate it as a single memory region and use
      * aliases to address portions of it, mostly for backwards compatibility
@@ -1201,7 +1199,8 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
     rom_set_fw(fw_cfg);
 
     if (linux_boot) {
-        load_linux(fw_cfg, kernel_filename, initrd_filename, kernel_cmdline, below_4g_mem_size);
+        load_linux(fw_cfg, args->kernel_filename, args->initrd_filename,
+                   args->kernel_cmdline, below_4g_mem_size);
     }
 
     for (i = 0; i < nb_option_roms; i++) {
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index d5dc1ef..96adc01 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -151,11 +151,9 @@ static void pc_init1(QEMUMachineInitArgs *args,
 
     /* allocate ram and load rom/bios */
     if (!xen_enabled()) {
-        fw_cfg = pc_memory_init(system_memory,
-                       args->kernel_filename, args->kernel_cmdline,
-                       args->initrd_filename,
-                       below_4g_mem_size, above_4g_mem_size,
-                       rom_memory, &ram_memory, guest_info);
+        fw_cfg = pc_memory_init(args, system_memory,
+                                below_4g_mem_size, above_4g_mem_size,
+                                rom_memory, &ram_memory, guest_info);
     }
 
     gsi_state = g_malloc0(sizeof(*gsi_state));
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index a7f6260..95fa01fc 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -138,9 +138,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 
     /* allocate ram and load rom/bios */
     if (!xen_enabled()) {
-        pc_memory_init(get_system_memory(),
-                       args->kernel_filename, args->kernel_cmdline,
-                       args->initrd_filename,
+        pc_memory_init(args, get_system_memory(),
                        below_4g_mem_size, above_4g_mem_size,
                        rom_memory, &ram_memory, guest_info);
     }
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 9010246..8fc0527 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -3,6 +3,7 @@
 
 #include "qemu-common.h"
 #include "exec/memory.h"
+#include "hw/boards.h"
 #include "hw/isa/isa.h"
 #include "hw/block/fdc.h"
 #include "net/net.h"
@@ -134,10 +135,8 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
 void pc_pci_as_mapping_init(Object *owner, MemoryRegion *system_memory,
                             MemoryRegion *pci_address_space);
 
-FWCfgState *pc_memory_init(MemoryRegion *system_memory,
-                           const char *kernel_filename,
-                           const char *kernel_cmdline,
-                           const char *initrd_filename,
+FWCfgState *pc_memory_init(QEMUMachineInitArgs *args,
+                           MemoryRegion *system_memory,
                            ram_addr_t below_4g_mem_size,
                            ram_addr_t above_4g_mem_size,
                            MemoryRegion *rom_memory,
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 13/28] numa: introduce memory_region_allocate_system_memory
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (11 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 12/28] pc: pass QEMUMachineInitArgs to pc_memory_init Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-07  3:18   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 14/28] add memdev backend infrastructure Paolo Bonzini
                   ` (15 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/pc.c            |  4 +---
 include/hw/boards.h     |  4 ++++
 include/sysemu/sysemu.h |  1 +
 numa.c                  | 11 +++++++++++
 4 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 17d4820..ff078fb 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1165,9 +1165,7 @@ FWCfgState *pc_memory_init(QEMUMachineInitArgs *args,
      * with older qemus that used qemu_ram_alloc().
      */
     ram = g_malloc(sizeof(*ram));
-    memory_region_init_ram(ram, NULL, "pc.ram",
-                           below_4g_mem_size + above_4g_mem_size);
-    vmstate_register_ram_global(ram);
+    memory_region_allocate_system_memory(ram, NULL, "pc.ram", args);
     *ram_memory = ram;
     ram_below_4g = g_malloc(sizeof(*ram_below_4g));
     memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 2151460..8b68878 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -48,6 +48,10 @@ struct QEMUMachine {
     const char *hw_version;
 };
 
+void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner,
+                                          const char *name,
+                                          QEMUMachineInitArgs *args);
+
 int qemu_register_machine(QEMUMachine *m);
 QEMUMachine *find_default_machine(void);
 
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 4c94cf5..54a6f28 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -10,6 +10,7 @@
 #include "qemu/notify.h"
 #include "qemu/main-loop.h"
 #include "qemu/bitmap.h"
+#include "qom/object.h"
 
 /* vl.c */
 
diff --git a/numa.c b/numa.c
index 6563232..930f49d 100644
--- a/numa.c
+++ b/numa.c
@@ -31,6 +31,7 @@
 #include "qapi/opts-visitor.h"
 #include "qapi/dealloc-visitor.h"
 #include "qapi/qmp/qerror.h"
+#include "hw/boards.h"
 
 QemuOptsList qemu_numa_opts = {
     .name = "numa",
@@ -191,3 +192,13 @@ void set_numa_modes(void)
         }
     }
 }
+
+void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner,
+                                          const char *name,
+                                          QEMUMachineInitArgs *args)
+{
+    uint64_t ram_size = args->ram_size;
+
+    memory_region_init_ram(mr, owner, name, ram_size);
+    vmstate_register_ram_global(mr);
+}
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 14/28] add memdev backend infrastructure
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (12 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 13/28] numa: introduce memory_region_allocate_system_memory Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-07  3:31   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option Paolo Bonzini
                   ` (14 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Igor Mammedov <imammedo@redhat.com>

Provides framework for splitting host RAM allocation/
policies into a separate backend that could be used
by devices.

Initially only legacy RAM backend is provided, which
uses memory_region_init_ram() allocator and compatible
with every CLI option that affects memory_region_init_ram().

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 backends/Makefile.objs   |   2 +
 backends/hostmem-ram.c   |  52 ++++++++++++++++++++++
 backends/hostmem.c       | 110 +++++++++++++++++++++++++++++++++++++++++++++++
 include/sysemu/hostmem.h |  60 ++++++++++++++++++++++++++
 4 files changed, 224 insertions(+)
 create mode 100644 backends/hostmem-ram.c
 create mode 100644 backends/hostmem.c
 create mode 100644 include/sysemu/hostmem.h

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 42557d5..e6bdc11 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -6,3 +6,5 @@ common-obj-$(CONFIG_BRLAPI) += baum.o
 $(obj)/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS) 
 
 common-obj-$(CONFIG_TPM) += tpm.o
+
+common-obj-y += hostmem.o hostmem-ram.o
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
new file mode 100644
index 0000000..ce06fbe
--- /dev/null
+++ b/backends/hostmem-ram.c
@@ -0,0 +1,52 @@
+/*
+ * QEMU Host Memory Backend
+ *
+ * Copyright (C) 2013 Red Hat Inc
+ *
+ * Authors:
+ *   Igor Mammedov <imammedo@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "sysemu/hostmem.h"
+#include "qom/object_interfaces.h"
+
+#define TYPE_MEMORY_BACKEND_RAM "memory-ram"
+
+
+static void
+ram_backend_memory_init(UserCreatable *uc, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
+
+    if (!backend->size) {
+        error_setg(errp, "can't create backend with size 0");
+        return;
+    }
+
+    memory_region_init_ram(&backend->mr, OBJECT(backend),
+                           object_get_canonical_path(OBJECT(backend)),
+                           backend->size);
+}
+
+static void
+ram_backend_class_init(ObjectClass *oc, void *data)
+{
+    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+
+    ucc->complete = ram_backend_memory_init;
+}
+
+static const TypeInfo ram_backend_info = {
+    .name = TYPE_MEMORY_BACKEND_RAM,
+    .parent = TYPE_MEMORY_BACKEND,
+    .class_init = ram_backend_class_init,
+};
+
+static void register_types(void)
+{
+    type_register_static(&ram_backend_info);
+}
+
+type_init(register_types);
diff --git a/backends/hostmem.c b/backends/hostmem.c
new file mode 100644
index 0000000..06817dd
--- /dev/null
+++ b/backends/hostmem.c
@@ -0,0 +1,110 @@
+/*
+ * QEMU Host Memory Backend
+ *
+ * Copyright (C) 2013 Red Hat Inc
+ *
+ * Authors:
+ *   Igor Mammedov <imammedo@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "sysemu/hostmem.h"
+#include "sysemu/sysemu.h"
+#include "qapi/visitor.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/config-file.h"
+#include "qom/object_interfaces.h"
+
+static void
+host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
+                            const char *name, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+    uint64_t value = backend->size;
+
+    visit_type_size(v, &value, name, errp);
+}
+
+static void
+host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
+                            const char *name, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+    uint64_t value;
+
+    if (memory_region_size(&backend->mr)) {
+        error_setg(errp, "cannot change property value\n");
+        return;
+    }
+
+    visit_type_size(v, &value, name, errp);
+    if (error_is_set(errp)) {
+        return;
+    }
+    if (!value) {
+        error_setg(errp, "Property '%s.%s' doesn't take value '%" PRIu64 "'",
+                   object_get_typename(obj), name , value);
+        return;
+    }
+    backend->size = value;
+}
+
+static void host_memory_backend_initfn(Object *obj)
+{
+    object_property_add(obj, "size", "int",
+                        host_memory_backend_get_size,
+                        host_memory_backend_set_size, NULL, NULL, NULL);
+}
+
+static void host_memory_backend_finalize(Object *obj)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    if (memory_region_size(&backend->mr)) {
+        memory_region_destroy(&backend->mr);
+    }
+}
+
+static void
+host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
+{
+    error_setg(errp, "memory_init is not implemented for type [%s]",
+               object_get_typename(OBJECT(uc)));
+}
+
+MemoryRegion *
+host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
+{
+    return memory_region_size(&backend->mr) ? &backend->mr : NULL;
+}
+
+static void
+host_memory_backend_class_init(ObjectClass *oc, void *data)
+{
+    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+
+    ucc->complete = host_memory_backend_memory_init;
+}
+
+static const TypeInfo host_memory_backend_info = {
+    .name = TYPE_MEMORY_BACKEND,
+    .parent = TYPE_OBJECT,
+    .abstract = true,
+    .class_size = sizeof(HostMemoryBackendClass),
+    .class_init = host_memory_backend_class_init,
+    .instance_size = sizeof(HostMemoryBackend),
+    .instance_init = host_memory_backend_initfn,
+    .instance_finalize = host_memory_backend_finalize,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_USER_CREATABLE },
+        { }
+    }
+};
+
+static void register_types(void)
+{
+    type_register_static(&host_memory_backend_info);
+}
+
+type_init(register_types);
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
new file mode 100644
index 0000000..bc3ffb3
--- /dev/null
+++ b/include/sysemu/hostmem.h
@@ -0,0 +1,60 @@
+/*
+ * QEMU Host Memory Backend
+ *
+ * Copyright (C) 2013 Red Hat Inc
+ *
+ * Authors:
+ *   Igor Mammedov <imammedo@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_RAM_H
+#define QEMU_RAM_H
+
+#include "qom/object.h"
+#include "qapi/error.h"
+#include "exec/memory.h"
+#include "qemu/option.h"
+
+#define TYPE_MEMORY_BACKEND "memory"
+#define MEMORY_BACKEND(obj) \
+    OBJECT_CHECK(HostMemoryBackend, (obj), TYPE_MEMORY_BACKEND)
+#define MEMORY_BACKEND_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(HostMemoryBackendClass, (obj), TYPE_MEMORY_BACKEND)
+#define MEMORY_BACKEND_CLASS(klass) \
+    OBJECT_CLASS_CHECK(HostMemoryBackendClass, (klass), TYPE_MEMORY_BACKEND)
+
+typedef struct HostMemoryBackend HostMemoryBackend;
+typedef struct HostMemoryBackendClass HostMemoryBackendClass;
+
+/**
+ * HostMemoryBackendClass:
+ * @parent_class: opaque parent class container
+ */
+struct HostMemoryBackendClass {
+    ObjectClass parent_class;
+};
+
+/**
+ * @HostMemoryBackend
+ *
+ * @parent: opaque parent object container
+ * @size: amount of memory backend provides
+ * @id: unique identification string in memdev namespace
+ * @mr: MemoryRegion representing host memory belonging to backend
+ */
+struct HostMemoryBackend {
+    /* private */
+    Object parent;
+
+    /* protected */
+    uint64_t size;
+
+    MemoryRegion mr;
+};
+
+MemoryRegion *host_memory_backend_get_memory(HostMemoryBackend *backend,
+                                             Error **errp);
+
+#endif
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (13 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 14/28] add memdev backend infrastructure Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 17:52   ` Eric Blake
  2014-03-07  5:33   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation Paolo Bonzini
                   ` (13 subsequent siblings)
  28 siblings, 2 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

This option provides the infrastructure for binding guest NUMA nodes
to host NUMA nodes.  For example:

 -object memory-ram,size=1024M,policy=membind,host-nodes=0,id=ram-node0 \
 -numa node,nodeid=0,cpus=0,memdev=ram-node0 \
 -object memory-ram,size=1024M,policy=interleave,host-nodes=1-3,id=ram-node1 \
 -numa node,nodeid=1,cpus=1,memdev=ram-node1

The option replaces "-numa node,mem=".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/sysemu/sysemu.h |  1 +
 numa.c                  | 63 +++++++++++++++++++++++++++++++++++++++++++++++--
 qapi-schema.json        |  8 ++++++-
 qemu-options.hx         | 12 ++++++----
 4 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 54a6f28..4870129 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -139,6 +139,7 @@ extern int nb_numa_nodes;
 typedef struct node_info {
     uint64_t node_mem;
     DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+    struct HostMemoryBackend *node_memdev;
 } NodeInfo;
 extern NodeInfo numa_info[MAX_NODES];
 void set_numa_nodes(void);
diff --git a/numa.c b/numa.c
index 930f49d..b00ef90 100644
--- a/numa.c
+++ b/numa.c
@@ -32,6 +32,7 @@
 #include "qapi/dealloc-visitor.h"
 #include "qapi/qmp/qerror.h"
 #include "hw/boards.h"
+#include "sysemu/hostmem.h"
 
 QemuOptsList qemu_numa_opts = {
     .name = "numa",
@@ -40,6 +41,8 @@ QemuOptsList qemu_numa_opts = {
     .desc = { { 0 } } /* validated with OptsVisitor */
 };
 
+static int have_memdevs = -1;
+
 static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
 {
     uint16_t nodenr;
@@ -66,6 +69,20 @@ static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
         bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
     }
 
+    if (node->has_mem && node->has_memdev) {
+        error_setg(errp, "qemu: cannot specify both mem= and memdev=\n");
+        return;
+    }
+
+    if (have_memdevs == -1) {
+        have_memdevs = node->has_memdev;
+    }
+    if (node->has_memdev != have_memdevs) {
+        error_setg(errp, "qemu: memdev option must be specified for either "
+                   "all or no nodes\n");
+        return;
+    }
+
     if (node->has_mem) {
         uint64_t mem_size = node->mem;
         const char *mem_str = qemu_opt_get(opts, "mem");
@@ -75,6 +92,18 @@ static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
         }
         numa_info[nodenr].node_mem = mem_size;
     }
+    if (node->has_memdev) {
+        Object *o;
+        o = object_resolve_path_type(node->memdev, TYPE_MEMORY_BACKEND, NULL);
+        if (!o) {
+            error_setg(errp, "memdev=%s is ambiguous", node->memdev);
+            return;
+        }
+
+        object_ref(o);
+        numa_info[nodenr].node_mem = object_property_get_int(o, "size", NULL);
+        numa_info[nodenr].node_memdev = MEMORY_BACKEND(o);
+    }
 }
 
 int numa_init_func(QemuOpts *opts, void *opaque)
@@ -193,12 +222,42 @@ void set_numa_modes(void)
     }
 }
 
+static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
+                                           const char *name,
+                                           QEMUMachineInitArgs *args)
+{
+    uint64_t ram_size = args->ram_size;
+
+    memory_region_init_ram(mr, owner, name, ram_size);
+    vmstate_register_ram_global(mr);
+}
+
 void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner,
                                           const char *name,
                                           QEMUMachineInitArgs *args)
 {
     uint64_t ram_size = args->ram_size;
+    uint64_t addr = 0;
+    int i;
 
-    memory_region_init_ram(mr, owner, name, ram_size);
-    vmstate_register_ram_global(mr);
+    if (nb_numa_nodes == 0 || !have_memdevs) {
+        allocate_system_memory_nonnuma(mr, owner, name, args);
+        return;
+    }
+
+    memory_region_init(mr, owner, name, ram_size);
+    for (i = 0; i < nb_numa_nodes; i++) {
+        Error *local_err = NULL;
+        uint64_t size = numa_info[i].node_mem;
+        HostMemoryBackend *backend = numa_info[i].node_memdev;
+        MemoryRegion *seg = host_memory_backend_get_memory(backend, &local_err);
+        if (local_err) {
+            qerror_report_err(local_err);
+            exit(1);
+        }
+
+        memory_region_add_subregion(mr, addr, seg);
+        vmstate_register_ram_global(seg);
+        addr += size;
+    }
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index ca8c11e..8bd84da 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4543,10 +4543,16 @@
 # @mem: #optional memory size of this node (equally divide total memory among
 #        nodes if omitted)
 #
+# @memdev: #optional memory backend object.  If specified for one node,
+#          it must be specified for all nodes.
+#
+# @mem: #optional memory size of this node; mutually exclusive with @memdev.
+#
 # Since: 2.1
 ##
 { 'type': 'NumaNodeOptions',
   'data': {
    '*nodeid': 'uint16',
    '*cpus':   ['uint16'],
-   '*mem':    'size' }}
+   '*mem':    'size',
+   '*memdev': 'str' }}
diff --git a/qemu-options.hx b/qemu-options.hx
index 98e78ca..f3bf291 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -95,16 +95,20 @@ specifies the maximum number of hotpluggable CPUs.
 ETEXI
 
 DEF("numa", HAS_ARG, QEMU_OPTION_numa,
-    "-numa node[,mem=size][,cpus=cpu[-cpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
+    "-numa node[,mem=size][,memdev=id][,cpus=cpu[-cpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
 STEXI
-@item -numa node[,mem=@var{size}][,cpus=@var{cpu[-cpu]}][,nodeid=@var{node}]
+@item -numa node[,mem=@var{size}][,memdev=@var{id}][,cpus=@var{cpu[-cpu]}][,nodeid=@var{node}]
 @findex -numa
-Simulate a multi node NUMA system. If @samp{mem}
+Simulate a multi node NUMA system. If @samp{mem}, @samp{memdev}
 and @samp{cpus} are omitted, resources are split equally. Also, note
 that the -@option{numa} option doesn't allocate any of the specified
 resources. That is, it just assigns existing resources to NUMA nodes. This
 means that one still has to use the @option{-m}, @option{-smp} options
-to respectively allocate RAM and vCPUs.
+to respectively allocate RAM and vCPUs, and possibly @option{-object}
+to specify the memory backend for the @samp{memdev} suboption.
+
+@samp{mem} and @samp{memdev} are mutually exclusive.  Furthermore, if one
+node uses @samp{memdev}, all of them have to use it.
 ETEXI
 
 DEF("add-fd", HAS_ARG, QEMU_OPTION_add_fd,
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (14 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-07  6:09   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory Paolo Bonzini
                   ` (12 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Split the internal interface in exec.c to a separate function, and
push the check on mem_path up to memory_region_init_ram.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                  | 105 +++++++++++++++++++++++++++++-------------------
 include/exec/cpu-all.h  |   3 --
 include/exec/ram_addr.h |   2 +
 include/sysemu/sysemu.h |   2 +
 memory.c                |   7 +++-
 5 files changed, 73 insertions(+), 46 deletions(-)

diff --git a/exec.c b/exec.c
index b69fd29..0aa4947 100644
--- a/exec.c
+++ b/exec.c
@@ -1240,56 +1240,30 @@ static int memory_try_enable_merging(void *addr, size_t len)
     return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
 }
 
-ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
-                                   MemoryRegion *mr)
+static ram_addr_t ram_block_add(RAMBlock *new_block)
 {
-    RAMBlock *block, *new_block;
+    RAMBlock *block;
     ram_addr_t old_ram_size, new_ram_size;
 
     old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
 
-    size = TARGET_PAGE_ALIGN(size);
-    new_block = g_malloc0(sizeof(*new_block));
-    new_block->fd = -1;
-
     /* This assumes the iothread lock is taken here too.  */
     qemu_mutex_lock_ramlist();
-    new_block->mr = mr;
-    new_block->offset = find_ram_offset(size);
-    if (host) {
-        new_block->host = host;
-        new_block->flags |= RAM_PREALLOC_MASK;
-    } else if (xen_enabled()) {
-        if (mem_path) {
-            fprintf(stderr, "-mem-path not supported with Xen\n");
-            exit(1);
-        }
-        xen_ram_alloc(new_block->offset, size, mr);
-    } else {
-        if (mem_path) {
-            if (phys_mem_alloc != qemu_anon_ram_alloc) {
-                /*
-                 * file_ram_alloc() needs to allocate just like
-                 * phys_mem_alloc, but we haven't bothered to provide
-                 * a hook there.
-                 */
-                fprintf(stderr,
-                        "-mem-path not supported with this accelerator\n");
-                exit(1);
-            }
-            new_block->host = file_ram_alloc(new_block, size, mem_path);
-        }
-        if (!new_block->host) {
-            new_block->host = phys_mem_alloc(size);
+    new_block->offset = find_ram_offset(new_block->length);
+
+    if (!new_block->host) {
+        if (xen_enabled()) {
+            xen_ram_alloc(new_block->offset, new_block->length, new_block->mr);
+        } else {
+            new_block->host = phys_mem_alloc(new_block->length);
             if (!new_block->host) {
                 fprintf(stderr, "Cannot set up guest memory '%s': %s\n",
                         new_block->mr->name, strerror(errno));
                 exit(1);
             }
-            memory_try_enable_merging(new_block->host, size);
+            memory_try_enable_merging(new_block->host, new_block->length);
         }
     }
-    new_block->length = size;
 
     /* Keep the list sorted from biggest to smallest block.  */
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
@@ -1317,18 +1291,65 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    old_ram_size, new_ram_size);
        }
     }
-    cpu_physical_memory_set_dirty_range(new_block->offset, size);
+    cpu_physical_memory_set_dirty_range(new_block->offset, new_block->length);
 
-    qemu_ram_setup_dump(new_block->host, size);
-    qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
-    qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK);
+    qemu_ram_setup_dump(new_block->host, new_block->length);
+    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_HUGEPAGE);
+    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_DONTFORK);
 
-    if (kvm_enabled())
-        kvm_setup_guest_memory(new_block->host, size);
+    if (kvm_enabled()) {
+        kvm_setup_guest_memory(new_block->host, new_block->length);
+    }
 
     return new_block->offset;
 }
 
+ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
+                                    const char *mem_path)
+{
+    RAMBlock *new_block;
+
+    if (xen_enabled()) {
+        fprintf(stderr, "-mem-path not supported with Xen\n");
+        exit(1);
+    }
+
+    if (phys_mem_alloc != qemu_anon_ram_alloc) {
+        /*
+         * file_ram_alloc() needs to allocate just like
+         * phys_mem_alloc, but we haven't bothered to provide
+         * a hook there.
+         */
+        fprintf(stderr,
+                "-mem-path not supported with this accelerator\n");
+        exit(1);
+    }
+
+    size = TARGET_PAGE_ALIGN(size);
+    new_block = g_malloc0(sizeof(*new_block));
+    new_block->mr = mr;
+    new_block->length = size;
+    new_block->host = file_ram_alloc(new_block, size, mem_path);
+    return ram_block_add(new_block);
+}
+
+ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
+                                   MemoryRegion *mr)
+{
+    RAMBlock *new_block;
+
+    size = TARGET_PAGE_ALIGN(size);
+    new_block = g_malloc0(sizeof(*new_block));
+    new_block->mr = mr;
+    new_block->length = size;
+    new_block->fd = -1;
+    new_block->host = host;
+    if (host) {
+        new_block->flags |= RAM_PREALLOC_MASK;
+    }
+    return ram_block_add(new_block);
+}
+
 ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
 {
     return qemu_ram_alloc_from_ptr(size, NULL, mr);
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index e66ab5b..b44babb 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -466,9 +466,6 @@ typedef struct RAMList {
 } RAMList;
 extern RAMList ram_list;
 
-extern const char *mem_path;
-extern int mem_prealloc;
-
 /* Flags stored in the low bits of the TLB virtual address.  These are
    defined so that fast path ram access is all zeros.  */
 /* Zero if TLB entry is valid.  */
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 2edfa96..dedb258 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -22,6 +22,8 @@
 #ifndef CONFIG_USER_ONLY
 #include "hw/xen/xen.h"
 
+ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
+                                    const char *mem_path);
 ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    MemoryRegion *mr);
 ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 4870129..03f5ee5 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -132,6 +132,8 @@ extern uint8_t *boot_splash_filedata;
 extern size_t boot_splash_filedata_size;
 extern uint8_t qemu_extra_params_fw[2];
 extern QEMUClockType rtc_clock;
+extern const char *mem_path;
+extern int mem_prealloc;
 
 #define MAX_NODES 128
 #define MAX_CPUMASK_BITS 255
diff --git a/memory.c b/memory.c
index 59ecc28..32b17a8 100644
--- a/memory.c
+++ b/memory.c
@@ -23,6 +23,7 @@
 
 #include "exec/memory-internal.h"
 #include "exec/ram_addr.h"
+#include "sysemu/sysemu.h"
 
 //#define DEBUG_UNASSIGNED
 
@@ -1016,7 +1017,11 @@ void memory_region_init_ram(MemoryRegion *mr,
     mr->ram = true;
     mr->terminates = true;
     mr->destructor = memory_region_destructor_ram;
-    mr->ram_addr = qemu_ram_alloc(size, mr);
+    if (mem_path) {
+        mr->ram_addr = qemu_ram_alloc_from_file(size, mr, mem_path);
+    } else {
+        mr->ram_addr = qemu_ram_alloc(size, mr);
+    }
 }
 
 void memory_region_init_ram_ptr(MemoryRegion *mr,
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (15 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-11  3:50   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 18/28] memory: add error propagation to file-based RAM allocation Paolo Bonzini
                   ` (11 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Like the previous patch did in exec.c, split memory_region_init_ram and
memory_region_init_ram_from_file, and push mem_path one step further up.
Other RAM regions than system memory will now be backed by regular RAM.

Also, boards that do not use memory_region_allocate_system_memory will
not support -mem-path anymore.  This can be changed before the patches
are merged by migrating boards to use the function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                | 10 ++--------
 include/exec/memory.h | 18 ++++++++++++++++++
 memory.c              | 21 ++++++++++++++++-----
 numa.c                | 11 ++++++++++-
 4 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/exec.c b/exec.c
index 0aa4947..4f05584 100644
--- a/exec.c
+++ b/exec.c
@@ -1123,14 +1123,6 @@ static void *file_ram_alloc(RAMBlock *block,
     block->fd = fd;
     return area;
 }
-#else
-static void *file_ram_alloc(RAMBlock *block,
-                            ram_addr_t memory,
-                            const char *path)
-{
-    fprintf(stderr, "-mem-path not supported on this host\n");
-    exit(1);
-}
 #endif
 
 static ram_addr_t find_ram_offset(ram_addr_t size)
@@ -1304,6 +1296,7 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
     return new_block->offset;
 }
 
+#ifdef __linux__
 ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
                                     const char *mem_path)
 {
@@ -1332,6 +1325,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
     new_block->host = file_ram_alloc(new_block, size, mem_path);
     return ram_block_add(new_block);
 }
+#endif
 
 ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    MemoryRegion *mr)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9101fc3..54bdb4d 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -311,6 +311,24 @@ void memory_region_init_ram(MemoryRegion *mr,
                             const char *name,
                             uint64_t size);
 
+#ifdef __linux__
+/**
+ * memory_region_init_ram_from_file:  Initialize RAM memory region with a
+ *                                    mmap-ed backend.
+ *
+ * @mr: the #MemoryRegion to be initialized.
+ * @owner: the object that tracks the region's reference count
+ * @name: the name of the region.
+ * @size: size of the region.
+ * @path: the path in which to allocate the RAM.
+ */
+void memory_region_init_ram_from_file(MemoryRegion *mr,
+                                      struct Object *owner,
+                                      const char *name,
+                                      uint64_t size,
+                                      const char *path);
+#endif
+
 /**
  * memory_region_init_ram_ptr:  Initialize RAM memory region from a
  *                              user-provided pointer.  Accesses into the
diff --git a/memory.c b/memory.c
index 32b17a8..1636351 100644
--- a/memory.c
+++ b/memory.c
@@ -1017,13 +1017,24 @@ void memory_region_init_ram(MemoryRegion *mr,
     mr->ram = true;
     mr->terminates = true;
     mr->destructor = memory_region_destructor_ram;
-    if (mem_path) {
-        mr->ram_addr = qemu_ram_alloc_from_file(size, mr, mem_path);
-    } else {
-        mr->ram_addr = qemu_ram_alloc(size, mr);
-    }
+    mr->ram_addr = qemu_ram_alloc(size, mr);
 }
 
+#ifdef __linux__
+void memory_region_init_ram_from_file(MemoryRegion *mr,
+                                      struct Object *owner,
+                                      const char *name,
+                                      uint64_t size,
+                                      const char *path)
+{
+    memory_region_init(mr, owner, name, size);
+    mr->ram = true;
+    mr->terminates = true;
+    mr->destructor = memory_region_destructor_ram;
+    mr->ram_addr = qemu_ram_alloc_from_file(size, mr, path);
+}
+#endif
+
 void memory_region_init_ram_ptr(MemoryRegion *mr,
                                 Object *owner,
                                 const char *name,
diff --git a/numa.c b/numa.c
index b00ef90..1afa017 100644
--- a/numa.c
+++ b/numa.c
@@ -228,7 +228,16 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
 {
     uint64_t ram_size = args->ram_size;
 
-    memory_region_init_ram(mr, owner, name, ram_size);
+    if (mem_path) {
+#ifdef __linux__
+        memory_region_init_ram_from_file(mr, owner, name, ram_size, mem_path);
+#else
+        fprintf(stderr, "-mem-path not supported on this host\n");
+        exit(1);
+#endif
+    } else {
+        memory_region_init_ram(mr, owner, name, ram_size);
+    }
     vmstate_register_ram_global(mr);
 }
 
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 18/28] memory: add error propagation to file-based RAM allocation
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (16 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 19/28] memory: move preallocation code out of exec.c Paolo Bonzini
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Right now, -mem-path will fall back to RAM-based allocation in some
cases.  This should never happen with "-object memory-file", prepare
the code by adding correct error propagation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                  | 32 ++++++++++++++++++++------------
 include/exec/memory.h   |  5 ++++-
 include/exec/ram_addr.h |  2 +-
 memory.c                |  5 +++--
 numa.c                  | 13 ++++++++++++-
 5 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/exec.c b/exec.c
index 4f05584..ce1f3b1 100644
--- a/exec.c
+++ b/exec.c
@@ -1020,7 +1020,8 @@ static void sigbus_handler(int signal)
 
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
-                            const char *path)
+                            const char *path,
+                            Error **errp)
 {
     char *filename;
     char *sanitized_name;
@@ -1039,7 +1040,7 @@ static void *file_ram_alloc(RAMBlock *block,
     }
 
     if (kvm_enabled() && !kvm_has_sync_mmu()) {
-        fprintf(stderr, "host lacks kvm mmu notifiers, -mem-path unsupported\n");
+        error_setg(errp, "host lacks kvm mmu notifiers, -mem-path unsupported\n");
         return NULL;
     }
 
@@ -1056,7 +1057,7 @@ static void *file_ram_alloc(RAMBlock *block,
 
     fd = mkstemp(filename);
     if (fd < 0) {
-        perror("unable to create backing store for hugepages");
+        error_setg_errno(errp, errno, "unable to create backing store for hugepages");
         g_free(filename);
         return NULL;
     }
@@ -1076,9 +1077,9 @@ static void *file_ram_alloc(RAMBlock *block,
 
     area = mmap(0, memory, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
     if (area == MAP_FAILED) {
-        perror("file_ram_alloc: can't mmap RAM pages");
+        error_setg_errno(errp, errno, "unable to map backing store for hugepages");
         close(fd);
-        return (NULL);
+        return NULL;
     }
 
     if (mem_prealloc) {
@@ -1298,13 +1299,14 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
 
 #ifdef __linux__
 ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
-                                    const char *mem_path)
+                                    const char *mem_path,
+                                    Error **errp)
 {
     RAMBlock *new_block;
 
     if (xen_enabled()) {
-        fprintf(stderr, "-mem-path not supported with Xen\n");
-        exit(1);
+        error_setg(errp, "-mem-path not supported with Xen\n");
+        return -1;
     }
 
     if (phys_mem_alloc != qemu_anon_ram_alloc) {
@@ -1313,16 +1315,22 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
          * phys_mem_alloc, but we haven't bothered to provide
          * a hook there.
          */
-        fprintf(stderr,
-                "-mem-path not supported with this accelerator\n");
-        exit(1);
+        error_setg(errp,
+                   "-mem-path not supported with this accelerator\n");
+        return -1;
     }
 
     size = TARGET_PAGE_ALIGN(size);
     new_block = g_malloc0(sizeof(*new_block));
     new_block->mr = mr;
     new_block->length = size;
-    new_block->host = file_ram_alloc(new_block, size, mem_path);
+    new_block->host = file_ram_alloc(new_block, size,
+                                     mem_path, errp);
+    if (!new_block->host) {
+        g_free(new_block);
+        return -1;
+    }
+
     return ram_block_add(new_block);
 }
 #endif
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 54bdb4d..e2c603c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -31,6 +31,7 @@
 #include "qemu/queue.h"
 #include "qemu/int128.h"
 #include "qemu/notify.h"
+#include "qapi/error.h"
 
 #define MAX_PHYS_ADDR_SPACE_BITS 62
 #define MAX_PHYS_ADDR            (((hwaddr)1 << MAX_PHYS_ADDR_SPACE_BITS) - 1)
@@ -321,12 +322,14 @@ void memory_region_init_ram(MemoryRegion *mr,
  * @name: the name of the region.
  * @size: size of the region.
  * @path: the path in which to allocate the RAM.
+ * @errp: pointer to Error*, to store an error if it happens.
  */
 void memory_region_init_ram_from_file(MemoryRegion *mr,
                                       struct Object *owner,
                                       const char *name,
                                       uint64_t size,
-                                      const char *path);
+                                      const char *path,
+                                      Error **errp);
 #endif
 
 /**
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index dedb258..f9518a6 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -23,7 +23,7 @@
 #include "hw/xen/xen.h"
 
 ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
-                                    const char *mem_path);
+                                    const char *mem_path, Error **errp);
 ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    MemoryRegion *mr);
 ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
diff --git a/memory.c b/memory.c
index 1636351..b27bcda 100644
--- a/memory.c
+++ b/memory.c
@@ -1025,13 +1025,14 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
                                       struct Object *owner,
                                       const char *name,
                                       uint64_t size,
-                                      const char *path)
+                                      const char *path,
+                                      Error **errp)
 {
     memory_region_init(mr, owner, name, size);
     mr->ram = true;
     mr->terminates = true;
     mr->destructor = memory_region_destructor_ram;
-    mr->ram_addr = qemu_ram_alloc_from_file(size, mr, path);
+    mr->ram_addr = qemu_ram_alloc_from_file(size, mr, path, errp);
 }
 #endif
 
diff --git a/numa.c b/numa.c
index 1afa017..bf22848 100644
--- a/numa.c
+++ b/numa.c
@@ -230,7 +230,18 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
 
     if (mem_path) {
 #ifdef __linux__
-        memory_region_init_ram_from_file(mr, owner, name, ram_size, mem_path);
+        Error *err = NULL;
+        memory_region_init_ram_from_file(mr, owner, name, ram_size,
+                                         mem_path, &err);
+
+	/* Legacy behavior: if allocation failed, fall back to
+	 * regular RAM allocation.
+	 */
+	if (!memory_region_size(mr)) {
+             qerror_report_err(err);
+             error_free(err);
+             memory_region_init_ram(mr, owner, name, ram_size);
+	}
 #else
         fprintf(stderr, "-mem-path not supported on this host\n");
         exit(1);
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 19/28] memory: move preallocation code out of exec.c
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (17 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 18/28] memory: add error propagation to file-based RAM allocation Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 20/28] memory: move RAM_PREALLOC_MASK to exec.c, rename Paolo Bonzini
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

So that backends can use it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c               | 44 +------------------------------
 include/qemu/osdep.h |  2 ++
 util/oslib-posix.c   | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 43 deletions(-)

diff --git a/exec.c b/exec.c
index ce1f3b1..7f05f14 100644
--- a/exec.c
+++ b/exec.c
@@ -1011,13 +1011,6 @@ static long gethugepagesize(const char *path)
     return fs.f_bsize;
 }
 
-static sigjmp_buf sigjump;
-
-static void sigbus_handler(int signal)
-{
-    siglongjmp(sigjump, 1);
-}
-
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
                             const char *path,
@@ -1083,42 +1076,7 @@ static void *file_ram_alloc(RAMBlock *block,
     }
 
     if (mem_prealloc) {
-        int ret, i;
-        struct sigaction act, oldact;
-        sigset_t set, oldset;
-
-        memset(&act, 0, sizeof(act));
-        act.sa_handler = &sigbus_handler;
-        act.sa_flags = 0;
-
-        ret = sigaction(SIGBUS, &act, &oldact);
-        if (ret) {
-            perror("file_ram_alloc: failed to install signal handler");
-            exit(1);
-        }
-
-        /* unblock SIGBUS */
-        sigemptyset(&set);
-        sigaddset(&set, SIGBUS);
-        pthread_sigmask(SIG_UNBLOCK, &set, &oldset);
-
-        if (sigsetjmp(sigjump, 1)) {
-            fprintf(stderr, "file_ram_alloc: failed to preallocate pages\n");
-            exit(1);
-        }
-
-        /* MAP_POPULATE silently ignores failures */
-        for (i = 0; i < (memory/hpagesize); i++) {
-            memset(area + (hpagesize*i), 0, 1);
-        }
-
-        ret = sigaction(SIGBUS, &oldact, NULL);
-        if (ret) {
-            perror("file_ram_alloc: failed to reinstall signal handler");
-            exit(1);
-        }
-
-        pthread_sigmask(SIG_SETMASK, &oldset, NULL);
+        os_mem_prealloc(fd, area, memory);
     }
 
     block->fd = fd;
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index ffb2966..9c1a119 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -251,4 +251,6 @@ void qemu_init_auxval(char **envp);
 
 void qemu_set_tty_echo(int fd, bool echo);
 
+void os_mem_prealloc(int fd, char *area, size_t sz);
+
 #endif
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index c2eeb4f..7af05e4 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -46,6 +46,7 @@ extern int daemon(int, int);
 #else
 #  define QEMU_VMALLOC_ALIGN getpagesize()
 #endif
+#define HUGETLBFS_MAGIC       0x958458f6
 
 #include <termios.h>
 #include <unistd.h>
@@ -58,9 +59,12 @@ extern int daemon(int, int);
 #include "qemu/sockets.h"
 #include <sys/mman.h>
 #include <libgen.h>
+#include <setjmp.h>
+#include <sys/signal.h>
 
 #ifdef CONFIG_LINUX
 #include <sys/syscall.h>
+#include <sys/vfs.h>
 #endif
 
 int qemu_get_thread_id(void)
@@ -328,3 +332,72 @@ char *qemu_get_exec_dir(void)
 {
     return g_strdup(exec_dir);
 }
+
+static sigjmp_buf sigjump;
+
+static void sigbus_handler(int signal)
+{
+    siglongjmp(sigjump, 1);
+}
+
+static size_t fd_getpagesize(int fd)
+{
+#ifdef CONFIG_LINUX
+    struct statfs fs;
+    int ret;
+
+    if (fd != -1) {
+        do {
+            ret = fstatfs(fd, &fs);
+        } while (ret != 0 && errno == EINTR);
+
+        if (ret == 0 && fs.f_type == HUGETLBFS_MAGIC) {
+            return fs.f_bsize;
+        }
+    }
+#endif
+
+    return getpagesize();
+}
+
+void os_mem_prealloc(int fd, char *area, size_t memory)
+{
+    int ret, i;
+    struct sigaction act, oldact;
+    sigset_t set, oldset;
+    size_t hpagesize = fd_getpagesize(fd);
+
+    memset(&act, 0, sizeof(act));
+    act.sa_handler = &sigbus_handler;
+    act.sa_flags = 0;
+
+    ret = sigaction(SIGBUS, &act, &oldact);
+    if (ret) {
+        perror("file_ram_alloc: failed to install signal handler");
+        exit(1);
+    }
+
+    /* unblock SIGBUS */
+    sigemptyset(&set);
+    sigaddset(&set, SIGBUS);
+    pthread_sigmask(SIG_UNBLOCK, &set, &oldset);
+
+    if (sigsetjmp(sigjump, 1)) {
+        fprintf(stderr, "file_ram_alloc: failed to preallocate pages\n");
+        exit(1);
+    }
+
+    /* MAP_POPULATE silently ignores failures */
+    memory = (memory + hpagesize - 1) & -hpagesize;
+    for (i = 0; i < (memory/hpagesize); i++) {
+        memset(area + (hpagesize*i), 0, 1);
+    }
+
+    ret = sigaction(SIGBUS, &oldact, NULL);
+    if (ret) {
+        perror("file_ram_alloc: failed to reinstall signal handler");
+        exit(1);
+    }
+
+    pthread_sigmask(SIG_SETMASK, &oldset, NULL);
+}
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 20/28] memory: move RAM_PREALLOC_MASK to exec.c, rename
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (18 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 19/28] memory: move preallocation code out of exec.c Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend Paolo Bonzini
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Prepare for adding more flags.  The "_MASK" suffix is unique, kill it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                 | 9 ++++++---
 include/exec/cpu-all.h | 3 ---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 7f05f14..387eb9c 100644
--- a/exec.c
+++ b/exec.c
@@ -71,6 +71,9 @@ AddressSpace address_space_memory;
 MemoryRegion io_mem_rom, io_mem_notdirty;
 static MemoryRegion io_mem_unassigned;
 
+/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */
+#define RAM_PREALLOC   (1 << 0)
+
 #endif
 
 struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus);
@@ -1305,7 +1308,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     new_block->fd = -1;
     new_block->host = host;
     if (host) {
-        new_block->flags |= RAM_PREALLOC_MASK;
+        new_block->flags |= RAM_PREALLOC;
     }
     return ram_block_add(new_block);
 }
@@ -1344,7 +1347,7 @@ void qemu_ram_free(ram_addr_t addr)
             QTAILQ_REMOVE(&ram_list.blocks, block, next);
             ram_list.mru_block = NULL;
             ram_list.version++;
-            if (block->flags & RAM_PREALLOC_MASK) {
+            if (block->flags & RAM_PREALLOC) {
                 ;
             } else if (xen_enabled()) {
                 xen_invalidate_map_cache_entry(block->host);
@@ -1376,7 +1379,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
         offset = addr - block->offset;
         if (offset < block->length) {
             vaddr = block->host + offset;
-            if (block->flags & RAM_PREALLOC_MASK) {
+            if (block->flags & RAM_PREALLOC) {
                 ;
             } else if (xen_enabled()) {
                 abort();
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index b44babb..fa4807d 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -438,9 +438,6 @@ void cpu_watchpoint_remove_all(CPUArchState *env, int mask);
 
 /* memory API */
 
-/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */
-#define RAM_PREALLOC_MASK   (1 << 0)
-
 typedef struct RAMBlock {
     struct MemoryRegion *mr;
     uint8_t *host;
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (19 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 20/28] memory: move RAM_PREALLOC_MASK to exec.c, rename Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 17:38   ` Eric Blake
  2014-03-07  6:57   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 22/28] hostmem: separate allocation from UserCreatable complete method Paolo Bonzini
                   ` (7 subsequent siblings)
  28 siblings, 2 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 backends/Makefile.objs  |   1 +
 backends/hostmem-file.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 109 insertions(+)
 create mode 100644 backends/hostmem-file.c

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index e6bdc11..509e4a3 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -8,3 +8,4 @@ $(obj)/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS)
 common-obj-$(CONFIG_TPM) += tpm.o
 
 common-obj-y += hostmem.o hostmem-ram.o
+common-obj-$(CONFIG_LINUX) += hostmem-file.o
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
new file mode 100644
index 0000000..8c6ea5d
--- /dev/null
+++ b/backends/hostmem-file.c
@@ -0,0 +1,108 @@
+/*
+ * QEMU Host Memory Backend for hugetlbfs
+ *
+ * Copyright (C) 2013 Red Hat Inc
+ *
+ * Authors:
+ *   Paolo Bonzini <pbonzini@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "sysemu/hostmem.h"
+#include "qom/object_interfaces.h"
+
+/* hostmem-file.c */
+/**
+ * @TYPE_MEMORY_BACKEND_FILE:
+ * name of backend that uses mmap on a file descriptor
+ */
+#define TYPE_MEMORY_BACKEND_FILE "memory-file"
+
+#define MEMORY_BACKEND_FILE(obj) \
+    OBJECT_CHECK(HostMemoryBackendFile, (obj), TYPE_MEMORY_BACKEND_FILE)
+
+typedef struct HostMemoryBackendFile HostMemoryBackendFile;
+
+struct HostMemoryBackendFile {
+    HostMemoryBackend parent_obj;
+    char *mem_path;
+};
+
+static void
+file_backend_memory_init(UserCreatable *uc, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(uc);
+
+    if (!backend->size) {
+        error_setg(errp, "can't create backend with size 0");
+        return;
+    }
+    if (!fb->mem_path) {
+        error_setg(errp, "mem-path property not set");
+        return;
+    }
+#ifndef CONFIG_LINUX
+    error_setg(errp, "-mem-path not supported on this host");
+#else
+    if (!memory_region_size(&backend->mr)) {
+        memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
+                                         object_get_canonical_path(OBJECT(backend)),
+                                         backend->size,
+                                         fb->mem_path, errp);
+    }
+#endif
+}
+
+static void
+file_backend_class_init(ObjectClass *oc, void *data)
+{
+    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+
+    ucc->complete = file_backend_memory_init;
+}
+
+static char *get_mem_path(Object *o, Error **errp)
+{
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
+
+    return g_strdup(fb->mem_path);
+}
+
+static void set_mem_path(Object *o, const char *str, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(o);
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
+
+    if (memory_region_size(&backend->mr)) {
+        error_setg(errp, "cannot change property value");
+        return;
+    }
+    if (fb->mem_path) {
+        g_free(fb->mem_path);
+    }
+    fb->mem_path = g_strdup(str);
+}
+
+static void
+file_backend_instance_init(Object *o)
+{
+    object_property_add_str(o, "mem-path", get_mem_path,
+                            set_mem_path, NULL);
+}
+
+static const TypeInfo file_backend_info = {
+    .name = TYPE_MEMORY_BACKEND_FILE,
+    .parent = TYPE_MEMORY_BACKEND,
+    .class_init = file_backend_class_init,
+    .instance_init = file_backend_instance_init,
+    .instance_size = sizeof(HostMemoryBackendFile),
+};
+
+static void register_types(void)
+{
+    type_register_static(&file_backend_info);
+}
+
+type_init(register_types);
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 22/28] hostmem: separate allocation from UserCreatable complete method
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (20 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-07  7:08   ` Hu Tao
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 23/28] hostmem: add merge and dump properties Paolo Bonzini
                   ` (6 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

This allows the superclass to set various policies on the memory
region that the subclass creates.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 backends/hostmem-file.c  |  9 ++++-----
 backends/hostmem-ram.c   |  8 +++-----
 backends/hostmem.c       | 12 ++++++++++--
 include/sysemu/hostmem.h |  2 ++
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 8c6ea5d..7e91665 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -30,10 +30,9 @@ struct HostMemoryBackendFile {
 };
 
 static void
-file_backend_memory_init(UserCreatable *uc, Error **errp)
+file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
 {
-    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
-    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(uc);
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
 
     if (!backend->size) {
         error_setg(errp, "can't create backend with size 0");
@@ -58,9 +57,9 @@ file_backend_memory_init(UserCreatable *uc, Error **errp)
 static void
 file_backend_class_init(ObjectClass *oc, void *data)
 {
-    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+    HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
 
-    ucc->complete = file_backend_memory_init;
+    bc->alloc = file_backend_memory_alloc;
 }
 
 static char *get_mem_path(Object *o, Error **errp)
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index ce06fbe..e4d244a 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -16,10 +16,8 @@
 
 
 static void
-ram_backend_memory_init(UserCreatable *uc, Error **errp)
+ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
 {
-    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
-
     if (!backend->size) {
         error_setg(errp, "can't create backend with size 0");
         return;
@@ -33,9 +31,9 @@ ram_backend_memory_init(UserCreatable *uc, Error **errp)
 static void
 ram_backend_class_init(ObjectClass *oc, void *data)
 {
-    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+    HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
 
-    ucc->complete = ram_backend_memory_init;
+    bc->alloc = ram_backend_memory_alloc;
 }
 
 static const TypeInfo ram_backend_info = {
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 06817dd..7d6199f 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -69,8 +69,16 @@ static void host_memory_backend_finalize(Object *obj)
 static void
 host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
 {
-    error_setg(errp, "memory_init is not implemented for type [%s]",
-               object_get_typename(OBJECT(uc)));
+    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
+    HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
+
+    if (!bc->alloc) {
+        error_setg(errp, "memory_init is not implemented for type [%s]",
+                   object_get_typename(OBJECT(uc)));
+        return;
+    }
+
+    bc->alloc(backend, errp);
 }
 
 MemoryRegion *
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index bc3ffb3..4738107 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -34,6 +34,8 @@ typedef struct HostMemoryBackendClass HostMemoryBackendClass;
  */
 struct HostMemoryBackendClass {
     ObjectClass parent_class;
+
+    void (*alloc)(HostMemoryBackend *backend, Error **errp);
 };
 
 /**
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 23/28] hostmem: add merge and dump properties
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (21 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 22/28] hostmem: separate allocation from UserCreatable complete method Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 24/28] hostmem: allow preallocation of any memory region Paolo Bonzini
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 backends/hostmem.c       | 83 +++++++++++++++++++++++++++++++++++++++++++++++-
 include/qemu/osdep.h     | 10 ++++++
 include/sysemu/hostmem.h |  1 +
 3 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 7d6199f..161c494 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -50,8 +50,72 @@ host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
     backend->size = value;
 }
 
+static bool host_memory_backend_get_merge(Object *obj, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    return backend->merge;
+}
+
+static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    if (!memory_region_size(&backend->mr)) {
+        backend->merge = value;
+        return;
+    }
+
+    if (value != backend->merge) {
+        void *ptr = memory_region_get_ram_ptr(&backend->mr);
+        uint64_t sz = memory_region_size(&backend->mr);
+
+        qemu_madvise(ptr, sz,
+                     value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE);
+        backend->merge = value;
+    }
+}
+
+static bool host_memory_backend_get_dump(Object *obj, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    return backend->dump;
+}
+
+static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    if (!memory_region_size(&backend->mr)) {
+        backend->dump = value;
+        return;
+    }
+
+    if (value != backend->dump) {
+        void *ptr = memory_region_get_ram_ptr(&backend->mr);
+        uint64_t sz = memory_region_size(&backend->mr);
+
+        qemu_madvise(ptr, sz,
+                     value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP);
+        backend->dump = value;
+    }
+}
+
+
 static void host_memory_backend_initfn(Object *obj)
 {
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    backend->merge = qemu_opt_get_bool(qemu_get_machine_opts(), "mem-merge", true);
+    backend->dump = qemu_opt_get_bool(qemu_get_machine_opts(), "dump-guest-core", true);
+
+    object_property_add_bool(obj, "merge",
+                        host_memory_backend_get_merge,
+                        host_memory_backend_set_merge, NULL);
+    object_property_add_bool(obj, "dump",
+                        host_memory_backend_get_dump,
+                        host_memory_backend_set_dump, NULL);
     object_property_add(obj, "size", "int",
                         host_memory_backend_get_size,
                         host_memory_backend_set_size, NULL, NULL, NULL);
@@ -71,6 +135,9 @@ host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
 {
     HostMemoryBackend *backend = MEMORY_BACKEND(uc);
     HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
+    Error *local_err = NULL;
+    void *ptr;
+    uint64_t sz;
 
     if (!bc->alloc) {
         error_setg(errp, "memory_init is not implemented for type [%s]",
@@ -78,7 +145,21 @@ host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
         return;
     }
 
-    bc->alloc(backend, errp);
+    bc->alloc(backend, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    ptr = memory_region_get_ram_ptr(&backend->mr);
+    sz = memory_region_size(&backend->mr);
+
+    if (backend->merge) {
+        qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
+    }
+    if (!backend->dump) {
+        qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
+    }
 }
 
 MemoryRegion *
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 9c1a119..820c5d0 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -116,6 +116,16 @@ void qemu_anon_ram_free(void *ptr, size_t size);
 #else
 #define QEMU_MADV_MERGEABLE QEMU_MADV_INVALID
 #endif
+#ifdef MADV_UNMERGEABLE
+#define QEMU_MADV_UNMERGEABLE MADV_UNMERGEABLE
+#else
+#define QEMU_MADV_UNMERGEABLE QEMU_MADV_INVALID
+#endif
+#ifdef MADV_DODUMP
+#define QEMU_MADV_DODUMP MADV_DODUMP
+#else
+#define QEMU_MADV_DODUMP QEMU_MADV_INVALID
+#endif
 #ifdef MADV_DONTDUMP
 #define QEMU_MADV_DONTDUMP MADV_DONTDUMP
 #else
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index 4738107..f33d433 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -52,6 +52,7 @@ struct HostMemoryBackend {
 
     /* protected */
     uint64_t size;
+    bool merge, dump;
 
     MemoryRegion mr;
 };
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 24/28] hostmem: allow preallocation of any memory region
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (22 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 23/28] hostmem: add merge and dump properties Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 25/28] hostmem: add property to map memory with MAP_SHARED Paolo Bonzini
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

And allow preallocation of file-based memory even without -mem-prealloc.
Some care is necessary because -mem-prealloc does not allow disabling
preallocation for hostmem-file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 backends/hostmem-file.c  |  3 +++
 backends/hostmem.c       | 40 ++++++++++++++++++++++++++++++++++++++++
 exec.c                   |  7 +++++++
 include/exec/memory.h    | 10 ++++++++++
 include/exec/ram_addr.h  |  1 +
 include/sysemu/hostmem.h |  1 +
 memory.c                 | 11 +++++++++++
 7 files changed, 73 insertions(+)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 7e91665..6199a27 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -9,7 +9,9 @@
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
  */
+#include "qemu-common.h"
 #include "sysemu/hostmem.h"
+#include "sysemu/sysemu.h"
 #include "qom/object_interfaces.h"
 
 /* hostmem-file.c */
@@ -46,6 +48,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     error_setg(errp, "-mem-path not supported on this host");
 #else
     if (!memory_region_size(&backend->mr)) {
+        backend->force_prealloc = mem_prealloc;
         memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
                                          object_get_canonical_path(OBJECT(backend)),
                                          backend->size,
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 161c494..49985dc 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -102,6 +102,39 @@ static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
     }
 }
 
+static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    return backend->prealloc || backend->force_prealloc;
+}
+
+static void host_memory_backend_set_prealloc(Object *obj, bool value, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    if (backend->force_prealloc) {
+        if (value) {
+            error_setg(errp, "remove -mem-prealloc to use the prealloc property");
+            return;
+        }
+    }
+
+    if (!memory_region_size(&backend->mr)) {
+        backend->prealloc = value;
+        return;
+    }
+
+    if (value && !backend->prealloc) {
+        int fd = memory_region_get_fd(&backend->mr);
+        void *ptr = memory_region_get_ram_ptr(&backend->mr);
+        uint64_t sz = memory_region_size(&backend->mr);
+
+        os_mem_prealloc(fd, ptr, sz);
+        backend->prealloc = true;
+    }
+}
+
 
 static void host_memory_backend_initfn(Object *obj)
 {
@@ -109,6 +142,7 @@ static void host_memory_backend_initfn(Object *obj)
 
     backend->merge = qemu_opt_get_bool(qemu_get_machine_opts(), "mem-merge", true);
     backend->dump = qemu_opt_get_bool(qemu_get_machine_opts(), "dump-guest-core", true);
+    backend->prealloc = mem_prealloc;
 
     object_property_add_bool(obj, "merge",
                         host_memory_backend_get_merge,
@@ -116,6 +150,9 @@ static void host_memory_backend_initfn(Object *obj)
     object_property_add_bool(obj, "dump",
                         host_memory_backend_get_dump,
                         host_memory_backend_set_dump, NULL);
+    object_property_add_bool(obj, "prealloc",
+                        host_memory_backend_get_prealloc,
+                        host_memory_backend_set_prealloc, NULL);
     object_property_add(obj, "size", "int",
                         host_memory_backend_get_size,
                         host_memory_backend_set_size, NULL, NULL, NULL);
@@ -160,6 +197,9 @@ host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
     if (!backend->dump) {
         qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
     }
+    if (backend->prealloc) {
+        os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
+    }
 }
 
 MemoryRegion *
diff --git a/exec.c b/exec.c
index 387eb9c..312db25 100644
--- a/exec.c
+++ b/exec.c
@@ -1422,6 +1422,13 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
 }
 #endif /* !_WIN32 */
 
+int qemu_get_ram_fd(ram_addr_t addr)
+{
+    RAMBlock *block = qemu_get_ram_block(addr);
+
+    return block->fd;
+}
+
 /* Return a host pointer to ram allocated with qemu_ram_alloc.
    With the exception of the softmmu code in this file, this should
    only be used for local memory (e.g. video ram) that the device owns,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index e2c603c..d7a47c4 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -534,6 +534,16 @@ bool memory_region_is_logging(MemoryRegion *mr);
 bool memory_region_is_rom(MemoryRegion *mr);
 
 /**
+ * memory_region_get_fd: Get a file descriptor backing a RAM memory region.
+ *
+ * Returns a file descriptor backing a file-based RAM memory region,
+ * or -1 if the region is not a file-based RAM memory region.
+ *
+ * @mr: the RAM or alias memory region being queried.
+ */
+int memory_region_get_fd(MemoryRegion *mr);
+
+/**
  * memory_region_get_ram_ptr: Get a pointer into a RAM memory region.
  *
  * Returns a host pointer to a RAM memory region (created with
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index f9518a6..d352f60 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -27,6 +27,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
 ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    MemoryRegion *mr);
 ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
+int qemu_get_ram_fd(ram_addr_t addr);
 void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 void qemu_ram_free_from_ptr(ram_addr_t addr);
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index f33d433..ae72fa5 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -53,6 +53,7 @@ struct HostMemoryBackend {
     /* protected */
     uint64_t size;
     bool merge, dump;
+    bool prealloc, force_prealloc;
 
     MemoryRegion mr;
 };
diff --git a/memory.c b/memory.c
index b27bcda..6909c16 100644
--- a/memory.c
+++ b/memory.c
@@ -1258,6 +1258,17 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr addr,
     cpu_physical_memory_reset_dirty(mr->ram_addr + addr, size, client);
 }
 
+int memory_region_get_fd(MemoryRegion *mr)
+{
+    if (mr->alias) {
+        return memory_region_get_fd(mr->alias);
+    }
+
+    assert(mr->terminates);
+
+    return qemu_get_ram_fd(mr->ram_addr & TARGET_PAGE_MASK);
+}
+
 void *memory_region_get_ram_ptr(MemoryRegion *mr)
 {
     if (mr->alias) {
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 25/28] hostmem: add property to map memory with MAP_SHARED
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (23 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 24/28] hostmem: allow preallocation of any memory region Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 26/28] configure: add Linux libnuma detection Paolo Bonzini
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

A new "share" property can be used with the "memory-file" backend to
map memory with MAP_SHARED instead of MAP_PRIVATE.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 backends/hostmem-file.c | 26 +++++++++++++++++++++++++-
 exec.c                  | 17 +++++++++--------
 include/exec/memory.h   |  2 ++
 include/exec/ram_addr.h |  2 +-
 memory.c                |  3 ++-
 numa.c                  |  2 +-
 6 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 6199a27..1b65445 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -28,6 +28,8 @@ typedef struct HostMemoryBackendFile HostMemoryBackendFile;
 
 struct HostMemoryBackendFile {
     HostMemoryBackend parent_obj;
+
+    bool share;
     char *mem_path;
 };
 
@@ -51,7 +53,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
         backend->force_prealloc = mem_prealloc;
         memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
                                          object_get_canonical_path(OBJECT(backend)),
-                                         backend->size,
+                                         backend->size, fb->share,
                                          fb->mem_path, errp);
     }
 #endif
@@ -87,9 +89,31 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
     fb->mem_path = g_strdup(str);
 }
 
+static bool file_memory_backend_get_share(Object *o, Error **errp)
+{
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
+
+    return fb->share;
+}
+
+static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(o);
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
+
+    if (memory_region_size(&backend->mr)) {
+        error_setg(errp, "cannot change property value");
+        return;
+    }
+    fb->share = value;
+}
+
 static void
 file_backend_instance_init(Object *o)
 {
+    object_property_add_bool(o, "share",
+                        file_memory_backend_get_share,
+                        file_memory_backend_set_share, NULL);
     object_property_add_str(o, "mem-path", get_mem_path,
                             set_mem_path, NULL);
 }
diff --git a/exec.c b/exec.c
index 312db25..ecef8a4 100644
--- a/exec.c
+++ b/exec.c
@@ -74,6 +74,9 @@ static MemoryRegion io_mem_unassigned;
 /* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */
 #define RAM_PREALLOC   (1 << 0)
 
+/* RAM is mmap-ed with MAP_SHARED */
+#define RAM_SHARED     (1 << 1)
+
 #endif
 
 struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus);
@@ -1071,7 +1074,9 @@ static void *file_ram_alloc(RAMBlock *block,
     if (ftruncate(fd, memory))
         perror("ftruncate");
 
-    area = mmap(0, memory, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+    area = mmap(0, memory, PROT_READ | PROT_WRITE,
+                (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE),
+                fd, 0);
     if (area == MAP_FAILED) {
         error_setg_errno(errp, errno, "unable to map backing store for hugepages");
         close(fd);
@@ -1260,7 +1265,7 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
 
 #ifdef __linux__
 ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
-                                    const char *mem_path,
+                                    bool share, const char *mem_path,
                                     Error **errp)
 {
     RAMBlock *new_block;
@@ -1285,6 +1290,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
     new_block = g_malloc0(sizeof(*new_block));
     new_block->mr = mr;
     new_block->length = size;
+    new_block->flags = share ? RAM_SHARED : 0;
     new_block->host = file_ram_alloc(new_block, size,
                                      mem_path, errp);
     if (!new_block->host) {
@@ -1387,12 +1393,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
                 flags = MAP_FIXED;
                 munmap(vaddr, length);
                 if (block->fd >= 0) {
-#ifdef MAP_POPULATE
-                    flags |= mem_prealloc ? MAP_POPULATE | MAP_SHARED :
-                        MAP_PRIVATE;
-#else
-                    flags |= MAP_PRIVATE;
-#endif
+                    flags |= (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE);
                     area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
                                 flags, block->fd, offset);
                 } else {
diff --git a/include/exec/memory.h b/include/exec/memory.h
index d7a47c4..c60f66d 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -321,6 +321,7 @@ void memory_region_init_ram(MemoryRegion *mr,
  * @owner: the object that tracks the region's reference count
  * @name: the name of the region.
  * @size: size of the region.
+ * @share: %true if memory must be mmaped with the MAP_SHARED flag
  * @path: the path in which to allocate the RAM.
  * @errp: pointer to Error*, to store an error if it happens.
  */
@@ -328,6 +329,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
                                       struct Object *owner,
                                       const char *name,
                                       uint64_t size,
+                                      bool share,
                                       const char *path,
                                       Error **errp);
 #endif
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index d352f60..5f8d30d 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -23,7 +23,7 @@
 #include "hw/xen/xen.h"
 
 ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
-                                    const char *mem_path, Error **errp);
+                                    bool share, const char *mem_path, Error **errp);
 ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    MemoryRegion *mr);
 ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
diff --git a/memory.c b/memory.c
index 6909c16..b071cbf 100644
--- a/memory.c
+++ b/memory.c
@@ -1025,6 +1025,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
                                       struct Object *owner,
                                       const char *name,
                                       uint64_t size,
+                                      bool share,
                                       const char *path,
                                       Error **errp)
 {
@@ -1032,7 +1033,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
     mr->ram = true;
     mr->terminates = true;
     mr->destructor = memory_region_destructor_ram;
-    mr->ram_addr = qemu_ram_alloc_from_file(size, mr, path, errp);
+    mr->ram_addr = qemu_ram_alloc_from_file(size, mr, share, path, errp);
 }
 #endif
 
diff --git a/numa.c b/numa.c
index bf22848..4a1ba0c 100644
--- a/numa.c
+++ b/numa.c
@@ -231,7 +231,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
     if (mem_path) {
 #ifdef __linux__
         Error *err = NULL;
-        memory_region_init_ram_from_file(mr, owner, name, ram_size,
+        memory_region_init_ram_from_file(mr, owner, name, ram_size, false,
                                          mem_path, &err);
 
 	/* Legacy behavior: if allocation failed, fall back to
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 26/28] configure: add Linux libnuma detection
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (24 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 25/28] hostmem: add property to map memory with MAP_SHARED Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 27/28] hostmem: add properties for NUMA memory policy Paolo Bonzini
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Andre Przywara, ehabkost, hutao, mtosatti, imammedo, a.motakis,
	gaowanlong

From: Wanlong Gao <gaowanlong@cn.fujitsu.com>

Add detection of libnuma (mostly contained in the numactl package)
to the configure script. Can be enabled or disabled on the command
line, default is use if available.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 configure | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/configure b/configure
index 8ad03ea..585b40c 100755
--- a/configure
+++ b/configure
@@ -303,6 +303,7 @@ tpm="no"
 libssh2=""
 vhdx=""
 quorum="no"
+numa=""
 
 # parse CC options first
 for opt do
@@ -1051,6 +1052,10 @@ for opt do
   ;;
   --enable-quorum) quorum="yes"
   ;;
+  --disable-numa) numa="no"
+  ;;
+  --enable-numa) numa="yes"
+  ;;
   *) echo "ERROR: unknown option $opt"; show_help="yes"
   ;;
   esac
@@ -1310,6 +1315,8 @@ Advanced options (experts only):
   --enable-vhdx            enable support for the Microsoft VHDX image format
   --disable-quorum         disable quorum block filter support
   --enable-quorum          enable quorum block filter support
+  --disable-numa           disable libnuma support
+  --enable-numa            enable libnuma support
 
 NOTE: The object files are built at the place where configure is launched
 EOF
@@ -2983,6 +2990,27 @@ if compile_prog "" "" ; then
 fi
 
 ##########################################
+# libnuma probe
+
+if test "$numa" != "no" ; then
+  numa=no
+  cat > $TMPC << EOF
+#include <numa.h>
+int main(void) { return numa_available(); }
+EOF
+
+  if compile_prog "" "-lnuma" ; then
+    numa=yes
+    libs_softmmu="-lnuma $libs_softmmu"
+  else
+    if test "$numa" = "yes" ; then
+      feature_not_found "linux NUMA (install numactl?)"
+    fi
+    numa=no
+  fi
+fi
+
+##########################################
 # signalfd probe
 signalfd="no"
 cat > $TMPC << EOF
@@ -4045,6 +4073,7 @@ echo "TPM passthrough   $tpm_passthrough"
 echo "QOM debugging     $qom_cast_debug"
 echo "vhdx              $vhdx"
 echo "Quorum            $quorum"
+echo "NUMA host support $numa"
 
 if test "$sdl_too_old" = "yes"; then
 echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -4983,6 +5012,10 @@ if [ "$dtc_internal" = "yes" ]; then
   echo "config-host.h: subdir-dtc" >> $config_host_mak
 fi
 
+if test "$numa" = "yes"; then
+  echo "CONFIG_NUMA=y" >> $config_host_mak
+fi
+
 # build tree in object directory in case the source is not in the current directory
 DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos tests/qapi-schema tests/tcg/xtensa tests/qemu-iotests"
 DIRS="$DIRS fsdev"
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 27/28] hostmem: add properties for NUMA memory policy
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (25 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 26/28] configure: add Linux libnuma detection Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev Paolo Bonzini
  2014-03-05 11:05 ` [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Andreas Färber
  28 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Hu Tao <hutao@cn.fujitsu.com>

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Raise errors on setting properties if !CONFIG_NUMA.  Add BUILD_BUG_ON
 checks. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 backends/hostmem.c       | 109 ++++++++++++++++++++++++++++++++++++++++++++++-
 include/sysemu/hostmem.h |   4 ++
 qapi-schema.json         |  20 +++++++++
 3 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 49985dc..895d27c 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -10,12 +10,21 @@
  * See the COPYING file in the top-level directory.
  */
 #include "sysemu/hostmem.h"
-#include "sysemu/sysemu.h"
 #include "qapi/visitor.h"
+#include "qapi-types.h"
+#include "qapi-visit.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/config-file.h"
 #include "qom/object_interfaces.h"
 
+#ifdef CONFIG_NUMA
+#include <numaif.h>
+QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_DEFAULT != MPOL_DEFAULT);
+QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_PREFERRED != MPOL_PREFERRED);
+QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_MEMBIND != MPOL_BIND);
+QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
+#endif
+
 static void
 host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
                             const char *name, Error **errp)
@@ -50,6 +59,84 @@ host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
     backend->size = value;
 }
 
+static void
+get_host_nodes(Object *obj, Visitor *v, void *opaque, const char *name,
+               Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+    uint16List *host_nodes = NULL;
+    uint16List **node = &host_nodes;
+    unsigned long value;
+
+    value = find_first_bit(backend->host_nodes, MAX_NODES);
+    if (value == MAX_NODES) {
+        return;
+    }
+
+    *node = g_malloc0(sizeof(**node));
+    (*node)->value = value;
+    node = &(*node)->next;
+
+    do {
+        value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
+        if (value == MAX_NODES) {
+            break;
+        }
+
+        *node = g_malloc0(sizeof(**node));
+        (*node)->value = value;
+        node = &(*node)->next;
+    } while (true);
+
+    visit_type_uint16List(v, &host_nodes, name, errp);
+}
+
+static void
+set_host_nodes(Object *obj, Visitor *v, void *opaque, const char *name,
+               Error **errp)
+{
+#ifdef CONFIG_NUMA
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+    uint16List *l = NULL;
+
+    visit_type_uint16List(v, &l, name, errp);
+
+    while (l) {
+        bitmap_set(backend->host_nodes, l->value, 1);
+        l = l->next;
+    }
+#else
+    error_setg(errp, "NUMA node binding are not supported by this QEMU");
+#endif
+}
+
+static void
+get_policy(Object *obj, Visitor *v, void *opaque, const char *name,
+           Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+    int policy = backend->policy;
+
+    visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
+}
+
+static void
+set_policy(Object *obj, Visitor *v, void *opaque, const char *name,
+           Error **errp)
+{
+#ifdef CONFIG_NUMA
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+    int policy;
+
+    visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
+    backend->policy = policy;
+#else
+    if (policy != HOST_MEM_POLICY_DEFAULT) {
+        error_setg(errp, "NUMA policies are not supported by this QEMU");
+    }
+#endif
+}
+
 static bool host_memory_backend_get_merge(Object *obj, Error **errp)
 {
     HostMemoryBackend *backend = MEMORY_BACKEND(obj);
@@ -156,6 +243,12 @@ static void host_memory_backend_initfn(Object *obj)
     object_property_add(obj, "size", "int",
                         host_memory_backend_get_size,
                         host_memory_backend_set_size, NULL, NULL, NULL);
+    object_property_add(obj, "host-nodes", "int",
+                        get_host_nodes,
+                        set_host_nodes, NULL, NULL, NULL);
+    object_property_add(obj, "policy", "str",
+                        get_policy,
+                        set_policy, NULL, NULL, NULL);
 }
 
 static void host_memory_backend_finalize(Object *obj)
@@ -200,6 +293,20 @@ host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
     if (backend->prealloc) {
         os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
     }
+
+#ifdef CONFIG_NUMA
+    unsigned long maxnode = find_last_bit(backend->host_nodes, MAX_NODES);
+
+    /* This is a workaround for a long standing bug in Linux'
+     * mbind implementation, which cuts off the last specified
+     * node.
+     */
+    if (mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode + 2, 0)) {
+        error_setg_errno(errp, errno,
+                         "cannot bind memory to host NUMA nodes");
+        return;
+    }
+#endif
 }
 
 MemoryRegion *
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index ae72fa5..0b7fef2 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -12,10 +12,12 @@
 #ifndef QEMU_RAM_H
 #define QEMU_RAM_H
 
+#include "sysemu/sysemu.h" /* for MAX_NODES */
 #include "qom/object.h"
 #include "qapi/error.h"
 #include "exec/memory.h"
 #include "qemu/option.h"
+#include "qemu/bitmap.h"
 
 #define TYPE_MEMORY_BACKEND "memory"
 #define MEMORY_BACKEND(obj) \
@@ -54,6 +56,8 @@ struct HostMemoryBackend {
     uint64_t size;
     bool merge, dump;
     bool prealloc, force_prealloc;
+    DECLARE_BITMAP(host_nodes, MAX_NODES);
+    HostMemPolicy policy;
 
     MemoryRegion mr;
 };
diff --git a/qapi-schema.json b/qapi-schema.json
index 8bd84da..b11b279 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4556,3 +4556,23 @@
    '*cpus':   ['uint16'],
    '*mem':    'size',
    '*memdev': 'str' }}
+
+##
+# @HostMemPolicy
+#
+# Host memory policy types
+#
+# @default: restore default policy, remove any nondefault policy
+#
+# @preferred: set the preferred host nodes for allocation
+#
+# @membind: a strict policy that restricts memory allocation to the
+#           host nodes specified
+#
+# @interleave: memory allocations are interleaved across the set
+#              of host nodes specified
+#
+# Since: 2.1
+##
+{ 'enum': 'HostMemPolicy',
+  'data': [ 'default', 'preferred', 'membind', 'interleave' ] }
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (26 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 27/28] hostmem: add properties for NUMA memory policy Paolo Bonzini
@ 2014-03-04 14:00 ` Paolo Bonzini
  2014-03-04 17:37   ` Eric Blake
  2014-03-05  3:48   ` Hu Tao
  2014-03-05 11:05 ` [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Andreas Färber
  28 siblings, 2 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

From: Hu Tao <hutao@cn.fujitsu.com>

Add qmp command query-memdev to query for information
of memory devices

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Use QMP visitors instead of String visitors. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 numa.c           | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json | 31 +++++++++++++++++++++++++++
 qmp-commands.hx  | 32 ++++++++++++++++++++++++++++
 3 files changed, 127 insertions(+)

diff --git a/numa.c b/numa.c
index 4a1ba0c..159f5af 100644
--- a/numa.c
+++ b/numa.c
@@ -30,9 +30,12 @@
 #include "qapi-visit.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/dealloc-visitor.h"
+#include "qapi/qmp-output-visitor.h"
+#include "qapi/qmp-input-visitor.h"
 #include "qapi/qmp/qerror.h"
 #include "hw/boards.h"
 #include "sysemu/hostmem.h"
+#include "qmp-commands.h"
 
 QemuOptsList qemu_numa_opts = {
     .name = "numa",
@@ -281,3 +284,64 @@ void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner,
         addr += size;
     }
 }
+
+MemdevList *qmp_query_memdev(Error **errp)
+{
+    QmpOutputVisitor *ov = qmp_output_visitor_new();
+    QmpInputVisitor *iv;
+    QObject *obj;
+    MemdevList *list = NULL, *m;
+    HostMemoryBackend *backend;
+    Error *err = NULL;
+    int i;
+
+    for (i = 0; i < nb_numa_nodes; i++) {
+        backend = numa_info[i].node_memdev;
+
+        m = g_malloc0(sizeof(*m));
+        m->value = g_malloc0(sizeof(*m->value));
+        m->value->size = object_property_get_int(OBJECT(backend), "size",
+                                                 &err);
+        if (err) {
+            goto error;
+        }
+        m->value->policy = object_property_get_str(OBJECT(backend), "policy",
+                                                   &err);
+        if (err) {
+            goto error;
+        }
+        object_property_get(OBJECT(backend), qmp_output_get_visitor(ov),
+                            "host-nodes", &err);
+        if (err) {
+            goto error;
+        }
+        obj = qmp_output_get_qobject(ov);
+        iv = qmp_input_visitor_new(obj);
+        qobject_decref(obj);
+
+        visit_type_uint16List(qmp_input_get_visitor(iv),
+                              &m->value->host_nodes, NULL, &err);
+        if (err) {
+            qmp_input_visitor_cleanup(iv);
+            goto error;
+        }
+
+        m->next = list;
+        list = m;
+        qmp_input_visitor_cleanup(iv);
+    }
+
+    qmp_output_visitor_cleanup(ov);
+    return list;
+
+error:
+    while (list) {
+        m = list;
+        list = list->next;
+        g_free(m->value);
+        g_free(m);
+    }
+    qerror_report_err(err);
+    qmp_output_visitor_cleanup(ov);
+    return NULL;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index b11b279..30d0d12 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4576,3 +4576,34 @@
 ##
 { 'enum': 'HostMemPolicy',
   'data': [ 'default', 'preferred', 'membind', 'interleave' ] }
+
+##
+# @Memdev:
+#
+# Information of memory device
+#
+# @size: memory device size
+#
+# @host-nodes: host nodes for its memory policy
+#
+# @policy: memory policy of memory device
+#
+# Since: 2.1
+##
+
+{ 'type': 'Memdev',
+  'data': {
+    'size':       'size',
+    'host-nodes': ['uint16'],
+    'policy':     'str' }}
+
+##
+# @query-memdev:
+#
+# Returns information for all memory devices.
+#
+# Returns: a list of @Memdev.
+#
+# Since: 2.1
+##
+{ 'command': 'query-memdev', 'returns': ['Memdev'] }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 8a0e832..903a48a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3498,3 +3498,35 @@ Example:
                    } } ] }
 
 EQMP
+
+    {
+        .name       = "query-memdev",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_input_query_memdev,
+    },
+
+SQMP
+query-memdev
+------------
+
+Show memory devices information.
+
+
+Example (1):
+
+-> { "execute": "query-memdev" }
+<- { "return": [
+       {
+         "size": 536870912,
+         "host-nodes": [0, 1],
+         "policy": "bind"
+       },
+       {
+         "size": 536870912,
+         "host-nodes": [2, 3],
+         "policy": "preferred"
+       }
+     ]
+   }
+
+EQMP
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 02/28] NUMA: check if the total numa memory size is equal to ram_size
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 02/28] NUMA: check if the total numa memory size is equal to ram_size Paolo Bonzini
@ 2014-03-04 17:00   ` Eric Blake
  2014-03-04 17:19     ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Eric Blake @ 2014-03-04 17:00 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]

On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
> From: Wanlong Gao <gaowanlong@cn.fujitsu.com>
> 
> If the total number of the assigned numa nodes memory is not
> equal to the assigned ram size, it will write the wrong data
> to ACPI talb, then the guest will ignore the wrong ACPI table

s/talb/table/

> and recognize all memory to one node. It's buggy, we should
> check it to ensure that we write the right data to ACPI table.
> 
> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  numa.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 

> +        if (numa_total != ram_size) {
> +            fprintf(stderr, "qemu: numa nodes total memory size "
> +                            "should equal to ram_size\n");

Is it worth also printing numa_total or ram_size values in this error
message?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 02/28] NUMA: check if the total numa memory size is equal to ram_size
  2014-03-04 17:00   ` Eric Blake
@ 2014-03-04 17:19     ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 17:19 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Il 04/03/2014 18:00, Eric Blake ha scritto:
> On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
>> From: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>>
>> If the total number of the assigned numa nodes memory is not
>> equal to the assigned ram size, it will write the wrong data
>> to ACPI talb, then the guest will ignore the wrong ACPI table
>
> s/talb/table/
>
>> and recognize all memory to one node. It's buggy, we should
>> check it to ensure that we write the right data to ACPI table.
>>
>> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  numa.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>
>> +        if (numa_total != ram_size) {
>> +            fprintf(stderr, "qemu: numa nodes total memory size "
>> +                            "should equal to ram_size\n");
>
> Is it worth also printing numa_total or ram_size values in this error
> message?

Good idea.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev Paolo Bonzini
@ 2014-03-04 17:37   ` Eric Blake
  2014-03-04 18:11     ` Paolo Bonzini
  2014-03-05  3:48   ` Hu Tao
  1 sibling, 1 reply; 70+ messages in thread
From: Eric Blake @ 2014-03-04 17:37 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

[-- Attachment #1: Type: text/plain, Size: 1583 bytes --]

On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
> From: Hu Tao <hutao@cn.fujitsu.com>
> 
> Add qmp command query-memdev to query for information
> of memory devices
> 
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> [Use QMP visitors instead of String visitors. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  numa.c           | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  qapi-schema.json | 31 +++++++++++++++++++++++++++
>  qmp-commands.hx  | 32 ++++++++++++++++++++++++++++
>  3 files changed, 127 insertions(+)
> 

> +++ b/qapi-schema.json
> @@ -4576,3 +4576,34 @@
>  ##
>  { 'enum': 'HostMemPolicy',
>    'data': [ 'default', 'preferred', 'membind', 'interleave' ] }
> +
> +##
> +# @Memdev:
> +#
> +# Information of memory device
> +#
> +# @size: memory device size
> +#
> +# @host-nodes: host nodes for its memory policy
> +#
> +# @policy: memory policy of memory device
> +#
> +# Since: 2.1
> +##
> +
> +{ 'type': 'Memdev',
> +  'data': {
> +    'size':       'size',
> +    'host-nodes': ['uint16'],
> +    'policy':     'str' }}

Why is policy 'str', when you just defined 'HostMemPolicy' as an enum in
the previous patch?  Should this be using the enum?


> +<- { "return": [
> +       {
> +         "size": 536870912,
> +         "host-nodes": [0, 1],
> +         "policy": "bind"

"bind" is not one of the values of HostMemPolicy - is that missing from
patch 27/28?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend Paolo Bonzini
@ 2014-03-04 17:38   ` Eric Blake
  2014-03-04 18:12     ` Paolo Bonzini
  2014-03-07  6:57   ` Hu Tao
  1 sibling, 1 reply; 70+ messages in thread
From: Eric Blake @ 2014-03-04 17:38 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  backends/Makefile.objs  |   1 +
>  backends/hostmem-file.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 109 insertions(+)
>  create mode 100644 backends/hostmem-file.c
> 

> +++ b/backends/hostmem-file.c
> @@ -0,0 +1,108 @@
> +/*
> + * QEMU Host Memory Backend for hugetlbfs
> + *
> + * Copyright (C) 2013 Red Hat Inc

It's 2014 now

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option Paolo Bonzini
@ 2014-03-04 17:52   ` Eric Blake
  2014-03-07  5:33   ` Hu Tao
  1 sibling, 0 replies; 70+ messages in thread
From: Eric Blake @ 2014-03-04 17:52 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

[-- Attachment #1: Type: text/plain, Size: 1558 bytes --]

On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
> This option provides the infrastructure for binding guest NUMA nodes
> to host NUMA nodes.  For example:
> 
>  -object memory-ram,size=1024M,policy=membind,host-nodes=0,id=ram-node0 \
>  -numa node,nodeid=0,cpus=0,memdev=ram-node0 \
>  -object memory-ram,size=1024M,policy=interleave,host-nodes=1-3,id=ram-node1 \
>  -numa node,nodeid=1,cpus=1,memdev=ram-node1
> 
> The option replaces "-numa node,mem=".
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---

> +++ b/qapi-schema.json
> @@ -4543,10 +4543,16 @@
>  # @mem: #optional memory size of this node (equally divide total memory among
>  #        nodes if omitted)
>  #
> +# @memdev: #optional memory backend object.  If specified for one node,
> +#          it must be specified for all nodes.
> +#
> +# @mem: #optional memory size of this node; mutually exclusive with @memdev.

No need to list @mem a second time, just add the mutual exclusion
comment to the first listing.


>  
>  DEF("numa", HAS_ARG, QEMU_OPTION_numa,
> -    "-numa node[,mem=size][,cpus=cpu[-cpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> +    "-numa node[,mem=size][,memdev=id][,cpus=cpu[-cpu]][,nodeid=node]\n", QEMU_ARCH_ALL)

Since this is mutually exclusive, would it be better to split into two
lines, as in:

"-numa node[,mem=size][,cpus=cpu[-cpu]][,nodeid=node]\n"
"-numa node,memdev=id[,cpus=cpu[-cpu]][,nodeid=node]\n"

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev
  2014-03-04 17:37   ` Eric Blake
@ 2014-03-04 18:11     ` Paolo Bonzini
  2014-03-05  3:50       ` Hu Tao
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 18:11 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Il 04/03/2014 18:37, Eric Blake ha scritto:
> On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
>> From: Hu Tao <hutao@cn.fujitsu.com>
>>
>> Add qmp command query-memdev to query for information
>> of memory devices
>>
>> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
>> [Use QMP visitors instead of String visitors. - Paolo]
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  numa.c           | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  qapi-schema.json | 31 +++++++++++++++++++++++++++
>>  qmp-commands.hx  | 32 ++++++++++++++++++++++++++++
>>  3 files changed, 127 insertions(+)
>>
>
>> +++ b/qapi-schema.json
>> @@ -4576,3 +4576,34 @@
>>  ##
>>  { 'enum': 'HostMemPolicy',
>>    'data': [ 'default', 'preferred', 'membind', 'interleave' ] }
>> +
>> +##
>> +# @Memdev:
>> +#
>> +# Information of memory device
>> +#
>> +# @size: memory device size
>> +#
>> +# @host-nodes: host nodes for its memory policy
>> +#
>> +# @policy: memory policy of memory device
>> +#
>> +# Since: 2.1
>> +##
>> +
>> +{ 'type': 'Memdev',
>> +  'data': {
>> +    'size':       'size',
>> +    'host-nodes': ['uint16'],
>> +    'policy':     'str' }}
>
> Why is policy 'str', when you just defined 'HostMemPolicy' as an enum in
> the previous patch?  Should this be using the enum?

Good catch.

>
>> +<- { "return": [
>> +       {
>> +         "size": 536870912,
>> +         "host-nodes": [0, 1],
>> +         "policy": "bind"
>
> "bind" is not one of the values of HostMemPolicy - is that missing from
> patch 27/28?

Should be membind.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend
  2014-03-04 17:38   ` Eric Blake
@ 2014-03-04 18:12     ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-04 18:12 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Il 04/03/2014 18:38, Eric Blake ha scritto:
> On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  backends/Makefile.objs  |   1 +
>>  backends/hostmem-file.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 109 insertions(+)
>>  create mode 100644 backends/hostmem-file.c
>>
>
>> +++ b/backends/hostmem-file.c
>> @@ -0,0 +1,108 @@
>> +/*
>> + * QEMU Host Memory Backend for hugetlbfs
>> + *
>> + * Copyright (C) 2013 Red Hat Inc
>
> It's 2014 now

The commit is still pointing to 2013.  I'll change it depending on 
whether it will be posted or sent directly with a pull request.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev Paolo Bonzini
  2014-03-04 17:37   ` Eric Blake
@ 2014-03-05  3:48   ` Hu Tao
  1 sibling, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-05  3:48 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

<...>

> +
> +MemdevList *qmp_query_memdev(Error **errp)
> +{
> +    QmpOutputVisitor *ov = qmp_output_visitor_new();
> +    QmpInputVisitor *iv;
> +    QObject *obj;
> +    MemdevList *list = NULL, *m;
> +    HostMemoryBackend *backend;
> +    Error *err = NULL;
> +    int i;
> +
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        backend = numa_info[i].node_memdev;
> +
> +        m = g_malloc0(sizeof(*m));
> +        m->value = g_malloc0(sizeof(*m->value));
> +        m->value->size = object_property_get_int(OBJECT(backend), "size",
> +                                                 &err);
> +        if (err) {
> +            goto error;
> +        }
> +        m->value->policy = object_property_get_str(OBJECT(backend), "policy",
> +                                                   &err);
> +        if (err) {
> +            goto error;
> +        }
> +        object_property_get(OBJECT(backend), qmp_output_get_visitor(ov),
> +                            "host-nodes", &err);
> +        if (err) {
> +            goto error;
> +        }
> +        obj = qmp_output_get_qobject(ov);

Unlike string output visitor, the internal state of qmp output visitor
retains, it should be cleaned up at the end of every loop.

> +        iv = qmp_input_visitor_new(obj);
> +        qobject_decref(obj);
> +
> +        visit_type_uint16List(qmp_input_get_visitor(iv),
> +                              &m->value->host_nodes, NULL, &err);
> +        if (err) {
> +            qmp_input_visitor_cleanup(iv);
> +            goto error;
> +        }
> +
> +        m->next = list;
> +        list = m;
> +        qmp_input_visitor_cleanup(iv);
> +    }
> +
> +    qmp_output_visitor_cleanup(ov);
> +    return list;
> +
> +error:
> +    while (list) {
> +        m = list;
> +        list = list->next;
> +        g_free(m->value);
> +        g_free(m);
> +    }
> +    qerror_report_err(err);
> +    qmp_output_visitor_cleanup(ov);
> +    return NULL;
> +}

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev
  2014-03-04 18:11     ` Paolo Bonzini
@ 2014-03-05  3:50       ` Hu Tao
  2014-03-05  8:17         ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Hu Tao @ 2014-03-05  3:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, a.motakis, imammedo, gaowanlong

On Tue, Mar 04, 2014 at 07:11:53PM +0100, Paolo Bonzini wrote:
> Il 04/03/2014 18:37, Eric Blake ha scritto:
> >On 03/04/2014 07:00 AM, Paolo Bonzini wrote:
> >>From: Hu Tao <hutao@cn.fujitsu.com>
> >>
> >>Add qmp command query-memdev to query for information
> >>of memory devices
> >>
> >>Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> >>[Use QMP visitors instead of String visitors. - Paolo]
> >>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >>---
> >> numa.c           | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> qapi-schema.json | 31 +++++++++++++++++++++++++++
> >> qmp-commands.hx  | 32 ++++++++++++++++++++++++++++
> >> 3 files changed, 127 insertions(+)
> >>
> >
> >>+++ b/qapi-schema.json
> >>@@ -4576,3 +4576,34 @@
> >> ##
> >> { 'enum': 'HostMemPolicy',
> >>   'data': [ 'default', 'preferred', 'membind', 'interleave' ] }
> >>+
> >>+##
> >>+# @Memdev:
> >>+#
> >>+# Information of memory device
> >>+#
> >>+# @size: memory device size
> >>+#
> >>+# @host-nodes: host nodes for its memory policy
> >>+#
> >>+# @policy: memory policy of memory device
> >>+#
> >>+# Since: 2.1
> >>+##
> >>+
> >>+{ 'type': 'Memdev',
> >>+  'data': {
> >>+    'size':       'size',
> >>+    'host-nodes': ['uint16'],
> >>+    'policy':     'str' }}
> >
> >Why is policy 'str', when you just defined 'HostMemPolicy' as an enum in
> >the previous patch?  Should this be using the enum?
> 
> Good catch.
> 
> >
> >>+<- { "return": [
> >>+       {
> >>+         "size": 536870912,
> >>+         "host-nodes": [0, 1],
> >>+         "policy": "bind"
> >
> >"bind" is not one of the values of HostMemPolicy - is that missing from
> >patch 27/28?
> 
> Should be membind.

How about change it into bind to keep in line with MPOL_BIND?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev
  2014-03-05  3:50       ` Hu Tao
@ 2014-03-05  8:17         ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-05  8:17 UTC (permalink / raw)
  To: Hu Tao; +Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

Il 05/03/2014 04:50, Hu Tao ha scritto:
>>> > >"bind" is not one of the values of HostMemPolicy - is that missing from
>>> > >patch 27/28?
>> >
>> > Should be membind.
> How about change it into bind to keep in line with MPOL_BIND?
>
>
>

Sure!

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts Paolo Bonzini
@ 2014-03-05 10:06   ` Andreas Färber
  2014-03-05 10:31     ` Paolo Bonzini
  2014-03-05 15:09     ` Igor Mammedov
  0 siblings, 2 replies; 70+ messages in thread
From: Andreas Färber @ 2014-03-05 10:06 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, imammedo
  Cc: hutao, mtosatti, ehabkost, gaowanlong, a.motakis

Am 04.03.2014 15:00, schrieb Paolo Bonzini:
> From: Igor Mammedov <imammedo@redhat.com>
> 
> Adds option to -m
>  "mem" - startup memory amount

Sorry for jumping in late, but assuming that -m is for "memory" already,
wouldn't it make more sense to name it "size" instead of "mem"?

> 
> For compatibility with legacy CLI if suffix-less number is passed,
> it assumes amount in Mb.
> 
> Otherwise user is free to use suffixed number using suffixes b,k/K,M,G
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qemu-options.hx |  9 +++++---
>  vl.c            | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------
>  2 files changed, 63 insertions(+), 16 deletions(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index f948f28..98e78ca 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -214,10 +214,13 @@ use is discouraged as it may be removed from future versions.
>  ETEXI
>  
>  DEF("m", HAS_ARG, QEMU_OPTION_m,
> -    "-m megs         set virtual RAM size to megs MB [default="
> -    stringify(DEFAULT_RAM_SIZE) "]\n", QEMU_ARCH_ALL)
> +    "-m [mem=]megs\n"
> +    "                configure guest RAM\n"
> +    "                mem: initial amount of guest memory (default: "
> +    stringify(DEFAULT_RAM_SIZE) "MiB)\n",
> +    QEMU_ARCH_ALL)
>  STEXI
> -@item -m @var{megs}
> +@item -m [mem=]@var{megs}
>  @findex -m
>  Set virtual RAM size to @var{megs} megabytes. Default is 128 MiB.  Optionally,
>  a suffix of ``M'' or ``G'' can be used to signify a value in megabytes or
> diff --git a/vl.c b/vl.c
> index dafe6f6..ac5f425 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -478,6 +478,20 @@ static QemuOptsList qemu_msg_opts = {
>      },
>  };
>  
> +static QemuOptsList qemu_mem_opts = {
> +    .name = "memory",
> +    .implied_opt_name = "mem",
> +    .head = QTAILQ_HEAD_INITIALIZER(qemu_mem_opts.head),
> +    .merge_lists = true,
> +    .desc = {
> +        {
> +            .name = "mem",
> +            .type = QEMU_OPT_SIZE,
> +        },
> +        { /* end of list */ }
> +    },
> +};
> +
>  /**
>   * Get machine options
>   *
> @@ -2718,6 +2732,7 @@ int main(int argc, char **argv, char **envp)
>      };
>      const char *trace_events = NULL;
>      const char *trace_file = NULL;
> +    const ram_addr_t default_ram_size = (ram_addr_t)DEFAULT_RAM_SIZE * 1024 * 1024;
>  
>      atexit(qemu_run_exit_notifiers);
>      error_set_progname(argv[0]);
> @@ -2758,6 +2773,7 @@ int main(int argc, char **argv, char **envp)
>      qemu_add_opts(&qemu_realtime_opts);
>      qemu_add_opts(&qemu_msg_opts);
>      qemu_add_opts(&qemu_numa_opts);
> +    qemu_add_opts(&qemu_mem_opts);
>  
>      runstate_init();
>  
> @@ -2773,7 +2789,7 @@ int main(int argc, char **argv, char **envp)
>      module_call_init(MODULE_INIT_MACHINE);
>      machine = find_default_machine();
>      cpu_model = NULL;
> -    ram_size = 0;
> +    ram_size = default_ram_size;
>      snapshot = 0;
>      cyls = heads = secs = 0;
>      translation = BIOS_ATA_TRANSLATION_AUTO;
> @@ -3063,20 +3079,50 @@ int main(int argc, char **argv, char **envp)
>                  exit(0);
>                  break;
>              case QEMU_OPTION_m: {
> -                int64_t value;
>                  uint64_t sz;
> -                char *end;
> +                const char *mem_str;
>  
> -                value = strtosz(optarg, &end);
> -                if (value < 0 || *end) {
> -                    fprintf(stderr, "qemu: invalid ram size: %s\n", optarg);
> -                    exit(1);
> +                opts = qemu_opts_parse(qemu_find_opts("memory"),
> +                                       optarg, 1);
> +                if (!opts) {
> +                    exit(EXIT_FAILURE);
> +                }
> +
> +                mem_str = qemu_opt_get(opts, "mem");
> +                if (!mem_str) {
> +                    fprintf(stderr, "qemu: invalid -m option, missing "
> +                            "'mem' option\n");

error_report(), in particular to fix "qemu: "

> +                    exit(EXIT_FAILURE);
> +                }
> +                if (!*mem_str) {
> +                    fprintf(stderr, "qemu: missing 'mem' option value\n");

error_report()

> +                    exit(EXIT_FAILURE);
> +                }
> +
> +                sz = qemu_opt_get_size(opts, "mem", ram_size);
> +
> +                /* Fix up legacy suffix-less format */
> +                if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) {
> +                    uint64_t overflow_check = sz;
> +
> +                    sz <<= 20;
> +                    if ((sz >> 20) != overflow_check) {
> +                        fprintf(stderr, "qemu: too large 'mem' option "
> +                                "value\n");

error_report()

> +                        exit(EXIT_FAILURE);
> +                    }
> +                }
> +
> +                /* backward compatibility behaviour for case "-m 0" */
> +                if (sz == 0) {
> +                    sz = default_ram_size;
>                  }
> -                sz = QEMU_ALIGN_UP((uint64_t)value, 8192);
> +
> +                sz = QEMU_ALIGN_UP(sz, 8192);
>                  ram_size = sz;
>                  if (ram_size != sz) {
>                      fprintf(stderr, "qemu: ram size too large\n");

error_report() while at it?

> -                    exit(1);
> +                    exit(EXIT_FAILURE);
>                  }
>                  break;
>              }
> @@ -3921,10 +3967,8 @@ int main(int argc, char **argv, char **envp)
>          exit(1);
>      }
>  
> -    /* init the memory */
> -    if (ram_size == 0) {
> -        ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
> -    }
> +    /* store value for the future use */
> +    qemu_opt_set_number(qemu_find_opts_singleton("memory"), "mem", ram_size);
>  
>      if (qemu_opts_foreach(qemu_find_opts("device"), device_help_func, NULL, 0)
>          != 0) {

Here's a dependency on the preceding patch - other than that, these two
could be applied independently of the rest via qemu-trivial?

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton Paolo Bonzini
@ 2014-03-05 10:08   ` Andreas Färber
  2014-03-07  2:27   ` Hu Tao
  2014-03-11 18:55   ` Eduardo Habkost
  2 siblings, 0 replies; 70+ messages in thread
From: Andreas Färber @ 2014-03-05 10:08 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Am 04.03.2014 15:00, schrieb Paolo Bonzini:
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Andreas Färber <afaerber@suse.de>

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts
  2014-03-05 10:06   ` Andreas Färber
@ 2014-03-05 10:31     ` Paolo Bonzini
  2014-03-05 15:09     ` Igor Mammedov
  1 sibling, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-05 10:31 UTC (permalink / raw)
  To: Andreas Färber, qemu-devel, imammedo
  Cc: hutao, mtosatti, ehabkost, gaowanlong, a.motakis

Il 05/03/2014 11:06, Andreas Färber ha scritto:
> Am 04.03.2014 15:00, schrieb Paolo Bonzini:
>> From: Igor Mammedov <imammedo@redhat.com>
>>
>> Adds option to -m
>>  "mem" - startup memory amount
>
> Sorry for jumping in late, but assuming that -m is for "memory" already,
> wouldn't it make more sense to name it "size" instead of "mem"?

Sure.

Paolo

>>
>> For compatibility with legacy CLI if suffix-less number is passed,
>> it assumes amount in Mb.
>>
>> Otherwise user is free to use suffixed number using suffixes b,k/K,M,G
>>
>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  qemu-options.hx |  9 +++++---
>>  vl.c            | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------
>>  2 files changed, 63 insertions(+), 16 deletions(-)
>>
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index f948f28..98e78ca 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -214,10 +214,13 @@ use is discouraged as it may be removed from future versions.
>>  ETEXI
>>
>>  DEF("m", HAS_ARG, QEMU_OPTION_m,
>> -    "-m megs         set virtual RAM size to megs MB [default="
>> -    stringify(DEFAULT_RAM_SIZE) "]\n", QEMU_ARCH_ALL)
>> +    "-m [mem=]megs\n"
>> +    "                configure guest RAM\n"
>> +    "                mem: initial amount of guest memory (default: "
>> +    stringify(DEFAULT_RAM_SIZE) "MiB)\n",
>> +    QEMU_ARCH_ALL)
>>  STEXI
>> -@item -m @var{megs}
>> +@item -m [mem=]@var{megs}
>>  @findex -m
>>  Set virtual RAM size to @var{megs} megabytes. Default is 128 MiB.  Optionally,
>>  a suffix of ``M'' or ``G'' can be used to signify a value in megabytes or
>> diff --git a/vl.c b/vl.c
>> index dafe6f6..ac5f425 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -478,6 +478,20 @@ static QemuOptsList qemu_msg_opts = {
>>      },
>>  };
>>
>> +static QemuOptsList qemu_mem_opts = {
>> +    .name = "memory",
>> +    .implied_opt_name = "mem",
>> +    .head = QTAILQ_HEAD_INITIALIZER(qemu_mem_opts.head),
>> +    .merge_lists = true,
>> +    .desc = {
>> +        {
>> +            .name = "mem",
>> +            .type = QEMU_OPT_SIZE,
>> +        },
>> +        { /* end of list */ }
>> +    },
>> +};
>> +
>>  /**
>>   * Get machine options
>>   *
>> @@ -2718,6 +2732,7 @@ int main(int argc, char **argv, char **envp)
>>      };
>>      const char *trace_events = NULL;
>>      const char *trace_file = NULL;
>> +    const ram_addr_t default_ram_size = (ram_addr_t)DEFAULT_RAM_SIZE * 1024 * 1024;
>>
>>      atexit(qemu_run_exit_notifiers);
>>      error_set_progname(argv[0]);
>> @@ -2758,6 +2773,7 @@ int main(int argc, char **argv, char **envp)
>>      qemu_add_opts(&qemu_realtime_opts);
>>      qemu_add_opts(&qemu_msg_opts);
>>      qemu_add_opts(&qemu_numa_opts);
>> +    qemu_add_opts(&qemu_mem_opts);
>>
>>      runstate_init();
>>
>> @@ -2773,7 +2789,7 @@ int main(int argc, char **argv, char **envp)
>>      module_call_init(MODULE_INIT_MACHINE);
>>      machine = find_default_machine();
>>      cpu_model = NULL;
>> -    ram_size = 0;
>> +    ram_size = default_ram_size;
>>      snapshot = 0;
>>      cyls = heads = secs = 0;
>>      translation = BIOS_ATA_TRANSLATION_AUTO;
>> @@ -3063,20 +3079,50 @@ int main(int argc, char **argv, char **envp)
>>                  exit(0);
>>                  break;
>>              case QEMU_OPTION_m: {
>> -                int64_t value;
>>                  uint64_t sz;
>> -                char *end;
>> +                const char *mem_str;
>>
>> -                value = strtosz(optarg, &end);
>> -                if (value < 0 || *end) {
>> -                    fprintf(stderr, "qemu: invalid ram size: %s\n", optarg);
>> -                    exit(1);
>> +                opts = qemu_opts_parse(qemu_find_opts("memory"),
>> +                                       optarg, 1);
>> +                if (!opts) {
>> +                    exit(EXIT_FAILURE);
>> +                }
>> +
>> +                mem_str = qemu_opt_get(opts, "mem");
>> +                if (!mem_str) {
>> +                    fprintf(stderr, "qemu: invalid -m option, missing "
>> +                            "'mem' option\n");
>
> error_report(), in particular to fix "qemu: "
>
>> +                    exit(EXIT_FAILURE);
>> +                }
>> +                if (!*mem_str) {
>> +                    fprintf(stderr, "qemu: missing 'mem' option value\n");
>
> error_report()
>
>> +                    exit(EXIT_FAILURE);
>> +                }
>> +
>> +                sz = qemu_opt_get_size(opts, "mem", ram_size);
>> +
>> +                /* Fix up legacy suffix-less format */
>> +                if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) {
>> +                    uint64_t overflow_check = sz;
>> +
>> +                    sz <<= 20;
>> +                    if ((sz >> 20) != overflow_check) {
>> +                        fprintf(stderr, "qemu: too large 'mem' option "
>> +                                "value\n");
>
> error_report()
>
>> +                        exit(EXIT_FAILURE);
>> +                    }
>> +                }
>> +
>> +                /* backward compatibility behaviour for case "-m 0" */
>> +                if (sz == 0) {
>> +                    sz = default_ram_size;
>>                  }
>> -                sz = QEMU_ALIGN_UP((uint64_t)value, 8192);
>> +
>> +                sz = QEMU_ALIGN_UP(sz, 8192);
>>                  ram_size = sz;
>>                  if (ram_size != sz) {
>>                      fprintf(stderr, "qemu: ram size too large\n");
>
> error_report() while at it?
>
>> -                    exit(1);
>> +                    exit(EXIT_FAILURE);
>>                  }
>>                  break;
>>              }
>> @@ -3921,10 +3967,8 @@ int main(int argc, char **argv, char **envp)
>>          exit(1);
>>      }
>>
>> -    /* init the memory */
>> -    if (ram_size == 0) {
>> -        ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
>> -    }
>> +    /* store value for the future use */
>> +    qemu_opt_set_number(qemu_find_opts_singleton("memory"), "mem", ram_size);
>>
>>      if (qemu_opts_foreach(qemu_find_opts("device"), device_help_func, NULL, 0)
>>          != 0) {
>
> Here's a dependency on the preceding patch - other than that, these two
> could be applied independently of the rest via qemu-trivial?
>
> Regards,
> Andreas
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
  2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
                   ` (27 preceding siblings ...)
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev Paolo Bonzini
@ 2014-03-05 11:05 ` Andreas Färber
  2014-03-05 11:30   ` Paolo Bonzini
  28 siblings, 1 reply; 70+ messages in thread
From: Andreas Färber @ 2014-03-05 11:05 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, Chen Fan
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Am 04.03.2014 15:00, schrieb Paolo Bonzini:
> This series includes all the pending work on QOMifying the memory
> backends.
[snip]

There's also a recent RFC from Chen Fan about how to model the
association between NUMA nodes and CPU socket/core/thread that
would/should influence this series if we're aiming for 2.1 now.

I didn't review it in-depth yet, but minor technical issues apart, I
think we need to keep NUMA and CPU separate, which then brings up the
question Chen Fan asked about whether we need to support splitting CPU
threads of one core or CPU cores of one socket onto different NUMA
nodes. If we can stop supporting this, 2.0 would be a good point in time
to catch this with an error message at least, even if the remodeling
depending on it happens post-2.0.

For example, assuming associativity at CPU socket level, I'd imagine:

/machine
  /numa-node[0]
    cpu[0] -> /machine/unassigned/device[0]
  /unassigned
    /device[0]
      /core[0]
        /thread[0]

I.e. the CPU socket is a self-contained object in /machine/unassigned
when machine-created or in /machine/peripheral when device_add'ed.

Hotplug will need a link<> property somewhere, and for a standard PC
this can be on /machine (Qseven modules, SoCs etc. would require it a
level down on their container but we don't support those yet). Having
the link<cpu>s on the NUMA nodes corresponds to having multiple buses in
the qdev world.

Compare that to:

/machine
  /node[0] # This is not really telling!
    /socket[0]
      /core[0]
        /thread[0] # So CPUState != thread?
          cpu -> /machine/unassigned/device[0]
  /unassigned
    /device[0]

Note that according to my interpretation of QOM ABI stability rules we
can't just turn a link<cpu> into a child<cpu> without renaming, thus
trying to be forward-looking for where we want to go design-wise.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
  2014-03-05 11:05 ` [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Andreas Färber
@ 2014-03-05 11:30   ` Paolo Bonzini
  2014-03-07 11:59     ` Andreas Färber
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-05 11:30 UTC (permalink / raw)
  To: Andreas Färber, qemu-devel, Chen Fan
  Cc: ehabkost, hutao, mtosatti, imammedo, a.motakis, gaowanlong

Il 05/03/2014 12:05, Andreas Färber ha scritto:
> Am 04.03.2014 15:00, schrieb Paolo Bonzini:
>> This series includes all the pending work on QOMifying the memory
>> backends.
> [snip]
>
> There's also a recent RFC from Chen Fan about how to model the
> association between NUMA nodes and CPU socket/core/thread that
> would/should influence this series if we're aiming for 2.1 now.

I don't think it should, apart from conflicts.  This series only changes 
things about memory.  CPUs are handled the same before and after the 
patches.

> I didn't review it in-depth yet, but minor technical issues apart, I
> think we need to keep NUMA and CPU separate,

I agree.

> Compare that to:
>
> /machine
>   /node[0] # This is not really telling!
>     /socket[0]
>       /core[0]
>         /thread[0] # So CPUState != thread?
>           cpu -> /machine/unassigned/device[0]
>   /unassigned
>     /device[0]

I think this is better; in our world we can have multiple sockets in the 
same NUMA node.  But CPUState == thread, so you can have just /thread[0] 
-> /machine/unassigned/device[0].

Alternatively, and to keep CPU + NUMA even *more* separate:

   /machine
     /node[0]
        /cpu[0] -> /machine/unassigned/device[0]
        ...
     /socket[0]
        /core[0]
           /thread[0] -> /machine/unassigned/device[0]
     /unassigned
        /device[0]

> which then brings up the
> question Chen Fan asked about whether we need to support splitting CPU
> threads of one core or CPU cores of one socket onto different NUMA
> nodes. If we can stop supporting this, 2.0 would be a good point in time
> to catch this with an error message at least, even if the remodeling
> depending on it happens post-2.0.

> Note that according to my interpretation of QOM ABI stability rules we
> can't just turn a link<cpu> into a child<cpu> without renaming, thus
> trying to be forward-looking for where we want to go design-wise.

I think we can.  Children and links look exactly the same from the outside.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts
  2014-03-05 10:06   ` Andreas Färber
  2014-03-05 10:31     ` Paolo Bonzini
@ 2014-03-05 15:09     ` Igor Mammedov
  1 sibling, 0 replies; 70+ messages in thread
From: Igor Mammedov @ 2014-03-05 15:09 UTC (permalink / raw)
  To: Andreas Färber
  Cc: ehabkost, hutao, mtosatti, qemu-devel, a.motakis, Paolo Bonzini,
	gaowanlong

On Wed, 05 Mar 2014 11:06:18 +0100
Andreas Färber <afaerber@suse.de> wrote:

> Am 04.03.2014 15:00, schrieb Paolo Bonzini:
> > From: Igor Mammedov <imammedo@redhat.com>
> > 
> > Adds option to -m
> >  "mem" - startup memory amount
> 
> Sorry for jumping in late, but assuming that -m is for "memory" already,
> wouldn't it make more sense to name it "size" instead of "mem"?
> 
> > 
> > For compatibility with legacy CLI if suffix-less number is passed,
> > it assumes amount in Mb.
> > 
> > Otherwise user is free to use suffixed number using suffixes b,k/K,M,G
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > Reviewed-by: Eric Blake <eblake@redhat.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  qemu-options.hx |  9 +++++---
> >  vl.c            | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------
> >  2 files changed, 63 insertions(+), 16 deletions(-)
> > 
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index f948f28..98e78ca 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -214,10 +214,13 @@ use is discouraged as it may be removed from future versions.
> >  ETEXI
> >  
> >  DEF("m", HAS_ARG, QEMU_OPTION_m,
> > -    "-m megs         set virtual RAM size to megs MB [default="
> > -    stringify(DEFAULT_RAM_SIZE) "]\n", QEMU_ARCH_ALL)
> > +    "-m [mem=]megs\n"
> > +    "                configure guest RAM\n"
> > +    "                mem: initial amount of guest memory (default: "
> > +    stringify(DEFAULT_RAM_SIZE) "MiB)\n",
> > +    QEMU_ARCH_ALL)
> >  STEXI
> > -@item -m @var{megs}
> > +@item -m [mem=]@var{megs}
> >  @findex -m
> >  Set virtual RAM size to @var{megs} megabytes. Default is 128 MiB.  Optionally,
> >  a suffix of ``M'' or ``G'' can be used to signify a value in megabytes or
> > diff --git a/vl.c b/vl.c
> > index dafe6f6..ac5f425 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -478,6 +478,20 @@ static QemuOptsList qemu_msg_opts = {
> >      },
> >  };
> >  
> > +static QemuOptsList qemu_mem_opts = {
> > +    .name = "memory",
> > +    .implied_opt_name = "mem",
> > +    .head = QTAILQ_HEAD_INITIALIZER(qemu_mem_opts.head),
> > +    .merge_lists = true,
> > +    .desc = {
> > +        {
> > +            .name = "mem",
> > +            .type = QEMU_OPT_SIZE,
> > +        },
> > +        { /* end of list */ }
> > +    },
> > +};
> > +
> >  /**
> >   * Get machine options
> >   *
> > @@ -2718,6 +2732,7 @@ int main(int argc, char **argv, char **envp)
> >      };
> >      const char *trace_events = NULL;
> >      const char *trace_file = NULL;
> > +    const ram_addr_t default_ram_size = (ram_addr_t)DEFAULT_RAM_SIZE * 1024 * 1024;
> >  
> >      atexit(qemu_run_exit_notifiers);
> >      error_set_progname(argv[0]);
> > @@ -2758,6 +2773,7 @@ int main(int argc, char **argv, char **envp)
> >      qemu_add_opts(&qemu_realtime_opts);
> >      qemu_add_opts(&qemu_msg_opts);
> >      qemu_add_opts(&qemu_numa_opts);
> > +    qemu_add_opts(&qemu_mem_opts);
> >  
> >      runstate_init();
> >  
> > @@ -2773,7 +2789,7 @@ int main(int argc, char **argv, char **envp)
> >      module_call_init(MODULE_INIT_MACHINE);
> >      machine = find_default_machine();
> >      cpu_model = NULL;
> > -    ram_size = 0;
> > +    ram_size = default_ram_size;
> >      snapshot = 0;
> >      cyls = heads = secs = 0;
> >      translation = BIOS_ATA_TRANSLATION_AUTO;
> > @@ -3063,20 +3079,50 @@ int main(int argc, char **argv, char **envp)
> >                  exit(0);
> >                  break;
> >              case QEMU_OPTION_m: {
> > -                int64_t value;
> >                  uint64_t sz;
> > -                char *end;
> > +                const char *mem_str;
> >  
> > -                value = strtosz(optarg, &end);
> > -                if (value < 0 || *end) {
> > -                    fprintf(stderr, "qemu: invalid ram size: %s\n", optarg);
> > -                    exit(1);
> > +                opts = qemu_opts_parse(qemu_find_opts("memory"),
> > +                                       optarg, 1);
> > +                if (!opts) {
> > +                    exit(EXIT_FAILURE);
> > +                }
> > +
> > +                mem_str = qemu_opt_get(opts, "mem");
> > +                if (!mem_str) {
> > +                    fprintf(stderr, "qemu: invalid -m option, missing "
> > +                            "'mem' option\n");
> 
> error_report(), in particular to fix "qemu: "
> 
> > +                    exit(EXIT_FAILURE);
> > +                }
> > +                if (!*mem_str) {
> > +                    fprintf(stderr, "qemu: missing 'mem' option value\n");
> 
> error_report()
> 
> > +                    exit(EXIT_FAILURE);
> > +                }
> > +
> > +                sz = qemu_opt_get_size(opts, "mem", ram_size);
> > +
> > +                /* Fix up legacy suffix-less format */
> > +                if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) {
> > +                    uint64_t overflow_check = sz;
> > +
> > +                    sz <<= 20;
> > +                    if ((sz >> 20) != overflow_check) {
> > +                        fprintf(stderr, "qemu: too large 'mem' option "
> > +                                "value\n");
> 
> error_report()
> 
> > +                        exit(EXIT_FAILURE);
> > +                    }
> > +                }
> > +
> > +                /* backward compatibility behaviour for case "-m 0" */
> > +                if (sz == 0) {
> > +                    sz = default_ram_size;
> >                  }
> > -                sz = QEMU_ALIGN_UP((uint64_t)value, 8192);
> > +
> > +                sz = QEMU_ALIGN_UP(sz, 8192);
> >                  ram_size = sz;
> >                  if (ram_size != sz) {
> >                      fprintf(stderr, "qemu: ram size too large\n");
> 
> error_report() while at it?
> 
> > -                    exit(1);
> > +                    exit(EXIT_FAILURE);
> >                  }
> >                  break;
> >              }
> > @@ -3921,10 +3967,8 @@ int main(int argc, char **argv, char **envp)
> >          exit(1);
> >      }
> >  
> > -    /* init the memory */
> > -    if (ram_size == 0) {
> > -        ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
> > -    }
> > +    /* store value for the future use */
> > +    qemu_opt_set_number(qemu_find_opts_singleton("memory"), "mem", ram_size);
> >  
> >      if (qemu_opts_foreach(qemu_find_opts("device"), device_help_func, NULL, 0)
> >          != 0) {
> 
> Here's a dependency on the preceding patch - other than that, these two
> could be applied independently of the rest via qemu-trivial?
I'll amend patch according to your notes and resubmit it with its only dependency
CCing trivial-patches this time.

> 
> Regards,
> Andreas
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton Paolo Bonzini
  2014-03-05 10:08   ` Andreas Färber
@ 2014-03-07  2:27   ` Hu Tao
  2014-03-11 18:55   ` Eduardo Habkost
  2 siblings, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  2:27 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:35PM +0100, Paolo Bonzini wrote:
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qemu/config-file.h |  2 ++
>  util/qemu-config.c         | 14 ++++++++++++++
>  vl.c                       | 11 +----------
>  3 files changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/include/qemu/config-file.h b/include/qemu/config-file.h
> index dbd97c4..d4ba20e 100644
> --- a/include/qemu/config-file.h
> +++ b/include/qemu/config-file.h
> @@ -8,6 +8,8 @@
>  
>  QemuOptsList *qemu_find_opts(const char *group);
>  QemuOptsList *qemu_find_opts_err(const char *group, Error **errp);
> +QemuOpts *qemu_find_opts_singleton(const char *group);
> +
>  void qemu_add_opts(QemuOptsList *list);
>  void qemu_add_drive_opts(QemuOptsList *list);
>  int qemu_set_option(const char *str);
> diff --git a/util/qemu-config.c b/util/qemu-config.c
> index f610101..60051df 100644
> --- a/util/qemu-config.c
> +++ b/util/qemu-config.c
> @@ -39,6 +39,20 @@ QemuOptsList *qemu_find_opts(const char *group)
>      return ret;
>  }
>  
> +QemuOpts *qemu_find_opts_singleton(const char *group)
> +{
> +    QemuOptsList *list;
> +    QemuOpts *opts;
> +
> +    list = qemu_find_opts(group);
> +    assert(list);
> +    opts = qemu_opts_find(list, NULL);
> +    if (!opts) {
> +        opts = qemu_opts_create(list, NULL, 0, &error_abort);
> +    }
> +    return opts;
> +}
> +
>  static CommandLineParameterInfoList *query_option_descs(const QemuOptDesc *desc)
>  {
>      CommandLineParameterInfoList *param_list = NULL, *entry;
> diff --git a/vl.c b/vl.c
> index 899b63f..dafe6f6 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -485,16 +485,7 @@ static QemuOptsList qemu_msg_opts = {
>   */
>  QemuOpts *qemu_get_machine_opts(void)
>  {
> -    QemuOptsList *list;
> -    QemuOpts *opts;
> -
> -    list = qemu_find_opts("machine");
> -    assert(list);
> -    opts = qemu_opts_find(list, NULL);
> -    if (!opts) {
> -        opts = qemu_opts_create(list, NULL, 0, &error_abort);
> -    }
> -    return opts;
> +    return qemu_find_opts_singleton("machine");
>  }
>  
>  const char *qemu_get_vm_name(void)
> -- 
> 1.8.5.3
> 

This patch itself has no problem, so:

Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>

BTW, why not let qemu_opt_find() return NULL for a NULL QemuOpts? Then
we can avoid creating an empty QemuOpts.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 09/28] vl: redo -object parsing
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 09/28] vl: redo -object parsing Paolo Bonzini
@ 2014-03-07  2:56   ` Hu Tao
  2014-03-07  7:39     ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Hu Tao @ 2014-03-07  2:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:37PM +0100, Paolo Bonzini wrote:
> Follow the lines of the HMP implementation, using OptsVisitor
> to parse the options.  This gives access to OptsVisitor's
> rich parsing of integer lists.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  vl.c | 87 +++++++++++++++++++++++++++-----------------------------------------
>  1 file changed, 35 insertions(+), 52 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index ac5f425..e8709ee 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -119,8 +119,7 @@ int main(int argc, char **argv)
>  #include "qemu/osdep.h"
>  
>  #include "ui/qemu-spice.h"
> -#include "qapi/string-input-visitor.h"
> -#include "qom/object_interfaces.h"
> +#include "qapi/opts-visitor.h"
>  
>  #define DEFAULT_RAM_SIZE 128
>  
> @@ -2629,69 +2628,53 @@ static void free_and_trace(gpointer mem)
>      free(mem);
>  }
>  
> -static int object_set_property(const char *name, const char *value, void *opaque)
> -{
> -    Object *obj = OBJECT(opaque);
> -    StringInputVisitor *siv;
> -    Error *local_err = NULL;
> -
> -    if (strcmp(name, "qom-type") == 0 || strcmp(name, "id") == 0) {
> -        return 0;
> -    }
> -
> -    siv = string_input_visitor_new(value);
> -    object_property_set(obj, string_input_get_visitor(siv), name, &local_err);
> -    string_input_visitor_cleanup(siv);
> -
> -    if (local_err) {
> -        qerror_report_err(local_err);
> -        error_free(local_err);
> -        return -1;
> -    }
> -
> -    return 0;
> -}
> -
>  static int object_create(QemuOpts *opts, void *opaque)
>  {
> -    const char *type = qemu_opt_get(opts, "qom-type");
> -    const char *id = qemu_opts_id(opts);
> -    Error *local_err = NULL;
> -    Object *obj;
> -
> -    g_assert(type != NULL);
> -
> -    if (id == NULL) {
> -        qerror_report(QERR_MISSING_PARAMETER, "id");
> -        return -1;
> +    Error *err = NULL;
> +    char *type = NULL;
> +    char *id = NULL;
> +    void *dummy = NULL;
> +    OptsVisitor *ov;
> +    QDict *pdict;
> +
> +    ov = opts_visitor_new(opts);
> +    pdict = qemu_opts_to_qdict(opts, NULL);
> +
> +    visit_start_struct(opts_get_visitor(ov), &dummy, NULL, NULL, 0, &err);
> +    if (err) {
> +        goto out;
>      }
>  
> -    obj = object_new(type);
> -    if (qemu_opt_foreach(opts, object_set_property, obj, 1) < 0) {
> -        object_unref(obj);
> -        return -1;
> +    qdict_del(pdict, "qom-type");
> +    visit_type_str(opts_get_visitor(ov), &type, "qom-type", &err);
> +    if (err) {
> +        goto out;
>      }

Can be moved up right before creating qdict.

>  
> -    if (!object_dynamic_cast(obj, TYPE_USER_CREATABLE)) {
> -        error_setg(&local_err, "object '%s' isn't supported by -object",
> -                   id);
> +    qdict_del(pdict, "id");
> +    visit_type_str(opts_get_visitor(ov), &id, "id", &err);
> +    if (err) {
>          goto out;
>      }

Can be moved up right before creating qdict.

>  
> -    user_creatable_complete(obj, &local_err);
> -    if (local_err) {
> +    object_add(type, id, pdict, opts_get_visitor(ov), &err);

I think it's better to move object_add() from qmp.c to qom/object.c.

> +    if (err) {
>          goto out;
>      }
> -
> -    object_property_add_child(container_get(object_get_root(), "/objects"),
> -                              id, obj, &local_err);
> +    visit_end_struct(opts_get_visitor(ov), &err);
> +    if (err) {
> +        qmp_object_del(id, NULL);
> +    }
>  
>  out:
> -    object_unref(obj);
> -    if (local_err) {
> -        qerror_report_err(local_err);
> -        error_free(local_err);
> -        return -1;
> +    opts_visitor_cleanup(ov);
> +
> +    QDECREF(pdict);
> +    g_free(id);
> +    g_free(type);
> +    g_free(dummy);
> +    if (err) {
> +        qerror_report_err(err);
>      }
>      return 0;
>  }
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 11/28] qmp: improve error reporting for -object and object-add
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 11/28] qmp: improve error reporting for -object and object-add Paolo Bonzini
@ 2014-03-07  3:07   ` Hu Tao
  2014-03-07  7:57     ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Hu Tao @ 2014-03-07  3:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:39PM +0100, Paolo Bonzini wrote:
> Use QERR_INVALID_PARAMETER_VALUE for consistency, and avoid an assertion
> failure if the class name is incorrect.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qmp.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/qmp.c b/qmp.c
> index 2ff943d..a3b0b73 100644
> --- a/qmp.c
> +++ b/qmp.c
> @@ -541,7 +541,8 @@ void object_add(const char *type, const char *id, const QDict *qdict,
>      Error *local_err = NULL;
>  
>      if (!object_class_by_name(type)) {
> -        error_setg(errp, "invalid class name");
> +        error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> +                  "qom-type", "a valid class name");
>          return;
>      }
>  
> @@ -556,8 +557,8 @@ void object_add(const char *type, const char *id, const QDict *qdict,
>      }
>  
>      if (!object_dynamic_cast(obj, TYPE_USER_CREATABLE)) {
> -        error_setg(&local_err, "object '%s' isn't supported by object-add",
> -                   id);
> +        error_setg(&local_err, "class '%s' isn't supported by object-add",
> +                   type);
>          goto out;
>      }

There is already an accepted version de580dafade551.

Paolo, I found that your numa tree is behind current master about 99
commits. I'd like to take over this series if you have no time on it.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 12/28] pc: pass QEMUMachineInitArgs to pc_memory_init
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 12/28] pc: pass QEMUMachineInitArgs to pc_memory_init Paolo Bonzini
@ 2014-03-07  3:09   ` Hu Tao
  0 siblings, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  3:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>

On Tue, Mar 04, 2014 at 03:00:40PM +0100, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  hw/i386/pc.c         | 11 +++++------
>  hw/i386/pc_piix.c    |  8 +++-----
>  hw/i386/pc_q35.c     |  4 +---
>  include/hw/i386/pc.h |  7 +++----
>  4 files changed, 12 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index a464e48..17d4820 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1145,10 +1145,8 @@ void pc_acpi_init(const char *default_dsdt)
>      }
>  }
>  
> -FWCfgState *pc_memory_init(MemoryRegion *system_memory,
> -                           const char *kernel_filename,
> -                           const char *kernel_cmdline,
> -                           const char *initrd_filename,
> +FWCfgState *pc_memory_init(QEMUMachineInitArgs *args,
> +                           MemoryRegion *system_memory,
>                             ram_addr_t below_4g_mem_size,
>                             ram_addr_t above_4g_mem_size,
>                             MemoryRegion *rom_memory,
> @@ -1160,7 +1158,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
>      MemoryRegion *ram_below_4g, *ram_above_4g;
>      FWCfgState *fw_cfg;
>  
> -    linux_boot = (kernel_filename != NULL);
> +    linux_boot = (args->kernel_filename != NULL);
>  
>      /* Allocate RAM.  We allocate it as a single memory region and use
>       * aliases to address portions of it, mostly for backwards compatibility
> @@ -1201,7 +1199,8 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
>      rom_set_fw(fw_cfg);
>  
>      if (linux_boot) {
> -        load_linux(fw_cfg, kernel_filename, initrd_filename, kernel_cmdline, below_4g_mem_size);
> +        load_linux(fw_cfg, args->kernel_filename, args->initrd_filename,
> +                   args->kernel_cmdline, below_4g_mem_size);
>      }
>  
>      for (i = 0; i < nb_option_roms; i++) {
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index d5dc1ef..96adc01 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -151,11 +151,9 @@ static void pc_init1(QEMUMachineInitArgs *args,
>  
>      /* allocate ram and load rom/bios */
>      if (!xen_enabled()) {
> -        fw_cfg = pc_memory_init(system_memory,
> -                       args->kernel_filename, args->kernel_cmdline,
> -                       args->initrd_filename,
> -                       below_4g_mem_size, above_4g_mem_size,
> -                       rom_memory, &ram_memory, guest_info);
> +        fw_cfg = pc_memory_init(args, system_memory,
> +                                below_4g_mem_size, above_4g_mem_size,
> +                                rom_memory, &ram_memory, guest_info);
>      }
>  
>      gsi_state = g_malloc0(sizeof(*gsi_state));
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index a7f6260..95fa01fc 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -138,9 +138,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
>  
>      /* allocate ram and load rom/bios */
>      if (!xen_enabled()) {
> -        pc_memory_init(get_system_memory(),
> -                       args->kernel_filename, args->kernel_cmdline,
> -                       args->initrd_filename,
> +        pc_memory_init(args, get_system_memory(),
>                         below_4g_mem_size, above_4g_mem_size,
>                         rom_memory, &ram_memory, guest_info);
>      }
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 9010246..8fc0527 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -3,6 +3,7 @@
>  
>  #include "qemu-common.h"
>  #include "exec/memory.h"
> +#include "hw/boards.h"
>  #include "hw/isa/isa.h"
>  #include "hw/block/fdc.h"
>  #include "net/net.h"
> @@ -134,10 +135,8 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
>  void pc_pci_as_mapping_init(Object *owner, MemoryRegion *system_memory,
>                              MemoryRegion *pci_address_space);
>  
> -FWCfgState *pc_memory_init(MemoryRegion *system_memory,
> -                           const char *kernel_filename,
> -                           const char *kernel_cmdline,
> -                           const char *initrd_filename,
> +FWCfgState *pc_memory_init(QEMUMachineInitArgs *args,
> +                           MemoryRegion *system_memory,
>                             ram_addr_t below_4g_mem_size,
>                             ram_addr_t above_4g_mem_size,
>                             MemoryRegion *rom_memory,
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 13/28] numa: introduce memory_region_allocate_system_memory
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 13/28] numa: introduce memory_region_allocate_system_memory Paolo Bonzini
@ 2014-03-07  3:18   ` Hu Tao
  0 siblings, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  3:18 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:41PM +0100, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  hw/i386/pc.c            |  4 +---
>  include/hw/boards.h     |  4 ++++
>  include/sysemu/sysemu.h |  1 +
>  numa.c                  | 11 +++++++++++
>  4 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 17d4820..ff078fb 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1165,9 +1165,7 @@ FWCfgState *pc_memory_init(QEMUMachineInitArgs *args,
>       * with older qemus that used qemu_ram_alloc().
>       */
>      ram = g_malloc(sizeof(*ram));
> -    memory_region_init_ram(ram, NULL, "pc.ram",
> -                           below_4g_mem_size + above_4g_mem_size);
> -    vmstate_register_ram_global(ram);
> +    memory_region_allocate_system_memory(ram, NULL, "pc.ram", args);
>      *ram_memory = ram;
>      ram_below_4g = g_malloc(sizeof(*ram_below_4g));
>      memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 2151460..8b68878 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -48,6 +48,10 @@ struct QEMUMachine {
>      const char *hw_version;
>  };
>  
> +void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner,
> +                                          const char *name,
> +                                          QEMUMachineInitArgs *args);
> +
>  int qemu_register_machine(QEMUMachine *m);
>  QEMUMachine *find_default_machine(void);
>  
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 4c94cf5..54a6f28 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -10,6 +10,7 @@
>  #include "qemu/notify.h"
>  #include "qemu/main-loop.h"
>  #include "qemu/bitmap.h"
> +#include "qom/object.h"
>  
>  /* vl.c */
>  
> diff --git a/numa.c b/numa.c
> index 6563232..930f49d 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -31,6 +31,7 @@
>  #include "qapi/opts-visitor.h"
>  #include "qapi/dealloc-visitor.h"
>  #include "qapi/qmp/qerror.h"
> +#include "hw/boards.h"
>  
>  QemuOptsList qemu_numa_opts = {
>      .name = "numa",
> @@ -191,3 +192,13 @@ void set_numa_modes(void)
>          }
>      }
>  }
> +
> +void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner,
> +                                          const char *name,
> +                                          QEMUMachineInitArgs *args)

What's needed is only ram_size.

> +{
> +    uint64_t ram_size = args->ram_size;
> +
> +    memory_region_init_ram(mr, owner, name, ram_size);
> +    vmstate_register_ram_global(mr);
> +}
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 14/28] add memdev backend infrastructure
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 14/28] add memdev backend infrastructure Paolo Bonzini
@ 2014-03-07  3:31   ` Hu Tao
  0 siblings, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  3:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:42PM +0100, Paolo Bonzini wrote:
> From: Igor Mammedov <imammedo@redhat.com>
> 
> Provides framework for splitting host RAM allocation/
> policies into a separate backend that could be used
> by devices.
> 
> Initially only legacy RAM backend is provided, which
> uses memory_region_init_ram() allocator and compatible
> with every CLI option that affects memory_region_init_ram().
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  backends/Makefile.objs   |   2 +
>  backends/hostmem-ram.c   |  52 ++++++++++++++++++++++
>  backends/hostmem.c       | 110 +++++++++++++++++++++++++++++++++++++++++++++++
>  include/sysemu/hostmem.h |  60 ++++++++++++++++++++++++++
>  4 files changed, 224 insertions(+)
>  create mode 100644 backends/hostmem-ram.c
>  create mode 100644 backends/hostmem.c
>  create mode 100644 include/sysemu/hostmem.h
> 
> diff --git a/backends/Makefile.objs b/backends/Makefile.objs
> index 42557d5..e6bdc11 100644
> --- a/backends/Makefile.objs
> +++ b/backends/Makefile.objs
> @@ -6,3 +6,5 @@ common-obj-$(CONFIG_BRLAPI) += baum.o
>  $(obj)/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS) 
>  
>  common-obj-$(CONFIG_TPM) += tpm.o
> +
> +common-obj-y += hostmem.o hostmem-ram.o
> diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> new file mode 100644
> index 0000000..ce06fbe
> --- /dev/null
> +++ b/backends/hostmem-ram.c
> @@ -0,0 +1,52 @@
> +/*
> + * QEMU Host Memory Backend
> + *
> + * Copyright (C) 2013 Red Hat Inc
> + *
> + * Authors:
> + *   Igor Mammedov <imammedo@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#include "sysemu/hostmem.h"
> +#include "qom/object_interfaces.h"
> +
> +#define TYPE_MEMORY_BACKEND_RAM "memory-ram"
> +
> +
> +static void
> +ram_backend_memory_init(UserCreatable *uc, Error **errp)
> +{
> +    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
> +
> +    if (!backend->size) {
> +        error_setg(errp, "can't create backend with size 0");
> +        return;
> +    }
> +
> +    memory_region_init_ram(&backend->mr, OBJECT(backend),
> +                           object_get_canonical_path(OBJECT(backend)),
> +                           backend->size);
> +}
> +
> +static void
> +ram_backend_class_init(ObjectClass *oc, void *data)
> +{
> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
> +
> +    ucc->complete = ram_backend_memory_init;
> +}
> +
> +static const TypeInfo ram_backend_info = {
> +    .name = TYPE_MEMORY_BACKEND_RAM,
> +    .parent = TYPE_MEMORY_BACKEND,
> +    .class_init = ram_backend_class_init,
> +};
> +
> +static void register_types(void)
> +{
> +    type_register_static(&ram_backend_info);
> +}
> +
> +type_init(register_types);
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> new file mode 100644
> index 0000000..06817dd
> --- /dev/null
> +++ b/backends/hostmem.c
> @@ -0,0 +1,110 @@
> +/*
> + * QEMU Host Memory Backend
> + *
> + * Copyright (C) 2013 Red Hat Inc
> + *
> + * Authors:
> + *   Igor Mammedov <imammedo@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#include "sysemu/hostmem.h"
> +#include "sysemu/sysemu.h"
> +#include "qapi/visitor.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qemu/config-file.h"
> +#include "qom/object_interfaces.h"
> +
> +static void
> +host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
> +                            const char *name, Error **errp)
> +{
> +    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
> +    uint64_t value = backend->size;
> +
> +    visit_type_size(v, &value, name, errp);
> +}
> +
> +static void
> +host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
> +                            const char *name, Error **errp)
> +{
> +    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
> +    uint64_t value;
> +
> +    if (memory_region_size(&backend->mr)) {
> +        error_setg(errp, "cannot change property value\n");
> +        return;
> +    }
> +
> +    visit_type_size(v, &value, name, errp);
> +    if (error_is_set(errp)) {
> +        return;
> +    }
> +    if (!value) {
> +        error_setg(errp, "Property '%s.%s' doesn't take value '%" PRIu64 "'",
> +                   object_get_typename(obj), name , value);
> +        return;
> +    }
> +    backend->size = value;
> +}
> +
> +static void host_memory_backend_initfn(Object *obj)
> +{
> +    object_property_add(obj, "size", "int",
> +                        host_memory_backend_get_size,
> +                        host_memory_backend_set_size, NULL, NULL, NULL);
> +}
> +
> +static void host_memory_backend_finalize(Object *obj)
> +{
> +    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
> +
> +    if (memory_region_size(&backend->mr)) {
> +        memory_region_destroy(&backend->mr);
> +    }
> +}
> +
> +static void
> +host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
> +{
> +    error_setg(errp, "memory_init is not implemented for type [%s]",
> +               object_get_typename(OBJECT(uc)));
> +}
> +
> +MemoryRegion *
> +host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
> +{
> +    return memory_region_size(&backend->mr) ? &backend->mr : NULL;
> +}
> +
> +static void
> +host_memory_backend_class_init(ObjectClass *oc, void *data)
> +{
> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
> +
> +    ucc->complete = host_memory_backend_memory_init;
> +}
> +
> +static const TypeInfo host_memory_backend_info = {
> +    .name = TYPE_MEMORY_BACKEND,
> +    .parent = TYPE_OBJECT,
> +    .abstract = true,
> +    .class_size = sizeof(HostMemoryBackendClass),
> +    .class_init = host_memory_backend_class_init,
> +    .instance_size = sizeof(HostMemoryBackend),
> +    .instance_init = host_memory_backend_initfn,
> +    .instance_finalize = host_memory_backend_finalize,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_USER_CREATABLE },
> +        { }
> +    }
> +};
> +
> +static void register_types(void)
> +{
> +    type_register_static(&host_memory_backend_info);
> +}
> +
> +type_init(register_types);
> diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
> new file mode 100644
> index 0000000..bc3ffb3
> --- /dev/null
> +++ b/include/sysemu/hostmem.h
> @@ -0,0 +1,60 @@
> +/*
> + * QEMU Host Memory Backend
> + *
> + * Copyright (C) 2013 Red Hat Inc
> + *
> + * Authors:
> + *   Igor Mammedov <imammedo@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#ifndef QEMU_RAM_H
> +#define QEMU_RAM_H
> +
> +#include "qom/object.h"
> +#include "qapi/error.h"

Not needed.

> +#include "exec/memory.h"
> +#include "qemu/option.h"

Not needed.

> +
> +#define TYPE_MEMORY_BACKEND "memory"
> +#define MEMORY_BACKEND(obj) \
> +    OBJECT_CHECK(HostMemoryBackend, (obj), TYPE_MEMORY_BACKEND)
> +#define MEMORY_BACKEND_GET_CLASS(obj) \
> +    OBJECT_GET_CLASS(HostMemoryBackendClass, (obj), TYPE_MEMORY_BACKEND)
> +#define MEMORY_BACKEND_CLASS(klass) \
> +    OBJECT_CLASS_CHECK(HostMemoryBackendClass, (klass), TYPE_MEMORY_BACKEND)
> +
> +typedef struct HostMemoryBackend HostMemoryBackend;
> +typedef struct HostMemoryBackendClass HostMemoryBackendClass;
> +
> +/**
> + * HostMemoryBackendClass:
> + * @parent_class: opaque parent class container
> + */
> +struct HostMemoryBackendClass {
> +    ObjectClass parent_class;
> +};
> +
> +/**
> + * @HostMemoryBackend
> + *
> + * @parent: opaque parent object container
> + * @size: amount of memory backend provides
> + * @id: unique identification string in memdev namespace
> + * @mr: MemoryRegion representing host memory belonging to backend
> + */
> +struct HostMemoryBackend {
> +    /* private */
> +    Object parent;
> +
> +    /* protected */
> +    uint64_t size;
> +
> +    MemoryRegion mr;
> +};
> +
> +MemoryRegion *host_memory_backend_get_memory(HostMemoryBackend *backend,
> +                                             Error **errp);
> +
> +#endif
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option Paolo Bonzini
  2014-03-04 17:52   ` Eric Blake
@ 2014-03-07  5:33   ` Hu Tao
  2014-03-07  7:41     ` Paolo Bonzini
  1 sibling, 1 reply; 70+ messages in thread
From: Hu Tao @ 2014-03-07  5:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:43PM +0100, Paolo Bonzini wrote:
> This option provides the infrastructure for binding guest NUMA nodes
> to host NUMA nodes.  For example:
> 
>  -object memory-ram,size=1024M,policy=membind,host-nodes=0,id=ram-node0 \
>  -numa node,nodeid=0,cpus=0,memdev=ram-node0 \
>  -object memory-ram,size=1024M,policy=interleave,host-nodes=1-3,id=ram-node1 \
>  -numa node,nodeid=1,cpus=1,memdev=ram-node1
> 
> The option replaces "-numa node,mem=".
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/sysemu/sysemu.h |  1 +
>  numa.c                  | 63 +++++++++++++++++++++++++++++++++++++++++++++++--
>  qapi-schema.json        |  8 ++++++-
>  qemu-options.hx         | 12 ++++++----
>  4 files changed, 77 insertions(+), 7 deletions(-)
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 54a6f28..4870129 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -139,6 +139,7 @@ extern int nb_numa_nodes;
>  typedef struct node_info {
>      uint64_t node_mem;
>      DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
> +    struct HostMemoryBackend *node_memdev;
>  } NodeInfo;
>  extern NodeInfo numa_info[MAX_NODES];
>  void set_numa_nodes(void);
> diff --git a/numa.c b/numa.c
> index 930f49d..b00ef90 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -32,6 +32,7 @@
>  #include "qapi/dealloc-visitor.h"
>  #include "qapi/qmp/qerror.h"
>  #include "hw/boards.h"
> +#include "sysemu/hostmem.h"
>  
>  QemuOptsList qemu_numa_opts = {
>      .name = "numa",
> @@ -40,6 +41,8 @@ QemuOptsList qemu_numa_opts = {
>      .desc = { { 0 } } /* validated with OptsVisitor */
>  };
>  
> +static int have_memdevs = -1;
> +

bool?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation Paolo Bonzini
@ 2014-03-07  6:09   ` Hu Tao
  2014-03-07  6:34     ` Hu Tao
  2014-03-07  7:47     ` Paolo Bonzini
  0 siblings, 2 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  6:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:44PM +0100, Paolo Bonzini wrote:
> Split the internal interface in exec.c to a separate function, and
> push the check on mem_path up to memory_region_init_ram.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                  | 105 +++++++++++++++++++++++++++++-------------------
>  include/exec/cpu-all.h  |   3 --
>  include/exec/ram_addr.h |   2 +
>  include/sysemu/sysemu.h |   2 +
>  memory.c                |   7 +++-
>  5 files changed, 73 insertions(+), 46 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index b69fd29..0aa4947 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1240,56 +1240,30 @@ static int memory_try_enable_merging(void *addr, size_t len)
>      return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
>  }
>  
> -ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> -                                   MemoryRegion *mr)
> +static ram_addr_t ram_block_add(RAMBlock *new_block)
>  {
> -    RAMBlock *block, *new_block;
> +    RAMBlock *block;
>      ram_addr_t old_ram_size, new_ram_size;
>  
>      old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
>  
> -    size = TARGET_PAGE_ALIGN(size);
> -    new_block = g_malloc0(sizeof(*new_block));
> -    new_block->fd = -1;
> -
>      /* This assumes the iothread lock is taken here too.  */
>      qemu_mutex_lock_ramlist();
> -    new_block->mr = mr;
> -    new_block->offset = find_ram_offset(size);
> -    if (host) {
> -        new_block->host = host;
> -        new_block->flags |= RAM_PREALLOC_MASK;
> -    } else if (xen_enabled()) {
> -        if (mem_path) {
> -            fprintf(stderr, "-mem-path not supported with Xen\n");
> -            exit(1);
> -        }
> -        xen_ram_alloc(new_block->offset, size, mr);
> -    } else {
> -        if (mem_path) {
> -            if (phys_mem_alloc != qemu_anon_ram_alloc) {
> -                /*
> -                 * file_ram_alloc() needs to allocate just like
> -                 * phys_mem_alloc, but we haven't bothered to provide
> -                 * a hook there.
> -                 */
> -                fprintf(stderr,
> -                        "-mem-path not supported with this accelerator\n");
> -                exit(1);
> -            }
> -            new_block->host = file_ram_alloc(new_block, size, mem_path);
> -        }
> -        if (!new_block->host) {
> -            new_block->host = phys_mem_alloc(size);
> +    new_block->offset = find_ram_offset(new_block->length);
> +
> +    if (!new_block->host) {
> +        if (xen_enabled()) {
> +            xen_ram_alloc(new_block->offset, new_block->length, new_block->mr);
> +        } else {
> +            new_block->host = phys_mem_alloc(new_block->length);
>              if (!new_block->host) {
>                  fprintf(stderr, "Cannot set up guest memory '%s': %s\n",
>                          new_block->mr->name, strerror(errno));
>                  exit(1);
>              }
> -            memory_try_enable_merging(new_block->host, size);
> +            memory_try_enable_merging(new_block->host, new_block->length);
>          }
>      }
> -    new_block->length = size;
>  
>      /* Keep the list sorted from biggest to smallest block.  */
>      QTAILQ_FOREACH(block, &ram_list.blocks, next) {
> @@ -1317,18 +1291,65 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>                                     old_ram_size, new_ram_size);
>         }
>      }
> -    cpu_physical_memory_set_dirty_range(new_block->offset, size);
> +    cpu_physical_memory_set_dirty_range(new_block->offset, new_block->length);
>  
> -    qemu_ram_setup_dump(new_block->host, size);
> -    qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
> -    qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK);
> +    qemu_ram_setup_dump(new_block->host, new_block->length);
> +    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_HUGEPAGE);
> +    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_DONTFORK);
>  
> -    if (kvm_enabled())
> -        kvm_setup_guest_memory(new_block->host, size);
> +    if (kvm_enabled()) {
> +        kvm_setup_guest_memory(new_block->host, new_block->length);
> +    }
>  
>      return new_block->offset;
>  }
>  
> +ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> +                                    const char *mem_path)
> +{
> +    RAMBlock *new_block;
> +
> +    if (xen_enabled()) {
> +        fprintf(stderr, "-mem-path not supported with Xen\n");
> +        exit(1);
> +    }
> +
> +    if (phys_mem_alloc != qemu_anon_ram_alloc) {
> +        /*
> +         * file_ram_alloc() needs to allocate just like
> +         * phys_mem_alloc, but we haven't bothered to provide
> +         * a hook there.
> +         */
> +        fprintf(stderr,
> +                "-mem-path not supported with this accelerator\n");
> +        exit(1);
> +    }
> +
> +    size = TARGET_PAGE_ALIGN(size);
> +    new_block = g_malloc0(sizeof(*new_block));
> +    new_block->mr = mr;
> +    new_block->length = size;
> +    new_block->host = file_ram_alloc(new_block, size, mem_path);
> +    return ram_block_add(new_block);
> +}
> +
> +ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> +                                   MemoryRegion *mr)
> +{
> +    RAMBlock *new_block;
> +
> +    size = TARGET_PAGE_ALIGN(size);
> +    new_block = g_malloc0(sizeof(*new_block));
> +    new_block->mr = mr;
> +    new_block->length = size;
> +    new_block->fd = -1;
> +    new_block->host = host;
> +    if (host) {
> +        new_block->flags |= RAM_PREALLOC_MASK;
> +    }
> +    return ram_block_add(new_block);
> +}
> +
>  ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
>  {
>      return qemu_ram_alloc_from_ptr(size, NULL, mr);
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index e66ab5b..b44babb 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -466,9 +466,6 @@ typedef struct RAMList {
>  } RAMList;
>  extern RAMList ram_list;
>  
> -extern const char *mem_path;
> -extern int mem_prealloc;
> -
>  /* Flags stored in the low bits of the TLB virtual address.  These are
>     defined so that fast path ram access is all zeros.  */
>  /* Zero if TLB entry is valid.  */
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 2edfa96..dedb258 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -22,6 +22,8 @@
>  #ifndef CONFIG_USER_ONLY
>  #include "hw/xen/xen.h"
>  
> +ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> +                                    const char *mem_path);
>  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>                                     MemoryRegion *mr);
>  ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 4870129..03f5ee5 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -132,6 +132,8 @@ extern uint8_t *boot_splash_filedata;
>  extern size_t boot_splash_filedata_size;
>  extern uint8_t qemu_extra_params_fw[2];
>  extern QEMUClockType rtc_clock;
> +extern const char *mem_path;
> +extern int mem_prealloc;
>  
>  #define MAX_NODES 128
>  #define MAX_CPUMASK_BITS 255
> diff --git a/memory.c b/memory.c
> index 59ecc28..32b17a8 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -23,6 +23,7 @@
>  
>  #include "exec/memory-internal.h"
>  #include "exec/ram_addr.h"
> +#include "sysemu/sysemu.h"
>  
>  //#define DEBUG_UNASSIGNED
>  
> @@ -1016,7 +1017,11 @@ void memory_region_init_ram(MemoryRegion *mr,
>      mr->ram = true;
>      mr->terminates = true;
>      mr->destructor = memory_region_destructor_ram;
> -    mr->ram_addr = qemu_ram_alloc(size, mr);
> +    if (mem_path) {
> +        mr->ram_addr = qemu_ram_alloc_from_file(size, mr, mem_path);
> +    } else {
> +        mr->ram_addr = qemu_ram_alloc(size, mr);
> +    }
>  }

This changes the logic of the original code:

  if (mem_path) {
      ...
      new_block->host = file_ram_alloc(new_block, size, mem_path);
  }
  if (!new_block->host) {
      new_block->host = phys_mem_alloc(size);
      ...
  }

>  
>  void memory_region_init_ram_ptr(MemoryRegion *mr,
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation
  2014-03-07  6:09   ` Hu Tao
@ 2014-03-07  6:34     ` Hu Tao
  2014-03-07  7:47     ` Paolo Bonzini
  1 sibling, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  6:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, a.motakis, imammedo, gaowanlong

On Fri, Mar 07, 2014 at 02:09:25PM +0800, Hu Tao wrote:
> On Tue, Mar 04, 2014 at 03:00:44PM +0100, Paolo Bonzini wrote:
> > Split the internal interface in exec.c to a separate function, and
> > push the check on mem_path up to memory_region_init_ram.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  exec.c                  | 105 +++++++++++++++++++++++++++++-------------------
> >  include/exec/cpu-all.h  |   3 --
> >  include/exec/ram_addr.h |   2 +
> >  include/sysemu/sysemu.h |   2 +
> >  memory.c                |   7 +++-
> >  5 files changed, 73 insertions(+), 46 deletions(-)
> > 
> > diff --git a/exec.c b/exec.c
> > index b69fd29..0aa4947 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -1240,56 +1240,30 @@ static int memory_try_enable_merging(void *addr, size_t len)
> >      return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
> >  }
> >  
> > -ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> > -                                   MemoryRegion *mr)
> > +static ram_addr_t ram_block_add(RAMBlock *new_block)
> >  {
> > -    RAMBlock *block, *new_block;
> > +    RAMBlock *block;
> >      ram_addr_t old_ram_size, new_ram_size;
> >  
> >      old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
> >  
> > -    size = TARGET_PAGE_ALIGN(size);
> > -    new_block = g_malloc0(sizeof(*new_block));
> > -    new_block->fd = -1;
> > -
> >      /* This assumes the iothread lock is taken here too.  */
> >      qemu_mutex_lock_ramlist();
> > -    new_block->mr = mr;
> > -    new_block->offset = find_ram_offset(size);
> > -    if (host) {
> > -        new_block->host = host;
> > -        new_block->flags |= RAM_PREALLOC_MASK;
> > -    } else if (xen_enabled()) {
> > -        if (mem_path) {
> > -            fprintf(stderr, "-mem-path not supported with Xen\n");
> > -            exit(1);
> > -        }
> > -        xen_ram_alloc(new_block->offset, size, mr);
> > -    } else {
> > -        if (mem_path) {
> > -            if (phys_mem_alloc != qemu_anon_ram_alloc) {
> > -                /*
> > -                 * file_ram_alloc() needs to allocate just like
> > -                 * phys_mem_alloc, but we haven't bothered to provide
> > -                 * a hook there.
> > -                 */
> > -                fprintf(stderr,
> > -                        "-mem-path not supported with this accelerator\n");
> > -                exit(1);
> > -            }
> > -            new_block->host = file_ram_alloc(new_block, size, mem_path);
> > -        }
> > -        if (!new_block->host) {
> > -            new_block->host = phys_mem_alloc(size);
> > +    new_block->offset = find_ram_offset(new_block->length);
> > +
> > +    if (!new_block->host) {
> > +        if (xen_enabled()) {
> > +            xen_ram_alloc(new_block->offset, new_block->length, new_block->mr);
> > +        } else {
> > +            new_block->host = phys_mem_alloc(new_block->length);
> >              if (!new_block->host) {
> >                  fprintf(stderr, "Cannot set up guest memory '%s': %s\n",
> >                          new_block->mr->name, strerror(errno));
> >                  exit(1);
> >              }
> > -            memory_try_enable_merging(new_block->host, size);
> > +            memory_try_enable_merging(new_block->host, new_block->length);
> >          }
> >      }
> > -    new_block->length = size;
> >  
> >      /* Keep the list sorted from biggest to smallest block.  */
> >      QTAILQ_FOREACH(block, &ram_list.blocks, next) {
> > @@ -1317,18 +1291,65 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> >                                     old_ram_size, new_ram_size);
> >         }
> >      }
> > -    cpu_physical_memory_set_dirty_range(new_block->offset, size);
> > +    cpu_physical_memory_set_dirty_range(new_block->offset, new_block->length);
> >  
> > -    qemu_ram_setup_dump(new_block->host, size);
> > -    qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
> > -    qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK);
> > +    qemu_ram_setup_dump(new_block->host, new_block->length);
> > +    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_HUGEPAGE);
> > +    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_DONTFORK);
> >  
> > -    if (kvm_enabled())
> > -        kvm_setup_guest_memory(new_block->host, size);
> > +    if (kvm_enabled()) {
> > +        kvm_setup_guest_memory(new_block->host, new_block->length);
> > +    }
> >  
> >      return new_block->offset;
> >  }
> >  
> > +ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> > +                                    const char *mem_path)
> > +{
> > +    RAMBlock *new_block;
> > +
> > +    if (xen_enabled()) {
> > +        fprintf(stderr, "-mem-path not supported with Xen\n");
> > +        exit(1);
> > +    }
> > +
> > +    if (phys_mem_alloc != qemu_anon_ram_alloc) {
> > +        /*
> > +         * file_ram_alloc() needs to allocate just like
> > +         * phys_mem_alloc, but we haven't bothered to provide
> > +         * a hook there.
> > +         */
> > +        fprintf(stderr,
> > +                "-mem-path not supported with this accelerator\n");
> > +        exit(1);
> > +    }
> > +
> > +    size = TARGET_PAGE_ALIGN(size);
> > +    new_block = g_malloc0(sizeof(*new_block));
> > +    new_block->mr = mr;
> > +    new_block->length = size;
> > +    new_block->host = file_ram_alloc(new_block, size, mem_path);
> > +    return ram_block_add(new_block);
> > +}
> > +
> > +ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> > +                                   MemoryRegion *mr)
> > +{
> > +    RAMBlock *new_block;
> > +
> > +    size = TARGET_PAGE_ALIGN(size);
> > +    new_block = g_malloc0(sizeof(*new_block));
> > +    new_block->mr = mr;
> > +    new_block->length = size;
> > +    new_block->fd = -1;
> > +    new_block->host = host;
> > +    if (host) {
> > +        new_block->flags |= RAM_PREALLOC_MASK;
> > +    }
> > +    return ram_block_add(new_block);
> > +}
> > +
> >  ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
> >  {
> >      return qemu_ram_alloc_from_ptr(size, NULL, mr);
> > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> > index e66ab5b..b44babb 100644
> > --- a/include/exec/cpu-all.h
> > +++ b/include/exec/cpu-all.h
> > @@ -466,9 +466,6 @@ typedef struct RAMList {
> >  } RAMList;
> >  extern RAMList ram_list;
> >  
> > -extern const char *mem_path;
> > -extern int mem_prealloc;
> > -
> >  /* Flags stored in the low bits of the TLB virtual address.  These are
> >     defined so that fast path ram access is all zeros.  */
> >  /* Zero if TLB entry is valid.  */
> > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> > index 2edfa96..dedb258 100644
> > --- a/include/exec/ram_addr.h
> > +++ b/include/exec/ram_addr.h
> > @@ -22,6 +22,8 @@
> >  #ifndef CONFIG_USER_ONLY
> >  #include "hw/xen/xen.h"
> >  
> > +ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> > +                                    const char *mem_path);
> >  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> >                                     MemoryRegion *mr);
> >  ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
> > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > index 4870129..03f5ee5 100644
> > --- a/include/sysemu/sysemu.h
> > +++ b/include/sysemu/sysemu.h
> > @@ -132,6 +132,8 @@ extern uint8_t *boot_splash_filedata;
> >  extern size_t boot_splash_filedata_size;
> >  extern uint8_t qemu_extra_params_fw[2];
> >  extern QEMUClockType rtc_clock;
> > +extern const char *mem_path;
> > +extern int mem_prealloc;
> >  
> >  #define MAX_NODES 128
> >  #define MAX_CPUMASK_BITS 255
> > diff --git a/memory.c b/memory.c
> > index 59ecc28..32b17a8 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -23,6 +23,7 @@
> >  
> >  #include "exec/memory-internal.h"
> >  #include "exec/ram_addr.h"
> > +#include "sysemu/sysemu.h"
> >  
> >  //#define DEBUG_UNASSIGNED
> >  
> > @@ -1016,7 +1017,11 @@ void memory_region_init_ram(MemoryRegion *mr,
> >      mr->ram = true;
> >      mr->terminates = true;
> >      mr->destructor = memory_region_destructor_ram;
> > -    mr->ram_addr = qemu_ram_alloc(size, mr);
> > +    if (mem_path) {
> > +        mr->ram_addr = qemu_ram_alloc_from_file(size, mr, mem_path);
> > +    } else {
> > +        mr->ram_addr = qemu_ram_alloc(size, mr);
> > +    }
> >  }
> 
> This changes the logic of the original code:
> 
>   if (mem_path) {
>       ...
>       new_block->host = file_ram_alloc(new_block, size, mem_path);
>   }
>   if (!new_block->host) {
>       new_block->host = phys_mem_alloc(size);
>       ...
>   }

Never mind. I'm now at patch 18.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend Paolo Bonzini
  2014-03-04 17:38   ` Eric Blake
@ 2014-03-07  6:57   ` Hu Tao
  1 sibling, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  6:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:49PM +0100, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  backends/Makefile.objs  |   1 +
>  backends/hostmem-file.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 109 insertions(+)
>  create mode 100644 backends/hostmem-file.c
> 
> diff --git a/backends/Makefile.objs b/backends/Makefile.objs
> index e6bdc11..509e4a3 100644
> --- a/backends/Makefile.objs
> +++ b/backends/Makefile.objs
> @@ -8,3 +8,4 @@ $(obj)/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS)
>  common-obj-$(CONFIG_TPM) += tpm.o
>  
>  common-obj-y += hostmem.o hostmem-ram.o
> +common-obj-$(CONFIG_LINUX) += hostmem-file.o
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> new file mode 100644
> index 0000000..8c6ea5d
> --- /dev/null
> +++ b/backends/hostmem-file.c
> @@ -0,0 +1,108 @@
> +/*
> + * QEMU Host Memory Backend for hugetlbfs
> + *
> + * Copyright (C) 2013 Red Hat Inc
> + *
> + * Authors:
> + *   Paolo Bonzini <pbonzini@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#include "sysemu/hostmem.h"
> +#include "qom/object_interfaces.h"
> +
> +/* hostmem-file.c */
> +/**
> + * @TYPE_MEMORY_BACKEND_FILE:
> + * name of backend that uses mmap on a file descriptor
> + */
> +#define TYPE_MEMORY_BACKEND_FILE "memory-file"
> +
> +#define MEMORY_BACKEND_FILE(obj) \
> +    OBJECT_CHECK(HostMemoryBackendFile, (obj), TYPE_MEMORY_BACKEND_FILE)
> +
> +typedef struct HostMemoryBackendFile HostMemoryBackendFile;
> +
> +struct HostMemoryBackendFile {
> +    HostMemoryBackend parent_obj;
> +    char *mem_path;
> +};
> +
> +static void
> +file_backend_memory_init(UserCreatable *uc, Error **errp)
> +{
> +    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
> +    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(uc);
> +
> +    if (!backend->size) {
> +        error_setg(errp, "can't create backend with size 0");
> +        return;
> +    }
> +    if (!fb->mem_path) {
> +        error_setg(errp, "mem-path property not set");
> +        return;
> +    }
> +#ifndef CONFIG_LINUX
> +    error_setg(errp, "-mem-path not supported on this host");
> +#else
> +    if (!memory_region_size(&backend->mr)) {
> +        memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
> +                                         object_get_canonical_path(OBJECT(backend)),
> +                                         backend->size,
> +                                         fb->mem_path, errp);
> +    }
> +#endif
> +}
> +
> +static void
> +file_backend_class_init(ObjectClass *oc, void *data)
> +{
> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
> +
> +    ucc->complete = file_backend_memory_init;
> +}
> +
> +static char *get_mem_path(Object *o, Error **errp)
> +{
> +    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
> +
> +    return g_strdup(fb->mem_path);
> +}
> +
> +static void set_mem_path(Object *o, const char *str, Error **errp)
> +{
> +    HostMemoryBackend *backend = MEMORY_BACKEND(o);
> +    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
> +
> +    if (memory_region_size(&backend->mr)) {
> +        error_setg(errp, "cannot change property value");

It's ambiguous that doesn't explain why the property value can't be
changed.

> +        return;
> +    }
> +    if (fb->mem_path) {
> +        g_free(fb->mem_path);
> +    }
> +    fb->mem_path = g_strdup(str);
> +}
> +
> +static void
> +file_backend_instance_init(Object *o)
> +{
> +    object_property_add_str(o, "mem-path", get_mem_path,
> +                            set_mem_path, NULL);
> +}
> +
> +static const TypeInfo file_backend_info = {
> +    .name = TYPE_MEMORY_BACKEND_FILE,
> +    .parent = TYPE_MEMORY_BACKEND,
> +    .class_init = file_backend_class_init,
> +    .instance_init = file_backend_instance_init,
> +    .instance_size = sizeof(HostMemoryBackendFile),
> +};
> +
> +static void register_types(void)
> +{
> +    type_register_static(&file_backend_info);
> +}
> +
> +type_init(register_types);
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 22/28] hostmem: separate allocation from UserCreatable complete method
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 22/28] hostmem: separate allocation from UserCreatable complete method Paolo Bonzini
@ 2014-03-07  7:08   ` Hu Tao
  0 siblings, 0 replies; 70+ messages in thread
From: Hu Tao @ 2014-03-07  7:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:50PM +0100, Paolo Bonzini wrote:
> This allows the superclass to set various policies on the memory
> region that the subclass creates.
> 
> Suggested-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  backends/hostmem-file.c  |  9 ++++-----
>  backends/hostmem-ram.c   |  8 +++-----
>  backends/hostmem.c       | 12 ++++++++++--
>  include/sysemu/hostmem.h |  2 ++
>  4 files changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> index 8c6ea5d..7e91665 100644
> --- a/backends/hostmem-file.c
> +++ b/backends/hostmem-file.c
> @@ -30,10 +30,9 @@ struct HostMemoryBackendFile {
>  };
>  
>  static void
> -file_backend_memory_init(UserCreatable *uc, Error **errp)
> +file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>  {
> -    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
> -    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(uc);
> +    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
>  
>      if (!backend->size) {
>          error_setg(errp, "can't create backend with size 0");
> @@ -58,9 +57,9 @@ file_backend_memory_init(UserCreatable *uc, Error **errp)
>  static void
>  file_backend_class_init(ObjectClass *oc, void *data)
>  {
> -    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
> +    HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
>  
> -    ucc->complete = file_backend_memory_init;
> +    bc->alloc = file_backend_memory_alloc;
>  }
>  
>  static char *get_mem_path(Object *o, Error **errp)
> diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> index ce06fbe..e4d244a 100644
> --- a/backends/hostmem-ram.c
> +++ b/backends/hostmem-ram.c
> @@ -16,10 +16,8 @@
>  
>  
>  static void
> -ram_backend_memory_init(UserCreatable *uc, Error **errp)
> +ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>  {
> -    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
> -
>      if (!backend->size) {
>          error_setg(errp, "can't create backend with size 0");
>          return;
> @@ -33,9 +31,9 @@ ram_backend_memory_init(UserCreatable *uc, Error **errp)
>  static void
>  ram_backend_class_init(ObjectClass *oc, void *data)
>  {
> -    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
> +    HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
>  
> -    ucc->complete = ram_backend_memory_init;
> +    bc->alloc = ram_backend_memory_alloc;
>  }
>  
>  static const TypeInfo ram_backend_info = {
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 06817dd..7d6199f 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -69,8 +69,16 @@ static void host_memory_backend_finalize(Object *obj)
>  static void
>  host_memory_backend_memory_init(UserCreatable *uc, Error **errp)
>  {
> -    error_setg(errp, "memory_init is not implemented for type [%s]",
> -               object_get_typename(OBJECT(uc)));
> +    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
> +    HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
> +
> +    if (!bc->alloc) {
> +        error_setg(errp, "memory_init is not implemented for type [%s]",

s/memory_init/memory_alloc/ ?

> +                   object_get_typename(OBJECT(uc)));
> +        return;
> +    }
> +
> +    bc->alloc(backend, errp);
>  }
>  
>  MemoryRegion *
> diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
> index bc3ffb3..4738107 100644
> --- a/include/sysemu/hostmem.h
> +++ b/include/sysemu/hostmem.h
> @@ -34,6 +34,8 @@ typedef struct HostMemoryBackendClass HostMemoryBackendClass;
>   */
>  struct HostMemoryBackendClass {
>      ObjectClass parent_class;
> +
> +    void (*alloc)(HostMemoryBackend *backend, Error **errp);
>  };
>  
>  /**
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 09/28] vl: redo -object parsing
  2014-03-07  2:56   ` Hu Tao
@ 2014-03-07  7:39     ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-07  7:39 UTC (permalink / raw)
  To: Hu Tao; +Cc: ehabkost, mtosatti, qemu-devel, a.motakis, imammedo, gaowanlong

Il 07/03/2014 03:56, Hu Tao ha scritto:
>> -    obj = object_new(type);
>> -    if (qemu_opt_foreach(opts, object_set_property, obj, 1) < 0) {
>> -        object_unref(obj);
>> -        return -1;
>> +    qdict_del(pdict, "qom-type");
>> +    visit_type_str(opts_get_visitor(ov), &type, "qom-type", &err);
>> +    if (err) {
>> +        goto out;
>>      }
>
> Can be moved up right before creating qdict.
>
>>
>> -    if (!object_dynamic_cast(obj, TYPE_USER_CREATABLE)) {
>> -        error_setg(&local_err, "object '%s' isn't supported by -object",
>> -                   id);
>> +    qdict_del(pdict, "id");
>> +    visit_type_str(opts_get_visitor(ov), &id, "id", &err);
>> +    if (err) {
>>          goto out;
>>      }
>
> Can be moved up right before creating qdict.

In both cases I prefer to keep the qdict_del and visit_type_str together.

>>
>> -    user_creatable_complete(obj, &local_err);
>> -    if (local_err) {
>> +    object_add(type, id, pdict, opts_get_visitor(ov), &err);
>
> I think it's better to move object_add() from qmp.c to qom/object.c.

No, I don't think so.  qom/object.c is not using QDict.  It is common 
for "human user interface" files (hmp.c, ui/gtk.c, in this case vl.c) to 
use qmp.c, the structure should be

           hmp.c, ui/gtk.c, vl.c
                  |
                qmp.c
                  |
         qom/, cpus.c, etc.

We could move parts of qmp.c to qom/qmp.c, that would be fine.

Paolo

>> +    if (err) {
>>          goto out;
>>      }
>> -
>> -    object_property_add_child(container_get(object_get_root(), "/objects"),
>> -                              id, obj, &local_err);
>> +    visit_end_struct(opts_get_visitor(ov), &err);
>> +    if (err) {
>> +        qmp_object_del(id, NULL);
>> +    }
>>
>>  out:
>> -    object_unref(obj);
>> -    if (local_err) {
>> -        qerror_report_err(local_err);
>> -        error_free(local_err);
>> -        return -1;
>> +    opts_visitor_cleanup(ov);
>> +
>> +    QDECREF(pdict);
>> +    g_free(id);
>> +    g_free(type);
>> +    g_free(dummy);
>> +    if (err) {
>> +        qerror_report_err(err);
>>      }
>>      return 0;
>>  }
>> --
>> 1.8.5.3
>>
>
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option
  2014-03-07  5:33   ` Hu Tao
@ 2014-03-07  7:41     ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-07  7:41 UTC (permalink / raw)
  To: Hu Tao; +Cc: ehabkost, mtosatti, qemu-devel, a.motakis, imammedo, gaowanlong

Il 07/03/2014 06:33, Hu Tao ha scritto:
> On Tue, Mar 04, 2014 at 03:00:43PM +0100, Paolo Bonzini wrote:
>> This option provides the infrastructure for binding guest NUMA nodes
>> to host NUMA nodes.  For example:
>>
>>  -object memory-ram,size=1024M,policy=membind,host-nodes=0,id=ram-node0 \
>>  -numa node,nodeid=0,cpus=0,memdev=ram-node0 \
>>  -object memory-ram,size=1024M,policy=interleave,host-nodes=1-3,id=ram-node1 \
>>  -numa node,nodeid=1,cpus=1,memdev=ram-node1
>>
>> The option replaces "-numa node,mem=".
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  include/sysemu/sysemu.h |  1 +
>>  numa.c                  | 63 +++++++++++++++++++++++++++++++++++++++++++++++--
>>  qapi-schema.json        |  8 ++++++-
>>  qemu-options.hx         | 12 ++++++----
>>  4 files changed, 77 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index 54a6f28..4870129 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -139,6 +139,7 @@ extern int nb_numa_nodes;
>>  typedef struct node_info {
>>      uint64_t node_mem;
>>      DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
>> +    struct HostMemoryBackend *node_memdev;
>>  } NodeInfo;
>>  extern NodeInfo numa_info[MAX_NODES];
>>  void set_numa_nodes(void);
>> diff --git a/numa.c b/numa.c
>> index 930f49d..b00ef90 100644
>> --- a/numa.c
>> +++ b/numa.c
>> @@ -32,6 +32,7 @@
>>  #include "qapi/dealloc-visitor.h"
>>  #include "qapi/qmp/qerror.h"
>>  #include "hw/boards.h"
>> +#include "sysemu/hostmem.h"
>>
>>  QemuOptsList qemu_numa_opts = {
>>      .name = "numa",
>> @@ -40,6 +41,8 @@ QemuOptsList qemu_numa_opts = {
>>      .desc = { { 0 } } /* validated with OptsVisitor */
>>  };
>>
>> +static int have_memdevs = -1;
>> +
>
> bool?
>
>

It is three-state and "-1" means "I haven't seen -numa yet".  Because of 
the nodeid parameter you cannot change "have_memdevs == -1 to for 
example "nodenr == 0".

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation
  2014-03-07  6:09   ` Hu Tao
  2014-03-07  6:34     ` Hu Tao
@ 2014-03-07  7:47     ` Paolo Bonzini
  1 sibling, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-07  7:47 UTC (permalink / raw)
  To: Hu Tao; +Cc: ehabkost, mtosatti, qemu-devel, a.motakis, imammedo, gaowanlong

Il 07/03/2014 07:09, Hu Tao ha scritto:
> On Tue, Mar 04, 2014 at 03:00:44PM +0100, Paolo Bonzini wrote:
>> Split the internal interface in exec.c to a separate function, and
>> push the check on mem_path up to memory_region_init_ram.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  exec.c                  | 105 +++++++++++++++++++++++++++++-------------------
>>  include/exec/cpu-all.h  |   3 --
>>  include/exec/ram_addr.h |   2 +
>>  include/sysemu/sysemu.h |   2 +
>>  memory.c                |   7 +++-
>>  5 files changed, 73 insertions(+), 46 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index b69fd29..0aa4947 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -1240,56 +1240,30 @@ static int memory_try_enable_merging(void *addr, size_t len)
>>      return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
>>  }
>>
>> -ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>> -                                   MemoryRegion *mr)
>> +static ram_addr_t ram_block_add(RAMBlock *new_block)
>>  {
>> -    RAMBlock *block, *new_block;
>> +    RAMBlock *block;
>>      ram_addr_t old_ram_size, new_ram_size;
>>
>>      old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
>>
>> -    size = TARGET_PAGE_ALIGN(size);
>> -    new_block = g_malloc0(sizeof(*new_block));
>> -    new_block->fd = -1;
>> -
>>      /* This assumes the iothread lock is taken here too.  */
>>      qemu_mutex_lock_ramlist();
>> -    new_block->mr = mr;
>> -    new_block->offset = find_ram_offset(size);
>> -    if (host) {
>> -        new_block->host = host;
>> -        new_block->flags |= RAM_PREALLOC_MASK;
>> -    } else if (xen_enabled()) {
>> -        if (mem_path) {
>> -            fprintf(stderr, "-mem-path not supported with Xen\n");
>> -            exit(1);
>> -        }
>> -        xen_ram_alloc(new_block->offset, size, mr);
>> -    } else {
>> -        if (mem_path) {
>> -            if (phys_mem_alloc != qemu_anon_ram_alloc) {
>> -                /*
>> -                 * file_ram_alloc() needs to allocate just like
>> -                 * phys_mem_alloc, but we haven't bothered to provide
>> -                 * a hook there.
>> -                 */
>> -                fprintf(stderr,
>> -                        "-mem-path not supported with this accelerator\n");
>> -                exit(1);
>> -            }
>> -            new_block->host = file_ram_alloc(new_block, size, mem_path);
>> -        }
>> -        if (!new_block->host) {
>> -            new_block->host = phys_mem_alloc(size);
>> +    new_block->offset = find_ram_offset(new_block->length);
>> +
>> +    if (!new_block->host) {
>> +        if (xen_enabled()) {
>> +            xen_ram_alloc(new_block->offset, new_block->length, new_block->mr);
>> +        } else {
>> +            new_block->host = phys_mem_alloc(new_block->length);
>>              if (!new_block->host) {
>>                  fprintf(stderr, "Cannot set up guest memory '%s': %s\n",
>>                          new_block->mr->name, strerror(errno));
>>                  exit(1);
>>              }
>> -            memory_try_enable_merging(new_block->host, size);
>> +            memory_try_enable_merging(new_block->host, new_block->length);
>>          }
>>      }
>> -    new_block->length = size;
>>
>>      /* Keep the list sorted from biggest to smallest block.  */
>>      QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>> @@ -1317,18 +1291,65 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>>                                     old_ram_size, new_ram_size);
>>         }
>>      }
>> -    cpu_physical_memory_set_dirty_range(new_block->offset, size);
>> +    cpu_physical_memory_set_dirty_range(new_block->offset, new_block->length);
>>
>> -    qemu_ram_setup_dump(new_block->host, size);
>> -    qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
>> -    qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK);
>> +    qemu_ram_setup_dump(new_block->host, new_block->length);
>> +    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_HUGEPAGE);
>> +    qemu_madvise(new_block->host, new_block->length, QEMU_MADV_DONTFORK);
>>
>> -    if (kvm_enabled())
>> -        kvm_setup_guest_memory(new_block->host, size);
>> +    if (kvm_enabled()) {
>> +        kvm_setup_guest_memory(new_block->host, new_block->length);
>> +    }
>>
>>      return new_block->offset;
>>  }
>>
>> +ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
>> +                                    const char *mem_path)
>> +{
>> +    RAMBlock *new_block;
>> +
>> +    if (xen_enabled()) {
>> +        fprintf(stderr, "-mem-path not supported with Xen\n");
>> +        exit(1);
>> +    }
>> +
>> +    if (phys_mem_alloc != qemu_anon_ram_alloc) {
>> +        /*
>> +         * file_ram_alloc() needs to allocate just like
>> +         * phys_mem_alloc, but we haven't bothered to provide
>> +         * a hook there.
>> +         */
>> +        fprintf(stderr,
>> +                "-mem-path not supported with this accelerator\n");
>> +        exit(1);
>> +    }
>> +
>> +    size = TARGET_PAGE_ALIGN(size);
>> +    new_block = g_malloc0(sizeof(*new_block));
>> +    new_block->mr = mr;
>> +    new_block->length = size;
>> +    new_block->host = file_ram_alloc(new_block, size, mem_path);
>> +    return ram_block_add(new_block);
>> +}
>> +
>> +ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>> +                                   MemoryRegion *mr)
>> +{
>> +    RAMBlock *new_block;
>> +
>> +    size = TARGET_PAGE_ALIGN(size);
>> +    new_block = g_malloc0(sizeof(*new_block));
>> +    new_block->mr = mr;
>> +    new_block->length = size;
>> +    new_block->fd = -1;
>> +    new_block->host = host;
>> +    if (host) {
>> +        new_block->flags |= RAM_PREALLOC_MASK;
>> +    }
>> +    return ram_block_add(new_block);
>> +}
>> +
>>  ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
>>  {
>>      return qemu_ram_alloc_from_ptr(size, NULL, mr);
>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
>> index e66ab5b..b44babb 100644
>> --- a/include/exec/cpu-all.h
>> +++ b/include/exec/cpu-all.h
>> @@ -466,9 +466,6 @@ typedef struct RAMList {
>>  } RAMList;
>>  extern RAMList ram_list;
>>
>> -extern const char *mem_path;
>> -extern int mem_prealloc;
>> -
>>  /* Flags stored in the low bits of the TLB virtual address.  These are
>>     defined so that fast path ram access is all zeros.  */
>>  /* Zero if TLB entry is valid.  */
>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>> index 2edfa96..dedb258 100644
>> --- a/include/exec/ram_addr.h
>> +++ b/include/exec/ram_addr.h
>> @@ -22,6 +22,8 @@
>>  #ifndef CONFIG_USER_ONLY
>>  #include "hw/xen/xen.h"
>>
>> +ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
>> +                                    const char *mem_path);
>>  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>>                                     MemoryRegion *mr);
>>  ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index 4870129..03f5ee5 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -132,6 +132,8 @@ extern uint8_t *boot_splash_filedata;
>>  extern size_t boot_splash_filedata_size;
>>  extern uint8_t qemu_extra_params_fw[2];
>>  extern QEMUClockType rtc_clock;
>> +extern const char *mem_path;
>> +extern int mem_prealloc;
>>
>>  #define MAX_NODES 128
>>  #define MAX_CPUMASK_BITS 255
>> diff --git a/memory.c b/memory.c
>> index 59ecc28..32b17a8 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -23,6 +23,7 @@
>>
>>  #include "exec/memory-internal.h"
>>  #include "exec/ram_addr.h"
>> +#include "sysemu/sysemu.h"
>>
>>  //#define DEBUG_UNASSIGNED
>>
>> @@ -1016,7 +1017,11 @@ void memory_region_init_ram(MemoryRegion *mr,
>>      mr->ram = true;
>>      mr->terminates = true;
>>      mr->destructor = memory_region_destructor_ram;
>> -    mr->ram_addr = qemu_ram_alloc(size, mr);
>> +    if (mem_path) {
>> +        mr->ram_addr = qemu_ram_alloc_from_file(size, mr, mem_path);
>> +    } else {
>> +        mr->ram_addr = qemu_ram_alloc(size, mr);
>> +    }
>>  }
>
> This changes the logic of the original code:
>
>   if (mem_path) {
>       ...
>       new_block->host = file_ram_alloc(new_block, size, mem_path);
>   }
>   if (!new_block->host) {
>       new_block->host = phys_mem_alloc(size);
>       ...
>   }

ram_block_add is still calling phys_mem_alloc:

     new_block = g_malloc0(sizeof(*new_block));
     new_block->mr = mr;
     new_block->length = size;
     new_block->host = file_ram_alloc(new_block, size, mem_path);
     return ram_block_add(new_block);

Paolo

>>
>>  void memory_region_init_ram_ptr(MemoryRegion *mr,
>> --
>> 1.8.5.3
>>
>
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 11/28] qmp: improve error reporting for -object and object-add
  2014-03-07  3:07   ` Hu Tao
@ 2014-03-07  7:57     ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-07  7:57 UTC (permalink / raw)
  To: Hu Tao; +Cc: ehabkost, mtosatti, qemu-devel, a.motakis, imammedo, gaowanlong

Il 07/03/2014 04:07, Hu Tao ha scritto:
> There is already an accepted version de580dafade551.
>
> Paolo, I found that your numa tree is behind current master about 99
> commits. I'd like to take over this series if you have no time on it.

Sure, I rebased it and pushed it again (not tested the rebase yet).

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
  2014-03-05 11:30   ` Paolo Bonzini
@ 2014-03-07 11:59     ` Andreas Färber
  2014-03-07 12:20       ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Andreas Färber @ 2014-03-07 11:59 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: ehabkost, hutao, mtosatti, Anthony Liguori, imammedo, Chen Fan,
	a.motakis, gaowanlong

Am 05.03.2014 12:30, schrieb Paolo Bonzini:
> Il 05/03/2014 12:05, Andreas Färber ha scritto:
>> Am 04.03.2014 15:00, schrieb Paolo Bonzini:
>>> This series includes all the pending work on QOMifying the memory
>>> backends.
>> [snip]
>>
>> There's also a recent RFC from Chen Fan about how to model the
>> association between NUMA nodes and CPU socket/core/thread that
>> would/should influence this series if we're aiming for 2.1 now.
> 
> I don't think it should, apart from conflicts.  This series only changes
> things about memory.  CPUs are handled the same before and after the
> patches.
> 
>> I didn't review it in-depth yet, but minor technical issues apart, I
>> think we need to keep NUMA and CPU separate,
> 
> I agree.

Here you agreed ...

>> Compare that to:
>>
>> /machine
>>   /node[0] # This is not really telling!
>>     /socket[0]
>>       /core[0]
>>         /thread[0] # So CPUState != thread?
>>           cpu -> /machine/unassigned/device[0]
>>   /unassigned
>>     /device[0]
> 
> I think this is better; in our world we can have multiple sockets in the
> same NUMA node.  But CPUState == thread, so you can have just /thread[0]
> -> /machine/unassigned/device[0].

... but you seem to have missed my point about separation. Here the
socket object is a child<> of the NUMA node and would get realized
together with it but separate from the link<>ed CPUState.

> Alternatively, and to keep CPU + NUMA even *more* separate:
> 
>   /machine
>     /node[0]
>        /cpu[0] -> /machine/unassigned/device[0]
>        ...
>     /socket[0]
>        /core[0]
>           /thread[0] -> /machine/unassigned/device[0]
>     /unassigned
>        /device[0]

Now this is pretty much my proposal ;) except that you retained the
criticized "node" as name and moved "socket[0]" out of
/machine/unassigned (I had /machine/peripheral in mind for -device) and
keep the CPUState out of the socket object.

Anthony had requested hot-add to happen via "device_add Xeon4242",
adding a full socket object with 6 cores at once. In that case CPUState
needs to be an integral part of that socket-derived device for recursive
realization. Objects that are just link<>ed to wouldn't get
automatically realized.

Since the only two other places for creating an X86CPU are PC code plus
cpu-add I don't envision problems with adding it as child<> to its core.

>> which then brings up the
>> question Chen Fan asked about whether we need to support splitting CPU
>> threads of one core or CPU cores of one socket onto different NUMA
>> nodes. If we can stop supporting this, 2.0 would be a good point in time
>> to catch this with an error message at least, even if the remodeling
>> depending on it happens post-2.0.
> 
>> Note that according to my interpretation of QOM ABI stability rules we
>> can't just turn a link<cpu> into a child<cpu> without renaming, thus
>> trying to be forward-looking for where we want to go design-wise.
> 
> I think we can.  Children and links look exactly the same from the outside.

Well, we can't qom-get/qom-set a path string from/to a child<> property,
can we? But paths can indeed be resolved either way.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
  2014-03-07 11:59     ` Andreas Färber
@ 2014-03-07 12:20       ` Paolo Bonzini
  2014-03-07 12:56         ` Igor Mammedov
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-07 12:20 UTC (permalink / raw)
  To: Andreas Färber, qemu-devel
  Cc: ehabkost, hutao, mtosatti, Anthony Liguori, imammedo, Chen Fan,
	a.motakis, gaowanlong

Il 07/03/2014 12:59, Andreas Färber ha scritto:
> Am 05.03.2014 12:30, schrieb Paolo Bonzini:
>> Il 05/03/2014 12:05, Andreas Färber ha scritto:
>>> I didn't review it in-depth yet, but minor technical issues apart, I
>>> think we need to keep NUMA and CPU separate,
>>
>> I agree.
>
> Here you agreed ...
>
>>> Compare that to:
>>>
>>> /machine
>>>   /node[0] # This is not really telling!
>>>     /socket[0]
>>>       /core[0]
>>>         /thread[0] # So CPUState != thread?
>>>           cpu -> /machine/unassigned/device[0]
>>>   /unassigned
>>>     /device[0]
>>
>> I think this is better; in our world we can have multiple sockets in the
>> same NUMA node.  But CPUState == thread, so you can have just /thread[0]
>> -> /machine/unassigned/device[0].
>
> ... but you seem to have missed my point about separation. Here the
> socket object is a child<> of the NUMA node and would get realized
> together with it but separate from the link<>ed CPUState.

Ah, I didn't think the socket object as anything but a container.  For 
me, "keep NUMA and CPU separate" meant "keep NUMA and CPUState separate".

>> Alternatively, and to keep CPU + NUMA even *more* separate:
>>
>>   /machine
>>     /node[0]
>>        /cpu[0] -> /machine/unassigned/device[0]
>>        ...
>>     /socket[0]
>>        /core[0]
>>           /thread[0] -> /machine/unassigned/device[0]
>>     /unassigned
>>        /device[0]
>
> Now this is pretty much my proposal ;) except that you retained the
> criticized "node" as name and moved "socket[0]" out of
> /machine/unassigned (I had /machine/peripheral in mind for -device) and
> keep the CPUState out of the socket object.
>
> Anthony had requested hot-add to happen via "device_add Xeon4242",
> adding a full socket object with 6 cores at once. In that case CPUState
> needs to be an integral part of that socket-derived device for recursive
> realization. Objects that are just link<>ed to wouldn't get
> automatically realized.

Yes, if you want to do this then you're right and /socket[n] needs to be 
a device.

However, I'd still like it to be mostly a container, and that is why I 
liked the idea of having /node[n] with "flat" links to the actual 
CPUStates (and also memdevs).

>> I think we can.  Children and links look exactly the same from the outside.
>
> Well, we can't qom-get/qom-set a path string from/to a child<> property,
> can we?

We can get it but not set it.  But Stefan's series provides a way to 
make links read-only too, and these links should be read-only I think.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
  2014-03-07 12:20       ` Paolo Bonzini
@ 2014-03-07 12:56         ` Igor Mammedov
  2014-03-07 13:35           ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Igor Mammedov @ 2014-03-07 12:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, hutao, mtosatti, qemu-devel, Anthony Liguori, Chen Fan,
	a.motakis, Andreas Färber, gaowanlong

On Fri, 07 Mar 2014 13:20:45 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 07/03/2014 12:59, Andreas Färber ha scritto:
> > Am 05.03.2014 12:30, schrieb Paolo Bonzini:
> >> Il 05/03/2014 12:05, Andreas Färber ha scritto:
> >>> I didn't review it in-depth yet, but minor technical issues apart, I
> >>> think we need to keep NUMA and CPU separate,
> >>
> >> I agree.
> >
> > Here you agreed ...
> >
> >>> Compare that to:
> >>>
> >>> /machine
> >>>   /node[0] # This is not really telling!
> >>>     /socket[0]
> >>>       /core[0]
> >>>         /thread[0] # So CPUState != thread?
> >>>           cpu -> /machine/unassigned/device[0]
> >>>   /unassigned
> >>>     /device[0]
> >>
> >> I think this is better; in our world we can have multiple sockets in the
> >> same NUMA node.  But CPUState == thread, so you can have just /thread[0]
> >> -> /machine/unassigned/device[0].
> >
> > ... but you seem to have missed my point about separation. Here the
> > socket object is a child<> of the NUMA node and would get realized
> > together with it but separate from the link<>ed CPUState.
> 
> Ah, I didn't think the socket object as anything but a container.  For 
> me, "keep NUMA and CPU separate" meant "keep NUMA and CPUState separate".
> 
> >> Alternatively, and to keep CPU + NUMA even *more* separate:
> >>
> >>   /machine
> >>     /node[0]
> >>        /cpu[0] -> /machine/unassigned/device[0]
> >>        ...
> >>     /socket[0]
> >>        /core[0]
> >>           /thread[0] -> /machine/unassigned/device[0]
> >>     /unassigned
> >>        /device[0]
> >
> > Now this is pretty much my proposal ;) except that you retained the
> > criticized "node" as name and moved "socket[0]" out of
> > /machine/unassigned (I had /machine/peripheral in mind for -device) and
> > keep the CPUState out of the socket object.
> >
> > Anthony had requested hot-add to happen via "device_add Xeon4242",
> > adding a full socket object with 6 cores at once. In that case CPUState
> > needs to be an integral part of that socket-derived device for recursive
> > realization. Objects that are just link<>ed to wouldn't get
> > automatically realized.
we possible can't do it in arch independent manner, but if we are talking
about 'pc' machine and would ever model real CPU composition there (is there
reasons to do it?), then composite CPU object could still stay
in internal /machine/unassigned|/machine/peripheral trees in parallel with
public /machine/node[x]/socket[y]/core[z]/link<CPUstate>[j] topology interface

> 
> Yes, if you want to do this then you're right and /socket[n] needs to be 
> a device.
> 
> However, I'd still like it to be mostly a container, and that is why I 
> liked the idea of having /node[n] with "flat" links to the actual 
> CPUStates (and also memdevs).
Is there a point in having flat links to CPUState at /nodeX level,

idea to create [*] /node[x]/socket[y]/core[z]/link<CPUstate>[j] tree, was
suggested as way:
 1. to expose stable arch independent topology interface to user
 2. use * as argument to -device / device_add/del cpu,path=foo to avoid
    exposing arch dependent APIC ID to the user.
while keeping /machine/node/socket/core objects mostly as containers to express
above things.

> 
> >> I think we can.  Children and links look exactly the same from the outside.
> >
> > Well, we can't qom-get/qom-set a path string from/to a child<> property,
> > can we?
> 
> We can get it but not set it.  But Stefan's series provides a way to 
> make links read-only too, and these links should be read-only I think.
CPUState links are readonly only until no hotplug supported.

> 
> Paolo


-- 
Regards,
  Igor

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
  2014-03-07 12:56         ` Igor Mammedov
@ 2014-03-07 13:35           ` Paolo Bonzini
  2014-03-07 14:54             ` Igor Mammedov
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-07 13:35 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, hutao, mtosatti, qemu-devel, Anthony Liguori, Chen Fan,
	a.motakis, Andreas Färber, gaowanlong

Il 07/03/2014 13:56, Igor Mammedov ha scritto:
>> However, I'd still like it to be mostly a container, and that is why I
>> liked the idea of having /node[n] with "flat" links to the actual
>> CPUStates (and also memdevs).
>
> Is there a point in having flat links to CPUState at /nodeX level,

Easily getting thread ids for the VCPU thread and pinning them to host 
nodes?  For this you need to match the CPU numbers passed to "-numa 
node", not some socket topology that can be completely arbitrary.

Paolo

> idea to create [*] /node[x]/socket[y]/core[z]/link<CPUstate>[j] tree, was
> suggested as way:
>  1. to expose stable arch independent topology interface to user
>  2. use * as argument to -device / device_add/del cpu,path=foo to avoid
>     exposing arch dependent APIC ID to the user.
> while keeping /machine/node/socket/core objects mostly as containers to express
> above things.
>
>>
>>>> I think we can.  Children and links look exactly the same from the outside.
>>>
>>> Well, we can't qom-get/qom-set a path string from/to a child<> property,
>>> can we?
>>
>> We can get it but not set it.  But Stefan's series provides a way to
>> make links read-only too, and these links should be read-only I think.
> CPUState links are readonly only until no hotplug supported.
>
>>
>> Paolo
>
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements
  2014-03-07 13:35           ` Paolo Bonzini
@ 2014-03-07 14:54             ` Igor Mammedov
  0 siblings, 0 replies; 70+ messages in thread
From: Igor Mammedov @ 2014-03-07 14:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, hutao, mtosatti, qemu-devel, Anthony Liguori, Chen Fan,
	a.motakis, Andreas Färber, gaowanlong

On Fri, 07 Mar 2014 14:35:09 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 07/03/2014 13:56, Igor Mammedov ha scritto:
> >> However, I'd still like it to be mostly a container, and that is why I
> >> liked the idea of having /node[n] with "flat" links to the actual
> >> CPUStates (and also memdevs).
> >
> > Is there a point in having flat links to CPUState at /nodeX level,
> 
> Easily getting thread ids for the VCPU thread and pinning them to host 
> nodes?  For this you need to match the CPU numbers passed to "-numa 
> node", not some socket topology that can be completely arbitrary.
CPU numbers, on -numa node, are coming from cpu_index legacy, and shouldn't
we try to get rid of it in favor of something manageable?
Since CPUs are now devices we could use "id" to specify CPUs on -numa node
as one solution or use path names as with memdev.


> 
> Paolo
> 
> > idea to create [*] /node[x]/socket[y]/core[z]/link<CPUstate>[j] tree, was
> > suggested as way:
> >  1. to expose stable arch independent topology interface to user
> >  2. use * as argument to -device / device_add/del cpu,path=foo to avoid
> >     exposing arch dependent APIC ID to the user.
> > while keeping /machine/node/socket/core objects mostly as containers to express
> > above things.
> >
> >>
> >>>> I think we can.  Children and links look exactly the same from the outside.
> >>>
> >>> Well, we can't qom-get/qom-set a path string from/to a child<> property,
> >>> can we?
> >>
> >> We can get it but not set it.  But Stefan's series provides a way to
> >> make links read-only too, and these links should be read-only I think.
> > CPUState links are readonly only until no hotplug supported.
> >
> >>
> >> Paolo
> >
> >
> 


-- 
Regards,
  Igor

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory Paolo Bonzini
@ 2014-03-11  3:50   ` Hu Tao
  2014-03-11  8:03     ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Hu Tao @ 2014-03-11  3:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:45PM +0100, Paolo Bonzini wrote:
> Like the previous patch did in exec.c, split memory_region_init_ram and
> memory_region_init_ram_from_file, and push mem_path one step further up.
> Other RAM regions than system memory will now be backed by regular RAM.

This changes qemu's behaviour regarding using hugetlbfs, especially when
size of other RAM regions is significant compared to system memory. Will
this a problem?(compatibilities, user configurations...)

> 
> Also, boards that do not use memory_region_allocate_system_memory will
> not support -mem-path anymore.  This can be changed before the patches
> are merged by migrating boards to use the function.

IIUC, memory_region_allocate_system_memory() is only called once at
board initialization time. In most cases, it just works. But there
are cases that system memory is not initialized by a single call to
memory_region_allocate_system_memory() that total memory is splitted
into banks each of which is initialized individually, see
ppc4xx_sdram_adjust(). How to convert to
memory_region_allocate_system_memory() in this case? Should we map
banks into numa nodes?

> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                | 10 ++--------
>  include/exec/memory.h | 18 ++++++++++++++++++
>  memory.c              | 21 ++++++++++++++++-----
>  numa.c                | 11 ++++++++++-
>  4 files changed, 46 insertions(+), 14 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 0aa4947..4f05584 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1123,14 +1123,6 @@ static void *file_ram_alloc(RAMBlock *block,
>      block->fd = fd;
>      return area;
>  }
> -#else
> -static void *file_ram_alloc(RAMBlock *block,
> -                            ram_addr_t memory,
> -                            const char *path)
> -{
> -    fprintf(stderr, "-mem-path not supported on this host\n");
> -    exit(1);
> -}
>  #endif
>  
>  static ram_addr_t find_ram_offset(ram_addr_t size)
> @@ -1304,6 +1296,7 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
>      return new_block->offset;
>  }
>  
> +#ifdef __linux__
>  ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
>                                      const char *mem_path)
>  {
> @@ -1332,6 +1325,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
>      new_block->host = file_ram_alloc(new_block, size, mem_path);
>      return ram_block_add(new_block);
>  }
> +#endif
>  
>  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>                                     MemoryRegion *mr)
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 9101fc3..54bdb4d 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -311,6 +311,24 @@ void memory_region_init_ram(MemoryRegion *mr,
>                              const char *name,
>                              uint64_t size);
>  
> +#ifdef __linux__
> +/**
> + * memory_region_init_ram_from_file:  Initialize RAM memory region with a
> + *                                    mmap-ed backend.
> + *
> + * @mr: the #MemoryRegion to be initialized.
> + * @owner: the object that tracks the region's reference count
> + * @name: the name of the region.
> + * @size: size of the region.
> + * @path: the path in which to allocate the RAM.
> + */
> +void memory_region_init_ram_from_file(MemoryRegion *mr,
> +                                      struct Object *owner,
> +                                      const char *name,
> +                                      uint64_t size,
> +                                      const char *path);
> +#endif
> +
>  /**
>   * memory_region_init_ram_ptr:  Initialize RAM memory region from a
>   *                              user-provided pointer.  Accesses into the
> diff --git a/memory.c b/memory.c
> index 32b17a8..1636351 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1017,13 +1017,24 @@ void memory_region_init_ram(MemoryRegion *mr,
>      mr->ram = true;
>      mr->terminates = true;
>      mr->destructor = memory_region_destructor_ram;
> -    if (mem_path) {
> -        mr->ram_addr = qemu_ram_alloc_from_file(size, mr, mem_path);
> -    } else {
> -        mr->ram_addr = qemu_ram_alloc(size, mr);
> -    }
> +    mr->ram_addr = qemu_ram_alloc(size, mr);
>  }
>  
> +#ifdef __linux__
> +void memory_region_init_ram_from_file(MemoryRegion *mr,
> +                                      struct Object *owner,
> +                                      const char *name,
> +                                      uint64_t size,
> +                                      const char *path)
> +{
> +    memory_region_init(mr, owner, name, size);
> +    mr->ram = true;
> +    mr->terminates = true;
> +    mr->destructor = memory_region_destructor_ram;
> +    mr->ram_addr = qemu_ram_alloc_from_file(size, mr, path);
> +}
> +#endif
> +
>  void memory_region_init_ram_ptr(MemoryRegion *mr,
>                                  Object *owner,
>                                  const char *name,
> diff --git a/numa.c b/numa.c
> index b00ef90..1afa017 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -228,7 +228,16 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>  {
>      uint64_t ram_size = args->ram_size;
>  
> -    memory_region_init_ram(mr, owner, name, ram_size);
> +    if (mem_path) {
> +#ifdef __linux__
> +        memory_region_init_ram_from_file(mr, owner, name, ram_size, mem_path);
> +#else
> +        fprintf(stderr, "-mem-path not supported on this host\n");
> +        exit(1);
> +#endif
> +    } else {
> +        memory_region_init_ram(mr, owner, name, ram_size);
> +    }
>      vmstate_register_ram_global(mr);
>  }
>  
> -- 
> 1.8.5.3
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory
  2014-03-11  3:50   ` Hu Tao
@ 2014-03-11  8:03     ` Paolo Bonzini
  2014-03-12  2:08       ` Marcelo Tosatti
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2014-03-11  8:03 UTC (permalink / raw)
  To: Hu Tao; +Cc: ehabkost, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

Il 11/03/2014 04:50, Hu Tao ha scritto:
> On Tue, Mar 04, 2014 at 03:00:45PM +0100, Paolo Bonzini wrote:
>> Like the previous patch did in exec.c, split memory_region_init_ram and
>> memory_region_init_ram_from_file, and push mem_path one step further up.
>> Other RAM regions than system memory will now be backed by regular RAM.
>
> This changes qemu's behaviour regarding using hugetlbfs, especially when
> size of other RAM regions is significant compared to system memory. Will
> this a problem?(compatibilities, user configurations...)

I think it's a bugfix, especially with 1G hugepages.  Wasting 1G on VGA 
memory is not really a good idea.

>>
>> Also, boards that do not use memory_region_allocate_system_memory will
>> not support -mem-path anymore.  This can be changed before the patches
>> are merged by migrating boards to use the function.
>
> IIUC, memory_region_allocate_system_memory() is only called once at
> board initialization time. In most cases, it just works. But there
> are cases that system memory is not initialized by a single call to
> memory_region_allocate_system_memory() that total memory is splitted
> into banks each of which is initialized individually, see
> ppc4xx_sdram_adjust(). How to convert to
> memory_region_allocate_system_memory() in this case? Should we map
> banks into numa nodes?

The solution would be to use aliases to split the system RAM into banks.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 06/28] man: improve -numa doc
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 06/28] man: improve -numa doc Paolo Bonzini
@ 2014-03-11 18:53   ` Eduardo Habkost
  0 siblings, 0 replies; 70+ messages in thread
From: Eduardo Habkost @ 2014-03-11 18:53 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: hutao, mtosatti, qemu-devel, Luiz Capitulino, a.motakis,
	imammedo, gaowanlong

On Tue, Mar 04, 2014 at 03:00:34PM +0100, Paolo Bonzini wrote:
> From: Luiz Capitulino <lcapitulino@redhat.com>
> 
> The -numa option documentation in qemu's manpage lacks the command-line
> options and some information regarding how it relates to options -m and
> -smp. This commit fills in the missing text.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

-- 
Eduardo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton
  2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton Paolo Bonzini
  2014-03-05 10:08   ` Andreas Färber
  2014-03-07  2:27   ` Hu Tao
@ 2014-03-11 18:55   ` Eduardo Habkost
  2 siblings, 0 replies; 70+ messages in thread
From: Eduardo Habkost @ 2014-03-11 18:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: hutao, mtosatti, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 04, 2014 at 03:00:35PM +0100, Paolo Bonzini wrote:
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

-- 
Eduardo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory
  2014-03-11  8:03     ` Paolo Bonzini
@ 2014-03-12  2:08       ` Marcelo Tosatti
  0 siblings, 0 replies; 70+ messages in thread
From: Marcelo Tosatti @ 2014-03-12  2:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, Hu Tao, qemu-devel, imammedo, a.motakis, gaowanlong

On Tue, Mar 11, 2014 at 09:03:29AM +0100, Paolo Bonzini wrote:
> Il 11/03/2014 04:50, Hu Tao ha scritto:
> >On Tue, Mar 04, 2014 at 03:00:45PM +0100, Paolo Bonzini wrote:
> >>Like the previous patch did in exec.c, split memory_region_init_ram and
> >>memory_region_init_ram_from_file, and push mem_path one step further up.
> >>Other RAM regions than system memory will now be backed by regular RAM.
> >
> >This changes qemu's behaviour regarding using hugetlbfs, especially when
> >size of other RAM regions is significant compared to system memory. Will
> >this a problem?(compatibilities, user configurations...)
> 
> I think it's a bugfix, especially with 1G hugepages.  Wasting 1G on
> VGA memory is not really a good idea.

What about this

    if (memory < hpagesize) {
        return NULL;
    }

> >>
> >>Also, boards that do not use memory_region_allocate_system_memory will
> >>not support -mem-path anymore.  This can be changed before the patches
> >>are merged by migrating boards to use the function.
> >
> >IIUC, memory_region_allocate_system_memory() is only called once at
> >board initialization time. In most cases, it just works. But there
> >are cases that system memory is not initialized by a single call to
> >memory_region_allocate_system_memory() that total memory is splitted
> >into banks each of which is initialized individually, see
> >ppc4xx_sdram_adjust(). How to convert to
> >memory_region_allocate_system_memory() in this case? Should we map
> >banks into numa nodes?
> 
> The solution would be to use aliases to split the system RAM into banks.
> 
> Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2014-03-12  2:09 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-04 14:00 [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 01/28] NUMA: move numa related code to new file numa.c Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 02/28] NUMA: check if the total numa memory size is equal to ram_size Paolo Bonzini
2014-03-04 17:00   ` Eric Blake
2014-03-04 17:19     ` Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 03/28] NUMA: Add numa_info structure to contain numa nodes info Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 04/28] NUMA: convert -numa option to use OptsVisitor Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 05/28] NUMA: expand MAX_NODES from 64 to 128 Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 06/28] man: improve -numa doc Paolo Bonzini
2014-03-11 18:53   ` Eduardo Habkost
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 07/28] qemu-option: introduce qemu_find_opts_singleton Paolo Bonzini
2014-03-05 10:08   ` Andreas Färber
2014-03-07  2:27   ` Hu Tao
2014-03-11 18:55   ` Eduardo Habkost
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 08/28] vl: convert -m to QemuOpts Paolo Bonzini
2014-03-05 10:06   ` Andreas Färber
2014-03-05 10:31     ` Paolo Bonzini
2014-03-05 15:09     ` Igor Mammedov
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 09/28] vl: redo -object parsing Paolo Bonzini
2014-03-07  2:56   ` Hu Tao
2014-03-07  7:39     ` Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 10/28] qmp: allow object-add completion handler to get canonical path Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 11/28] qmp: improve error reporting for -object and object-add Paolo Bonzini
2014-03-07  3:07   ` Hu Tao
2014-03-07  7:57     ` Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 12/28] pc: pass QEMUMachineInitArgs to pc_memory_init Paolo Bonzini
2014-03-07  3:09   ` Hu Tao
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 13/28] numa: introduce memory_region_allocate_system_memory Paolo Bonzini
2014-03-07  3:18   ` Hu Tao
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 14/28] add memdev backend infrastructure Paolo Bonzini
2014-03-07  3:31   ` Hu Tao
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 15/28] numa: add -numa node, memdev= option Paolo Bonzini
2014-03-04 17:52   ` Eric Blake
2014-03-07  5:33   ` Hu Tao
2014-03-07  7:41     ` Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 16/28] memory: reorganize file-based allocation Paolo Bonzini
2014-03-07  6:09   ` Hu Tao
2014-03-07  6:34     ` Hu Tao
2014-03-07  7:47     ` Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 17/28] memory: move mem_path handling to memory_region_allocate_system_memory Paolo Bonzini
2014-03-11  3:50   ` Hu Tao
2014-03-11  8:03     ` Paolo Bonzini
2014-03-12  2:08       ` Marcelo Tosatti
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 18/28] memory: add error propagation to file-based RAM allocation Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 19/28] memory: move preallocation code out of exec.c Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 20/28] memory: move RAM_PREALLOC_MASK to exec.c, rename Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 21/28] hostmem: add file-based HostMemoryBackend Paolo Bonzini
2014-03-04 17:38   ` Eric Blake
2014-03-04 18:12     ` Paolo Bonzini
2014-03-07  6:57   ` Hu Tao
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 22/28] hostmem: separate allocation from UserCreatable complete method Paolo Bonzini
2014-03-07  7:08   ` Hu Tao
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 23/28] hostmem: add merge and dump properties Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 24/28] hostmem: allow preallocation of any memory region Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 25/28] hostmem: add property to map memory with MAP_SHARED Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 26/28] configure: add Linux libnuma detection Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 27/28] hostmem: add properties for NUMA memory policy Paolo Bonzini
2014-03-04 14:00 ` [Qemu-devel] [PATCH 2.1 28/28] qmp: add query-memdev Paolo Bonzini
2014-03-04 17:37   ` Eric Blake
2014-03-04 18:11     ` Paolo Bonzini
2014-03-05  3:50       ` Hu Tao
2014-03-05  8:17         ` Paolo Bonzini
2014-03-05  3:48   ` Hu Tao
2014-03-05 11:05 ` [Qemu-devel] [PATCH 2.1 00/28] Current state of NUMA series, and hostmem improvements Andreas Färber
2014-03-05 11:30   ` Paolo Bonzini
2014-03-07 11:59     ` Andreas Färber
2014-03-07 12:20       ` Paolo Bonzini
2014-03-07 12:56         ` Igor Mammedov
2014-03-07 13:35           ` Paolo Bonzini
2014-03-07 14:54             ` Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.