All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option
@ 2017-05-10 11:29 Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 01/18] numa: move source of default CPUs to NUMA node mapping into boards Igor Mammedov
                   ` (18 more replies)
  0 siblings, 19 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Changes since v2:
   * rebased on top of numa-next tree
   * add comment to spapr_cpu_index_to_props() explaining why
     cpu_index can't used as index in possible_cpus[] (Eduardo)
   * remove obsolete comment in parse_numa_opts() (Eduardo)
   * add to commit message of "pc: add node-id property to CPU"
     why changing CPUState::numa_node default is safe (Eduardo)
   * amend error msg to print:
      "NUMA is not supported by this machine-type" (Eduardo)
   * add comment documenting machine_set_cpu_numa_node() semantics (Eduardo)
   * assert if props->has_node_id == false (Eduardo)
   * drop numa_[has_]node_id() wrappers (ehabkost, dwg)
   * use qapi_NumaCpuOptions_base() instead of copying fields manually (Eduardo)
Changes since v1:
   * arm (Drew)
       - s/virt_idx2mp_affinity/virt_cpu_mp_affinity/
       - add arm_cpu_mp_affinity() to reduce code duplication
       - use cpu list returned by possible_cpu_arch_ids() instead of
         directly accessing MachineState::possible_cpus field
   * various checkpatch/cleanups/spelling fixes (Eric/Drew)
   * simplify spapr_cpu_index_to_props() (David)
   * add/use numa_[has_]node_id() wrappers (Drew)
   * use new NumaCpuOptions instead of CpuInstanceProperties
     in NumaOptions (Eduardo)
   * better documment NumaCpuOptions.node-id in qapi-schema.json (Eduardo)
Changes since RFC:
    * convert all targets that support numa (Eduardo)
    * add numa CLI tests
    * support wildcard matching with "-numa cpu,..." (Paolo)

Series introduces a new CLI option to allow mapping cpus to numa
nodes using public properties [socket|core|thread]-ids instead of
internal cpu_index and moving internal handling of cpu<->node
mapping from cpu_index based global bitmaps to MachineState.

New '-numa cpu' option is supported only on PC and SPAPR
machines that implement hotpluggable-cpus query.
ARM machine user-facing interface stays cpu_index based due to
lack of hotpluggable-cpus support, but internally cpu<->node
mapping will be using the common for PC/SPAPR/ARM approach
(i.e. store mapping info in MachineState:possible_cpus)

It only provides CLI interface to do mapping, there is no QMP
one as I haven't found a suitable place/way to update/set mapping
after machine_done for QEMU started with -S (stopped mode) so that
mgmt could query hopluggable-cpus first, then map them to numa nodes
in runtime before actually allowing guest to run.

Another alternative I've been considering is to add CLI option
similar to -S but that would pause initialization before machine_init()
callback is run so that user can get CPU layout with hopluggable-cpus,
then map CPUs to numa nodes and unpause to let machine_init() initialize
machine using previously predefined numa mapping.
Such option might also be useful for other usecases.


git repo for testing:
   https://github.com/imammedo/qemu.git cphp_numa_cfg_v3
reference to v2:
   http://lists.gnu.org/archive/html/qemu-devel/2017-05/msg00526.html

CC: Eduardo Habkost <ehabkost@redhat.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Andrew Jones <drjones@redhat.com>
CC: David Gibson <david@gibson.dropbear.id.au>
CC: Eric Blake <eblake@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Shannon Zhao <zhaoshenglong@huawei.com>
CC: qemu-arm@nongnu.org
CC: qemu-ppc@nongnu.org

Igor Mammedov (18):
  numa: move source of default CPUs to NUMA node mapping into boards
  spapr: add node-id property to sPAPR core
  pc: add node-id property to CPU
  virt-arm: add node-id property to CPU
  numa: add check that board supports cpu_index to node mapping
  numa: mirror cpu to node mapping in MachineState::possible_cpus
  numa: do default mapping based on possible_cpus instead of node_cpu
    bitmaps
  pc: get numa node mapping from possible_cpus instead of
    numa_get_node_for_cpu()
  spapr: get numa node mapping from possible_cpus instead of
    numa_get_node_for_cpu()
  virt-arm: get numa node mapping from possible_cpus instead of
    numa_get_node_for_cpu()
  QMP: include CpuInstanceProperties into query_cpus output output
  tests: numa: add case for QMP command query-cpus
  numa: remove no longer need numa_post_machine_init()
  machine: call machine init from wrapper
  numa: use possible_cpus for not mapped CPUs check
  numa: remove node_cpu bitmaps as they are no longer used
  numa: add '-numa cpu,...' option for property based node mapping
  tests: check -numa node,cpu=props_list usecase

 include/hw/boards.h             |  12 ++-
 include/hw/ppc/spapr_cpu_core.h |   1 +
 include/qom/cpu.h               |   2 +
 include/sysemu/numa.h           |   9 +-
 cpus.c                          |  10 ++
 hw/acpi/cpu.c                   |   7 +-
 hw/arm/virt-acpi-build.c        |  19 ++--
 hw/arm/virt.c                   |  44 +++++++--
 hw/core/machine.c               | 160 ++++++++++++++++++++++++++++++++
 hw/i386/acpi-build.c            |  11 +--
 hw/i386/pc.c                    |  54 ++++++++---
 hw/ppc/spapr.c                  |  45 +++++++--
 hw/ppc/spapr_cpu_core.c         |  21 ++---
 numa.c                          | 150 ++++++++++--------------------
 qapi-schema.json                |  27 +++++-
 qemu-options.hx                 |  20 ++++
 target/arm/cpu.c                |   1 +
 target/i386/cpu.c               |   1 +
 tests/numa-test.c               | 196 ++++++++++++++++++++++++++++++++++++++++
 vl.c                            |   6 +-
 20 files changed, 618 insertions(+), 178 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 01/18] numa: move source of default CPUs to NUMA node mapping into boards
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-11  4:36   ` David Gibson
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 02/18] spapr: add node-id property to sPAPR core Igor Mammedov
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Originally CPU threads were by default assigned in
round-robin fashion. However it was causing issues in
guest since CPU threads from the same socket/core could
be placed on different NUMA nodes.
Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
fixed it by grouping threads within a socket on the same node
introducing cpu_index_to_socket_id() callback and commit
20bb648d (spapr: Fix default NUMA node allocation for threads)
reused callback to fix similar issues for SPAPR machine
even though socket doesn't make much sense there.

As result QEMU ended up having 3 default distribution rules
used by 3 targets /virt-arm, spapr, pc/.

In effort of moving NUMA mapping for CPUs into possible_cpus,
generalize default mapping in numa.c by making boards decide
on default mapping and let them explicitly tell generic
numa code to which node a CPU thread belongs to by replacing
cpu_index_to_socket_id() with @cpu_index_to_instance_props()
which provides default node_id assigned by board to specified
cpu_index.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
---
Patch only moves source of default mapping to possible_cpus[]
and leaves the rest of NUMA handling to numa_info[node_id].node_cpu
bitmaps. It's up to follow up patches to replace bitmaps
with possible_cpus[] internally.

v3:
  - drop duplicate ';' (Drew)
  - add comment to spapr_cpu_index_to_props() explaining why
    cpu_index can't used as index in possible_cpus[] (Eduardo)
  - remove obsolete comment in parse_numa_opts() (Eduardo)

v2:
  - use cpu_index instead of core_id directly in spapr_cpu_index_to_props()
       (David Gibson)
  - ammend comment message s/board/numa init code/
       (David Gibson)
---
 include/hw/boards.h   |  8 ++++++--
 include/sysemu/numa.h |  2 +-
 hw/arm/virt.c         | 20 ++++++++++++++++++--
 hw/i386/pc.c          | 23 +++++++++++++++++------
 hw/ppc/spapr.c        | 28 +++++++++++++++++++++-------
 numa.c                | 24 +++++++++++-------------
 vl.c                  |  2 +-
 7 files changed, 75 insertions(+), 32 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 99458eb..3ffa255 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -74,7 +74,10 @@ typedef struct {
  *    of HotplugHandler object, which handles hotplug operation
  *    for a given @dev. It may return NULL if @dev doesn't require
  *    any actions to be performed by hotplug handler.
- * @cpu_index_to_socket_id:
+ * @cpu_index_to_instance_props:
+ *    used to provide @cpu_index to socket/core/thread number mapping, allowing
+ *    legacy code to perform maping from cpu_index to topology properties
+ *    Returns: tuple of socket/core/thread ids given cpu_index belongs to.
  *    used to provide @cpu_index to socket number mapping, allowing
  *    a machine to group CPU threads belonging to the same socket/package
  *    Returns: socket number given cpu_index belongs to.
@@ -141,7 +144,8 @@ struct MachineClass {
 
     HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
                                            DeviceState *dev);
-    unsigned (*cpu_index_to_socket_id)(unsigned cpu_index);
+    CpuInstanceProperties (*cpu_index_to_instance_props)(MachineState *machine,
+                                                         unsigned cpu_index);
     const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
 };
 
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 70e5621..027830c 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -26,7 +26,7 @@ struct node_info {
 };
 
 extern NodeInfo numa_info[MAX_NODES];
-void parse_numa_opts(MachineClass *mc);
+void parse_numa_opts(MachineState *ms);
 void numa_post_machine_init(void);
 void query_numa_node_mem(uint64_t node_mem[]);
 extern QemuOptsList qemu_numa_opts;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index acc748e..dfd6fd4 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1539,6 +1539,16 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
     }
 }
 
+static CpuInstanceProperties
+virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
+
+    assert(cpu_index < possible_cpus->len);
+    return possible_cpus->cpus[cpu_index].props;
+}
+
 static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
 {
     int n;
@@ -1558,8 +1568,13 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[n].props.has_thread_id = true;
         ms->possible_cpus->cpus[n].props.thread_id = n;
 
-        /* TODO: add 'has_node/node' here to describe
-           to which node core belongs */
+        /* default distribution of CPUs over NUMA nodes */
+        if (nb_numa_nodes) {
+            /* preset values but do not enable them i.e. 'has_node_id = false',
+             * numa init code will enable them later if manual mapping wasn't
+             * present on CLI */
+            ms->possible_cpus->cpus[n].props.node_id = n % nb_numa_nodes;
+        }
     }
     return ms->possible_cpus;
 }
@@ -1581,6 +1596,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     /* We know we will never create a pre-ARMv7 CPU which needs 1K pages */
     mc->minimum_page_bits = 12;
     mc->possible_cpu_arch_ids = virt_possible_cpu_arch_ids;
+    mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
 }
 
 static const TypeInfo virt_machine_info = {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f3b372a..01693d5 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2243,12 +2243,14 @@ static void pc_machine_reset(void)
     }
 }
 
-static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index)
+static CpuInstanceProperties
+pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 {
-    X86CPUTopoInfo topo;
-    x86_topo_ids_from_idx(smp_cores, smp_threads, cpu_index,
-                          &topo);
-    return topo.pkg_id;
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
+
+    assert(cpu_index < possible_cpus->len);
+    return possible_cpus->cpus[cpu_index].props;
 }
 
 static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
@@ -2280,6 +2282,15 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
         ms->possible_cpus->cpus[i].props.has_thread_id = true;
         ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
+
+        /* default distribution of CPUs over NUMA nodes */
+        if (nb_numa_nodes) {
+            /* preset values but do not enable them i.e. 'has_node_id = false',
+             * numa init code will enable them later if manual mapping wasn't
+             * present on CLI */
+            ms->possible_cpus->cpus[i].props.node_id =
+                topo.pkg_id % nb_numa_nodes;
+        }
     }
     return ms->possible_cpus;
 }
@@ -2322,7 +2333,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     pcmc->acpi_data_size = 0x20000 + 0x8000;
     pcmc->save_tsc_khz = true;
     mc->get_hotplug_handler = pc_get_hotpug_handler;
-    mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
+    mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
     mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
     mc->has_hotpluggable_cpus = true;
     mc->default_boot_order = "cad";
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bdc31ce..2077e4b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2981,11 +2981,18 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
     return NULL;
 }
 
-static unsigned spapr_cpu_index_to_socket_id(unsigned cpu_index)
+static CpuInstanceProperties
+spapr_cpu_index_to_props(MachineState *machine, unsigned cpu_index)
 {
-    /* Allocate to NUMA nodes on a "socket" basis (not that concept of
-     * socket means much for the paravirtualized PAPR platform) */
-    return cpu_index / smp_threads / smp_cores;
+    CPUArchId *core_slot;
+    MachineClass *mc = MACHINE_GET_CLASS(machine);
+
+    /* make sure possible_cpu are intialized */
+    mc->possible_cpu_arch_ids(machine);
+    /* get CPU core slot containing thread that matches cpu_index */
+    core_slot = spapr_find_cpu_slot(machine, cpu_index, NULL);
+    assert(core_slot);
+    return core_slot->props;
 }
 
 static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
@@ -3012,8 +3019,15 @@ static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
         machine->possible_cpus->cpus[i].arch_id = core_id;
         machine->possible_cpus->cpus[i].props.has_core_id = true;
         machine->possible_cpus->cpus[i].props.core_id = core_id;
-        /* TODO: add 'has_node/node' here to describe
-           to which node core belongs */
+
+        /* default distribution of CPUs over NUMA nodes */
+        if (nb_numa_nodes) {
+            /* preset values but do not enable them i.e. 'has_node_id = false',
+             * numa init code will enable them later if manual mapping wasn't
+             * present on CLI */
+            machine->possible_cpus->cpus[i].props.node_id =
+                core_id / smp_threads / smp_cores % nb_numa_nodes;
+        }
     }
     return machine->possible_cpus;
 }
@@ -3138,7 +3152,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     hc->pre_plug = spapr_machine_device_pre_plug;
     hc->plug = spapr_machine_device_plug;
     hc->unplug = spapr_machine_device_unplug;
-    mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
+    mc->cpu_index_to_instance_props = spapr_cpu_index_to_props;
     mc->possible_cpu_arch_ids = spapr_possible_cpu_arch_ids;
     hc->unplug_request = spapr_machine_device_unplug_request;
 
diff --git a/numa.c b/numa.c
index d753687..bcdfca2 100644
--- a/numa.c
+++ b/numa.c
@@ -443,9 +443,10 @@ void numa_default_auto_assign_ram(MachineClass *mc, NodeInfo *nodes,
     nodes[i].node_mem = size - usedmem;
 }
 
-void parse_numa_opts(MachineClass *mc)
+void parse_numa_opts(MachineState *ms)
 {
     int i;
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
 
     for (i = 0; i < MAX_NODES; i++) {
         numa_info[i].node_cpu = bitmap_new(max_cpus);
@@ -511,21 +512,18 @@ void parse_numa_opts(MachineClass *mc)
                 break;
             }
         }
-        /* Historically VCPUs were assigned in round-robin order to NUMA
-         * nodes. However it causes issues with guest not handling it nice
-         * in case where cores/threads from a multicore CPU appear on
-         * different nodes. So allow boards to override default distribution
-         * rule grouping VCPUs by socket so that VCPUs from the same socket
-         * would be on the same node.
-         */
+
+        /* assign CPUs to nodes using board provided default mapping */
+        if (!mc->cpu_index_to_instance_props) {
+            error_report("default CPUs to NUMA node mapping isn't supported");
+            exit(1);
+        }
         if (i == nb_numa_nodes) {
             for (i = 0; i < max_cpus; i++) {
-                unsigned node_id = i % nb_numa_nodes;
-                if (mc->cpu_index_to_socket_id) {
-                    node_id = mc->cpu_index_to_socket_id(i) % nb_numa_nodes;
-                }
+                CpuInstanceProperties props;
+                props = mc->cpu_index_to_instance_props(ms, i);
 
-                set_bit(i, numa_info[node_id].node_cpu);
+                set_bit(i, numa_info[props.node_id].node_cpu);
             }
         }
 
diff --git a/vl.c b/vl.c
index f46e070..c63f4d5 100644
--- a/vl.c
+++ b/vl.c
@@ -4506,7 +4506,7 @@ int main(int argc, char **argv, char **envp)
     default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
     default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS);
 
-    parse_numa_opts(machine_class);
+    parse_numa_opts(current_machine);
 
     if (qemu_opts_foreach(qemu_find_opts("mon"),
                           mon_init_func, NULL, NULL)) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 02/18] spapr: add node-id property to sPAPR core
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 01/18] numa: move source of default CPUs to NUMA node mapping into boards Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 03/18] pc: add node-id property to CPU Igor Mammedov
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

it will allow switching from cpu_index to core based numa
mapping in follow up patches.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 include/hw/ppc/spapr_cpu_core.h |  1 +
 include/qom/cpu.h               |  2 ++
 hw/ppc/spapr.c                  | 17 +++++++++++++++++
 hw/ppc/spapr_cpu_core.c         | 11 ++++++++---
 4 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index 3c35665..93051e9 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -27,6 +27,7 @@ typedef struct sPAPRCPUCore {
 
     /*< public >*/
     void *threads;
+    int node_id;
 } sPAPRCPUCore;
 
 typedef struct sPAPRCPUCoreClass {
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 5d10359..55214ce 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -258,6 +258,8 @@ typedef void (*run_on_cpu_func)(CPUState *cpu, run_on_cpu_data data);
 
 struct qemu_work_item;
 
+#define CPU_UNSET_NUMA_NODE_ID -1
+
 /**
  * CPUState:
  * @cpu_index: CPU index (informative).
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 2077e4b..a952a39 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2824,9 +2824,11 @@ static void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev);
     Error *local_err = NULL;
     CPUCore *cc = CPU_CORE(dev);
+    sPAPRCPUCore *sc = SPAPR_CPU_CORE(dev);
     char *base_core_type = spapr_get_cpu_core_type(machine->cpu_model);
     const char *type = object_get_typename(OBJECT(dev));
     CPUArchId *core_slot;
+    int node_id;
     int index;
 
     if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
@@ -2861,6 +2863,21 @@ static void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         goto out;
     }
 
+    node_id = numa_get_node_for_cpu(cc->core_id);
+    if (node_id == nb_numa_nodes) {
+        /* by default CPUState::numa_node was 0 if it's not set via CLI
+         * keep it this way for now but in future we probably should
+         * refuse to start up with incomplete numa mapping */
+        node_id = 0;
+    }
+    if (sc->node_id == CPU_UNSET_NUMA_NODE_ID) {
+        sc->node_id = node_id;
+    } else if (sc->node_id != node_id) {
+        error_setg(&local_err, "node-id %d must match numa node specified"
+            "with -numa option for cpu-index %d", sc->node_id, cc->core_id);
+        goto out;
+    }
+
 out:
     g_free(base_core_type);
     error_propagate(errp, local_err);
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 4389ef4..9de7a56 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -176,7 +176,6 @@ static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
     const char *typename = object_class_get_name(scc->cpu_class);
     size_t size = object_type_get_instance_size(typename);
     Error *local_err = NULL;
-    int core_node_id = numa_get_node_for_cpu(cc->core_id);;
     void *obj;
     int i, j;
 
@@ -194,10 +193,10 @@ static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
 
         /* Set NUMA node for the added CPUs  */
         node_id = numa_get_node_for_cpu(cs->cpu_index);
-        if (node_id != core_node_id) {
+        if (node_id != sc->node_id) {
             error_setg(&local_err, "Invalid node-id=%d of thread[cpu-index: %d]"
                 " on CPU[core-id: %d, node-id: %d], node-id must be the same",
-                 node_id, cs->cpu_index, cc->core_id, core_node_id);
+                 node_id, cs->cpu_index, cc->core_id, sc->node_id);
             goto err;
         }
         if (node_id < nb_numa_nodes) {
@@ -263,6 +262,11 @@ static const char *spapr_core_models[] = {
     "POWER9_v1.0",
 };
 
+static Property spapr_cpu_core_properties[] = {
+    DEFINE_PROP_INT32("node-id", sPAPRCPUCore, node_id, CPU_UNSET_NUMA_NODE_ID),
+    DEFINE_PROP_END_OF_LIST()
+};
+
 void spapr_cpu_core_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
@@ -270,6 +274,7 @@ void spapr_cpu_core_class_init(ObjectClass *oc, void *data)
 
     dc->realize = spapr_cpu_core_realize;
     dc->unrealize = spapr_cpu_core_unrealizefn;
+    dc->props = spapr_cpu_core_properties;
     scc->cpu_class = cpu_class_by_name(TYPE_POWERPC_CPU, data);
     g_assert(scc->cpu_class);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 03/18] pc: add node-id property to CPU
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 01/18] numa: move source of default CPUs to NUMA node mapping into boards Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 02/18] spapr: add node-id property to sPAPR core Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 04/18] virt-arm: " Igor Mammedov
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

it will allow switching from cpu_index to property based
numa mapping in follow up patches.

PS:
patch changes default value of CPUState::numa_node from 0
to CPU_UNSET_NUMA_NODE_ID. The only place for x86 that
would affected is monitor's 'infor numa' command which
uses that field. However legacy 0 value is still preserved
by pc_cpu_pre_plug() in this patch if user/numa.c hasn't
set it explicitly, so there is no change in behavior.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/i386/pc.c      | 17 +++++++++++++++++
 target/i386/cpu.c |  1 +
 2 files changed, 18 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 01693d5..455300f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1893,6 +1893,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
                             DeviceState *dev, Error **errp)
 {
     int idx;
+    int node_id;
     CPUState *cs;
     CPUArchId *cpu_slot;
     X86CPUTopoInfo topo;
@@ -1982,6 +1983,22 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
     cs = CPU(cpu);
     cs->cpu_index = idx;
+
+    node_id = numa_get_node_for_cpu(cs->cpu_index);
+    if (node_id == nb_numa_nodes) {
+        /* by default CPUState::numa_node was 0 if it's not set via CLI
+         * keep it this way for now but in future we probably should
+         * refuse to start up with incomplete numa mapping */
+        node_id = 0;
+    }
+    if (cs->numa_node == CPU_UNSET_NUMA_NODE_ID) {
+        cs->numa_node = node_id;
+    } else if (cs->numa_node != node_id) {
+            error_setg(errp, "node-id %d must match numa node specified"
+                "with -numa option for cpu-index %d",
+                cs->numa_node, cs->cpu_index);
+            return;
+    }
 }
 
 static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 13c0985..007a5bd 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3982,6 +3982,7 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
 #endif
+    DEFINE_PROP_INT32("node-id", CPUState, numa_node, CPU_UNSET_NUMA_NODE_ID),
     DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
     { .name  = "hv-spinlocks", .info  = &qdev_prop_spinlocks },
     DEFINE_PROP_BOOL("hv-relaxed", X86CPU, hyperv_relaxed_timing, false),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 04/18] virt-arm: add node-id property to CPU
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (2 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 03/18] pc: add node-id property to CPU Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 05/18] numa: add check that board supports cpu_index to node mapping Igor Mammedov
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

it will allow switching from cpu_index to property based
numa mapping in follow up patches.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 hw/arm/virt.c    | 15 +++++++++++++++
 target/arm/cpu.c |  1 +
 2 files changed, 16 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index dfd6fd4..653b4d7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1350,6 +1350,7 @@ static void machvirt_init(MachineState *machine)
     for (n = 0; n < possible_cpus->len; n++) {
         Object *cpuobj;
         CPUState *cs;
+        int node_id;
 
         if (n >= smp_cpus) {
             break;
@@ -1362,6 +1363,20 @@ static void machvirt_init(MachineState *machine)
         cs = CPU(cpuobj);
         cs->cpu_index = n;
 
+        node_id = numa_get_node_for_cpu(cs->cpu_index);
+        if (node_id == nb_numa_nodes) {
+            /* by default CPUState::numa_node was 0 if it's not set via CLI
+             * keep it this way for now but in future we probably should
+             * refuse to start up with incomplete numa mapping */
+             node_id = 0;
+        }
+        if (cs->numa_node == CPU_UNSET_NUMA_NODE_ID) {
+            cs->numa_node = node_id;
+        } else {
+            /* CPU isn't device_add compatible yet, this shouldn't happen */
+            error_setg(&error_abort, "user set node-id not implemented");
+        }
+
         if (!vms->secure) {
             object_property_set_bool(cpuobj, false, "has_el3", NULL);
         }
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index ee1406d..c185eb1 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1573,6 +1573,7 @@ static Property arm_cpu_properties[] = {
     DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
                         mp_affinity, ARM64_AFFINITY_INVALID),
+    DEFINE_PROP_INT32("node-id", CPUState, numa_node, CPU_UNSET_NUMA_NODE_ID),
     DEFINE_PROP_END_OF_LIST()
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 05/18] numa: add check that board supports cpu_index to node mapping
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (3 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 04/18] virt-arm: " Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus Igor Mammedov
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Default node mapping initialization already checks that board
supports cpu_index to node mapping and refuses to start if
it's not supported. Do the same for explicitly provided
mapping "-numa node,cpus=..."

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
v3:
  - amend erre msg to print:
     "NUMA is not supported by this machine-type"
   (Eduardo)
---
 numa.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/numa.c b/numa.c
index bcdfca2..7182481 100644
--- a/numa.c
+++ b/numa.c
@@ -141,10 +141,12 @@ uint32_t numa_get_node(ram_addr_t addr, Error **errp)
     return -1;
 }
 
-static void parse_numa_node(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
+static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
+                            QemuOpts *opts, Error **errp)
 {
     uint16_t nodenr;
     uint16List *cpus = NULL;
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
 
     if (node->has_nodeid) {
         nodenr = node->nodeid;
@@ -163,6 +165,10 @@ static void parse_numa_node(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
         return;
     }
 
+    if (!mc->cpu_index_to_instance_props) {
+        error_report("NUMA is not supported by this machine-type");
+        exit(1);
+    }
     for (cpus = node->cpus; cpus; cpus = cpus->next) {
         if (cpus->value >= max_cpus) {
             error_setg(errp,
@@ -253,6 +259,7 @@ static void parse_numa_distance(NumaDistOptions *dist, Error **errp)
 static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
 {
     NumaOptions *object = NULL;
+    MachineState *ms = opaque;
     Error *err = NULL;
 
     {
@@ -267,7 +274,7 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
 
     switch (object->type) {
     case NUMA_OPTIONS_TYPE_NODE:
-        parse_numa_node(&object->u.node, opts, &err);
+        parse_numa_node(ms, &object->u.node, opts, &err);
         if (err) {
             goto end;
         }
@@ -452,7 +459,7 @@ void parse_numa_opts(MachineState *ms)
         numa_info[i].node_cpu = bitmap_new(max_cpus);
     }
 
-    if (qemu_opts_foreach(qemu_find_opts("numa"), parse_numa, NULL, NULL)) {
+    if (qemu_opts_foreach(qemu_find_opts("numa"), parse_numa, ms, NULL)) {
         exit(1);
     }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (4 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 05/18] numa: add check that board supports cpu_index to node mapping Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-26 15:46   ` Eduardo Habkost
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 07/18] numa: do default mapping based on possible_cpus instead of node_cpu bitmaps Igor Mammedov
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Introduce machine_set_cpu_numa_node() helper that stores
node mapping for CPU in MachineState::possible_cpus.
CPU and node it belongs to is specified by 'props' argument.

Patch doesn't remove old way of storing mapping in
numa_info[X].node_cpu as removing it at the same time
makes patch rather big. Instead it just mirrors mapping
in possible_cpus and follow up per target patches will
switch to possible_cpus and numa_info[X].node_cpu will
be removed once there isn't any users left.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
v3:
  - add comment documenting machine_set_cpu_numa_node() semantics
    (Eduardo)
  - assert if props->has_node_id == false (Eduardo)
---
 include/hw/boards.h |  3 ++
 hw/core/machine.c   | 96 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 numa.c              |  8 +++++
 3 files changed, 107 insertions(+)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 3ffa255..4e14ff0 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -42,6 +42,9 @@ bool machine_dump_guest_core(MachineState *machine);
 bool machine_mem_merge(MachineState *machine);
 void machine_register_compat_props(MachineState *machine);
 HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine);
+void machine_set_cpu_numa_node(MachineState *machine,
+                               const CpuInstanceProperties *props,
+                               Error **errp);
 
 /**
  * CPUArchId:
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 2482c63..420c8c4 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -389,6 +389,102 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)
     return head;
 }
 
+/**
+ * machine_set_cpu_numa_node:
+ * @machine: machine object to modify
+ * @props: specifies which cpu objects to assign to
+ *         numa node specified by @props.node_id
+ * @errp: if an error occurs, a pointer to an area to store the error
+ *
+ * Associate NUMA node specified by @props.node_id with cpu slots that
+ * match socket/core/thread-ids specified by @props. It's recommended to use
+ * query-hotpluggable-cpus.props values to specify affected cpu slots,
+ * which would lead to exact 1:1 mapping of cpu slots to NUMA node.
+ *
+ * However for CLI convenience it's possible to pass in subset of properties,
+ * which would affect all cpu slots that match it.
+ * Ex for pc machine:
+ *    -smp 4,cores=2,sockets=2 -numa node,nodeid=0 -numa node,nodeid=1 \
+ *    -numa cpu,node-id=0,socket_id=0 \
+ *    -numa cpu,node-id=1,socket_id=1
+ * will assign all child cores of socket 0 to node 0 and
+ * of socket 1 to node 1.
+ *
+ * On attempt of reassigning (already assigned) cpu slot to another NUMA node,
+ * return error.
+ * Empty subset is disallowed and function will return with error in this case.
+ */
+void machine_set_cpu_numa_node(MachineState *machine,
+                               const CpuInstanceProperties *props, Error **errp)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(machine);
+    bool match = false;
+    int i;
+
+    if (!mc->possible_cpu_arch_ids) {
+        error_setg(errp, "mapping of CPUs to NUMA node is not supported");
+        return;
+    }
+
+    /* disabling node mapping is not supported, forbid it */
+    assert(props->has_node_id);
+
+    /* force board to initialize possible_cpus if it hasn't been done yet */
+    mc->possible_cpu_arch_ids(machine);
+
+    for (i = 0; i < machine->possible_cpus->len; i++) {
+        CPUArchId *slot = &machine->possible_cpus->cpus[i];
+
+        /* reject unsupported by board properties */
+        if (props->has_thread_id && !slot->props.has_thread_id) {
+            error_setg(errp, "thread-id is not supported");
+            return;
+        }
+
+        if (props->has_core_id && !slot->props.has_core_id) {
+            error_setg(errp, "core-id is not supported");
+            return;
+        }
+
+        if (props->has_socket_id && !slot->props.has_socket_id) {
+            error_setg(errp, "socket-id is not supported");
+            return;
+        }
+
+        /* skip slots with explicit mismatch */
+        if (props->has_thread_id && props->thread_id != slot->props.thread_id) {
+                continue;
+        }
+
+        if (props->has_core_id && props->core_id != slot->props.core_id) {
+                continue;
+        }
+
+        if (props->has_socket_id && props->socket_id != slot->props.socket_id) {
+                continue;
+        }
+
+        /* reject assignment if slot is already assigned, for compatibility
+         * of legacy cpu_index mapping with SPAPR core based mapping do not
+         * error out if cpu thread and matched core have the same node-id */
+        if (slot->props.has_node_id &&
+            slot->props.node_id != props->node_id) {
+            error_setg(errp, "CPU is already assigned to node-id: %" PRId64,
+                       slot->props.node_id);
+            return;
+        }
+
+        /* assign slot to node as it's matched '-numa cpu' key */
+        match = true;
+        slot->props.node_id = props->node_id;
+        slot->props.has_node_id = props->has_node_id;
+    }
+
+    if (!match) {
+        error_setg(errp, "no match found");
+    }
+}
+
 static void machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
diff --git a/numa.c b/numa.c
index 7182481..7db5dde 100644
--- a/numa.c
+++ b/numa.c
@@ -170,6 +170,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
         exit(1);
     }
     for (cpus = node->cpus; cpus; cpus = cpus->next) {
+        CpuInstanceProperties props;
         if (cpus->value >= max_cpus) {
             error_setg(errp,
                        "CPU index (%" PRIu16 ")"
@@ -178,6 +179,10 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
             return;
         }
         bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
+        props = mc->cpu_index_to_instance_props(ms, cpus->value);
+        props.node_id = nodenr;
+        props.has_node_id = true;
+        machine_set_cpu_numa_node(ms, &props, &error_fatal);
     }
 
     if (node->has_mem && node->has_memdev) {
@@ -528,9 +533,12 @@ void parse_numa_opts(MachineState *ms)
         if (i == nb_numa_nodes) {
             for (i = 0; i < max_cpus; i++) {
                 CpuInstanceProperties props;
+                /* fetch default mapping from board and enable it */
                 props = mc->cpu_index_to_instance_props(ms, i);
+                props.has_node_id = true;
 
                 set_bit(i, numa_info[props.node_id].node_cpu);
+                machine_set_cpu_numa_node(ms, &props, &error_fatal);
             }
         }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 07/18] numa: do default mapping based on possible_cpus instead of node_cpu bitmaps
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (5 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 08/18] pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() Igor Mammedov
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 numa.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/numa.c b/numa.c
index 7db5dde..c89fc2d 100644
--- a/numa.c
+++ b/numa.c
@@ -458,6 +458,7 @@ void numa_default_auto_assign_ram(MachineClass *mc, NodeInfo *nodes,
 void parse_numa_opts(MachineState *ms)
 {
     int i;
+    const CPUArchIdList *possible_cpus;
     MachineClass *mc = MACHINE_GET_CLASS(ms);
 
     for (i = 0; i < MAX_NODES; i++) {
@@ -519,18 +520,21 @@ void parse_numa_opts(MachineState *ms)
 
         numa_set_mem_ranges();
 
-        for (i = 0; i < nb_numa_nodes; i++) {
-            if (!bitmap_empty(numa_info[i].node_cpu, max_cpus)) {
-                break;
-            }
-        }
-
         /* assign CPUs to nodes using board provided default mapping */
-        if (!mc->cpu_index_to_instance_props) {
+        if (!mc->cpu_index_to_instance_props || !mc->possible_cpu_arch_ids) {
             error_report("default CPUs to NUMA node mapping isn't supported");
             exit(1);
         }
-        if (i == nb_numa_nodes) {
+
+        possible_cpus = mc->possible_cpu_arch_ids(ms);
+        for (i = 0; i < possible_cpus->len; i++) {
+            if (possible_cpus->cpus[i].props.has_node_id) {
+                break;
+            }
+        }
+
+        /* no CPUs are assigned to NUMA nodes */
+        if (i == possible_cpus->len) {
             for (i = 0; i < max_cpus; i++) {
                 CpuInstanceProperties props;
                 /* fetch default mapping from board and enable it */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 08/18] pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (6 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 07/18] numa: do default mapping based on possible_cpus instead of node_cpu bitmaps Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 09/18] spapr: " Igor Mammedov
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
v3:
   drop numa_[has_]node_id() wrappers (ehabkost, dwg)
v2:
   use numa_[has_]node_id() wrappers (Drew)
---
 hw/acpi/cpu.c        |  7 +++----
 hw/i386/acpi-build.c | 11 ++++-------
 hw/i386/pc.c         | 18 ++++++++++--------
 3 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 8c719d3..a233fe1 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -503,7 +503,6 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
 
         /* build Processor object for each processor */
         for (i = 0; i < arch_ids->len; i++) {
-            int j;
             Aml *dev;
             Aml *uid = aml_int(i);
             GArray *madt_buf = g_array_new(0, 1, 1);
@@ -557,9 +556,9 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
              * as a result _PXM is required for all CPUs which might
              * be hot-plugged. For simplicity, add it for all CPUs.
              */
-            j = numa_get_node_for_cpu(i);
-            if (j < nb_numa_nodes) {
-                aml_append(dev, aml_name_decl("_PXM", aml_int(j)));
+            if (arch_ids->cpus[i].props.has_node_id) {
+                aml_append(dev, aml_name_decl("_PXM",
+                           aml_int(arch_ids->cpus[i].props.node_id)));
             }
 
             aml_append(cpus_dev, dev);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2458ebc..e3f8254 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2306,7 +2306,8 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
     srat->reserved1 = cpu_to_le32(1);
 
     for (i = 0; i < apic_ids->len; i++) {
-        int j = numa_get_node_for_cpu(i);
+        int node_id = apic_ids->cpus[i].props.has_node_id ?
+            apic_ids->cpus[i].props.node_id : 0;
         uint32_t apic_id = apic_ids->cpus[i].arch_id;
 
         if (apic_id < 255) {
@@ -2316,9 +2317,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
             core->type = ACPI_SRAT_PROCESSOR_APIC;
             core->length = sizeof(*core);
             core->local_apic_id = apic_id;
-            if (j < nb_numa_nodes) {
-                core->proximity_lo = j;
-            }
+            core->proximity_lo = node_id;
             memset(core->proximity_hi, 0, 3);
             core->local_sapic_eid = 0;
             core->flags = cpu_to_le32(1);
@@ -2329,9 +2328,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
             core->type = ACPI_SRAT_PROCESSOR_x2APIC;
             core->length = sizeof(*core);
             core->x2apic_id = cpu_to_le32(apic_id);
-            if (j < nb_numa_nodes) {
-                core->proximity_domain = cpu_to_le32(j);
-            }
+            core->proximity_domain = cpu_to_le32(node_id);
             core->flags = cpu_to_le32(1);
         }
     }
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 455300f..e36a375 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -747,7 +747,9 @@ static FWCfgState *bochs_bios_init(AddressSpace *as, PCMachineState *pcms)
 {
     FWCfgState *fw_cfg;
     uint64_t *numa_fw_cfg;
-    int i, j;
+    int i;
+    const CPUArchIdList *cpus;
+    MachineClass *mc = MACHINE_GET_CLASS(pcms);
 
     fw_cfg = fw_cfg_init_io_dma(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4, as);
     fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
@@ -782,12 +784,12 @@ static FWCfgState *bochs_bios_init(AddressSpace *as, PCMachineState *pcms)
      */
     numa_fw_cfg = g_new0(uint64_t, 1 + pcms->apic_id_limit + nb_numa_nodes);
     numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes);
-    for (i = 0; i < max_cpus; i++) {
-        unsigned int apic_id = x86_cpu_apic_id_from_index(i);
+    cpus = mc->possible_cpu_arch_ids(MACHINE(pcms));
+    for (i = 0; i < cpus->len; i++) {
+        unsigned int apic_id = cpus->cpus[i].arch_id;
         assert(apic_id < pcms->apic_id_limit);
-        j = numa_get_node_for_cpu(i);
-        if (j < nb_numa_nodes) {
-            numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
+        if (cpus->cpus[i].props.has_node_id) {
+            numa_fw_cfg[apic_id + 1] = cpu_to_le64(cpus->cpus[i].props.node_id);
         }
     }
     for (i = 0; i < nb_numa_nodes; i++) {
@@ -1984,8 +1986,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     cs = CPU(cpu);
     cs->cpu_index = idx;
 
-    node_id = numa_get_node_for_cpu(cs->cpu_index);
-    if (node_id == nb_numa_nodes) {
+    node_id = cpu_slot->props.node_id;
+    if (!cpu_slot->props.has_node_id) {
         /* by default CPUState::numa_node was 0 if it's not set via CLI
          * keep it this way for now but in future we probably should
          * refuse to start up with incomplete numa mapping */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 09/18] spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (7 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 08/18] pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 10/18] virt-arm: " Igor Mammedov
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

it's safe to remove thread node_id != core node_id error
branch as machine_set_cpu_numa_node() also does mismatch
check and is called even before any CPU is created.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c          |  4 ++--
 hw/ppc/spapr_cpu_core.c | 14 ++------------
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a952a39..504161f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2863,8 +2863,8 @@ static void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         goto out;
     }
 
-    node_id = numa_get_node_for_cpu(cc->core_id);
-    if (node_id == nb_numa_nodes) {
+    node_id = core_slot->props.node_id;
+    if (!core_slot->props.has_node_id) {
         /* by default CPUState::numa_node was 0 if it's not set via CLI
          * keep it this way for now but in future we probably should
          * refuse to start up with incomplete numa mapping */
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 9de7a56..a17ea07 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -181,7 +181,6 @@ static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
 
     sc->threads = g_malloc0(size * cc->nr_threads);
     for (i = 0; i < cc->nr_threads; i++) {
-        int node_id;
         char id[32];
         CPUState *cs;
 
@@ -191,17 +190,8 @@ static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
         cs = CPU(obj);
         cs->cpu_index = cc->core_id + i;
 
-        /* Set NUMA node for the added CPUs  */
-        node_id = numa_get_node_for_cpu(cs->cpu_index);
-        if (node_id != sc->node_id) {
-            error_setg(&local_err, "Invalid node-id=%d of thread[cpu-index: %d]"
-                " on CPU[core-id: %d, node-id: %d], node-id must be the same",
-                 node_id, cs->cpu_index, cc->core_id, sc->node_id);
-            goto err;
-        }
-        if (node_id < nb_numa_nodes) {
-            cs->numa_node = node_id;
-        }
+        /* Set NUMA node for the threads belonged to core  */
+        cs->numa_node = sc->node_id;
 
         snprintf(id, sizeof(id), "thread[%d]", i);
         object_property_add_child(OBJECT(sc), id, obj, &local_err);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 10/18] virt-arm: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (8 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 09/18] spapr: " Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 11/18] QMP: include CpuInstanceProperties into query_cpus output output Igor Mammedov
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
v3:
   drop numa_[has_]node_id() wrappers (ehabkost, dwg)
v2:
   use numa_[has_]node_id() wrappers (Drew)
---
 hw/arm/virt-acpi-build.c | 19 +++++++------------
 hw/arm/virt.c            | 13 +++++++------
 2 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0835e59..ce7499c 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -486,30 +486,25 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     AcpiSystemResourceAffinityTable *srat;
     AcpiSratProcessorGiccAffinity *core;
     AcpiSratMemoryAffinity *numamem;
-    int i, j, srat_start;
+    int i, srat_start;
     uint64_t mem_base;
-    uint32_t *cpu_node = g_malloc0(vms->smp_cpus * sizeof(uint32_t));
-
-    for (i = 0; i < vms->smp_cpus; i++) {
-        j = numa_get_node_for_cpu(i);
-        if (j < nb_numa_nodes) {
-                cpu_node[i] = j;
-        }
-    }
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
+    const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
 
     srat_start = table_data->len;
     srat = acpi_data_push(table_data, sizeof(*srat));
     srat->reserved1 = cpu_to_le32(1);
 
-    for (i = 0; i < vms->smp_cpus; ++i) {
+    for (i = 0; i < cpu_list->len; ++i) {
+        int node_id = cpu_list->cpus[i].props.has_node_id ?
+            cpu_list->cpus[i].props.node_id : 0;
         core = acpi_data_push(table_data, sizeof(*core));
         core->type = ACPI_SRAT_PROCESSOR_GICC;
         core->length = sizeof(*core);
-        core->proximity = cpu_to_le32(cpu_node[i]);
+        core->proximity = cpu_to_le32(node_id);
         core->acpi_processor_uid = cpu_to_le32(i);
         core->flags = cpu_to_le32(1);
     }
-    g_free(cpu_node);
 
     mem_base = vms->memmap[VIRT_MEM].base;
     for (i = 0; i < nb_numa_nodes; ++i) {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 653b4d7..c7c8159 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -338,7 +338,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
 {
     int cpu;
     int addr_cells = 1;
-    unsigned int i;
+    const MachineState *ms = MACHINE(vms);
 
     /*
      * From Documentation/devicetree/bindings/arm/cpus.txt
@@ -369,6 +369,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
     for (cpu = vms->smp_cpus - 1; cpu >= 0; cpu--) {
         char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
+        CPUState *cs = CPU(armcpu);
 
         qemu_fdt_add_subnode(vms->fdt, nodename);
         qemu_fdt_setprop_string(vms->fdt, nodename, "device_type", "cpu");
@@ -389,9 +390,9 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
                                   armcpu->mp_affinity);
         }
 
-        i = numa_get_node_for_cpu(cpu);
-        if (i < nb_numa_nodes) {
-            qemu_fdt_setprop_cell(vms->fdt, nodename, "numa-node-id", i);
+        if (ms->possible_cpus->cpus[cs->cpu_index].props.has_node_id) {
+            qemu_fdt_setprop_cell(vms->fdt, nodename, "numa-node-id",
+                ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
         }
 
         g_free(nodename);
@@ -1363,8 +1364,8 @@ static void machvirt_init(MachineState *machine)
         cs = CPU(cpuobj);
         cs->cpu_index = n;
 
-        node_id = numa_get_node_for_cpu(cs->cpu_index);
-        if (node_id == nb_numa_nodes) {
+        node_id = possible_cpus->cpus[cs->cpu_index].props.node_id;
+        if (!possible_cpus->cpus[cs->cpu_index].props.has_node_id) {
             /* by default CPUState::numa_node was 0 if it's not set via CLI
              * keep it this way for now but in future we probably should
              * refuse to start up with incomplete numa mapping */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 11/18] QMP: include CpuInstanceProperties into query_cpus output output
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (9 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 10/18] virt-arm: " Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-11  4:39   ` David Gibson
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 12/18] tests: numa: add case for QMP command query-cpus Igor Mammedov
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

if board supports CpuInstanceProperties, report them for
each CPU thread listed. Main motivation for this is to
provide these properties introspection via QMP interface
for using in test cases to verify numa node to cpu mapping,
which includes not only boards that support cpu hotplug
and have this info in query-hotpluggable-cpus (pc/spapr)
but also for boards that don't not support hotpluggable-cpus
but support numa mapping (virt-arm).

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
v2:
  * fix checkpatch error and remove extra space after = (Eric)
---
 cpus.c           | 10 ++++++++++
 qapi-schema.json |  6 +++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 740b8dc..4f91d25 100644
--- a/cpus.c
+++ b/cpus.c
@@ -50,6 +50,7 @@
 #include "qapi-event.h"
 #include "hw/nmi.h"
 #include "sysemu/replay.h"
+#include "hw/boards.h"
 
 #ifdef CONFIG_LINUX
 
@@ -1859,6 +1860,8 @@ void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
 
 CpuInfoList *qmp_query_cpus(Error **errp)
 {
+    MachineState *ms = MACHINE(qdev_get_machine());
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
     CpuInfoList *head = NULL, *cur_item = NULL;
     CPUState *cpu;
 
@@ -1909,6 +1912,13 @@ CpuInfoList *qmp_query_cpus(Error **errp)
 #else
         info->value->arch = CPU_INFO_ARCH_OTHER;
 #endif
+        info->value->has_props = !!mc->cpu_index_to_instance_props;
+        if (info->value->has_props) {
+            CpuInstanceProperties *props;
+            props = g_malloc0(sizeof(*props));
+            *props = mc->cpu_index_to_instance_props(ms, cpu->cpu_index);
+            info->value->props = props;
+        }
 
         /* XXX: waiting for the qapi to support GSList */
         if (!cur_item) {
diff --git a/qapi-schema.json b/qapi-schema.json
index bf48873..f1bcebe 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1325,6 +1325,9 @@
 #
 # @thread_id: ID of the underlying host thread
 #
+# @props: properties describing to which node/socket/core/thread
+#         virtual CPU belongs to, provided if supported by board (since 2.10)
+#
 # @arch: architecture of the cpu, which determines which additional fields
 #        will be listed (since 2.6)
 #
@@ -1335,7 +1338,8 @@
 ##
 { 'union': 'CpuInfo',
   'base': {'CPU': 'int', 'current': 'bool', 'halted': 'bool',
-           'qom_path': 'str', 'thread_id': 'int', 'arch': 'CpuInfoArch' },
+           'qom_path': 'str', 'thread_id': 'int',
+           '*props': 'CpuInstanceProperties', 'arch': 'CpuInfoArch' },
   'discriminator': 'arch',
   'data': { 'x86': 'CpuInfoX86',
             'sparc': 'CpuInfoSPARC',
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 12/18] tests: numa: add case for QMP command query-cpus
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (10 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 11/18] QMP: include CpuInstanceProperties into query_cpus output output Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 13/18] numa: remove no longer need numa_post_machine_init() Igor Mammedov
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
v2:
 * fix checkpatch error, move { to newline
---
 numa.c            | 14 --------------
 tests/numa-test.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/numa.c b/numa.c
index c89fc2d..f16a6a8 100644
--- a/numa.c
+++ b/numa.c
@@ -737,20 +737,6 @@ MemdevList *qmp_query_memdev(Error **errp)
     return list;
 }
 
-int numa_get_node_for_cpu(int idx)
-{
-    int i;
-
-    assert(idx < max_cpus);
-
-    for (i = 0; i < nb_numa_nodes; i++) {
-        if (test_bit(idx, numa_info[i].node_cpu)) {
-            break;
-        }
-    }
-    return i;
-}
-
 void ram_block_notifier_add(RAMBlockNotifier *n)
 {
     QLIST_INSERT_HEAD(&ram_list.ramblock_notifiers, n, next);
diff --git a/tests/numa-test.c b/tests/numa-test.c
index f5da0c8..2722687 100644
--- a/tests/numa-test.c
+++ b/tests/numa-test.c
@@ -87,6 +87,50 @@ static void test_mon_partial(const void *data)
     g_free(cli);
 }
 
+static QList *get_cpus(QDict **resp)
+{
+    *resp = qmp("{ 'execute': 'query-cpus' }");
+    g_assert(*resp);
+    g_assert(qdict_haskey(*resp, "return"));
+    return  qdict_get_qlist(*resp, "return");
+}
+
+static void test_query_cpus(const void *data)
+{
+    char *cli;
+    QDict *resp;
+    QList *cpus;
+    const QObject *e;
+
+    cli = make_cli(data, "-smp 8 -numa node,cpus=0-3 -numa node,cpus=4-7");
+    qtest_start(cli);
+    cpus = get_cpus(&resp);
+    g_assert(cpus);
+
+    while ((e = qlist_pop(cpus))) {
+        QDict *cpu, *props;
+        int64_t cpu_idx, node;
+
+        cpu = qobject_to_qdict(e);
+        g_assert(qdict_haskey(cpu, "CPU"));
+        g_assert(qdict_haskey(cpu, "props"));
+
+        cpu_idx = qdict_get_int(cpu, "CPU");
+        props = qdict_get_qdict(cpu, "props");
+        g_assert(qdict_haskey(props, "node-id"));
+        node = qdict_get_int(props, "node-id");
+        if (cpu_idx >= 0 && cpu_idx < 4) {
+            g_assert_cmpint(node, ==, 0);
+        } else {
+            g_assert_cmpint(node, ==, 1);
+        }
+    }
+
+    QDECREF(resp);
+    qtest_end();
+    g_free(cli);
+}
+
 int main(int argc, char **argv)
 {
     const char *args = NULL;
@@ -101,6 +145,7 @@ int main(int argc, char **argv)
     qtest_add_data_func("/numa/mon/default", args, test_mon_default);
     qtest_add_data_func("/numa/mon/cpus/explicit", args, test_mon_explicit);
     qtest_add_data_func("/numa/mon/cpus/partial", args, test_mon_partial);
+    qtest_add_data_func("/numa/qmp/cpus/query-cpus", args, test_query_cpus);
 
     return g_test_run();
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 13/18] numa: remove no longer need numa_post_machine_init()
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (11 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 12/18] tests: numa: add case for QMP command query-cpus Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 14/18] machine: call machine init from wrapper Igor Mammedov
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

CPUState::numa_node is still in use but now it's set by
board when it creates CPU objects. So there isn't any
need to set it again after all CPU's are created,
since it's been already set.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 include/sysemu/numa.h |  6 ------
 numa.c                | 15 ---------------
 vl.c                  |  2 --
 3 files changed, 23 deletions(-)

diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 027830c..8cb3ebc 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -27,7 +27,6 @@ struct node_info {
 
 extern NodeInfo numa_info[MAX_NODES];
 void parse_numa_opts(MachineState *ms);
-void numa_post_machine_init(void);
 void query_numa_node_mem(uint64_t node_mem[]);
 extern QemuOptsList qemu_numa_opts;
 void numa_set_mem_node_id(ram_addr_t addr, uint64_t size, uint32_t node);
@@ -37,9 +36,4 @@ void numa_legacy_auto_assign_ram(MachineClass *mc, NodeInfo *nodes,
                                  int nb_nodes, ram_addr_t size);
 void numa_default_auto_assign_ram(MachineClass *mc, NodeInfo *nodes,
                                   int nb_nodes, ram_addr_t size);
-
-
-/* on success returns node index in numa_info,
- * on failure returns nb_numa_nodes */
-int numa_get_node_for_cpu(int idx);
 #endif
diff --git a/numa.c b/numa.c
index f16a6a8..dc739ea 100644
--- a/numa.c
+++ b/numa.c
@@ -572,21 +572,6 @@ void parse_numa_opts(MachineState *ms)
     }
 }
 
-void numa_post_machine_init(void)
-{
-    CPUState *cpu;
-    int i;
-
-    CPU_FOREACH(cpu) {
-        for (i = 0; i < nb_numa_nodes; i++) {
-            assert(cpu->cpu_index < max_cpus);
-            if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
-                cpu->numa_node = i;
-            }
-        }
-    }
-}
-
 static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
                                            const char *name,
                                            uint64_t ram_size)
diff --git a/vl.c b/vl.c
index c63f4d5..fe4741d 100644
--- a/vl.c
+++ b/vl.c
@@ -4595,8 +4595,6 @@ int main(int argc, char **argv, char **envp)
 
     cpu_synchronize_all_post_init();
 
-    numa_post_machine_init();
-
     rom_reset_order_override();
 
     /*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 14/18] machine: call machine init from wrapper
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (12 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 13/18] numa: remove no longer need numa_post_machine_init() Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 15/18] numa: use possible_cpus for not mapped CPUs check Igor Mammedov
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

add machine_run_board_init() wrapper that calls machine
init for now but in follow up patches it will be used
to run generic machine code that should run before
machine init.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
---
 include/hw/boards.h | 1 +
 hw/core/machine.c   | 6 ++++++
 vl.c                | 2 +-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 4e14ff0..76ce021 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -32,6 +32,7 @@ void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner,
 MachineClass *find_default_machine(void);
 extern MachineState *current_machine;
 
+void machine_run_board_init(MachineState *machine);
 bool machine_usb(MachineState *machine);
 bool machine_kernel_irqchip_allowed(MachineState *machine);
 bool machine_kernel_irqchip_required(MachineState *machine);
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 420c8c4..64e2a4f 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -678,6 +678,12 @@ bool machine_mem_merge(MachineState *machine)
     return machine->mem_merge;
 }
 
+void machine_run_board_init(MachineState *machine)
+{
+    MachineClass *machine_class = MACHINE_GET_CLASS(machine);
+    machine_class->init(machine);
+}
+
 static void machine_class_finalize(ObjectClass *klass, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(klass);
diff --git a/vl.c b/vl.c
index fe4741d..ac46d6e 100644
--- a/vl.c
+++ b/vl.c
@@ -4562,7 +4562,7 @@ int main(int argc, char **argv, char **envp)
     current_machine->boot_order = boot_order;
     current_machine->cpu_model = cpu_model;
 
-    machine_class->init(current_machine);
+    machine_run_board_init(current_machine);
 
     realtime_init();
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 15/18] numa: use possible_cpus for not mapped CPUs check
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (13 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 14/18] machine: call machine init from wrapper Igor Mammedov
@ 2017-05-10 11:29 ` Igor Mammedov
  2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 16/18] numa: remove node_cpu bitmaps as they are no longer used Igor Mammedov
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

and remove corresponding part in numa.c that uses
node_cpu bitmaps.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
It's one more less user of node_cpu bitmpas, following
commit will remove the last user along with
node_cpu itself.
---
 hw/core/machine.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 numa.c            | 10 ----------
 2 files changed, 58 insertions(+), 10 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 64e2a4f..fd6a436 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -20,6 +20,7 @@
 #include "sysemu/numa.h"
 #include "qemu/error-report.h"
 #include "qemu/cutils.h"
+#include "sysemu/numa.h"
 
 static char *machine_get_accel(Object *obj, Error **errp)
 {
@@ -678,9 +679,66 @@ bool machine_mem_merge(MachineState *machine)
     return machine->mem_merge;
 }
 
+static char *cpu_slot_to_string(const CPUArchId *cpu)
+{
+    GString *s = g_string_new(NULL);
+    if (cpu->props.has_socket_id) {
+        g_string_append_printf(s, "socket-id: %"PRId64, cpu->props.socket_id);
+    }
+    if (cpu->props.has_core_id) {
+        if (s->len) {
+            g_string_append_printf(s, ", ");
+        }
+        g_string_append_printf(s, "core-id: %"PRId64, cpu->props.core_id);
+    }
+    if (cpu->props.has_thread_id) {
+        if (s->len) {
+            g_string_append_printf(s, ", ");
+        }
+        g_string_append_printf(s, "thread-id: %"PRId64, cpu->props.thread_id);
+    }
+    return g_string_free(s, false);
+}
+
+static void machine_numa_validate(MachineState *machine)
+{
+    int i;
+    GString *s = g_string_new(NULL);
+    MachineClass *mc = MACHINE_GET_CLASS(machine);
+    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(machine);
+
+    assert(nb_numa_nodes);
+    for (i = 0; i < possible_cpus->len; i++) {
+        const CPUArchId *cpu_slot = &possible_cpus->cpus[i];
+
+        /* at this point numa mappings are initilized by CLI options
+         * or with default mappings so it's sufficient to list
+         * all not yet mapped CPUs here */
+        /* TODO: make it hard error in future */
+        if (!cpu_slot->props.has_node_id) {
+            char *cpu_str = cpu_slot_to_string(cpu_slot);
+            g_string_append_printf(s, "%sCPU %d [%s]", s->len ? ", " : "", i,
+                                   cpu_str);
+            g_free(cpu_str);
+        }
+    }
+    if (s->len) {
+        error_report("warning: CPU(s) not present in any NUMA nodes: %s",
+                     s->str);
+        error_report("warning: All CPU(s) up to maxcpus should be described "
+                     "in NUMA config, ability to start up with partial NUMA "
+                     "mappings is obsoleted and will be removed in future");
+    }
+    g_string_free(s, true);
+}
+
 void machine_run_board_init(MachineState *machine)
 {
     MachineClass *machine_class = MACHINE_GET_CLASS(machine);
+
+    if (nb_numa_nodes) {
+        machine_numa_validate(machine);
+    }
     machine_class->init(machine);
 }
 
diff --git a/numa.c b/numa.c
index dc739ea..63bff5a 100644
--- a/numa.c
+++ b/numa.c
@@ -337,16 +337,6 @@ static void validate_numa_cpus(void)
         bitmap_or(seen_cpus, seen_cpus,
                   numa_info[i].node_cpu, max_cpus);
     }
-
-    if (!bitmap_full(seen_cpus, max_cpus)) {
-        char *msg;
-        bitmap_complement(seen_cpus, seen_cpus, max_cpus);
-        msg = enumerate_cpus(seen_cpus, max_cpus);
-        error_report("warning: CPU(s) not present in any NUMA nodes: %s", msg);
-        error_report("warning: All CPU(s) up to maxcpus should be described "
-                     "in NUMA config");
-        g_free(msg);
-    }
     g_free(seen_cpus);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 16/18] numa: remove node_cpu bitmaps as they are no longer used
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (14 preceding siblings ...)
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 15/18] numa: use possible_cpus for not mapped CPUs check Igor Mammedov
@ 2017-05-10 11:30 ` Igor Mammedov
  2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 17/18] numa: add '-numa cpu, ...' option for property based node mapping Igor Mammedov
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Postfactum "CPU(s) present in multiple NUMA nodes" check
was the last user of node_cpu bitmaps, but it's not need
as machine_set_cpu_numa_node() does the similar check at
the time mapping is set for cpus (i.e. when -numa cpus=
is parsed) and ensures that cpu can be mapped only to
one node.

Remove duplicate check based on node_cpu bitmaps and
since the last user is gone remove node_cpu as well,
which completes internal transition from legacy bitmap
based mapping storage to possible_cpus storage.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 include/sysemu/numa.h |  1 -
 numa.c                | 43 -------------------------------------------
 2 files changed, 44 deletions(-)

diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 8cb3ebc..7ffde5b 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -18,7 +18,6 @@ struct numa_addr_range {
 
 struct node_info {
     uint64_t node_mem;
-    unsigned long *node_cpu;
     struct HostMemoryBackend *node_memdev;
     bool present;
     QLIST_HEAD(, numa_addr_range) addr; /* List to store address ranges */
diff --git a/numa.c b/numa.c
index 63bff5a..ca122cc 100644
--- a/numa.c
+++ b/numa.c
@@ -178,7 +178,6 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
                        cpus->value, max_cpus);
             return;
         }
-        bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
         props = mc->cpu_index_to_instance_props(ms, cpus->value);
         props.node_id = nodenr;
         props.has_node_id = true;
@@ -305,41 +304,6 @@ end:
     return 0;
 }
 
-static char *enumerate_cpus(unsigned long *cpus, int max_cpus)
-{
-    int cpu;
-    bool first = true;
-    GString *s = g_string_new(NULL);
-
-    for (cpu = find_first_bit(cpus, max_cpus);
-        cpu < max_cpus;
-        cpu = find_next_bit(cpus, max_cpus, cpu + 1)) {
-        g_string_append_printf(s, "%s%d", first ? "" : " ", cpu);
-        first = false;
-    }
-    return g_string_free(s, FALSE);
-}
-
-static void validate_numa_cpus(void)
-{
-    int i;
-    unsigned long *seen_cpus = bitmap_new(max_cpus);
-
-    for (i = 0; i < nb_numa_nodes; i++) {
-        if (bitmap_intersects(seen_cpus, numa_info[i].node_cpu, max_cpus)) {
-            bitmap_and(seen_cpus, seen_cpus,
-                       numa_info[i].node_cpu, max_cpus);
-            error_report("CPU(s) present in multiple NUMA nodes: %s",
-                         enumerate_cpus(seen_cpus, max_cpus));
-            g_free(seen_cpus);
-            exit(EXIT_FAILURE);
-        }
-        bitmap_or(seen_cpus, seen_cpus,
-                  numa_info[i].node_cpu, max_cpus);
-    }
-    g_free(seen_cpus);
-}
-
 /* If all node pair distances are symmetric, then only distances
  * in one direction are enough. If there is even one asymmetric
  * pair, though, then all distances must be provided. The
@@ -451,10 +415,6 @@ void parse_numa_opts(MachineState *ms)
     const CPUArchIdList *possible_cpus;
     MachineClass *mc = MACHINE_GET_CLASS(ms);
 
-    for (i = 0; i < MAX_NODES; i++) {
-        numa_info[i].node_cpu = bitmap_new(max_cpus);
-    }
-
     if (qemu_opts_foreach(qemu_find_opts("numa"), parse_numa, ms, NULL)) {
         exit(1);
     }
@@ -531,13 +491,10 @@ void parse_numa_opts(MachineState *ms)
                 props = mc->cpu_index_to_instance_props(ms, i);
                 props.has_node_id = true;
 
-                set_bit(i, numa_info[props.node_id].node_cpu);
                 machine_set_cpu_numa_node(ms, &props, &error_fatal);
             }
         }
 
-        validate_numa_cpus();
-
         /* QEMU needs at least all unique node pair distances to build
          * the whole NUMA distance table. QEMU treats the distance table
          * as symmetric by default, i.e. distance A->B == distance B->A.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 17/18] numa: add '-numa cpu, ...' option for property based node mapping
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (15 preceding siblings ...)
  2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 16/18] numa: remove node_cpu bitmaps as they are no longer used Igor Mammedov
@ 2017-05-10 11:30 ` Igor Mammedov
  2017-05-11  5:23   ` David Gibson
  2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 18/18] tests: check -numa node, cpu=props_list usecase Igor Mammedov
  2017-05-11 13:38 ` [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Eduardo Habkost
  18 siblings, 1 reply; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

legacy cpu to node mapping is using cpu index values to map
VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
option. However cpu index is internal concept and QEMU users
have to guess /reimplement qemu's logic/ to map it to
a concrete cpu socket/core/thread to make sane CPUs
placement across numa nodes.

This patch allows to map cpu objects to numa nodes using
the same properties as used for cpus with -device/device_add
(socket-id/core-id/thread-id/node-id).

At present valid properties/values to address CPUs could be
fetched using hotpluggable-cpus monitor/qmp command, it will
require user to start qemu twice when creating domain to fetch
possible CPUs for a machine type/-smp layout first and
then the second time with numa explicit mapping for actual
usage. The first step results could be saved and reused to
set/change mapping later as far as machine type/-smp stays
the same.

Proposed impl. supports exact and wildcard matching to
simplify CLI and allow to set mapping for a specific cpu
or group of cpu objects specified by matched properties.

For example:

   # exact mapping x86
   -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n

   # exact mapping SPAPR
   -numa cpu,node-id=x,core-id=y

   # wildcard mapping, all cpu objects that match socket-id=y
   # are mapped to node-id=x
   -numa cpu,node-id=x,socket-id=y

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
v3:
  - use qapi_NumaCpuOptions_base() instead of copying fields manually
    (Eduardo)
v2:
  - use new NumaCpuOptions instead of CpuInstanceProperties in
    NumaOptions, so that in future we could decouple both
    if needed. (Eduardo Habkost <ehabkost@redhat.com>)
  - clarify effect of NumaCpuOptions.node-id in qapi-schema.json
---
 numa.c           | 15 +++++++++++++++
 qapi-schema.json | 21 +++++++++++++++++++--
 qemu-options.hx  | 20 ++++++++++++++++++++
 3 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/numa.c b/numa.c
index ca122cc..84ce2af 100644
--- a/numa.c
+++ b/numa.c
@@ -290,6 +290,21 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
             goto end;
         }
         break;
+    case NUMA_OPTIONS_TYPE_CPU:
+        if (!object->u.cpu.has_node_id) {
+            error_setg(&err, "Missing mandatory node-id property");
+            goto end;
+        }
+        if (!numa_info[object->u.cpu.node_id].present) {
+            error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
+                "defined with -numa node,nodeid=ID before it's used with "
+                "-numa cpu,node-id=ID", object->u.cpu.node_id);
+            goto end;
+        }
+
+        machine_set_cpu_numa_node(ms, qapi_NumaCpuOptions_base(&object->u.cpu),
+                                  &err);
+        break;
     default:
         abort();
     }
diff --git a/qapi-schema.json b/qapi-schema.json
index f1bcebe..c9fdbc3 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -5684,10 +5684,12 @@
 #
 # @dist: NUMA distance configuration (since 2.10)
 #
+# @cpu: property based CPU(s) to node mapping (Since: 2.10)
+#
 # Since: 2.1
 ##
 { 'enum': 'NumaOptionsType',
-  'data': [ 'node', 'dist' ] }
+  'data': [ 'node', 'dist', 'cpu' ] }
 
 ##
 # @NumaOptions:
@@ -5701,7 +5703,8 @@
   'discriminator': 'type',
   'data': {
     'node': 'NumaNodeOptions',
-    'dist': 'NumaDistOptions' }}
+    'dist': 'NumaDistOptions',
+    'cpu': 'NumaCpuOptions' }}
 
 ##
 # @NumaNodeOptions:
@@ -5751,6 +5754,20 @@
    'val': 'uint8' }}
 
 ##
+# @NumaCpuOptions:
+#
+# Option "-numa cpu" overrides default cpu to node mapping.
+# It accepts the same set of cpu properties as returned by
+# query-hotpluggable-cpus[].props, where node-id could be used to
+# override default node mapping.
+#
+# Since: 2.10
+##
+{ 'struct': 'NumaCpuOptions',
+   'base': 'CpuInstanceProperties',
+   'data' : {} }
+
+##
 # @HostMemPolicy:
 #
 # Host memory policy types
diff --git a/qemu-options.hx b/qemu-options.hx
index cfe4dc3..731f1bd 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -145,10 +145,12 @@ STEXI
 @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
 @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
 @itemx -numa dist,src=@var{source},dst=@var{destination},val=@var{distance}
+@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
 @findex -numa
 Define a NUMA node and assign RAM and VCPUs to it.
 Set the NUMA distance from a source node to a destination node.
 
+Legacy VCPU assignment uses @samp{cpus} option where
 @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
 @samp{cpus} option represent a contiguous range of CPU indexes
 (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
@@ -162,6 +164,24 @@ a NUMA node:
 -numa node,cpus=0-2,cpus=5
 @end example
 
+@samp{cpu} option is a new alternative to @samp{cpus} option
+which uses @samp{socket-id|core-id|thread-id} properties to assign
+CPU objects to a @var{node} using topology layout properties of CPU.
+The set of properties is machine specific, and depends on used
+machine type/@samp{smp} options. It could be queried with
+@samp{hotpluggable-cpus} monitor command.
+@samp{node-id} property specifies @var{node} to which CPU object
+will be assigned, it's required for @var{node} to be declared
+with @samp{node} option before it's used with @samp{cpu} option.
+
+For example:
+@example
+-M pc \
+-smp 1,sockets=2,maxcpus=2 \
+-numa node,nodeid=0 -numa node,nodeid=1 \
+-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
+@end example
+
 @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
 assigns RAM from a given memory backend device to a node. If
 @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v3 18/18] tests: check -numa node, cpu=props_list usecase
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (16 preceding siblings ...)
  2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 17/18] numa: add '-numa cpu, ...' option for property based node mapping Igor Mammedov
@ 2017-05-10 11:30 ` Igor Mammedov
  2017-05-11 13:38 ` [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Eduardo Habkost
  18 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-10 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 tests/numa-test.c | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 151 insertions(+)

diff --git a/tests/numa-test.c b/tests/numa-test.c
index 2722687..c3475d6 100644
--- a/tests/numa-test.c
+++ b/tests/numa-test.c
@@ -131,6 +131,144 @@ static void test_query_cpus(const void *data)
     g_free(cli);
 }
 
+static void pc_numa_cpu(const void *data)
+{
+    char *cli;
+    QDict *resp;
+    QList *cpus;
+    const QObject *e;
+
+    cli = make_cli(data, "-cpu pentium -smp 8,sockets=2,cores=2,threads=2 "
+        "-numa node,nodeid=0 -numa node,nodeid=1 "
+        "-numa cpu,node-id=1,socket-id=0 "
+        "-numa cpu,node-id=0,socket-id=1,core-id=0 "
+        "-numa cpu,node-id=0,socket-id=1,core-id=1,thread-id=0 "
+        "-numa cpu,node-id=1,socket-id=1,core-id=1,thread-id=1");
+    qtest_start(cli);
+    cpus = get_cpus(&resp);
+    g_assert(cpus);
+
+    while ((e = qlist_pop(cpus))) {
+        QDict *cpu, *props;
+        int64_t socket, core, thread, node;
+
+        cpu = qobject_to_qdict(e);
+        g_assert(qdict_haskey(cpu, "props"));
+        props = qdict_get_qdict(cpu, "props");
+
+        g_assert(qdict_haskey(props, "node-id"));
+        node = qdict_get_int(props, "node-id");
+        g_assert(qdict_haskey(props, "socket-id"));
+        socket = qdict_get_int(props, "socket-id");
+        g_assert(qdict_haskey(props, "core-id"));
+        core = qdict_get_int(props, "core-id");
+        g_assert(qdict_haskey(props, "thread-id"));
+        thread = qdict_get_int(props, "thread-id");
+
+        if (socket == 0) {
+            g_assert_cmpint(node, ==, 1);
+        } else if (socket == 1 && core == 0) {
+            g_assert_cmpint(node, ==, 0);
+        } else if (socket == 1 && core == 1 && thread == 0) {
+            g_assert_cmpint(node, ==, 0);
+        } else if (socket == 1 && core == 1 && thread == 1) {
+            g_assert_cmpint(node, ==, 1);
+        } else {
+            g_assert(false);
+        }
+    }
+
+    QDECREF(resp);
+    qtest_end();
+    g_free(cli);
+}
+
+static void spapr_numa_cpu(const void *data)
+{
+    char *cli;
+    QDict *resp;
+    QList *cpus;
+    const QObject *e;
+
+    cli = make_cli(data, "-smp 4,cores=4 "
+        "-numa node,nodeid=0 -numa node,nodeid=1 "
+        "-numa cpu,node-id=0,core-id=0 "
+        "-numa cpu,node-id=0,core-id=1 "
+        "-numa cpu,node-id=0,core-id=2 "
+        "-numa cpu,node-id=1,core-id=3");
+    qtest_start(cli);
+    cpus = get_cpus(&resp);
+    g_assert(cpus);
+
+    while ((e = qlist_pop(cpus))) {
+        QDict *cpu, *props;
+        int64_t core, node;
+
+        cpu = qobject_to_qdict(e);
+        g_assert(qdict_haskey(cpu, "props"));
+        props = qdict_get_qdict(cpu, "props");
+
+        g_assert(qdict_haskey(props, "node-id"));
+        node = qdict_get_int(props, "node-id");
+        g_assert(qdict_haskey(props, "core-id"));
+        core = qdict_get_int(props, "core-id");
+
+        if (core >= 0 && core < 3) {
+            g_assert_cmpint(node, ==, 0);
+        } else if (core == 3) {
+            g_assert_cmpint(node, ==, 1);
+        } else {
+            g_assert(false);
+        }
+    }
+
+    QDECREF(resp);
+    qtest_end();
+    g_free(cli);
+}
+
+static void aarch64_numa_cpu(const void *data)
+{
+    char *cli;
+    QDict *resp;
+    QList *cpus;
+    const QObject *e;
+
+    cli = make_cli(data, "-smp 2 "
+        "-numa node,nodeid=0 -numa node,nodeid=1 "
+        "-numa cpu,node-id=1,thread-id=0 "
+        "-numa cpu,node-id=0,thread-id=1");
+    qtest_start(cli);
+    cpus = get_cpus(&resp);
+    g_assert(cpus);
+
+    while ((e = qlist_pop(cpus))) {
+        QDict *cpu, *props;
+        int64_t thread, node;
+
+        cpu = qobject_to_qdict(e);
+        g_assert(qdict_haskey(cpu, "props"));
+        props = qdict_get_qdict(cpu, "props");
+
+        g_assert(qdict_haskey(props, "node-id"));
+        node = qdict_get_int(props, "node-id");
+        g_assert(qdict_haskey(props, "thread-id"));
+        thread = qdict_get_int(props, "thread-id");
+
+        if (thread == 0) {
+            g_assert_cmpint(node, ==, 1);
+        } else if (thread == 1) {
+            g_assert_cmpint(node, ==, 0);
+        } else {
+            g_assert(false);
+        }
+    }
+
+    QDECREF(resp);
+    qtest_end();
+    g_free(cli);
+}
+
 int main(int argc, char **argv)
 {
     const char *args = NULL;
@@ -147,5 +285,18 @@ int main(int argc, char **argv)
     qtest_add_data_func("/numa/mon/cpus/partial", args, test_mon_partial);
     qtest_add_data_func("/numa/qmp/cpus/query-cpus", args, test_query_cpus);
 
+    if (!strcmp(arch, "i386") || !strcmp(arch, "x86_64")) {
+        qtest_add_data_func("/numa/pc/cpu/explicit", args, pc_numa_cpu);
+    }
+
+    if (!strcmp(arch, "ppc64")) {
+        qtest_add_data_func("/numa/spapr/cpu/explicit", args, spapr_numa_cpu);
+    }
+
+    if (!strcmp(arch, "aarch64")) {
+        qtest_add_data_func("/numa/aarch64/cpu/explicit", args,
+                            aarch64_numa_cpu);
+    }
+
     return g_test_run();
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 01/18] numa: move source of default CPUs to NUMA node mapping into boards
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 01/18] numa: move source of default CPUs to NUMA node mapping into boards Igor Mammedov
@ 2017-05-11  4:36   ` David Gibson
  0 siblings, 0 replies; 29+ messages in thread
From: David Gibson @ 2017-05-11  4:36 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Eduardo Habkost, Peter Maydell, Andrew Jones,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2043 bytes --]

On Wed, May 10, 2017 at 01:29:45PM +0200, Igor Mammedov wrote:
> Originally CPU threads were by default assigned in
> round-robin fashion. However it was causing issues in
> guest since CPU threads from the same socket/core could
> be placed on different NUMA nodes.
> Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
> fixed it by grouping threads within a socket on the same node
> introducing cpu_index_to_socket_id() callback and commit
> 20bb648d (spapr: Fix default NUMA node allocation for threads)
> reused callback to fix similar issues for SPAPR machine
> even though socket doesn't make much sense there.
> 
> As result QEMU ended up having 3 default distribution rules
> used by 3 targets /virt-arm, spapr, pc/.
> 
> In effort of moving NUMA mapping for CPUs into possible_cpus,
> generalize default mapping in numa.c by making boards decide
> on default mapping and let them explicitly tell generic
> numa code to which node a CPU thread belongs to by replacing
> cpu_index_to_socket_id() with @cpu_index_to_instance_props()
> which provides default node_id assigned by board to specified
> cpu_index.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

[snip]
> +static CpuInstanceProperties
> +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> +
> +    assert(cpu_index < possible_cpus->len);
> +    return possible_cpus->cpus[cpu_index].props;
> +}

Suggested follow up patch: Since this function is identical for x86
and ARM, we could have a "thread_granularity_cpu_index_to_props"
helper they could both reference.  M

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 11/18] QMP: include CpuInstanceProperties into query_cpus output output
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 11/18] QMP: include CpuInstanceProperties into query_cpus output output Igor Mammedov
@ 2017-05-11  4:39   ` David Gibson
  0 siblings, 0 replies; 29+ messages in thread
From: David Gibson @ 2017-05-11  4:39 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Eduardo Habkost, Peter Maydell, Andrew Jones,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 3241 bytes --]

On Wed, May 10, 2017 at 01:29:55PM +0200, Igor Mammedov wrote:
> if board supports CpuInstanceProperties, report them for
> each CPU thread listed. Main motivation for this is to
> provide these properties introspection via QMP interface
> for using in test cases to verify numa node to cpu mapping,
> which includes not only boards that support cpu hotplug
> and have this info in query-hotpluggable-cpus (pc/spapr)
> but also for boards that don't not support hotpluggable-cpus
> but support numa mapping (virt-arm).
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> v2:
>   * fix checkpatch error and remove extra space after = (Eric)
> ---
>  cpus.c           | 10 ++++++++++
>  qapi-schema.json |  6 +++++-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 740b8dc..4f91d25 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -50,6 +50,7 @@
>  #include "qapi-event.h"
>  #include "hw/nmi.h"
>  #include "sysemu/replay.h"
> +#include "hw/boards.h"
>  
>  #ifdef CONFIG_LINUX
>  
> @@ -1859,6 +1860,8 @@ void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
>  
>  CpuInfoList *qmp_query_cpus(Error **errp)
>  {
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>      CpuInfoList *head = NULL, *cur_item = NULL;
>      CPUState *cpu;
>  
> @@ -1909,6 +1912,13 @@ CpuInfoList *qmp_query_cpus(Error **errp)
>  #else
>          info->value->arch = CPU_INFO_ARCH_OTHER;
>  #endif
> +        info->value->has_props = !!mc->cpu_index_to_instance_props;
> +        if (info->value->has_props) {
> +            CpuInstanceProperties *props;
> +            props = g_malloc0(sizeof(*props));
> +            *props = mc->cpu_index_to_instance_props(ms, cpu->cpu_index);
> +            info->value->props = props;
> +        }
>  
>          /* XXX: waiting for the qapi to support GSList */
>          if (!cur_item) {
> diff --git a/qapi-schema.json b/qapi-schema.json
> index bf48873..f1bcebe 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1325,6 +1325,9 @@
>  #
>  # @thread_id: ID of the underlying host thread
>  #
> +# @props: properties describing to which node/socket/core/thread
> +#         virtual CPU belongs to, provided if supported by board (since 2.10)
> +#
>  # @arch: architecture of the cpu, which determines which additional fields
>  #        will be listed (since 2.6)
>  #
> @@ -1335,7 +1338,8 @@
>  ##
>  { 'union': 'CpuInfo',
>    'base': {'CPU': 'int', 'current': 'bool', 'halted': 'bool',
> -           'qom_path': 'str', 'thread_id': 'int', 'arch': 'CpuInfoArch' },
> +           'qom_path': 'str', 'thread_id': 'int',
> +           '*props': 'CpuInstanceProperties', 'arch': 'CpuInfoArch' },
>    'discriminator': 'arch',
>    'data': { 'x86': 'CpuInfoX86',
>              'sparc': 'CpuInfoSPARC',

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 17/18] numa: add '-numa cpu, ...' option for property based node mapping
  2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 17/18] numa: add '-numa cpu, ...' option for property based node mapping Igor Mammedov
@ 2017-05-11  5:23   ` David Gibson
  0 siblings, 0 replies; 29+ messages in thread
From: David Gibson @ 2017-05-11  5:23 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Eduardo Habkost, Peter Maydell, Andrew Jones,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 6641 bytes --]

On Wed, May 10, 2017 at 01:30:01PM +0200, Igor Mammedov wrote:
> legacy cpu to node mapping is using cpu index values to map
> VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> option. However cpu index is internal concept and QEMU users
> have to guess /reimplement qemu's logic/ to map it to
> a concrete cpu socket/core/thread to make sane CPUs
> placement across numa nodes.
> 
> This patch allows to map cpu objects to numa nodes using
> the same properties as used for cpus with -device/device_add
> (socket-id/core-id/thread-id/node-id).
> 
> At present valid properties/values to address CPUs could be
> fetched using hotpluggable-cpus monitor/qmp command, it will
> require user to start qemu twice when creating domain to fetch
> possible CPUs for a machine type/-smp layout first and
> then the second time with numa explicit mapping for actual
> usage. The first step results could be saved and reused to
> set/change mapping later as far as machine type/-smp stays
> the same.
> 
> Proposed impl. supports exact and wildcard matching to
> simplify CLI and allow to set mapping for a specific cpu
> or group of cpu objects specified by matched properties.
> 
> For example:
> 
>    # exact mapping x86
>    -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> 
>    # exact mapping SPAPR
>    -numa cpu,node-id=x,core-id=y
> 
>    # wildcard mapping, all cpu objects that match socket-id=y
>    # are mapped to node-id=x
>    -numa cpu,node-id=x,socket-id=y
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> v3:
>   - use qapi_NumaCpuOptions_base() instead of copying fields manually
>     (Eduardo)
> v2:
>   - use new NumaCpuOptions instead of CpuInstanceProperties in
>     NumaOptions, so that in future we could decouple both
>     if needed. (Eduardo Habkost <ehabkost@redhat.com>)
>   - clarify effect of NumaCpuOptions.node-id in qapi-schema.json
> ---
>  numa.c           | 15 +++++++++++++++
>  qapi-schema.json | 21 +++++++++++++++++++--
>  qemu-options.hx  | 20 ++++++++++++++++++++
>  3 files changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/numa.c b/numa.c
> index ca122cc..84ce2af 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -290,6 +290,21 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
>              goto end;
>          }
>          break;
> +    case NUMA_OPTIONS_TYPE_CPU:
> +        if (!object->u.cpu.has_node_id) {
> +            error_setg(&err, "Missing mandatory node-id property");
> +            goto end;
> +        }
> +        if (!numa_info[object->u.cpu.node_id].present) {
> +            error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
> +                "defined with -numa node,nodeid=ID before it's used with "
> +                "-numa cpu,node-id=ID", object->u.cpu.node_id);
> +            goto end;
> +        }
> +
> +        machine_set_cpu_numa_node(ms, qapi_NumaCpuOptions_base(&object->u.cpu),
> +                                  &err);
> +        break;
>      default:
>          abort();
>      }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index f1bcebe..c9fdbc3 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -5684,10 +5684,12 @@
>  #
>  # @dist: NUMA distance configuration (since 2.10)
>  #
> +# @cpu: property based CPU(s) to node mapping (Since: 2.10)
> +#
>  # Since: 2.1
>  ##
>  { 'enum': 'NumaOptionsType',
> -  'data': [ 'node', 'dist' ] }
> +  'data': [ 'node', 'dist', 'cpu' ] }
>  
>  ##
>  # @NumaOptions:
> @@ -5701,7 +5703,8 @@
>    'discriminator': 'type',
>    'data': {
>      'node': 'NumaNodeOptions',
> -    'dist': 'NumaDistOptions' }}
> +    'dist': 'NumaDistOptions',
> +    'cpu': 'NumaCpuOptions' }}
>  
>  ##
>  # @NumaNodeOptions:
> @@ -5751,6 +5754,20 @@
>     'val': 'uint8' }}
>  
>  ##
> +# @NumaCpuOptions:
> +#
> +# Option "-numa cpu" overrides default cpu to node mapping.
> +# It accepts the same set of cpu properties as returned by
> +# query-hotpluggable-cpus[].props, where node-id could be used to
> +# override default node mapping.
> +#
> +# Since: 2.10
> +##
> +{ 'struct': 'NumaCpuOptions',
> +   'base': 'CpuInstanceProperties',
> +   'data' : {} }
> +
> +##
>  # @HostMemPolicy:
>  #
>  # Host memory policy types
> diff --git a/qemu-options.hx b/qemu-options.hx
> index cfe4dc3..731f1bd 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -145,10 +145,12 @@ STEXI
>  @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
>  @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
>  @itemx -numa dist,src=@var{source},dst=@var{destination},val=@var{distance}
> +@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
>  @findex -numa
>  Define a NUMA node and assign RAM and VCPUs to it.
>  Set the NUMA distance from a source node to a destination node.
>  
> +Legacy VCPU assignment uses @samp{cpus} option where
>  @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
>  @samp{cpus} option represent a contiguous range of CPU indexes
>  (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
> @@ -162,6 +164,24 @@ a NUMA node:
>  -numa node,cpus=0-2,cpus=5
>  @end example
>  
> +@samp{cpu} option is a new alternative to @samp{cpus} option
> +which uses @samp{socket-id|core-id|thread-id} properties to assign
> +CPU objects to a @var{node} using topology layout properties of CPU.
> +The set of properties is machine specific, and depends on used
> +machine type/@samp{smp} options. It could be queried with
> +@samp{hotpluggable-cpus} monitor command.
> +@samp{node-id} property specifies @var{node} to which CPU object
> +will be assigned, it's required for @var{node} to be declared
> +with @samp{node} option before it's used with @samp{cpu} option.
> +
> +For example:
> +@example
> +-M pc \
> +-smp 1,sockets=2,maxcpus=2 \
> +-numa node,nodeid=0 -numa node,nodeid=1 \
> +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> +@end example
> +
>  @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
>  assigns RAM from a given memory backend device to a node. If
>  @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option
  2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
                   ` (17 preceding siblings ...)
  2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 18/18] tests: check -numa node, cpu=props_list usecase Igor Mammedov
@ 2017-05-11 13:38 ` Eduardo Habkost
  18 siblings, 0 replies; 29+ messages in thread
From: Eduardo Habkost @ 2017-05-11 13:38 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Peter Maydell, Andrew Jones, David Gibson,
	Eric Blake, Paolo Bonzini, Shannon Zhao, qemu-arm, qemu-ppc

On Wed, May 10, 2017 at 01:29:44PM +0200, Igor Mammedov wrote:
[...]
> Igor Mammedov (18):
>   numa: move source of default CPUs to NUMA node mapping into boards
>   spapr: add node-id property to sPAPR core
>   pc: add node-id property to CPU
>   virt-arm: add node-id property to CPU
>   numa: add check that board supports cpu_index to node mapping
>   numa: mirror cpu to node mapping in MachineState::possible_cpus
>   numa: do default mapping based on possible_cpus instead of node_cpu
>     bitmaps
>   pc: get numa node mapping from possible_cpus instead of
>     numa_get_node_for_cpu()
>   spapr: get numa node mapping from possible_cpus instead of
>     numa_get_node_for_cpu()
>   virt-arm: get numa node mapping from possible_cpus instead of
>     numa_get_node_for_cpu()
>   QMP: include CpuInstanceProperties into query_cpus output output
>   tests: numa: add case for QMP command query-cpus
>   numa: remove no longer need numa_post_machine_init()
>   machine: call machine init from wrapper
>   numa: use possible_cpus for not mapped CPUs check
>   numa: remove node_cpu bitmaps as they are no longer used
>   numa: add '-numa cpu,...' option for property based node mapping
>   tests: check -numa node,cpu=props_list usecase

Queued on numa-next. Thanks!

-- 
Eduardo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus
  2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus Igor Mammedov
@ 2017-05-26 15:46   ` Eduardo Habkost
  2017-05-29 13:12     ` Igor Mammedov
  2017-05-30 12:13     ` Igor Mammedov
  0 siblings, 2 replies; 29+ messages in thread
From: Eduardo Habkost @ 2017-05-26 15:46 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Peter Maydell, Andrew Jones, qemu-arm, qemu-ppc,
	Shannon Zhao, Paolo Bonzini, David Gibson

On Wed, May 10, 2017 at 01:29:50PM +0200, Igor Mammedov wrote:
[...]
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 2482c63..420c8c4 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -389,6 +389,102 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)
[...]
> +void machine_set_cpu_numa_node(MachineState *machine,
> +                               const CpuInstanceProperties *props, Error **errp)
> +{
[...]
> +    /* force board to initialize possible_cpus if it hasn't been done yet */
> +    mc->possible_cpu_arch_ids(machine);
[...]
> diff --git a/numa.c b/numa.c
> index 7182481..7db5dde 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -170,6 +170,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
>          exit(1);
>      }
>      for (cpus = node->cpus; cpus; cpus = cpus->next) {
> +        CpuInstanceProperties props;
>          if (cpus->value >= max_cpus) {
>              error_setg(errp,
>                         "CPU index (%" PRIu16 ")"
> @@ -178,6 +179,10 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
>              return;
>          }
>          bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
> +        props = mc->cpu_index_to_instance_props(ms, cpus->value);
> +        props.node_id = nodenr;
> +        props.has_node_id = true;
> +        machine_set_cpu_numa_node(ms, &props, &error_fatal);

This triggers a call to possible_cpu_arch_ids() before
nb_numa_nodes is set to the actual number of NUMA nodes in the
machine, breaking the "node_id = ... % nb_numa_nodes"
initialization logic in pc, virt, and spapr.

The initialization ordering between possible_cpus and NUMA data
structures looks very subtle and fragile.  I still don't see an
obvious way to untangle that.

I suggest moving the default-NUMA-mapping code to a separate
machine class method, instead of relying on
possible_cpu_arch_ids() to initialize node_id.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus
  2017-05-26 15:46   ` Eduardo Habkost
@ 2017-05-29 13:12     ` Igor Mammedov
  2017-05-29 13:36       ` Eduardo Habkost
  2017-05-30 12:13     ` Igor Mammedov
  1 sibling, 1 reply; 29+ messages in thread
From: Igor Mammedov @ 2017-05-29 13:12 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Maydell, Andrew Jones, qemu-devel, qemu-arm, qemu-ppc,
	Shannon Zhao, Paolo Bonzini, David Gibson

On Fri, 26 May 2017 12:46:25 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Wed, May 10, 2017 at 01:29:50PM +0200, Igor Mammedov wrote:
> [...]
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 2482c63..420c8c4 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -389,6 +389,102 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)  
> [...]
> > +void machine_set_cpu_numa_node(MachineState *machine,
> > +                               const CpuInstanceProperties *props, Error **errp)
> > +{  
> [...]
> > +    /* force board to initialize possible_cpus if it hasn't been done yet */
> > +    mc->possible_cpu_arch_ids(machine);  
> [...]
> > diff --git a/numa.c b/numa.c
> > index 7182481..7db5dde 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -170,6 +170,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> >          exit(1);
> >      }
> >      for (cpus = node->cpus; cpus; cpus = cpus->next) {
> > +        CpuInstanceProperties props;
> >          if (cpus->value >= max_cpus) {
> >              error_setg(errp,
> >                         "CPU index (%" PRIu16 ")"
> > @@ -178,6 +179,10 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> >              return;
> >          }
> >          bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
> > +        props = mc->cpu_index_to_instance_props(ms, cpus->value);
> > +        props.node_id = nodenr;
> > +        props.has_node_id = true;
> > +        machine_set_cpu_numa_node(ms, &props, &error_fatal);  
> 
> This triggers a call to possible_cpu_arch_ids() before
> nb_numa_nodes is set to the actual number of NUMA nodes in the
> machine, breaking the "node_id = ... % nb_numa_nodes"
> initialization logic in pc, virt, and spapr.
> 
> The initialization ordering between possible_cpus and NUMA data
> structures looks very subtle and fragile.  I still don't see an
> obvious way to untangle that.
It's unfixable unless we require specific ordering on CLI,
i.e. first go all '-numa node,nodeid=[...]' options and only then
the rest of [-numa node,cpus|cpu].
We can do that for '-numa cpu' (probably should do enforce it for
this new option anyway for saner CLI)

but not for '-numa node,cpus' as it will break existing users
that do not declare nodes first.

> I suggest moving the default-NUMA-mapping code to a separate
> machine class method, instead of relying on
> possible_cpu_arch_ids() to initialize node_id.
So as you suggest we have to postpone default values initialization
till all the options are parsed:

1: strait-forward additional machine callback called from
   machine_run_board_init()

or:

2: save extra callback and recalculate not yet set props.node_id-s
   in possible_cpu_arch_ids() if nb_numa_nodes is changed since
   the last invocation of possible_cpu_arch_ids()

which one would you prefer?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus
  2017-05-29 13:12     ` Igor Mammedov
@ 2017-05-29 13:36       ` Eduardo Habkost
  2017-05-29 13:49         ` Igor Mammedov
  0 siblings, 1 reply; 29+ messages in thread
From: Eduardo Habkost @ 2017-05-29 13:36 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Peter Maydell, Andrew Jones, qemu-devel, qemu-arm, qemu-ppc,
	Shannon Zhao, Paolo Bonzini, David Gibson

On Mon, May 29, 2017 at 03:12:45PM +0200, Igor Mammedov wrote:
> On Fri, 26 May 2017 12:46:25 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Wed, May 10, 2017 at 01:29:50PM +0200, Igor Mammedov wrote:
> > [...]
> > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > index 2482c63..420c8c4 100644
> > > --- a/hw/core/machine.c
> > > +++ b/hw/core/machine.c
> > > @@ -389,6 +389,102 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)  
> > [...]
> > > +void machine_set_cpu_numa_node(MachineState *machine,
> > > +                               const CpuInstanceProperties *props, Error **errp)
> > > +{  
> > [...]
> > > +    /* force board to initialize possible_cpus if it hasn't been done yet */
> > > +    mc->possible_cpu_arch_ids(machine);  
> > [...]
> > > diff --git a/numa.c b/numa.c
> > > index 7182481..7db5dde 100644
> > > --- a/numa.c
> > > +++ b/numa.c
> > > @@ -170,6 +170,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > >          exit(1);
> > >      }
> > >      for (cpus = node->cpus; cpus; cpus = cpus->next) {
> > > +        CpuInstanceProperties props;
> > >          if (cpus->value >= max_cpus) {
> > >              error_setg(errp,
> > >                         "CPU index (%" PRIu16 ")"
> > > @@ -178,6 +179,10 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > >              return;
> > >          }
> > >          bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
> > > +        props = mc->cpu_index_to_instance_props(ms, cpus->value);
> > > +        props.node_id = nodenr;
> > > +        props.has_node_id = true;
> > > +        machine_set_cpu_numa_node(ms, &props, &error_fatal);  
> > 
> > This triggers a call to possible_cpu_arch_ids() before
> > nb_numa_nodes is set to the actual number of NUMA nodes in the
> > machine, breaking the "node_id = ... % nb_numa_nodes"
> > initialization logic in pc, virt, and spapr.
> > 
> > The initialization ordering between possible_cpus and NUMA data
> > structures looks very subtle and fragile.  I still don't see an
> > obvious way to untangle that.
> It's unfixable unless we require specific ordering on CLI,
> i.e. first go all '-numa node,nodeid=[...]' options and only then
> the rest of [-numa node,cpus|cpu].
> We can do that for '-numa cpu' (probably should do enforce it for
> this new option anyway for saner CLI)
> 
> but not for '-numa node,cpus' as it will break existing users
> that do not declare nodes first.
> 
> > I suggest moving the default-NUMA-mapping code to a separate
> > machine class method, instead of relying on
> > possible_cpu_arch_ids() to initialize node_id.
> So as you suggest we have to postpone default values initialization
> till all the options are parsed:
> 
> 1: strait-forward additional machine callback called from
>    machine_run_board_init()
> 
> or:
> 
> 2: save extra callback and recalculate not yet set props.node_id-s
>    in possible_cpu_arch_ids() if nb_numa_nodes is changed since
>    the last invocation of possible_cpu_arch_ids()
> 
> which one would you prefer?

Option 1 sounds simpler to me.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus
  2017-05-29 13:36       ` Eduardo Habkost
@ 2017-05-29 13:49         ` Igor Mammedov
  0 siblings, 0 replies; 29+ messages in thread
From: Igor Mammedov @ 2017-05-29 13:49 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Maydell, Andrew Jones, qemu-devel, qemu-arm, qemu-ppc,
	Shannon Zhao, Paolo Bonzini, David Gibson

On Mon, 29 May 2017 10:36:47 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Mon, May 29, 2017 at 03:12:45PM +0200, Igor Mammedov wrote:
> > On Fri, 26 May 2017 12:46:25 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> >   
> > > On Wed, May 10, 2017 at 01:29:50PM +0200, Igor Mammedov wrote:
> > > [...]  
> > > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > > index 2482c63..420c8c4 100644
> > > > --- a/hw/core/machine.c
> > > > +++ b/hw/core/machine.c
> > > > @@ -389,6 +389,102 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)    
> > > [...]  
> > > > +void machine_set_cpu_numa_node(MachineState *machine,
> > > > +                               const CpuInstanceProperties *props, Error **errp)
> > > > +{    
> > > [...]  
> > > > +    /* force board to initialize possible_cpus if it hasn't been done yet */
> > > > +    mc->possible_cpu_arch_ids(machine);    
> > > [...]  
> > > > diff --git a/numa.c b/numa.c
> > > > index 7182481..7db5dde 100644
> > > > --- a/numa.c
> > > > +++ b/numa.c
> > > > @@ -170,6 +170,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > > >          exit(1);
> > > >      }
> > > >      for (cpus = node->cpus; cpus; cpus = cpus->next) {
> > > > +        CpuInstanceProperties props;
> > > >          if (cpus->value >= max_cpus) {
> > > >              error_setg(errp,
> > > >                         "CPU index (%" PRIu16 ")"
> > > > @@ -178,6 +179,10 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > > >              return;
> > > >          }
> > > >          bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
> > > > +        props = mc->cpu_index_to_instance_props(ms, cpus->value);
> > > > +        props.node_id = nodenr;
> > > > +        props.has_node_id = true;
> > > > +        machine_set_cpu_numa_node(ms, &props, &error_fatal);    
> > > 
> > > This triggers a call to possible_cpu_arch_ids() before
> > > nb_numa_nodes is set to the actual number of NUMA nodes in the
> > > machine, breaking the "node_id = ... % nb_numa_nodes"
> > > initialization logic in pc, virt, and spapr.
> > > 
> > > The initialization ordering between possible_cpus and NUMA data
> > > structures looks very subtle and fragile.  I still don't see an
> > > obvious way to untangle that.  
> > It's unfixable unless we require specific ordering on CLI,
> > i.e. first go all '-numa node,nodeid=[...]' options and only then
> > the rest of [-numa node,cpus|cpu].
> > We can do that for '-numa cpu' (probably should do enforce it for
> > this new option anyway for saner CLI)
> > 
> > but not for '-numa node,cpus' as it will break existing users
> > that do not declare nodes first.
> >   
> > > I suggest moving the default-NUMA-mapping code to a separate
> > > machine class method, instead of relying on
> > > possible_cpu_arch_ids() to initialize node_id.  
> > So as you suggest we have to postpone default values initialization
> > till all the options are parsed:
> > 
> > 1: strait-forward additional machine callback called from
> >    machine_run_board_init()
> > 
> > or:
> > 
> > 2: save extra callback and recalculate not yet set props.node_id-s
> >    in possible_cpu_arch_ids() if nb_numa_nodes is changed since
> >    the last invocation of possible_cpu_arch_ids()
> > 
> > which one would you prefer?  
> 
> Option 1 sounds simpler to me.
ok, I'll include fix on cleanups series respin.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus
  2017-05-26 15:46   ` Eduardo Habkost
  2017-05-29 13:12     ` Igor Mammedov
@ 2017-05-30 12:13     ` Igor Mammedov
  2017-05-30 14:04       ` Eduardo Habkost
  1 sibling, 1 reply; 29+ messages in thread
From: Igor Mammedov @ 2017-05-30 12:13 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Maydell, Andrew Jones, qemu-devel, qemu-arm, qemu-ppc,
	Shannon Zhao, Paolo Bonzini, David Gibson

On Fri, 26 May 2017 12:46:25 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Wed, May 10, 2017 at 01:29:50PM +0200, Igor Mammedov wrote:
> [...]
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 2482c63..420c8c4 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -389,6 +389,102 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)  
> [...]
> > +void machine_set_cpu_numa_node(MachineState *machine,
> > +                               const CpuInstanceProperties *props, Error **errp)
> > +{  
> [...]
> > +    /* force board to initialize possible_cpus if it hasn't been done yet */
> > +    mc->possible_cpu_arch_ids(machine);  
> [...]
> > diff --git a/numa.c b/numa.c
> > index 7182481..7db5dde 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -170,6 +170,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> >          exit(1);
> >      }
> >      for (cpus = node->cpus; cpus; cpus = cpus->next) {
> > +        CpuInstanceProperties props;
> >          if (cpus->value >= max_cpus) {
> >              error_setg(errp,
> >                         "CPU index (%" PRIu16 ")"
> > @@ -178,6 +179,10 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> >              return;
> >          }
> >          bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
> > +        props = mc->cpu_index_to_instance_props(ms, cpus->value);
> > +        props.node_id = nodenr;
> > +        props.has_node_id = true;
> > +        machine_set_cpu_numa_node(ms, &props, &error_fatal);  
> 
> This triggers a call to possible_cpu_arch_ids() before
> nb_numa_nodes is set to the actual number of NUMA nodes in the
> machine, breaking the "node_id = ... % nb_numa_nodes"
> initialization logic in pc, virt, and spapr.
Looking at it more, in current code this problem theoretically
affects only partial mapping case. And even there it's masked by 
fixup to 0 node for non explicitly mapped CPUs and the bug which
 *) "[PATCH v2 3/5] numa: make sure that all cpus in has  has_node_id set if numa is enabled"
fixes. So invalid default values for non mapped cpus are stored
in possible_cpus but not used nor visible elsewhere.

[*] as side effect fixes invalid defaults in possible_cpus as well
by fixup to 0 node that's moved from pre_plug handler.

In future we are going to remove partial mapping with fixup to 0
and fetch default mappings from possible_cpus where these wrong
defaults won't exists (or matter) as CPUs are going to either
 1) be all explicitly mapped (so all node_id values are overwritten)
or
 2) all default mapped (where defaults will be correct since no
    -numa cpu=/-numa node,cpus= were called), but that's fragile
    so I'd add calculate defaults at machine_run_board_init()
    time anyway.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus
  2017-05-30 12:13     ` Igor Mammedov
@ 2017-05-30 14:04       ` Eduardo Habkost
  0 siblings, 0 replies; 29+ messages in thread
From: Eduardo Habkost @ 2017-05-30 14:04 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Peter Maydell, Andrew Jones, qemu-devel, qemu-arm, qemu-ppc,
	Shannon Zhao, Paolo Bonzini, David Gibson

On Tue, May 30, 2017 at 02:13:29PM +0200, Igor Mammedov wrote:
> On Fri, 26 May 2017 12:46:25 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Wed, May 10, 2017 at 01:29:50PM +0200, Igor Mammedov wrote:
> > [...]
> > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > index 2482c63..420c8c4 100644
> > > --- a/hw/core/machine.c
> > > +++ b/hw/core/machine.c
> > > @@ -389,6 +389,102 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)  
> > [...]
> > > +void machine_set_cpu_numa_node(MachineState *machine,
> > > +                               const CpuInstanceProperties *props, Error **errp)
> > > +{  
> > [...]
> > > +    /* force board to initialize possible_cpus if it hasn't been done yet */
> > > +    mc->possible_cpu_arch_ids(machine);  
> > [...]
> > > diff --git a/numa.c b/numa.c
> > > index 7182481..7db5dde 100644
> > > --- a/numa.c
> > > +++ b/numa.c
> > > @@ -170,6 +170,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > >          exit(1);
> > >      }
> > >      for (cpus = node->cpus; cpus; cpus = cpus->next) {
> > > +        CpuInstanceProperties props;
> > >          if (cpus->value >= max_cpus) {
> > >              error_setg(errp,
> > >                         "CPU index (%" PRIu16 ")"
> > > @@ -178,6 +179,10 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > >              return;
> > >          }
> > >          bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
> > > +        props = mc->cpu_index_to_instance_props(ms, cpus->value);
> > > +        props.node_id = nodenr;
> > > +        props.has_node_id = true;
> > > +        machine_set_cpu_numa_node(ms, &props, &error_fatal);  
> > 
> > This triggers a call to possible_cpu_arch_ids() before
> > nb_numa_nodes is set to the actual number of NUMA nodes in the
> > machine, breaking the "node_id = ... % nb_numa_nodes"
> > initialization logic in pc, virt, and spapr.
> Looking at it more, in current code this problem theoretically
> affects only partial mapping case. And even there it's masked by 
> fixup to 0 node for non explicitly mapped CPUs and the bug which
>  *) "[PATCH v2 3/5] numa: make sure that all cpus in has  has_node_id set if numa is enabled"
> fixes. So invalid default values for non mapped cpus are stored
> in possible_cpus but not used nor visible elsewhere.
> 
> [*] as side effect fixes invalid defaults in possible_cpus as well
> by fixup to 0 node that's moved from pre_plug handler.
> 
> In future we are going to remove partial mapping with fixup to 0
> and fetch default mappings from possible_cpus where these wrong
> defaults won't exists (or matter) as CPUs are going to either
>  1) be all explicitly mapped (so all node_id values are overwritten)
> or
>  2) all default mapped (where defaults will be correct since no
>     -numa cpu=/-numa node,cpus= were called), but that's fragile
>     so I'd add calculate defaults at machine_run_board_init()
>     time anyway.

Your analysis seems correct: it looks like just a case of fragile
code, not a bug that can be triggered with the current code.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-05-30 14:04 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10 11:29 [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 01/18] numa: move source of default CPUs to NUMA node mapping into boards Igor Mammedov
2017-05-11  4:36   ` David Gibson
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 02/18] spapr: add node-id property to sPAPR core Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 03/18] pc: add node-id property to CPU Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 04/18] virt-arm: " Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 05/18] numa: add check that board supports cpu_index to node mapping Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 06/18] numa: mirror cpu to node mapping in MachineState::possible_cpus Igor Mammedov
2017-05-26 15:46   ` Eduardo Habkost
2017-05-29 13:12     ` Igor Mammedov
2017-05-29 13:36       ` Eduardo Habkost
2017-05-29 13:49         ` Igor Mammedov
2017-05-30 12:13     ` Igor Mammedov
2017-05-30 14:04       ` Eduardo Habkost
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 07/18] numa: do default mapping based on possible_cpus instead of node_cpu bitmaps Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 08/18] pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 09/18] spapr: " Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 10/18] virt-arm: " Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 11/18] QMP: include CpuInstanceProperties into query_cpus output output Igor Mammedov
2017-05-11  4:39   ` David Gibson
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 12/18] tests: numa: add case for QMP command query-cpus Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 13/18] numa: remove no longer need numa_post_machine_init() Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 14/18] machine: call machine init from wrapper Igor Mammedov
2017-05-10 11:29 ` [Qemu-devel] [PATCH v3 15/18] numa: use possible_cpus for not mapped CPUs check Igor Mammedov
2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 16/18] numa: remove node_cpu bitmaps as they are no longer used Igor Mammedov
2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 17/18] numa: add '-numa cpu, ...' option for property based node mapping Igor Mammedov
2017-05-11  5:23   ` David Gibson
2017-05-10 11:30 ` [Qemu-devel] [PATCH v3 18/18] tests: check -numa node, cpu=props_list usecase Igor Mammedov
2017-05-11 13:38 ` [Qemu-devel] [PATCH v3 00/18] numa: add '-numa cpu' option Eduardo Habkost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.