All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC] spapr: Fix default NUMA node allocation for threads
@ 2015-09-01  3:15 David Gibson
  2015-09-02  8:25 ` Alexey Kardashevskiy
  0 siblings, 1 reply; 2+ messages in thread
From: David Gibson @ 2015-09-01  3:15 UTC (permalink / raw)
  To: aik, benh, ehabkost, agraf; +Cc: qemu-ppc, qemu-devel, David Gibson

At present, if guest numa nodes are requested, but the cpus in each node
are not specified, spapr just uses the default behaviour or assigning each
vcpu round-robin to nodes.

If smp_threads != 1, that will assign adjacent threads in a core to
different NUMA nodes.  As well as being just weird, that's a configuration
that can't be represented in the device tree we give to the guest, which
means the guest and qemu end up with different ideas of the NUMA topology.

This patch implements mc->cpu_index_to_socket_id in the spapr code to
make sure vcpus get assigned to nodes only at the socket granularity.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

The default NUMA allocation is pretty broken for any normal system,
but this at least fixes it for one more case.  This is already in my
spapr-next tree, but if I can get a Reviewed-by or two, it will be
ready for merge to mainline.


diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bf0c64f..8c2b103 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1820,6 +1820,13 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
     }
 }
 
+static unsigned spapr_cpu_index_to_socket_id(unsigned cpu_index)
+{
+    /* Allocate to NUMA nodes on a "socket" basis (not that concept of
+     * socket means much for the paravirtualized PAPR platform) */
+    return cpu_index / smp_threads / smp_cores;
+}
+
 static void spapr_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -1836,6 +1843,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->kvm_type = spapr_kvm_type;
     mc->has_dynamic_sysbus = true;
     mc->pci_allow_0_address = true;
+    mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
 
     fwc->get_dev_path = spapr_get_fw_dev_path;
     nc->nmi_monitor_handler = spapr_nmi;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [Qemu-devel] [RFC] spapr: Fix default NUMA node allocation for threads
  2015-09-01  3:15 [Qemu-devel] [RFC] spapr: Fix default NUMA node allocation for threads David Gibson
@ 2015-09-02  8:25 ` Alexey Kardashevskiy
  0 siblings, 0 replies; 2+ messages in thread
From: Alexey Kardashevskiy @ 2015-09-02  8:25 UTC (permalink / raw)
  To: David Gibson, benh, ehabkost, agraf; +Cc: qemu-ppc, qemu-devel

On 09/01/2015 01:15 PM, David Gibson wrote:
> At present, if guest numa nodes are requested, but the cpus in each node
> are not specified, spapr just uses the default behaviour or assigning each
> vcpu round-robin to nodes.
>
> If smp_threads != 1, that will assign adjacent threads in a core to
> different NUMA nodes.  As well as being just weird, that's a configuration
> that can't be represented in the device tree we give to the guest, which
> means the guest and qemu end up with different ideas of the NUMA topology.
>
> This patch implements mc->cpu_index_to_socket_id in the spapr code to
> make sure vcpus get assigned to nodes only at the socket granularity.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>   hw/ppc/spapr.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
>
> The default NUMA allocation is pretty broken for any normal system,
> but this at least fixes it for one more case.  This is already in my
> spapr-next tree, but if I can get a Reviewed-by or two, it will be
> ready for merge to mainline.
>
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index bf0c64f..8c2b103 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1820,6 +1820,13 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
>       }
>   }
>
> +static unsigned spapr_cpu_index_to_socket_id(unsigned cpu_index)
> +{
> +    /* Allocate to NUMA nodes on a "socket" basis (not that concept of
> +     * socket means much for the paravirtualized PAPR platform) */
> +    return cpu_index / smp_threads / smp_cores;



This bothers me as "ibm,chip-id" is calculated different in 
spapr_populate_cpu_dt() and your schema gives different socket numbers for 
weird cases like -smp 16,sockets=3,cores=4,threads=2


In general, I do not really understand why there is "sockets" parameter in 
QEMU at all...






> +}
> +
>   static void spapr_machine_class_init(ObjectClass *oc, void *data)
>   {
>       MachineClass *mc = MACHINE_CLASS(oc);
> @@ -1836,6 +1843,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>       mc->kvm_type = spapr_kvm_type;
>       mc->has_dynamic_sysbus = true;
>       mc->pci_allow_0_address = true;
> +    mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
>
>       fwc->get_dev_path = spapr_get_fw_dev_path;
>       nc->nmi_monitor_handler = spapr_nmi;
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-09-02  8:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-01  3:15 [Qemu-devel] [RFC] spapr: Fix default NUMA node allocation for threads David Gibson
2015-09-02  8:25 ` Alexey Kardashevskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.