All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Eduardo Habkost <ehabkost@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes
Date: Mon, 16 Jun 2014 17:25:00 -0700	[thread overview]
Message-ID: <20140617002500.GL16644@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140616205150.GD8629@otherpad.lan.raisama.net>

On 16.06.2014 [17:51:50 -0300], Eduardo Habkost wrote:
> On Mon, Jun 16, 2014 at 06:16:29PM +1000, Alexey Kardashevskiy wrote:
> > On 06/16/2014 05:53 PM, Alexey Kardashevskiy wrote:
> > > c4177479 "spapr: make sure RMA is in first mode of first memory node"
> > > introduced regression which prevents from running guests with memoryless
> > > NUMA node#0 which may happen on real POWER8 boxes and which would make
> > > sense to debug in QEMU.
> > > 
> > > This patchset aim is to fix that and also fix various code problems in
> > > memory nodes generation.
> > > 
> > > These 2 patches could be merged (the resulting patch looks rather ugly):
> > > spapr: Use DT memory node rendering helper for other nodes
> > > spapr: Move DT memory node rendering to a helper
> > > 
> > > Please comment. Thanks!
> > > 
> > 
> > Sure I forgot to add an example of what I am trying to run without errors
> > and warnings:
> > 
> > /home/aik/qemu-system-ppc64 \
> > -enable-kvm \
> > -machine pseries \
> > -nographic \
> > -vga none \
> > -drive id=id0,if=none,file=virtimg/fc20_24GB.qcow2,format=qcow2 \
> > -device scsi-disk,id=id1,drive=id0 \
> > -m 2080 \
> > -smp 8 \
> > -numa node,nodeid=0,cpus=0-7,memory=0 \
> > -numa node,nodeid=2,cpus=0-3,mem=1040 \
> > -numa node,nodeid=4,cpus=4-7,mem=1040
> 
> (Note: I will ignore the "cpus" argument for the discussion below.)
> 
> I understand now that the non-contiguous node IDs are guest-visible.
> 
> But I still would like to understand the motivations for your use case,
> to understand which solution makes more sense.
> 
> If you really want 5 nodes, you just need to write this:
>   -numa node,nodeid=0,cpus=0-7,memory=0 \
>   -numa node,nodeid=1 \
>   -numa node,nodeid=2,cpus=0-3,mem=1040 \
>   -numa node,nodeid=3 \
>   -numa node,nodeid=4,cpus=4-7,mem=1040
> 
> If you just want 3 nodes, you can just write this:
>   -numa node,nodeid=0,cpus=0-7,memory=0 \
>   -numa node,nodeid=1,cpus=0-3,mem=1040 \
>   -numa node,nodeid=4,cpus=4-7,mem=1040

No, this doesn't do what you think it would :)

nb_numa_nodes = 3

but node_mem[0] = 0
node_mem[1] = 1040
node_mem[2] = 0
node_mem[3] = 0
node_mem[4] = 1040

Because of the generic parsing of the numa options.

I'd need to look at my test case again (and this is reproducible on
x86), but I believe it's actually worse if you skip node 0 altogether,
e.g.:

   -numa node,nodeid=1,cpus=0-7,memory=0 \
   -numa node,nodeid=2,cpus=0-3,mem=1040 \
   -numa node,nodeid=4,cpus=4-7,mem=1040

Node 0 will have node 4's memory (because we put the rest there, iirc)
and the cpus that should be on node 4 are on node 0 as well).

I'll try to get the exact test results later.

In any case, it's confusing the topology you see in Linux vs. what the
command-line says.

> But you seem to claim you need 3 nodes with non-contiguous IDs. In that
> case, which exactly is the guest-visible difference you expect to get
> between:
>   -numa node,nodeid=0,cpus=0-7,memory=0 \
>   -numa node,nodeid=1 \
>   -numa node,nodeid=2,cpus=0-3,mem=1040 \
>   -numa node,nodeid=3 \
>   -numa node,nodeid=4,cpus=4-7,mem=1040

I guess here you'd see 5 NUMA nodes in Linux, with 0, 1 and 3 having no
memory.

> and
>   -numa node,nodeid=0,cpus=0-7,memory=0 \
>   -numa node,nodeid=2,cpus=0-3,mem=1040 \
>   -numa node,nodeid=4,cpus=4-7,mem=1040
> ?

And here you'd see 3 NUMA nodes in Linux, with 0 having no memory. I
would think the principle of least surprise means qemu doesn't change
the topology from the user-requested one without any indicate that's
happening?

Thanks,
Nish

  reply	other threads:[~2014-06-17  0:25 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-16  7:53 [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes Alexey Kardashevskiy
2014-06-16  7:53 ` [Qemu-devel] [PATCH 1/7] spapr: Move DT memory node rendering to a helper Alexey Kardashevskiy
2014-06-16  7:53 ` [Qemu-devel] [PATCH 2/7] spapr: Use DT memory node rendering helper for other nodes Alexey Kardashevskiy
2014-06-16  7:53 ` [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory() Alexey Kardashevskiy
2014-06-18  5:04   ` Alexey Kardashevskiy
2014-06-20 19:10   ` Nishanth Aravamudan
2014-06-21  3:08     ` Alexey Kardashevskiy
2014-06-23 17:41       ` Nishanth Aravamudan
2014-06-23 22:02         ` Alexey Kardashevskiy
2014-06-20 22:55   ` Nishanth Aravamudan
2014-06-21  3:06     ` Alexey Kardashevskiy
2014-06-23 17:40       ` Nishanth Aravamudan
2014-06-24  6:07         ` Alexey Kardashevskiy
2014-06-24 17:07           ` Nishanth Aravamudan
2014-06-24  3:08       ` Nishanth Aravamudan
2014-06-24  6:14         ` Alexey Kardashevskiy
2014-06-24 17:01           ` Nishanth Aravamudan
2014-07-21 18:08           ` Nishanth Aravamudan
2014-06-16  7:53 ` [Qemu-devel] [PATCH 4/7] spapr: Split memory nodes to power-of-two blocks Alexey Kardashevskiy
2014-06-17  7:07   ` Alexey Kardashevskiy
2014-06-16  7:53 ` [Qemu-devel] [PATCH 5/7] spapr: Add a helper for node0_size calculation Alexey Kardashevskiy
2014-06-16 18:43   ` Nishanth Aravamudan
2014-06-16  7:53 ` [Qemu-devel] [PATCH 6/7] spapr: Fix ibm, associativity for memory nodes Alexey Kardashevskiy
2014-06-16  7:53 ` [Qemu-devel] [PATCH 7/7] numa: Allow empty nodes Alexey Kardashevskiy
2014-06-16 16:15   ` Eduardo Habkost
2014-06-16 18:49     ` Nishanth Aravamudan
2014-06-16 20:11       ` Eduardo Habkost
2014-06-16 20:31         ` Eduardo Habkost
2014-06-17  0:21           ` Nishanth Aravamudan
2014-06-17  0:16         ` Nishanth Aravamudan
2014-06-16  8:16 ` [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes Alexey Kardashevskiy
2014-06-16 18:26   ` Nishanth Aravamudan
2014-06-16 20:51   ` Eduardo Habkost
2014-06-17  0:25     ` Nishanth Aravamudan [this message]
2014-06-17  1:37       ` Eduardo Habkost
2014-06-17 18:36         ` Nishanth Aravamudan
2014-06-17  1:41       ` Eduardo Habkost
2014-06-17 18:37         ` Nishanth Aravamudan
2014-06-17  5:51     ` Alexey Kardashevskiy
2014-06-17 14:07       ` Eduardo Habkost
2014-06-17 18:38         ` Nishanth Aravamudan
2014-06-17 19:22           ` Eduardo Habkost
2014-06-18 18:28             ` Nishanth Aravamudan
2014-06-18 19:33               ` Eduardo Habkost
2014-06-18 23:58                 ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140617002500.GL16644@linux.vnet.ibm.com \
    --to=nacc@linux.vnet.ibm.com \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=ehabkost@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.