From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57910) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wwu2c-0001bc-4U for qemu-devel@nongnu.org; Tue, 17 Jun 2014 10:07:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Wwu2U-0002ZI-H1 for qemu-devel@nongnu.org; Tue, 17 Jun 2014 10:07:13 -0400 Date: Tue, 17 Jun 2014 11:07:00 -0300 From: Eduardo Habkost Message-ID: <20140617140700.GG3222@otherpad.lan.raisama.net> References: <1402905233-26510-1-git-send-email-aik@ozlabs.ru> <539EA7DD.8040306@ozlabs.ru> <20140616205150.GD8629@otherpad.lan.raisama.net> <539FD767.2020905@ozlabs.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <539FD767.2020905@ozlabs.ru> Subject: Re: [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy Cc: Nishanth Aravamudan , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, Alexander Graf On Tue, Jun 17, 2014 at 03:51:35PM +1000, Alexey Kardashevskiy wrote: > On 06/17/2014 06:51 AM, Eduardo Habkost wrote: > > On Mon, Jun 16, 2014 at 06:16:29PM +1000, Alexey Kardashevskiy wrote: > >> On 06/16/2014 05:53 PM, Alexey Kardashevskiy wrote: > >>> c4177479 "spapr: make sure RMA is in first mode of first memory node" > >>> introduced regression which prevents from running guests with memoryless > >>> NUMA node#0 which may happen on real POWER8 boxes and which would make > >>> sense to debug in QEMU. > >>> > >>> This patchset aim is to fix that and also fix various code problems in > >>> memory nodes generation. > >>> > >>> These 2 patches could be merged (the resulting patch looks rather ugly): > >>> spapr: Use DT memory node rendering helper for other nodes > >>> spapr: Move DT memory node rendering to a helper > >>> > >>> Please comment. Thanks! > >>> > >> > >> Sure I forgot to add an example of what I am trying to run without errors > >> and warnings: > >> > >> /home/aik/qemu-system-ppc64 \ > >> -enable-kvm \ > >> -machine pseries \ > >> -nographic \ > >> -vga none \ > >> -drive id=id0,if=none,file=virtimg/fc20_24GB.qcow2,format=qcow2 \ > >> -device scsi-disk,id=id1,drive=id0 \ > >> -m 2080 \ > >> -smp 8 \ > >> -numa node,nodeid=0,cpus=0-7,memory=0 \ > >> -numa node,nodeid=2,cpus=0-3,mem=1040 \ > >> -numa node,nodeid=4,cpus=4-7,mem=1040 > > > > (Note: I will ignore the "cpus" argument for the discussion below.) > > The example is quite bad, I should not have used same CPUs in 2 nodes. > SPAPR allows this but QEMU does not really support this and I am not > touching this now. > > > > > > I understand now that the non-contiguous node IDs are guest-visible. > > > > But I still would like to understand the motivations for your use case, > > to understand which solution makes more sense. > > One of examples is a 2 CPUs on one die, one of CPUs is connected to memory > bus, the other is not, instead it is connected to the first CPU (via super > fast bus) and the first CPU acts as a bridge. > > > > > If you really want 5 nodes, you just need to write this: > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=1 \ > > -numa node,nodeid=2,cpus=0-3,mem=1040 \ > > -numa node,nodeid=3 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > > > If you just want 3 nodes, you can just write this: > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=1,cpus=0-3,mem=1040 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > > > But you seem to claim you need 3 nodes with non-contiguous IDs. In that > > case, which exactly is the guest-visible difference you expect to get > > between: > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=1 \ > > -numa node,nodeid=2,cpus=0-3,mem=1040 \ > > -numa node,nodeid=3 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > and > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=2,cpus=0-3,mem=1040 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > ? > > > > Because your patch is making both be exactly the same, and I guess you > > don't want that (otherwise you could simply use the 5-node command-line > > above and we wouldn't need patch 7/7). > > If it is canonical and kosher way of using NUMA in QEMU, ok, we can use it. > I just fail to see why we need a requirement for nodes to go consequently > here. And it confuses me as a user a bit if I can add "-numa > node,nodeid=22" (no memory, no cpus) but do not get to see it in the guest. I agree with you it is confusing. But before we support that use case, we need to make sure auto-allocation is handled properly, because it would be hard to fix it later without breaking compatibility. We probably just need a "present" field on struct NodeInfo, so machine-specific code and auto-allocation code can differentiate nodes that are not present on the command-line from empty nodes that were specified in the command-line. In the meantime, people can use the 5-node example above as a workaround. > > btw how is it supposed to work with memory hotplug? Current "-numa" does > not support gaps in memory and I would expect that we will need it. Any > plans here? The DIMM device used for memory hotplug has a "node" property, for the NUMA node ID. -- Eduardo