From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51794) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYBnM-0007dB-Re for qemu-devel@nongnu.org; Wed, 18 Mar 2015 07:05:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YYBnI-0003OT-U3 for qemu-devel@nongnu.org; Wed, 18 Mar 2015 07:05:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45622) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYBnI-0003OJ-N0 for qemu-devel@nongnu.org; Wed, 18 Mar 2015 07:05:48 -0400 Date: Wed, 18 Mar 2015 12:05:45 +0100 From: Igor Mammedov Message-ID: <20150318120545.245df0da@nial.brq.redhat.com> In-Reply-To: <55085D84.7000701@suse.de> References: <1426607318-22728-1-git-send-email-imammedo@redhat.com> <20150317164236.GM3513@thinpad.lan.raisama.net> <55085D84.7000701@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for-2.3] numa: pc: fix default VCPU to node mapping List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andreas =?UTF-8?B?RsOkcmJlcg==?= Cc: Eduardo Habkost , qemu-devel@nongnu.org On Tue, 17 Mar 2015 17:59:48 +0100 Andreas F=C3=A4rber wrote: > Am 17.03.2015 um 17:42 schrieb Eduardo Habkost: > > On Tue, Mar 17, 2015 at 03:48:38PM +0000, Igor Mammedov wrote: > >> since commit > >> dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT > >> Linux kernel actually tries to use CPU to Node mapping from > >> QEMU provided SRAT table instead of discarding it, and that > >> in some cases breaks build_sched_domains() which expects > >> sane mapping where cores/threads belonging to the same socket > >> are on the same NUMA node. > >> > >> With current default round-robin mapping of VCPUs to nodes > >> guest ends-up with cores/threads belonging to the same socket > >> being on different NUMA nodes. > >> > >> For example with following CLI: > >> qemu-kvm -m 4G -smp 5,sockets=3D1,cores=3D4,threads=3D1,maxcpus=3D8 \ > >> -numa node,nodeid=3D0 -numa node,nodeid=3D1 > >> 2.6.32 based kernels will hang on boot due to incorrectly build > >> sched_group-s list in update_sd_lb_stats() > >> so comment in QEMU justifying dumb default mapping: > >> " > >> guest OSes must cope with this anyway, because there are BIOSes > >> out there in real machines which also use this scheme. > >> " > >> isn't really valid. > >> > >> Replacing default mapping withi a manual, where VCPUs belonging to > >> the same socket are on the same NUMA node, fixes issue for > >> guests which can't handle nonsense topology i.e. cnaging CLI to: > >> -numa node,nodeid=3D0,cpus=3D0-3 -numa node,nodeid=3D1,cpus=3D4-7 > >> > >> So instead of simply scattering VCPUs around nodes, map > >> the same socket VCPUs to the same NUMA node, which is what > >> guest would expect from a sane hardware/BIOS. > >> > >> Signed-off-by: Igor Mammedov > >=20 > > I believe the proposed behavior is much better. But if we are going to > > break compatibility, shouldn't we at least do that before the first -rc > > so we get feedback in case it break existing configurations? > >=20 > > About qemu_cpu_socket_id_from_index(): all qemu-system-* binaries have > > smp_cores and smp_threads available (even if machines ignore it), but > > the default stub can return values that are larger than the number of > > sockets if smp_cores*smp_threads > 1, which would be obviously > > incorrect. Isn't it easier to simply make > > "cpu_index/(smp_cores*smp_sockets)" be the default cpu_index->socket > > mapping function, and allow machine-specific (not arch-specific) > > overrides if necessary? >=20 > Agree that the proposed stub solution is not so nice. Can you propose a > MachineClass based solution instead? sure >=20 > The example I keep bringing up for x86 is that the Galileo boards or > even the Minnow boards don't really have sockets, being a SoC. >=20 > Thanks, > Andreas >=20