From mboxrd@z Thu Jan 1 00:00:00 1970 From: Luiz Capitulino Subject: Re: [RFC] kvm: x86: export vCPU halted state to sysfs Date: Mon, 5 Feb 2018 11:36:26 -0500 Message-ID: <20180205113626.6c392936@redhat.com> References: <86571633-ae6d-5678-7611-549ff41dccd8@linux.vnet.ibm.com> <20180202155415.GN15403@redhat.com> <20180202110137.2e2c1816@redhat.com> <597d7701-ea7b-524e-7632-10073284d060@linux.vnet.ibm.com> <20180202174249.GA22556@localhost.localdomain> <20180202135033.3ecfdd7b@redhat.com> <20180202200912.GP26425@localhost.localdomain> <20180202151945.52847f8e@redhat.com> <20180202204144.GQ26425@localhost.localdomain> <14fc370a-0d33-6790-6f14-8e224311e676@linux.vnet.ibm.com> <20180205134727.GH25338@redhat.com> <20180205103701.59b69703@redhat.com> <8b691f79-1ead-8528-e00f-55624056fd7e@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Cc: "Daniel P. =?UTF-8?B?QmVycmFuZ8Op?=" , Eduardo Habkost , Radim =?UTF-8?B?S3LEjW3DocWZ?= , kvm@vger.kernel.org, pbonzini@redhat.com, Peter Krempa , John Ferlan , libvir-list@redhat.com, Christian Borntraeger To: Viktor Mihajlovski Return-path: Received: from mx1.redhat.com ([209.132.183.28]:53954 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752925AbeBEQgb (ORCPT ); Mon, 5 Feb 2018 11:36:31 -0500 In-Reply-To: <8b691f79-1ead-8528-e00f-55624056fd7e@linux.vnet.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, 5 Feb 2018 17:10:18 +0100 Viktor Mihajlovski wrote: > On 05.02.2018 16:37, Luiz Capitulino wrote: > > On Mon, 5 Feb 2018 13:47:27 +0000 > > Daniel P. Berrangé wrote: > > > >> On Mon, Feb 05, 2018 at 02:43:15PM +0100, Viktor Mihajlovski wrote: > >>> On 02.02.2018 21:41, Eduardo Habkost wrote: > >>>> On Fri, Feb 02, 2018 at 03:19:45PM -0500, Luiz Capitulino wrote: > >>>>> On Fri, 2 Feb 2018 18:09:12 -0200 > >>>>> Eduardo Habkost wrote: > >>>> [...] > >>>>>> Your plan above covers what will happen when using newer QEMU > >>>>>> versions, but libvirt still needs to work sanely if running QEMU > >>>>>> 2.11. My suggestion is that libvirt do not run query-cpus to ask > >>>>>> for the "halted" field on any architecture except s390. > >>>>> > >>>>> My current plan is to ask libvirt to completely remove query-cpus > >>>>> usage, independent of the arch and use the new command instead. > >>>> > >>>> This would be a regression for people running QEMU 2.11 on s390. > >>>> > >>>> (But maybe it would be an acceptable regression? Viktor, what do > >>>> you think? Are there production releases of management systems > >>>> that already rely on vcpu..halted?) > >>>> > >>> Unfortunately, there's code out there looking at vcpu..halted. I've > >>> informed the product team about the issue. > >>> > >>> If we drop/deprecate vcpu..halted from the domain statistics, this > >>> should be done for all arches, if there's a replacement mechanism (i.e. > >>> new VCPU states). As a stop-gap measure we can make the call > >>> arch-dependent until the new stuff is in place. > >> > >> Yes, I think libvirt should just restrict this 'halted' feature reporting > >> to s390 only, since the other archs have different semantics for this > >> item, and the s390 semantics are the ones we want. > > > > From this whole discussion, there's only one thing that I still don't > > understand (in a very honest way): what makes s390 halted semantics > > different?One problem is that using the halted property to indicate that the CPU > has assumed the architected disabled wait state may not have been the > wisest decision (my fault). If the CPU enters disabled wait, it will > stay inactive until it is explicitly restarted which is different on x86. Ah, OK. So, s390 does indeed have different semantics. > > By quickly looking at the code, it seems to be very like the x86 one > > when in kernel irqchip is not used: if a guest vCPU executes HLT, the > > vCPU exits to userspace and qemu will put the vCPU thread to sleep. > > This is the semantics I'd expect for HLT, and maybe for all archs.> > > What makes x86 different, is when the in kernel irqchip is used (which > > should be the default with libvirt). In this case, the vCPU thread avoids > > exiting to user-space. So, qemu doesn't know the vCPU halted. > > > > That's only one of the reasons why query-cpus forces vCPUs to user-space. > > But there are other reasons, and that's why even on s390 query-cpus > > will also force vCPUs to user-space, which means s390 has the same perf > > issue but maybe this hasn't been detected yet. > > > > For the immediate term, I still think we should have a query-cpus > > replacement that doesn't cause vCPUs to go to userspace. I'll work this > > this week. > FWIW: I currently exploring an extension to query-cpus to report > s390-specific information, allowing to ditch halted in the long run. > Further, I'm considering a new QAPI event along the lines of "CPU info > has changed" allowing QEMU to announce low-frequency changes of CPU > state (as is the case for s390) and finally wire up a handler in libvirt > to update a tbd. property (!= halted). I very much prefer adding a replacement for query-cpus, which works for all archs and which doesn't have any performance impact. > > > > However, IMHO, what we really want is to add an API to the guest agent > > to export the CPU online bit from the guest userspace sysfs. This will > > give the ultimate semantics and move us away from this halted mess. > > >