From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alexander Graf <agraf@suse.de>
Subject: Re: Semantics of "-cpu host" (was Re: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest)
Date: Wed, 9 May 2012 00:07:04 +0200
Message-ID: <6BF7428F-FDEF-4497-94F5-7A43BC9E1E67@suse.de>
References: <4F917E75.2080003@siemens.com> <20120420153656.GX3169@otherpad.lan.raisama.net> <4F926086.3020307@web.de> <20120423144818.GA3169@otherpad.lan.raisama.net> <4F9583DD.10807@siemens.com> <20120423200214.GG3169@otherpad.lan.raisama.net> <4F96CF9F.9060302@siemens.com> <20120424171925.GT3169@otherpad.lan.raisama.net> <20120507182142.GD16951@otherpad.lan.raisama.net> <B4321C29-D9D7-463C-8B0E-6A58649646DC@suse.de> <20120508201441.GN4373@otherpad.lan.raisama.net>
Mime-Version: 1.0 (Apple Message framework v1257)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8BIT
Cc: Andre Przywara <andre.przywara@amd.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Avi Kivity <avi@redhat.com>
To: Eduardo Habkost <ehabkost@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from cantor2.suse.de ([195.135.220.15]:51223 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1756849Ab2EHWHI convert rfc822-to-8bit (ORCPT
	<rfc822;kvm@vger.kernel.org>); Tue, 8 May 2012 18:07:08 -0400
In-Reply-To: <20120508201441.GN4373@otherpad.lan.raisama.net>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


On 08.05.2012, at 22:14, Eduardo Habkost wrote:

> On Tue, May 08, 2012 at 02:58:11AM +0200, Alexander Graf wrote:
>> On 07.05.2012, at 20:21, Eduardo Habkost wrote:
>> 
>>> 
>>> Andre? Are you able to help to answer the question below?
>>> 
>>> I would like to clarify what's the expected behavior of "-cpu host" to
>>> be able to continue working on it. I believe the code will need to be
>>> fixed on either case, but first we need to figure out what are the
>>> expectations/requirements, to know _which_ changes will be needed.
>>> 
>>> 
>>> On Tue, Apr 24, 2012 at 02:19:25PM -0300, Eduardo Habkost wrote:
>>>> (CCing Andre Przywara, in case he can help to clarify what's the
>>>> expected meaning of "-cpu host")
>>>> 
>>> [...]
>>>> I am not sure I understand what you are proposing. Let me explain the
>>>> use case I am thinking about:
>>>> 
>>>> - Feature FOO is of type (A) (e.g. just a new instruction set that
>>>> doesn't require additional userspace support)
>>>> - User has a Qemu vesion that doesn't know anything about feature FOO
>>>> - User gets a new CPU that supports feature FOO
>>>> - User gets a new kernel that supports feature FOO (i.e. has FOO in
>>>> GET_SUPPORTED_CPUID)
>>>> - User does _not_ upgrade Qemu.
>>>> - User expects to get feature FOO enabled if using "-cpu host", without
>>>> upgrading Qemu.
>>>> 
>>>> The problem here is: to support the above use-case, userspace need a
>>>> probing mechanism that can differentiate _new_ (previously unknown)
>>>> features that are in group (A) (safe to blindly enable) from features
>>>> that are in group (B) (that can't be enabled without an userspace
>>>> upgrade).
>>>> 
>>>> In short, it becomes a problem if we consider the following case:
>>>> 
>>>> - Feature BAR is of type (B) (it can't be enabled without extra
>>>> userspace support)
>>>> - User has a Qemu version that doesn't know anything about feature BAR
>>>> - User gets a new CPU that supports feature BAR
>>>> - User gets a new kernel that supports feature BAR (i.e. has BAR in
>>>> GET_SUPPORTED_CPUID)
>>>> - User does _not_ upgrade Qemu.
>>>> - User simply shouldn't get feature BAR enabled, even if using "-cpu
>>>> host", otherwise Qemu would break.
>>>> 
>>>> If userspace always limited itself to features it knows about, it would
>>>> be really easy to implement the feature without any new probing
>>>> mechanism from the kernel. But that's not how I think users expect "-cpu
>>>> host" to work. Maybe I am wrong, I don't know. I am CCing Andre, who
>>>> introduced the "-cpu host" feature, in case he can explain what's the
>>>> expected semantics on the cases above.
>> 
>> Can you think of any feature that'd go into category B?
> 
> - TSC-deadline: can't be enabled unless userspace takes care to enable
>  the in-kernel irqchip.

The kernel can check if in-kernel irqchip has it enabled and otherwise mask it out, no?

> - x2apic: ditto.

Same here. For user space irqchip the kernel side doesn't care. If in-kernel APIC is enabled, check for its capabilities.

> 
> I am not sure about XSAVE: an old qemu version would call kvm_put_fpu()
> instead of kvm_put_xsave() on kvm_arch_put_registers(), but I don't know
> if this would have unexpected side-effects or not.

Then XSAVE awareness should be manually enabled by user space. That's what we have ENABLE_CAP for :). Do ENABLE_CAP(XSAVE) -> get XSAVE as a bit in GET_SUPPORTED_CPUID.

> 
> I wouldn't be surprised if we find many other cases, as even the
> GET_SUPPORTED_CPUID documentation is explicit about that: "Userspace can
> use the information returned by this ioctl [GET_SUPPORTED_CPUID] to
> construct cpuid information (for KVM_SET_CPUID2) that is consistent with
> hardware, kernel, and userspace capabilities, [...]"
>                      ^^^^^^^^^^^^^^^^^^^^^^

Yeah, so the intent is that kvm is aware of all the bits user space would care about.

> 
>> 
>> All features I'm aware of work fine (without migration, but that one
>> is moot for -cpu host anyway) as long as the host kvm implementation
>> is fine with it (GET_SUPPORTED_CPUID). So they'd be category A.
> 
> So, you would argue that GET_SUPPORTED_CPUID should include only
> features of type A? That's the opposite of what we have today.
> 
> Maybe we could change the semantics to type-A-only if we define "type A"
> as:
> 
> - Don't require any extra userspace support except for:
>  - Migration support.
>  - Enabling the in-kernel irqchip.
> 
> If we agree on that semantics, "-cpu host" could safely enable all the
> fetures returned by GET_SUPPORTED_CPUID blindly, after making sure that
> migration support is disabled and the in-kernel irqchip is enabled. Then
> all type-B features will require defining a KVM_CAP_* capability instead

not instead. In addition. Define a KVM_CAP_ and do an ENABLE_CAP on that one to have it exposed.

> of using GET_SUPPORTED_CPUID. It's the opposite of the direction I was
> proposing earlier in this thread, but it is starting to look like a
> better idea (otherwise "-cpu host" would never be reliable).
> 
> If we agree on that semantics, it won't require any code change on the
> current code, just a documentation update.

Life is simple, eh? :)


Alex


From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:59620)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1SRsYo-0000w8-Ha
	for qemu-devel@nongnu.org; Tue, 08 May 2012 18:07:12 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1SRsYm-0000mi-9L
	for qemu-devel@nongnu.org; Tue, 08 May 2012 18:07:10 -0400
Received: from cantor2.suse.de ([195.135.220.15]:51220 helo=mx2.suse.de)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1SRsYl-0000mW-Vx
	for qemu-devel@nongnu.org; Tue, 08 May 2012 18:07:08 -0400
Mime-Version: 1.0 (Apple Message framework v1257)
Content-Type: text/plain; charset=us-ascii
From: Alexander Graf <agraf@suse.de>
In-Reply-To: <20120508201441.GN4373@otherpad.lan.raisama.net>
Date: Wed, 9 May 2012 00:07:04 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <6BF7428F-FDEF-4497-94F5-7A43BC9E1E67@suse.de>
References: <4F917E75.2080003@siemens.com>
	<20120420153656.GX3169@otherpad.lan.raisama.net>
	<4F926086.3020307@web.de>
	<20120423144818.GA3169@otherpad.lan.raisama.net>
	<4F9583DD.10807@siemens.com>
	<20120423200214.GG3169@otherpad.lan.raisama.net>
	<4F96CF9F.9060302@siemens.com>
	<20120424171925.GT3169@otherpad.lan.raisama.net>
	<20120507182142.GD16951@otherpad.lan.raisama.net>
	<B4321C29-D9D7-463C-8B0E-6A58649646DC@suse.de>
	<20120508201441.GN4373@otherpad.lan.raisama.net>
Subject: Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2]
	Expose tsc deadline timer cpuid to guest)
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Eduardo Habkost <ehabkost@redhat.com>
Cc: Andre Przywara <andre.przywara@amd.com>, Avi Kivity <avi@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, Jan Kiszka <jan.kiszka@siemens.com>


On 08.05.2012, at 22:14, Eduardo Habkost wrote:

> On Tue, May 08, 2012 at 02:58:11AM +0200, Alexander Graf wrote:
>> On 07.05.2012, at 20:21, Eduardo Habkost wrote:
>>=20
>>>=20
>>> Andre? Are you able to help to answer the question below?
>>>=20
>>> I would like to clarify what's the expected behavior of "-cpu host" =
to
>>> be able to continue working on it. I believe the code will need to =
be
>>> fixed on either case, but first we need to figure out what are the
>>> expectations/requirements, to know _which_ changes will be needed.
>>>=20
>>>=20
>>> On Tue, Apr 24, 2012 at 02:19:25PM -0300, Eduardo Habkost wrote:
>>>> (CCing Andre Przywara, in case he can help to clarify what's the
>>>> expected meaning of "-cpu host")
>>>>=20
>>> [...]
>>>> I am not sure I understand what you are proposing. Let me explain =
the
>>>> use case I am thinking about:
>>>>=20
>>>> - Feature FOO is of type (A) (e.g. just a new instruction set that
>>>> doesn't require additional userspace support)
>>>> - User has a Qemu vesion that doesn't know anything about feature =
FOO
>>>> - User gets a new CPU that supports feature FOO
>>>> - User gets a new kernel that supports feature FOO (i.e. has FOO in
>>>> GET_SUPPORTED_CPUID)
>>>> - User does _not_ upgrade Qemu.
>>>> - User expects to get feature FOO enabled if using "-cpu host", =
without
>>>> upgrading Qemu.
>>>>=20
>>>> The problem here is: to support the above use-case, userspace need =
a
>>>> probing mechanism that can differentiate _new_ (previously unknown)
>>>> features that are in group (A) (safe to blindly enable) from =
features
>>>> that are in group (B) (that can't be enabled without an userspace
>>>> upgrade).
>>>>=20
>>>> In short, it becomes a problem if we consider the following case:
>>>>=20
>>>> - Feature BAR is of type (B) (it can't be enabled without extra
>>>> userspace support)
>>>> - User has a Qemu version that doesn't know anything about feature =
BAR
>>>> - User gets a new CPU that supports feature BAR
>>>> - User gets a new kernel that supports feature BAR (i.e. has BAR in
>>>> GET_SUPPORTED_CPUID)
>>>> - User does _not_ upgrade Qemu.
>>>> - User simply shouldn't get feature BAR enabled, even if using =
"-cpu
>>>> host", otherwise Qemu would break.
>>>>=20
>>>> If userspace always limited itself to features it knows about, it =
would
>>>> be really easy to implement the feature without any new probing
>>>> mechanism from the kernel. But that's not how I think users expect =
"-cpu
>>>> host" to work. Maybe I am wrong, I don't know. I am CCing Andre, =
who
>>>> introduced the "-cpu host" feature, in case he can explain what's =
the
>>>> expected semantics on the cases above.
>>=20
>> Can you think of any feature that'd go into category B?
>=20
> - TSC-deadline: can't be enabled unless userspace takes care to enable
>  the in-kernel irqchip.

The kernel can check if in-kernel irqchip has it enabled and otherwise =
mask it out, no?

> - x2apic: ditto.

Same here. For user space irqchip the kernel side doesn't care. If =
in-kernel APIC is enabled, check for its capabilities.

>=20
> I am not sure about XSAVE: an old qemu version would call =
kvm_put_fpu()
> instead of kvm_put_xsave() on kvm_arch_put_registers(), but I don't =
know
> if this would have unexpected side-effects or not.

Then XSAVE awareness should be manually enabled by user space. That's =
what we have ENABLE_CAP for :). Do ENABLE_CAP(XSAVE) -> get XSAVE as a =
bit in GET_SUPPORTED_CPUID.

>=20
> I wouldn't be surprised if we find many other cases, as even the
> GET_SUPPORTED_CPUID documentation is explicit about that: "Userspace =
can
> use the information returned by this ioctl [GET_SUPPORTED_CPUID] to
> construct cpuid information (for KVM_SET_CPUID2) that is consistent =
with
> hardware, kernel, and userspace capabilities, [...]"
>                      ^^^^^^^^^^^^^^^^^^^^^^

Yeah, so the intent is that kvm is aware of all the bits user space =
would care about.

>=20
>>=20
>> All features I'm aware of work fine (without migration, but that one
>> is moot for -cpu host anyway) as long as the host kvm implementation
>> is fine with it (GET_SUPPORTED_CPUID). So they'd be category A.
>=20
> So, you would argue that GET_SUPPORTED_CPUID should include only
> features of type A? That's the opposite of what we have today.
>=20
> Maybe we could change the semantics to type-A-only if we define "type =
A"
> as:
>=20
> - Don't require any extra userspace support except for:
>  - Migration support.
>  - Enabling the in-kernel irqchip.
>=20
> If we agree on that semantics, "-cpu host" could safely enable all the
> fetures returned by GET_SUPPORTED_CPUID blindly, after making sure =
that
> migration support is disabled and the in-kernel irqchip is enabled. =
Then
> all type-B features will require defining a KVM_CAP_* capability =
instead

not instead. In addition. Define a KVM_CAP_ and do an ENABLE_CAP on that =
one to have it exposed.

> of using GET_SUPPORTED_CPUID. It's the opposite of the direction I was
> proposing earlier in this thread, but it is starting to look like a
> better idea (otherwise "-cpu host" would never be reliable).
>=20
> If we agree on that semantics, it won't require any code change on the
> current code, just a documentation update.

Life is simple, eh? :)


Alex