On 2012-04-20 17:36, Eduardo Habkost wrote: > On Fri, Apr 20, 2012 at 05:19:17PM +0200, Jan Kiszka wrote: >> On 2012-04-20 17:00, Eduardo Habkost wrote: >>> On Fri, Apr 20, 2012 at 12:12:38PM +0200, Jan Kiszka wrote: >>>> On 2012-04-19 22:03, Eduardo Habkost wrote: >>>>> Jan/Avi: ping? >>>>> >>>>> I would like to get this ABI detail clarified so it can be implemented >>>>> the right way on Qemu and KVM. >>>>> >>>>> My proposal is to simply add tsc-deadline to the data returned by >>>>> GET_SUPPORTED_CPUID, making KVM_CAP_TSC_DEADLINE_TIMER unnecessary. >>>>> >>>> >>>> IIRC, there were problems with this model to exclude that the feature >>>> gets reported this way without ensuring that in-kernel irqchip is >>>> actually activated. Please browse the discussions, it should be >>>> documented there. >>> >>> The flag was never added to GET_SUPPORTED_CPUID on any of the git >>> repositories I have checked, and I don't see that being done anywhere on >>> my KVM mailing list archives, either. So I don't see how we could have >>> had issues with GET_SUPPORTED_CPUID, if it was never present on the >>> code. >>> >>> What was present on the code before the fix, was coded that >>> unconditinally enabled the flag even if Qemu never asked for it, and >>> that really was wrong. >>> >>> GET_SUPPORTED_CPUID doesn't enable any flag: it just tells userspace >>> that the hardware and KVM supports the feature (and, surprise, this is >>> exactly what KVM_CAP_TSC_DEADLINE_TIMER means too, isn't it?). It's up >>> to userspace to enable the CPUID bits according to user requirements and >>> userspace capabilities (in other words: only when userspace knows it is >>> safe because the in-kernel irqchip is enabled). >>> >>> If the above is not what GET_SUPPORTED_CPUID means, I would like to get >>> that clarified, so I can submit an updated to KVM API documentation. >> >> Does old userspace filter out flags from GET_SUPPORTED_CPUID that it >> does not understand? > > It's even more strict than that: it only enables what was explicitly > enabled on the CPU model asked by the user. > > But: > > The only exception is "-cpu host", that tries to enable everything, even > flags Qemu never heard of, and that is something that may require some > changes on the API design (or at least documentation clarifications). > > Today "-cpu host" can't differentiate (A) "a feature that KVM supports > and emulate by itself and can be enabled without any support from > userspace" and (B) "a feature that KVM supports but need support from > userspace to be enabled". I am sure we will be able to find multiple > examples of (B) inside the flags returned by GET_SUPPORTED_CPUID today. The problem remains that exposing TSC_DEADLINE via SUPPORTED_CPUID is wrong in case userspace doesn't select the in-kernel APIC. The kernel does _nothing_ about emulating the flag if userspace provides the APIC, so it must not expose this feature. Even if this had no practical impact (but it has: old qemu[-kvm] + userspace APIC + -cpu host), it would still be semantically broken. > > We could decide to never add any new flag to GET_SUPPORTED_CPUID if it > requires additional userspace support to work, from now on, and create > new KVM_CAP_* flags for them. But, we really want to do that for almost > every new CPUID feature bit in the future? Most CPU features do not depend on our in-kernel irqchips. But if there is something to simplify in retrieving information about such "conditional features", let's do it. > > We also have the problem of defining what "requires support from > userspace to work" means: I am not sure this has the same meaning for > everybody. Many new features require userspace support only for > migration, and nothing else, but some users don't need migration to > work. Do those features qualify as (A), or as (B)? "Works for most user" also means "breaks for some". And that is a bug, isn't it? Jan