From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
dmatlack@google.com, luto@kernel.org, peterhornyack@google.com,
x86@kernel.org
Subject: Re: [PATCH 2/2] x86, kvm: use kvmclock to compute TSC deadline value
Date: Fri, 16 Sep 2016 17:24:44 +0200 [thread overview]
Message-ID: <20160916152443.GG17296@potion> (raw)
In-Reply-To: <cb0f91d4-7e18-afc2-a899-68357cbc0d49@redhat.com>
2016-09-16 17:06+0200, Paolo Bonzini:
> On 16/09/2016 16:59, Radim Krčmář wrote:
>> KVM_MSR_DEADLINE would be interface in kvmclock nanosecond values and
>> MSR_IA32_TSCDEADLINE in TSC values. KVM_MSR_DEADLINE would follow
>> similar rules as MSR_IA32_TSCDEADLINE -- the interrupt fires when
>> kvmclock reaches the value, you read what you write, and 0 disarms it.
>>
>> If the TSC deadline timer was enabled, then the guest could write to
>> both MSR_IA32_TSCDEADLINE and KVM_MSR_DEADLINE, but only one could be
>> armed at any time (non-zero write to one will set the other to 0).
>>
>> The dual interface would allow unconditinal addition of the PV feature
>> without regressing users that currently use MSR_IA32_TSCDEADLINE and
>> adapted their stack to handle KVM's TSC shortcomings ...
>
> So far so good. My question is: what happens if you write to
> KVM_MSR_DEADLINE and read from MSR_IA32_TSCDEADLINE, or vice versa?
(The second paragraph covered it ;])
> The possibilities are:
>
> a) you read a 0
This one.
> b) you read the value converted to the other unit
Too much hassle. :)
> c) you read another value such as -1
Having common "disarmed" value is nicer and MSR_IA32_TSCDEADLINE has 0.
> (a) and (c) are the simplest of course. (c) may make sense when writing
> to MSR_IA32_TSCDEADLINE and reading from KVM_MSR_DEADLINE, since we can
> decide which values are valid or not; -1 is technically a valid TSC
> deadline.
>
> I'm not sure about whether to allow (b). In the end KVM is going to
> convert a nsec deadline to a TSC value internally, and vice versa.
It is not necessary to convert nsec deadline to guest-TSC, only to
host-TSC in case the VMX_PREEPTION_TIMER is used.
I would only have the host-TSC internal representation, which is not
exportable to the guest or migratable.
> On
> the other hand, if we do, userspace needs to figure out (on migration)
> whether the guest set up a TSC or a nanosecond deadline.
Yeah, I think the solution described below (writing 0 doesn't disarm the
other one) is not bad.
>>> this lets userspace decide whether to set a nsec-based
>>> deadline or a TSC-based deadline after migration.
>>
>> Hm, isn't switching to TSC-based deadline after migration pointless?
>
> Yes, but I didn't mean that. I meant preserving which MSR was written
> to arm the timer, and redoing the same on the destination.
Ah, I see. Both MSRs read what deadline written to them (if they are
armed) and at most one can be non-zero.
KVM will add MSR_IA32_TSCDEADLINE to the list of emulated MSRs, so
userspace will save/restore both deadline MSRs and zero writes will not
disarm the other timer, so the correct timer will be armed.
No special logic to try to avoid TSC-related bugs.
>>>>> This still wouldn't handle old hosts of course.
>>>>
>>>> The question is whether we want to carry around 150 LOC because of old
>>>> hosts. I'd just fix Linux to avoid deadline TSC without invariant TSC.
>>>> :)
>>>
>>> Yes, that would automatically blacklist it on KVM. You'd also need to
>>> update the recent optimization to the TSC deadline timer, to also work
>>> on other APIC timer modes or at least in your new PV mode.
>>
>> All modes shouldn't be much harder than just the PV mode.
>
> The PV mode would still be a bit easier since it's still the TSC
> deadline timer just with a nicer interface that is not based on the TSC.
> Depends on how you code it though, I guess.
Yeah, we'll see. I am planning to carry around the deadline value in
nanoseconds (to avoid needless conversions), so it would have similar
requirements as the APIC timer.
prev parent reply other threads:[~2016-09-16 15:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-06 22:29 [PATCH v2 0/2] if running under KVM, use kvmclock to compute TSC deadline value Paolo Bonzini
2016-09-06 22:29 ` [PATCH 1/2] x86: paravirt: add local_apic_timer_interrupt to pv_ops Paolo Bonzini
2016-09-07 6:25 ` kbuild test robot
2016-09-07 6:33 ` kbuild test robot
2016-09-06 22:29 ` [PATCH 2/2] x86, kvm: use kvmclock to compute TSC deadline value Paolo Bonzini
2016-09-08 22:13 ` David Matlack
2016-09-09 16:38 ` Paolo Bonzini
2016-09-09 20:05 ` David Matlack
2016-10-11 4:05 ` Wanpeng Li
2016-09-15 15:09 ` Radim Krčmář
2016-09-15 16:00 ` Paolo Bonzini
2016-09-15 19:59 ` Radim Krčmář
2016-09-15 21:02 ` Paolo Bonzini
2016-09-16 14:59 ` Radim Krčmář
2016-09-16 15:06 ` Paolo Bonzini
2016-09-16 15:24 ` Radim Krčmář [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160916152443.GG17296@potion \
--to=rkrcmar@redhat.com \
--cc=dmatlack@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterhornyack@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).