All of lore.kernel.org
 help / color / mirror / Atom feed
* Can't use kvm_exit for the host side time stamp
@ 2019-02-26 16:24 Steven Rostedt
  2019-02-26 18:02 ` Tzvetomir Stoyanov
  2019-02-28 16:22 ` Tzvetomir Stoyanov
  0 siblings, 2 replies; 6+ messages in thread
From: Steven Rostedt @ 2019-02-26 16:24 UTC (permalink / raw)
  To: Tzvetomir Stoyanov; +Cc: linux-trace-devel, Yordan Karadzhov, Slavomir Kaslev

Hi Ceco,

I was walking Yordan down the code in the kernel to show him the ideas
I have about using kvm_exit as the trace event on the host side, when I
noticed that it's called *after* interrupts and preemption are enabled!

This totally breaks the design, as it is not a reliable event, because
the process running the guest is likely to get an interrupt or
preempted before recording that event, which can cause a huge
difference.

kvm_enter is fine, but the reason that kvm_exit is not, is because, in
order to record the reason for the exit, it needs to call into either
the Intel or AMD code (as they are different), and that is done after
interrupts and preemption are enabled.

Instead, we can trace the function kvm_read_l1_tsc, which happens while
interrupts are still enabled.

Would you be able to test with that function instead of using kvm_exit?
Also, if you still see a drift, try setting both trace clocks to
x86-tsc, and see if you still get a drift.

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't use kvm_exit for the host side time stamp
  2019-02-26 16:24 Can't use kvm_exit for the host side time stamp Steven Rostedt
@ 2019-02-26 18:02 ` Tzvetomir Stoyanov
  2019-02-26 18:15   ` Steven Rostedt
  2019-02-28 16:22 ` Tzvetomir Stoyanov
  1 sibling, 1 reply; 6+ messages in thread
From: Tzvetomir Stoyanov @ 2019-02-26 18:02 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-trace-devel, Yordan Karadzhov, Slavomir Kaslev

Hi Steven,

On Tue, Feb 26, 2019 at 6:24 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Hi Ceco,
>
> I was walking Yordan down the code in the kernel to show him the ideas
> I have about using kvm_exit as the trace event on the host side, when I
> noticed that it's called *after* interrupts and preemption are enabled!
>
> This totally breaks the design, as it is not a reliable event, because
> the process running the guest is likely to get an interrupt or
> preempted before recording that event, which can cause a huge
> difference.
>
We performed a lot of tests with the current implementation (relying
on the kvm_exit event),
and the results are very accurate and consistent. Of course, there are
probes with huge
deviation, but these can be easily filtered.

> kvm_enter is fine, but the reason that kvm_exit is not, is because, in
> order to record the reason for the exit, it needs to call into either
> the Intel or AMD code (as they are different), and that is done after
> interrupts and preemption are enabled.
>
> Instead, we can trace the function kvm_read_l1_tsc, which happens while
> interrupts are still enabled.
>
> Would you be able to test with that function instead of using kvm_exit?
> Also, if you still see a drift, try setting both trace clocks to
> x86-tsc, and see if you still get a drift.

Yes, I'll test with the kvm_read_l1_tsc function tomorrow.  Is it
related to x86-tsc
clock only, or it should work with any clock ? I noticed that the
drift does not depend
on the clock, x86-tsc also drifts.

>
> Thanks!
>
> -- Steve



-- 

Tzvetomir (Ceco) Stoyanov
VMware Open Source Technology Center

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't use kvm_exit for the host side time stamp
  2019-02-26 18:02 ` Tzvetomir Stoyanov
@ 2019-02-26 18:15   ` Steven Rostedt
  2019-02-26 19:31     ` Tzvetomir Stoyanov
  0 siblings, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2019-02-26 18:15 UTC (permalink / raw)
  To: Tzvetomir Stoyanov; +Cc: linux-trace-devel, Yordan Karadzhov, Slavomir Kaslev

On Tue, 26 Feb 2019 18:02:48 +0000
Tzvetomir Stoyanov <tstoyanov@vmware.com> wrote:

> We performed a lot of tests with the current implementation (relying
> on the kvm_exit event),
> and the results are very accurate and consistent. Of course, there are
> probes with huge
> deviation, but these can be easily filtered.

At least we now know why there were huge deviations :-)

> 
> > kvm_enter is fine, but the reason that kvm_exit is not, is because, in
> > order to record the reason for the exit, it needs to call into either
> > the Intel or AMD code (as they are different), and that is done after
> > interrupts and preemption are enabled.
> >
> > Instead, we can trace the function kvm_read_l1_tsc, which happens while
> > interrupts are still enabled.
> >
> > Would you be able to test with that function instead of using kvm_exit?
> > Also, if you still see a drift, try setting both trace clocks to
> > x86-tsc, and see if you still get a drift.  
> 
> Yes, I'll test with the kvm_read_l1_tsc function tomorrow.  Is it
> related to x86-tsc
> clock only, or it should work with any clock ? I noticed that the
> drift does not depend
> on the clock, x86-tsc also drifts.
>

No, it shouldn't be related, you can do them differently.

Is it only your machine that has an drift, or do you see it on other
machines? We looked at Yordan's laptop, and it had shifts, but not a
drift. That is, it was off for a bit but then goes back. Not a drift,
but a shift between two deltas. Not sure why.

The fact that it drifts with x86-tsc clock bothers me, because the
guest and the host are using the same crystal. For there to be a drift,
the tsc of the guest needs to be using a different crystal than the
host. That is, two different clock sources for the TSC depending on if
the CPU is in the VM or in the host context.

BTW, is your box Intel or AMD?

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't use kvm_exit for the host side time stamp
  2019-02-26 18:15   ` Steven Rostedt
@ 2019-02-26 19:31     ` Tzvetomir Stoyanov
  0 siblings, 0 replies; 6+ messages in thread
From: Tzvetomir Stoyanov @ 2019-02-26 19:31 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-trace-devel, Yordan Karadzhov, Slavomir Kaslev

On Tue, Feb 26, 2019 at 8:15 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 26 Feb 2019 18:02:48 +0000
> Tzvetomir Stoyanov <tstoyanov@vmware.com> wrote:
>
> > We performed a lot of tests with the current implementation (relying
> > on the kvm_exit event),
> > and the results are very accurate and consistent. Of course, there are
> > probes with huge
> > deviation, but these can be easily filtered.
>
> At least we now know why there were huge deviations :-)
>
> >
> > > kvm_enter is fine, but the reason that kvm_exit is not, is because, in
> > > order to record the reason for the exit, it needs to call into either
> > > the Intel or AMD code (as they are different), and that is done after
> > > interrupts and preemption are enabled.
> > >
> > > Instead, we can trace the function kvm_read_l1_tsc, which happens while
> > > interrupts are still enabled.
> > >
> > > Would you be able to test with that function instead of using kvm_exit?
> > > Also, if you still see a drift, try setting both trace clocks to
> > > x86-tsc, and see if you still get a drift.
> >
> > Yes, I'll test with the kvm_read_l1_tsc function tomorrow.  Is it
> > related to x86-tsc
> > clock only, or it should work with any clock ? I noticed that the
> > drift does not depend
> > on the clock, x86-tsc also drifts.
> >
>
> No, it shouldn't be related, you can do them differently.
>
> Is it only your machine that has an drift, or do you see it on other
> machines? We looked at Yordan's laptop, and it had shifts, but not a
> drift. That is, it was off for a bit but then goes back. Not a drift,
> but a shift between two deltas. Not sure why.

I tested on my machine only (Intel), with different clocks. The drift
is easily visible
using KernelShark combo plots - at the trace start, both traces are
synced with accuracy
up to 1-2 us. At the trace end the desync is bigger, depending on the
trace duration.

> The fact that it drifts with x86-tsc clock bothers me, because the
> guest and the host are using the same crystal. For there to be a drift,
> the tsc of the guest needs to be using a different crystal than the
> host. That is, two different clock sources for the TSC depending on if
> the CPU is in the VM or in the host context.
>
> BTW, is your box Intel or AMD?
>
> Thanks,
>
> -- Steve



-- 

Tzvetomir (Ceco) Stoyanov
VMware Open Source Technology Center

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't use kvm_exit for the host side time stamp
  2019-02-26 16:24 Can't use kvm_exit for the host side time stamp Steven Rostedt
  2019-02-26 18:02 ` Tzvetomir Stoyanov
@ 2019-02-28 16:22 ` Tzvetomir Stoyanov
  2019-02-28 16:33   ` Steven Rostedt
  1 sibling, 1 reply; 6+ messages in thread
From: Tzvetomir Stoyanov @ 2019-02-28 16:22 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-trace-devel, Yordan Karadzhov, Slavomir Kaslev

Hi Steven,
On Tue, Feb 26, 2019 at 6:24 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Hi Ceco,
>
> I was walking Yordan down the code in the kernel to show him the ideas
> I have about using kvm_exit as the trace event on the host side, when I
> noticed that it's called *after* interrupts and preemption are enabled!
>
> This totally breaks the design, as it is not a reliable event, because
> the process running the guest is likely to get an interrupt or
> preempted before recording that event, which can cause a huge
> difference.
>
> kvm_enter is fine, but the reason that kvm_exit is not, is because, in
> order to record the reason for the exit, it needs to call into either
> the Intel or AMD code (as they are different), and that is done after
> interrupts and preemption are enabled.
>
> Instead, we can trace the function kvm_read_l1_tsc, which happens while
> interrupts are still enabled.

I changed the implementation to track kvm_read_l1_tsc() instead of
kvm_exit() and tested on my machine. The results were very odd, I
cannot see any traces of kvm_read_l1_tsc() called by
vcpu_enter_guest(), there are only bunch of calls like that:
       CPU 1/KVM-5555  [005] ....  4530.180230: kvm_read_l1_tsc
<-vmx_set_hv_timer
       CPU 1/KVM-5555  [005] d...  4530.180238: kvm_read_l1_tsc <-start_sw_timer
       CPU 0/KVM-5554  [007] d...  4530.183991: kvm_read_l1_tsc
<-wait_lapic_expire
With the help of Slavi, we found that kvm_read_l1_tsc() is compiled as
inline on my kernel (4.20.11, shipped  by Fedora),  in this specific
flow. Looks like a gcc optimization ?  We tested with
vmx_read_l1_tsc_offset(), instead of kvm_read_l1_tsc()  - it appears
as expected in the traces, and the results are much more accurate
than the case when  kvm_exit() is used. The problem is that
vmx_read_l1_tsc_offset() is Intel specific, most probably it is not
used on non-Intel CPUs.

>
> Would you be able to test with that function instead of using kvm_exit?
> Also, if you still see a drift, try setting both trace clocks to
> x86-tsc, and see if you still get a drift.
>
> Thanks!
>
> -- Steve



-- 

Tzvetomir (Ceco) Stoyanov
VMware Open Source Technology Center

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't use kvm_exit for the host side time stamp
  2019-02-28 16:22 ` Tzvetomir Stoyanov
@ 2019-02-28 16:33   ` Steven Rostedt
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2019-02-28 16:33 UTC (permalink / raw)
  To: Tzvetomir Stoyanov; +Cc: linux-trace-devel, Yordan Karadzhov, Slavomir Kaslev

On Thu, 28 Feb 2019 16:22:41 +0000
Tzvetomir Stoyanov <tstoyanov@vmware.com> wrote:

> The problem is that
> vmx_read_l1_tsc_offset() is Intel specific, most probably it is not
> used on non-Intel CPUs.

Correct, but we should be able to easily determine that by checking
which module is loaded (kvm_intel or kvm_amd). I'm sure we can just use
the equivalent for amd.

But again, this needs to be able to work with non-x86 too, which means
at the end, we need to make this configurable. Either by having plugins
loaded by checking what system it is on, and seeing what options are
available.

-- Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-02-28 16:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-26 16:24 Can't use kvm_exit for the host side time stamp Steven Rostedt
2019-02-26 18:02 ` Tzvetomir Stoyanov
2019-02-26 18:15   ` Steven Rostedt
2019-02-26 19:31     ` Tzvetomir Stoyanov
2019-02-28 16:22 ` Tzvetomir Stoyanov
2019-02-28 16:33   ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.