All of lore.kernel.org
 help / color / mirror / Atom feed
* VDSO pvclock may increase host cpu consumption, is this a problem?
@ 2014-03-29  8:47 Zhanghailiang
  2014-03-29 14:46 ` Marcelo Tosatti
  2014-03-31 17:52 ` Andy Lutomirski
  0 siblings, 2 replies; 13+ messages in thread
From: Zhanghailiang @ 2014-03-29  8:47 UTC (permalink / raw)
  To: mtosatti, johnstul, tglx, kvm; +Cc: linux-kernel, Zhouxiangjiu, zhang yanying

Hi,
I found when Guest is idle, VDSO pvclock may increase host consumption.
We can calcutate as follow, Correct me if I am wrong.
      (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. 
When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
Both Host and Guest is linux-3.13.6.
So, whether the host cpu consumption is a problem?

Thanks.
Zhang hailiang

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-03-29  8:47 VDSO pvclock may increase host cpu consumption, is this a problem? Zhanghailiang
@ 2014-03-29 14:46 ` Marcelo Tosatti
  2014-03-31  1:12   ` Zhanghailiang
  2014-03-31 17:52 ` Andy Lutomirski
  1 sibling, 1 reply; 13+ messages in thread
From: Marcelo Tosatti @ 2014-03-29 14:46 UTC (permalink / raw)
  To: Zhanghailiang
  Cc: johnstul, tglx, kvm, linux-kernel, Zhouxiangjiu, zhang yanying

On Sat, Mar 29, 2014 at 08:47:27AM +0000, Zhanghailiang wrote:
> Hi,
> I found when Guest is idle, VDSO pvclock may increase host consumption.
> We can calcutate as follow, Correct me if I am wrong.
>       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
> In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
> In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. 
> When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
> Both Host and Guest is linux-3.13.6.
> So, whether the host cpu consumption is a problem?

Hi,

How many percents out of the total CPU cycles are 225,000 cycles, for your
CPU ?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-03-29 14:46 ` Marcelo Tosatti
@ 2014-03-31  1:12   ` Zhanghailiang
  0 siblings, 0 replies; 13+ messages in thread
From: Zhanghailiang @ 2014-03-31  1:12 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: johnstul, tglx, kvm, linux-kernel, Zhouxiangjiu

Hi Marcelo,
The CPU's info is:
processor       : 15
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
stepping        : 2
microcode       : 12
cpu MHz         : 2400.125
cache size      : 12288 KB
physical id     : 1
siblings        : 8
core id         : 10
cpu cores       : 4
apicid          : 53
initial apicid  : 53
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4800.18
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

Thanks
Zhang hailiang

> On Sat, Mar 29, 2014 at 08:47:27AM +0000, Zhanghailiang wrote:
> > Hi,
> > I found when Guest is idle, VDSO pvclock may increase host consumption.
> > We can calcutate as follow, Correct me if I am wrong.
> >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) In
> > Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in
> timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250
> Hz, it may consume 225,000 cycles per second, even no VM is created.
> > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If
> the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per
> call. The feature decrease 150 cycles consumption per call.
> > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the
> host consumption.
> > Both Host and Guest is linux-3.13.6.
> > So, whether the host cpu consumption is a problem?
> 
> Hi,
> 
> How many percents out of the total CPU cycles are 225,000 cycles, for your
> CPU ?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-03-29  8:47 VDSO pvclock may increase host cpu consumption, is this a problem? Zhanghailiang
  2014-03-29 14:46 ` Marcelo Tosatti
@ 2014-03-31 17:52 ` Andy Lutomirski
  2014-03-31 21:30   ` Marcelo Tosatti
  1 sibling, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2014-03-31 17:52 UTC (permalink / raw)
  To: Zhanghailiang, mtosatti, johnstul, tglx, kvm
  Cc: linux-kernel, Zhouxiangjiu, zhang yanying

On 03/29/2014 01:47 AM, Zhanghailiang wrote:
> Hi,
> I found when Guest is idle, VDSO pvclock may increase host consumption.
> We can calcutate as follow, Correct me if I am wrong.
>       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
> In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
> In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. 
> When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
> Both Host and Guest is linux-3.13.6.
> So, whether the host cpu consumption is a problem?

Does pvclock serve any real purpose on systems with fully-functional
TSCs?  The x86 guest implementation is awful, so it's about 2x slower
than TSC.  It could be improved a lot, but I'm not sure I understand why
it exists in the first place.

I certainly understand the goal of keeping the guest CLOCK_REALTIME is
sync with the host, but pvclock seems like overkill for that.

--Andy


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-03-31 17:52 ` Andy Lutomirski
@ 2014-03-31 21:30   ` Marcelo Tosatti
  2014-04-01  5:33     ` Andy Lutomirski
  0 siblings, 1 reply; 13+ messages in thread
From: Marcelo Tosatti @ 2014-03-31 21:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Zhanghailiang, johnstul, tglx, kvm, linux-kernel, Zhouxiangjiu,
	zhang yanying

On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
> On 03/29/2014 01:47 AM, Zhanghailiang wrote:
> > Hi,
> > I found when Guest is idle, VDSO pvclock may increase host consumption.
> > We can calcutate as follow, Correct me if I am wrong.
> >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
> > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
> > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. 
> > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
> > Both Host and Guest is linux-3.13.6.
> > So, whether the host cpu consumption is a problem?
> 
> Does pvclock serve any real purpose on systems with fully-functional
> TSCs?  The x86 guest implementation is awful, so it's about 2x slower
> than TSC.  It could be improved a lot, but I'm not sure I understand why
> it exists in the first place.

VM migration.

Can you explain why you consider it so bad ? How you think it could be
improved ?

> I certainly understand the goal of keeping the guest CLOCK_REALTIME is
> sync with the host, but pvclock seems like overkill for that.

VM migration.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-03-31 21:30   ` Marcelo Tosatti
@ 2014-04-01  5:33     ` Andy Lutomirski
  2014-04-01 18:01       ` Marcelo Tosatti
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2014-04-01  5:33 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
>
> On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
> > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
> > > Hi,
> > > I found when Guest is idle, VDSO pvclock may increase host consumption.
> > > We can calcutate as follow, Correct me if I am wrong.
> > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
> > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
> > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
> > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
> > > Both Host and Guest is linux-3.13.6.
> > > So, whether the host cpu consumption is a problem?
> >
> > Does pvclock serve any real purpose on systems with fully-functional
> > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
> > than TSC.  It could be improved a lot, but I'm not sure I understand why
> > it exists in the first place.
>
> VM migration.

Why does that need percpu stuff?  Wouldn't it be sufficient to
interrupt all CPUs (or at least all cpus running in userspace) on
migration and update the normal timing data structures?

Even better: have the VM offer to invalidate the physical page
containing the kernel's clock data on migration and interrupt one CPU.
 If another CPU races, it'll fault and wait for the guest kernel to
update its timing.

Does the current kvmclock stuff track CLOCK_MONOTONIC and
CLOCK_REALTIME separately?

>
> Can you explain why you consider it so bad ? How you think it could be
> improved ?

The second rdtsc_barrier looks unnecessary.  Even better, if rdtscp is
available, then rdtscp can replace rdtsc_barrier, rdtsc, and the
getcpu call.

It would also be nice to avoid having two sets of rescalings of the timing data.


>
> > I certainly understand the goal of keeping the guest CLOCK_REALTIME is
> > sync with the host, but pvclock seems like overkill for that.
>
> VM migration.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-04-01  5:33     ` Andy Lutomirski
@ 2014-04-01 18:01       ` Marcelo Tosatti
  2014-04-01 19:17         ` Andy Lutomirski
  0 siblings, 1 reply; 13+ messages in thread
From: Marcelo Tosatti @ 2014-04-01 18:01 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote:
> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
> >
> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
> > > > Hi,
> > > > I found when Guest is idle, VDSO pvclock may increase host consumption.
> > > > We can calcutate as follow, Correct me if I am wrong.
> > > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
> > > > Both Host and Guest is linux-3.13.6.
> > > > So, whether the host cpu consumption is a problem?
> > >
> > > Does pvclock serve any real purpose on systems with fully-functional
> > > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
> > > than TSC.  It could be improved a lot, but I'm not sure I understand why
> > > it exists in the first place.
> >
> > VM migration.
> 
> Why does that need percpu stuff?  Wouldn't it be sufficient to
> interrupt all CPUs (or at least all cpus running in userspace) on
> migration and update the normal timing data structures?

Are you suggesting to allow interruption of the timekeeping code 
at any time to update frequency information ?

Do you want to that as a special tsc clocksource driver ? 

> Even better: have the VM offer to invalidate the physical page
> containing the kernel's clock data on migration and interrupt one CPU.
>  If another CPU races, it'll fault and wait for the guest kernel to
> update its timing.

Perhaps that is a good idea.

> Does the current kvmclock stuff track CLOCK_MONOTONIC and
> CLOCK_REALTIME separately?

No. kvmclock counting is interrupted on vm pause (the "hw" clock does not
count during vm pause).

> > Can you explain why you consider it so bad ? How you think it could be
> > improved ?
> 
> The second rdtsc_barrier looks unnecessary.  Even better, if rdtscp is
> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the
> getcpu call.
>
> It would also be nice to avoid having two sets of rescalings of the timing data.

Yep, probably good improvements, patches are welcome :-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-04-01 18:01       ` Marcelo Tosatti
@ 2014-04-01 19:17         ` Andy Lutomirski
  2014-04-02  0:12           ` Marcelo Tosatti
       [not found]           ` <20140402002926.GB31945@amt.cnet>
  0 siblings, 2 replies; 13+ messages in thread
From: Andy Lutomirski @ 2014-04-01 19:17 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote:
>> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
>> >
>> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
>> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
>> > > > Hi,
>> > > > I found when Guest is idle, VDSO pvclock may increase host consumption.
>> > > > We can calcutate as follow, Correct me if I am wrong.
>> > > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
>> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
>> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
>> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
>> > > > Both Host and Guest is linux-3.13.6.
>> > > > So, whether the host cpu consumption is a problem?
>> > >
>> > > Does pvclock serve any real purpose on systems with fully-functional
>> > > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
>> > > than TSC.  It could be improved a lot, but I'm not sure I understand why
>> > > it exists in the first place.
>> >
>> > VM migration.
>>
>> Why does that need percpu stuff?  Wouldn't it be sufficient to
>> interrupt all CPUs (or at least all cpus running in userspace) on
>> migration and update the normal timing data structures?
>
> Are you suggesting to allow interruption of the timekeeping code
> at any time to update frequency information ?

I'm not sure what you mean by "interruption of the timekeeping code".
I'm suggesting sending an interrupt to the guest (via a virtio device,
presumably) to tell it that it has been paused and resumed.

This is probably worth getting John's input if you actually want to do
this.  I'm not about to :)

Is there any case in which the TSC is stable and the kvmclock data for
different cpus is actually different?

>
> Do you want to that as a special tsc clocksource driver ?
>
>> Even better: have the VM offer to invalidate the physical page
>> containing the kernel's clock data on migration and interrupt one CPU.
>>  If another CPU races, it'll fault and wait for the guest kernel to
>> update its timing.
>
> Perhaps that is a good idea.
>
>> Does the current kvmclock stuff track CLOCK_MONOTONIC and
>> CLOCK_REALTIME separately?
>
> No. kvmclock counting is interrupted on vm pause (the "hw" clock does not
> count during vm pause).

Makes sense.

>
>> > Can you explain why you consider it so bad ? How you think it could be
>> > improved ?
>>
>> The second rdtsc_barrier looks unnecessary.  Even better, if rdtscp is
>> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the
>> getcpu call.
>>
>> It would also be nice to avoid having two sets of rescalings of the timing data.
>
> Yep, probably good improvements, patches are welcome :-)
>

I may get to it at some point.  No guarantees.  I did just rewrite all
the mapping-related code for every other x86 vdso timesource, so maybe
I should try to add this to the pile.  The fact that the data is a
variable number of pages makes it messy, though, and since I don't
understand why there's a separate structure for each CPU, I'm hesitant
to change it too much.

--Andy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-04-01 19:17         ` Andy Lutomirski
@ 2014-04-02  0:12           ` Marcelo Tosatti
  2014-04-02  0:20             ` Andy Lutomirski
       [not found]           ` <20140402002926.GB31945@amt.cnet>
  1 sibling, 1 reply; 13+ messages in thread
From: Marcelo Tosatti @ 2014-04-02  0:12 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote:
> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote:
> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
> >> >
> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
> >> > > > Hi,
> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption.
> >> > > > We can calcutate as follow, Correct me if I am wrong.
> >> > > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
> >> > > > Both Host and Guest is linux-3.13.6.
> >> > > > So, whether the host cpu consumption is a problem?
> >> > >
> >> > > Does pvclock serve any real purpose on systems with fully-functional
> >> > > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
> >> > > than TSC.  It could be improved a lot, but I'm not sure I understand why
> >> > > it exists in the first place.
> >> >
> >> > VM migration.
> >>
> >> Why does that need percpu stuff?  Wouldn't it be sufficient to
> >> interrupt all CPUs (or at least all cpus running in userspace) on
> >> migration and update the normal timing data structures?
> >
> > Are you suggesting to allow interruption of the timekeeping code
> > at any time to update frequency information ?
> 
> I'm not sure what you mean by "interruption of the timekeeping code".
> I'm suggesting sending an interrupt to the guest (via a virtio device,
> presumably) to tell it that it has been paused and resumed.

code:

1) disable interrupts
2) A = RDTSC
3) B = SCALE(A, TSC.FREQ)

If migration happens between 2 and 3, you've got an incorrect value.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-04-02  0:12           ` Marcelo Tosatti
@ 2014-04-02  0:20             ` Andy Lutomirski
  0 siblings, 0 replies; 13+ messages in thread
From: Andy Lutomirski @ 2014-04-02  0:20 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Tue, Apr 1, 2014 at 5:12 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote:
>> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote:
>> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
>> >> >
>> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
>> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
>> >> > > > Hi,
>> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption.
>> >> > > > We can calcutate as follow, Correct me if I am wrong.
>> >> > > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
>> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
>> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
>> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
>> >> > > > Both Host and Guest is linux-3.13.6.
>> >> > > > So, whether the host cpu consumption is a problem?
>> >> > >
>> >> > > Does pvclock serve any real purpose on systems with fully-functional
>> >> > > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
>> >> > > than TSC.  It could be improved a lot, but I'm not sure I understand why
>> >> > > it exists in the first place.
>> >> >
>> >> > VM migration.
>> >>
>> >> Why does that need percpu stuff?  Wouldn't it be sufficient to
>> >> interrupt all CPUs (or at least all cpus running in userspace) on
>> >> migration and update the normal timing data structures?
>> >
>> > Are you suggesting to allow interruption of the timekeeping code
>> > at any time to update frequency information ?
>>
>> I'm not sure what you mean by "interruption of the timekeeping code".
>> I'm suggesting sending an interrupt to the guest (via a virtio device,
>> presumably) to tell it that it has been paused and resumed.
>
> code:
>
> 1) disable interrupts
> 2) A = RDTSC
> 3) B = SCALE(A, TSC.FREQ)
>
> If migration happens between 2 and 3, you've got an incorrect value.
>

Fair enough.

I guess

1) disable interrupts
2) A = RDTSC
3) B = SCALE(A, TSC.FREQ)

is also bad if (3) blocks due to magic invalidation of the physical page.

--Andy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
       [not found]           ` <20140402002926.GB31945@amt.cnet>
@ 2014-04-02  0:46             ` Andy Lutomirski
  2014-04-02 22:05               ` Marcelo Tosatti
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2014-04-02  0:46 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Tue, Apr 1, 2014 at 5:29 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote:
>> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote:
>> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
>> >> >
>> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
>> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
>> >> > > > Hi,
>> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption.
>> >> > > > We can calcutate as follow, Correct me if I am wrong.
>> >> > > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
>> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
>> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
>> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
>> >> > > > Both Host and Guest is linux-3.13.6.
>> >> > > > So, whether the host cpu consumption is a problem?
>> >> > >
>> >> > > Does pvclock serve any real purpose on systems with fully-functional
>> >> > > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
>> >> > > than TSC.  It could be improved a lot, but I'm not sure I understand why
>> >> > > it exists in the first place.
>> >> >
>> >> > VM migration.
>> >>
>> >> Why does that need percpu stuff?  Wouldn't it be sufficient to
>> >> interrupt all CPUs (or at least all cpus running in userspace) on
>> >> migration and update the normal timing data structures?
>> >
>> > Are you suggesting to allow interruption of the timekeeping code
>> > at any time to update frequency information ?
>>
>> I'm not sure what you mean by "interruption of the timekeeping code".
>> I'm suggesting sending an interrupt to the guest (via a virtio device,
>> presumably) to tell it that it has been paused and resumed.
>>
>> This is probably worth getting John's input if you actually want to do
>> this.  I'm not about to :)
>
> Honestly, neither am i at the moment. But i'll think about it.
>
>> Is there any case in which the TSC is stable and the kvmclock data for
>> different cpus is actually different?
>
> No. However, kvmclock_data.flags field is an interface for watchdog
> unpause.
>
>> > Do you want to that as a special tsc clocksource driver ?
>> >
>> >> Even better: have the VM offer to invalidate the physical page
>> >> containing the kernel's clock data on migration and interrupt one CPU.
>> >>  If another CPU races, it'll fault and wait for the guest kernel to
>> >> update its timing.
>> >
>> > Perhaps that is a good idea.
>> >
>> >> Does the current kvmclock stuff track CLOCK_MONOTONIC and
>> >> CLOCK_REALTIME separately?
>> >
>> > No. kvmclock counting is interrupted on vm pause (the "hw" clock does not
>> > count during vm pause).
>>
>> Makes sense.
>>
>> >
>> >> > Can you explain why you consider it so bad ? How you think it could be
>> >> > improved ?
>> >>
>> >> The second rdtsc_barrier looks unnecessary.  Even better, if rdtscp is
>> >> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the
>> >> getcpu call.
>> >>
>> >> It would also be nice to avoid having two sets of rescalings of the timing data.
>> >
>> > Yep, probably good improvements, patches are welcome :-)
>> >
>>
>> I may get to it at some point.  No guarantees.  I did just rewrite all
>> the mapping-related code for every other x86 vdso timesource, so maybe
>> I should try to add this to the pile.  The fact that the data is a
>> variable number of pages makes it messy, though, and since I don't
>> understand why there's a separate structure for each CPU, I'm hesitant
>> to change it too much.
>>
>> --Andy
>
> kvmclock.data? Because each VCPU can have different .flags fields for
> example.

It looks like the vdso kvmclock code only runs if
PVCLOCK_TSC_STABLE_BIT is set, which in turn is only the case if the
TSC is guaranteed to be monotonic across all CPUs.  If we can rely on
the fact that that bit will only be set if tsc_to_system_mul and
tsc_shift are the same on all CPUs and that (system_time -
(tsc_timestamp * mul) >> shift) is the same on all CPUs, then there
should be no reason for the vdso to read the pvclock data for anything
but CPU 0.  That will make it a lot faster and simpler.

Can we rely on that?

I wonder what happens if the guest runs ntpd or otherwise uses
adjtimex.  Presumably it starts drifting relative to the host.

--Andy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-04-02  0:46             ` Andy Lutomirski
@ 2014-04-02 22:05               ` Marcelo Tosatti
  2014-04-02 22:31                 ` Andy Lutomirski
  0 siblings, 1 reply; 13+ messages in thread
From: Marcelo Tosatti @ 2014-04-02 22:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Tue, Apr 01, 2014 at 05:46:34PM -0700, Andy Lutomirski wrote:
> On Tue, Apr 1, 2014 at 5:29 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote:
> >> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> >> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote:
> >> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
> >> >> >
> >> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
> >> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
> >> >> > > > Hi,
> >> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption.
> >> >> > > > We can calcutate as follow, Correct me if I am wrong.
> >> >> > > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
> >> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
> >> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
> >> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
> >> >> > > > Both Host and Guest is linux-3.13.6.
> >> >> > > > So, whether the host cpu consumption is a problem?
> >> >> > >
> >> >> > > Does pvclock serve any real purpose on systems with fully-functional
> >> >> > > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
> >> >> > > than TSC.  It could be improved a lot, but I'm not sure I understand why
> >> >> > > it exists in the first place.
> >> >> >
> >> >> > VM migration.
> >> >>
> >> >> Why does that need percpu stuff?  Wouldn't it be sufficient to
> >> >> interrupt all CPUs (or at least all cpus running in userspace) on
> >> >> migration and update the normal timing data structures?
> >> >
> >> > Are you suggesting to allow interruption of the timekeeping code
> >> > at any time to update frequency information ?
> >>
> >> I'm not sure what you mean by "interruption of the timekeeping code".
> >> I'm suggesting sending an interrupt to the guest (via a virtio device,
> >> presumably) to tell it that it has been paused and resumed.
> >>
> >> This is probably worth getting John's input if you actually want to do
> >> this.  I'm not about to :)
> >
> > Honestly, neither am i at the moment. But i'll think about it.
> >
> >> Is there any case in which the TSC is stable and the kvmclock data for
> >> different cpus is actually different?
> >
> > No. However, kvmclock_data.flags field is an interface for watchdog
> > unpause.
> >
> >> > Do you want to that as a special tsc clocksource driver ?
> >> >
> >> >> Even better: have the VM offer to invalidate the physical page
> >> >> containing the kernel's clock data on migration and interrupt one CPU.
> >> >>  If another CPU races, it'll fault and wait for the guest kernel to
> >> >> update its timing.
> >> >
> >> > Perhaps that is a good idea.
> >> >
> >> >> Does the current kvmclock stuff track CLOCK_MONOTONIC and
> >> >> CLOCK_REALTIME separately?
> >> >
> >> > No. kvmclock counting is interrupted on vm pause (the "hw" clock does not
> >> > count during vm pause).
> >>
> >> Makes sense.
> >>
> >> >
> >> >> > Can you explain why you consider it so bad ? How you think it could be
> >> >> > improved ?
> >> >>
> >> >> The second rdtsc_barrier looks unnecessary.  Even better, if rdtscp is
> >> >> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the
> >> >> getcpu call.
> >> >>
> >> >> It would also be nice to avoid having two sets of rescalings of the timing data.
> >> >
> >> > Yep, probably good improvements, patches are welcome :-)
> >> >
> >>
> >> I may get to it at some point.  No guarantees.  I did just rewrite all
> >> the mapping-related code for every other x86 vdso timesource, so maybe
> >> I should try to add this to the pile.  The fact that the data is a
> >> variable number of pages makes it messy, though, and since I don't
> >> understand why there's a separate structure for each CPU, I'm hesitant
> >> to change it too much.
> >>
> >> --Andy
> >
> > kvmclock.data? Because each VCPU can have different .flags fields for
> > example.
> 
> It looks like the vdso kvmclock code only runs if
> PVCLOCK_TSC_STABLE_BIT is set, which in turn is only the case if the
> TSC is guaranteed to be monotonic across all CPUs.  If we can rely on
> the fact that that bit will only be set if tsc_to_system_mul and
> tsc_shift are the same on all CPUs and that (system_time -
> (tsc_timestamp * mul) >> shift) is the same on all CPUs, then there
> should be no reason for the vdso to read the pvclock data for anything
> but CPU 0.  That will make it a lot faster and simpler.
> 
> Can we rely on that?

In theory yes, but you would have to handle

PVCLOCK_TSC_STABLE_BIT set -> PVCLOCK_TSC_STABLE_BIT not set

Transition (and the other way around as well).

> I wonder what happens if the guest runs ntpd or otherwise uses
> adjtimex.  Presumably it starts drifting relative to the host.

It should use ntpd and adjtimex.  KVMCLOCK is the "hw" clock, 
the values returned by CLOCK_REALTIME and CLOCK_GETTIME are built
by the Linux guest timekeeping subsystem on top of the "hw" clock.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: VDSO pvclock may increase host cpu consumption, is this a problem?
  2014-04-02 22:05               ` Marcelo Tosatti
@ 2014-04-02 22:31                 ` Andy Lutomirski
  0 siblings, 0 replies; 13+ messages in thread
From: Andy Lutomirski @ 2014-04-02 22:31 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Thomas Gleixner, linux-kernel, zhang yanying, Zhouxiangjiu, kvm,
	johnstul, Zhanghailiang

On Wed, Apr 2, 2014 at 3:05 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Apr 01, 2014 at 05:46:34PM -0700, Andy Lutomirski wrote:
>> On Tue, Apr 1, 2014 at 5:29 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote:
>> >> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> >> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote:
>> >> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
>> >> >> >
>> >> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote:
>> >> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote:
>> >> >> > > > Hi,
>> >> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption.
>> >> >> > > > We can calcutate as follow, Correct me if I am wrong.
>> >> >> > > >       (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
>> >> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
>> >> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
>> >> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
>> >> >> > > > Both Host and Guest is linux-3.13.6.
>> >> >> > > > So, whether the host cpu consumption is a problem?
>> >> >> > >
>> >> >> > > Does pvclock serve any real purpose on systems with fully-functional
>> >> >> > > TSCs?  The x86 guest implementation is awful, so it's about 2x slower
>> >> >> > > than TSC.  It could be improved a lot, but I'm not sure I understand why
>> >> >> > > it exists in the first place.
>> >> >> >
>> >> >> > VM migration.
>> >> >>
>> >> >> Why does that need percpu stuff?  Wouldn't it be sufficient to
>> >> >> interrupt all CPUs (or at least all cpus running in userspace) on
>> >> >> migration and update the normal timing data structures?
>> >> >
>> >> > Are you suggesting to allow interruption of the timekeeping code
>> >> > at any time to update frequency information ?
>> >>
>> >> I'm not sure what you mean by "interruption of the timekeeping code".
>> >> I'm suggesting sending an interrupt to the guest (via a virtio device,
>> >> presumably) to tell it that it has been paused and resumed.
>> >>
>> >> This is probably worth getting John's input if you actually want to do
>> >> this.  I'm not about to :)
>> >
>> > Honestly, neither am i at the moment. But i'll think about it.
>> >
>> >> Is there any case in which the TSC is stable and the kvmclock data for
>> >> different cpus is actually different?
>> >
>> > No. However, kvmclock_data.flags field is an interface for watchdog
>> > unpause.
>> >
>> >> > Do you want to that as a special tsc clocksource driver ?
>> >> >
>> >> >> Even better: have the VM offer to invalidate the physical page
>> >> >> containing the kernel's clock data on migration and interrupt one CPU.
>> >> >>  If another CPU races, it'll fault and wait for the guest kernel to
>> >> >> update its timing.
>> >> >
>> >> > Perhaps that is a good idea.
>> >> >
>> >> >> Does the current kvmclock stuff track CLOCK_MONOTONIC and
>> >> >> CLOCK_REALTIME separately?
>> >> >
>> >> > No. kvmclock counting is interrupted on vm pause (the "hw" clock does not
>> >> > count during vm pause).
>> >>
>> >> Makes sense.
>> >>
>> >> >
>> >> >> > Can you explain why you consider it so bad ? How you think it could be
>> >> >> > improved ?
>> >> >>
>> >> >> The second rdtsc_barrier looks unnecessary.  Even better, if rdtscp is
>> >> >> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the
>> >> >> getcpu call.
>> >> >>
>> >> >> It would also be nice to avoid having two sets of rescalings of the timing data.
>> >> >
>> >> > Yep, probably good improvements, patches are welcome :-)
>> >> >
>> >>
>> >> I may get to it at some point.  No guarantees.  I did just rewrite all
>> >> the mapping-related code for every other x86 vdso timesource, so maybe
>> >> I should try to add this to the pile.  The fact that the data is a
>> >> variable number of pages makes it messy, though, and since I don't
>> >> understand why there's a separate structure for each CPU, I'm hesitant
>> >> to change it too much.
>> >>
>> >> --Andy
>> >
>> > kvmclock.data? Because each VCPU can have different .flags fields for
>> > example.
>>
>> It looks like the vdso kvmclock code only runs if
>> PVCLOCK_TSC_STABLE_BIT is set, which in turn is only the case if the
>> TSC is guaranteed to be monotonic across all CPUs.  If we can rely on
>> the fact that that bit will only be set if tsc_to_system_mul and
>> tsc_shift are the same on all CPUs and that (system_time -
>> (tsc_timestamp * mul) >> shift) is the same on all CPUs, then there
>> should be no reason for the vdso to read the pvclock data for anything
>> but CPU 0.  That will make it a lot faster and simpler.
>>
>> Can we rely on that?
>
> In theory yes, but you would have to handle
>
> PVCLOCK_TSC_STABLE_BIT set -> PVCLOCK_TSC_STABLE_BIT not set
>
> Transition (and the other way around as well).

Since !STABLE already results in a real syscall for clock_gettime and
gettimeofday, I don't think this is a real hardship for the vdso.

>
>> I wonder what happens if the guest runs ntpd or otherwise uses
>> adjtimex.  Presumably it starts drifting relative to the host.
>
> It should use ntpd and adjtimex.  KVMCLOCK is the "hw" clock,
> the values returned by CLOCK_REALTIME and CLOCK_GETTIME are built
> by the Linux guest timekeeping subsystem on top of the "hw" clock.
>

If the kernel can guarantee that, then the timing code gets faster,
since the cyc2ns scale will be unity.  Maybe this is worth a branch.

Anyway, I'll try to find some time to improve this if/when hpa picks
up my current series of vdso cleanups.  I suspect that the overall
effect will be a 30-40% speedup in clock_gettime along with a decent
reduction of code complexity.

--Andy

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-04-02 22:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-29  8:47 VDSO pvclock may increase host cpu consumption, is this a problem? Zhanghailiang
2014-03-29 14:46 ` Marcelo Tosatti
2014-03-31  1:12   ` Zhanghailiang
2014-03-31 17:52 ` Andy Lutomirski
2014-03-31 21:30   ` Marcelo Tosatti
2014-04-01  5:33     ` Andy Lutomirski
2014-04-01 18:01       ` Marcelo Tosatti
2014-04-01 19:17         ` Andy Lutomirski
2014-04-02  0:12           ` Marcelo Tosatti
2014-04-02  0:20             ` Andy Lutomirski
     [not found]           ` <20140402002926.GB31945@amt.cnet>
2014-04-02  0:46             ` Andy Lutomirski
2014-04-02 22:05               ` Marcelo Tosatti
2014-04-02 22:31                 ` Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.