All of lore.kernel.org
 help / color / mirror / Atom feed
* smp guest questions
@ 2009-06-17  8:38 Michael Tokarev
  2009-06-17  9:15 ` Avi Kivity
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Tokarev @ 2009-06-17  8:38 UTC (permalink / raw)
  To: KVM list

After seeing words from Avi about that smp guests
are ok now, I descided to try.  And immediately
got a few questions.

Running on a Phenom 9750 machine (PhenomI), AMD780G
chipset.  Host is 2.6.29 x86-64, qemu-kvm 0.10.5,
guests are linux with kvm paravirt bits enabled, also
dynticks (on both host and guest).


When booting a 2-CPU guest, I see in dmesg:

PM-Timer running at invalid rate: 200% of normal - aborting.

and indeed, in available_clocksource there's no pmtimer.
Should I be concerned?  It does not look healthy.


Some time later, I see stuff like:

hrtimer: interrupt too slow, forcing clock min delta to 47210997 ns

Which reminds me issues I had with broken hpet (time goes
back-n-forth with similar messages shown in dmesg, but
about hpet not hrtimer).  Also does not look healthy.


I haven't seen either of the two messages above on any of
single-processor guests so far, at least with recent kernels
and kvm userspace, only on smp (2 cpu for now).

Thanks!

/mjt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smp guest questions
  2009-06-17  8:38 smp guest questions Michael Tokarev
@ 2009-06-17  9:15 ` Avi Kivity
  2009-06-17 10:46   ` Michael Tokarev
  0 siblings, 1 reply; 5+ messages in thread
From: Avi Kivity @ 2009-06-17  9:15 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: KVM list, Marcelo Tosatti

On 06/17/2009 11:38 AM, Michael Tokarev wrote:
> After seeing words from Avi about that smp guests
> are ok now, I descided to try.  And immediately
> got a few questions.
>
> Running on a Phenom 9750 machine (PhenomI), AMD780G
> chipset.  Host is 2.6.29 x86-64, qemu-kvm 0.10.5,
> guests are linux with kvm paravirt bits enabled, also
> dynticks (on both host and guest).
>
>
> When booting a 2-CPU guest, I see in dmesg:
>
> PM-Timer running at invalid rate: 200% of normal - aborting.
>
> and indeed, in available_clocksource there's no pmtimer.
> Should I be concerned?  It does not look healthy.
>

It's a bug, please post guest details (kernel version, bitness).

Copying Marcelo.

>
> Some time later, I see stuff like:
>
> hrtimer: interrupt too slow, forcing clock min delta to 47210997 ns
>
> Which reminds me issues I had with broken hpet (time goes
> back-n-forth with similar messages shown in dmesg, but
> about hpet not hrtimer).  Also does not look healthy.
>
>
> I haven't seen either of the two messages above on any of
> single-processor guests so far, at least with recent kernels
> and kvm userspace, only on smp (2 cpu for now).

Please also post host /proc/cpuifo.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smp guest questions
  2009-06-17  9:15 ` Avi Kivity
@ 2009-06-17 10:46   ` Michael Tokarev
  2009-06-18  9:14     ` Michael Tokarev
  2009-06-18 15:17     ` Marcelo Tosatti
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Tokarev @ 2009-06-17 10:46 UTC (permalink / raw)
  To: Avi Kivity; +Cc: KVM list, Marcelo Tosatti

Avi Kivity wrote:
> On 06/17/2009 11:38 AM, Michael Tokarev wrote:
>> After seeing words from Avi about that smp guests
>> are ok now, I descided to try.  And immediately
>> got a few questions.
>>
>> Running on a Phenom 9750 machine (PhenomI), AMD780G
>> chipset.  Host is 2.6.29 x86-64, qemu-kvm 0.10.5,
>> guests are linux with kvm paravirt bits enabled, also
>> dynticks (on both host and guest).
>>
>>
>> When booting a 2-CPU guest, I see in dmesg:
>>
>> PM-Timer running at invalid rate: 200% of normal - aborting.
>>
>> and indeed, in available_clocksource there's no pmtimer.
>> Should I be concerned?  It does not look healthy.
>>
> 
> It's a bug, please post guest details (kernel version, bitness).

The guest kernel is also 2.6.29[.5], but this time it's x86-32
(compiled for P4).  kvm userspace is also 32bits (historical) --
only host kernel is 64bit for now.  I'll try to do some more
experiments later today on a test machine (this is a production
box) -- "hopefully" that same issue will occur on another
machine :)

> Copying Marcelo.
> 
>>
>> Some time later, I see stuff like:
>>
>> hrtimer: interrupt too slow, forcing clock min delta to 47210997 ns
>>
>> Which reminds me issues I had with broken hpet (time goes
>> back-n-forth with similar messages shown in dmesg, but
>> about hpet not hrtimer).  Also does not look healthy.
>>
>>
>> I haven't seen either of the two messages above on any of
>> single-processor guests so far, at least with recent kernels
>> and kvm userspace, only on smp (2 cpu for now).
> 
> Please also post host /proc/cpuifo.

HOST cpuinfo (only for 4th core, other cores are similar):
processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: AMD Phenom(tm) 9750 Quad-Core Processor
stepping	: 3
cpu MHz		: 1200.000
(yes ondemand cpufreq is in effect - nominal frequency is 2400.
I had no issues with cpufreq on this box so far, including all
the guests).
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
bogomips	: 4812.67
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate



cpuinfo on GUEST (also for only one CPU):

processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 6
model		: 2
model name	: QEMU Virtual CPU version 0.10.5
stepping	: 3
cpu MHz		: 2405.894
cache size	: 512 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall lm pni hypervisor
bogomips	: 4811.78
clflush size	: 64
power management:


Thanks!

/mjt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smp guest questions
  2009-06-17 10:46   ` Michael Tokarev
@ 2009-06-18  9:14     ` Michael Tokarev
  2009-06-18 15:17     ` Marcelo Tosatti
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Tokarev @ 2009-06-18  9:14 UTC (permalink / raw)
  To: Avi Kivity; +Cc: KVM list, Marcelo Tosatti

Replying to myself & top-posting for reference.

I can't reproduce the problem - neither of the
two issues with timers mentioned in my original
email quited below.

But there IS a race somewhere, that's for sure.

When I saw both - "pm-timer running at 200% rate"
and "hrtimer: interrupt too slow" (and I saw them
more than once on this configuration), - it was
during host system startup, when it starts all
the guest machines (several of them) and they
continue its own startup at the background, all
at once.  I.e, it happened more than once when
several kvm guests gets started all together.

Playing with it more I wasn't able to repeat the
issue, and can't trigger it with 4 guests on my
test machine at home either.  But it happened
again "when I wasn't watching", also during
massive guest startup.

Another issue happened during startup (or, rather,
AFTER such massive startup when one guest reported
the 200% rate of pm-timer, probably at the same time
when hrtimer message popped up) - another guest
locked up hard, kvm process were looping using 100%
cpu time and did not answer to monitor socket requests
(it was supposed to listen on a unix socket for monitor
commands).  *Probably* at the time when one guest were
in locked state, another guest reported that hrtimer
message - but I'm not 100% sure since I can only see
it by "--MARK--" messages in syslog of the died guest,
which are at 20-minute intervals.  Maybe some "random
glitch", I dunno ;)

In any way, since I can't provide more information
about all this despite all my attempts to reproduce
the situation.. I consider this issue closed, for now
anyway.  But let it be archived for future refefence :)

Thanks!

/mjt

Michael Tokarev wrote:
> Avi Kivity wrote:
>> On 06/17/2009 11:38 AM, Michael Tokarev wrote:
>>> After seeing words from Avi about that smp guests
>>> are ok now, I descided to try.  And immediately
>>> got a few questions.
>>>
>>> Running on a Phenom 9750 machine (PhenomI), AMD780G
>>> chipset.  Host is 2.6.29 x86-64, qemu-kvm 0.10.5,
>>> guests are linux with kvm paravirt bits enabled, also
>>> dynticks (on both host and guest).
>>>
>>>
>>> When booting a 2-CPU guest, I see in dmesg:
>>>
>>> PM-Timer running at invalid rate: 200% of normal - aborting.
>>>
>>> and indeed, in available_clocksource there's no pmtimer.
>>> Should I be concerned?  It does not look healthy.
>>>
>>
>> It's a bug, please post guest details (kernel version, bitness).
> 
> The guest kernel is also 2.6.29[.5], but this time it's x86-32
> (compiled for P4).  kvm userspace is also 32bits (historical) --
> only host kernel is 64bit for now.  I'll try to do some more
> experiments later today on a test machine (this is a production
> box) -- "hopefully" that same issue will occur on another
> machine :)
> 
>> Copying Marcelo.
>>
>>>
>>> Some time later, I see stuff like:
>>>
>>> hrtimer: interrupt too slow, forcing clock min delta to 47210997 ns
>>>
>>> Which reminds me issues I had with broken hpet (time goes
>>> back-n-forth with similar messages shown in dmesg, but
>>> about hpet not hrtimer).  Also does not look healthy.
>>>
>>>
>>> I haven't seen either of the two messages above on any of
>>> single-processor guests so far, at least with recent kernels
>>> and kvm userspace, only on smp (2 cpu for now).
>>
>> Please also post host /proc/cpuifo.
> 
> HOST cpuinfo (only for 4th core, other cores are similar):
> processor    : 3
> vendor_id    : AuthenticAMD
> cpu family    : 16
> model        : 2
> model name    : AMD Phenom(tm) 9750 Quad-Core Processor
> stepping    : 3
> cpu MHz        : 1200.000
> (yes ondemand cpufreq is in effect - nominal frequency is 2400.
> I had no issues with cpufreq on this box so far, including all
> the guests).
> cache size    : 512 KB
> physical id    : 0
> siblings    : 4
> core id        : 3
> cpu cores    : 4
> apicid        : 3
> initial apicid    : 3
> fpu        : yes
> fpu_exception    : yes
> cpuid level    : 5
> wp        : yes
> flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt 
> pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc pni 
> monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a 
> misalignsse 3dnowprefetch osvw ibs
> bogomips    : 4812.67
> TLB size    : 1024 4K pages
> clflush size    : 64
> cache_alignment    : 64
> address sizes    : 48 bits physical, 48 bits virtual
> power management: ts ttp tm stc 100mhzsteps hwpstate
> 
> 
> 
> cpuinfo on GUEST (also for only one CPU):
> 
> processor    : 1
> vendor_id    : AuthenticAMD
> cpu family    : 6
> model        : 2
> model name    : QEMU Virtual CPU version 0.10.5
> stepping    : 3
> cpu MHz        : 2405.894
> cache size    : 512 KB
> fdiv_bug    : no
> hlt_bug        : no
> f00f_bug    : no
> coma_bug    : no
> fpu        : yes
> fpu_exception    : yes
> cpuid level    : 2
> wp        : yes
> flags        : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
> pat pse36 clflush mmx fxsr sse sse2 syscall lm pni hypervisor
> bogomips    : 4811.78
> clflush size    : 64
> power management:
> 
> 
> Thanks!
> 
> /mjt
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smp guest questions
  2009-06-17 10:46   ` Michael Tokarev
  2009-06-18  9:14     ` Michael Tokarev
@ 2009-06-18 15:17     ` Marcelo Tosatti
  1 sibling, 0 replies; 5+ messages in thread
From: Marcelo Tosatti @ 2009-06-18 15:17 UTC (permalink / raw)
  To: Michael Tokarev, Zachary Amsden; +Cc: Avi Kivity, KVM list

On Wed, Jun 17, 2009 at 02:46:47PM +0400, Michael Tokarev wrote:
> Avi Kivity wrote:
>> On 06/17/2009 11:38 AM, Michael Tokarev wrote:
>>> After seeing words from Avi about that smp guests
>>> are ok now, I descided to try.  And immediately
>>> got a few questions.
>>>
>>> Running on a Phenom 9750 machine (PhenomI), AMD780G
>>> chipset.  Host is 2.6.29 x86-64, qemu-kvm 0.10.5,
>>> guests are linux with kvm paravirt bits enabled, also
>>> dynticks (on both host and guest).
>>>
>>>
>>> When booting a 2-CPU guest, I see in dmesg:
>>>
>>> PM-Timer running at invalid rate: 200% of normal - aborting.
>>>
>>> and indeed, in available_clocksource there's no pmtimer.
>>> Should I be concerned?  It does not look healthy.
>>>
>>
>> It's a bug, please post guest details (kernel version, bitness).
>
> The guest kernel is also 2.6.29[.5], but this time it's x86-32
> (compiled for P4).  kvm userspace is also 32bits (historical) --
> only host kernel is 64bit for now.  I'll try to do some more
> experiments later today on a test machine (this is a production
> box) -- "hopefully" that same issue will occur on another
> machine :)

kernel tries to correlate pm-timer with PIT:

http://lxr.linux.no/linux+v2.6.17/arch/i386/kernel/timers/timer_pm.c#L95

http://lxr.linux.no/linux+v2.6.17/include/asm-i386/mach-default/mach_timer.h#L39

Should be fixed before the Brazil world cup.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-06-18 15:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-17  8:38 smp guest questions Michael Tokarev
2009-06-17  9:15 ` Avi Kivity
2009-06-17 10:46   ` Michael Tokarev
2009-06-18  9:14     ` Michael Tokarev
2009-06-18 15:17     ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.