All of lore.kernel.org
 help / color / mirror / Atom feed
* strange guest slowness after some time
@ 2009-03-07 15:47 Tomasz Chmielewski
  2009-03-07 16:41 ` Johannes Baumann
                   ` (2 more replies)
  0 siblings, 3 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-07 15:47 UTC (permalink / raw)
  To: kvm

I have a strange slowness which affects some guests after they are 
running for some time. "Slowness" can happen a few hours after guest 
start, or, a couple of days after guest start.

What do I mean by "slowness"?

This is how long it takes to log in via SSH to an unaffected guest - 
below a second:

$ time ssh backupuser@normal_guest exit
0.02user 0.01system 0:00.67elapsed 4%CPU (0avgtext+0avgdata 0maxresident)

Now, let's try to log in to the affected guest running on the same host 
- more than 12 seconds:

$ time ssh backupuser@slow_guest exit
0.02user 0.01system 0:12.56elapsed 0%CPU (0avgtext+0avgdata 0maxresident)

If I log in via SSH to the affected guest, any key presses lag a second 
or two.


This is actually weird - if I run something IO intensive on the guest, 
the login is much faster (running CPU-intensive tasks makes no difference):

guest# dd if=/dev/vda of=/dev/null

$ time ssh backupuser@slow_guest exit
0.02user 0.00system 0:00.70elapsed 2%CPU (0avgtext+0avgdata 0maxresident)

Also, running "ping -f <slow_guest>" helps a lot and SSH logins are fast.


Look at the difference here - 7470ms vs 139183ms (and packet losses):

# ping -f -c 10000 normal_guest

10000 packets transmitted, 10000 received, 0% packet loss, time 7470ms
rtt min/avg/max/mdev = 0.443/0.709/6.487/0.112 ms, ipg/ewma 0.747/0.716 ms

# ping -f -c 10000 slow_guest

10000 packets transmitted, 9934 received, 0% packet loss, time 139183ms
rtt min/avg/max/mdev = 0.470/14.337/50.455/5.409 ms, pipe 4, ipg/ewma 
13.919/14.788 ms


CPU-intensive tasks are as fast as on unaffected guests.
Reading from /dev/vda is as fast as on unaffected guests.

So the only thing broken seems to be the network.


Rebooting the guest does not help - it is still slow.
The only thing that helps is stopping the guest and starting it again 
(i.e., stopping kvm process and starting a new one).


Is there an explanation to this phenomenon? Looks like a problem with 
virtio drivers somewhere, or?



The host is running kvm-83.
Affected guests are running 2.6.27.14 kernels and use virtio drivers.
The problem happens only _sometimes_. Out of 9 guests I have running on 
this host, I saw this problem only on 3 guests. I never saw this 
happening on more than one guest at a time.
All three have 512 MB memory assigned, other guests have less memory.


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-07 15:47 strange guest slowness after some time Tomasz Chmielewski
@ 2009-03-07 16:41 ` Johannes Baumann
  2009-03-07 16:54   ` Tomasz Chmielewski
  2009-03-09  9:18 ` Tomasz Chmielewski
  2009-03-09  9:55 ` Avi Kivity
  2 siblings, 1 reply; 70+ messages in thread
From: Johannes Baumann @ 2009-03-07 16:41 UTC (permalink / raw)
  To: Tomasz Chmielewski, kvm

are your nameservers ok?
ssh is reveres checking your ip, if your nameserver is not
available login may take some time.

johannes

Tomasz Chmielewski schrieb:
> I have a strange slowness which affects some guests after they are
> running for some time. "Slowness" can happen a few hours after guest
> start, or, a couple of days after guest start.
> 
> What do I mean by "slowness"?
> 
> This is how long it takes to log in via SSH to an unaffected guest -
> below a second:
> 
> $ time ssh backupuser@normal_guest exit
> 0.02user 0.01system 0:00.67elapsed 4%CPU (0avgtext+0avgdata 0maxresident)
> 
> Now, let's try to log in to the affected guest running on the same host
> - more than 12 seconds:
> 
> $ time ssh backupuser@slow_guest exit
> 0.02user 0.01system 0:12.56elapsed 0%CPU (0avgtext+0avgdata 0maxresident)
> 
> If I log in via SSH to the affected guest, any key presses lag a second
> or two.
> 
> 
> This is actually weird - if I run something IO intensive on the guest,
> the login is much faster (running CPU-intensive tasks makes no difference):
> 
> guest# dd if=/dev/vda of=/dev/null
> 
> $ time ssh backupuser@slow_guest exit
> 0.02user 0.00system 0:00.70elapsed 2%CPU (0avgtext+0avgdata 0maxresident)
> 
> Also, running "ping -f <slow_guest>" helps a lot and SSH logins are fast.
> 
> 
> Look at the difference here - 7470ms vs 139183ms (and packet losses):
> 
> # ping -f -c 10000 normal_guest
> 
> 10000 packets transmitted, 10000 received, 0% packet loss, time 7470ms
> rtt min/avg/max/mdev = 0.443/0.709/6.487/0.112 ms, ipg/ewma 0.747/0.716 ms
> 
> # ping -f -c 10000 slow_guest
> 
> 10000 packets transmitted, 9934 received, 0% packet loss, time 139183ms
> rtt min/avg/max/mdev = 0.470/14.337/50.455/5.409 ms, pipe 4, ipg/ewma
> 13.919/14.788 ms
> 
> 
> CPU-intensive tasks are as fast as on unaffected guests.
> Reading from /dev/vda is as fast as on unaffected guests.
> 
> So the only thing broken seems to be the network.
> 
> 
> Rebooting the guest does not help - it is still slow.
> The only thing that helps is stopping the guest and starting it again
> (i.e., stopping kvm process and starting a new one).
> 
> 
> Is there an explanation to this phenomenon? Looks like a problem with
> virtio drivers somewhere, or?
> 
> 
> 
> The host is running kvm-83.
> Affected guests are running 2.6.27.14 kernels and use virtio drivers.
> The problem happens only _sometimes_. Out of 9 guests I have running on
> this host, I saw this problem only on 3 guests. I never saw this
> happening on more than one guest at a time.
> All three have 512 MB memory assigned, other guests have less memory.
> 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-07 16:41 ` Johannes Baumann
@ 2009-03-07 16:54   ` Tomasz Chmielewski
  0 siblings, 0 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-07 16:54 UTC (permalink / raw)
  To: Johannes Baumann; +Cc: kvm

Johannes Baumann schrieb:
> are your nameservers ok?
> ssh is reveres checking your ip, if your nameserver is not
> available login may take some time.

Nameservers were fine.
If they were wrong, it would affect all other guests, or?

Also, to my knowledge, nameservers normally do not affect ping losses 
and/or ping roundtrip times ;)

"dd if=/dev/vda of=/dev/null" curing the problem also excludes the 
nameserver idea.

-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-07 15:47 strange guest slowness after some time Tomasz Chmielewski
  2009-03-07 16:41 ` Johannes Baumann
@ 2009-03-09  9:18 ` Tomasz Chmielewski
  2009-03-09  9:28   ` Tomasz Chmielewski
  2009-03-09  9:55 ` Avi Kivity
  2 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-09  9:18 UTC (permalink / raw)
  To: kvm

Tomasz Chmielewski schrieb:

> The host is running kvm-83.
> Affected guests are running 2.6.27.14 kernels and use virtio drivers.
> The problem happens only _sometimes_. Out of 9 guests I have running on 
> this host, I saw this problem only on 3 guests. I never saw this 
> happening on more than one guest at a time.
> All three have 512 MB memory assigned, other guests have less memory.

I upgraded ~2 days ago to kvm-84 and the same just happened for a guest with 256 MB memory.

Note how _time_ is different (similar timings are to other unaffected guests):

# ping -f -c 10000 <unaffected_guest>

10000 packets transmitted, 10000 received, 0% packet loss, time 12313ms
rtt min/avg/max/mdev = 0.432/1.164/96.163/1.934 ms, pipe 7, ipg/ewma 1.231/1.111 ms


# ping -f -c 10000 <affected_guest>

10000 packets transmitted, 10000 received, 0% packet loss, time 135625ms
rtt min/avg/max/mdev = 0.807/14.228/55.569/5.779 ms, pipe 4, ipg/ewma 13.563/8.601 ms


Running "dd if=/dev/vda of=/dev/null" on the affected guest reduces that a bit:

# ping -f -c 10000 <affected_guest>

10000 packets transmitted, 10000 received, 0% packet loss, time 50469ms
rtt min/avg/max/mdev = 0.616/4.881/54.357/3.847 ms, pipe 5, ipg/ewma 5.047/7.783 ms



Anyone? Is it a known bug?


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09  9:18 ` Tomasz Chmielewski
@ 2009-03-09  9:28   ` Tomasz Chmielewski
  2009-03-19 13:03     ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-09  9:28 UTC (permalink / raw)
  To: kvm

Tomasz Chmielewski schrieb:

> I upgraded ~2 days ago to kvm-84 and the same just happened for a guest 
> with 256 MB memory.
> 
> Note how _time_ is different (similar timings are to other unaffected 
> guests):

This is also pretty interesting:

# ping -c 10 <unaffected guest>
PING 192.168.4.4 (192.168.4.4) 56(84) bytes of data.
64 bytes from 192.168.4.4: icmp_seq=1 ttl=64 time=1.25 ms
64 bytes from 192.168.4.4: icmp_seq=2 ttl=64 time=1.58 ms
64 bytes from 192.168.4.4: icmp_seq=3 ttl=64 time=3.53 ms
64 bytes from 192.168.4.4: icmp_seq=4 ttl=64 time=1.43 ms
64 bytes from 192.168.4.4: icmp_seq=5 ttl=64 time=3.89 ms
64 bytes from 192.168.4.4: icmp_seq=6 ttl=64 time=3.43 ms
64 bytes from 192.168.4.4: icmp_seq=7 ttl=64 time=1.03 ms
64 bytes from 192.168.4.4: icmp_seq=8 ttl=64 time=1.36 ms
64 bytes from 192.168.4.4: icmp_seq=9 ttl=64 time=1.28 ms
64 bytes from 192.168.4.4: icmp_seq=10 ttl=64 time=1.78 ms

--- 192.168.4.4 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9091ms
rtt min/avg/max/mdev = 1.031/2.059/3.894/1.045 ms



How probable it is so many pings returned with exactly 1000 ms?

# ping -c 10 <affected_guest>
PING 192.168.4.5 (192.168.4.5) 56(84) bytes of data.
64 bytes from 192.168.4.5: icmp_seq=1 ttl=64 time=1009 ms
64 bytes from 192.168.4.5: icmp_seq=2 ttl=64 time=9.61 ms
64 bytes from 192.168.4.5: icmp_seq=3 ttl=64 time=1000 ms
64 bytes from 192.168.4.5: icmp_seq=4 ttl=64 time=1000 ms
64 bytes from 192.168.4.5: icmp_seq=5 ttl=64 time=1000 ms
64 bytes from 192.168.4.5: icmp_seq=6 ttl=64 time=992 ms
64 bytes from 192.168.4.5: icmp_seq=7 ttl=64 time=1000 ms
64 bytes from 192.168.4.5: icmp_seq=8 ttl=64 time=1001 ms
64 bytes from 192.168.4.5: icmp_seq=9 ttl=64 time=1000 ms
64 bytes from 192.168.4.5: icmp_seq=10 ttl=64 time=998 ms

--- 192.168.4.5 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 10025ms
rtt min/avg/max/mdev = 9.610/901.198/1009.161/297.222 ms, pipe 2


This one is with "dd if=/dev/vda of=/dev/null" running on the affected guest:

# ping -c 10 <affected_guest>
PING 192.168.4.5 (192.168.4.5) 56(84) bytes of data.
64 bytes from 192.168.4.5: icmp_seq=1 ttl=64 time=29.4 ms
64 bytes from 192.168.4.5: icmp_seq=2 ttl=64 time=4.56 ms
64 bytes from 192.168.4.5: icmp_seq=3 ttl=64 time=4.05 ms
64 bytes from 192.168.4.5: icmp_seq=4 ttl=64 time=4.20 ms
64 bytes from 192.168.4.5: icmp_seq=5 ttl=64 time=3.82 ms
64 bytes from 192.168.4.5: icmp_seq=6 ttl=64 time=2.47 ms
64 bytes from 192.168.4.5: icmp_seq=7 ttl=64 time=2.16 ms
64 bytes from 192.168.4.5: icmp_seq=8 ttl=64 time=3.89 ms
64 bytes from 192.168.4.5: icmp_seq=9 ttl=64 time=5.98 ms
64 bytes from 192.168.4.5: icmp_seq=10 ttl=64 time=9.16 ms

--- 192.168.4.5 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9107ms
rtt min/avg/max/mdev = 2.169/6.978/29.439/7.714 ms



-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-07 15:47 strange guest slowness after some time Tomasz Chmielewski
  2009-03-07 16:41 ` Johannes Baumann
  2009-03-09  9:18 ` Tomasz Chmielewski
@ 2009-03-09  9:55 ` Avi Kivity
  2009-03-09 10:22   ` Tomasz Chmielewski
  2 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-09  9:55 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

Tomasz Chmielewski wrote:
> I have a strange slowness which affects some guests after they are 
> running for some time. "Slowness" can happen a few hours after guest 
> start, or, a couple of days after guest start.
>
> What do I mean by "slowness"?
>
> This is how long it takes to log in via SSH to an unaffected guest - 
> below a second:
>
> $ time ssh backupuser@normal_guest exit
> 0.02user 0.01system 0:00.67elapsed 4%CPU (0avgtext+0avgdata 0maxresident)
>
> Now, let's try to log in to the affected guest running on the same 
> host - more than 12 seconds:
>
> $ time ssh backupuser@slow_guest exit
> 0.02user 0.01system 0:12.56elapsed 0%CPU (0avgtext+0avgdata 0maxresident)
>
> If I log in via SSH to the affected guest, any key presses lag a 
> second or two.
>
>
> This is actually weird - if I run something IO intensive on the guest, 
> the login is much faster (running CPU-intensive tasks makes no 
> difference):
>
> guest# dd if=/dev/vda of=/dev/null
>
> $ time ssh backupuser@slow_guest exit
> 0.02user 0.00system 0:00.70elapsed 2%CPU (0avgtext+0avgdata 0maxresident)
>
> Also, running "ping -f <slow_guest>" helps a lot and SSH logins are fast.
>
>
> Look at the difference here - 7470ms vs 139183ms (and packet losses):
>
> # ping -f -c 10000 normal_guest
>
> 10000 packets transmitted, 10000 received, 0% packet loss, time 7470ms
> rtt min/avg/max/mdev = 0.443/0.709/6.487/0.112 ms, ipg/ewma 
> 0.747/0.716 ms
>
> # ping -f -c 10000 slow_guest
>
> 10000 packets transmitted, 9934 received, 0% packet loss, time 139183ms
> rtt min/avg/max/mdev = 0.470/14.337/50.455/5.409 ms, pipe 4, ipg/ewma 
> 13.919/14.788 ms
>
>
> CPU-intensive tasks are as fast as on unaffected guests.
> Reading from /dev/vda is as fast as on unaffected guests.
>
> So the only thing broken seems to be the network.
>
>
> Rebooting the guest does not help - it is still slow.
> The only thing that helps is stopping the guest and starting it again 
> (i.e., stopping kvm process and starting a new one).
>
>
> Is there an explanation to this phenomenon? Looks like a problem with 
> virtio drivers somewhere, or?
>
>
>
> The host is running kvm-83.
> Affected guests are running 2.6.27.14 kernels and use virtio drivers.
> The problem happens only _sometimes_. Out of 9 guests I have running 
> on this host, I saw this problem only on 3 guests. I never saw this 
> happening on more than one guest at a time.
> All three have 512 MB memory assigned, other guests have less memory.
>

I'm guessing there's a problem with timers or timer interrupts.

What is the host cpu?

Does the problem occur if you pin a guest to a cpu with taskset?

-- 
error compiling committee.c: too many arguments to function



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09  9:55 ` Avi Kivity
@ 2009-03-09 10:22   ` Tomasz Chmielewski
  2009-03-09 10:25     ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-09 10:22 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi Kivity schrieb:

> I'm guessing there's a problem with timers or timer interrupts.
> 
> What is the host cpu?

4 entries like this in /proc/cpuinfo:

processor       : 3
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 65
model name      : Dual-Core AMD Opteron(tm) Processor 2212
stepping        : 2
cpu MHz         : 2000.000
cache size      : 1024 KB
physical id     : 1
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext 
fxsr_opt rdtscp lm3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy 
svm extapic cr8_legacy
bogomips        : 3993.03
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc


> Does the problem occur if you pin a guest to a cpu with taskset?

Like this?

# taskset -p 01 22906

(doesn't help)


# taskset -p 02 22906

(doesn't help)


But if I do:

# taskset -p 03 22906

or

# taskset -p 04 22906

it fixes it _rarely_ for the first few seconds, then it's broken again, 
until I switch the CPUs again (look at ping 9 and 10; other pings are 
also slow, unaffected guests are around 1 ms):

# ping -c 10 192.168.113.85
PING 192.168.113.85 (192.168.113.85) 56(84) bytes of data.
64 bytes from 192.168.113.85: icmp_seq=1 ttl=64 time=22.0 ms
64 bytes from 192.168.113.85: icmp_seq=2 ttl=64 time=23.7 ms
64 bytes from 192.168.113.85: icmp_seq=3 ttl=64 time=2.96 ms
64 bytes from 192.168.113.85: icmp_seq=4 ttl=64 time=51.3 ms
64 bytes from 192.168.113.85: icmp_seq=5 ttl=64 time=22.2 ms
64 bytes from 192.168.113.85: icmp_seq=6 ttl=64 time=1.60 ms
64 bytes from 192.168.113.85: icmp_seq=7 ttl=64 time=49.8 ms
64 bytes from 192.168.113.85: icmp_seq=8 ttl=64 time=23.3 ms
64 bytes from 192.168.113.85: icmp_seq=9 ttl=64 time=999 ms
64 bytes from 192.168.113.85: icmp_seq=10 ttl=64 time=822 ms


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 10:22   ` Tomasz Chmielewski
@ 2009-03-09 10:25     ` Avi Kivity
  2009-03-09 10:31       ` Tomasz Chmielewski
  2009-03-15 13:19       ` Tomasz Chmielewski
  0 siblings, 2 replies; 70+ messages in thread
From: Avi Kivity @ 2009-03-09 10:25 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

Tomasz Chmielewski wrote:
> Avi Kivity schrieb:
>
>> I'm guessing there's a problem with timers or timer interrupts.
>>
>> What is the host cpu?
>
> 4 entries like this in /proc/cpuinfo:
>
> processor       : 3
> vendor_id       : AuthenticAMD
> cpu family      : 15
> model           : 65
> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>

That's probably the kvmclock issue that hit older AMDs.  It was fixed in 
kvm-84, please try that.

>> Does the problem occur if you pin a guest to a cpu with taskset?
>
> Like this?
>
> # taskset -p 01 22906
>

I meant 'taskset 01 qemu ...' but it wouldn't have helped if it's kvmclock.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 10:25     ` Avi Kivity
@ 2009-03-09 10:31       ` Tomasz Chmielewski
  2009-03-09 10:37         ` Avi Kivity
  2009-03-15 13:19       ` Tomasz Chmielewski
  1 sibling, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-09 10:31 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi Kivity schrieb:
> Tomasz Chmielewski wrote:
>> Avi Kivity schrieb:
>>
>>> I'm guessing there's a problem with timers or timer interrupts.
>>>
>>> What is the host cpu?
>>
>> 4 entries like this in /proc/cpuinfo:
>>
>> processor       : 3
>> vendor_id       : AuthenticAMD
>> cpu family      : 15
>> model           : 65
>> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>>
> 
> That's probably the kvmclock issue that hit older AMDs.  It was fixed in 
> kvm-84, please try that.

It is kvm-84, I have it running since Saturday (but I had this issue 
with kvm-83 as well).

# dmesg | grep kvm
(...)
loaded kvm module (kvm-84)

# modinfo kvm
filename:       /lib/modules/2.6.24-2-pve/kernel/arch/x86/kvm/kvm.ko
version:        kvm-84

# kvm -h
QEMU PC emulator version 0.9.1 (kvm-84), Copyright (c) 2003-2008 Fabrice 
Bellard


>>> Does the problem occur if you pin a guest to a cpu with taskset?
>>
>> Like this?
>>
>> # taskset -p 01 22906
>>
> 
> I meant 'taskset 01 qemu ...' but it wouldn't have helped if it's kvmclock.

It can be done on a running process as well (22906 is the PID of the 
affected guest).
And the issue is hard to reproduce (shows up after 1-7 days on a random 
guest).


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 10:31       ` Tomasz Chmielewski
@ 2009-03-09 10:37         ` Avi Kivity
  2009-03-09 10:54           ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-09 10:37 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

Tomasz Chmielewski wrote:
> Avi Kivity schrieb:
>> Tomasz Chmielewski wrote:
>>> Avi Kivity schrieb:
>>>
>>>> I'm guessing there's a problem with timers or timer interrupts.
>>>>
>>>> What is the host cpu?
>>>
>>> 4 entries like this in /proc/cpuinfo:
>>>
>>> processor       : 3
>>> vendor_id       : AuthenticAMD
>>> cpu family      : 15
>>> model           : 65
>>> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>>>
>>
>> That's probably the kvmclock issue that hit older AMDs.  It was fixed 
>> in kvm-84, please try that.
>
> It is kvm-84, I have it running since Saturday (but I had this issue 
> with kvm-83 as well).
>

And the problem continues?

What's your current clocksource (in the guest)?  Does changing it help?

See /sys/devices/system/clocksource/clocksource0/*.

>>
>> I meant 'taskset 01 qemu ...' but it wouldn't have helped if it's 
>> kvmclock.
>
> It can be done on a running process as well (22906 is the PID of the 
> affected gue

Right, but if the guest is poisoned somehow, this won't help.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 10:37         ` Avi Kivity
@ 2009-03-09 10:54           ` Tomasz Chmielewski
  2009-03-09 11:37             ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-09 10:54 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi Kivity schrieb:

>>>>> I'm guessing there's a problem with timers or timer interrupts.
>>>>>
>>>>> What is the host cpu?
>>>>
>>>> 4 entries like this in /proc/cpuinfo:
>>>>
>>>> processor       : 3
>>>> vendor_id       : AuthenticAMD
>>>> cpu family      : 15
>>>> model           : 65
>>>> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>>>>
>>>
>>> That's probably the kvmclock issue that hit older AMDs.  It was fixed 
>>> in kvm-84, please try that.
>>
>> It is kvm-84, I have it running since Saturday (but I had this issue 
>> with kvm-83 as well).
>>
> 
> And the problem continues?
> 
> What's your current clocksource (in the guest)?  Does changing it help?
> 
> See /sys/devices/system/clocksource/clocksource0/*.

It was kvm-clock.
I tried changing it to acpi_pm, jiffies, tsc, but it made no difference.


>>> I meant 'taskset 01 qemu ...' but it wouldn't have helped if it's 
>>> kvmclock.
>>
>> It can be done on a running process as well (22906 is the PID of the 
>> affected gue
> 
> Right, but if the guest is poisoned somehow, this won't help.

Yep, it seems poisoned.
I'll start the guest again in the evening, will add it a e1000 card.

If the problem reappears, it would be good to see if it affect only 
virtio card or not (I've never seen this issue on a guest which doesn't 
use virtio drivers - so far at least).


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 10:54           ` Tomasz Chmielewski
@ 2009-03-09 11:37             ` Tomasz Chmielewski
  2009-03-09 12:14               ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-09 11:37 UTC (permalink / raw)
  To: Avi Kivity, kvm

Tomasz Chmielewski schrieb:
> Avi Kivity schrieb:
> 
>>>>>> I'm guessing there's a problem with timers or timer interrupts.
>>>>>>
>>>>>> What is the host cpu?
>>>>>
>>>>> 4 entries like this in /proc/cpuinfo:
>>>>>
>>>>> processor       : 3
>>>>> vendor_id       : AuthenticAMD
>>>>> cpu family      : 15
>>>>> model           : 65
>>>>> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>>>>>
>>>>
>>>> That's probably the kvmclock issue that hit older AMDs.  It was 
>>>> fixed in kvm-84, please try that.
>>>
>>> It is kvm-84, I have it running since Saturday (but I had this issue 
>>> with kvm-83 as well).
>>>
>>
>> And the problem continues?
>>
>> What's your current clocksource (in the guest)?  Does changing it help?
>>
>> See /sys/devices/system/clocksource/clocksource0/*.
> 
> It was kvm-clock.
> I tried changing it to acpi_pm, jiffies, tsc, but it made no difference.

Actually, I don't think that I checked tsc, because when I changed to jiffies, the time has stopped:

# echo jiffies > /sys/devices/system/clocksource/clocksource0/current_clocksource
# date
Mon Mar  9 12:29:00 CET 2009
# date
Mon Mar  9 12:29:00 CET 2009
# date
Mon Mar  9 12:29:00 CET 2009
# date
Mon Mar  9 12:29:00 CET 2009
# date
Mon Mar  9 12:29:00 CET 2009

And I couldn't change to anything else any more:

# echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
jiffies
# echo kvm-clock > /sys/devices/system/clocksource/clocksource0/current_clocksource
# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
jiffies


So I had to kill the guest and start it again (the above is reproduced on another,
"non-poisoned" guest).


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 11:37             ` Tomasz Chmielewski
@ 2009-03-09 12:14               ` Avi Kivity
  2009-03-09 12:52                 ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-09 12:14 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

Tomasz Chmielewski wrote:
>>
>> It was kvm-clock.
>> I tried changing it to acpi_pm, jiffies, tsc, but it made no difference.
>
> Actually, I don't think that I checked tsc, because when I changed to 
> jiffies, the time has stopped:
>
> # echo jiffies > 
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> # date
> Mon Mar  9 12:29:00 CET 2009
> # date
> Mon Mar  9 12:29:00 CET 2009
> # date
> Mon Mar  9 12:29:00 CET 2009

can you post some /proc/interrupt dumps from the guest?  I guess the 
timer interrupt isn't working.

Does -no-kvm-irqchip help?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 12:14               ` Avi Kivity
@ 2009-03-09 12:52                 ` Tomasz Chmielewski
  2009-03-15 15:41                   ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-09 12:52 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi Kivity schrieb:
> Tomasz Chmielewski wrote:
>>>
>>> It was kvm-clock.
>>> I tried changing it to acpi_pm, jiffies, tsc, but it made no difference.
>>
>> Actually, I don't think that I checked tsc, because when I changed to 
>> jiffies, the time has stopped:
>>
>> # echo jiffies > 
>> /sys/devices/system/clocksource/clocksource0/current_clocksource
>> # date
>> Mon Mar  9 12:29:00 CET 2009
>> # date
>> Mon Mar  9 12:29:00 CET 2009
>> # date
>> Mon Mar  9 12:29:00 CET 2009
> 
> can you post some /proc/interrupt dumps from the guest?  I guess the 
> timer interrupt isn't working.

We're touching another issue from my original one ("guest slowness") 
here, I suppose.

But there are new interrupts here, when I set the clocksource to 
"jiffies" (setting to "jiffies" also kills my serial console connection 
- no key press go through to the guest any more):

# cat /proc/interrupts
            CPU0
   0:        104   IO-APIC-edge      timer
   1:          6   IO-APIC-edge      i8042
   4:        480   IO-APIC-edge      serial
   6:          2   IO-APIC-edge      floppy
   7:          0   IO-APIC-edge      parport0
   8:          2   IO-APIC-edge      rtc0
   9:          0   IO-APIC-fasteoi   acpi
  10:       4400   IO-APIC-fasteoi   virtio0, virtio2, virtio4
  11:       1550   IO-APIC-fasteoi   uhci_hcd:usb1, virtio1, virtio3
  12:         89   IO-APIC-edge      i8042
  14:          0   IO-APIC-edge      ide0
  15:         30   IO-APIC-edge      ide1
NMI:          0   Non-maskable interrupts
LOC:      85231   Local timer interrupts
RES:          0   Rescheduling interrupts
CAL:          0   function call interrupts
TLB:          0   TLB shootdowns
TRM:          0   Thermal event interrupts
SPU:          0   Spurious interrupts
ERR:          0
MIS:          0

# cat /proc/interrupts
            CPU0
   0:        104   IO-APIC-edge      timer
   1:          6   IO-APIC-edge      i8042
   4:        486   IO-APIC-edge      serial
   6:          2   IO-APIC-edge      floppy
   7:          0   IO-APIC-edge      parport0
   8:          2   IO-APIC-edge      rtc0
   9:          0   IO-APIC-fasteoi   acpi
  10:       4461   IO-APIC-fasteoi   virtio0, virtio2, virtio4
  11:       1590   IO-APIC-fasteoi   uhci_hcd:usb1, virtio1, virtio3
  12:         89   IO-APIC-edge      i8042
  14:          0   IO-APIC-edge      ide0
  15:         30   IO-APIC-edge      ide1
NMI:          0   Non-maskable interrupts
LOC:     108361   Local timer interrupts
RES:          0   Rescheduling interrupts
CAL:          0   function call interrupts
TLB:          0   TLB shootdowns
TRM:          0   Thermal event interrupts
SPU:          0   Spurious interrupts
ERR:          0
MIS:          0



> Does -no-kvm-irqchip help?

Nope, it doesn't - with jiffies, time always stops.


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 10:25     ` Avi Kivity
  2009-03-09 10:31       ` Tomasz Chmielewski
@ 2009-03-15 13:19       ` Tomasz Chmielewski
  2009-03-17 10:47         ` Tomasz Chmielewski
  1 sibling, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-15 13:19 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi Kivity schrieb:

>>> I'm guessing there's a problem with timers or timer interrupts.
>>>
>>> What is the host cpu?
>>
>> 4 entries like this in /proc/cpuinfo:
>>
>> processor       : 3
>> vendor_id       : AuthenticAMD
>> cpu family      : 15
>> model           : 65
>> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>>
> 
> That's probably the kvmclock issue that hit older AMDs.  It was fixed in 
> kvm-84, please try that.

I've been running it for about a week now with kvm-84 and no guest got slow.

Can it be related to using cpufreq and ondemand governor?

1) with kvm-83 and cpufreq/ondemand, guests go totally crazy (see 
"Houston, we have May 15, 1953" thread)

2) with kvm-83 without cpufreq, "slowness" affects guests sometimes

3) with kvm-84 and cpufreq/ondemand, "slowness" affects guests sometimes

4) with kvm-84 without cpufreq, everything run correctly (at least it 
does for a week now)


Does anything from this make any sense? I would really like to use 
cpufreq/ondemand on the host with KVM, as my tests show it would save me 
about 50 EUR on electricity bills per one of such servers yearly.


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09 12:52                 ` Tomasz Chmielewski
@ 2009-03-15 15:41                   ` Avi Kivity
  2009-03-15 16:14                     ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-15 15:41 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

Tomasz Chmielewski wrote:
>
>> Does -no-kvm-irqchip help?
>
> Nope, it doesn't - with jiffies, time always stops.
>
>


Here, too.  This is strange.  On bare metal it works as expected.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-15 15:41                   ` Avi Kivity
@ 2009-03-15 16:14                     ` Avi Kivity
  0 siblings, 0 replies; 70+ messages in thread
From: Avi Kivity @ 2009-03-15 16:14 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

Avi Kivity wrote:
> Tomasz Chmielewski wrote:
>>
>>> Does -no-kvm-irqchip help?
>>
>> Nope, it doesn't - with jiffies, time always stops.
>>
>>
>
>
> Here, too.  This is strange.  On bare metal it works as expected.
>

I think it's unrelated.  The PIT is programmed in one-shot mode (likely 
for the scheduler) and doesn't return to periodic mode.  Pity the kernel 
doesn't warn about this.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-15 13:19       ` Tomasz Chmielewski
@ 2009-03-17 10:47         ` Tomasz Chmielewski
  2009-03-17 11:16           ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 10:47 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Tomasz Chmielewski schrieb:
> Avi Kivity schrieb:
> 
>>>> I'm guessing there's a problem with timers or timer interrupts.
>>>>
>>>> What is the host cpu?
>>>
>>> 4 entries like this in /proc/cpuinfo:
>>>
>>> processor       : 3
>>> vendor_id       : AuthenticAMD
>>> cpu family      : 15
>>> model           : 65
>>> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>>>
>>
>> That's probably the kvmclock issue that hit older AMDs.  It was fixed 
>> in kvm-84, please try that.
> 
> I've been running it for about a week now with kvm-84 and no guest got 
> slow.
> 
> Can it be related to using cpufreq and ondemand governor?

Something fishy here :(

After a week or so, network in one guest got slow with kvm-84 and no 
cpufreq.


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 10:47         ` Tomasz Chmielewski
@ 2009-03-17 11:16           ` Avi Kivity
  2009-03-17 11:25             ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-17 11:16 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

Tomasz Chmielewski wrote:
>
> After a week or so, network in one guest got slow with kvm-84 and no 
> cpufreq.
>

This is virtio, right?  What about e1000?

(I realize it takes a week to reproduce, but maybe you have some more 
experience)


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 11:16           ` Avi Kivity
@ 2009-03-17 11:25             ` Tomasz Chmielewski
  2009-03-17 15:32               ` Felix Leimbach
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 11:25 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi Kivity schrieb:
> Tomasz Chmielewski wrote:
>>
>> After a week or so, network in one guest got slow with kvm-84 and no 
>> cpufreq.
>>
> 
> This is virtio, right?  What about e1000?
> 
> (I realize it takes a week to reproduce, but maybe you have some more 
> experience)

Yes, all affected had virtio. Probably because I didn't have many guests 
with e1000 interface.

After a guest gets slow, I stop it and add another interface, e1000.


If it gets slow again, I'll check if e1000 interface is slow as well.

Will keep you updated.


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 11:25             ` Tomasz Chmielewski
@ 2009-03-17 15:32               ` Felix Leimbach
  2009-03-17 15:43                 ` Tomasz Chmielewski
                                   ` (2 more replies)
  0 siblings, 3 replies; 70+ messages in thread
From: Felix Leimbach @ 2009-03-17 15:32 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: avi, kvm

Tomasz Chmielewski wrote:
> Avi Kivity schrieb:
>> Tomasz Chmielewski wrote:
>>> After a week or so, network in one guest got slow with kvm-84 and no 
>>> cpufreq.
>> This is virtio, right?  What about e1000?
>>
>> (I realize it takes a week to reproduce, but maybe you have some more 
>> experience)
>
> Yes, all affected had virtio. Probably because I didn't have many 
> guests with e1000 interface.
>
> After a guest gets slow, I stop it and add another interface, e1000.
>
>
> If it gets slow again, I'll check if e1000 interface is slow as well.
>
> Will keep you updated.
I see similar behavior: After a week one of my guests' network totally 
stops to respond. Only guests using virtio networking get hit. Both 
windows and linux guests are affected.
My guests in production use e1000 and have never been hit.
While that can be a coincidence it seems very unlikely: Out of 3 virtio 
guests 2 have been hit, one repeatedly.
Out of 3 e1000 guests none has ever been hit.

Observed with kvm-83 and kvm-84 with the host running in-kernel KVM code 
(linux 2.6.25.7)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 15:32               ` Felix Leimbach
@ 2009-03-17 15:43                 ` Tomasz Chmielewski
  2009-03-17 17:01                   ` Felix Leimbach
  2009-03-17 15:52                 ` Avi Kivity
  2009-03-17 16:27                 ` Tomasz Chmielewski
  2 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 15:43 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: avi, kvm

Felix Leimbach schrieb:

>> If it gets slow again, I'll check if e1000 interface is slow as well.
>>
>> Will keep you updated.
> I see similar behavior: After a week one of my guests' network totally 
> stops to respond. Only guests using virtio networking get hit. Both 
> windows and linux guests are affected.
> My guests in production use e1000 and have never been hit.
> While that can be a coincidence it seems very unlikely: Out of 3 virtio 
> guests 2 have been hit, one repeatedly.
> Out of 3 e1000 guests none has ever been hit.
> 
> Observed with kvm-83 and kvm-84 with the host running in-kernel KVM code 
> (linux 2.6.25.7)

Could you add a (unused) e1000 interface to your virtio guests?
As this issue happens rarely for me, maybe you could help to reproduce 
it as well (i.e. if network gets slow on virtio interface, give e1000 a 
IP address, and try if network is also slow on e1000 on the very same 
guest).

BTW, what CPU do you have?


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 15:32               ` Felix Leimbach
  2009-03-17 15:43                 ` Tomasz Chmielewski
@ 2009-03-17 15:52                 ` Avi Kivity
  2009-03-17 16:12                   ` Tomasz Chmielewski
  2009-03-17 16:27                 ` Tomasz Chmielewski
  2 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-17 15:52 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Tomasz Chmielewski, kvm

Felix Leimbach wrote:
> I see similar behavior: After a week one of my guests' network totally 
> stops to respond. Only guests using virtio networking get hit. Both 
> windows and linux guests are affected.
> My guests in production use e1000 and have never been hit.
> While that can be a coincidence it seems very unlikely: Out of 3 
> virtio guests 2 have been hit, one repeatedly.
> Out of 3 e1000 guests none has ever been hit.
>
> Observed with kvm-83 and kvm-84 with the host running in-kernel KVM 
> code (linux 2.6.25.7)

Might it be that some counter overflowed?  What are the packet counts on 
long running guests?

(output of ifconfig, even on an unaffected e1000 guest, might help)

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 15:52                 ` Avi Kivity
@ 2009-03-17 16:12                   ` Tomasz Chmielewski
  2009-03-17 17:05                     ` Felix Leimbach
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 16:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Felix Leimbach, kvm

Avi Kivity schrieb:
> Felix Leimbach wrote:
>> I see similar behavior: After a week one of my guests' network totally 
>> stops to respond. Only guests using virtio networking get hit. Both 
>> windows and linux guests are affected.
>> My guests in production use e1000 and have never been hit.
>> While that can be a coincidence it seems very unlikely: Out of 3 
>> virtio guests 2 have been hit, one repeatedly.
>> Out of 3 e1000 guests none has ever been hit.
>>
>> Observed with kvm-83 and kvm-84 with the host running in-kernel KVM 
>> code (linux 2.6.25.7)
> 
> Might it be that some counter overflowed?  What are the packet counts on 
> long running guests?

I don't think so.

I just made both counters (TX, RX) of ifconfig for virtio interfaces 
overflow several times and everything is still as fast as it should be.

> (output of ifconfig, even on an unaffected e1000 guest, might help)


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 15:32               ` Felix Leimbach
  2009-03-17 15:43                 ` Tomasz Chmielewski
  2009-03-17 15:52                 ` Avi Kivity
@ 2009-03-17 16:27                 ` Tomasz Chmielewski
  2009-03-17 17:14                   ` Felix Leimbach
  2 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 16:27 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: avi, kvm

Felix Leimbach schrieb:

>> Yes, all affected had virtio. Probably because I didn't have many 
>> guests with e1000 interface.
>>
>> After a guest gets slow, I stop it and add another interface, e1000.
>>
>>
>> If it gets slow again, I'll check if e1000 interface is slow as well.
>>
>> Will keep you updated.
> I see similar behavior: After a week one of my guests' network totally 
> stops to respond. Only guests using virtio networking get hit. Both 
> windows and linux guests are affected.

Also, does guest reboot help for you (for me, it doesn't)?

Or, you have to halt the guest and start it again (i.e. stop kvm/qemu 
process and start a new one) to make the network working properly again?


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 15:43                 ` Tomasz Chmielewski
@ 2009-03-17 17:01                   ` Felix Leimbach
  2009-03-17 17:05                     ` Avi Kivity
                                       ` (2 more replies)
  0 siblings, 3 replies; 70+ messages in thread
From: Felix Leimbach @ 2009-03-17 17:01 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: avi, kvm

Tomasz Chmielewski wrote:
> Felix Leimbach schrieb:
>> Out of 3 e1000 guests none has ever been hit.
>>
>> Observed with kvm-83 and kvm-84 with the host running in-kernel KVM 
>> code (linux 2.6.25.7)
> Could you add a (unused) e1000 interface to your virtio guests?
> As this issue happens rarely for me, maybe you could help to reproduce 
> it as well (i.e. if network gets slow on virtio interface, give e1000 
> a IP address, and try if network is also slow on e1000 on the very 
> same guest).
Will do and report
>
> BTW, what CPU do you have?
One dual core Opteron 2212
Note: I will upgrade to two Shanghai Quad-Cores in 2 weeks and test with 
those as well.

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 65
model name      : Dual-Core AMD Opteron(tm) Processor 2212
stepping        : 2
cpu MHz         : 1994.996
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov              pat pse36 clflush mmx fxsr sse sse2 ht syscall nx 
mmxext fxsr_opt rdtscp lm 3dno             wext 3dnow rep_good nopl pni 
cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips        : 3990.06
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:01                   ` Felix Leimbach
@ 2009-03-17 17:05                     ` Avi Kivity
  2009-03-17 18:49                       ` Felix Leimbach
  2009-03-17 17:38                     ` Tomasz Chmielewski
  2009-03-31  8:50                     ` Tomasz Chmielewski
  2 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-17 17:05 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Tomasz Chmielewski, kvm

Felix Leimbach wrote:
>>
>> BTW, what CPU do you have?
> One dual core Opteron 2212

Does idle=poll help things?  It can cause tsc breakage similar to cpufreq.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 16:12                   ` Tomasz Chmielewski
@ 2009-03-17 17:05                     ` Felix Leimbach
  2009-03-17 17:10                       ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Leimbach @ 2009-03-17 17:05 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: Avi Kivity, kvm

Tomasz Chmielewski wrote:
> Avi Kivity schrieb:
>> Might it be that some counter overflowed?  What are the packet counts 
>> on long running guests?
> I don't think so.
>
> I just made both counters (TX, RX) of ifconfig for virtio interfaces 
> overflow several times and everything is still as fast as it should be.
I had overflows on the counters as well (32 bit guests) without an problem.
Here is the current ifconfig output of a machine which suffered the 
problem before:

eth0      Link encap:Ethernet  HWaddr 52:54:00:74:01:01
          inet addr:10.75.13.1  Bcast:10.75.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3542104 errors:0 dropped:0 overruns:0 frame:0
          TX packets:412546 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:682285568 (650.6 MiB)  TX bytes:2907586796 (2.7 GiB)

>> (output of ifconfig, even on an unaffected e1000 guest, might help)
currently I have e1000 only on windows guests. Is there a way to gather 
relevant statistics there too?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:05                     ` Felix Leimbach
@ 2009-03-17 17:10                       ` Avi Kivity
  2009-03-17 17:43                         ` Tomasz Chmielewski
                                           ` (2 more replies)
  0 siblings, 3 replies; 70+ messages in thread
From: Avi Kivity @ 2009-03-17 17:10 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Tomasz Chmielewski, kvm

Felix Leimbach wrote:
> Tomasz Chmielewski wrote:
>> Avi Kivity schrieb:
>>> Might it be that some counter overflowed?  What are the packet 
>>> counts on long running guests?
>> I don't think so.
>>
>> I just made both counters (TX, RX) of ifconfig for virtio interfaces 
>> overflow several times and everything is still as fast as it should be.
> I had overflows on the counters as well (32 bit guests) without an 
> problem.
> Here is the current ifconfig output of a machine which suffered the 
> problem before:
>
> eth0      Link encap:Ethernet  HWaddr 52:54:00:74:01:01
>          inet addr:10.75.13.1  Bcast:10.75.255.255  Mask:255.255.0.0
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:3542104 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:412546 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:682285568 (650.6 MiB)  TX bytes:2907586796 (2.7 GiB)

packet counters are will within 32-bit limits.  byte counters not so 
interesting.

> currently I have e1000 only on windows guests. Is there a way to 
> gather relevant statistics there too?

Sure, right-click on the adapter icon, it's there somewhere.

Do you experience the slowdown on Windows guests?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 16:27                 ` Tomasz Chmielewski
@ 2009-03-17 17:14                   ` Felix Leimbach
  2009-03-17 17:19                     ` Avi Kivity
  2009-03-17 17:34                     ` Tomasz Chmielewski
  0 siblings, 2 replies; 70+ messages in thread
From: Felix Leimbach @ 2009-03-17 17:14 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: avi, kvm

Tomasz Chmielewski wrote:
>
> Felix Leimbach schrieb:
>> I see similar behavior: After a week one of my guests' network 
>> totally stops to respond. Only guests using virtio networking get 
>> hit. Both windows and linux guests are affected.
>
> Also, does guest reboot help for you (for me, it doesn't)?
>
> Or, you have to halt the guest and start it again (i.e. stop kvm/qemu 
> process and start a new one) to make the network working properly again?
I have not tried rebooting; always stopped and restarted the qemu 
instance. Will try on the next occasion.

Before I wrote that I tested on kvm-83 and 84 but it turns out the 
kvm-84 part was wrong: Since the upgrade 4 days ago I have not yet had a 
hang.
I noticed that you Tomasz are also running kvm-83. Maybe kvm-84 fixed 
the issue already?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:14                   ` Felix Leimbach
@ 2009-03-17 17:19                     ` Avi Kivity
  2009-03-17 17:34                     ` Tomasz Chmielewski
  1 sibling, 0 replies; 70+ messages in thread
From: Avi Kivity @ 2009-03-17 17:19 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Tomasz Chmielewski, kvm

Felix Leimbach wrote:
> I noticed that you Tomasz are also running kvm-83. Maybe kvm-84 fixed 
> the issue already?

kvm-84 fixes a serious problem with kvmclock on AMDs, but does not fix 
the problem with c1e, so it may not have fixed the problem completely.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:14                   ` Felix Leimbach
  2009-03-17 17:19                     ` Avi Kivity
@ 2009-03-17 17:34                     ` Tomasz Chmielewski
  1 sibling, 0 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 17:34 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: avi, kvm

Felix Leimbach schrieb:

> I have not tried rebooting; always stopped and restarted the qemu 
> instance. Will try on the next occasion.
> 
> Before I wrote that I tested on kvm-83 and 84 but it turns out the 
> kvm-84 part was wrong: Since the upgrade 4 days ago I have not yet had a 
> hang.
> I noticed that you Tomasz are also running kvm-83. Maybe kvm-84 fixed 
> the issue already?

No, I run kvm-84.
With kvm-83 I had this issue much more frequently. With kvm-84, is seems 
less frequent. Or maybe that's just what I'd like to believe ;)


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:01                   ` Felix Leimbach
  2009-03-17 17:05                     ` Avi Kivity
@ 2009-03-17 17:38                     ` Tomasz Chmielewski
  2009-06-08 11:02                       ` Felix Leimbach
  2009-03-31  8:50                     ` Tomasz Chmielewski
  2 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 17:38 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: avi, kvm

Felix Leimbach schrieb:

>> BTW, what CPU do you have?
> One dual core Opteron 2212
> Note: I will upgrade to two Shanghai Quad-Cores in 2 weeks and test with 
> those as well.
> 
> processor       : 1
> vendor_id       : AuthenticAMD
> cpu family      : 15
> model           : 65
> model name      : Dual-Core AMD Opteron(tm) Processor 2212
> stepping        : 2
> cpu MHz         : 1994.996
> cache size      : 1024 KB

It's exactly the same CPU I have.

Almost. My is 5.004 MHz faster ;)
 

model name      : Dual-Core AMD Opteron(tm) Processor 2212 

stepping        : 2 

cpu MHz         : 2000.000 



-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:10                       ` Avi Kivity
@ 2009-03-17 17:43                         ` Tomasz Chmielewski
  2009-03-17 18:55                           ` Tomasz Chmielewski
  2009-03-17 18:57                         ` Felix Leimbach
  2009-03-18  5:54                         ` Felix Leimbach
  2 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 17:43 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Felix Leimbach, kvm

Avi Kivity schrieb:
> Felix Leimbach wrote:
>> Tomasz Chmielewski wrote:
>>> Avi Kivity schrieb:
>>>> Might it be that some counter overflowed?  What are the packet 
>>>> counts on long running guests?

>> Here is the current ifconfig output of a machine which suffered the 
>> problem before:
>>
>> eth0      Link encap:Ethernet  HWaddr 52:54:00:74:01:01
>>          inet addr:10.75.13.1  Bcast:10.75.255.255  Mask:255.255.0.0
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:3542104 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:412546 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:682285568 (650.6 MiB)  TX bytes:2907586796 (2.7 GiB)
> 
> packet counters are will within 32-bit limits.  byte counters not so 
> interesting.

Ah OK.
I did only byte overflow.

Packet overflow will take much longer. It's one of these very rare cases 
where setting very small MTU is useful...


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:05                     ` Avi Kivity
@ 2009-03-17 18:49                       ` Felix Leimbach
  2009-03-18  6:36                         ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Leimbach @ 2009-03-17 18:49 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm

Avi Kivity wrote:
> Does idle=poll help things?  It can cause tsc breakage similar to 
> cpufreq.
On the host, right? Can't test that as I cannot reboot the server.
Is tsc breakage still s.th. to watch out after I've upgraded to the 
Shanghai quadcores?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:43                         ` Tomasz Chmielewski
@ 2009-03-17 18:55                           ` Tomasz Chmielewski
  2009-03-17 19:04                             ` Felix Leimbach
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 18:55 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Felix Leimbach, kvm

Tomasz Chmielewski schrieb:
> Avi Kivity schrieb:
>> Felix Leimbach wrote:
>>> Tomasz Chmielewski wrote:
>>>> Avi Kivity schrieb:
>>>>> Might it be that some counter overflowed?  What are the packet 
>>>>> counts on long running guests?
> 
>>> Here is the current ifconfig output of a machine which suffered the 
>>> problem before:
>>>
>>> eth0      Link encap:Ethernet  HWaddr 52:54:00:74:01:01
>>>          inet addr:10.75.13.1  Bcast:10.75.255.255  Mask:255.255.0.0
>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>          RX packets:3542104 errors:0 dropped:0 overruns:0 frame:0
>>>          TX packets:412546 errors:0 dropped:0 overruns:0 carrier:0
>>>          collisions:0 txqueuelen:1000
>>>          RX bytes:682285568 (650.6 MiB)  TX bytes:2907586796 (2.7 GiB)
>>
>> packet counters are will within 32-bit limits.  byte counters not so 
>> interesting.
> 
> Ah OK.
> I did only byte overflow.
> 
> Packet overflow will take much longer. It's one of these very rare cases 
> where setting very small MTU is useful...

OK, another bug found.

Set your MTU to 100.

On two hosts, do:

HOST1_MTU1500# dd if=/dev/zero | ssh manager@HOST2 dd of=/dev/null
HOST2_MTU100# dd if=/dev/zero | ssh manager@HOST1 dd of=/dev/null

HOST2 with MTU 100 will crash after 10-15 minutes (with packet count 
still not overflown).


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:10                       ` Avi Kivity
  2009-03-17 17:43                         ` Tomasz Chmielewski
@ 2009-03-17 18:57                         ` Felix Leimbach
  2009-03-18  5:54                         ` Felix Leimbach
  2 siblings, 0 replies; 70+ messages in thread
From: Felix Leimbach @ 2009-03-17 18:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm

Avi Kivity wrote:
> Felix Leimbach wrote:
>> eth0      Link encap:Ethernet  HWaddr 52:54:00:74:01:01
>>          inet addr:10.75.13.1  Bcast:10.75.255.255  Mask:255.255.0.0
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:3542104 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:412546 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:682285568 (650.6 MiB)  TX bytes:2907586796 (2.7 GiB)
>
> packet counters are will within 32-bit limits.  byte counters not so 
> interesting.
ah right, I checked the byte counters only.
Testing packet counter overflow now (takes a while).

> Do you experience the slowdown on Windows guests?
both Linux and Windows Server 2003. All 32bit.
But with me it is not a slowdown but a complete loss of network in the 
guest. Can't be pinged anymore. Although there might be a slowdown 
period before the that, I've heard hints in that direction from users.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 18:55                           ` Tomasz Chmielewski
@ 2009-03-17 19:04                             ` Felix Leimbach
  2009-03-17 19:24                               ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Leimbach @ 2009-03-17 19:04 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: Avi Kivity, kvm

Tomasz Chmielewski wrote:
> Tomasz Chmielewski schrieb:
>> Avi Kivity schrieb:
>>> packet counters are will within 32-bit limits.  byte counters not so 
>>> interesting.
>>
>> Ah OK.
>> I did only byte overflow.
>>
>> Packet overflow will take much longer. It's one of these very rare 
>> cases where setting very small MTU is useful...
>
> OK, another bug found.
>
> Set your MTU to 100.
>
> On two hosts, do:
>
> HOST1_MTU1500# dd if=/dev/zero | ssh manager@HOST2 dd of=/dev/null
> HOST2_MTU100# dd if=/dev/zero | ssh manager@HOST1 dd of=/dev/null
>
> HOST2 with MTU 100 will crash after 10-15 minutes (with packet count 
> still not overflown).
>
Intersting. What are the packet counter at crash time (roughly)?

My - currently running - test is:

Guest 1 (Linux):
MTU 150
# cat /dev/zero | nc <guest2ip> 7777

Guest 2 (Windows 2003 Server):
MTU: 1500
# nc -l -p 7777 > NUL

My packet are currently at 63 million without a problem - yet.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 19:04                             ` Felix Leimbach
@ 2009-03-17 19:24                               ` Tomasz Chmielewski
  2009-03-17 20:14                                 ` Tomasz Chmielewski
  2009-03-18  6:29                                 ` Avi Kivity
  0 siblings, 2 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 19:24 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Avi Kivity, kvm

Felix Leimbach schrieb:

>> OK, another bug found.
>>
>> Set your MTU to 100.
>>
>> On two hosts, do:
>>
>> HOST1_MTU1500# dd if=/dev/zero | ssh manager@HOST2 dd of=/dev/null
>> HOST2_MTU100# dd if=/dev/zero | ssh manager@HOST1 dd of=/dev/null
>>
>> HOST2 with MTU 100 will crash after 10-15 minutes (with packet count 
>> still not overflown).
>>
> Intersting. What are the packet counter at crash time (roughly)?
> 
> My - currently running - test is:
> 
> Guest 1 (Linux):
> MTU 150
> # cat /dev/zero | nc <guest2ip> 7777
> 
> Guest 2 (Windows 2003 Server):
> MTU: 1500
> # nc -l -p 7777 > NUL
> 
> My packet are currently at 63 million without a problem - yet.

I have it running with MTU 1500. And one of the guests (the one which 
was crashing with MTU=100) froze.

On a VNC console I can see:

virtio_net virtio0: id 64 is not a head!
BUG: soft lockup - CPU#0 stuck for 61s! [ssh:2265]

And "soft lockup" is being printed periodically. VNC and serial console 
do not react to any key press. Guest do not react on ACPI events (shutdown).
kvm/qemu process is using 100% CPU.

See this screenshot:

http://www1.wpkg.org/lockup.png


Guest that locks up is running Debian Lenny with 2.6.26 kernel.
Guest that does not lock up runs Mandriva 2009.0 with 2.6.27.x kernel.
(data being transferred both side to/from each of these hosts).



-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 19:24                               ` Tomasz Chmielewski
@ 2009-03-17 20:14                                 ` Tomasz Chmielewski
  2009-03-17 22:34                                   ` Tomasz Chmielewski
  2009-03-18  6:29                                 ` Avi Kivity
  1 sibling, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 20:14 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Avi Kivity, kvm

Tomasz Chmielewski schrieb:

> See this screenshot:
> 
> http://www1.wpkg.org/lockup.png
> 
> 
> Guest that locks up is running Debian Lenny with 2.6.26 kernel.
> Guest that does not lock up runs Mandriva 2009.0 with 2.6.27.x kernel.
> (data being transferred both side to/from each of these hosts).

Sorry, both machines run Debian Lenny and 2.6.26 kernel.
The only difference is that machine which crashes (with MTU=100) or 
locks up (with MTU=1500) runs a "2.6.26-1-686" kernel and the one which 
doesn't lock up runs "2.6.26-1-486" kernel (both are Debian's kernels).


-- 
Tomasz Chmielewski
http://wpkg.org




^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 20:14                                 ` Tomasz Chmielewski
@ 2009-03-17 22:34                                   ` Tomasz Chmielewski
  2009-03-17 23:02                                     ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 22:34 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Avi Kivity, kvm

Tomasz Chmielewski schrieb:

> Sorry, both machines run Debian Lenny and 2.6.26 kernel.
> The only difference is that machine which crashes (with MTU=100) or 
> locks up (with MTU=1500) runs a "2.6.26-1-686" kernel and the one which 
> doesn't lock up runs "2.6.26-1-486" kernel (both are Debian's kernels).

Some more tries and I got this one. Serial console died, but SSH is still working.

Note the "S" tainted flag.
According to Documentation/oops-tracing.txt, it means:

  3: 'S' if the oops occurred on an SMP kernel running on hardware that
     hasn't been certified as safe to run multiprocessor.
     Currently this occurs only on various Athlons that are not
     SMP capable.


And this is a difference between "2.6.26-1-686" and "2.6.26-1-486" kernels.

# grep -i smp /boot/config-2.6.26-1-686
CONFIG_X86_SMP=y
CONFIG_X86_32_SMP=y
CONFIG_SMP=y


# grep -i smp /boot/config-2.6.26-1-486
CONFIG_BROKEN_ON_SMP=y
# CONFIG_SMP is not set



[10942.216450] BUG: soft lockup - CPU#0 stuck for 760s! [postgres:1802]
[10942.216450] Modules linked in: ipv6 loop joydev virtio_balloon virtio_net parport_pc parport snd_pcsp serio_raw snd_pcm snd_timer psmouse snd soundcore snd_page_alloc i2c_piix4 i2c_core button usbhid hid ff_memless evdev ext3 jbd mbcache virtio_blk ide_cd_mod cdrom ide_pci_generic floppy virtio_pci uhci_hcd usbcore piix ide_core ata_generic libata scsi_mod dock thermal processor fan thermal_sys
[10942.216450]
[10942.216450] Pid: 1802, comm: postgres Tainted: G S        (2.6.26-1-686 #1)
[10942.216450] EIP: 0060:[<c011d5a0>] EFLAGS: 00000206 CPU: 0
[10942.216450] EIP is at finish_task_switch+0x25/0x99
[10942.216450] EAX: c1208fa0 EBX: c03bafa0 ECX: c1208fa0 EDX: ce0be4a0
[10942.216450] ESI: 00000000 EDI: ce0be4a0 EBP: 00000001 ESP: ce7f9afc
[10942.216450]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[10942.216450] CR0: 8005003b CR2: 080f3a10 CR3: 0eaeb000 CR4: 000006d0
[10942.216450] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[10942.216450] DR6: ffff0ff0 DR7: 00000400
[10942.216450]  [<c02b82ee>] ? schedule+0x60c/0x66f
[10942.216450]  [<c0129ab0>] ? lock_timer_base+0x19/0x35
[10942.216450]  [<c0129bc3>] ? __mod_timer+0x99/0xa3
[10942.216450]  [<c02b8549>] ? schedule_timeout+0x6b/0x86
[10942.216450]  [<c01297ec>] ? process_timeout+0x0/0x5
[10942.216450]  [<c02b8544>] ? schedule_timeout+0x66/0x86
[10942.216450]  [<c017f2c6>] ? do_select+0x364/0x3bd
[10942.216450]  [<c017f7ca>] ? __pollwait+0x0/0xac
[10942.216450]  [<d08e74c4>] ? start_xmit+0x9f/0xa5 [virtio_net]
[10942.216450]  [<c025895c>] ? dev_hard_start_xmit+0x1eb/0x24f
[10942.216450]  [<c02669f2>] ? __qdisc_run+0xcc/0x17c
[10942.216450]  [<c025abbf>] ? dev_queue_xmit+0x287/0x2bc
[10942.216450]  [<c02762cd>] ? ip_finish_output+0x1c5/0x1fc
[10942.216450]  [<c0115403>] ? pvclock_clocksource_read+0x4b/0xd0
[10942.216450]  [<c0275e5b>] ? ip_local_out+0x15/0x17
[10942.216450]  [<c013604c>] ? getnstimeofday+0x37/0xbc
[10942.216450]  [<c01344c2>] ? ktime_get_ts+0x22/0x49
[10942.216450]  [<c01344f6>] ? ktime_get+0xd/0x21
[10942.216450]  [<c01190e6>] ? hrtick_start_fair+0xeb/0x12c
[10942.216450]  [<c011b39f>] ? task_rq_lock+0x3b/0x5e
[10942.216450]  [<c02531ab>] ? skb_checksum+0x52/0x272
[10942.216450]  [<c017f5a1>] ? core_sys_select+0x282/0x29f
[10942.216450]  [<c0129ccb>] ? mod_timer+0x19/0x36
[10942.216450]  [<c0252345>] ? sock_def_readable+0xf/0x58
[10942.216450]  [<c0283cf4>] ? tcp_rcv_established+0x51d/0x7b1
[10942.216450]  [<c0288d9f>] ? tcp_v4_do_rcv+0x262/0x3e8
[10942.216450]  [<c028ab5d>] ? tcp_v4_rcv+0x5b6/0x609
[10942.216450]  [<c0272ec3>] ? ip_local_deliver_finish+0xe8/0x183
[10942.216450]  [<c0272dbe>] ? ip_rcv_finish+0x286/0x2a3
[10942.216450]  [<c025837a>] ? netif_receive_skb+0x2d6/0x343
[10942.216450]  [<d08e7aa9>] ? virtnet_poll+0x21d/0x258 [virtio_net]
[10942.216450]  [<c017f915>] ? sys_select+0x9f/0x180
[10942.216450]  [<c0103853>] ? sysenter_past_esp+0x78/0xb1
[10942.216450]  =======================



-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 22:34                                   ` Tomasz Chmielewski
@ 2009-03-17 23:02                                     ` Tomasz Chmielewski
  0 siblings, 0 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-17 23:02 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Avi Kivity, kvm

Tomasz Chmielewski schrieb:
> Tomasz Chmielewski schrieb:
> 
>> Sorry, both machines run Debian Lenny and 2.6.26 kernel.
>> The only difference is that machine which crashes (with MTU=100) or 
>> locks up (with MTU=1500) runs a "2.6.26-1-686" kernel and the one 
>> which doesn't lock up runs "2.6.26-1-486" kernel (both are Debian's 
>> kernels).
> 
> Some more tries and I got this one. Serial console died, but SSH is 
> still working.
> 
> Note the "S" tainted flag.
> According to Documentation/oops-tracing.txt, it means:
> 
>  3: 'S' if the oops occurred on an SMP kernel running on hardware that
>     hasn't been certified as safe to run multiprocessor.
>     Currently this occurs only on various Athlons that are not
>     SMP capable.
> 
> 
> And this is a difference between "2.6.26-1-686" and "2.6.26-1-486" kernels.
> 
> # grep -i smp /boot/config-2.6.26-1-686
> CONFIG_X86_SMP=y
> CONFIG_X86_32_SMP=y
> CONFIG_SMP=y
> 
> 
> # grep -i smp /boot/config-2.6.26-1-486
> CONFIG_BROKEN_ON_SMP=y
> # CONFIG_SMP is not set

BTW, it was the machine with /boot/config-2.6.26-1-486 kernel (non-SMP) 
which got slow for me today.


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:10                       ` Avi Kivity
  2009-03-17 17:43                         ` Tomasz Chmielewski
  2009-03-17 18:57                         ` Felix Leimbach
@ 2009-03-18  5:54                         ` Felix Leimbach
  2 siblings, 0 replies; 70+ messages in thread
From: Felix Leimbach @ 2009-03-18  5:54 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm

Avi,

Avi Kivity wrote:
> packet counters are will within 32-bit limits.  byte counters not so 
> interesting.
This night my test overflowed the *packet* counters twice without any 
slowness or loss of connectivity.

Snippet from my log file of the sending VM (linux 2.6.27):

Wed Mar 18 05:14:18 CET 2009: TX packet counter = 4292944043
Wed Mar 18 05:15:18 CET 2009: TX packet counter = 6259211

ifconfig after stress test:
# ifconfig
eth0      Link encap:Ethernet  HWaddr 52:54:00:74:01:01
          inet addr:10.75.13.1  Bcast:10.75.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:150  Metric:1
          RX packets:48950340 errors:0 dropped:0 overruns:0 frame:0
          TX packets:727911367 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3686511201 (3.4 GiB)  TX bytes:4207842269 (3.9 GiB)

I didn't create a log file on the receiving Windows VM but they must 
have overflowed as well. Its packet counters are currently:
Sent: 47.112.780
Received: 1.515.275.693

No problem on the Windows guest either.

So the problem must lie elsewhere.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 19:24                               ` Tomasz Chmielewski
  2009-03-17 20:14                                 ` Tomasz Chmielewski
@ 2009-03-18  6:29                                 ` Avi Kivity
  2009-03-19  4:59                                   ` Rusty Russell
  1 sibling, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-18  6:29 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: Felix Leimbach, kvm, Rusty Russell, Anthony Liguori

Tomasz Chmielewski wrote:
> Felix Leimbach schrieb:
>
>>> OK, another bug found.
>>>
>>> Set your MTU to 100.
>>>
>>> On two hosts, do:
>>>
>>> HOST1_MTU1500# dd if=/dev/zero | ssh manager@HOST2 dd of=/dev/null
>>> HOST2_MTU100# dd if=/dev/zero | ssh manager@HOST1 dd of=/dev/null
>>>
>>> HOST2 with MTU 100 will crash after 10-15 minutes (with packet count 
>>> still not overflown).
>>>
>> Intersting. What are the packet counter at crash time (roughly)?
>>
>> My - currently running - test is:
>>
>> Guest 1 (Linux):
>> MTU 150
>> # cat /dev/zero | nc <guest2ip> 7777
>>
>> Guest 2 (Windows 2003 Server):
>> MTU: 1500
>> # nc -l -p 7777 > NUL
>>
>> My packet are currently at 63 million without a problem - yet.
>
> I have it running with MTU 1500. And one of the guests (the one which 
> was crashing with MTU=100) froze.
>
> On a VNC console I can see:
>
> virtio_net virtio0: id 64 is not a head!
> BUG: soft lockup - CPU#0 stuck for 61s! [ssh:2265]
>
> And "soft lockup" is being printed periodically. VNC and serial 
> console do not react to any key press. Guest do not react on ACPI 
> events (shutdown).
> kvm/qemu process is using 100% CPU.
>
> See this screenshot:
>
> http://www1.wpkg.org/lockup.png
>
>
> Guest that locks up is running Debian Lenny with 2.6.26 kernel.
> Guest that does not lock up runs Mandriva 2009.0 with 2.6.27.x kernel.
> (data being transferred both side to/from each of these hosts).

Copying the virtio folks... something is wrong.

You can obtain a stack trace of the locked up guest by doing

  (qemu) gdbserver 1234

  $ gdb /path/to/guest/vmlinux
  (gdb) target remote localhost:1234
  (gdb) backtrace

I don't know host you obtain the guest vmlinux on debian; on Fedora it 
is contained in kernel-debuginfo.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 18:49                       ` Felix Leimbach
@ 2009-03-18  6:36                         ` Avi Kivity
  2009-03-18  7:57                           ` Felix Leimbach
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-18  6:36 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: Tomasz Chmielewski, kvm

Felix Leimbach wrote:
> Avi Kivity wrote:
>> Does idle=poll help things?  It can cause tsc breakage similar to 
>> cpufreq.
> On the host, right? Can't test that as I cannot reboot the server.
> Is tsc breakage still s.th. to watch out after I've upgraded to the 
> Shanghai quadcores?

No, should be gone.

Will you have the old server around so we can test things?

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-18  6:36                         ` Avi Kivity
@ 2009-03-18  7:57                           ` Felix Leimbach
  2009-03-18  8:48                             ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Leimbach @ 2009-03-18  7:57 UTC (permalink / raw)
  To: kvm, Tomasz Chmielewski

Avi Kivity wrote:
> Felix Leimbach wrote:
>> Is tsc breakage still s.th. to watch out after I've upgraded to the 
>> Shanghai quadcores?
>
> No, should be gone.
>
> Will you have the old server around so we can test things?

No, I'll be upgrading the existing server. If you have specific tests in
mind I can perform them in the next two weeks before the upgrade. But I
cannot restart the server because a few VMs are in production use.

If a developer is interested in the old CPU (Opteron 2212) then I can
have it mailed to him/her.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-18  7:57                           ` Felix Leimbach
@ 2009-03-18  8:48                             ` Avi Kivity
  2009-03-18  9:08                               ` Felix Leimbach
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-03-18  8:48 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: kvm, Tomasz Chmielewski

Felix Leimbach wrote:
> Avi Kivity wrote:
>> Felix Leimbach wrote:
>>> Is tsc breakage still s.th. to watch out after I've upgraded to the 
>>> Shanghai quadcores?
>>
>> No, should be gone.
>>
>> Will you have the old server around so we can test things?
>
> No, I'll be upgrading the existing server. If you have specific tests in
> mind I can perform them in the next two weeks before the upgrade. But I
> cannot restart the server because a few VMs are in production use.
>
> If a developer is interested in the old CPU (Opteron 2212) then I can
> have it mailed to him/her.
>

Thanks for the offer; I can probably find a similar cpu, the main 
difficulty is replicating the problem.

Since there are now at least two reports, maybe it won't be that 
difficult.  If you can figure out a way to reliably reproduce this, that 
would be most helpful.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-18  8:48                             ` Avi Kivity
@ 2009-03-18  9:08                               ` Felix Leimbach
  0 siblings, 0 replies; 70+ messages in thread
From: Felix Leimbach @ 2009-03-18  9:08 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Tomasz Chmielewski

Avi Kivity wrote:
> Since there are now at least two reports, maybe it won't be that 
> difficult.  If you can figure out a way to reliably reproduce this, 
> that would be most helpful.

I'll see what I can do. Although I'm not too optimistic because I have 
not experienced the problem after upgrading to kvm-84. But hey, that's a 
good thing.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-18  6:29                                 ` Avi Kivity
@ 2009-03-19  4:59                                   ` Rusty Russell
  2009-03-19  5:22                                     ` David S. Ahern
  0 siblings, 1 reply; 70+ messages in thread
From: Rusty Russell @ 2009-03-19  4:59 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Tomasz Chmielewski, Felix Leimbach, kvm, Anthony Liguori

On Wednesday 18 March 2009 16:59:36 Avi Kivity wrote:
> Tomasz Chmielewski wrote:
> > virtio_net virtio0: id 64 is not a head!

This means that qemu said "I've finished with buffer 64" and the guest didn't
know anything about buffer 64.

We should not lock up, tho networking is toast: I think that qemu got upset
and that caused this as well as it to chew 100% cpu.

I'll see if I can reproduce with kvm-84 userspace and 2.6.27 guests, 32-bit
guests on a 64-bit AMD host.  What's your kvm/qemu command line?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-19  4:59                                   ` Rusty Russell
@ 2009-03-19  5:22                                     ` David S. Ahern
  2009-03-19  6:08                                       ` David S. Ahern
  0 siblings, 1 reply; 70+ messages in thread
From: David S. Ahern @ 2009-03-19  5:22 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Avi Kivity, Tomasz Chmielewski, Felix Leimbach, kvm, Anthony Liguori



Rusty Russell wrote:
> On Wednesday 18 March 2009 16:59:36 Avi Kivity wrote:
>> Tomasz Chmielewski wrote:
>>> virtio_net virtio0: id 64 is not a head!
> 
> This means that qemu said "I've finished with buffer 64" and the guest didn't
> know anything about buffer 64.
> 
> We should not lock up, tho networking is toast: I think that qemu got upset
> and that caused this as well as it to chew 100% cpu.
> 
> I'll see if I can reproduce with kvm-84 userspace and 2.6.27 guests, 32-bit
> guests on a 64-bit AMD host.  What's your kvm/qemu command line?
> 

I've hit this as well.

Intel host, running RHEL5.3, x86_64 with KVM-81.

Guest is RHEL4.7, 32-bit, with the virtio drivers from RHEL4.8 beta.

Happens pretty darn quickly for me.

david


> Thanks,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-19  5:22                                     ` David S. Ahern
@ 2009-03-19  6:08                                       ` David S. Ahern
  2009-03-19  8:03                                         ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: David S. Ahern @ 2009-03-19  6:08 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Avi Kivity, Tomasz Chmielewski, Felix Leimbach, kvm, Anthony Liguori



David S. Ahern wrote:
> 
> Rusty Russell wrote:
>> On Wednesday 18 March 2009 16:59:36 Avi Kivity wrote:
>>> Tomasz Chmielewski wrote:
>>>> virtio_net virtio0: id 64 is not a head!
>> This means that qemu said "I've finished with buffer 64" and the guest didn't
>> know anything about buffer 64.
>>
>> We should not lock up, tho networking is toast: I think that qemu got upset
>> and that caused this as well as it to chew 100% cpu.
>>
>> I'll see if I can reproduce with kvm-84 userspace and 2.6.27 guests, 32-bit
>> guests on a 64-bit AMD host.  What's your kvm/qemu command line?
>>
> 
> I've hit this as well.
> 
> Intel host, running RHEL5.3, x86_64 with KVM-81.
> 
> Guest is RHEL4.7, 32-bit, with the virtio drivers from RHEL4.8 beta.
> 
> Happens pretty darn quickly for me.
> 
> david
> 

Like I said, pretty darn quickly.

More information for you. Command line (a few elements blurred) for this
run (started about 15 minutes ago):

kvm -localtime -no-reboot -m 3584 -smp 4 \
-drive file=/dev/cciss/c0d0,if=scsi,cache=off,boot=on \
-drive file=/dev/cciss/c0d1,if=scsi,cache=off,boot=off \
-net nic,vlan=0,macaddr=00:11:22:33:44:55,model=virtio \
-net tap,vlan=0,ifname=tap0,script=no,downscript=no \
-net nic,vlan=1,macaddr=00:12:34:56:78:1,model=virtio \
-net tap,vlan=1,ifname=tap1,script=no,downscript=no \
-usb -usbdevice tablet -mem-path /hugepages \
-pidfile /tmp/1.pid \
-monitor unix:/tmp/1,server,nowait  \
-vnc :1

It does not take much network traffic for the network to lock up. In
this case, the host shows 2 kvm threads spinning -- for vcpus 2,3. I
have vcpus pinned to pcpus (vcpu0:pcpu0, etc).

Backtrace for kvm, though nothing interesting:

Thread 5 (Thread 0x43344940 (LWP 3153)):
#0  0x00002b8af5088c77 in ioctl () from /lib64/libc.so.6
#1  0x0000000000530ece in kvm_run ()
#2  0x0000000000506529 in kvm_cpu_exec ()
#3  0x00000000005067c0 in ap_main_loop ()
#4  0x00002b8af46aa367 in start_thread () from /lib64/libpthread.so.0
#5  0x00002b8af50900ad in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x43d45940 (LWP 3154)):
#0  0x00002b8af5088c77 in ioctl () from /lib64/libc.so.6
#1  0x0000000000530ece in kvm_run ()
#2  0x0000000000506529 in kvm_cpu_exec ()
#3  0x00000000005067c0 in ap_main_loop ()
#4  0x00002b8af46aa367 in start_thread () from /lib64/libpthread.so.0
#5  0x00002b8af50900ad in clone () from /lib64/libc.so.6



In the guest, I see 2 threads of 2 separate processes spinning on cpus
2,3. They appear to be spinning kernel side.

Attempts to restart the network froze the guest console, and at this
point the host shows 3 threads spinning away, though not at 100% cpu.
The qemu monitor was able to push a system_powerdown event to the guest,
and it showed signs of receiving it though it did not powerdown on its own.

david



> 
>> Thanks,
>> Rusty.
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-19  6:08                                       ` David S. Ahern
@ 2009-03-19  8:03                                         ` Tomasz Chmielewski
  2009-03-19 14:11                                           ` David S. Ahern
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-19  8:03 UTC (permalink / raw)
  To: David S. Ahern
  Cc: Rusty Russell, Avi Kivity, Felix Leimbach, kvm, Anthony Liguori

David S. Ahern schrieb:
> 
> David S. Ahern wrote:
>> Rusty Russell wrote:
>>> On Wednesday 18 March 2009 16:59:36 Avi Kivity wrote:
>>>> Tomasz Chmielewski wrote:
>>>>> virtio_net virtio0: id 64 is not a head!
>>> This means that qemu said "I've finished with buffer 64" and the guest didn't
>>> know anything about buffer 64.
>>>
>>> We should not lock up, tho networking is toast: I think that qemu got upset
>>> and that caused this as well as it to chew 100% cpu.
>>>
>>> I'll see if I can reproduce with kvm-84 userspace and 2.6.27 guests, 32-bit
>>> guests on a 64-bit AMD host.  What's your kvm/qemu command line?
>>>
>> I've hit this as well.
>>
>> Intel host, running RHEL5.3, x86_64 with KVM-81.
>>
>> Guest is RHEL4.7, 32-bit, with the virtio drivers from RHEL4.8 beta.
>>
>> Happens pretty darn quickly for me.
>>
>> david
>>
> 
> Like I said, pretty darn quickly.

Can you reproduce it also with e1000 instead of virtio?


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-09  9:28   ` Tomasz Chmielewski
@ 2009-03-19 13:03     ` Tomasz Chmielewski
  0 siblings, 0 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-19 13:03 UTC (permalink / raw)
  To: kvm, Anthony Liguori, Rusty Russell, Avi Kivity

Tomasz Chmielewski schrieb:

>> Note how _time_ is different (similar timings are to other unaffected 
>> guests):
> 
> This is also pretty interesting:
> 
> # ping -c 10 <unaffected guest>
> PING 192.168.4.4 (192.168.4.4) 56(84) bytes of data.
> 64 bytes from 192.168.4.4: icmp_seq=1 ttl=64 time=1.25 ms
> 64 bytes from 192.168.4.4: icmp_seq=2 ttl=64 time=1.58 ms

(...)

> --- 192.168.4.4 ping statistics ---
> 10 packets transmitted, 10 received, 0% packet loss, time 9091ms
> rtt min/avg/max/mdev = 1.031/2.059/3.894/1.045 ms
> 
> 
> 
> How probable it is so many pings returned with exactly 1000 ms?
> 
> # ping -c 10 <affected_guest>
> PING 192.168.4.5 (192.168.4.5) 56(84) bytes of data.
> 64 bytes from 192.168.4.5: icmp_seq=1 ttl=64 time=1009 ms
> 64 bytes from 192.168.4.5: icmp_seq=2 ttl=64 time=9.61 ms
> 64 bytes from 192.168.4.5: icmp_seq=3 ttl=64 time=1000 ms
> 64 bytes from 192.168.4.5: icmp_seq=4 ttl=64 time=1000 ms
 
(...)

Just same as above happened for me again.
This time, I equipped the guest in one virtio card and one e1000 card.

00:03.0 Ethernet controller: Qumranet, Inc. Device 1000
00:04.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)

Pinging e1000 card on affected guest - replies are as fast:

# ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data.
64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=5.86 ms
64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=3.40 ms
64 bytes from 10.1.1.1: icmp_seq=3 ttl=64 time=0.791 ms

Pinging virtio on affected guest - slow:

# ping 192.168.113.83
PING 192.168.113.83 (192.168.113.83) 56(84) bytes of data.
64 bytes from 192.168.113.83: icmp_seq=1 ttl=64 time=21.6 ms
64 bytes from 192.168.113.83: icmp_seq=2 ttl=64 time=1000 ms
64 bytes from 192.168.113.83: icmp_seq=3 ttl=64 time=2.73 ms
64 bytes from 192.168.113.83: icmp_seq=4 ttl=64 time=243 ms


(this is same network, guests on the same host, so latencies are not caused by packets 
travelling around the globe).


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-19  8:03                                         ` Tomasz Chmielewski
@ 2009-03-19 14:11                                           ` David S. Ahern
  0 siblings, 0 replies; 70+ messages in thread
From: David S. Ahern @ 2009-03-19 14:11 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: Rusty Russell, Avi Kivity, Felix Leimbach, kvm, Anthony Liguori



Tomasz Chmielewski wrote:
> David S. Ahern schrieb:
>>
>> David S. Ahern wrote:
>>> Rusty Russell wrote:
>>>> On Wednesday 18 March 2009 16:59:36 Avi Kivity wrote:
>>>>> Tomasz Chmielewski wrote:
>>>>>> virtio_net virtio0: id 64 is not a head!
>>>> This means that qemu said "I've finished with buffer 64" and the
>>>> guest didn't
>>>> know anything about buffer 64.
>>>>
>>>> We should not lock up, tho networking is toast: I think that qemu
>>>> got upset
>>>> and that caused this as well as it to chew 100% cpu.
>>>>
>>>> I'll see if I can reproduce with kvm-84 userspace and 2.6.27 guests,
>>>> 32-bit
>>>> guests on a 64-bit AMD host.  What's your kvm/qemu command line?
>>>>
>>> I've hit this as well.
>>>
>>> Intel host, running RHEL5.3, x86_64 with KVM-81.
>>>
>>> Guest is RHEL4.7, 32-bit, with the virtio drivers from RHEL4.8 beta.
>>>
>>> Happens pretty darn quickly for me.
>>>
>>> david
>>>
>>
>> Like I said, pretty darn quickly.
> 
> Can you reproduce it also with e1000 instead of virtio?
> 
> 

I have not had a problem with the e1000 nic. This seems to be strictly a
virtio bug; I get the same messages. These are 2 separate runs, one from
March 11:

kernel: virtio_net virtio0: id 98 is not a head!

and the other last night:

kernel: virtio_net virtio0: id 6 is not a head!


david


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:01                   ` Felix Leimbach
  2009-03-17 17:05                     ` Avi Kivity
  2009-03-17 17:38                     ` Tomasz Chmielewski
@ 2009-03-31  8:50                     ` Tomasz Chmielewski
  2009-04-01  4:22                       ` David S. Ahern
  2 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-03-31  8:50 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: avi, kvm, Rusty Russell, Anthony Liguori, David S. Ahern

Felix Leimbach schrieb:
> Tomasz Chmielewski wrote:
>> Felix Leimbach schrieb:
>>> Out of 3 e1000 guests none has ever been hit.
>>>
>>> Observed with kvm-83 and kvm-84 with the host running in-kernel KVM 
>>> code (linux 2.6.25.7)
>> Could you add a (unused) e1000 interface to your virtio guests?
>> As this issue happens rarely for me, maybe you could help to reproduce 
>> it as well (i.e. if network gets slow on virtio interface, give e1000 
>> a IP address, and try if network is also slow on e1000 on the very 
>> same guest).
> Will do and report
>>
>> BTW, what CPU do you have?
> One dual core Opteron 2212
> Note: I will upgrade to two Shanghai Quad-Cores in 2 weeks and test with 
> those as well.

I have this "slowness" on an Intel CPU as well, after about 10 days of 
guest uptime (using virtio net):

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            3050  @ 2.13GHz
stepping        : 6
cpu MHz         : 2133.410
cache size      : 2048 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe 
syscall lm constant_tsc arch_perfmon pebs bts rep_good pni monitor 
ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips        : 4266.87
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-31  8:50                     ` Tomasz Chmielewski
@ 2009-04-01  4:22                       ` David S. Ahern
  2009-04-01  6:21                         ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: David S. Ahern @ 2009-04-01  4:22 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: Felix Leimbach, avi, kvm, Rusty Russell, Anthony Liguori


Tomasz Chmielewski wrote:
> Felix Leimbach schrieb:
>> Tomasz Chmielewski wrote:
>>> Felix Leimbach schrieb:
>>>> Out of 3 e1000 guests none has ever been hit.
>>>>
>>>> Observed with kvm-83 and kvm-84 with the host running in-kernel KVM
>>>> code (linux 2.6.25.7)
>>> Could you add a (unused) e1000 interface to your virtio guests?
>>> As this issue happens rarely for me, maybe you could help to
>>> reproduce it as well (i.e. if network gets slow on virtio interface,
>>> give e1000 a IP address, and try if network is also slow on e1000 on
>>> the very same guest).
>> Will do and report
>>>
>>> BTW, what CPU do you have?
>> One dual core Opteron 2212
>> Note: I will upgrade to two Shanghai Quad-Cores in 2 weeks and test
>> with those as well.
> 
> I have this "slowness" on an Intel CPU as well, after about 10 days of
> guest uptime (using virtio net):
> 
> processor       : 1
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 15
> model name      : Intel(R) Xeon(R) CPU            3050  @ 2.13GHz
> stepping        : 6
> cpu MHz         : 2133.410
> cache size      : 2048 KB
> physical id     : 0
> siblings        : 2
> core id         : 1
> cpu cores       : 2
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 10
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
> syscall lm constant_tsc arch_perfmon pebs bts rep_good pni monitor
> ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
> bogomips        : 4266.87
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 36 bits physical, 48 bits virtual
> power management:
> 
> 

For the Intel server, the guest is using the e1000 NIC or virtio or
other? I have a few DL320G5s with this processor; I have not hit this
problem running rhel3 and rhel4 guests using e1000/scsi devices.

david


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-04-01  4:22                       ` David S. Ahern
@ 2009-04-01  6:21                         ` Tomasz Chmielewski
  2009-04-06 15:19                           ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-04-01  6:21 UTC (permalink / raw)
  To: David S. Ahern; +Cc: Felix Leimbach, avi, kvm, Rusty Russell, Anthony Liguori

David S. Ahern schrieb:

>>>> Could you add a (unused) e1000 interface to your virtio guests?
>>>> As this issue happens rarely for me, maybe you could help to
>>>> reproduce it as well (i.e. if network gets slow on virtio interface,
>>>> give e1000 a IP address, and try if network is also slow on e1000 on
>>>> the very same guest).
>>> Will do and report
>>>> BTW, what CPU do you have?
>>> One dual core Opteron 2212
>>> Note: I will upgrade to two Shanghai Quad-Cores in 2 weeks and test
>>> with those as well.
>> I have this "slowness" on an Intel CPU as well, after about 10 days of
>> guest uptime (using virtio net):
>>
>> processor       : 1
>> vendor_id       : GenuineIntel
>> cpu family      : 6
>> model           : 15
>> model name      : Intel(R) Xeon(R) CPU            3050  @ 2.13GHz

> For the Intel server, the guest is using the e1000 NIC or virtio or
> other? I have a few DL320G5s with this processor; I have not hit this
> problem running rhel3 and rhel4 guests using e1000/scsi devices.

As I mentioned, it was using virtio net.

Guests running with e1000 (and virtio_blk) don't have this problem.


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-04-01  6:21                         ` Tomasz Chmielewski
@ 2009-04-06 15:19                           ` Tomasz Chmielewski
  2009-04-08  0:49                             ` Rusty Russell
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-04-06 15:19 UTC (permalink / raw)
  To: David S. Ahern; +Cc: Felix Leimbach, avi, kvm, Rusty Russell, Anthony Liguori

Tomasz Chmielewski schrieb:

> As I mentioned, it was using virtio net.
> 
> Guests running with e1000 (and virtio_blk) don't have this problem.

Also, virtio_console seem to be affected by this "slowness" issue.

Am I correct to think that if:

* on guest "lsmod" outputs:

     virtio_console          6828  0 [permanent]

* on guest, /etc/inittab contains:

6:2345:respawn:/sbin/mingetty ttyS0

* on host, I start the guest with a parameter:

-serial unix:/var/run/qemu-server/103.serial,server,nowait


That the guests's ttyS0 console is "virtio_console"?



If my thinking is correct, than I have a "slow serial console" on some 
of the guests using virtio_pci and virtio_console driver.


By "slow serial console" I mean any character typed shows up after a 
second or so.

It can be also "cured" like with virtio_net - just run:

     dd if=/dev/vda of=/dev/null

And the console reacts normally. Stop dd, console is slow again.


I have this issue on two guests with e1000 network, which use virtio_blk 
(and virtio_console...).
I never saw this issue with guests which don't use virtio.



-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-04-06 15:19                           ` Tomasz Chmielewski
@ 2009-04-08  0:49                             ` Rusty Russell
  2009-04-08  5:45                               ` Tomasz Chmielewski
  2009-05-26 11:49                               ` Tomasz Chmielewski
  0 siblings, 2 replies; 70+ messages in thread
From: Rusty Russell @ 2009-04-08  0:49 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: David S. Ahern, Felix Leimbach, avi, kvm, Anthony Liguori

On Tuesday 07 April 2009 00:49:17 Tomasz Chmielewski wrote:
> Tomasz Chmielewski schrieb:
> 
> > As I mentioned, it was using virtio net.
> > 
> > Guests running with e1000 (and virtio_blk) don't have this problem.
> 
> Also, virtio_console seem to be affected by this "slowness" issue.

I'm pretty sure this is different.  Older virtio_console code ignored
interrupts and polled, and use a heuristic to back off on polling (this was
because we used the generic "hvc" infrastructure which hacked support).

You'll find a delay on the first keystroke after idle, but none on the
second.

Hope that helps,
Rusty.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-04-08  0:49                             ` Rusty Russell
@ 2009-04-08  5:45                               ` Tomasz Chmielewski
  2009-05-26 11:49                               ` Tomasz Chmielewski
  1 sibling, 0 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-04-08  5:45 UTC (permalink / raw)
  To: Rusty Russell; +Cc: David S. Ahern, Felix Leimbach, avi, kvm, Anthony Liguori

Rusty Russell schrieb:
> On Tuesday 07 April 2009 00:49:17 Tomasz Chmielewski wrote:
>> Tomasz Chmielewski schrieb:
>>
>>> As I mentioned, it was using virtio net.
>>>
>>> Guests running with e1000 (and virtio_blk) don't have this problem.
>> Also, virtio_console seem to be affected by this "slowness" issue.
> 
> I'm pretty sure this is different.  Older virtio_console code ignored
> interrupts and polled, and use a heuristic to back off on polling (this was
> because we used the generic "hvc" infrastructure which hacked support).

By "older" you mean guest drivers?
I have 2.6.27.x on guests and see this issue.
If you meant host, I use kvm-84.


> You'll find a delay on the first keystroke after idle, but none on the
> second.

I'm not sure.
Press "a" seven times fast, and 7 characters will be printed a second later.

But: wait one second more, it will be unresponsive again. You won't see 
the characters "as you type".


Also these symptoms are very similar to virtio_net issue:
- it happens only on some guest (even if they have the same kernel and 
userspace) after a random period of time
- it used to happen for me _always_ when network got slow with 
virtio_net driver
- it doesn't go away with guest restart initiated from guest's system
- it goes away with kvm process stop/start (i.e. new kvm process), but 
can appear later with no apparent cause



-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-04-08  0:49                             ` Rusty Russell
  2009-04-08  5:45                               ` Tomasz Chmielewski
@ 2009-05-26 11:49                               ` Tomasz Chmielewski
  2009-05-26 11:55                                 ` Avi Kivity
  1 sibling, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-05-26 11:49 UTC (permalink / raw)
  To: Rusty Russell; +Cc: David S. Ahern, Felix Leimbach, avi, kvm, Anthony Liguori

Rusty Russell wrote:
> On Tuesday 07 April 2009 00:49:17 Tomasz Chmielewski wrote:
>> Tomasz Chmielewski schrieb:
>>
>>> As I mentioned, it was using virtio net.
>>>
>>> Guests running with e1000 (and virtio_blk) don't have this problem.
>> Also, virtio_console seem to be affected by this "slowness" issue.
> 
> I'm pretty sure this is different.  Older virtio_console code ignored
> interrupts and polled, and use a heuristic to back off on polling (this was
> because we used the generic "hvc" infrastructure which hacked support).
> 
> You'll find a delay on the first keystroke after idle, but none on the
> second.

I still observe this "slowness" with kvm-86 after the guest is running 
for some time (virtio_net and virtio_console seem to be affected; guest 
restart doesn't fix it).

-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-05-26 11:49                               ` Tomasz Chmielewski
@ 2009-05-26 11:55                                 ` Avi Kivity
  2009-05-26 12:05                                   ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-05-26 11:55 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: Rusty Russell, David S. Ahern, Felix Leimbach, kvm, Anthony Liguori

Tomasz Chmielewski wrote:
>
> I still observe this "slowness" with kvm-86 after the guest is running 
> for some time (virtio_net and virtio_console seem to be affected; 
> guest restart doesn't fix it).
>

Anything in guest dmesg?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-05-26 11:55                                 ` Avi Kivity
@ 2009-05-26 12:05                                   ` Tomasz Chmielewski
  2009-05-26 12:10                                     ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-05-26 12:05 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Rusty Russell, David S. Ahern, Felix Leimbach, kvm, Anthony Liguori

Avi Kivity wrote:
> Tomasz Chmielewski wrote:
>>
>> I still observe this "slowness" with kvm-86 after the guest is running 
>> for some time (virtio_net and virtio_console seem to be affected; 
>> guest restart doesn't fix it).
>>
> 
> Anything in guest dmesg?

No.
No hints in syslog, dmesg...


Can it be that this is more likely to happens on "busy" hosts?

It happens for me on a host where I have 16 guests running.


Also, as I booted the host almost 2 days ago, 2 or 3 guests didn't start 
properly (16 guests were starting at the same time), with their kernel 
saying:

     Kernel panic - not syncing: IO-APIC + timer doesn't work!

Can it be related?

After I restarted these failed guests, they started properly.


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-05-26 12:05                                   ` Tomasz Chmielewski
@ 2009-05-26 12:10                                     ` Avi Kivity
  2009-05-26 14:07                                       ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-05-26 12:10 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: Rusty Russell, David S. Ahern, Felix Leimbach, kvm, Anthony Liguori

Tomasz Chmielewski wrote:
> Avi Kivity wrote:
>> Tomasz Chmielewski wrote:
>>>
>>> I still observe this "slowness" with kvm-86 after the guest is 
>>> running for some time (virtio_net and virtio_console seem to be 
>>> affected; guest restart doesn't fix it).
>>>
>>
>> Anything in guest dmesg?
>
> No.
> No hints in syslog, dmesg...
>
>
> Can it be that this is more likely to happens on "busy" hosts?
>

We'll only know once we fix it...

> It happens for me on a host where I have 16 guests running.
>
>
> Also, as I booted the host almost 2 days ago, 2 or 3 guests didn't 
> start properly (16 guests were starting at the same time), with their 
> kernel saying:
>
>     Kernel panic - not syncing: IO-APIC + timer doesn't work!
>
> Can it be related?
>
> After I restarted these failed guests, they started properly.
>

This is timing related.  On a busy host you can get timeouts and thus 
the panics.  It's unrelated.

Maybe virtio is racy and a loaded host exposes the race.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-05-26 12:10                                     ` Avi Kivity
@ 2009-05-26 14:07                                       ` Tomasz Chmielewski
  2009-05-26 14:35                                         ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-05-26 14:07 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Rusty Russell, David S. Ahern, Felix Leimbach, kvm, Anthony Liguori

Avi Kivity wrote:
> Tomasz Chmielewski wrote:
>> Avi Kivity wrote:
>>> Tomasz Chmielewski wrote:
>>>>
>>>> I still observe this "slowness" with kvm-86 after the guest is 
>>>> running for some time (virtio_net and virtio_console seem to be 
>>>> affected; guest restart doesn't fix it).
>>>>
>>>
>>> Anything in guest dmesg?
>>
>> No.
>> No hints in syslog, dmesg...
>>
>>
>> Can it be that this is more likely to happens on "busy" hosts?
>>
> 
> We'll only know once we fix it...

(...)

> Maybe virtio is racy and a loaded host exposes the race.

I see it happening with virtio on 2.6.29.x guests as well.

So, what would you do if you saw it on your systems as well? ;)

Add some debug routines into virtio_* modules?


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-05-26 14:07                                       ` Tomasz Chmielewski
@ 2009-05-26 14:35                                         ` Avi Kivity
  2009-05-28 14:58                                           ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Avi Kivity @ 2009-05-26 14:35 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: Rusty Russell, David S. Ahern, Felix Leimbach, kvm, Anthony Liguori

Tomasz Chmielewski wrote:
>> Maybe virtio is racy and a loaded host exposes the race.
>
> I see it happening with virtio on 2.6.29.x guests as well.
>
> So, what would you do if you saw it on your systems as well? ;)
>
> Add some debug routines into virtio_* modules?
>

I'm no virtio expert.  Maybe I'd insert tracepoints to record interrupts 
and kicks.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-05-26 14:35                                         ` Avi Kivity
@ 2009-05-28 14:58                                           ` Tomasz Chmielewski
  2009-05-31  8:43                                             ` Avi Kivity
  0 siblings, 1 reply; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-05-28 14:58 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Rusty Russell, David S. Ahern, Felix Leimbach, kvm, Anthony Liguori

Avi Kivity wrote:
> Tomasz Chmielewski wrote:
>>> Maybe virtio is racy and a loaded host exposes the race.
>>
>> I see it happening with virtio on 2.6.29.x guests as well.
>>
>> So, what would you do if you saw it on your systems as well? ;)
>>
>> Add some debug routines into virtio_* modules?
>>
> 
> I'm no virtio expert.  Maybe I'd insert tracepoints to record interrupts 
> and kicks.

Accidentally, I made some "interesting" discovery.

This ~2 MB video shows a kvm-86 guest being rebooted and GRUB started:

http://syneticon.net/kvm/kvm-slowness.ogg


GRUB has its timeout set to 50 seconds, and is supposed to show it on 
the screen by decreasing the number of seconds shown, every second.

Here, GRUB decreases the second counter very fast by 2 seconds, then 
waits 2 seconds, then again decreases the number of sends by 2 seconds 
very fast, and so on.

Perhaps my wording does not describe it very well though, so just try to 
download the video and open it i.e. in mplayer.


Comments?


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-05-28 14:58                                           ` Tomasz Chmielewski
@ 2009-05-31  8:43                                             ` Avi Kivity
  0 siblings, 0 replies; 70+ messages in thread
From: Avi Kivity @ 2009-05-31  8:43 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: Rusty Russell, David S. Ahern, Felix Leimbach, kvm,
	Anthony Liguori, Marcelo Tosatti

Tomasz Chmielewski wrote:
> Accidentally, I made some "interesting" discovery.
>
> This ~2 MB video shows a kvm-86 guest being rebooted and GRUB started:
>
> http://syneticon.net/kvm/kvm-slowness.ogg
>
>
> GRUB has its timeout set to 50 seconds, and is supposed to show it on 
> the screen by decreasing the number of seconds shown, every second.
>
> Here, GRUB decreases the second counter very fast by 2 seconds, then 
> waits 2 seconds, then again decreases the number of sends by 2 seconds 
> very fast, and so on.
>
> Perhaps my wording does not describe it very well though, so just try 
> to download the video and open it i.e. in mplayer.

Wierd, wierd.

Can you run kvmtrace on this guest while this is happening and post the 
results somewhere?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-03-17 17:38                     ` Tomasz Chmielewski
@ 2009-06-08 11:02                       ` Felix Leimbach
  2009-06-16 14:26                         ` Tomasz Chmielewski
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Leimbach @ 2009-06-08 11:02 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: avi, kvm

Tomasz Chmielewski wrote:
> Felix Leimbach schrieb:
>
>>> BTW, what CPU do you have?
>> One dual core Opteron 2212
>> Note: I will upgrade to two Shanghai Quad-Cores in 2 weeks and test 
>> with those as well.
>>
>> processor       : 1
>> vendor_id       : AuthenticAMD
>> cpu family      : 15
>> model           : 65
>> model name      : Dual-Core AMD Opteron(tm) Processor 2212
>> stepping        : 2
>> cpu MHz         : 1994.996
>> cache size      : 1024 KB
>
> It's exactly the same CPU I have.
Interesting: Since two months I'm running on 2 Shanghai Quad-Cores 
instead and the problem is definitely gone.
The rest of the hardware as well as the whole software-stack remained 
unchanged.

That should confirm what we assumed already.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: strange guest slowness after some time
  2009-06-08 11:02                       ` Felix Leimbach
@ 2009-06-16 14:26                         ` Tomasz Chmielewski
  0 siblings, 0 replies; 70+ messages in thread
From: Tomasz Chmielewski @ 2009-06-16 14:26 UTC (permalink / raw)
  To: Felix Leimbach; +Cc: avi, kvm

Felix Leimbach wrote:

>> It's exactly the same CPU I have.
> Interesting: Since two months I'm running on 2 Shanghai Quad-Cores 
> instead and the problem is definitely gone.
> The rest of the hardware as well as the whole software-stack remained 
> unchanged.
> 
> That should confirm what we assumed already.

For me, it turned out that KVM I was running (coming with Proxmox VE) 
had a "fairsched" patch (OpenVZ-related) which caused this broken behaviour.


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2009-06-16 14:26 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-07 15:47 strange guest slowness after some time Tomasz Chmielewski
2009-03-07 16:41 ` Johannes Baumann
2009-03-07 16:54   ` Tomasz Chmielewski
2009-03-09  9:18 ` Tomasz Chmielewski
2009-03-09  9:28   ` Tomasz Chmielewski
2009-03-19 13:03     ` Tomasz Chmielewski
2009-03-09  9:55 ` Avi Kivity
2009-03-09 10:22   ` Tomasz Chmielewski
2009-03-09 10:25     ` Avi Kivity
2009-03-09 10:31       ` Tomasz Chmielewski
2009-03-09 10:37         ` Avi Kivity
2009-03-09 10:54           ` Tomasz Chmielewski
2009-03-09 11:37             ` Tomasz Chmielewski
2009-03-09 12:14               ` Avi Kivity
2009-03-09 12:52                 ` Tomasz Chmielewski
2009-03-15 15:41                   ` Avi Kivity
2009-03-15 16:14                     ` Avi Kivity
2009-03-15 13:19       ` Tomasz Chmielewski
2009-03-17 10:47         ` Tomasz Chmielewski
2009-03-17 11:16           ` Avi Kivity
2009-03-17 11:25             ` Tomasz Chmielewski
2009-03-17 15:32               ` Felix Leimbach
2009-03-17 15:43                 ` Tomasz Chmielewski
2009-03-17 17:01                   ` Felix Leimbach
2009-03-17 17:05                     ` Avi Kivity
2009-03-17 18:49                       ` Felix Leimbach
2009-03-18  6:36                         ` Avi Kivity
2009-03-18  7:57                           ` Felix Leimbach
2009-03-18  8:48                             ` Avi Kivity
2009-03-18  9:08                               ` Felix Leimbach
2009-03-17 17:38                     ` Tomasz Chmielewski
2009-06-08 11:02                       ` Felix Leimbach
2009-06-16 14:26                         ` Tomasz Chmielewski
2009-03-31  8:50                     ` Tomasz Chmielewski
2009-04-01  4:22                       ` David S. Ahern
2009-04-01  6:21                         ` Tomasz Chmielewski
2009-04-06 15:19                           ` Tomasz Chmielewski
2009-04-08  0:49                             ` Rusty Russell
2009-04-08  5:45                               ` Tomasz Chmielewski
2009-05-26 11:49                               ` Tomasz Chmielewski
2009-05-26 11:55                                 ` Avi Kivity
2009-05-26 12:05                                   ` Tomasz Chmielewski
2009-05-26 12:10                                     ` Avi Kivity
2009-05-26 14:07                                       ` Tomasz Chmielewski
2009-05-26 14:35                                         ` Avi Kivity
2009-05-28 14:58                                           ` Tomasz Chmielewski
2009-05-31  8:43                                             ` Avi Kivity
2009-03-17 15:52                 ` Avi Kivity
2009-03-17 16:12                   ` Tomasz Chmielewski
2009-03-17 17:05                     ` Felix Leimbach
2009-03-17 17:10                       ` Avi Kivity
2009-03-17 17:43                         ` Tomasz Chmielewski
2009-03-17 18:55                           ` Tomasz Chmielewski
2009-03-17 19:04                             ` Felix Leimbach
2009-03-17 19:24                               ` Tomasz Chmielewski
2009-03-17 20:14                                 ` Tomasz Chmielewski
2009-03-17 22:34                                   ` Tomasz Chmielewski
2009-03-17 23:02                                     ` Tomasz Chmielewski
2009-03-18  6:29                                 ` Avi Kivity
2009-03-19  4:59                                   ` Rusty Russell
2009-03-19  5:22                                     ` David S. Ahern
2009-03-19  6:08                                       ` David S. Ahern
2009-03-19  8:03                                         ` Tomasz Chmielewski
2009-03-19 14:11                                           ` David S. Ahern
2009-03-17 18:57                         ` Felix Leimbach
2009-03-18  5:54                         ` Felix Leimbach
2009-03-17 16:27                 ` Tomasz Chmielewski
2009-03-17 17:14                   ` Felix Leimbach
2009-03-17 17:19                     ` Avi Kivity
2009-03-17 17:34                     ` Tomasz Chmielewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.