* Strange CPU usage pattern in SMP guest
@ 2010-03-21 0:13 Sebastian Hetze
2010-03-21 10:09 ` Avi Kivity
0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21 0:13 UTC (permalink / raw)
To: kvm
Hi *,
in an 6 CPU SMP guest running on an host with 2 quad core
Intel Xeon E5520 with hyperthrading enabled
we see one or more guest CPUs working in a very strange
pattern. It looks like all or nothing. We can easily identify
the effected CPU with xosview. Here is the mpstat output
compared to one regular working CPU:
mpstat -P 4 1
Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU)
00:45:19 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
00:45:20 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:21 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:22 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:23 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:24 4 0,00 66,67 0,00 0,00 0,00 33,33 0,00 0,00 0,00
00:45:25 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:26 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:27 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:28 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:29 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
00:45:30 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
00:45:31 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
00:45:32 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
00:45:33 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
00:45:34 4 0,00 0,00 25,00 25,00 0,00 25,00 0,00 0,00 25,00
00:45:35 4 0,00 0,00 0,00 50,00 0,00 0,00 0,00 0,00 50,00
00:45:36 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
00:45:37 4 0,00 0,00 0,00 66,67 0,00 0,00 0,00 0,00 33,33
00:45:38 4 0,00 0,00 0,00 0,00 0,00 50,00 0,00 0,00 50,00
00:45:39 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
00:45:40 4 0,00 0,00 0,00 50,00 0,00 25,00 0,00 0,00 25,00
00:45:41 4 0,00 0,00 50,00 0,00 0,00 50,00 0,00 0,00 0,00
00:45:42 4 0,00 33,33 0,00 0,00 0,00 0,00 0,00 0,00 66,67
00:45:43 4 33,33 0,00 33,33 0,00 0,00 33,33 0,00 0,00 0,00
00:45:44 4 0,00 0,00 0,00 33,33 0,00 33,33 0,00 0,00 33,33
00:45:45 4 0,00 0,00 0,00 100,00 0,00 0,00 0,00 0,00 0,00
mpstat -P 5 1
Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU)
00:46:05 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
00:46:06 5 0,00 50,51 4,04 10,10 1,01 4,04 0,00 0,00 30,30
00:46:07 5 0,92 1,83 0,92 55,05 0,00 1,83 0,00 0,00 39,45
00:46:08 5 0,00 0,96 3,85 64,42 0,00 1,92 0,00 0,00 28,85
00:46:09 5 1,71 1,71 7,69 61,54 0,85 7,69 0,00 0,00 18,80
00:46:10 5 1,71 1,71 5,13 64,10 0,85 9,40 0,00 0,00 17,09
00:46:12 5 0,85 50,00 25,42 6,78 0,00 5,08 0,00 0,00 11,86
00:46:13 5 0,00 28,70 3,48 21,74 0,00 5,22 0,00 0,00 40,87
00:46:14 5 0,00 3,45 15,52 18,10 0,86 14,66 0,00 0,00 47,41
00:46:15 5 0,98 0,00 3,92 87,25 0,00 3,92 0,00 0,00 3,92
00:46:16 5 1,72 0,00 4,31 17,24 0,00 9,48 0,00 0,00 67,24
00:46:17 5 0,85 0,00 3,42 12,82 0,85 27,35 0,00 0,00 54,70
00:46:18 5 0,00 0,86 7,76 12,93 0,00 27,59 0,00 0,00 50,86
00:46:20 5 0,00 0,00 6,90 36,21 0,00 6,03 0,00 0,00 50,86
00:46:21 5 0,86 0,86 5,17 22,41 0,00 35,34 0,00 0,00 35,34
00:46:22 5 0,00 0,85 9,40 18,80 0,00 23,93 0,00 0,00 47,01
00:46:23 5 0,00 4,04 11,11 9,09 1,01 20,20 0,00 0,00 54,55
00:46:24 5 0,00 5,00 11,00 10,00 0,00 29,00 0,00 0,00 45,00
00:46:25 5 1,00 9,00 10,00 3,00 0,00 28,00 0,00 0,00 49,00
00:46:26 5 0,00 1,00 9,00 13,00 1,00 26,00 0,00 0,00 50,00
00:46:27 5 3,00 4,00 12,00 20,00 0,00 25,00 0,00 0,00 36,00
00:46:28 5 0,00 0,00 8,08 16,16 0,00 8,08 0,00 0,00 67,68
00:46:29 5 0,98 0,98 3,92 4,90 0,00 5,88 0,00 0,00 83,33
00:46:30 5 1,01 2,02 8,08 24,24 0,00 7,07 0,00 0,00 57,58
00:46:31 5 0,00 0,00 7,00 4,00 1,00 20,00 0,00 0,00 68,00
00:46:32 5 1,01 1,01 11,11 14,14 0,00 15,15 0,00 0,00 57,58
00:46:33 5 0,99 2,97 9,90 32,67 0,99 9,90 0,00 0,00 42,57
In CPU4, all values are simple fractions while CPU5 shows what I would
expect as an real live workload. Any idea whats happening here?
The host CPUs do not show any such pattern.
Best regards,
Sebastian
--
Sebastian Hetze
Mitglied des Vorstands
Linux Information Systems AG
Bundesallee 93, D-12161 Berlin
Fon: +49 30 726238-0, Fax: +49 30 726238-99
s.hetze@linux-ag.com, http://www.linux-ag.com
----------------------------------------------------------
Sitz der Gesellschaft: Putzbrunner Str. 71, 81739 München
Amtsgericht München: HRB 128 019
Vorstand: Rudolf Strobl, Sebastian Hetze
Aufsichtsrat: Michael Tarabochia (Vorsitzender)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
2010-03-21 0:13 Strange CPU usage pattern in SMP guest Sebastian Hetze
@ 2010-03-21 10:09 ` Avi Kivity
2010-03-21 12:02 ` Sebastian Hetze
[not found] ` <20100321120236.55228A0015@mail.linux-ag.de>
0 siblings, 2 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-21 10:09 UTC (permalink / raw)
To: Sebastian Hetze; +Cc: kvm
[-- Attachment #1: Type: text/plain, Size: 1602 bytes --]
On 03/21/2010 02:13 AM, Sebastian Hetze wrote:
> Hi *,
>
> in an 6 CPU SMP guest running on an host with 2 quad core
> Intel Xeon E5520 with hyperthrading enabled
> we see one or more guest CPUs working in a very strange
> pattern. It looks like all or nothing. We can easily identify
> the effected CPU with xosview. Here is the mpstat output
> compared to one regular working CPU:
>
>
> mpstat -P 4 1
> Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU)
> 00:45:19 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
> 00:45:20 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
> 00:45:21 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
> 00:45:22 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
> 00:45:23 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
> 00:45:24 4 0,00 66,67 0,00 0,00 0,00 33,33 0,00 0,00 0,00
> 00:45:25 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
> 00:45:26 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
>
Looks like the guest is only receiving 3-4 timer interrupts per second,
so time becomes quantized.
Please run the attached irqtop in the affected guest and report the results.
Is the host overly busy? What host kernel, kvm, and qemu are you
running? Is the guest running an I/O workload? if so, how are the disks
configured?
--
error compiling committee.c: too many arguments to function
[-- Attachment #2: irqtop --]
[-- Type: text/plain, Size: 1756 bytes --]
#!/usr/bin/python
import curses
import sys, os, time, optparse
def read_interrupts():
irq = {}
proc = file('/proc/interrupts')
nrcpu = len(proc.readline().split())
for line in proc.readlines():
vec, data = line.strip().split(':', 1)
if vec in ('ERR', 'MIS'):
continue
counts = data.split(None, nrcpu)
counts, rest = (counts[:-1], counts[-1])
count = sum([int(x) for x in counts])
try:
v = int(vec)
name = rest.split(None, 1)[1]
except:
name = rest
irq[name] = count
return irq
def delta_interrupts():
old = read_interrupts()
while True:
irq = read_interrupts()
delta = {}
for key in irq.keys():
delta[key] = irq[key] - old[key]
yield delta
old = irq
label_width = 30
number_width = 10
def tui(screen):
curses.use_default_colors()
curses.noecho()
def getcount(x):
return x[1]
def refresh(irq):
screen.erase()
screen.addstr(0, 0, 'irqtop')
row = 2
for name, count in sorted(irq.items(), key = getcount, reverse = True):
if row >= screen.getmaxyx()[0]:
break
col = 1
screen.addstr(row, col, name)
col += label_width
screen.addstr(row, col, '%10d' % (count,))
row += 1
screen.refresh()
for irqs in delta_interrupts():
refresh(irqs)
curses.halfdelay(10)
try:
c = screen.getkey()
if c == 'q':
break
except KeyboardInterrupt:
break
except curses.error:
continue
import curses.wrapper
curses.wrapper(tui)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
2010-03-21 10:09 ` Avi Kivity
@ 2010-03-21 12:02 ` Sebastian Hetze
[not found] ` <20100321120236.55228A0015@mail.linux-ag.de>
1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21 12:02 UTC (permalink / raw)
To: Avi Kivity; +Cc: Sebastian Hetze, kvm
On Sun, Mar 21, 2010 at 12:09:00PM +0200, Avi Kivity wrote:
> On 03/21/2010 02:13 AM, Sebastian Hetze wrote:
>> Hi *,
>>
>> in an 6 CPU SMP guest running on an host with 2 quad core
>> Intel Xeon E5520 with hyperthrading enabled
>> we see one or more guest CPUs working in a very strange
>> pattern. It looks like all or nothing. We can easily identify
>> the effected CPU with xosview. Here is the mpstat output
>> compared to one regular working CPU:
>>
>>
>> mpstat -P 4 1
>> Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU)
>> 00:45:19 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
>> 00:45:20 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
>> 00:45:21 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
>> 00:45:22 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
>> 00:45:23 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
>> 00:45:24 4 0,00 66,67 0,00 0,00 0,00 33,33 0,00 0,00 0,00
>> 00:45:25 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
>> 00:45:26 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
>>
>
> Looks like the guest is only receiving 3-4 timer interrupts per second,
> so time becomes quantized.
>
> Please run the attached irqtop in the affected guest and report the results.
>
> Is the host overly busy? What host kernel, kvm, and qemu are you
> running? Is the guest running an I/O workload? if so, how are the disks
The host is not busy at all. In fact, currently it is running only one
guest. The host is running an ubuntu 2.6.31-14-server kernel. qemu-kvm
is 0.12.2-0ubuntu6. The kvm module has srcversion: 82D6B673524596F9CF3E84C
as stated by modinfo.
The guest occasionally is running IO workload. However, the effect is
visible all the time. And it is only one out of 6 CPUs the very same guest
is running. This is the output on the guest for all CPUs:
mpstat -P ALL 1
12:45:59 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:46:00 all 0,40 9,74 2,39 5,37 0,80 3,98 0,00 0,00 77,34
12:46:00 0 1,00 5,00 6,00 3,00 1,00 9,00 0,00 0,00 75,00
12:46:00 1 0,00 23,00 2,00 10,00 0,00 0,00 0,00 0,00 65,00
12:46:00 2 0,00 5,94 0,99 6,93 0,00 1,98 0,00 0,00 84,16
12:46:00 3 0,00 8,00 2,00 5,00 2,00 9,00 0,00 0,00 74,00
12:46:00 4 0,00 33,33 0,00 0,00 0,00 0,00 0,00 0,00 66,67
12:46:00 5 0,00 5,94 0,00 3,96 0,00 0,99 0,00 0,00 89,11
12:46:00 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:46:01 all 0,60 5,81 3,21 24,45 0,40 3,61 0,00 0,00 61,92
12:46:01 0 1,01 4,04 7,07 31,31 1,01 6,06 0,00 0,00 49,49
12:46:01 1 0,00 5,00 2,00 19,00 0,00 2,00 0,00 0,00 72,00
12:46:01 2 0,99 7,92 1,98 35,64 0,00 2,97 0,00 0,00 50,50
12:46:01 3 1,98 4,95 2,97 13,86 0,00 6,93 0,00 0,00 69,31
12:46:01 4 0,00 33,33 0,00 0,00 0,00 0,00 0,00 0,00 66,67
12:46:01 5 0,00 8,08 3,03 22,22 0,00 1,01 0,00 0,00 65,66
12:46:01 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:46:02 all 2,38 12,70 17,06 14,68 0,60 1,98 0,00 0,00 50,60
12:46:02 0 3,96 15,84 9,90 13,86 0,00 2,97 0,00 0,00 53,47
12:46:02 1 2,97 6,93 5,94 19,80 2,97 2,97 0,00 0,00 58,42
12:46:02 2 2,02 17,17 8,08 18,18 2,02 1,01 0,00 0,00 51,52
12:46:02 3 2,02 10,10 8,08 14,14 0,00 2,02 0,00 0,00 63,64
12:46:02 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:02 5 0,00 13,00 55,00 6,00 0,00 1,00 0,00 0,00 25,00
12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
So it is only CPU4 that is showing this strange behaviour.
The same time on the host:
mpstat -P ALL 1
12:46:00 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:46:01 all 0,16 0,00 1,92 1,37 0,00 0,05 0,00 5,97 90,53
12:46:01 0 1,00 0,00 9,00 25,00 0,00 1,00 0,00 17,00 47,00
12:46:01 1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:01 2 0,00 0,00 13,08 0,00 0,00 0,00 0,00 17,69 69,23
12:46:01 3 0,00 0,00 2,06 0,00 0,00 0,00 0,00 8,25 89,69
12:46:01 4 0,95 0,00 0,95 0,00 0,00 0,00 0,00 22,86 75,24
12:46:01 5 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:01 6 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:01 7 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:01 8 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:01 9 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:01 10 0,87 0,00 5,22 0,00 0,00 0,00 0,00 23,48 70,43
12:46:01 11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:01 12 0,00 0,00 0,00 0,00 0,00 0,00 0,00 10,20 89,80
12:46:01 13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:01 14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:01 15 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:01 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:46:02 all 0,20 0,00 0,53 4,92 0,00 0,13 0,00 6,25 87,97
12:46:02 0 0,00 0,00 2,00 74,00 0,00 2,00 0,00 1,00 21,00
12:46:02 1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:02 2 0,00 0,00 3,06 0,00 0,00 0,00 0,00 28,57 68,37
12:46:02 3 1,11 0,00 1,11 0,00 0,00 0,00 0,00 22,22 75,56
12:46:02 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 7,87 92,13
12:46:02 5 0,00 0,00 0,95 0,00 0,00 0,00 0,00 0,00 99,05
12:46:02 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:02 7 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:02 8 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:02 9 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:02 10 0,92 0,00 0,92 0,00 0,00 0,00 0,00 22,02 76,15
12:46:02 11 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:02 12 1,08 0,00 0,00 0,00 0,00 0,00 0,00 15,05 83,87
12:46:02 13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:02 14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:02 15 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
12:46:03 all 0,26 0,00 1,12 1,97 0,00 0,07 0,00 13,93 82,65
12:46:03 0 1,00 0,00 6,00 30,00 0,00 1,00 0,00 0,00 62,00
12:46:03 1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:03 2 0,00 0,00 0,00 0,00 0,00 0,00 0,00 82,18 17,82
12:46:03 3 0,00 0,00 4,76 0,00 0,00 0,00 0,00 38,10 57,14
12:46:03 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 16,49 83,51
12:46:03 5 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:03 6 0,34 0,00 1,01 0,00 0,00 0,00 0,00 11,11 87,54
12:46:03 7 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:03 8 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:03 9 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
12:46:03 10 2,25 0,00 0,00 0,00 0,00 0,00 0,00 26,97 70,79
12:46:03 11 0,00 0,00 1,67 0,00 0,00 0,00 0,00 0,00 98,33
12:46:03 12 0,00 0,00 1,98 0,00 0,00 0,00 0,00 15,84 82,18
12:46:03 13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:03 14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:46:03 15 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00
your irqtop on the guest shows something like:
Local timer interrupts 1257
Rescheduling interrupts 849
eth0 715
ata_piix 24
virtio0-input 1
Disk configuration of the guest is a single LVM volume running as raw image.
Here is the qemu-kvm command line:
/usr/bin/qemu-system-x86_64 -S -M pc-0.12 -cpu qemu32 -m 8192 -smp 6 -name guest -uuid dac8c0df-5848-48e5-904e-5548fcb5e614 -monitor unix:/var/lib/libvirt/qemu/guest.monitor,server,nowait -boot c -drive file=/dev/drbd/by-res/guest,if=ide,index=0,boot=on,format=host_device,cache=none -net nic,macaddr=24:09:61:01:32:00,vlan=0,model=virtio,name=virtio.0 -net tap,fd=20,vlan=0,name=tap.0 -serial pty -parallel none -usb -vnc 127.0.0.1:0 -k de -vga cirrus -soundhw es1370 -pcidevice host=01:00.0 -pcidevice host=06:00.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
[not found] ` <20100321120236.55228A0015@mail.linux-ag.de>
@ 2010-03-21 12:19 ` Avi Kivity
2010-03-21 14:55 ` Sebastian Hetze
[not found] ` <20100321145548.CC027A0015@mail.linux-ag.de>
0 siblings, 2 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-21 12:19 UTC (permalink / raw)
To: Sebastian Hetze; +Cc: kvm
On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>
> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
>
> So it is only CPU4 that is showing this strange behaviour.
>
Can you adjust irqtop to only count cpu4? or even just post a few 'cat
/proc/interrupts' from that guest.
Most likely the timer interrupt for cpu4 died.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
2010-03-21 12:19 ` Avi Kivity
@ 2010-03-21 14:55 ` Sebastian Hetze
[not found] ` <20100321145548.CC027A0015@mail.linux-ag.de>
1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21 14:55 UTC (permalink / raw)
To: Avi Kivity; +Cc: Sebastian Hetze, kvm
[-- Attachment #1: Type: text/plain, Size: 2049 bytes --]
On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>
>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
>>
>> So it is only CPU4 that is showing this strange behaviour.
>>
>
> Can you adjust irqtop to only count cpu4? or even just post a few 'cat
> /proc/interrupts' from that guest.
>
> Most likely the timer interrupt for cpu4 died.
I've added two keys +/- to your irqtop to focus up and down
in the row of available CPUs.
The irqtop for CPU4 shows a constant number of 6 local timer interrupts
per update, while the other CPUs show various higher values:
irqtop for cpu 4
eth0 188
Rescheduling interrupts 162
Local timer interrupts 6
ata_piix 3
TLB shootdowns 1
Spurious interrupts 0
Machine check exceptions 0
irqtop for cpu 5
eth0 257
Local timer interrupts 251
Rescheduling interrupts 237
Spurious interrupts 0
Machine check exceptions 0
So the timer interrupt for cpu4 is not completely dead but somehow
broken. What can cause this problem? Any way to speed it up again?
[-- Attachment #2: irqtop --]
[-- Type: text/plain, Size: 2201 bytes --]
#!/usr/bin/python
import curses
import sys, os, time, optparse
def read_interrupts():
global target
irq = {}
proc = file('/proc/interrupts')
nrcpu = len(proc.readline().split())
if target < 0:
target = 0;
if target > nrcpu:
target = nrcpu
for line in proc.readlines():
vec, data = line.strip().split(':', 1)
if vec in ('ERR', 'MIS'):
continue
counts = data.split(None, nrcpu)
counts, rest = (counts[:-1], counts[-1])
if target == 0:
count = sum([int(x) for x in counts])
else:
count = int(counts[target-1])
try:
v = int(vec)
name = rest.split(None, 1)[1]
except:
name = rest
irq[name] = count
return irq
def delta_interrupts():
old = read_interrupts()
while True:
irq = read_interrupts()
delta = {}
for key in irq.keys():
delta[key] = irq[key] - old[key]
yield delta
old = irq
target = 0
label_width = 35
number_width = 10
def tui(screen):
curses.use_default_colors()
global target
curses.noecho()
def getcount(x):
return x[1]
def refresh(irq):
screen.erase()
if target > 0:
title = "irqtop for cpu %d"%(target-1)
else:
title = "irqtop sum for all cpu's"
screen.addstr(0, 0, title)
row = 2
for name, count in sorted(irq.items(), key = getcount, reverse = True):
if row >= screen.getmaxyx()[0]:
break
col = 1
screen.addstr(row, col, name)
col += label_width
screen.addstr(row, col, '%10d' % (count,))
row += 1
screen.refresh()
for irqs in delta_interrupts():
refresh(irqs)
curses.halfdelay(10)
try:
c = screen.getkey()
if c == 'q':
break
if c == '+':
target = target+1
if c == '-':
target = target-1
except KeyboardInterrupt:
break
except curses.error:
continue
import curses.wrapper
curses.wrapper(tui)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
[not found] ` <20100321145548.CC027A0015@mail.linux-ag.de>
@ 2010-03-21 15:17 ` Avi Kivity
2010-03-21 15:47 ` Sebastian Hetze
2010-03-22 12:51 ` Sebastian Hetze
0 siblings, 2 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-21 15:17 UTC (permalink / raw)
To: Sebastian Hetze; +Cc: kvm, Marcelo Tosatti
On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
>
>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>
>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
>>>
>>> So it is only CPU4 that is showing this strange behaviour.
>>>
>>>
>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat
>> /proc/interrupts' from that guest.
>>
>> Most likely the timer interrupt for cpu4 died.
>>
> I've added two keys +/- to your irqtop to focus up and down
> in the row of available CPUs.
> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
> per update, while the other CPUs show various higher values:
>
> irqtop for cpu 4
>
> eth0 188
> Rescheduling interrupts 162
> Local timer interrupts 6
> ata_piix 3
> TLB shootdowns 1
> Spurious interrupts 0
> Machine check exceptions 0
>
>
> irqtop for cpu 5
>
> eth0 257
> Local timer interrupts 251
> Rescheduling interrupts 237
> Spurious interrupts 0
> Machine check exceptions 0
>
> So the timer interrupt for cpu4 is not completely dead but somehow
> broken.
That is incredibly weird.
> What can cause this problem? Any way to speed it up again?
>
The host has 8 cpus and is only running this 6 vcpu guest, yes?
Can you confirm the other vcpus are ticking at 250 Hz?
What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a
last-used-cpu field in the display.
Marcelo, any ideas?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
2010-03-21 15:17 ` Avi Kivity
@ 2010-03-21 15:47 ` Sebastian Hetze
2010-03-22 12:51 ` Sebastian Hetze
1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21 15:47 UTC (permalink / raw)
To: Avi Kivity; +Cc: Sebastian Hetze, kvm, Marcelo Tosatti
On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
>> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
>>
>>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>>
>>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
>>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
>>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
>>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
>>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
>>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
>>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
>>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
>>>>
>>>> So it is only CPU4 that is showing this strange behaviour.
>>>>
>>>>
>>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat
>>> /proc/interrupts' from that guest.
>>>
>>> Most likely the timer interrupt for cpu4 died.
>>>
>> I've added two keys +/- to your irqtop to focus up and down
>> in the row of available CPUs.
>> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
>> per update, while the other CPUs show various higher values:
>>
>> irqtop for cpu 4
>>
>> eth0 188
>> Rescheduling interrupts 162
>> Local timer interrupts 6
>> ata_piix 3
>> TLB shootdowns 1
>> Spurious interrupts 0
>> Machine check exceptions 0
>>
>>
>> irqtop for cpu 5
>>
>> eth0 257
>> Local timer interrupts 251
>> Rescheduling interrupts 237
>> Spurious interrupts 0
>> Machine check exceptions 0
>>
>> So the timer interrupt for cpu4 is not completely dead but somehow
>> broken.
>
> That is incredibly weird.
>
>> What can cause this problem? Any way to speed it up again?
>>
>
> The host has 8 cpus and is only running this 6 vcpu guest, yes?
The host is an dual quad core E5520 with hyperthrading enabled, so we
see 2x4x2=16 CPUs on the host. The guest is started with 6 CPUs.
> Can you confirm the other vcpus are ticking at 250 Hz?
The irqtop shows different numbers for local timer interrupts on the
other CPUs. The total number (summed up over all CPUs) varies between
something like 700 and 1400. Any CPU can be down to 10 and next update
up to 260. Only CPU4 stays at the 6 local timer interrupts.
>
> What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a
> last-used-cpu field in the display.
The processes are not bound to a particular CPU, so the picture varies.
Here are two shots:
take1:
15 root RT -5 0 0 0 S 0 0.0 0:01.70 4 migration/4
16 root 15 -5 0 0 0 S 0 0.0 0:00.08 4 ksoftirqd/4
17 root RT -5 0 0 0 S 0 0.0 0:00.00 4 watchdog/4
25 root 15 -5 0 0 0 S 0 0.0 0:00.01 4 events/4
35 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kintegrityd/4
41 root 15 -5 0 0 0 S 0 0.0 0:00.03 4 kblockd/4
50 root 15 -5 0 0 0 S 0 0.0 0:00.90 4 ata/4
55 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kseriod
66 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 aio/4
73 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 crypto/4
80 root 15 -5 0 0 0 S 0 0.0 2:11.71 4 scsi_eh_1
87 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kmpathd/4
95 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kondemand/4
101 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kconservative/4
103 root 10 -10 0 0 0 S 0 0.0 0:00.00 4 krfcommd
681 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush
686 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush
691 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush
737 root 15 -5 0 0 0 S 0 0.0 0:00.71 4 kjournald
826 root 16 -4 2100 452 312 S 0 0.0 0:00.14 4 udevd
1350 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kpsmoused
1444 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kgameportd
1718 root 15 -5 0 0 0 S 0 0.0 0:14.62 4 kjournald
2108 statd 20 0 2252 1152 760 S 0 0.0 0:02.66 4 rpc.statd
2117 root 15 -5 0 0 0 S 0 0.0 0:00.36 4 rpciod/4
2123 root 15 -5 0 0 0 S 0 0.0 0:06.61 4 nfsiod
2259 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty
2265 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty
2584 root 20 0 1860 656 604 S 0 0.0 0:01.92 4 automount
2712 mysql 20 0 146m 4580 3464 S 0 0.1 0:59.71 4 mysqld
2999 root 20 0 28944 12m 1704 S 0 0.2 1:23.04 4 console-kit-dae
3834 root 20 0 13476 2640 1572 S 0 0.0 0:00.03 4 smbd
4015 root 15 -5 0 0 0 S 0 0.0 7:22.33 4 nfsd
4017 root 15 -5 0 0 0 S 0 0.0 10:46.87 4 nfsd
4030 root 20 0 2644 1180 876 S 0 0.0 0:04.32 4 rpc.mountd
4134 root 20 0 6164 1672 1344 S 0 0.0 0:00.37 4 master
4311 root 20 0 5068 1796 1580 S 0 0.0 1:25.80 4 hald-addon-stor
4560 root 20 0 2092 728 572 S 0 0.0 0:01.78 4 cron
5053 www-data 20 0 33312 4092 1004 S 0 0.0 0:00.03 4 apache2
take 2:
15 root RT -5 0 0 0 S 0 0.0 0:01.70 4 migration/4
16 root 15 -5 0 0 0 S 0 0.0 0:00.08 4 ksoftirqd/4
17 root RT -5 0 0 0 S 0 0.0 0:00.00 4 watchdog/4
25 root 15 -5 0 0 0 S 0 0.0 0:00.01 4 events/4
35 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kintegrityd/4
41 root 15 -5 0 0 0 S 0 0.0 0:00.03 4 kblockd/4
50 root 15 -5 0 0 0 S 0 0.0 0:00.90 4 ata/4
55 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kseriod
66 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 aio/4
73 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 crypto/4
87 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kmpathd/4
95 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kondemand/4
101 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kconservative/4
103 root 10 -10 0 0 0 S 0 0.0 0:00.00 4 krfcommd
681 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush
686 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush
691 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush
737 root 15 -5 0 0 0 S 0 0.0 0:00.71 4 kjournald
826 root 16 -4 2100 452 312 S 0 0.0 0:00.14 4 udevd
1350 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kpsmoused
1444 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kgameportd
2091 daemon 20 0 1820 552 464 S 0 0.0 0:44.31 4 portmap
2108 statd 20 0 2252 1152 760 S 0 0.0 0:02.66 4 rpc.statd
2117 root 15 -5 0 0 0 S 0 0.0 0:00.36 4 rpciod/4
2123 root 15 -5 0 0 0 S 0 0.0 0:06.61 4 nfsiod
2259 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty
2265 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty
2712 mysql 20 0 146m 4580 3464 S 0 0.1 0:59.72 4 mysqld
3834 root 20 0 13476 2640 1572 S 0 0.0 0:00.03 4 smbd
4017 root 15 -5 0 0 0 S 0 0.0 10:46.87 4 nfsd
4134 root 20 0 6164 1672 1344 S 0 0.0 0:00.37 4 master
4311 root 20 0 5068 1796 1580 S 0 0.0 1:25.80 4 hald-addon-stor
4560 root 20 0 2092 728 572 S 0 0.0 0:01.78 4 cron
5053 www-data 20 0 33312 4092 1004 S 0 0.0 0:00.03 4 apache2
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
2010-03-21 15:17 ` Avi Kivity
2010-03-21 15:47 ` Sebastian Hetze
@ 2010-03-22 12:51 ` Sebastian Hetze
2010-03-23 21:18 ` Marcelo Tosatti
1 sibling, 1 reply; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-22 12:51 UTC (permalink / raw)
To: Avi Kivity; +Cc: Sebastian Hetze, kvm, Marcelo Tosatti
On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
>> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
>>
>>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>>
>>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
>>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
>>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
>>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
>>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
>>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
>>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
>>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
>>>>
>>>> So it is only CPU4 that is showing this strange behaviour.
>>>>
>>>>
>>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat
>>> /proc/interrupts' from that guest.
>>>
>>> Most likely the timer interrupt for cpu4 died.
>>>
>> I've added two keys +/- to your irqtop to focus up and down
>> in the row of available CPUs.
>> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
>> per update, while the other CPUs show various higher values:
>>
>> irqtop for cpu 4
>>
>> eth0 188
>> Rescheduling interrupts 162
>> Local timer interrupts 6
>> ata_piix 3
>> TLB shootdowns 1
>> Spurious interrupts 0
>> Machine check exceptions 0
>>
>>
>> irqtop for cpu 5
>>
>> eth0 257
>> Local timer interrupts 251
>> Rescheduling interrupts 237
>> Spurious interrupts 0
>> Machine check exceptions 0
>>
>> So the timer interrupt for cpu4 is not completely dead but somehow
>> broken.
>
> That is incredibly weird.
>
>> What can cause this problem? Any way to speed it up again?
>>
>
> The host has 8 cpus and is only running this 6 vcpu guest, yes?
>
> Can you confirm the other vcpus are ticking at 250 Hz?
>
> What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a
> last-used-cpu field in the display.
>
> Marcelo, any ideas?
Just to let you know, right after startup, all vcpus work fine.
The following message might be related to the problem:
hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns
The guest is an 32bit system running on an 64bit host.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
2010-03-22 12:51 ` Sebastian Hetze
@ 2010-03-23 21:18 ` Marcelo Tosatti
2010-03-30 8:27 ` Sebastian Hetze
[not found] ` <20100330082743.49A113030135@mail.linux-ag.de>
0 siblings, 2 replies; 11+ messages in thread
From: Marcelo Tosatti @ 2010-03-23 21:18 UTC (permalink / raw)
To: Sebastian Hetze; +Cc: Avi Kivity, kvm
[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]
On Mon, Mar 22, 2010 at 01:51:20PM +0100, Sebastian Hetze wrote:
> On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> > On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
> >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
> >>
> >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
> >>>
> >>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
> >>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
> >>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
> >>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
> >>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
> >>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
> >>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
> >>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
> >>>>
> >>>> So it is only CPU4 that is showing this strange behaviour.
> >>>>
> >>>>
> >>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat
> >>> /proc/interrupts' from that guest.
> >>>
> >>> Most likely the timer interrupt for cpu4 died.
> >>>
> >> I've added two keys +/- to your irqtop to focus up and down
> >> in the row of available CPUs.
> >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
> >> per update, while the other CPUs show various higher values:
> >>
> >> irqtop for cpu 4
> >>
> >> eth0 188
> >> Rescheduling interrupts 162
> >> Local timer interrupts 6
> >> ata_piix 3
> >> TLB shootdowns 1
> >> Spurious interrupts 0
> >> Machine check exceptions 0
> >>
> >>
> >> irqtop for cpu 5
> >>
> >> eth0 257
> >> Local timer interrupts 251
> >> Rescheduling interrupts 237
> >> Spurious interrupts 0
> >> Machine check exceptions 0
> >>
> >> So the timer interrupt for cpu4 is not completely dead but somehow
> >> broken.
> >
> > That is incredibly weird.
> >
> >> What can cause this problem? Any way to speed it up again?
> >>
> >
> > The host has 8 cpus and is only running this 6 vcpu guest, yes?
> >
> > Can you confirm the other vcpus are ticking at 250 Hz?
> >
> > What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a
> > last-used-cpu field in the display.
> >
> > Marcelo, any ideas?
>
> Just to let you know, right after startup, all vcpus work fine.
>
> The following message might be related to the problem:
> hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns
>
> The guest is an 32bit system running on an 64bit host.
Sebastian,
Please apply the attached patch to your guest kernel.
[-- Attachment #2: hrtimer-tune-hrtimer_interrupt-hang-logic.patch --]
[-- Type: text/plain, Size: 7973 bytes --]
commit ff28aa973164f8e5f4b2633e178162d121c4e5eb
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Fri Nov 13 17:05:44 2009 +0100
hrtimer: Tune hrtimer_interrupt hang logic
The hrtimer_interrupt hang logic adjusts min_delta_ns based on the
execution time of the hrtimer callbacks.
This is error-prone for virtual machines, where a guest vcpu can be
scheduled out during the execution of the callbacks (and the callbacks
themselves can do operations that translate to blocking operations in
the hypervisor), which in can lead to large min_delta_ns rendering the
system unusable.
Replace the current heuristics with something more reliable. Allow the
interrupt code to try 3 times to catch up with the lost time. If that
fails use the total time spent in the interrupt handler to defer the
next timer interrupt so the system can catch up with other things
which got delayed. Limit that deferment to 100ms.
The retry events and the maximum time spent in the interrupt handler
are recorded and exposed via /proc/timer_list
Inspired by a patch from Marcelo.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 4759917..2e1064f 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -164,10 +164,11 @@ struct hrtimer_clock_base {
* @expires_next: absolute time of the next event which was scheduled
* via clock_set_next_event()
* @hres_active: State of high resolution mode
- * @check_clocks: Indictator, when set evaluate time source and clock
- * event devices whether high resolution mode can be
- * activated.
- * @nr_events: Total number of timer interrupt events
+ * @hang_detected: The last hrtimer interrupt detected a hang
+ * @nr_events: Total number of hrtimer interrupt events
+ * @nr_retries: Total number of hrtimer interrupt retries
+ * @nr_hangs: Total number of hrtimer interrupt hangs
+ * @max_hang_time: Maximum time spent in hrtimer_interrupt
*/
struct hrtimer_cpu_base {
spinlock_t lock;
@@ -175,7 +176,11 @@ struct hrtimer_cpu_base {
#ifdef CONFIG_HIGH_RES_TIMERS
ktime_t expires_next;
int hres_active;
+ int hang_detected;
unsigned long nr_events;
+ unsigned long nr_retries;
+ unsigned long nr_hangs;
+ ktime_t max_hang_time;
#endif
};
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 49da79a..4b0638b 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -584,7 +584,7 @@ static void hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base)
static int hrtimer_reprogram(struct hrtimer *timer,
struct hrtimer_clock_base *base)
{
- ktime_t *expires_next = &__get_cpu_var(hrtimer_bases).expires_next;
+ struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
int res;
@@ -609,7 +609,16 @@ static int hrtimer_reprogram(struct hrtimer *timer,
if (expires.tv64 < 0)
return -ETIME;
- if (expires.tv64 >= expires_next->tv64)
+ if (expires.tv64 >= cpu_base->expires_next.tv64)
+ return 0;
+
+ /*
+ * If a hang was detected in the last timer interrupt then we
+ * do not schedule a timer which is earlier than the expiry
+ * which we enforced in the hang detection. We want the system
+ * to make progress.
+ */
+ if (cpu_base->hang_detected)
return 0;
/*
@@ -617,7 +626,7 @@ static int hrtimer_reprogram(struct hrtimer *timer,
*/
res = tick_program_event(expires, 0);
if (!IS_ERR_VALUE(res))
- *expires_next = expires;
+ cpu_base->expires_next = expires;
return res;
}
@@ -1234,29 +1243,6 @@ static void __run_hrtimer(struct hrtimer *timer)
#ifdef CONFIG_HIGH_RES_TIMERS
-static int force_clock_reprogram;
-
-/*
- * After 5 iteration's attempts, we consider that hrtimer_interrupt()
- * is hanging, which could happen with something that slows the interrupt
- * such as the tracing. Then we force the clock reprogramming for each future
- * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
- * threshold that we will overwrite.
- * The next tick event will be scheduled to 3 times we currently spend on
- * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
- * 1/4 of their time to process the hrtimer interrupts. This is enough to
- * let it running without serious starvation.
- */
-
-static inline void
-hrtimer_interrupt_hanging(struct clock_event_device *dev,
- ktime_t try_time)
-{
- force_clock_reprogram = 1;
- dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
- printk(KERN_WARNING "hrtimer: interrupt too slow, "
- "forcing clock min delta to %lu ns\n", dev->min_delta_ns);
-}
/*
* High resolution timer interrupt
* Called with interrupts disabled
@@ -1265,21 +1251,15 @@ void hrtimer_interrupt(struct clock_event_device *dev)
{
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
struct hrtimer_clock_base *base;
- ktime_t expires_next, now;
- int nr_retries = 0;
- int i;
+ ktime_t expires_next, now, entry_time, delta;
+ int i, retries = 0;
BUG_ON(!cpu_base->hres_active);
cpu_base->nr_events++;
dev->next_event.tv64 = KTIME_MAX;
- retry:
- /* 5 retries is enough to notice a hang */
- if (!(++nr_retries % 5))
- hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now));
-
- now = ktime_get();
-
+ entry_time = now = ktime_get();
+retry:
expires_next.tv64 = KTIME_MAX;
spin_lock(&cpu_base->lock);
@@ -1341,10 +1321,48 @@ void hrtimer_interrupt(struct clock_event_device *dev)
spin_unlock(&cpu_base->lock);
/* Reprogramming necessary ? */
- if (expires_next.tv64 != KTIME_MAX) {
- if (tick_program_event(expires_next, force_clock_reprogram))
- goto retry;
+ if (expires_next.tv64 == KTIME_MAX ||
+ !tick_program_event(expires_next, 0)) {
+ cpu_base->hang_detected = 0;
+ return;
}
+
+ /*
+ * The next timer was already expired due to:
+ * - tracing
+ * - long lasting callbacks
+ * - being scheduled away when running in a VM
+ *
+ * We need to prevent that we loop forever in the hrtimer
+ * interrupt routine. We give it 3 attempts to avoid
+ * overreacting on some spurious event.
+ */
+ now = ktime_get();
+ cpu_base->nr_retries++;
+ if (++retries < 3)
+ goto retry;
+ /*
+ * Give the system a chance to do something else than looping
+ * here. We stored the entry time, so we know exactly how long
+ * we spent here. We schedule the next event this amount of
+ * time away.
+ */
+ cpu_base->nr_hangs++;
+ cpu_base->hang_detected = 1;
+ delta = ktime_sub(now, entry_time);
+ if (delta.tv64 > cpu_base->max_hang_time.tv64)
+ cpu_base->max_hang_time = delta;
+ /*
+ * Limit it to a sensible value as we enforce a longer
+ * delay. Give the CPU at least 100ms to catch up.
+ */
+ if (delta.tv64 > 100 * NSEC_PER_MSEC)
+ expires_next = ktime_add_ns(now, 100 * NSEC_PER_MSEC);
+ else
+ expires_next = ktime_add(now, delta);
+ tick_program_event(expires_next, 1);
+ printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
+ ktime_to_ns(delta));
}
/*
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index fddd69d..7cd905c 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -150,6 +150,9 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
P_ns(expires_next);
P(hres_active);
P(nr_events);
+ P(nr_retries);
+ P(nr_hangs);
+ P_ns(max_hang_time);
#endif
#undef P
#undef P_ns
@@ -252,7 +255,7 @@ static int timer_list_show(struct seq_file *m, void *v)
u64 now = ktime_to_ns(ktime_get());
int cpu;
- SEQ_printf(m, "Timer List Version: v0.4\n");
+ SEQ_printf(m, "Timer List Version: v0.5\n");
SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES);
SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now);
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
2010-03-23 21:18 ` Marcelo Tosatti
@ 2010-03-30 8:27 ` Sebastian Hetze
[not found] ` <20100330082743.49A113030135@mail.linux-ag.de>
1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-30 8:27 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Sebastian Hetze, Avi Kivity, kvm
On Tue, Mar 23, 2010 at 06:18:08PM -0300, Marcelo Tosatti wrote:
> On Mon, Mar 22, 2010 at 01:51:20PM +0100, Sebastian Hetze wrote:
> > On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> > > On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
> > >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
> > >>
> > >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
> > >>>
> > >>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
> > >>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14
> > >>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00
> > >>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45
> > >>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00
> > >>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00
> > >>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00
> > >>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00
> > >>>>
> > >>>> So it is only CPU4 that is showing this strange behaviour.
> > >>>>
> > >>>>
> > >>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat
> > >>> /proc/interrupts' from that guest.
> > >>>
> > >>> Most likely the timer interrupt for cpu4 died.
> > >>>
> > >> I've added two keys +/- to your irqtop to focus up and down
> > >> in the row of available CPUs.
> > >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
> > >> per update, while the other CPUs show various higher values:
> > >>
> > >> irqtop for cpu 4
> > >>
> > >> eth0 188
> > >> Rescheduling interrupts 162
> > >> Local timer interrupts 6
> > >> ata_piix 3
> > >> TLB shootdowns 1
> > >> Spurious interrupts 0
> > >> Machine check exceptions 0
> > >>
> > >>
> > >> irqtop for cpu 5
> > >>
> > >> eth0 257
> > >> Local timer interrupts 251
> > >> Rescheduling interrupts 237
> > >> Spurious interrupts 0
> > >> Machine check exceptions 0
> > >>
> > >> So the timer interrupt for cpu4 is not completely dead but somehow
> > >> broken.
> > >
> > > That is incredibly weird.
> > >
> > >> What can cause this problem? Any way to speed it up again?
> > >>
> > >
> > > The host has 8 cpus and is only running this 6 vcpu guest, yes?
> > >
> > > Can you confirm the other vcpus are ticking at 250 Hz?
> > >
> > > What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a
> > > last-used-cpu field in the display.
> > >
> > > Marcelo, any ideas?
> >
> > Just to let you know, right after startup, all vcpus work fine.
> >
> > The following message might be related to the problem:
> > hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns
> >
> > The guest is an 32bit system running on an 64bit host.
>
> Sebastian,
>
> Please apply the attached patch to your guest kernel.
>
With this patch applied, the system runs without hrtimer messages since
5 days and the timer iterrupts look fine. However, I had this
Clocksource tsc unstable (delta = -4398046474878 ns) message that I
reported on Sunday.
Actually, when restarting the system with the hrtimer patch applied,
we also changed the BIOS setting to disable Intel SmartStep on the host.
Since there are no hrtimer messages at all, it might be that the SmartStep
CPU frequency adjustment is the real cause for the slow interrupts in
the KVM guest. Anyone else experienced these problems?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest
[not found] ` <20100330082743.49A113030135@mail.linux-ag.de>
@ 2010-04-05 21:24 ` Sebastian Hetze
0 siblings, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-04-05 21:24 UTC (permalink / raw)
To: Sebastian Hetze; +Cc: Marcelo Tosatti, Avi Kivity, kvm
On Tue, Mar 30, 2010 at 10:27:43AM +0200, Sebastian Hetze wrote:
> With this patch applied, the system runs without hrtimer messages since
> 5 days and the timer iterrupts look fine.
>
> Actually, when restarting the system with the hrtimer patch applied,
> we also changed the BIOS setting to disable Intel SmartStep on the host.
> Since there are no hrtimer messages at all, it might be that the SmartStep
> CPU frequency adjustment is the real cause for the slow interrupts in
> the KVM guest. Anyone else experienced these problems?
Back with SmartStep enabled, we get "hrtimer: interrupt took 119986041 ns"
so it really looks like this SmartStep feature is the actual cause for
the interrupts to be delayed in the first place.
The system continues to run correctly so far, so the patch does fix the
problems caused by these delayed interrupts.
Thanx alot & best regards,
Sebastian
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-04-05 21:24 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-21 0:13 Strange CPU usage pattern in SMP guest Sebastian Hetze
2010-03-21 10:09 ` Avi Kivity
2010-03-21 12:02 ` Sebastian Hetze
[not found] ` <20100321120236.55228A0015@mail.linux-ag.de>
2010-03-21 12:19 ` Avi Kivity
2010-03-21 14:55 ` Sebastian Hetze
[not found] ` <20100321145548.CC027A0015@mail.linux-ag.de>
2010-03-21 15:17 ` Avi Kivity
2010-03-21 15:47 ` Sebastian Hetze
2010-03-22 12:51 ` Sebastian Hetze
2010-03-23 21:18 ` Marcelo Tosatti
2010-03-30 8:27 ` Sebastian Hetze
[not found] ` <20100330082743.49A113030135@mail.linux-ag.de>
2010-04-05 21:24 ` Sebastian Hetze
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.