All of lore.kernel.org
 help / color / mirror / Atom feed
* Strange CPU usage pattern in SMP guest
@ 2010-03-21  0:13 Sebastian Hetze
  2010-03-21 10:09 ` Avi Kivity
  0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21  0:13 UTC (permalink / raw)
  To: kvm

Hi *,

in an 6 CPU SMP guest running on an host with 2 quad core
Intel Xeon E5520 with hyperthrading enabled
we see one or more guest CPUs working in a very strange
pattern. It looks like all or nothing. We can easily identify
the effected CPU with xosview. Here is the mpstat output
compared to one regular working CPU:


mpstat -P 4 1
Linux 2.6.31-16-generic-pae (guest) 	21.03.2010 	_i686_	(6 CPU)
00:45:19     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
00:45:20       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:21       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:22       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:23       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:24       4    0,00   66,67    0,00    0,00    0,00   33,33    0,00    0,00    0,00
00:45:25       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:26       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:27       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:28       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:29       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
00:45:30       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
00:45:31       4    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
00:45:32       4    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
00:45:33       4    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
00:45:34       4    0,00    0,00   25,00   25,00    0,00   25,00    0,00    0,00   25,00
00:45:35       4    0,00    0,00    0,00   50,00    0,00    0,00    0,00    0,00   50,00
00:45:36       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
00:45:37       4    0,00    0,00    0,00   66,67    0,00    0,00    0,00    0,00   33,33
00:45:38       4    0,00    0,00    0,00    0,00    0,00   50,00    0,00    0,00   50,00
00:45:39       4    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
00:45:40       4    0,00    0,00    0,00   50,00    0,00   25,00    0,00    0,00   25,00
00:45:41       4    0,00    0,00   50,00    0,00    0,00   50,00    0,00    0,00    0,00
00:45:42       4    0,00   33,33    0,00    0,00    0,00    0,00    0,00    0,00   66,67
00:45:43       4   33,33    0,00   33,33    0,00    0,00   33,33    0,00    0,00    0,00
00:45:44       4    0,00    0,00    0,00   33,33    0,00   33,33    0,00    0,00   33,33
00:45:45       4    0,00    0,00    0,00  100,00    0,00    0,00    0,00    0,00    0,00


mpstat -P 5 1
Linux 2.6.31-16-generic-pae (guest) 	21.03.2010 	_i686_	(6 CPU)
00:46:05     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
00:46:06       5    0,00   50,51    4,04   10,10    1,01    4,04    0,00    0,00   30,30
00:46:07       5    0,92    1,83    0,92   55,05    0,00    1,83    0,00    0,00   39,45
00:46:08       5    0,00    0,96    3,85   64,42    0,00    1,92    0,00    0,00   28,85
00:46:09       5    1,71    1,71    7,69   61,54    0,85    7,69    0,00    0,00   18,80
00:46:10       5    1,71    1,71    5,13   64,10    0,85    9,40    0,00    0,00   17,09
00:46:12       5    0,85   50,00   25,42    6,78    0,00    5,08    0,00    0,00   11,86
00:46:13       5    0,00   28,70    3,48   21,74    0,00    5,22    0,00    0,00   40,87
00:46:14       5    0,00    3,45   15,52   18,10    0,86   14,66    0,00    0,00   47,41
00:46:15       5    0,98    0,00    3,92   87,25    0,00    3,92    0,00    0,00    3,92
00:46:16       5    1,72    0,00    4,31   17,24    0,00    9,48    0,00    0,00   67,24
00:46:17       5    0,85    0,00    3,42   12,82    0,85   27,35    0,00    0,00   54,70
00:46:18       5    0,00    0,86    7,76   12,93    0,00   27,59    0,00    0,00   50,86
00:46:20       5    0,00    0,00    6,90   36,21    0,00    6,03    0,00    0,00   50,86
00:46:21       5    0,86    0,86    5,17   22,41    0,00   35,34    0,00    0,00   35,34
00:46:22       5    0,00    0,85    9,40   18,80    0,00   23,93    0,00    0,00   47,01
00:46:23       5    0,00    4,04   11,11    9,09    1,01   20,20    0,00    0,00   54,55
00:46:24       5    0,00    5,00   11,00   10,00    0,00   29,00    0,00    0,00   45,00
00:46:25       5    1,00    9,00   10,00    3,00    0,00   28,00    0,00    0,00   49,00
00:46:26       5    0,00    1,00    9,00   13,00    1,00   26,00    0,00    0,00   50,00
00:46:27       5    3,00    4,00   12,00   20,00    0,00   25,00    0,00    0,00   36,00
00:46:28       5    0,00    0,00    8,08   16,16    0,00    8,08    0,00    0,00   67,68
00:46:29       5    0,98    0,98    3,92    4,90    0,00    5,88    0,00    0,00   83,33
00:46:30       5    1,01    2,02    8,08   24,24    0,00    7,07    0,00    0,00   57,58
00:46:31       5    0,00    0,00    7,00    4,00    1,00   20,00    0,00    0,00   68,00
00:46:32       5    1,01    1,01   11,11   14,14    0,00   15,15    0,00    0,00   57,58
00:46:33       5    0,99    2,97    9,90   32,67    0,99    9,90    0,00    0,00   42,57

In CPU4, all values are simple fractions while CPU5 shows what I would
expect as an real live workload. Any idea whats happening here?
The host CPUs do not show any such pattern.

Best regards,

  Sebastian
-- 
Sebastian Hetze
Mitglied des Vorstands
Linux Information Systems AG
Bundesallee 93, D-12161 Berlin

Fon: +49 30 726238-0, Fax: +49 30 726238-99
s.hetze@linux-ag.com, http://www.linux-ag.com
----------------------------------------------------------
Sitz der Gesellschaft: Putzbrunner Str. 71, 81739 München
Amtsgericht München: HRB 128 019
Vorstand: Rudolf Strobl, Sebastian Hetze
Aufsichtsrat: Michael Tarabochia (Vorsitzender)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
  2010-03-21  0:13 Strange CPU usage pattern in SMP guest Sebastian Hetze
@ 2010-03-21 10:09 ` Avi Kivity
  2010-03-21 12:02   ` Sebastian Hetze
       [not found]   ` <20100321120236.55228A0015@mail.linux-ag.de>
  0 siblings, 2 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-21 10:09 UTC (permalink / raw)
  To: Sebastian Hetze; +Cc: kvm

[-- Attachment #1: Type: text/plain, Size: 1602 bytes --]

On 03/21/2010 02:13 AM, Sebastian Hetze wrote:
> Hi *,
>
> in an 6 CPU SMP guest running on an host with 2 quad core
> Intel Xeon E5520 with hyperthrading enabled
> we see one or more guest CPUs working in a very strange
> pattern. It looks like all or nothing. We can easily identify
> the effected CPU with xosview. Here is the mpstat output
> compared to one regular working CPU:
>
>
> mpstat -P 4 1
> Linux 2.6.31-16-generic-pae (guest) 	21.03.2010 	_i686_	(6 CPU)
> 00:45:19     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
> 00:45:20       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
> 00:45:21       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
> 00:45:22       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
> 00:45:23       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
> 00:45:24       4    0,00   66,67    0,00    0,00    0,00   33,33    0,00    0,00    0,00
> 00:45:25       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
> 00:45:26       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
>    

Looks like the guest is only receiving 3-4 timer interrupts per second, 
so time becomes quantized.

Please run the attached irqtop in the affected guest and report the results.

Is the host overly busy?  What host kernel, kvm, and qemu are you 
running?  Is the guest running an I/O workload? if so, how are the disks 
configured?

-- 
error compiling committee.c: too many arguments to function


[-- Attachment #2: irqtop --]
[-- Type: text/plain, Size: 1756 bytes --]

#!/usr/bin/python

import curses
import sys, os, time, optparse

def read_interrupts():
    irq = {}
    proc = file('/proc/interrupts')
    nrcpu = len(proc.readline().split())
    for line in proc.readlines():
        vec, data = line.strip().split(':', 1)
        if vec in ('ERR', 'MIS'):
            continue
        counts = data.split(None, nrcpu)
        counts, rest = (counts[:-1], counts[-1])
        count = sum([int(x) for x in counts])
        try:
            v = int(vec)
            name = rest.split(None, 1)[1]
        except:
            name = rest
        irq[name] = count
    return irq

def delta_interrupts():
    old = read_interrupts()
    while True:
        irq = read_interrupts()
        delta = {}
        for key in irq.keys():
            delta[key] = irq[key] - old[key]
        yield delta
        old = irq

label_width = 30
number_width = 10

def tui(screen):
    curses.use_default_colors()
    curses.noecho()
    def getcount(x):
        return x[1]
    def refresh(irq):
        screen.erase()
        screen.addstr(0, 0, 'irqtop')
        row = 2
        for name, count in sorted(irq.items(), key = getcount, reverse = True):
            if row >= screen.getmaxyx()[0]:
                break
            col = 1
            screen.addstr(row, col, name)
            col += label_width
            screen.addstr(row, col, '%10d' % (count,))
            row += 1
        screen.refresh()

    for irqs in delta_interrupts():
        refresh(irqs)
        curses.halfdelay(10)
        try:
            c = screen.getkey()
            if c == 'q':
                break
        except KeyboardInterrupt:
            break
        except curses.error:
            continue

import curses.wrapper
curses.wrapper(tui)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
  2010-03-21 10:09 ` Avi Kivity
@ 2010-03-21 12:02   ` Sebastian Hetze
       [not found]   ` <20100321120236.55228A0015@mail.linux-ag.de>
  1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21 12:02 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Sebastian Hetze, kvm

On Sun, Mar 21, 2010 at 12:09:00PM +0200, Avi Kivity wrote:
> On 03/21/2010 02:13 AM, Sebastian Hetze wrote:
>> Hi *,
>>
>> in an 6 CPU SMP guest running on an host with 2 quad core
>> Intel Xeon E5520 with hyperthrading enabled
>> we see one or more guest CPUs working in a very strange
>> pattern. It looks like all or nothing. We can easily identify
>> the effected CPU with xosview. Here is the mpstat output
>> compared to one regular working CPU:
>>
>>
>> mpstat -P 4 1
>> Linux 2.6.31-16-generic-pae (guest) 	21.03.2010 	_i686_	(6 CPU)
>> 00:45:19     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
>> 00:45:20       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
>> 00:45:21       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
>> 00:45:22       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
>> 00:45:23       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
>> 00:45:24       4    0,00   66,67    0,00    0,00    0,00   33,33    0,00    0,00    0,00
>> 00:45:25       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
>> 00:45:26       4    0,00  100,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00
>>    
>
> Looks like the guest is only receiving 3-4 timer interrupts per second,  
> so time becomes quantized.
>
> Please run the attached irqtop in the affected guest and report the results.
>
> Is the host overly busy?  What host kernel, kvm, and qemu are you  
> running?  Is the guest running an I/O workload? if so, how are the disks  

The host is not busy at all. In fact, currently it is running only one
guest. The host is running an ubuntu 2.6.31-14-server kernel. qemu-kvm
is 0.12.2-0ubuntu6. The kvm module has srcversion: 82D6B673524596F9CF3E84C
as stated by modinfo.

The guest occasionally is running IO workload. However, the effect is
visible all the time. And it is only one out of 6 CPUs the very same guest
is running. This is the output on the guest for all CPUs:

mpstat -P ALL 1
12:45:59     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:00     all    0,40    9,74    2,39    5,37    0,80    3,98    0,00    0,00   77,34
12:46:00       0    1,00    5,00    6,00    3,00    1,00    9,00    0,00    0,00   75,00
12:46:00       1    0,00   23,00    2,00   10,00    0,00    0,00    0,00    0,00   65,00
12:46:00       2    0,00    5,94    0,99    6,93    0,00    1,98    0,00    0,00   84,16
12:46:00       3    0,00    8,00    2,00    5,00    2,00    9,00    0,00    0,00   74,00
12:46:00       4    0,00   33,33    0,00    0,00    0,00    0,00    0,00    0,00   66,67
12:46:00       5    0,00    5,94    0,00    3,96    0,00    0,99    0,00    0,00   89,11

12:46:00     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:01     all    0,60    5,81    3,21   24,45    0,40    3,61    0,00    0,00   61,92
12:46:01       0    1,01    4,04    7,07   31,31    1,01    6,06    0,00    0,00   49,49
12:46:01       1    0,00    5,00    2,00   19,00    0,00    2,00    0,00    0,00   72,00
12:46:01       2    0,99    7,92    1,98   35,64    0,00    2,97    0,00    0,00   50,50
12:46:01       3    1,98    4,95    2,97   13,86    0,00    6,93    0,00    0,00   69,31
12:46:01       4    0,00   33,33    0,00    0,00    0,00    0,00    0,00    0,00   66,67
12:46:01       5    0,00    8,08    3,03   22,22    0,00    1,01    0,00    0,00   65,66

12:46:01     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:02     all    2,38   12,70   17,06   14,68    0,60    1,98    0,00    0,00   50,60
12:46:02       0    3,96   15,84    9,90   13,86    0,00    2,97    0,00    0,00   53,47
12:46:02       1    2,97    6,93    5,94   19,80    2,97    2,97    0,00    0,00   58,42
12:46:02       2    2,02   17,17    8,08   18,18    2,02    1,01    0,00    0,00   51,52
12:46:02       3    2,02   10,10    8,08   14,14    0,00    2,02    0,00    0,00   63,64
12:46:02       4    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:02       5    0,00   13,00   55,00    6,00    0,00    1,00    0,00    0,00   25,00

12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00

So it is only CPU4 that is showing this strange behaviour.
The same time on the host:

mpstat -P ALL 1
12:46:00     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:01     all    0,16    0,00    1,92    1,37    0,00    0,05    0,00    5,97   90,53
12:46:01       0    1,00    0,00    9,00   25,00    0,00    1,00    0,00   17,00   47,00
12:46:01       1    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:01       2    0,00    0,00   13,08    0,00    0,00    0,00    0,00   17,69   69,23
12:46:01       3    0,00    0,00    2,06    0,00    0,00    0,00    0,00    8,25   89,69
12:46:01       4    0,95    0,00    0,95    0,00    0,00    0,00    0,00   22,86   75,24
12:46:01       5    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:01       6    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:01       7    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:01       8    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:01       9    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:01      10    0,87    0,00    5,22    0,00    0,00    0,00    0,00   23,48   70,43
12:46:01      11    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:01      12    0,00    0,00    0,00    0,00    0,00    0,00    0,00   10,20   89,80
12:46:01      13    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:01      14    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:01      15    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00

12:46:01     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:02     all    0,20    0,00    0,53    4,92    0,00    0,13    0,00    6,25   87,97
12:46:02       0    0,00    0,00    2,00   74,00    0,00    2,00    0,00    1,00   21,00
12:46:02       1    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:02       2    0,00    0,00    3,06    0,00    0,00    0,00    0,00   28,57   68,37
12:46:02       3    1,11    0,00    1,11    0,00    0,00    0,00    0,00   22,22   75,56
12:46:02       4    0,00    0,00    0,00    0,00    0,00    0,00    0,00    7,87   92,13
12:46:02       5    0,00    0,00    0,95    0,00    0,00    0,00    0,00    0,00   99,05
12:46:02       6    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:02       7    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:02       8    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:02       9    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:02      10    0,92    0,00    0,92    0,00    0,00    0,00    0,00   22,02   76,15
12:46:02      11    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:02      12    1,08    0,00    0,00    0,00    0,00    0,00    0,00   15,05   83,87
12:46:02      13    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:02      14    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:02      15    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00

12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:03     all    0,26    0,00    1,12    1,97    0,00    0,07    0,00   13,93   82,65
12:46:03       0    1,00    0,00    6,00   30,00    0,00    1,00    0,00    0,00   62,00
12:46:03       1    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:03       2    0,00    0,00    0,00    0,00    0,00    0,00    0,00   82,18   17,82
12:46:03       3    0,00    0,00    4,76    0,00    0,00    0,00    0,00   38,10   57,14
12:46:03       4    0,00    0,00    0,00    0,00    0,00    0,00    0,00   16,49   83,51
12:46:03       5    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:03       6    0,34    0,00    1,01    0,00    0,00    0,00    0,00   11,11   87,54
12:46:03       7    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:03       8    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:03       9    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00
12:46:03      10    2,25    0,00    0,00    0,00    0,00    0,00    0,00   26,97   70,79
12:46:03      11    0,00    0,00    1,67    0,00    0,00    0,00    0,00    0,00   98,33
12:46:03      12    0,00    0,00    1,98    0,00    0,00    0,00    0,00   15,84   82,18
12:46:03      13    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:03      14    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:03      15    0,00    0,00    0,00    0,00    0,00    0,00    0,00    0,00  100,00


your irqtop on the guest shows something like:

 Local timer interrupts              1257
 Rescheduling interrupts              849
 eth0                                 715
 ata_piix                              24
 virtio0-input                          1


Disk configuration of the guest is a single LVM volume running as raw image.

Here is the qemu-kvm command line:
/usr/bin/qemu-system-x86_64 -S -M pc-0.12 -cpu qemu32 -m 8192 -smp 6 -name guest -uuid dac8c0df-5848-48e5-904e-5548fcb5e614 -monitor unix:/var/lib/libvirt/qemu/guest.monitor,server,nowait -boot c -drive file=/dev/drbd/by-res/guest,if=ide,index=0,boot=on,format=host_device,cache=none -net nic,macaddr=24:09:61:01:32:00,vlan=0,model=virtio,name=virtio.0 -net tap,fd=20,vlan=0,name=tap.0 -serial pty -parallel none -usb -vnc 127.0.0.1:0 -k de -vga cirrus -soundhw es1370 -pcidevice host=01:00.0 -pcidevice host=06:00.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
       [not found]   ` <20100321120236.55228A0015@mail.linux-ag.de>
@ 2010-03-21 12:19     ` Avi Kivity
  2010-03-21 14:55       ` Sebastian Hetze
       [not found]       ` <20100321145548.CC027A0015@mail.linux-ag.de>
  0 siblings, 2 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-21 12:19 UTC (permalink / raw)
  To: Sebastian Hetze; +Cc: kvm

On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>
> 12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
> 12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
> 12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
> 12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
> 12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
> 12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
> 12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
> 12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00
>
> So it is only CPU4 that is showing this strange behaviour.
>    

Can you adjust irqtop to only count cpu4?  or even just post a few 'cat 
/proc/interrupts' from that guest.

Most likely the timer interrupt for cpu4 died.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
  2010-03-21 12:19     ` Avi Kivity
@ 2010-03-21 14:55       ` Sebastian Hetze
       [not found]       ` <20100321145548.CC027A0015@mail.linux-ag.de>
  1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21 14:55 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Sebastian Hetze, kvm

[-- Attachment #1: Type: text/plain, Size: 2049 bytes --]

On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>
>> 12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
>> 12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
>> 12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
>> 12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
>> 12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
>> 12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
>> 12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
>> 12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00
>>
>> So it is only CPU4 that is showing this strange behaviour.
>>    
>
> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat  
> /proc/interrupts' from that guest.
>
> Most likely the timer interrupt for cpu4 died.

I've added two keys +/- to your irqtop to focus up and down
in the row of available CPUs.
The irqtop for CPU4 shows a constant number of 6 local timer interrupts
per update, while the other CPUs show various higher values:

irqtop for cpu 4

 eth0                                      188
 Rescheduling interrupts                   162
 Local timer interrupts                      6
 ata_piix                                    3
 TLB shootdowns                              1
 Spurious interrupts                         0
 Machine check exceptions                    0


irqtop for cpu 5

 eth0                                      257
 Local timer interrupts                    251
 Rescheduling interrupts                   237
 Spurious interrupts                         0
 Machine check exceptions                    0

So the timer interrupt for cpu4 is not completely dead but somehow
broken. What can cause this problem? Any way to speed it up again?


[-- Attachment #2: irqtop --]
[-- Type: text/plain, Size: 2201 bytes --]

#!/usr/bin/python

import curses
import sys, os, time, optparse

def read_interrupts():
    global target
    irq = {}
    proc = file('/proc/interrupts')
    nrcpu = len(proc.readline().split())
    if target < 0:
        target = 0;
    if target > nrcpu:
        target = nrcpu
    for line in proc.readlines():
        vec, data = line.strip().split(':', 1)
        if vec in ('ERR', 'MIS'):
            continue
        counts = data.split(None, nrcpu)
        counts, rest = (counts[:-1], counts[-1])
	if target == 0:
            count = sum([int(x) for x in counts])
	else:
            count = int(counts[target-1])
        try:
            v = int(vec)
            name = rest.split(None, 1)[1]
        except:
            name = rest
        irq[name] = count
    return irq

def delta_interrupts():
    old = read_interrupts()
    while True:
        irq = read_interrupts()
        delta = {}
        for key in irq.keys():
            delta[key] = irq[key] - old[key]
        yield delta
        old = irq

target = 0
label_width = 35
number_width = 10

def tui(screen):
    curses.use_default_colors()
    global target
    curses.noecho()
    def getcount(x):
        return x[1]
    def refresh(irq):
        screen.erase()
        if target > 0:
            title = "irqtop for cpu %d"%(target-1)
	else:
            title = "irqtop sum for all cpu's"
        screen.addstr(0, 0, title)
        row = 2
        for name, count in sorted(irq.items(), key = getcount, reverse = True):
            if row >= screen.getmaxyx()[0]:
                break
            col = 1
            screen.addstr(row, col, name)
            col += label_width
            screen.addstr(row, col, '%10d' % (count,))
            row += 1
        screen.refresh()

    for irqs in delta_interrupts():
        refresh(irqs)
        curses.halfdelay(10)
        try:
            c = screen.getkey()
            if c == 'q':
                break
            if c == '+':
                target = target+1
            if c == '-':
                target = target-1
        except KeyboardInterrupt:
            break
        except curses.error:
            continue

import curses.wrapper
curses.wrapper(tui)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
       [not found]       ` <20100321145548.CC027A0015@mail.linux-ag.de>
@ 2010-03-21 15:17         ` Avi Kivity
  2010-03-21 15:47           ` Sebastian Hetze
  2010-03-22 12:51           ` Sebastian Hetze
  0 siblings, 2 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-21 15:17 UTC (permalink / raw)
  To: Sebastian Hetze; +Cc: kvm, Marcelo Tosatti

On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
>    
>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>      
>>> 12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
>>> 12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
>>> 12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
>>> 12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
>>> 12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
>>> 12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
>>> 12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
>>> 12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00
>>>
>>> So it is only CPU4 that is showing this strange behaviour.
>>>
>>>        
>> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat
>> /proc/interrupts' from that guest.
>>
>> Most likely the timer interrupt for cpu4 died.
>>      
> I've added two keys +/- to your irqtop to focus up and down
> in the row of available CPUs.
> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
> per update, while the other CPUs show various higher values:
>
> irqtop for cpu 4
>
>   eth0                                      188
>   Rescheduling interrupts                   162
>   Local timer interrupts                      6
>   ata_piix                                    3
>   TLB shootdowns                              1
>   Spurious interrupts                         0
>   Machine check exceptions                    0
>
>
> irqtop for cpu 5
>
>   eth0                                      257
>   Local timer interrupts                    251
>   Rescheduling interrupts                   237
>   Spurious interrupts                         0
>   Machine check exceptions                    0
>
> So the timer interrupt for cpu4 is not completely dead but somehow
> broken.

That is incredibly weird.

> What can cause this problem? Any way to speed it up again?
>    

The host has 8 cpus and is only running this 6 vcpu guest, yes?

Can you confirm the other vcpus are ticking at 250 Hz?

What does 'top' show running on cpu 4?  Pressing 'f' 'j' will add a 
last-used-cpu field in the display.

Marcelo, any ideas?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
  2010-03-21 15:17         ` Avi Kivity
@ 2010-03-21 15:47           ` Sebastian Hetze
  2010-03-22 12:51           ` Sebastian Hetze
  1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-21 15:47 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Sebastian Hetze, kvm, Marcelo Tosatti

On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
>> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
>>    
>>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>>      
>>>> 12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
>>>> 12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
>>>> 12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
>>>> 12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
>>>> 12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
>>>> 12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
>>>> 12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
>>>> 12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00
>>>>
>>>> So it is only CPU4 that is showing this strange behaviour.
>>>>
>>>>        
>>> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat
>>> /proc/interrupts' from that guest.
>>>
>>> Most likely the timer interrupt for cpu4 died.
>>>      
>> I've added two keys +/- to your irqtop to focus up and down
>> in the row of available CPUs.
>> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
>> per update, while the other CPUs show various higher values:
>>
>> irqtop for cpu 4
>>
>>   eth0                                      188
>>   Rescheduling interrupts                   162
>>   Local timer interrupts                      6
>>   ata_piix                                    3
>>   TLB shootdowns                              1
>>   Spurious interrupts                         0
>>   Machine check exceptions                    0
>>
>>
>> irqtop for cpu 5
>>
>>   eth0                                      257
>>   Local timer interrupts                    251
>>   Rescheduling interrupts                   237
>>   Spurious interrupts                         0
>>   Machine check exceptions                    0
>>
>> So the timer interrupt for cpu4 is not completely dead but somehow
>> broken.
>
> That is incredibly weird.
>
>> What can cause this problem? Any way to speed it up again?
>>    
>
> The host has 8 cpus and is only running this 6 vcpu guest, yes?

The host is an dual quad core E5520 with hyperthrading enabled, so we
see 2x4x2=16 CPUs on the host. The guest is started with 6 CPUs.

> Can you confirm the other vcpus are ticking at 250 Hz?

The irqtop shows different numbers for local timer interrupts on the
other CPUs. The total number (summed up over all CPUs) varies between
something like 700 and 1400. Any CPU can be down to 10 and next update
up to 260. Only CPU4 stays at the 6 local timer interrupts.

>
> What does 'top' show running on cpu 4?  Pressing 'f' 'j' will add a  
> last-used-cpu field in the display.

The processes are not bound to a particular CPU, so the picture varies.
Here are two shots:

take1:

   15 root      RT  -5     0    0    0 S    0  0.0   0:01.70 4 migration/4
   16 root      15  -5     0    0    0 S    0  0.0   0:00.08 4 ksoftirqd/4
   17 root      RT  -5     0    0    0 S    0  0.0   0:00.00 4 watchdog/4
   25 root      15  -5     0    0    0 S    0  0.0   0:00.01 4 events/4
   35 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kintegrityd/4
   41 root      15  -5     0    0    0 S    0  0.0   0:00.03 4 kblockd/4
   50 root      15  -5     0    0    0 S    0  0.0   0:00.90 4 ata/4
   55 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kseriod
   66 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 aio/4
   73 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 crypto/4
   80 root      15  -5     0    0    0 S    0  0.0   2:11.71 4 scsi_eh_1
   87 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kmpathd/4
   95 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kondemand/4
  101 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kconservative/4
  103 root      10 -10     0    0    0 S    0  0.0   0:00.00 4 krfcommd
  681 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kdmflush
  686 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kdmflush
  691 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kdmflush
  737 root      15  -5     0    0    0 S    0  0.0   0:00.71 4 kjournald
  826 root      16  -4  2100  452  312 S    0  0.0   0:00.14 4 udevd
 1350 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kpsmoused
 1444 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kgameportd
 1718 root      15  -5     0    0    0 S    0  0.0   0:14.62 4 kjournald
 2108 statd     20   0  2252 1152  760 S    0  0.0   0:02.66 4 rpc.statd
 2117 root      15  -5     0    0    0 S    0  0.0   0:00.36 4 rpciod/4
 2123 root      15  -5     0    0    0 S    0  0.0   0:06.61 4 nfsiod
 2259 root      20   0  1696  444  440 S    0  0.0   0:00.00 4 getty
 2265 root      20   0  1696  444  440 S    0  0.0   0:00.00 4 getty
 2584 root      20   0  1860  656  604 S    0  0.0   0:01.92 4 automount
 2712 mysql     20   0  146m 4580 3464 S    0  0.1   0:59.71 4 mysqld
 2999 root      20   0 28944  12m 1704 S    0  0.2   1:23.04 4 console-kit-dae
 3834 root      20   0 13476 2640 1572 S    0  0.0   0:00.03 4 smbd
 4015 root      15  -5     0    0    0 S    0  0.0   7:22.33 4 nfsd
 4017 root      15  -5     0    0    0 S    0  0.0  10:46.87 4 nfsd
 4030 root      20   0  2644 1180  876 S    0  0.0   0:04.32 4 rpc.mountd
 4134 root      20   0  6164 1672 1344 S    0  0.0   0:00.37 4 master
 4311 root      20   0  5068 1796 1580 S    0  0.0   1:25.80 4 hald-addon-stor
 4560 root      20   0  2092  728  572 S    0  0.0   0:01.78 4 cron
 5053 www-data  20   0 33312 4092 1004 S    0  0.0   0:00.03 4 apache2

take 2:

   15 root      RT  -5     0    0    0 S    0  0.0   0:01.70 4 migration/4
   16 root      15  -5     0    0    0 S    0  0.0   0:00.08 4 ksoftirqd/4
   17 root      RT  -5     0    0    0 S    0  0.0   0:00.00 4 watchdog/4
   25 root      15  -5     0    0    0 S    0  0.0   0:00.01 4 events/4
   35 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kintegrityd/4
   41 root      15  -5     0    0    0 S    0  0.0   0:00.03 4 kblockd/4
   50 root      15  -5     0    0    0 S    0  0.0   0:00.90 4 ata/4
   55 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kseriod
   66 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 aio/4
   73 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 crypto/4
   87 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kmpathd/4
   95 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kondemand/4
  101 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kconservative/4
  103 root      10 -10     0    0    0 S    0  0.0   0:00.00 4 krfcommd
  681 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kdmflush
  686 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kdmflush
  691 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kdmflush
  737 root      15  -5     0    0    0 S    0  0.0   0:00.71 4 kjournald
  826 root      16  -4  2100  452  312 S    0  0.0   0:00.14 4 udevd
 1350 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kpsmoused
 1444 root      15  -5     0    0    0 S    0  0.0   0:00.00 4 kgameportd
 2091 daemon    20   0  1820  552  464 S    0  0.0   0:44.31 4 portmap
 2108 statd     20   0  2252 1152  760 S    0  0.0   0:02.66 4 rpc.statd
 2117 root      15  -5     0    0    0 S    0  0.0   0:00.36 4 rpciod/4
 2123 root      15  -5     0    0    0 S    0  0.0   0:06.61 4 nfsiod
 2259 root      20   0  1696  444  440 S    0  0.0   0:00.00 4 getty
 2265 root      20   0  1696  444  440 S    0  0.0   0:00.00 4 getty
 2712 mysql     20   0  146m 4580 3464 S    0  0.1   0:59.72 4 mysqld
 3834 root      20   0 13476 2640 1572 S    0  0.0   0:00.03 4 smbd
 4017 root      15  -5     0    0    0 S    0  0.0  10:46.87 4 nfsd
 4134 root      20   0  6164 1672 1344 S    0  0.0   0:00.37 4 master
 4311 root      20   0  5068 1796 1580 S    0  0.0   1:25.80 4 hald-addon-stor
 4560 root      20   0  2092  728  572 S    0  0.0   0:01.78 4 cron
 5053 www-data  20   0 33312 4092 1004 S    0  0.0   0:00.03 4 apache2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
  2010-03-21 15:17         ` Avi Kivity
  2010-03-21 15:47           ` Sebastian Hetze
@ 2010-03-22 12:51           ` Sebastian Hetze
  2010-03-23 21:18             ` Marcelo Tosatti
  1 sibling, 1 reply; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-22 12:51 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Sebastian Hetze, kvm, Marcelo Tosatti

On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
>> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
>>    
>>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>>      
>>>> 12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
>>>> 12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
>>>> 12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
>>>> 12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
>>>> 12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
>>>> 12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
>>>> 12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
>>>> 12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00
>>>>
>>>> So it is only CPU4 that is showing this strange behaviour.
>>>>
>>>>        
>>> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat
>>> /proc/interrupts' from that guest.
>>>
>>> Most likely the timer interrupt for cpu4 died.
>>>      
>> I've added two keys +/- to your irqtop to focus up and down
>> in the row of available CPUs.
>> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
>> per update, while the other CPUs show various higher values:
>>
>> irqtop for cpu 4
>>
>>   eth0                                      188
>>   Rescheduling interrupts                   162
>>   Local timer interrupts                      6
>>   ata_piix                                    3
>>   TLB shootdowns                              1
>>   Spurious interrupts                         0
>>   Machine check exceptions                    0
>>
>>
>> irqtop for cpu 5
>>
>>   eth0                                      257
>>   Local timer interrupts                    251
>>   Rescheduling interrupts                   237
>>   Spurious interrupts                         0
>>   Machine check exceptions                    0
>>
>> So the timer interrupt for cpu4 is not completely dead but somehow
>> broken.
>
> That is incredibly weird.
>
>> What can cause this problem? Any way to speed it up again?
>>    
>
> The host has 8 cpus and is only running this 6 vcpu guest, yes?
>
> Can you confirm the other vcpus are ticking at 250 Hz?
>
> What does 'top' show running on cpu 4?  Pressing 'f' 'j' will add a  
> last-used-cpu field in the display.
>
> Marcelo, any ideas?

Just to let you know, right after startup, all vcpus work fine.

The following message might be related to the problem:
hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns

The guest is an 32bit system running on an 64bit host.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
  2010-03-22 12:51           ` Sebastian Hetze
@ 2010-03-23 21:18             ` Marcelo Tosatti
  2010-03-30  8:27               ` Sebastian Hetze
       [not found]               ` <20100330082743.49A113030135@mail.linux-ag.de>
  0 siblings, 2 replies; 11+ messages in thread
From: Marcelo Tosatti @ 2010-03-23 21:18 UTC (permalink / raw)
  To: Sebastian Hetze; +Cc: Avi Kivity, kvm

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

On Mon, Mar 22, 2010 at 01:51:20PM +0100, Sebastian Hetze wrote:
> On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> > On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
> >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
> >>    
> >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
> >>>      
> >>>> 12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
> >>>> 12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
> >>>> 12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
> >>>> 12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
> >>>> 12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
> >>>> 12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
> >>>> 12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
> >>>> 12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00
> >>>>
> >>>> So it is only CPU4 that is showing this strange behaviour.
> >>>>
> >>>>        
> >>> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat
> >>> /proc/interrupts' from that guest.
> >>>
> >>> Most likely the timer interrupt for cpu4 died.
> >>>      
> >> I've added two keys +/- to your irqtop to focus up and down
> >> in the row of available CPUs.
> >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
> >> per update, while the other CPUs show various higher values:
> >>
> >> irqtop for cpu 4
> >>
> >>   eth0                                      188
> >>   Rescheduling interrupts                   162
> >>   Local timer interrupts                      6
> >>   ata_piix                                    3
> >>   TLB shootdowns                              1
> >>   Spurious interrupts                         0
> >>   Machine check exceptions                    0
> >>
> >>
> >> irqtop for cpu 5
> >>
> >>   eth0                                      257
> >>   Local timer interrupts                    251
> >>   Rescheduling interrupts                   237
> >>   Spurious interrupts                         0
> >>   Machine check exceptions                    0
> >>
> >> So the timer interrupt for cpu4 is not completely dead but somehow
> >> broken.
> >
> > That is incredibly weird.
> >
> >> What can cause this problem? Any way to speed it up again?
> >>    
> >
> > The host has 8 cpus and is only running this 6 vcpu guest, yes?
> >
> > Can you confirm the other vcpus are ticking at 250 Hz?
> >
> > What does 'top' show running on cpu 4?  Pressing 'f' 'j' will add a  
> > last-used-cpu field in the display.
> >
> > Marcelo, any ideas?
> 
> Just to let you know, right after startup, all vcpus work fine.
> 
> The following message might be related to the problem:
> hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns
> 
> The guest is an 32bit system running on an 64bit host.

Sebastian,

Please apply the attached patch to your guest kernel.


[-- Attachment #2: hrtimer-tune-hrtimer_interrupt-hang-logic.patch --]
[-- Type: text/plain, Size: 7973 bytes --]

commit ff28aa973164f8e5f4b2633e178162d121c4e5eb
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Fri Nov 13 17:05:44 2009 +0100

    hrtimer: Tune hrtimer_interrupt hang logic
    
    The hrtimer_interrupt hang logic adjusts min_delta_ns based on the
    execution time of the hrtimer callbacks.
    
    This is error-prone for virtual machines, where a guest vcpu can be
    scheduled out during the execution of the callbacks (and the callbacks
    themselves can do operations that translate to blocking operations in
    the hypervisor), which in can lead to large min_delta_ns rendering the
    system unusable.
    
    Replace the current heuristics with something more reliable. Allow the
    interrupt code to try 3 times to catch up with the lost time. If that
    fails use the total time spent in the interrupt handler to defer the
    next timer interrupt so the system can catch up with other things
    which got delayed. Limit that deferment to 100ms.
    
    The retry events and the maximum time spent in the interrupt handler
    are recorded and exposed via /proc/timer_list
    
    Inspired by a patch from Marcelo.
    
    Reported-by: Michael Tokarev <mjt@tls.msk.ru>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: kvm@vger.kernel.org

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 4759917..2e1064f 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -164,10 +164,11 @@ struct hrtimer_clock_base {
  * @expires_next:	absolute time of the next event which was scheduled
  *			via clock_set_next_event()
  * @hres_active:	State of high resolution mode
- * @check_clocks:	Indictator, when set evaluate time source and clock
- *			event devices whether high resolution mode can be
- *			activated.
- * @nr_events:		Total number of timer interrupt events
+ * @hang_detected:	The last hrtimer interrupt detected a hang
+ * @nr_events:		Total number of hrtimer interrupt events
+ * @nr_retries:		Total number of hrtimer interrupt retries
+ * @nr_hangs:		Total number of hrtimer interrupt hangs
+ * @max_hang_time:	Maximum time spent in hrtimer_interrupt
  */
 struct hrtimer_cpu_base {
 	spinlock_t			lock;
@@ -175,7 +176,11 @@ struct hrtimer_cpu_base {
 #ifdef CONFIG_HIGH_RES_TIMERS
 	ktime_t				expires_next;
 	int				hres_active;
+	int				hang_detected;
 	unsigned long			nr_events;
+	unsigned long			nr_retries;
+	unsigned long			nr_hangs;
+	ktime_t				max_hang_time;
 #endif
 };
 
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 49da79a..4b0638b 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -584,7 +584,7 @@ static void hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base)
 static int hrtimer_reprogram(struct hrtimer *timer,
 			     struct hrtimer_clock_base *base)
 {
-	ktime_t *expires_next = &__get_cpu_var(hrtimer_bases).expires_next;
+	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
 	ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 	int res;
 
@@ -609,7 +609,16 @@ static int hrtimer_reprogram(struct hrtimer *timer,
 	if (expires.tv64 < 0)
 		return -ETIME;
 
-	if (expires.tv64 >= expires_next->tv64)
+	if (expires.tv64 >= cpu_base->expires_next.tv64)
+		return 0;
+
+	/*
+	 * If a hang was detected in the last timer interrupt then we
+	 * do not schedule a timer which is earlier than the expiry
+	 * which we enforced in the hang detection. We want the system
+	 * to make progress.
+	 */
+	if (cpu_base->hang_detected)
 		return 0;
 
 	/*
@@ -617,7 +626,7 @@ static int hrtimer_reprogram(struct hrtimer *timer,
 	 */
 	res = tick_program_event(expires, 0);
 	if (!IS_ERR_VALUE(res))
-		*expires_next = expires;
+		cpu_base->expires_next = expires;
 	return res;
 }
 
@@ -1234,29 +1243,6 @@ static void __run_hrtimer(struct hrtimer *timer)
 
 #ifdef CONFIG_HIGH_RES_TIMERS
 
-static int force_clock_reprogram;
-
-/*
- * After 5 iteration's attempts, we consider that hrtimer_interrupt()
- * is hanging, which could happen with something that slows the interrupt
- * such as the tracing. Then we force the clock reprogramming for each future
- * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
- * threshold that we will overwrite.
- * The next tick event will be scheduled to 3 times we currently spend on
- * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
- * 1/4 of their time to process the hrtimer interrupts. This is enough to
- * let it running without serious starvation.
- */
-
-static inline void
-hrtimer_interrupt_hanging(struct clock_event_device *dev,
-			ktime_t try_time)
-{
-	force_clock_reprogram = 1;
-	dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
-	printk(KERN_WARNING "hrtimer: interrupt too slow, "
-		"forcing clock min delta to %lu ns\n", dev->min_delta_ns);
-}
 /*
  * High resolution timer interrupt
  * Called with interrupts disabled
@@ -1265,21 +1251,15 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 {
 	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
 	struct hrtimer_clock_base *base;
-	ktime_t expires_next, now;
-	int nr_retries = 0;
-	int i;
+	ktime_t expires_next, now, entry_time, delta;
+	int i, retries = 0;
 
 	BUG_ON(!cpu_base->hres_active);
 	cpu_base->nr_events++;
 	dev->next_event.tv64 = KTIME_MAX;
 
- retry:
-	/* 5 retries is enough to notice a hang */
-	if (!(++nr_retries % 5))
-		hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now));
-
-	now = ktime_get();
-
+	entry_time = now = ktime_get();
+retry:
 	expires_next.tv64 = KTIME_MAX;
 
 	spin_lock(&cpu_base->lock);
@@ -1341,10 +1321,48 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	spin_unlock(&cpu_base->lock);
 
 	/* Reprogramming necessary ? */
-	if (expires_next.tv64 != KTIME_MAX) {
-		if (tick_program_event(expires_next, force_clock_reprogram))
-			goto retry;
+	if (expires_next.tv64 == KTIME_MAX ||
+	    !tick_program_event(expires_next, 0)) {
+		cpu_base->hang_detected = 0;
+		return;
 	}
+
+	/*
+	 * The next timer was already expired due to:
+	 * - tracing
+	 * - long lasting callbacks
+	 * - being scheduled away when running in a VM
+	 *
+	 * We need to prevent that we loop forever in the hrtimer
+	 * interrupt routine. We give it 3 attempts to avoid
+	 * overreacting on some spurious event.
+	 */
+	now = ktime_get();
+	cpu_base->nr_retries++;
+	if (++retries < 3)
+		goto retry;
+	/*
+	 * Give the system a chance to do something else than looping
+	 * here. We stored the entry time, so we know exactly how long
+	 * we spent here. We schedule the next event this amount of
+	 * time away.
+	 */
+	cpu_base->nr_hangs++;
+	cpu_base->hang_detected = 1;
+	delta = ktime_sub(now, entry_time);
+	if (delta.tv64 > cpu_base->max_hang_time.tv64)
+		cpu_base->max_hang_time = delta;
+	/*
+	 * Limit it to a sensible value as we enforce a longer
+	 * delay. Give the CPU at least 100ms to catch up.
+	 */
+	if (delta.tv64 > 100 * NSEC_PER_MSEC)
+		expires_next = ktime_add_ns(now, 100 * NSEC_PER_MSEC);
+	else
+		expires_next = ktime_add(now, delta);
+	tick_program_event(expires_next, 1);
+	printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
+		    ktime_to_ns(delta));
 }
 
 /*
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index fddd69d..7cd905c 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -150,6 +150,9 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
 	P_ns(expires_next);
 	P(hres_active);
 	P(nr_events);
+	P(nr_retries);
+	P(nr_hangs);
+	P_ns(max_hang_time);
 #endif
 #undef P
 #undef P_ns
@@ -252,7 +255,7 @@ static int timer_list_show(struct seq_file *m, void *v)
 	u64 now = ktime_to_ns(ktime_get());
 	int cpu;
 
-	SEQ_printf(m, "Timer List Version: v0.4\n");
+	SEQ_printf(m, "Timer List Version: v0.5\n");
 	SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES);
 	SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now);
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
  2010-03-23 21:18             ` Marcelo Tosatti
@ 2010-03-30  8:27               ` Sebastian Hetze
       [not found]               ` <20100330082743.49A113030135@mail.linux-ag.de>
  1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-03-30  8:27 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Sebastian Hetze, Avi Kivity, kvm

On Tue, Mar 23, 2010 at 06:18:08PM -0300, Marcelo Tosatti wrote:
> On Mon, Mar 22, 2010 at 01:51:20PM +0100, Sebastian Hetze wrote:
> > On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> > > On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
> > >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
> > >>    
> > >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
> > >>>      
> > >>>> 12:46:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
> > >>>> 12:46:03     all    0,20   11,35   10,96    8,96    0,40    2,99    0,00    0,00   65,14
> > >>>> 12:46:03       0    1,00   11,00    7,00   15,00    0,00    1,00    0,00    0,00   65,00
> > >>>> 12:46:03       1    0,00    7,14    2,04    6,12    1,02   11,22    0,00    0,00   72,45
> > >>>> 12:46:03       2    0,00   15,00    1,00   12,00    0,00    1,00    0,00    0,00   71,00
> > >>>> 12:46:03       3    0,00   11,00   23,00    8,00    0,00    0,00    0,00    0,00   58,00
> > >>>> 12:46:03       4    0,00    0,00   50,00    0,00    0,00    0,00    0,00    0,00   50,00
> > >>>> 12:46:03       5    0,00   13,00   20,00    4,00    0,00    1,00    0,00    0,00   62,00
> > >>>>
> > >>>> So it is only CPU4 that is showing this strange behaviour.
> > >>>>
> > >>>>        
> > >>> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat
> > >>> /proc/interrupts' from that guest.
> > >>>
> > >>> Most likely the timer interrupt for cpu4 died.
> > >>>      
> > >> I've added two keys +/- to your irqtop to focus up and down
> > >> in the row of available CPUs.
> > >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
> > >> per update, while the other CPUs show various higher values:
> > >>
> > >> irqtop for cpu 4
> > >>
> > >>   eth0                                      188
> > >>   Rescheduling interrupts                   162
> > >>   Local timer interrupts                      6
> > >>   ata_piix                                    3
> > >>   TLB shootdowns                              1
> > >>   Spurious interrupts                         0
> > >>   Machine check exceptions                    0
> > >>
> > >>
> > >> irqtop for cpu 5
> > >>
> > >>   eth0                                      257
> > >>   Local timer interrupts                    251
> > >>   Rescheduling interrupts                   237
> > >>   Spurious interrupts                         0
> > >>   Machine check exceptions                    0
> > >>
> > >> So the timer interrupt for cpu4 is not completely dead but somehow
> > >> broken.
> > >
> > > That is incredibly weird.
> > >
> > >> What can cause this problem? Any way to speed it up again?
> > >>    
> > >
> > > The host has 8 cpus and is only running this 6 vcpu guest, yes?
> > >
> > > Can you confirm the other vcpus are ticking at 250 Hz?
> > >
> > > What does 'top' show running on cpu 4?  Pressing 'f' 'j' will add a  
> > > last-used-cpu field in the display.
> > >
> > > Marcelo, any ideas?
> > 
> > Just to let you know, right after startup, all vcpus work fine.
> > 
> > The following message might be related to the problem:
> > hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns
> > 
> > The guest is an 32bit system running on an 64bit host.
> 
> Sebastian,
> 
> Please apply the attached patch to your guest kernel.
> 

With this patch applied, the system runs without hrtimer messages since
5 days and the timer iterrupts look fine. However, I had this
Clocksource tsc unstable (delta = -4398046474878 ns) message that I
reported on Sunday.

Actually, when restarting the system with the hrtimer patch applied,
we also changed the BIOS setting to disable Intel SmartStep on the host.
Since there are no hrtimer messages at all, it might be that the SmartStep
CPU frequency adjustment is the real cause for the slow interrupts in
the KVM guest. Anyone else experienced these problems?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Strange CPU usage pattern in SMP guest
       [not found]               ` <20100330082743.49A113030135@mail.linux-ag.de>
@ 2010-04-05 21:24                 ` Sebastian Hetze
  0 siblings, 0 replies; 11+ messages in thread
From: Sebastian Hetze @ 2010-04-05 21:24 UTC (permalink / raw)
  To: Sebastian Hetze; +Cc: Marcelo Tosatti, Avi Kivity, kvm

On Tue, Mar 30, 2010 at 10:27:43AM +0200, Sebastian Hetze wrote:
> With this patch applied, the system runs without hrtimer messages since
> 5 days and the timer iterrupts look fine.
> 
> Actually, when restarting the system with the hrtimer patch applied,
> we also changed the BIOS setting to disable Intel SmartStep on the host.
> Since there are no hrtimer messages at all, it might be that the SmartStep
> CPU frequency adjustment is the real cause for the slow interrupts in
> the KVM guest. Anyone else experienced these problems?

Back with SmartStep enabled, we get "hrtimer: interrupt took 119986041 ns"
so it really looks like this SmartStep feature is the actual cause for
the interrupts to be delayed in the first place.

The system continues to run correctly so far, so the patch does fix the 
problems caused by these delayed interrupts.

Thanx alot & best regards,

  Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-04-05 21:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-21  0:13 Strange CPU usage pattern in SMP guest Sebastian Hetze
2010-03-21 10:09 ` Avi Kivity
2010-03-21 12:02   ` Sebastian Hetze
     [not found]   ` <20100321120236.55228A0015@mail.linux-ag.de>
2010-03-21 12:19     ` Avi Kivity
2010-03-21 14:55       ` Sebastian Hetze
     [not found]       ` <20100321145548.CC027A0015@mail.linux-ag.de>
2010-03-21 15:17         ` Avi Kivity
2010-03-21 15:47           ` Sebastian Hetze
2010-03-22 12:51           ` Sebastian Hetze
2010-03-23 21:18             ` Marcelo Tosatti
2010-03-30  8:27               ` Sebastian Hetze
     [not found]               ` <20100330082743.49A113030135@mail.linux-ag.de>
2010-04-05 21:24                 ` Sebastian Hetze

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.