* Strange CPU usage pattern in SMP guest @ 2010-03-21 0:13 Sebastian Hetze 2010-03-21 10:09 ` Avi Kivity 0 siblings, 1 reply; 11+ messages in thread From: Sebastian Hetze @ 2010-03-21 0:13 UTC (permalink / raw) To: kvm Hi *, in an 6 CPU SMP guest running on an host with 2 quad core Intel Xeon E5520 with hyperthrading enabled we see one or more guest CPUs working in a very strange pattern. It looks like all or nothing. We can easily identify the effected CPU with xosview. Here is the mpstat output compared to one regular working CPU: mpstat -P 4 1 Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU) 00:45:19 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 00:45:20 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:21 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:22 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:23 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:24 4 0,00 66,67 0,00 0,00 0,00 33,33 0,00 0,00 0,00 00:45:25 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:26 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:27 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:28 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:29 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 00:45:30 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 00:45:31 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 00:45:32 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 00:45:33 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 00:45:34 4 0,00 0,00 25,00 25,00 0,00 25,00 0,00 0,00 25,00 00:45:35 4 0,00 0,00 0,00 50,00 0,00 0,00 0,00 0,00 50,00 00:45:36 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 00:45:37 4 0,00 0,00 0,00 66,67 0,00 0,00 0,00 0,00 33,33 00:45:38 4 0,00 0,00 0,00 0,00 0,00 50,00 0,00 0,00 50,00 00:45:39 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 00:45:40 4 0,00 0,00 0,00 50,00 0,00 25,00 0,00 0,00 25,00 00:45:41 4 0,00 0,00 50,00 0,00 0,00 50,00 0,00 0,00 0,00 00:45:42 4 0,00 33,33 0,00 0,00 0,00 0,00 0,00 0,00 66,67 00:45:43 4 33,33 0,00 33,33 0,00 0,00 33,33 0,00 0,00 0,00 00:45:44 4 0,00 0,00 0,00 33,33 0,00 33,33 0,00 0,00 33,33 00:45:45 4 0,00 0,00 0,00 100,00 0,00 0,00 0,00 0,00 0,00 mpstat -P 5 1 Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU) 00:46:05 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 00:46:06 5 0,00 50,51 4,04 10,10 1,01 4,04 0,00 0,00 30,30 00:46:07 5 0,92 1,83 0,92 55,05 0,00 1,83 0,00 0,00 39,45 00:46:08 5 0,00 0,96 3,85 64,42 0,00 1,92 0,00 0,00 28,85 00:46:09 5 1,71 1,71 7,69 61,54 0,85 7,69 0,00 0,00 18,80 00:46:10 5 1,71 1,71 5,13 64,10 0,85 9,40 0,00 0,00 17,09 00:46:12 5 0,85 50,00 25,42 6,78 0,00 5,08 0,00 0,00 11,86 00:46:13 5 0,00 28,70 3,48 21,74 0,00 5,22 0,00 0,00 40,87 00:46:14 5 0,00 3,45 15,52 18,10 0,86 14,66 0,00 0,00 47,41 00:46:15 5 0,98 0,00 3,92 87,25 0,00 3,92 0,00 0,00 3,92 00:46:16 5 1,72 0,00 4,31 17,24 0,00 9,48 0,00 0,00 67,24 00:46:17 5 0,85 0,00 3,42 12,82 0,85 27,35 0,00 0,00 54,70 00:46:18 5 0,00 0,86 7,76 12,93 0,00 27,59 0,00 0,00 50,86 00:46:20 5 0,00 0,00 6,90 36,21 0,00 6,03 0,00 0,00 50,86 00:46:21 5 0,86 0,86 5,17 22,41 0,00 35,34 0,00 0,00 35,34 00:46:22 5 0,00 0,85 9,40 18,80 0,00 23,93 0,00 0,00 47,01 00:46:23 5 0,00 4,04 11,11 9,09 1,01 20,20 0,00 0,00 54,55 00:46:24 5 0,00 5,00 11,00 10,00 0,00 29,00 0,00 0,00 45,00 00:46:25 5 1,00 9,00 10,00 3,00 0,00 28,00 0,00 0,00 49,00 00:46:26 5 0,00 1,00 9,00 13,00 1,00 26,00 0,00 0,00 50,00 00:46:27 5 3,00 4,00 12,00 20,00 0,00 25,00 0,00 0,00 36,00 00:46:28 5 0,00 0,00 8,08 16,16 0,00 8,08 0,00 0,00 67,68 00:46:29 5 0,98 0,98 3,92 4,90 0,00 5,88 0,00 0,00 83,33 00:46:30 5 1,01 2,02 8,08 24,24 0,00 7,07 0,00 0,00 57,58 00:46:31 5 0,00 0,00 7,00 4,00 1,00 20,00 0,00 0,00 68,00 00:46:32 5 1,01 1,01 11,11 14,14 0,00 15,15 0,00 0,00 57,58 00:46:33 5 0,99 2,97 9,90 32,67 0,99 9,90 0,00 0,00 42,57 In CPU4, all values are simple fractions while CPU5 shows what I would expect as an real live workload. Any idea whats happening here? The host CPUs do not show any such pattern. Best regards, Sebastian -- Sebastian Hetze Mitglied des Vorstands Linux Information Systems AG Bundesallee 93, D-12161 Berlin Fon: +49 30 726238-0, Fax: +49 30 726238-99 s.hetze@linux-ag.com, http://www.linux-ag.com ---------------------------------------------------------- Sitz der Gesellschaft: Putzbrunner Str. 71, 81739 München Amtsgericht München: HRB 128 019 Vorstand: Rudolf Strobl, Sebastian Hetze Aufsichtsrat: Michael Tarabochia (Vorsitzender) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest 2010-03-21 0:13 Strange CPU usage pattern in SMP guest Sebastian Hetze @ 2010-03-21 10:09 ` Avi Kivity 2010-03-21 12:02 ` Sebastian Hetze [not found] ` <20100321120236.55228A0015@mail.linux-ag.de> 0 siblings, 2 replies; 11+ messages in thread From: Avi Kivity @ 2010-03-21 10:09 UTC (permalink / raw) To: Sebastian Hetze; +Cc: kvm [-- Attachment #1: Type: text/plain, Size: 1602 bytes --] On 03/21/2010 02:13 AM, Sebastian Hetze wrote: > Hi *, > > in an 6 CPU SMP guest running on an host with 2 quad core > Intel Xeon E5520 with hyperthrading enabled > we see one or more guest CPUs working in a very strange > pattern. It looks like all or nothing. We can easily identify > the effected CPU with xosview. Here is the mpstat output > compared to one regular working CPU: > > > mpstat -P 4 1 > Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU) > 00:45:19 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle > 00:45:20 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 00:45:21 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 00:45:22 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 00:45:23 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 00:45:24 4 0,00 66,67 0,00 0,00 0,00 33,33 0,00 0,00 0,00 > 00:45:25 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 00:45:26 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > Looks like the guest is only receiving 3-4 timer interrupts per second, so time becomes quantized. Please run the attached irqtop in the affected guest and report the results. Is the host overly busy? What host kernel, kvm, and qemu are you running? Is the guest running an I/O workload? if so, how are the disks configured? -- error compiling committee.c: too many arguments to function [-- Attachment #2: irqtop --] [-- Type: text/plain, Size: 1756 bytes --] #!/usr/bin/python import curses import sys, os, time, optparse def read_interrupts(): irq = {} proc = file('/proc/interrupts') nrcpu = len(proc.readline().split()) for line in proc.readlines(): vec, data = line.strip().split(':', 1) if vec in ('ERR', 'MIS'): continue counts = data.split(None, nrcpu) counts, rest = (counts[:-1], counts[-1]) count = sum([int(x) for x in counts]) try: v = int(vec) name = rest.split(None, 1)[1] except: name = rest irq[name] = count return irq def delta_interrupts(): old = read_interrupts() while True: irq = read_interrupts() delta = {} for key in irq.keys(): delta[key] = irq[key] - old[key] yield delta old = irq label_width = 30 number_width = 10 def tui(screen): curses.use_default_colors() curses.noecho() def getcount(x): return x[1] def refresh(irq): screen.erase() screen.addstr(0, 0, 'irqtop') row = 2 for name, count in sorted(irq.items(), key = getcount, reverse = True): if row >= screen.getmaxyx()[0]: break col = 1 screen.addstr(row, col, name) col += label_width screen.addstr(row, col, '%10d' % (count,)) row += 1 screen.refresh() for irqs in delta_interrupts(): refresh(irqs) curses.halfdelay(10) try: c = screen.getkey() if c == 'q': break except KeyboardInterrupt: break except curses.error: continue import curses.wrapper curses.wrapper(tui) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest 2010-03-21 10:09 ` Avi Kivity @ 2010-03-21 12:02 ` Sebastian Hetze [not found] ` <20100321120236.55228A0015@mail.linux-ag.de> 1 sibling, 0 replies; 11+ messages in thread From: Sebastian Hetze @ 2010-03-21 12:02 UTC (permalink / raw) To: Avi Kivity; +Cc: Sebastian Hetze, kvm On Sun, Mar 21, 2010 at 12:09:00PM +0200, Avi Kivity wrote: > On 03/21/2010 02:13 AM, Sebastian Hetze wrote: >> Hi *, >> >> in an 6 CPU SMP guest running on an host with 2 quad core >> Intel Xeon E5520 with hyperthrading enabled >> we see one or more guest CPUs working in a very strange >> pattern. It looks like all or nothing. We can easily identify >> the effected CPU with xosview. Here is the mpstat output >> compared to one regular working CPU: >> >> >> mpstat -P 4 1 >> Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU) >> 00:45:19 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle >> 00:45:20 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 >> 00:45:21 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 >> 00:45:22 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 >> 00:45:23 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 >> 00:45:24 4 0,00 66,67 0,00 0,00 0,00 33,33 0,00 0,00 0,00 >> 00:45:25 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 >> 00:45:26 4 0,00 100,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 >> > > Looks like the guest is only receiving 3-4 timer interrupts per second, > so time becomes quantized. > > Please run the attached irqtop in the affected guest and report the results. > > Is the host overly busy? What host kernel, kvm, and qemu are you > running? Is the guest running an I/O workload? if so, how are the disks The host is not busy at all. In fact, currently it is running only one guest. The host is running an ubuntu 2.6.31-14-server kernel. qemu-kvm is 0.12.2-0ubuntu6. The kvm module has srcversion: 82D6B673524596F9CF3E84C as stated by modinfo. The guest occasionally is running IO workload. However, the effect is visible all the time. And it is only one out of 6 CPUs the very same guest is running. This is the output on the guest for all CPUs: mpstat -P ALL 1 12:45:59 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 12:46:00 all 0,40 9,74 2,39 5,37 0,80 3,98 0,00 0,00 77,34 12:46:00 0 1,00 5,00 6,00 3,00 1,00 9,00 0,00 0,00 75,00 12:46:00 1 0,00 23,00 2,00 10,00 0,00 0,00 0,00 0,00 65,00 12:46:00 2 0,00 5,94 0,99 6,93 0,00 1,98 0,00 0,00 84,16 12:46:00 3 0,00 8,00 2,00 5,00 2,00 9,00 0,00 0,00 74,00 12:46:00 4 0,00 33,33 0,00 0,00 0,00 0,00 0,00 0,00 66,67 12:46:00 5 0,00 5,94 0,00 3,96 0,00 0,99 0,00 0,00 89,11 12:46:00 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 12:46:01 all 0,60 5,81 3,21 24,45 0,40 3,61 0,00 0,00 61,92 12:46:01 0 1,01 4,04 7,07 31,31 1,01 6,06 0,00 0,00 49,49 12:46:01 1 0,00 5,00 2,00 19,00 0,00 2,00 0,00 0,00 72,00 12:46:01 2 0,99 7,92 1,98 35,64 0,00 2,97 0,00 0,00 50,50 12:46:01 3 1,98 4,95 2,97 13,86 0,00 6,93 0,00 0,00 69,31 12:46:01 4 0,00 33,33 0,00 0,00 0,00 0,00 0,00 0,00 66,67 12:46:01 5 0,00 8,08 3,03 22,22 0,00 1,01 0,00 0,00 65,66 12:46:01 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 12:46:02 all 2,38 12,70 17,06 14,68 0,60 1,98 0,00 0,00 50,60 12:46:02 0 3,96 15,84 9,90 13,86 0,00 2,97 0,00 0,00 53,47 12:46:02 1 2,97 6,93 5,94 19,80 2,97 2,97 0,00 0,00 58,42 12:46:02 2 2,02 17,17 8,08 18,18 2,02 1,01 0,00 0,00 51,52 12:46:02 3 2,02 10,10 8,08 14,14 0,00 2,02 0,00 0,00 63,64 12:46:02 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:02 5 0,00 13,00 55,00 6,00 0,00 1,00 0,00 0,00 25,00 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 So it is only CPU4 that is showing this strange behaviour. The same time on the host: mpstat -P ALL 1 12:46:00 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 12:46:01 all 0,16 0,00 1,92 1,37 0,00 0,05 0,00 5,97 90,53 12:46:01 0 1,00 0,00 9,00 25,00 0,00 1,00 0,00 17,00 47,00 12:46:01 1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:01 2 0,00 0,00 13,08 0,00 0,00 0,00 0,00 17,69 69,23 12:46:01 3 0,00 0,00 2,06 0,00 0,00 0,00 0,00 8,25 89,69 12:46:01 4 0,95 0,00 0,95 0,00 0,00 0,00 0,00 22,86 75,24 12:46:01 5 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:01 6 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:01 7 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:01 8 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:01 9 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:01 10 0,87 0,00 5,22 0,00 0,00 0,00 0,00 23,48 70,43 12:46:01 11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:01 12 0,00 0,00 0,00 0,00 0,00 0,00 0,00 10,20 89,80 12:46:01 13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:01 14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:01 15 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:01 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 12:46:02 all 0,20 0,00 0,53 4,92 0,00 0,13 0,00 6,25 87,97 12:46:02 0 0,00 0,00 2,00 74,00 0,00 2,00 0,00 1,00 21,00 12:46:02 1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:02 2 0,00 0,00 3,06 0,00 0,00 0,00 0,00 28,57 68,37 12:46:02 3 1,11 0,00 1,11 0,00 0,00 0,00 0,00 22,22 75,56 12:46:02 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 7,87 92,13 12:46:02 5 0,00 0,00 0,95 0,00 0,00 0,00 0,00 0,00 99,05 12:46:02 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:02 7 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:02 8 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:02 9 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:02 10 0,92 0,00 0,92 0,00 0,00 0,00 0,00 22,02 76,15 12:46:02 11 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:02 12 1,08 0,00 0,00 0,00 0,00 0,00 0,00 15,05 83,87 12:46:02 13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:02 14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:02 15 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 12:46:03 all 0,26 0,00 1,12 1,97 0,00 0,07 0,00 13,93 82,65 12:46:03 0 1,00 0,00 6,00 30,00 0,00 1,00 0,00 0,00 62,00 12:46:03 1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:03 2 0,00 0,00 0,00 0,00 0,00 0,00 0,00 82,18 17,82 12:46:03 3 0,00 0,00 4,76 0,00 0,00 0,00 0,00 38,10 57,14 12:46:03 4 0,00 0,00 0,00 0,00 0,00 0,00 0,00 16,49 83,51 12:46:03 5 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:03 6 0,34 0,00 1,01 0,00 0,00 0,00 0,00 11,11 87,54 12:46:03 7 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:03 8 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:03 9 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 12:46:03 10 2,25 0,00 0,00 0,00 0,00 0,00 0,00 26,97 70,79 12:46:03 11 0,00 0,00 1,67 0,00 0,00 0,00 0,00 0,00 98,33 12:46:03 12 0,00 0,00 1,98 0,00 0,00 0,00 0,00 15,84 82,18 12:46:03 13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:03 14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12:46:03 15 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 your irqtop on the guest shows something like: Local timer interrupts 1257 Rescheduling interrupts 849 eth0 715 ata_piix 24 virtio0-input 1 Disk configuration of the guest is a single LVM volume running as raw image. Here is the qemu-kvm command line: /usr/bin/qemu-system-x86_64 -S -M pc-0.12 -cpu qemu32 -m 8192 -smp 6 -name guest -uuid dac8c0df-5848-48e5-904e-5548fcb5e614 -monitor unix:/var/lib/libvirt/qemu/guest.monitor,server,nowait -boot c -drive file=/dev/drbd/by-res/guest,if=ide,index=0,boot=on,format=host_device,cache=none -net nic,macaddr=24:09:61:01:32:00,vlan=0,model=virtio,name=virtio.0 -net tap,fd=20,vlan=0,name=tap.0 -serial pty -parallel none -usb -vnc 127.0.0.1:0 -k de -vga cirrus -soundhw es1370 -pcidevice host=01:00.0 -pcidevice host=06:00.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20100321120236.55228A0015@mail.linux-ag.de>]
* Re: Strange CPU usage pattern in SMP guest [not found] ` <20100321120236.55228A0015@mail.linux-ag.de> @ 2010-03-21 12:19 ` Avi Kivity 2010-03-21 14:55 ` Sebastian Hetze [not found] ` <20100321145548.CC027A0015@mail.linux-ag.de> 0 siblings, 2 replies; 11+ messages in thread From: Avi Kivity @ 2010-03-21 12:19 UTC (permalink / raw) To: Sebastian Hetze; +Cc: kvm On 03/21/2010 02:02 PM, Sebastian Hetze wrote: > > 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle > 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 > 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 > 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 > 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 > 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 > 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 > 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 > > So it is only CPU4 that is showing this strange behaviour. > Can you adjust irqtop to only count cpu4? or even just post a few 'cat /proc/interrupts' from that guest. Most likely the timer interrupt for cpu4 died. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest 2010-03-21 12:19 ` Avi Kivity @ 2010-03-21 14:55 ` Sebastian Hetze [not found] ` <20100321145548.CC027A0015@mail.linux-ag.de> 1 sibling, 0 replies; 11+ messages in thread From: Sebastian Hetze @ 2010-03-21 14:55 UTC (permalink / raw) To: Avi Kivity; +Cc: Sebastian Hetze, kvm [-- Attachment #1: Type: text/plain, Size: 2049 bytes --] On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: > On 03/21/2010 02:02 PM, Sebastian Hetze wrote: >> >> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle >> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 >> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 >> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 >> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 >> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 >> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 >> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 >> >> So it is only CPU4 that is showing this strange behaviour. >> > > Can you adjust irqtop to only count cpu4? or even just post a few 'cat > /proc/interrupts' from that guest. > > Most likely the timer interrupt for cpu4 died. I've added two keys +/- to your irqtop to focus up and down in the row of available CPUs. The irqtop for CPU4 shows a constant number of 6 local timer interrupts per update, while the other CPUs show various higher values: irqtop for cpu 4 eth0 188 Rescheduling interrupts 162 Local timer interrupts 6 ata_piix 3 TLB shootdowns 1 Spurious interrupts 0 Machine check exceptions 0 irqtop for cpu 5 eth0 257 Local timer interrupts 251 Rescheduling interrupts 237 Spurious interrupts 0 Machine check exceptions 0 So the timer interrupt for cpu4 is not completely dead but somehow broken. What can cause this problem? Any way to speed it up again? [-- Attachment #2: irqtop --] [-- Type: text/plain, Size: 2201 bytes --] #!/usr/bin/python import curses import sys, os, time, optparse def read_interrupts(): global target irq = {} proc = file('/proc/interrupts') nrcpu = len(proc.readline().split()) if target < 0: target = 0; if target > nrcpu: target = nrcpu for line in proc.readlines(): vec, data = line.strip().split(':', 1) if vec in ('ERR', 'MIS'): continue counts = data.split(None, nrcpu) counts, rest = (counts[:-1], counts[-1]) if target == 0: count = sum([int(x) for x in counts]) else: count = int(counts[target-1]) try: v = int(vec) name = rest.split(None, 1)[1] except: name = rest irq[name] = count return irq def delta_interrupts(): old = read_interrupts() while True: irq = read_interrupts() delta = {} for key in irq.keys(): delta[key] = irq[key] - old[key] yield delta old = irq target = 0 label_width = 35 number_width = 10 def tui(screen): curses.use_default_colors() global target curses.noecho() def getcount(x): return x[1] def refresh(irq): screen.erase() if target > 0: title = "irqtop for cpu %d"%(target-1) else: title = "irqtop sum for all cpu's" screen.addstr(0, 0, title) row = 2 for name, count in sorted(irq.items(), key = getcount, reverse = True): if row >= screen.getmaxyx()[0]: break col = 1 screen.addstr(row, col, name) col += label_width screen.addstr(row, col, '%10d' % (count,)) row += 1 screen.refresh() for irqs in delta_interrupts(): refresh(irqs) curses.halfdelay(10) try: c = screen.getkey() if c == 'q': break if c == '+': target = target+1 if c == '-': target = target-1 except KeyboardInterrupt: break except curses.error: continue import curses.wrapper curses.wrapper(tui) ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20100321145548.CC027A0015@mail.linux-ag.de>]
* Re: Strange CPU usage pattern in SMP guest [not found] ` <20100321145548.CC027A0015@mail.linux-ag.de> @ 2010-03-21 15:17 ` Avi Kivity 2010-03-21 15:47 ` Sebastian Hetze 2010-03-22 12:51 ` Sebastian Hetze 0 siblings, 2 replies; 11+ messages in thread From: Avi Kivity @ 2010-03-21 15:17 UTC (permalink / raw) To: Sebastian Hetze; +Cc: kvm, Marcelo Tosatti On 03/21/2010 04:55 PM, Sebastian Hetze wrote: > On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: > >> On 03/21/2010 02:02 PM, Sebastian Hetze wrote: >> >>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle >>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 >>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 >>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 >>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 >>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 >>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 >>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 >>> >>> So it is only CPU4 that is showing this strange behaviour. >>> >>> >> Can you adjust irqtop to only count cpu4? or even just post a few 'cat >> /proc/interrupts' from that guest. >> >> Most likely the timer interrupt for cpu4 died. >> > I've added two keys +/- to your irqtop to focus up and down > in the row of available CPUs. > The irqtop for CPU4 shows a constant number of 6 local timer interrupts > per update, while the other CPUs show various higher values: > > irqtop for cpu 4 > > eth0 188 > Rescheduling interrupts 162 > Local timer interrupts 6 > ata_piix 3 > TLB shootdowns 1 > Spurious interrupts 0 > Machine check exceptions 0 > > > irqtop for cpu 5 > > eth0 257 > Local timer interrupts 251 > Rescheduling interrupts 237 > Spurious interrupts 0 > Machine check exceptions 0 > > So the timer interrupt for cpu4 is not completely dead but somehow > broken. That is incredibly weird. > What can cause this problem? Any way to speed it up again? > The host has 8 cpus and is only running this 6 vcpu guest, yes? Can you confirm the other vcpus are ticking at 250 Hz? What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a last-used-cpu field in the display. Marcelo, any ideas? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest 2010-03-21 15:17 ` Avi Kivity @ 2010-03-21 15:47 ` Sebastian Hetze 2010-03-22 12:51 ` Sebastian Hetze 1 sibling, 0 replies; 11+ messages in thread From: Sebastian Hetze @ 2010-03-21 15:47 UTC (permalink / raw) To: Avi Kivity; +Cc: Sebastian Hetze, kvm, Marcelo Tosatti On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote: > On 03/21/2010 04:55 PM, Sebastian Hetze wrote: >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: >> >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote: >>> >>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle >>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 >>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 >>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 >>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 >>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 >>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 >>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 >>>> >>>> So it is only CPU4 that is showing this strange behaviour. >>>> >>>> >>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat >>> /proc/interrupts' from that guest. >>> >>> Most likely the timer interrupt for cpu4 died. >>> >> I've added two keys +/- to your irqtop to focus up and down >> in the row of available CPUs. >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts >> per update, while the other CPUs show various higher values: >> >> irqtop for cpu 4 >> >> eth0 188 >> Rescheduling interrupts 162 >> Local timer interrupts 6 >> ata_piix 3 >> TLB shootdowns 1 >> Spurious interrupts 0 >> Machine check exceptions 0 >> >> >> irqtop for cpu 5 >> >> eth0 257 >> Local timer interrupts 251 >> Rescheduling interrupts 237 >> Spurious interrupts 0 >> Machine check exceptions 0 >> >> So the timer interrupt for cpu4 is not completely dead but somehow >> broken. > > That is incredibly weird. > >> What can cause this problem? Any way to speed it up again? >> > > The host has 8 cpus and is only running this 6 vcpu guest, yes? The host is an dual quad core E5520 with hyperthrading enabled, so we see 2x4x2=16 CPUs on the host. The guest is started with 6 CPUs. > Can you confirm the other vcpus are ticking at 250 Hz? The irqtop shows different numbers for local timer interrupts on the other CPUs. The total number (summed up over all CPUs) varies between something like 700 and 1400. Any CPU can be down to 10 and next update up to 260. Only CPU4 stays at the 6 local timer interrupts. > > What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a > last-used-cpu field in the display. The processes are not bound to a particular CPU, so the picture varies. Here are two shots: take1: 15 root RT -5 0 0 0 S 0 0.0 0:01.70 4 migration/4 16 root 15 -5 0 0 0 S 0 0.0 0:00.08 4 ksoftirqd/4 17 root RT -5 0 0 0 S 0 0.0 0:00.00 4 watchdog/4 25 root 15 -5 0 0 0 S 0 0.0 0:00.01 4 events/4 35 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kintegrityd/4 41 root 15 -5 0 0 0 S 0 0.0 0:00.03 4 kblockd/4 50 root 15 -5 0 0 0 S 0 0.0 0:00.90 4 ata/4 55 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kseriod 66 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 aio/4 73 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 crypto/4 80 root 15 -5 0 0 0 S 0 0.0 2:11.71 4 scsi_eh_1 87 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kmpathd/4 95 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kondemand/4 101 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kconservative/4 103 root 10 -10 0 0 0 S 0 0.0 0:00.00 4 krfcommd 681 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush 686 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush 691 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush 737 root 15 -5 0 0 0 S 0 0.0 0:00.71 4 kjournald 826 root 16 -4 2100 452 312 S 0 0.0 0:00.14 4 udevd 1350 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kpsmoused 1444 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kgameportd 1718 root 15 -5 0 0 0 S 0 0.0 0:14.62 4 kjournald 2108 statd 20 0 2252 1152 760 S 0 0.0 0:02.66 4 rpc.statd 2117 root 15 -5 0 0 0 S 0 0.0 0:00.36 4 rpciod/4 2123 root 15 -5 0 0 0 S 0 0.0 0:06.61 4 nfsiod 2259 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty 2265 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty 2584 root 20 0 1860 656 604 S 0 0.0 0:01.92 4 automount 2712 mysql 20 0 146m 4580 3464 S 0 0.1 0:59.71 4 mysqld 2999 root 20 0 28944 12m 1704 S 0 0.2 1:23.04 4 console-kit-dae 3834 root 20 0 13476 2640 1572 S 0 0.0 0:00.03 4 smbd 4015 root 15 -5 0 0 0 S 0 0.0 7:22.33 4 nfsd 4017 root 15 -5 0 0 0 S 0 0.0 10:46.87 4 nfsd 4030 root 20 0 2644 1180 876 S 0 0.0 0:04.32 4 rpc.mountd 4134 root 20 0 6164 1672 1344 S 0 0.0 0:00.37 4 master 4311 root 20 0 5068 1796 1580 S 0 0.0 1:25.80 4 hald-addon-stor 4560 root 20 0 2092 728 572 S 0 0.0 0:01.78 4 cron 5053 www-data 20 0 33312 4092 1004 S 0 0.0 0:00.03 4 apache2 take 2: 15 root RT -5 0 0 0 S 0 0.0 0:01.70 4 migration/4 16 root 15 -5 0 0 0 S 0 0.0 0:00.08 4 ksoftirqd/4 17 root RT -5 0 0 0 S 0 0.0 0:00.00 4 watchdog/4 25 root 15 -5 0 0 0 S 0 0.0 0:00.01 4 events/4 35 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kintegrityd/4 41 root 15 -5 0 0 0 S 0 0.0 0:00.03 4 kblockd/4 50 root 15 -5 0 0 0 S 0 0.0 0:00.90 4 ata/4 55 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kseriod 66 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 aio/4 73 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 crypto/4 87 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kmpathd/4 95 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kondemand/4 101 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kconservative/4 103 root 10 -10 0 0 0 S 0 0.0 0:00.00 4 krfcommd 681 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush 686 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush 691 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kdmflush 737 root 15 -5 0 0 0 S 0 0.0 0:00.71 4 kjournald 826 root 16 -4 2100 452 312 S 0 0.0 0:00.14 4 udevd 1350 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kpsmoused 1444 root 15 -5 0 0 0 S 0 0.0 0:00.00 4 kgameportd 2091 daemon 20 0 1820 552 464 S 0 0.0 0:44.31 4 portmap 2108 statd 20 0 2252 1152 760 S 0 0.0 0:02.66 4 rpc.statd 2117 root 15 -5 0 0 0 S 0 0.0 0:00.36 4 rpciod/4 2123 root 15 -5 0 0 0 S 0 0.0 0:06.61 4 nfsiod 2259 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty 2265 root 20 0 1696 444 440 S 0 0.0 0:00.00 4 getty 2712 mysql 20 0 146m 4580 3464 S 0 0.1 0:59.72 4 mysqld 3834 root 20 0 13476 2640 1572 S 0 0.0 0:00.03 4 smbd 4017 root 15 -5 0 0 0 S 0 0.0 10:46.87 4 nfsd 4134 root 20 0 6164 1672 1344 S 0 0.0 0:00.37 4 master 4311 root 20 0 5068 1796 1580 S 0 0.0 1:25.80 4 hald-addon-stor 4560 root 20 0 2092 728 572 S 0 0.0 0:01.78 4 cron 5053 www-data 20 0 33312 4092 1004 S 0 0.0 0:00.03 4 apache2 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest 2010-03-21 15:17 ` Avi Kivity 2010-03-21 15:47 ` Sebastian Hetze @ 2010-03-22 12:51 ` Sebastian Hetze 2010-03-23 21:18 ` Marcelo Tosatti 1 sibling, 1 reply; 11+ messages in thread From: Sebastian Hetze @ 2010-03-22 12:51 UTC (permalink / raw) To: Avi Kivity; +Cc: Sebastian Hetze, kvm, Marcelo Tosatti On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote: > On 03/21/2010 04:55 PM, Sebastian Hetze wrote: >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: >> >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote: >>> >>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle >>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 >>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 >>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 >>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 >>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 >>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 >>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 >>>> >>>> So it is only CPU4 that is showing this strange behaviour. >>>> >>>> >>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat >>> /proc/interrupts' from that guest. >>> >>> Most likely the timer interrupt for cpu4 died. >>> >> I've added two keys +/- to your irqtop to focus up and down >> in the row of available CPUs. >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts >> per update, while the other CPUs show various higher values: >> >> irqtop for cpu 4 >> >> eth0 188 >> Rescheduling interrupts 162 >> Local timer interrupts 6 >> ata_piix 3 >> TLB shootdowns 1 >> Spurious interrupts 0 >> Machine check exceptions 0 >> >> >> irqtop for cpu 5 >> >> eth0 257 >> Local timer interrupts 251 >> Rescheduling interrupts 237 >> Spurious interrupts 0 >> Machine check exceptions 0 >> >> So the timer interrupt for cpu4 is not completely dead but somehow >> broken. > > That is incredibly weird. > >> What can cause this problem? Any way to speed it up again? >> > > The host has 8 cpus and is only running this 6 vcpu guest, yes? > > Can you confirm the other vcpus are ticking at 250 Hz? > > What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a > last-used-cpu field in the display. > > Marcelo, any ideas? Just to let you know, right after startup, all vcpus work fine. The following message might be related to the problem: hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns The guest is an 32bit system running on an 64bit host. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest 2010-03-22 12:51 ` Sebastian Hetze @ 2010-03-23 21:18 ` Marcelo Tosatti 2010-03-30 8:27 ` Sebastian Hetze [not found] ` <20100330082743.49A113030135@mail.linux-ag.de> 0 siblings, 2 replies; 11+ messages in thread From: Marcelo Tosatti @ 2010-03-23 21:18 UTC (permalink / raw) To: Sebastian Hetze; +Cc: Avi Kivity, kvm [-- Attachment #1: Type: text/plain, Size: 3128 bytes --] On Mon, Mar 22, 2010 at 01:51:20PM +0100, Sebastian Hetze wrote: > On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote: > > On 03/21/2010 04:55 PM, Sebastian Hetze wrote: > >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: > >> > >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote: > >>> > >>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle > >>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 > >>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 > >>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 > >>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 > >>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 > >>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 > >>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 > >>>> > >>>> So it is only CPU4 that is showing this strange behaviour. > >>>> > >>>> > >>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat > >>> /proc/interrupts' from that guest. > >>> > >>> Most likely the timer interrupt for cpu4 died. > >>> > >> I've added two keys +/- to your irqtop to focus up and down > >> in the row of available CPUs. > >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts > >> per update, while the other CPUs show various higher values: > >> > >> irqtop for cpu 4 > >> > >> eth0 188 > >> Rescheduling interrupts 162 > >> Local timer interrupts 6 > >> ata_piix 3 > >> TLB shootdowns 1 > >> Spurious interrupts 0 > >> Machine check exceptions 0 > >> > >> > >> irqtop for cpu 5 > >> > >> eth0 257 > >> Local timer interrupts 251 > >> Rescheduling interrupts 237 > >> Spurious interrupts 0 > >> Machine check exceptions 0 > >> > >> So the timer interrupt for cpu4 is not completely dead but somehow > >> broken. > > > > That is incredibly weird. > > > >> What can cause this problem? Any way to speed it up again? > >> > > > > The host has 8 cpus and is only running this 6 vcpu guest, yes? > > > > Can you confirm the other vcpus are ticking at 250 Hz? > > > > What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a > > last-used-cpu field in the display. > > > > Marcelo, any ideas? > > Just to let you know, right after startup, all vcpus work fine. > > The following message might be related to the problem: > hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns > > The guest is an 32bit system running on an 64bit host. Sebastian, Please apply the attached patch to your guest kernel. [-- Attachment #2: hrtimer-tune-hrtimer_interrupt-hang-logic.patch --] [-- Type: text/plain, Size: 7973 bytes --] commit ff28aa973164f8e5f4b2633e178162d121c4e5eb Author: Thomas Gleixner <tglx@linutronix.de> Date: Fri Nov 13 17:05:44 2009 +0100 hrtimer: Tune hrtimer_interrupt hang logic The hrtimer_interrupt hang logic adjusts min_delta_ns based on the execution time of the hrtimer callbacks. This is error-prone for virtual machines, where a guest vcpu can be scheduled out during the execution of the callbacks (and the callbacks themselves can do operations that translate to blocking operations in the hypervisor), which in can lead to large min_delta_ns rendering the system unusable. Replace the current heuristics with something more reliable. Allow the interrupt code to try 3 times to catch up with the lost time. If that fails use the total time spent in the interrupt handler to defer the next timer interrupt so the system can catch up with other things which got delayed. Limit that deferment to 100ms. The retry events and the maximum time spent in the interrupt handler are recorded and exposed via /proc/timer_list Inspired by a patch from Marcelo. Reported-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: kvm@vger.kernel.org diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 4759917..2e1064f 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -164,10 +164,11 @@ struct hrtimer_clock_base { * @expires_next: absolute time of the next event which was scheduled * via clock_set_next_event() * @hres_active: State of high resolution mode - * @check_clocks: Indictator, when set evaluate time source and clock - * event devices whether high resolution mode can be - * activated. - * @nr_events: Total number of timer interrupt events + * @hang_detected: The last hrtimer interrupt detected a hang + * @nr_events: Total number of hrtimer interrupt events + * @nr_retries: Total number of hrtimer interrupt retries + * @nr_hangs: Total number of hrtimer interrupt hangs + * @max_hang_time: Maximum time spent in hrtimer_interrupt */ struct hrtimer_cpu_base { spinlock_t lock; @@ -175,7 +176,11 @@ struct hrtimer_cpu_base { #ifdef CONFIG_HIGH_RES_TIMERS ktime_t expires_next; int hres_active; + int hang_detected; unsigned long nr_events; + unsigned long nr_retries; + unsigned long nr_hangs; + ktime_t max_hang_time; #endif }; diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index 49da79a..4b0638b 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -584,7 +584,7 @@ static void hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base) static int hrtimer_reprogram(struct hrtimer *timer, struct hrtimer_clock_base *base) { - ktime_t *expires_next = &__get_cpu_var(hrtimer_bases).expires_next; + struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases); ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset); int res; @@ -609,7 +609,16 @@ static int hrtimer_reprogram(struct hrtimer *timer, if (expires.tv64 < 0) return -ETIME; - if (expires.tv64 >= expires_next->tv64) + if (expires.tv64 >= cpu_base->expires_next.tv64) + return 0; + + /* + * If a hang was detected in the last timer interrupt then we + * do not schedule a timer which is earlier than the expiry + * which we enforced in the hang detection. We want the system + * to make progress. + */ + if (cpu_base->hang_detected) return 0; /* @@ -617,7 +626,7 @@ static int hrtimer_reprogram(struct hrtimer *timer, */ res = tick_program_event(expires, 0); if (!IS_ERR_VALUE(res)) - *expires_next = expires; + cpu_base->expires_next = expires; return res; } @@ -1234,29 +1243,6 @@ static void __run_hrtimer(struct hrtimer *timer) #ifdef CONFIG_HIGH_RES_TIMERS -static int force_clock_reprogram; - -/* - * After 5 iteration's attempts, we consider that hrtimer_interrupt() - * is hanging, which could happen with something that slows the interrupt - * such as the tracing. Then we force the clock reprogramming for each future - * hrtimer interrupts to avoid infinite loops and use the min_delta_ns - * threshold that we will overwrite. - * The next tick event will be scheduled to 3 times we currently spend on - * hrtimer_interrupt(). This gives a good compromise, the cpus will spend - * 1/4 of their time to process the hrtimer interrupts. This is enough to - * let it running without serious starvation. - */ - -static inline void -hrtimer_interrupt_hanging(struct clock_event_device *dev, - ktime_t try_time) -{ - force_clock_reprogram = 1; - dev->min_delta_ns = (unsigned long)try_time.tv64 * 3; - printk(KERN_WARNING "hrtimer: interrupt too slow, " - "forcing clock min delta to %lu ns\n", dev->min_delta_ns); -} /* * High resolution timer interrupt * Called with interrupts disabled @@ -1265,21 +1251,15 @@ void hrtimer_interrupt(struct clock_event_device *dev) { struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases); struct hrtimer_clock_base *base; - ktime_t expires_next, now; - int nr_retries = 0; - int i; + ktime_t expires_next, now, entry_time, delta; + int i, retries = 0; BUG_ON(!cpu_base->hres_active); cpu_base->nr_events++; dev->next_event.tv64 = KTIME_MAX; - retry: - /* 5 retries is enough to notice a hang */ - if (!(++nr_retries % 5)) - hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now)); - - now = ktime_get(); - + entry_time = now = ktime_get(); +retry: expires_next.tv64 = KTIME_MAX; spin_lock(&cpu_base->lock); @@ -1341,10 +1321,48 @@ void hrtimer_interrupt(struct clock_event_device *dev) spin_unlock(&cpu_base->lock); /* Reprogramming necessary ? */ - if (expires_next.tv64 != KTIME_MAX) { - if (tick_program_event(expires_next, force_clock_reprogram)) - goto retry; + if (expires_next.tv64 == KTIME_MAX || + !tick_program_event(expires_next, 0)) { + cpu_base->hang_detected = 0; + return; } + + /* + * The next timer was already expired due to: + * - tracing + * - long lasting callbacks + * - being scheduled away when running in a VM + * + * We need to prevent that we loop forever in the hrtimer + * interrupt routine. We give it 3 attempts to avoid + * overreacting on some spurious event. + */ + now = ktime_get(); + cpu_base->nr_retries++; + if (++retries < 3) + goto retry; + /* + * Give the system a chance to do something else than looping + * here. We stored the entry time, so we know exactly how long + * we spent here. We schedule the next event this amount of + * time away. + */ + cpu_base->nr_hangs++; + cpu_base->hang_detected = 1; + delta = ktime_sub(now, entry_time); + if (delta.tv64 > cpu_base->max_hang_time.tv64) + cpu_base->max_hang_time = delta; + /* + * Limit it to a sensible value as we enforce a longer + * delay. Give the CPU at least 100ms to catch up. + */ + if (delta.tv64 > 100 * NSEC_PER_MSEC) + expires_next = ktime_add_ns(now, 100 * NSEC_PER_MSEC); + else + expires_next = ktime_add(now, delta); + tick_program_event(expires_next, 1); + printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n", + ktime_to_ns(delta)); } /* diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c index fddd69d..7cd905c 100644 --- a/kernel/time/timer_list.c +++ b/kernel/time/timer_list.c @@ -150,6 +150,9 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now) P_ns(expires_next); P(hres_active); P(nr_events); + P(nr_retries); + P(nr_hangs); + P_ns(max_hang_time); #endif #undef P #undef P_ns @@ -252,7 +255,7 @@ static int timer_list_show(struct seq_file *m, void *v) u64 now = ktime_to_ns(ktime_get()); int cpu; - SEQ_printf(m, "Timer List Version: v0.4\n"); + SEQ_printf(m, "Timer List Version: v0.5\n"); SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES); SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Strange CPU usage pattern in SMP guest 2010-03-23 21:18 ` Marcelo Tosatti @ 2010-03-30 8:27 ` Sebastian Hetze [not found] ` <20100330082743.49A113030135@mail.linux-ag.de> 1 sibling, 0 replies; 11+ messages in thread From: Sebastian Hetze @ 2010-03-30 8:27 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Sebastian Hetze, Avi Kivity, kvm On Tue, Mar 23, 2010 at 06:18:08PM -0300, Marcelo Tosatti wrote: > On Mon, Mar 22, 2010 at 01:51:20PM +0100, Sebastian Hetze wrote: > > On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote: > > > On 03/21/2010 04:55 PM, Sebastian Hetze wrote: > > >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: > > >> > > >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote: > > >>> > > >>>> 12:46:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle > > >>>> 12:46:03 all 0,20 11,35 10,96 8,96 0,40 2,99 0,00 0,00 65,14 > > >>>> 12:46:03 0 1,00 11,00 7,00 15,00 0,00 1,00 0,00 0,00 65,00 > > >>>> 12:46:03 1 0,00 7,14 2,04 6,12 1,02 11,22 0,00 0,00 72,45 > > >>>> 12:46:03 2 0,00 15,00 1,00 12,00 0,00 1,00 0,00 0,00 71,00 > > >>>> 12:46:03 3 0,00 11,00 23,00 8,00 0,00 0,00 0,00 0,00 58,00 > > >>>> 12:46:03 4 0,00 0,00 50,00 0,00 0,00 0,00 0,00 0,00 50,00 > > >>>> 12:46:03 5 0,00 13,00 20,00 4,00 0,00 1,00 0,00 0,00 62,00 > > >>>> > > >>>> So it is only CPU4 that is showing this strange behaviour. > > >>>> > > >>>> > > >>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat > > >>> /proc/interrupts' from that guest. > > >>> > > >>> Most likely the timer interrupt for cpu4 died. > > >>> > > >> I've added two keys +/- to your irqtop to focus up and down > > >> in the row of available CPUs. > > >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts > > >> per update, while the other CPUs show various higher values: > > >> > > >> irqtop for cpu 4 > > >> > > >> eth0 188 > > >> Rescheduling interrupts 162 > > >> Local timer interrupts 6 > > >> ata_piix 3 > > >> TLB shootdowns 1 > > >> Spurious interrupts 0 > > >> Machine check exceptions 0 > > >> > > >> > > >> irqtop for cpu 5 > > >> > > >> eth0 257 > > >> Local timer interrupts 251 > > >> Rescheduling interrupts 237 > > >> Spurious interrupts 0 > > >> Machine check exceptions 0 > > >> > > >> So the timer interrupt for cpu4 is not completely dead but somehow > > >> broken. > > > > > > That is incredibly weird. > > > > > >> What can cause this problem? Any way to speed it up again? > > >> > > > > > > The host has 8 cpus and is only running this 6 vcpu guest, yes? > > > > > > Can you confirm the other vcpus are ticking at 250 Hz? > > > > > > What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a > > > last-used-cpu field in the display. > > > > > > Marcelo, any ideas? > > > > Just to let you know, right after startup, all vcpus work fine. > > > > The following message might be related to the problem: > > hrtimer: interrupt too slow, forcing clock min delta to 165954639 ns > > > > The guest is an 32bit system running on an 64bit host. > > Sebastian, > > Please apply the attached patch to your guest kernel. > With this patch applied, the system runs without hrtimer messages since 5 days and the timer iterrupts look fine. However, I had this Clocksource tsc unstable (delta = -4398046474878 ns) message that I reported on Sunday. Actually, when restarting the system with the hrtimer patch applied, we also changed the BIOS setting to disable Intel SmartStep on the host. Since there are no hrtimer messages at all, it might be that the SmartStep CPU frequency adjustment is the real cause for the slow interrupts in the KVM guest. Anyone else experienced these problems? ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20100330082743.49A113030135@mail.linux-ag.de>]
* Re: Strange CPU usage pattern in SMP guest [not found] ` <20100330082743.49A113030135@mail.linux-ag.de> @ 2010-04-05 21:24 ` Sebastian Hetze 0 siblings, 0 replies; 11+ messages in thread From: Sebastian Hetze @ 2010-04-05 21:24 UTC (permalink / raw) To: Sebastian Hetze; +Cc: Marcelo Tosatti, Avi Kivity, kvm On Tue, Mar 30, 2010 at 10:27:43AM +0200, Sebastian Hetze wrote: > With this patch applied, the system runs without hrtimer messages since > 5 days and the timer iterrupts look fine. > > Actually, when restarting the system with the hrtimer patch applied, > we also changed the BIOS setting to disable Intel SmartStep on the host. > Since there are no hrtimer messages at all, it might be that the SmartStep > CPU frequency adjustment is the real cause for the slow interrupts in > the KVM guest. Anyone else experienced these problems? Back with SmartStep enabled, we get "hrtimer: interrupt took 119986041 ns" so it really looks like this SmartStep feature is the actual cause for the interrupts to be delayed in the first place. The system continues to run correctly so far, so the patch does fix the problems caused by these delayed interrupts. Thanx alot & best regards, Sebastian ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-04-05 21:24 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-03-21 0:13 Strange CPU usage pattern in SMP guest Sebastian Hetze 2010-03-21 10:09 ` Avi Kivity 2010-03-21 12:02 ` Sebastian Hetze [not found] ` <20100321120236.55228A0015@mail.linux-ag.de> 2010-03-21 12:19 ` Avi Kivity 2010-03-21 14:55 ` Sebastian Hetze [not found] ` <20100321145548.CC027A0015@mail.linux-ag.de> 2010-03-21 15:17 ` Avi Kivity 2010-03-21 15:47 ` Sebastian Hetze 2010-03-22 12:51 ` Sebastian Hetze 2010-03-23 21:18 ` Marcelo Tosatti 2010-03-30 8:27 ` Sebastian Hetze [not found] ` <20100330082743.49A113030135@mail.linux-ag.de> 2010-04-05 21:24 ` Sebastian Hetze
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.