linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Shanahan <kmshanah@ucwb.org.au>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Avi Kivity" <avi@redhat.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Kernel Testers List" <kernel-testers@vger.kernel.org>,
	"Mike Galbraith" <efault@gmx.de>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	bugme-daemon@bugzilla.kernel.org
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)
Date: Mon, 26 Jan 2009 20:25:30 +1030	[thread overview]
Message-ID: <1232963730.4921.12.camel@kulgan.wumi.org.au> (raw)
In-Reply-To: <20090121151820.GA23813@elte.hu>

On Wed, 2009-01-21 at 16:18 +0100, Ingo Molnar wrote:
> * Avi Kivity <avi@redhat.com> wrote:
> > It means, a scheduling problem.  Can you run the latency tracer (which 
> > only works with realtime priority), so we can tell if it is (a) kvm 
> > failing to wake up the vcpu properly or (b) the scheduler delaying the 
> > vcpu from running.
> 
> Could we please get an ftrace capture of the incident?
> 
> Firstly, it makes sense to simplify the tracing environment as much as 
> possible: for example single-CPU traces are much easier to interpret.
> 
> Can you reproduce it with just one CPU online? I.e. if you offline all the 
> other cores via:
> 
>   echo 0 > /sys/devices/system/cpu/cpu1/online
> 
>   [etc.]
> 
> and keep CPU#0 only, do the latencies still occur?
> 
> If they do still occur, then please do the traces that way.
> 
> [ If they do not occur then switch back on all CPUs - we'll sort out the
>   traces ;-) ]
> 
> Then please build a function tracer kernel, by enabling:
> 
>   CONFIG_FUNCTION_TRACER=y
>   CONFIG_FUNCTION_GRAPH_TRACER=y
>   CONFIG_DYNAMIC_FTRACE=y
> 
> Once you boot into such a kernel, you can switch on function tracing via:
> 
>   cd /debug/tracing/
> 
>   echo 0 > tracing_enabled
>   echo function_graph > current_tracer
>   echo funcgraph-proc > trace_options 
> 
> It does not run yet, first find a suitable set of functions to trace. For 
> example this will be a pretty good starting point for scheduler+KVM 
> problems:
> 
>   echo ''         > set_ftrace_filter  # clear filter functions
>   echo '*sched*' >> set_ftrace_filter 
>   echo '*wake*'  >> set_ftrace_filter
>   echo '*kvm*'   >> set_ftrace_filter
>   echo 1 > tracing_enabled             # let the tracer go
> 
> You can see your current selection of functions to trace via 'cat 
> set_ftrace_filter', and you can see all functions via 'cat 
> available_filter_functions'.
> 
> You can also trace all functions via:
> 
>   echo '*' > set_ftrace_filter
> 
> Tracer output can be captured from the 'trace' file. It should look like 
> this:
> 
>  15)   cc1-28106    |   0.263 us    |    page_evictable();
>  15)   cc1-28106    |               |    lru_cache_add_lru() {
>  15)   cc1-28106    |   0.252 us    |      __lru_cache_add();
>  15)   cc1-28106    |   0.738 us    |    }
>  15)   cc1-28106    | + 74.026 us   |  }
>  15)   cc1-28106    |               |  up_read() {
>  15)   cc1-28106    |   0.257 us    |    _spin_lock_irqsave();
>  15)   cc1-28106    |   0.253 us    |    _spin_unlock_irqrestore();
>  15)   cc1-28106    |   1.329 us    |  }
> 
> To capture a continuous stream of all trace data you can do:
> 
>   cat trace_pipe > /tmp/trace.txt
> 
> (this will also drain the trace ringbuffers.)
> 
> Note that this can be quite expensive if there are a lot of functions that 
> are traced - so it makes sense to trim down the set of traced functions to 
> only the interesting ones. Which are the interesting ones can be 
> determined from looking at the traces. You should see your KVM threads 
> getting active every second as the ping happens.
> 
> If you get lost events you can increase the trace buffer size via the 
> buffer_size_kb control - the default is around 1.4 MB.
> 
> Let me know if any of these steps is causing problems or if interpreting 
> the traces is difficult.

Just carrying out the steps was okay, but I don't really know what I'm
looking at. I've uploaded the trace here (about 10 seconds worth, I
think):

  http://disenchant.net/tmp/bug-12465/trace-1/

The guest being pinged is process 4353:

kmshanah@flexo:~$ pstree -p 4353
qemu-system-x86(4353)─┬─{qemu-system-x86}(4354)
                      ├─{qemu-system-x86}(4355)
                      └─{qemu-system-x86}(4772)

I guess the larger overhead/duration values are what we are looking for,
e.g.:

kmshanah@flexo:~$ bzgrep -E '[[:digit:]]{6,}' trace.txt.bz2 
 0)   ksoftir-4    | ! 3010470 us |  }
 0)  qemu-sy-4354  | ! 250406.2 us |    }
 0)  qemu-sy-4354  | ! 250407.0 us |  }
 0)  qemu-sy-4354  | ! 362946.3 us |    }
 0)  qemu-sy-4354  | ! 362947.0 us |  }
 0)  qemu-sy-4177  | ! 780480.3 us |  }
 0)  qemu-sy-4354  | ! 117685.7 us |    }
 0)  qemu-sy-4354  | ! 117686.5 us |  }

That ksoftirqd value is a bit strange (> 3 seconds, or is the formatting
wrong?). I guess I still need some guidance to know what I'm looking at
with this trace and/or what to do next.

Cheers,
Kevin.



  parent reply	other threads:[~2009-01-26  9:56 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-19 21:41 2.6.29-rc2-git1: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-01-19 21:41 ` [Bug #11849] default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems) Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12061] snd_hda_intel: power_save: sound cracks on powerdown Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12208] uml is very slow on 2.6.28 host Rafael J. Wysocki
2009-01-26 11:35   ` Miklos Szeredi
2009-01-19 21:45 ` [Bug #12160] networking oops after resume from s2ram (2.6.28-rc6) Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12159] 2.6.28-rc6-git1 -- No sound produced from Intel HDA ALSA driver Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12263] Sata soft reset filling log Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12260] Regression due to commit 2b80848e3818fb1c (p54usb: support LM87 firmwares) Rafael J. Wysocki
2009-01-20 22:11   ` [PATCH -stable] p54usb: fix traffic stalls / packet drop Christian Lamparter
2009-01-20 22:36     ` Rafael J. Wysocki
2009-01-20 22:39     ` Greg KH
2009-01-20 23:56       ` John W. Linville
2009-01-21 14:03         ` Christian Lamparter
2009-01-19 21:45 ` [Bug #12224] journal activity on inactive partition causes inactive harddrive spinup Rafael J. Wysocki
2009-01-20 13:03   ` Theodore Tso
2009-01-19 21:45 ` [Bug #12209] oldish top core dumps (in its meminfo() function) Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12265] FPU emulation broken in 2.6.28-rc8 ? Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12264] i915: switching from kwin in opengl mode to a VT then back to x11, x11 freezes Rafael J. Wysocki
2009-01-20 18:13   ` Caleb Cushing
2009-01-19 21:45 ` [Bug #12337] ~100 extra wakeups reported by powertop Rafael J. Wysocki
2009-01-20  9:38   ` Alberto Gonzalez
2009-01-19 21:45 ` [Bug #12391] Processor does not go below C2 state until usb.autosuspend is enabled Rafael J. Wysocki
2009-01-27 10:27   ` Pavel Machek
2009-01-19 21:45 ` [Bug #12395] 2.6.28-rc9: oprofile regression Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12393] debugging in dosemu causes lots of 'scheduling while atomic' Rafael J. Wysocki
2009-01-20  9:58   ` Michal Suchanek
2009-01-19 21:45 ` [Bug #12396] hwinfo problem since 2.6.28 Rafael J. Wysocki
2009-01-26 14:00   ` Beschorner Daniel
2009-01-19 21:45 ` [Bug #12404] Oops in 2.6.28-rc9 and -rc8 -- mtrr issues / e1000e Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12403] TTY problem on linux-2.6.28-rc7 Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12401] 2.6.28 regression: xbacklight broken on ThinkPad X61s Rafael J. Wysocki
2009-01-20  7:30   ` Tino Keitel
2009-01-19 21:45 ` [Bug #12406] 2.6.28 thinks that my PS/2 mouse is a touchpad Rafael J. Wysocki
2009-01-20  1:45   ` Arjan Opmeer
2009-01-20  9:19     ` Dmitry Torokhov
2009-01-22  6:29       ` Alexander E. Patrakov
2009-01-19 21:45 ` [Bug #12407] Kernel 2.6.28 regression: Hang after hibernate Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12408] Funny problem with 2.6.28: Kernel stalls Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12405] oops in __bounce_end_io_read under kvm Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12409] NULL pointer dereference at get_stats() Rafael J. Wysocki
2009-01-21 16:18   ` Frederik Deweerdt
2009-01-24  0:39     ` Tetsuo Handa
2009-02-07  2:34     ` Tetsuo Handa
2009-02-09 11:19       ` Tetsuo Handa
2009-02-11 22:54         ` Alok Kataria
2009-02-11 23:02           ` Alok Kataria
2009-02-13 11:54       ` Tetsuo Handa
2009-01-19 21:45 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-01-20  0:12   ` Kevin Shanahan
2009-01-20 11:35     ` Ingo Molnar
2009-01-20 12:37       ` Avi Kivity
2009-01-20 12:42       ` Kevin Shanahan
2009-01-20 12:56         ` Ingo Molnar
2009-01-20 13:07           ` Ingo Molnar
2009-01-20 14:59             ` Steven Rostedt
2009-01-20 15:04               ` Ingo Molnar
2009-01-20 17:53                 ` Steven Rostedt
2009-01-20 18:39                   ` Ingo Molnar
2009-01-20 17:47               ` Avi Kivity
2009-01-21 14:25                 ` Kevin Shanahan
2009-01-21 14:34                   ` Avi Kivity
2009-01-21 14:51                     ` Kevin Shanahan
2009-01-21 14:59                       ` Avi Kivity
2009-01-21 15:13                         ` Steven Rostedt
2009-01-22  1:48                         ` Steven Rostedt
2009-01-21 15:10                     ` Steven Rostedt
2009-01-21 15:18                     ` Ingo Molnar
2009-01-22 19:57                       ` Kevin Shanahan
2009-01-22 20:31                         ` Ingo Molnar
2009-01-26  9:55                       ` Kevin Shanahan [this message]
2009-01-26 11:35                         ` Peter Zijlstra
2009-01-26 14:38                           ` [RFC][PATCH] ftrace: function graph trace context switches Peter Zijlstra
2009-01-26 15:39                             ` Frédéric Weisbecker
2009-01-26 15:41                             ` Steven Rostedt
2009-03-16 17:57                             ` Frederic Weisbecker
2009-01-26 15:00                           ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Ingo Molnar
2009-01-20 14:23           ` Kevin Shanahan
2009-01-20 14:25             ` Ingo Molnar
2009-01-20 15:51               ` Kevin Shanahan
2009-01-20 16:06                 ` Ingo Molnar
2009-01-20 16:19                   ` Peter Zijlstra
2009-01-20 14:46             ` Frédéric Weisbecker
2009-01-20 13:04         ` Avi Kivity
2009-01-20 17:54           ` Kevin Shanahan
2009-01-20 18:42             ` Ingo Molnar
2009-01-19 21:45 ` [Bug #12411] 2.6.28: BUG in r8169 Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12426] TMDC Joystick no longer works in kernel 2.6.28 Rafael J. Wysocki
2009-01-21  0:48   ` Andrew S. Johnson
2009-01-22 13:34     ` Jiri Kosina
2009-01-23  2:06       ` Andrew S. Johnson
2009-01-26 11:49         ` Jiri Kosina
2009-01-19 21:45 ` [Bug #12483] Reference to inexistent struct dmi_device_id breaks the build Rafael J. Wysocki
2009-01-20  8:15   ` Jean Delvare
2009-01-19 21:45 ` [Bug #12500] r8169: NETDEV WATCHDOG: eth0 (r8169): transmit timed out Rafael J. Wysocki
2009-01-22 16:43 ` 2.6.29-rc2-git1: Reported regressions 2.6.27 -> 2.6.28 Jörg-Volker Peetz
2009-01-24 13:25 ` Rolf Eike Beer
2009-02-04 10:55 2.6.29-rc3-git6: " Rafael J. Wysocki
2009-02-04 10:58 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-02-05 19:35   ` Kevin Shanahan
2009-02-05 22:37     ` Rafael J. Wysocki
2009-02-14 20:48 2.6.29-rc5: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-02-14 20:50 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-02-23 22:00 2.6.29-rc6: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-02-23 22:03 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-02-24  0:59   ` Kevin Shanahan
2009-02-24  1:37     ` Rafael J. Wysocki
2009-02-24 12:09     ` Avi Kivity
2009-02-24 22:11       ` Kevin Shanahan
2009-03-03 19:34 2.6.29-rc6-git7: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-03-03 19:41 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-03-04  3:08   ` Kevin Shanahan
2009-03-08 10:04     ` Avi Kivity
2009-03-14 19:11 2.6.29-rc8: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-03-15  9:03   ` Kevin Shanahan
2009-03-15  9:18     ` Avi Kivity
2009-03-15  9:48       ` Ingo Molnar
2009-03-15  9:56         ` Avi Kivity
2009-03-15 10:03           ` Ingo Molnar
2009-03-15 10:13             ` Avi Kivity
2009-03-16  9:49     ` Avi Kivity
2009-03-16 12:46       ` Kevin Shanahan
2009-03-16 20:07         ` Frederic Weisbecker
2009-03-16 22:55           ` Kevin Shanahan
2009-03-18  0:20             ` Frederic Weisbecker
2009-03-18  1:16               ` Kevin Shanahan
2009-03-18  2:24                 ` Frederic Weisbecker
2009-03-18 21:24                 ` Kevin Shanahan
2009-03-21  5:00                   ` Kevin Shanahan
2009-03-21 14:08                     ` Frederic Weisbecker
2009-03-24 11:44                     ` Frederic Weisbecker
2009-03-24 11:47                       ` Frederic Weisbecker
2009-03-25 23:40                       ` Kevin Shanahan
2009-03-25 23:48                         ` Frederic Weisbecker
2009-03-26 20:22                       ` Kevin Shanahan
2009-03-21 17:01 2.6.29-rc8-git5: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-03-21 17:07 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-03-21 19:50   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1232963730.4921.12.camel@kulgan.wumi.org.au \
    --to=kmshanah@ucwb.org.au \
    --cc=a.p.zijlstra@chello.nl \
    --cc=avi@redhat.com \
    --cc=bugme-daemon@bugzilla.kernel.org \
    --cc=efault@gmx.de \
    --cc=fweisbec@gmail.com \
    --cc=kernel-testers@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).