All of lore.kernel.org
 help / color / mirror / Atom feed
* kvm deadlock
@ 2011-12-05 22:48 Nate Custer
  2011-12-14 12:25 ` Marcelo Tosatti
  0 siblings, 1 reply; 29+ messages in thread
From: Nate Custer @ 2011-12-05 22:48 UTC (permalink / raw)
  To: kvm

Hello,

I am struggling with repeatable full hardware locks when running 8-12 KVM vms. At some point before the hard lock I get a inconsistent lock state warning. An example of this can be found here:

http://pastebin.com/8wKhgE2C

After that the server continues to run for a while and then starts its death spiral. When it reaches that point it fails to log anything further to the disk, but by attaching a console I have been able to get a stack trace documenting the final implosion:

http://pastebin.com/PbcN76bd

All of the cores end up hung and the server stops responding to all input, including SysRq commands. 

I have seen this behavior on two machines (dual E5606 running Fedora 16) both passed cpuburnin testing and memtest86 scans without error. 

I have reproduced the crash and stack traces from a Fedora debugging kernel - 3.1.2-1 and with a vanilla 3.1.4 kernel.

Nate Custer
QA Analyst
cPanel Inc

^ permalink raw reply	[flat|nested] 29+ messages in thread
* Re: [RFT PATCH] blkio: alloc per cpu data from worker thread context( Re: kvm deadlock)
@ 2011-12-16 18:47 Nate Custer
  0 siblings, 0 replies; 29+ messages in thread
From: Nate Custer @ 2011-12-16 18:47 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Jens Axboe, Avi Kivity, Marcelo Tosatti, kvm, linux-kernel


On Dec 15, 2011, at 1:47 PM, Vivek Goyal wrote:
> Ok, I continued to develop on the patch which tries to allocate per cpu
> stats from worker thread context. Here is the patch.
> 
> Can the reporter please try out the patch and see if it helps. I am not
> sure if deadlock was because of mutex issue or not, but it should help
> get rid of lockdep warning.
> 
> This patch is on top of for-3.3/core branch of jens's linux-block tree.
> If it does not apply on your kernel version, do let me know the version 
> you are testing with and I will generate another version of patch.
> 
> If testing results are good, I will break down the patch in small pieces
> and post as a series separately.
> 
> Thanks
> Vivek

Running on a fedora-16 box with the patch applied to the linux-block tree I still had deadlocks. In fact it seemed to happen much faster and with ligher workloads.

I was able to get netconsole setup and a full stacktrace is posted here:

http://pastebin.com/9Rq68exU

Nate


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2011-12-20 20:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-05 22:48 kvm deadlock Nate Custer
2011-12-14 12:25 ` Marcelo Tosatti
2011-12-14 13:43   ` Avi Kivity
2011-12-14 14:00     ` Marcelo Tosatti
2011-12-14 14:02       ` Avi Kivity
2011-12-14 14:06         ` Marcelo Tosatti
2011-12-14 14:17           ` Nate Custer
2011-12-14 14:20             ` Marcelo Tosatti
2011-12-14 14:28             ` Avi Kivity
2011-12-14 14:27           ` Avi Kivity
2011-12-14 16:03     ` Jens Axboe
2011-12-14 17:03       ` Vivek Goyal
2011-12-14 17:09         ` Jens Axboe
2011-12-14 17:22           ` Vivek Goyal
2011-12-14 18:16             ` Tejun Heo
2011-12-14 18:41               ` Vivek Goyal
2011-12-14 23:06                 ` Vivek Goyal
2011-12-15 19:47       ` [RFT PATCH] blkio: alloc per cpu data from worker thread context( Re: kvm deadlock) Vivek Goyal
     [not found]         ` <E73DB38E-AFC5-445D-9E76-DE599B36A814@cpanel.net>
2011-12-16 20:29           ` Vivek Goyal
2011-12-18 21:25             ` Nate Custer
2011-12-19 13:40               ` Vivek Goyal
2011-12-19 17:27               ` Vivek Goyal
2011-12-19 17:35                 ` Tejun Heo
2011-12-19 18:27                   ` Vivek Goyal
2011-12-19 22:56                     ` Tejun Heo
2011-12-20 14:50                       ` Vivek Goyal
2011-12-20 20:45                         ` Tejun Heo
2011-12-20 12:49                     ` Jens Axboe
2011-12-16 18:47 Nate Custer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.