From: Jens Axboe <axboe@kernel.dk>
To: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
Nate Custer <nate@cpanel.net>,
kvm@vger.kernel.org, linux-kernel <linux-kernel@vger.kernel.org>,
Vivek Goyal <vgoyal@redhat.com>
Subject: Re: kvm deadlock
Date: Wed, 14 Dec 2011 17:03:54 +0100 [thread overview]
Message-ID: <4EE8C8EA.9070207@kernel.dk> (raw)
In-Reply-To: <4EE8A7ED.7060703@redhat.com>
On 2011-12-14 14:43, Avi Kivity wrote:
> On 12/14/2011 02:25 PM, Marcelo Tosatti wrote:
>> On Mon, Dec 05, 2011 at 04:48:16PM -0600, Nate Custer wrote:
>>> Hello,
>>>
>>> I am struggling with repeatable full hardware locks when running 8-12 KVM vms. At some point before the hard lock I get a inconsistent lock state warning. An example of this can be found here:
>>>
>>> http://pastebin.com/8wKhgE2C
>>>
>>> After that the server continues to run for a while and then starts its death spiral. When it reaches that point it fails to log anything further to the disk, but by attaching a console I have been able to get a stack trace documenting the final implosion:
>>>
>>> http://pastebin.com/PbcN76bd
>>>
>>> All of the cores end up hung and the server stops responding to all input, including SysRq commands.
>>>
>>> I have seen this behavior on two machines (dual E5606 running Fedora 16) both passed cpuburnin testing and memtest86 scans without error.
>>>
>>> I have reproduced the crash and stack traces from a Fedora debugging kernel - 3.1.2-1 and with a vanilla 3.1.4 kernel.
>>
>> Busted hardware, apparently. Can you reproduce these issues with the
>> same workload on different hardware?
>
> I don't think it's hardware related. The second trace (in the first
> paste) is called during swap, so GFP_FS is set. The first one is not,
> so GFP_FS is clear. Lockdep is worried about the following scenario:
>
> acpi_early_init() is called
> calls pcpu_alloc(), which takes pcpu_alloc_mutex
> eventually, calls kmalloc(), or some other allocation function
> no memory, so swap
> call try_to_free_pages()
> submit_bio()
> blk_throtl_bio()
> blkio_alloc_blkg_stats()
> alloc_percpu()
> pcpu_alloc(), which takes pcpu_alloc_mutex
> deadlock
>
> It's a little unlikely that acpi_early_init() will OOM, but lockdep
> doesn't know that. Other callers of pcpu_alloc() could trigger the same
> thing.
>
> When lockdep says
>
> [ 5839.924953] other info that might help us debug this:
> [ 5839.925396] Possible unsafe locking scenario:
> [ 5839.925397]
> [ 5839.925840] CPU0
> [ 5839.926063] ----
> [ 5839.926287] lock(pcpu_alloc_mutex);
> [ 5839.926533] <Interrupt>
> [ 5839.926756] lock(pcpu_alloc_mutex);
> [ 5839.926986]
>
> It really means
>
> <swap, set GFP_FS>
>
> GFP_FS simply marks the beginning of a nested, unrelated context that
> uses the same thread, just like an interrupt. Kudos to lockdep for
> catching that.
>
> I think the allocation in blkio_alloc_blkg_stats() should be moved out
> of the I/O path into some init function. Copying Jens.
That's completely buggy, basically you end up with a GFP_KERNEL
allocation from the IO submit path. Vivek, per_cpu data needs to be set
up at init time. You can't allocate it dynamically off the IO path.
--
Jens Axboe
next prev parent reply other threads:[~2011-12-14 16:04 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-05 22:48 kvm deadlock Nate Custer
2011-12-14 12:25 ` Marcelo Tosatti
2011-12-14 13:43 ` Avi Kivity
2011-12-14 14:00 ` Marcelo Tosatti
2011-12-14 14:02 ` Avi Kivity
2011-12-14 14:06 ` Marcelo Tosatti
2011-12-14 14:17 ` Nate Custer
2011-12-14 14:20 ` Marcelo Tosatti
2011-12-14 14:28 ` Avi Kivity
2011-12-14 14:27 ` Avi Kivity
2011-12-14 16:03 ` Jens Axboe [this message]
2011-12-14 17:03 ` Vivek Goyal
2011-12-14 17:09 ` Jens Axboe
2011-12-14 17:22 ` Vivek Goyal
2011-12-14 18:16 ` Tejun Heo
2011-12-14 18:41 ` Vivek Goyal
2011-12-14 23:06 ` Vivek Goyal
2011-12-15 19:47 ` [RFT PATCH] blkio: alloc per cpu data from worker thread context( Re: kvm deadlock) Vivek Goyal
[not found] ` <E73DB38E-AFC5-445D-9E76-DE599B36A814@cpanel.net>
2011-12-16 20:29 ` Vivek Goyal
2011-12-18 21:25 ` Nate Custer
2011-12-19 13:40 ` Vivek Goyal
2011-12-19 17:27 ` Vivek Goyal
2011-12-19 17:35 ` Tejun Heo
2011-12-19 18:27 ` Vivek Goyal
2011-12-19 22:56 ` Tejun Heo
2011-12-20 14:50 ` Vivek Goyal
2011-12-20 20:45 ` Tejun Heo
2011-12-20 12:49 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EE8C8EA.9070207@kernel.dk \
--to=axboe@kernel.dk \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=nate@cpanel.net \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.