All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Nate Custer <nate@cpanel.net>,
	kvm@vger.kernel.org, linux-kernel <linux-kernel@vger.kernel.org>,
	Tejun Heo <tj@kernel.org>
Subject: Re: kvm deadlock
Date: Wed, 14 Dec 2011 18:09:55 +0100	[thread overview]
Message-ID: <4EE8D863.5000701@kernel.dk> (raw)
In-Reply-To: <20111214170347.GA25484@redhat.com>

On 2011-12-14 18:03, Vivek Goyal wrote:
>>> I think the allocation in blkio_alloc_blkg_stats() should be moved out
>>> of the I/O path into some init function. Copying Jens.
>>
>> That's completely buggy, basically you end up with a GFP_KERNEL
>> allocation from the IO submit path. Vivek, per_cpu data needs to be set
>> up at init time. You can't allocate it dynamically off the IO path.
> 
> Hi Jens,
> 
> I am wondering how does CFQ get away with blocking cfqq allocation in
> IO submit path. I see that blk_queue_bio() will do following.
> 
> blk_queue_bio()
>  get_request_wait()
>   get_request(..,..,GFP_NOIO)
>    blk_alloc_request()
>     elv_set_request()
>      cfq_set_request()
>       ---> Can sleep and do memory allocation in IO submit path as
>            GFP_NOIO has __GFP_WAIT.
> 
> So that means sleeping allocation from IO submit path is not necessarily
> a problem?

__GFP_WAIT isn't the problem, you can block in the IO path. You cannot,
however, recurse back into IO submission. That's why CFQ is using
GFP_NOIO, implying that waiting is OK, but submitting new IO to satisfy
the allocation is not.

> But in case of per cpu data allocation, we might be holding
> pcpu_alloc_mutex() already at the time of calling pcpu allocation
> again and that might lead to deadlock.  (As Avi mentioned). If yes,
> then it is a problem.

It is both a problem and somewhat of a mess...

> Right now allocation of root group and associated stats happens at
> queue initialization time. For non-root cgroups, group allocation and
> associated per cpu stats allocation happens dynamically when the IO is
> submitted. So in this case may be we are creating a new blkio cgroup
> and then doing IO which leads to this warning.
> 
> I am not sure how to move this allocation to init path. These stats
> are per group and groups are created dynamically as IO happens in
> them. Only init path seems to be cgroup creation time. blkg is an
> object which is contained in a parent object and at that time parent
> object is not available. It is created dynamically at the IO time
> (cfq_group, blkio_group etc).
> 
> Though it is little hackish but can we just delay the allocation of
> stats if pcpu_alloc_mutex is held. We shall have to make
> pcpu_alloc_mutex non static though. Delaying will just not capture the
> stats for some time and sooner or later we will get regular IO with
> pcpu_alloc_mutex not held and we can do per cpu allocation at that
> time. I will write a a test patch.
> 
> Or may be there is a safer version of pcpu alloc which will return
> without allocation if pcpu_alloc_mutex is already locked.

Both of these proposed solutions aren't really pretty. It would be
cleaner and have better runtime for the fast path if you could alloc all
of these when the non-root groups are setup. Why isn't it done that way
right now?

-- 
Jens Axboe


  reply	other threads:[~2011-12-14 17:10 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-05 22:48 kvm deadlock Nate Custer
2011-12-14 12:25 ` Marcelo Tosatti
2011-12-14 13:43   ` Avi Kivity
2011-12-14 14:00     ` Marcelo Tosatti
2011-12-14 14:02       ` Avi Kivity
2011-12-14 14:06         ` Marcelo Tosatti
2011-12-14 14:17           ` Nate Custer
2011-12-14 14:20             ` Marcelo Tosatti
2011-12-14 14:28             ` Avi Kivity
2011-12-14 14:27           ` Avi Kivity
2011-12-14 16:03     ` Jens Axboe
2011-12-14 17:03       ` Vivek Goyal
2011-12-14 17:09         ` Jens Axboe [this message]
2011-12-14 17:22           ` Vivek Goyal
2011-12-14 18:16             ` Tejun Heo
2011-12-14 18:41               ` Vivek Goyal
2011-12-14 23:06                 ` Vivek Goyal
2011-12-15 19:47       ` [RFT PATCH] blkio: alloc per cpu data from worker thread context( Re: kvm deadlock) Vivek Goyal
     [not found]         ` <E73DB38E-AFC5-445D-9E76-DE599B36A814@cpanel.net>
2011-12-16 20:29           ` Vivek Goyal
2011-12-18 21:25             ` Nate Custer
2011-12-19 13:40               ` Vivek Goyal
2011-12-19 17:27               ` Vivek Goyal
2011-12-19 17:35                 ` Tejun Heo
2011-12-19 18:27                   ` Vivek Goyal
2011-12-19 22:56                     ` Tejun Heo
2011-12-20 14:50                       ` Vivek Goyal
2011-12-20 20:45                         ` Tejun Heo
2011-12-20 12:49                     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EE8D863.5000701@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=nate@cpanel.net \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.