All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Kent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, vgoyal@redhat.com, mpatocka@redhat.com,
	bharrosh@panasas.com, Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
Date: Tue, 28 Aug 2012 13:49:10 -0700	[thread overview]
Message-ID: <20120828204910.GG24608@dhcp-172-17-108-109.mtv.corp.google.com> (raw)
In-Reply-To: <1346175456-1572-10-git-send-email-koverstreet@google.com>

Hello,

On Tue, Aug 28, 2012 at 10:37:36AM -0700, Kent Overstreet wrote:
> @@ -324,13 +342,37 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
>  		front_pad = 0;
>  		inline_vecs = nr_iovecs;
>  	} else {
> +		/*
> +		 * generic_make_request() converts recursion to iteration; this
> +		 * means if we're running beneath it, any bios we allocate and
> +		 * submit will not be submitted (and thus freed) until after we
> +		 * return.
> +		 *
> +		 * This exposes us to a potential deadlock if we allocate
> +		 * multiple bios from the same bio_set() while running
> +		 * underneath generic_make_request(). If we were to allocate
> +		 * multiple bios (say a stacking block driver that was splitting
> +		 * bios), we would deadlock if we exhausted the mempool's
> +		 * reserve.
> +		 *
> +		 * We solve this, and guarantee forward progress, with a rescuer
> +		 * workqueue per bio_set. If we go to allocate and there are
> +		 * bios on current->bio_list, we first try the allocation
> +		 * without __GFP_WAIT; if that fails, we punt those bios we
> +		 * would be blocking to the rescuer workqueue before we retry
> +		 * with the original gfp_flags.
> +		 */
> +
> +		if (current->bio_list && !bio_list_empty(current->bio_list))
> +			gfp_mask &= ~__GFP_WAIT;
> +retry:
>  		p = mempool_alloc(bs->bio_pool, gfp_mask);
>  		front_pad = bs->front_pad;
>  		inline_vecs = BIO_INLINE_VECS;
>  	}
>  
>  	if (unlikely(!p))
> -		return NULL;
> +		goto err;
>  
>  	bio = p + front_pad;
>  	bio_init(bio);
> @@ -351,6 +393,19 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
>  
>  err_free:
>  	mempool_free(p, bs->bio_pool);
> +err:
> +	if (gfp_mask != saved_gfp) {
> +		gfp_mask = saved_gfp;
> +
> +		spin_lock(&bs->rescue_lock);
> +		bio_list_merge(&bs->rescue_list, current->bio_list);
> +		bio_list_init(current->bio_list);
> +		spin_unlock(&bs->rescue_lock);
> +
> +		queue_work(bs->rescue_workqueue, &bs->rescue_work);
> +		goto retry;
> +	}

I wonder whether it would be easier to follow if this part is in-line
where retry: is.  All that needs to be duplicated is single
mempool_alloc() call, right?

Overall, I *think* this is correct but need to think more about it to
be sure.

Thanks!

-- 
tejun

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	bharrosh-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org,
	Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
Subject: Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
Date: Tue, 28 Aug 2012 13:49:10 -0700	[thread overview]
Message-ID: <20120828204910.GG24608@dhcp-172-17-108-109.mtv.corp.google.com> (raw)
In-Reply-To: <1346175456-1572-10-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Hello,

On Tue, Aug 28, 2012 at 10:37:36AM -0700, Kent Overstreet wrote:
> @@ -324,13 +342,37 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
>  		front_pad = 0;
>  		inline_vecs = nr_iovecs;
>  	} else {
> +		/*
> +		 * generic_make_request() converts recursion to iteration; this
> +		 * means if we're running beneath it, any bios we allocate and
> +		 * submit will not be submitted (and thus freed) until after we
> +		 * return.
> +		 *
> +		 * This exposes us to a potential deadlock if we allocate
> +		 * multiple bios from the same bio_set() while running
> +		 * underneath generic_make_request(). If we were to allocate
> +		 * multiple bios (say a stacking block driver that was splitting
> +		 * bios), we would deadlock if we exhausted the mempool's
> +		 * reserve.
> +		 *
> +		 * We solve this, and guarantee forward progress, with a rescuer
> +		 * workqueue per bio_set. If we go to allocate and there are
> +		 * bios on current->bio_list, we first try the allocation
> +		 * without __GFP_WAIT; if that fails, we punt those bios we
> +		 * would be blocking to the rescuer workqueue before we retry
> +		 * with the original gfp_flags.
> +		 */
> +
> +		if (current->bio_list && !bio_list_empty(current->bio_list))
> +			gfp_mask &= ~__GFP_WAIT;
> +retry:
>  		p = mempool_alloc(bs->bio_pool, gfp_mask);
>  		front_pad = bs->front_pad;
>  		inline_vecs = BIO_INLINE_VECS;
>  	}
>  
>  	if (unlikely(!p))
> -		return NULL;
> +		goto err;
>  
>  	bio = p + front_pad;
>  	bio_init(bio);
> @@ -351,6 +393,19 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
>  
>  err_free:
>  	mempool_free(p, bs->bio_pool);
> +err:
> +	if (gfp_mask != saved_gfp) {
> +		gfp_mask = saved_gfp;
> +
> +		spin_lock(&bs->rescue_lock);
> +		bio_list_merge(&bs->rescue_list, current->bio_list);
> +		bio_list_init(current->bio_list);
> +		spin_unlock(&bs->rescue_lock);
> +
> +		queue_work(bs->rescue_workqueue, &bs->rescue_work);
> +		goto retry;
> +	}

I wonder whether it would be easier to follow if this part is in-line
where retry: is.  All that needs to be duplicated is single
mempool_alloc() call, right?

Overall, I *think* this is correct but need to think more about it to
be sure.

Thanks!

-- 
tejun

  reply	other threads:[~2012-08-28 20:49 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-28 17:37 [PATCH v7 0/9] Block cleanups, deadlock fix Kent Overstreet
2012-08-28 17:37 ` [PATCH v7 1/9] block: Generalized bio pool freeing Kent Overstreet
2012-08-28 17:37   ` Kent Overstreet
2012-08-28 17:37 ` [PATCH v7 2/9] dm: Use bioset's front_pad for dm_rq_clone_bio_info Kent Overstreet
2012-08-28 17:37 ` [PATCH v7 3/9] block: Add bio_reset() Kent Overstreet
2012-08-28 17:37   ` Kent Overstreet
2012-08-28 20:31   ` Tejun Heo
2012-08-28 20:31     ` Tejun Heo
2012-08-28 22:17     ` Kent Overstreet
2012-08-28 22:53       ` Kent Overstreet
2012-08-28 22:53         ` Kent Overstreet
2012-09-01  2:23       ` Tejun Heo
2012-09-01  2:23         ` Tejun Heo
2012-09-05 20:13         ` Kent Overstreet
2012-08-28 17:37 ` [PATCH v7 4/9] pktcdvd: Switch to bio_kmalloc() Kent Overstreet
2012-08-28 20:32   ` Tejun Heo
2012-08-28 20:32     ` Tejun Heo
2012-08-28 22:24     ` Kent Overstreet
2012-09-04  9:05     ` Jiri Kosina
2012-09-04  9:05       ` Jiri Kosina
2012-09-05 19:44       ` Kent Overstreet
2012-09-05 19:44         ` Kent Overstreet
2012-08-28 17:37 ` [PATCH v7 5/9] block: Kill bi_destructor Kent Overstreet
2012-08-28 20:36   ` Tejun Heo
2012-08-28 20:36     ` Tejun Heo
2012-08-28 22:07     ` Kent Overstreet
2012-08-28 17:37 ` [PATCH v7 6/9] block: Consolidate bio_alloc_bioset(), bio_kmalloc() Kent Overstreet
2012-08-28 17:37   ` Kent Overstreet
2012-08-28 20:41   ` Tejun Heo
2012-08-28 20:41     ` Tejun Heo
2012-08-28 22:03     ` Kent Overstreet
2012-08-28 22:03       ` Kent Overstreet
2012-09-01  2:17       ` Tejun Heo
2012-08-28 17:37 ` [PATCH v7 7/9] block: Add bio_clone_bioset(), bio_clone_kmalloc() Kent Overstreet
2012-08-28 20:44   ` Tejun Heo
2012-08-28 20:44     ` Tejun Heo
2012-08-28 22:05     ` Kent Overstreet
2012-09-01  2:19       ` Tejun Heo
2012-09-01  2:19         ` Tejun Heo
2012-08-28 17:37 ` [PATCH v7 8/9] block: Reorder struct bio_set Kent Overstreet
2012-08-28 17:37 ` [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers Kent Overstreet
2012-08-28 20:49   ` Tejun Heo [this message]
2012-08-28 20:49     ` Tejun Heo
2012-08-28 22:28     ` Kent Overstreet
2012-08-28 23:01       ` Kent Overstreet
2012-08-28 23:01         ` Kent Overstreet
2012-08-29  1:31         ` Vivek Goyal
2012-08-29  1:31           ` Vivek Goyal
2012-08-29  3:25           ` Kent Overstreet
2012-08-29 12:57             ` Vivek Goyal
2012-08-29 12:57               ` Vivek Goyal
2012-08-29 14:39               ` [dm-devel] " Alasdair G Kergon
2012-08-29 14:39                 ` Alasdair G Kergon
2012-08-29 16:26                 ` Kent Overstreet
2012-08-29 16:26                   ` Kent Overstreet
2012-08-29 21:01                   ` John Stoffel
2012-08-29 21:08                     ` Kent Overstreet
2012-08-29 21:08                       ` Kent Overstreet
2012-08-28 22:06   ` Vivek Goyal
2012-08-28 22:06     ` Vivek Goyal
2012-08-28 22:23     ` Kent Overstreet
2012-08-28 22:23       ` Kent Overstreet
2012-08-29 16:24   ` Mikulas Patocka
2012-08-29 16:50     ` Kent Overstreet
2012-08-29 16:57       ` [dm-devel] " Alasdair G Kergon
2012-08-29 16:57         ` Alasdair G Kergon
2012-08-29 17:07       ` Vivek Goyal
2012-08-29 17:07         ` Vivek Goyal
2012-08-29 17:13         ` Kent Overstreet
2012-08-29 17:13           ` Kent Overstreet
2012-08-29 17:23           ` [dm-devel] " Alasdair G Kergon
2012-08-29 17:23             ` Alasdair G Kergon
2012-08-29 17:32             ` Kent Overstreet
2012-08-30 22:07           ` Vivek Goyal
2012-08-30 22:07             ` Vivek Goyal
2012-08-31  1:43             ` Kent Overstreet
2012-08-31  1:43               ` Kent Overstreet
2012-08-31  1:55               ` Kent Overstreet
2012-08-31  1:55                 ` Kent Overstreet
2012-08-31 15:01               ` Vivek Goyal
2012-09-03  1:26                 ` Kent Overstreet
2012-09-03  1:26                   ` Kent Overstreet
2012-09-03 20:41               ` Mikulas Patocka
2012-09-03 20:41                 ` Mikulas Patocka
2012-09-04  3:41                 ` Kent Overstreet
2012-09-04  3:41                   ` Kent Overstreet
2012-09-04 18:55                   ` Tejun Heo
2012-09-04 18:55                     ` Tejun Heo
2012-09-04 19:01                     ` Tejun Heo
2012-09-04 19:43                       ` Kent Overstreet
2012-09-04 19:43                         ` Kent Overstreet
2012-09-04 19:42                     ` Kent Overstreet
2012-09-04 21:03                       ` Tejun Heo
2012-09-04 21:03                         ` Tejun Heo
2012-09-04 19:26                   ` Mikulas Patocka
2012-09-04 19:26                     ` Mikulas Patocka
2012-09-04 19:39                     ` Vivek Goyal
2012-09-04 19:39                       ` Vivek Goyal
2012-09-04 19:51                     ` [PATCH] dm: Use bioset's front_pad for dm_target_io Kent Overstreet
2012-09-04 19:51                       ` Kent Overstreet
2012-09-04 21:20                       ` Tejun Heo
2012-09-04 21:20                         ` Tejun Heo
2012-09-11 19:28                       ` [PATCH 2] " Mikulas Patocka
2012-09-11 19:28                         ` Mikulas Patocka
2012-09-11 19:50                         ` Kent Overstreet
2012-09-11 19:50                           ` Kent Overstreet
2012-09-12 22:31                           ` Mikulas Patocka
2012-09-12 22:31                             ` Mikulas Patocka
2012-09-14 23:09                             ` [dm-devel] " Alasdair G Kergon
2012-09-14 23:09                               ` Alasdair G Kergon
2012-09-01  2:13             ` [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers Tejun Heo
2012-09-01  2:13               ` Tejun Heo
2012-09-03  1:34               ` [PATCH v2] " Kent Overstreet
2012-09-03  1:34                 ` Kent Overstreet
2012-09-04 15:00               ` [PATCH v7 9/9] " Vivek Goyal
2012-09-04 15:00                 ` Vivek Goyal
2012-09-03  0:49             ` Dave Chinner
2012-09-03  0:49               ` Dave Chinner
2012-09-03  1:17               ` Kent Overstreet
2012-09-03  1:17                 ` Kent Overstreet
2012-09-04 13:54               ` Vivek Goyal
2012-09-04 13:54                 ` Vivek Goyal
2012-09-04 18:26                 ` Tejun Heo
2012-09-04 18:26                   ` Tejun Heo
2012-09-05  3:57                   ` Dave Chinner
2012-09-05  3:57                     ` Dave Chinner
2012-09-05  4:37                     ` Tejun Heo
2012-09-05  4:37                       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120828204910.GG24608@dhcp-172-17-108-109.mtv.corp.google.com \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bharrosh@panasas.com \
    --cc=dm-devel@redhat.com \
    --cc=koverstreet@google.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.