All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dm: don't reuse bio for flushes
@ 2018-12-19 15:50 Jens Axboe
  2018-12-19 16:06 ` Ming Lei
  2018-12-19 16:11 ` Mike Snitzer
  0 siblings, 2 replies; 7+ messages in thread
From: Jens Axboe @ 2018-12-19 15:50 UTC (permalink / raw)
  To: linux-block; +Cc: Mike Snitzer, Ming Lei, Dennis Zhou

DM currently has a statically allocated bio that it uses to issue empty
flushes. It doesn't submit this bio, it just uses it for maintaining
state while setting up clones. Multiple users can access this bio at the
same time. This wasn't previously an issue, even if it was a bit iffy,
but with the blkg associations it can become one.

We setup the blkg association, then clone bio's and submit, then remove
the blkg assocation again. But since we can have multiple tasks doing
this at the same time, against multiple blkg's, then we can either lose
references to a blkg, or put it twice. The latter causes complaints on
the percpu ref being <= 0 when released, and can cause use-after-free as
well. Ming reports that xfstest generic/475 triggers this:

------------[ cut here ]------------
percpu ref (blkg_release) <= 0 (0) after switching to atomic
WARNING: CPU: 13 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x2c9/0x4a0

Switch to just using an on-stack bio for this, and get rid of the
embedded bio.

Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

Only other use case like this I could find is raid5-cache, which
serializes access to its embedded bio (and actually submits it, too).
That one looks fine.

diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index 6fe883fac471..95c6d86ab5e8 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -106,9 +106,6 @@ struct mapped_device {
 
 	struct block_device *bdev;
 
-	/* zero-length flush that will be cloned and submitted to targets */
-	struct bio flush_bio;
-
 	struct dm_stats stats;
 
 	/* for blk-mq request-based DM support */
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index dddbca63e140..f588a6a83d80 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1420,11 +1420,11 @@ static int __send_empty_flush(struct clone_info *ci)
 	struct dm_target *ti;
 
 	/*
-	 * Empty flush uses a statically initialized bio, &md->flush_bio, as
-	 * the base for cloning.  However, blkg association requires that a
-	 * bdev is associated with a gendisk, which doesn't happen until the
-	 * bdev is opened.  So, blkg association is done at issue time of the
-	 * flush rather than when the device is created in alloc_dev().
+	 * Empty flush uses a statically initialized bio, as the base for
+	 * cloning.  However, blkg association requires that a bdev is
+	 * associated with a gendisk, which doesn't happen until the bdev is
+	 * opened.  So, blkg association is done at issue time of the flush
+	 * rather than when the device is created in alloc_dev().
 	 */
 	bio_set_dev(ci->bio, ci->io->md->bdev);
 
@@ -1609,7 +1609,16 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 	init_clone_info(&ci, md, map, bio);
 
 	if (bio->bi_opf & REQ_PREFLUSH) {
-		ci.bio = &ci.io->md->flush_bio;
+		struct bio flush_bio;
+
+		/*
+		 * Use an on-stack bio for this, it's safe since we don't
+		 * need to reference it after submit. It's just used as
+		 * the basis for the clone(s).
+		 */
+		bio_init(&flush_bio, NULL, 0);
+		flush_bio.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC;
+		ci.bio = &flush_bio;
 		ci.sector_count = 0;
 		error = __send_empty_flush(&ci);
 		/* dec_pending submits any data associated with flush */
@@ -1665,7 +1674,16 @@ static blk_qc_t __process_bio(struct mapped_device *md,
 	init_clone_info(&ci, md, map, bio);
 
 	if (bio->bi_opf & REQ_PREFLUSH) {
-		ci.bio = &ci.io->md->flush_bio;
+		struct bio flush_bio;
+
+		/*
+		 * Use an on-stack bio for this, it's safe since we don't
+		 * need to reference it after submit. It's just used as
+		 * the basis for the clone(s).
+		 */
+		bio_init(&flush_bio, NULL, 0);
+		flush_bio.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC;
+		ci.bio = &flush_bio;
 		ci.sector_count = 0;
 		error = __send_empty_flush(&ci);
 		/* dec_pending submits any data associated with flush */
@@ -1949,9 +1967,6 @@ static struct mapped_device *alloc_dev(int minor)
 	if (!md->bdev)
 		goto bad;
 
-	bio_init(&md->flush_bio, NULL, 0);
-	md->flush_bio.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC;
-
 	dm_stats_init(&md->stats);
 
 	/* Populate the mapping, nobody knows we exist yet */

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] dm: don't reuse bio for flushes
  2018-12-19 15:50 [PATCH] dm: don't reuse bio for flushes Jens Axboe
@ 2018-12-19 16:06 ` Ming Lei
  2018-12-19 16:11   ` Jens Axboe
  2018-12-19 16:11 ` Mike Snitzer
  1 sibling, 1 reply; 7+ messages in thread
From: Ming Lei @ 2018-12-19 16:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Mike Snitzer, Dennis Zhou

On Wed, Dec 19, 2018 at 08:50:09AM -0700, Jens Axboe wrote:
> DM currently has a statically allocated bio that it uses to issue empty
> flushes. It doesn't submit this bio, it just uses it for maintaining
> state while setting up clones. Multiple users can access this bio at the
> same time. This wasn't previously an issue, even if it was a bit iffy,
> but with the blkg associations it can become one.
> 
> We setup the blkg association, then clone bio's and submit, then remove
> the blkg assocation again. But since we can have multiple tasks doing
> this at the same time, against multiple blkg's, then we can either lose
> references to a blkg, or put it twice. The latter causes complaints on
> the percpu ref being <= 0 when released, and can cause use-after-free as
> well. Ming reports that xfstest generic/475 triggers this:
> 
> ------------[ cut here ]------------
> percpu ref (blkg_release) <= 0 (0) after switching to atomic
> WARNING: CPU: 13 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x2c9/0x4a0
> 
> Switch to just using an on-stack bio for this, and get rid of the
> embedded bio.
> 
> Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
> Reported-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> ---
> 
> Only other use case like this I could find is raid5-cache, which
> serializes access to its embedded bio (and actually submits it, too).
> That one looks fine.

With this patch, xfstests(generic/475) won't trigger the warning of
'percpu ref (blkg_release) <= 0 (0) after switching to atomic' any more.

Tested-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dm: don't reuse bio for flushes
  2018-12-19 16:06 ` Ming Lei
@ 2018-12-19 16:11   ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2018-12-19 16:11 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Mike Snitzer, Dennis Zhou

On 12/19/18 9:06 AM, Ming Lei wrote:
> On Wed, Dec 19, 2018 at 08:50:09AM -0700, Jens Axboe wrote:
>> DM currently has a statically allocated bio that it uses to issue empty
>> flushes. It doesn't submit this bio, it just uses it for maintaining
>> state while setting up clones. Multiple users can access this bio at the
>> same time. This wasn't previously an issue, even if it was a bit iffy,
>> but with the blkg associations it can become one.
>>
>> We setup the blkg association, then clone bio's and submit, then remove
>> the blkg assocation again. But since we can have multiple tasks doing
>> this at the same time, against multiple blkg's, then we can either lose
>> references to a blkg, or put it twice. The latter causes complaints on
>> the percpu ref being <= 0 when released, and can cause use-after-free as
>> well. Ming reports that xfstest generic/475 triggers this:
>>
>> ------------[ cut here ]------------
>> percpu ref (blkg_release) <= 0 (0) after switching to atomic
>> WARNING: CPU: 13 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x2c9/0x4a0
>>
>> Switch to just using an on-stack bio for this, and get rid of the
>> embedded bio.
>>
>> Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
>> Reported-by: Ming Lei <ming.lei@redhat.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> ---
>>
>> Only other use case like this I could find is raid5-cache, which
>> serializes access to its embedded bio (and actually submits it, too).
>> That one looks fine.
> 
> With this patch, xfstests(generic/475) won't trigger the warning of
> 'percpu ref (blkg_release) <= 0 (0) after switching to atomic' any more.
> 
> Tested-by: Ming Lei <ming.lei@redhat.com>

Great, thanks Ming. I think this was somewhat of a ticking time bomb,
so good to have it fixed even if we could have done without a bug. And
thanks for all the testing!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm: don't reuse bio for flushes
  2018-12-19 15:50 [PATCH] dm: don't reuse bio for flushes Jens Axboe
  2018-12-19 16:06 ` Ming Lei
@ 2018-12-19 16:11 ` Mike Snitzer
  2018-12-19 16:13   ` Mike Snitzer
  1 sibling, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2018-12-19 16:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Dennis Zhou

On Wed, Dec 19 2018 at 10:50am -0500,
Jens Axboe <axboe@kernel.dk> wrote:

> DM currently has a statically allocated bio that it uses to issue empty
> flushes. It doesn't submit this bio, it just uses it for maintaining
> state while setting up clones. Multiple users can access this bio at the
> same time. This wasn't previously an issue, even if it was a bit iffy,
> but with the blkg associations it can become one.
> 
> We setup the blkg association, then clone bio's and submit, then remove
> the blkg assocation again. But since we can have multiple tasks doing
> this at the same time, against multiple blkg's, then we can either lose
> references to a blkg, or put it twice. The latter causes complaints on
> the percpu ref being <= 0 when released, and can cause use-after-free as
> well. Ming reports that xfstest generic/475 triggers this:
> 
> ------------[ cut here ]------------
> percpu ref (blkg_release) <= 0 (0) after switching to atomic
> WARNING: CPU: 13 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x2c9/0x4a0
> 
> Switch to just using an on-stack bio for this, and get rid of the
> embedded bio.
> 
> Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
> Reported-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Thanks for sorting this one out, definitely wasn't happy with how
exposed DM was left with the recent blkg changes.  This is clearly
better.

Acked-by: Mike Snitzer <snitzer@redhat.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm: don't reuse bio for flushes
  2018-12-19 16:11 ` Mike Snitzer
@ 2018-12-19 16:13   ` Mike Snitzer
  2018-12-19 16:14     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2018-12-19 16:13 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Dennis Zhou

On Wed, Dec 19 2018 at 11:11am -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Wed, Dec 19 2018 at 10:50am -0500,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
> > DM currently has a statically allocated bio that it uses to issue empty
> > flushes. It doesn't submit this bio, it just uses it for maintaining
> > state while setting up clones. Multiple users can access this bio at the
> > same time. This wasn't previously an issue, even if it was a bit iffy,
> > but with the blkg associations it can become one.
> > 
> > We setup the blkg association, then clone bio's and submit, then remove
> > the blkg assocation again. But since we can have multiple tasks doing
> > this at the same time, against multiple blkg's, then we can either lose
> > references to a blkg, or put it twice. The latter causes complaints on
> > the percpu ref being <= 0 when released, and can cause use-after-free as
> > well. Ming reports that xfstest generic/475 triggers this:
> > 
> > ------------[ cut here ]------------
> > percpu ref (blkg_release) <= 0 (0) after switching to atomic
> > WARNING: CPU: 13 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x2c9/0x4a0
> > 
> > Switch to just using an on-stack bio for this, and get rid of the
> > embedded bio.
> > 
> > Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
> > Reported-by: Ming Lei <ming.lei@redhat.com>
> > Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Thanks for sorting this one out, definitely wasn't happy with how
> exposed DM was left with the recent blkg changes.  This is clearly
> better.
> 
> Acked-by: Mike Snitzer <snitzer@redhat.com>

Please upgrade this to:

Reviewed-by: Mike Snitzer <snitzer@redhat.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm: don't reuse bio for flushes
  2018-12-19 16:13   ` Mike Snitzer
@ 2018-12-19 16:14     ` Jens Axboe
  2018-12-19 19:43       ` Dennis Zhou
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2018-12-19 16:14 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-block, Ming Lei, Dennis Zhou

On 12/19/18 9:13 AM, Mike Snitzer wrote:
> On Wed, Dec 19 2018 at 11:11am -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
>> On Wed, Dec 19 2018 at 10:50am -0500,
>> Jens Axboe <axboe@kernel.dk> wrote:
>>
>>> DM currently has a statically allocated bio that it uses to issue empty
>>> flushes. It doesn't submit this bio, it just uses it for maintaining
>>> state while setting up clones. Multiple users can access this bio at the
>>> same time. This wasn't previously an issue, even if it was a bit iffy,
>>> but with the blkg associations it can become one.
>>>
>>> We setup the blkg association, then clone bio's and submit, then remove
>>> the blkg assocation again. But since we can have multiple tasks doing
>>> this at the same time, against multiple blkg's, then we can either lose
>>> references to a blkg, or put it twice. The latter causes complaints on
>>> the percpu ref being <= 0 when released, and can cause use-after-free as
>>> well. Ming reports that xfstest generic/475 triggers this:
>>>
>>> ------------[ cut here ]------------
>>> percpu ref (blkg_release) <= 0 (0) after switching to atomic
>>> WARNING: CPU: 13 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x2c9/0x4a0
>>>
>>> Switch to just using an on-stack bio for this, and get rid of the
>>> embedded bio.
>>>
>>> Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
>>> Reported-by: Ming Lei <ming.lei@redhat.com>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> Thanks for sorting this one out, definitely wasn't happy with how
>> exposed DM was left with the recent blkg changes.  This is clearly
>> better.
>>
>> Acked-by: Mike Snitzer <snitzer@redhat.com>
> 
> Please upgrade this to:
> 
> Reviewed-by: Mike Snitzer <snitzer@redhat.com>

Done, thanks Mike.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm: don't reuse bio for flushes
  2018-12-19 16:14     ` Jens Axboe
@ 2018-12-19 19:43       ` Dennis Zhou
  0 siblings, 0 replies; 7+ messages in thread
From: Dennis Zhou @ 2018-12-19 19:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Snitzer, linux-block, Ming Lei, Dennis Zhou

On Wed, Dec 19, 2018 at 09:14:25AM -0700, Jens Axboe wrote:
> On 12/19/18 9:13 AM, Mike Snitzer wrote:
> > On Wed, Dec 19 2018 at 11:11am -0500,
> > Mike Snitzer <snitzer@redhat.com> wrote:
> > 
> >> On Wed, Dec 19 2018 at 10:50am -0500,
> >> Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >>> DM currently has a statically allocated bio that it uses to issue empty
> >>> flushes. It doesn't submit this bio, it just uses it for maintaining
> >>> state while setting up clones. Multiple users can access this bio at the
> >>> same time. This wasn't previously an issue, even if it was a bit iffy,
> >>> but with the blkg associations it can become one.
> >>>
> >>> We setup the blkg association, then clone bio's and submit, then remove
> >>> the blkg assocation again. But since we can have multiple tasks doing
> >>> this at the same time, against multiple blkg's, then we can either lose
> >>> references to a blkg, or put it twice. The latter causes complaints on
> >>> the percpu ref being <= 0 when released, and can cause use-after-free as
> >>> well. Ming reports that xfstest generic/475 triggers this:
> >>>
> >>> ------------[ cut here ]------------
> >>> percpu ref (blkg_release) <= 0 (0) after switching to atomic
> >>> WARNING: CPU: 13 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x2c9/0x4a0
> >>>
> >>> Switch to just using an on-stack bio for this, and get rid of the
> >>> embedded bio.
> >>>
> >>> Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
> >>> Reported-by: Ming Lei <ming.lei@redhat.com>
> >>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> >>
> >> Thanks for sorting this one out, definitely wasn't happy with how
> >> exposed DM was left with the recent blkg changes.  This is clearly
> >> better.
> >>
> >> Acked-by: Mike Snitzer <snitzer@redhat.com>
> > 
> > Please upgrade this to:
> > 
> > Reviewed-by: Mike Snitzer <snitzer@redhat.com>
> 
> Done, thanks Mike.
> 
> -- 
> Jens Axboe
> 

Thanks all of you for helping fix this issue! Apologies for introducing
with the initial change to dm.

Thanks,
Dennis

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-12-19 19:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-19 15:50 [PATCH] dm: don't reuse bio for flushes Jens Axboe
2018-12-19 16:06 ` Ming Lei
2018-12-19 16:11   ` Jens Axboe
2018-12-19 16:11 ` Mike Snitzer
2018-12-19 16:13   ` Mike Snitzer
2018-12-19 16:14     ` Jens Axboe
2018-12-19 19:43       ` Dennis Zhou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.