linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests
@ 2021-05-11 15:22 Ming Lei
  2021-05-11 15:22 ` [PATCH V7 1/4] block: avoid double io accounting for flush request Ming Lei
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Ming Lei @ 2021-05-11 15:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki, Ming Lei

Hi Jens,

This patchset fixes the request UAF issue by one simple approach,
without clearing ->rqs[] in fast path, please consider it for 5.13.

1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
and release it after calling ->fn, so ->fn won't be called for one
request if its queue is frozen, done in 2st patch

2) clearing any stale request referred in ->rqs[] before freeing the
request pool, one per-tags spinlock is added for protecting
grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
in bt_tags_iter() is avoided, done in 3rd patch.

V7:
	- fix one null-ptr-deref during updating nr_hw_queues, because
	blk_mq_clear_flush_rq_mapping() may touch non-mapped hw queue's
	tags, only patch 4/4 is modified, reported and verified by
	Shinichiro Kawasaki
	- run blktests and not see regression

V6:
	- hold spin lock when reading rq via ->rq[tag_bit], the issue is
	  added in V5
	- make blk_mq_find_and_get_req() more clean, as suggested by Bart
	- not hold tags->lock when clearing ->rqs[] for avoiding to disable
	interrupt a bit long, as suggested by Bart
	- code style change, as suggested by Christoph

V5:
	- cover bt_iter() by same approach taken in bt_tags_iter(), and pass
	John's invasive test
	- add tested-by/reviewed-by tag

V4:
	- remove hctx->fq-flush_rq from tags->rqs[] before freeing hw queue,
	patch 4/4 is added, which is based on David's patch.

V3:
	- drop patches for completing requests started in iterator ->fn,
	  because blk-mq guarantees that valid request is passed to ->fn,
	  and it is driver's responsibility for avoiding double completion.
	  And drivers works well for not completing rq twice.
	- add one patch for avoiding double accounting of flush rq 

V2:
	- take Bart's suggestion to not add blk-mq helper for completing
	  requests when it is being iterated
	- don't grab rq->ref if the iterator is over static rqs because
	the use case do require to iterate over all requests no matter if
	the request is initialized or not


Ming Lei (4):
  block: avoid double io accounting for flush request
  blk-mq: grab rq->refcount before calling ->fn in
    blk_mq_tagset_busy_iter
  blk-mq: clear stale request in tags->rq[] before freeing one request
    pool
  blk-mq: clearing flush request reference in tags->rqs[]

 block/blk-flush.c  |  3 +-
 block/blk-mq-tag.c | 49 ++++++++++++++++++------
 block/blk-mq-tag.h |  6 +++
 block/blk-mq.c     | 95 ++++++++++++++++++++++++++++++++++++++++------
 block/blk-mq.h     |  1 +
 5 files changed, 130 insertions(+), 24 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V7 1/4] block: avoid double io accounting for flush request
  2021-05-11 15:22 [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
@ 2021-05-11 15:22 ` Ming Lei
  2021-05-11 15:22 ` [PATCH V7 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Ming Lei
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2021-05-11 15:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki, Ming Lei

For flush request, rq->end_io() may be called two times, one is from
timeout handling(blk_mq_check_expired()), another is from normal
completion(__blk_mq_end_request()).

Move blk_account_io_flush() after flush_rq->ref drops to zero, so
io accounting can be done just once for flush request.

Fixes: b68663186577 ("block: add iostat counters for flush requests")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-flush.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 7942ca6ed321..1002f6c58181 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -219,8 +219,6 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 	unsigned long flags = 0;
 	struct blk_flush_queue *fq = blk_get_flush_queue(q, flush_rq->mq_ctx);
 
-	blk_account_io_flush(flush_rq);
-
 	/* release the tag's ownership to the req cloned from */
 	spin_lock_irqsave(&fq->mq_flush_lock, flags);
 
@@ -230,6 +228,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 		return;
 	}
 
+	blk_account_io_flush(flush_rq);
 	/*
 	 * Flush request has to be marked as IDLE when it is really ended
 	 * because its .end_io() is called from timeout code path too for
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V7 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter
  2021-05-11 15:22 [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
  2021-05-11 15:22 ` [PATCH V7 1/4] block: avoid double io accounting for flush request Ming Lei
@ 2021-05-11 15:22 ` Ming Lei
  2021-05-11 15:22 ` [PATCH V7 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool Ming Lei
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2021-05-11 15:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki, Ming Lei

Grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter(), and
this way will prevent the request from being re-used when ->fn is
running. The approach is same as what we do during handling timeout.

Fix request use-after-free(UAF) related with completion race or queue
releasing:

- If one rq is referred before rq->q is frozen, then queue won't be
frozen before the request is released during iteration.

- If one rq is referred after rq->q is frozen, refcount_inc_not_zero()
will return false, and we won't iterate over this request.

However, still one request UAF not covered: refcount_inc_not_zero() may
read one freed request, and it will be handled in next patch.

Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-tag.c | 44 +++++++++++++++++++++++++++++++++-----------
 block/blk-mq.c     | 14 +++++++++-----
 block/blk-mq.h     |  1 +
 3 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 2a37731e8244..544edf2c56a5 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -199,6 +199,16 @@ struct bt_iter_data {
 	bool reserved;
 };
 
+static struct request *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
+		unsigned int bitnr)
+{
+	struct request *rq = tags->rqs[bitnr];
+
+	if (!rq || !refcount_inc_not_zero(&rq->ref))
+		return NULL;
+	return rq;
+}
+
 static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 {
 	struct bt_iter_data *iter_data = data;
@@ -206,18 +216,22 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	struct blk_mq_tags *tags = hctx->tags;
 	bool reserved = iter_data->reserved;
 	struct request *rq;
+	bool ret = true;
 
 	if (!reserved)
 		bitnr += tags->nr_reserved_tags;
-	rq = tags->rqs[bitnr];
-
 	/*
 	 * We can hit rq == NULL here, because the tagging functions
 	 * test and set the bit before assigning ->rqs[].
 	 */
-	if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
-		return iter_data->fn(hctx, rq, iter_data->data, reserved);
-	return true;
+	rq = blk_mq_find_and_get_req(tags, bitnr);
+	if (!rq)
+		return true;
+
+	if (rq->q == hctx->queue && rq->mq_hctx == hctx)
+		ret = iter_data->fn(hctx, rq, iter_data->data, reserved);
+	blk_mq_put_rq_ref(rq);
+	return ret;
 }
 
 /**
@@ -264,6 +278,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	struct blk_mq_tags *tags = iter_data->tags;
 	bool reserved = iter_data->flags & BT_TAG_ITER_RESERVED;
 	struct request *rq;
+	bool ret = true;
+	bool iter_static_rqs = !!(iter_data->flags & BT_TAG_ITER_STATIC_RQS);
 
 	if (!reserved)
 		bitnr += tags->nr_reserved_tags;
@@ -272,16 +288,19 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	 * We can hit rq == NULL here, because the tagging functions
 	 * test and set the bit before assigning ->rqs[].
 	 */
-	if (iter_data->flags & BT_TAG_ITER_STATIC_RQS)
+	if (iter_static_rqs)
 		rq = tags->static_rqs[bitnr];
 	else
-		rq = tags->rqs[bitnr];
+		rq = blk_mq_find_and_get_req(tags, bitnr);
 	if (!rq)
 		return true;
-	if ((iter_data->flags & BT_TAG_ITER_STARTED) &&
-	    !blk_mq_request_started(rq))
-		return true;
-	return iter_data->fn(rq, iter_data->data, reserved);
+
+	if (!(iter_data->flags & BT_TAG_ITER_STARTED) ||
+	    blk_mq_request_started(rq))
+		ret = iter_data->fn(rq, iter_data->data, reserved);
+	if (!iter_static_rqs)
+		blk_mq_put_rq_ref(rq);
+	return ret;
 }
 
 /**
@@ -348,6 +367,9 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn,
  *		indicates whether or not @rq is a reserved request. Return
  *		true to continue iterating tags, false to stop.
  * @priv:	Will be passed as second argument to @fn.
+ *
+ * We grab one request reference before calling @fn and release it after
+ * @fn returns.
  */
 void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 		busy_tag_iter_fn *fn, void *priv)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 466676bc2f0b..d053e80fa6d7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -909,6 +909,14 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
 	return false;
 }
 
+void blk_mq_put_rq_ref(struct request *rq)
+{
+	if (is_flush_rq(rq, rq->mq_hctx))
+		rq->end_io(rq, 0);
+	else if (refcount_dec_and_test(&rq->ref))
+		__blk_mq_free_request(rq);
+}
+
 static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, void *priv, bool reserved)
 {
@@ -942,11 +950,7 @@ static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 	if (blk_mq_req_expired(rq, next))
 		blk_mq_rq_timed_out(rq, reserved);
 
-	if (is_flush_rq(rq, hctx))
-		rq->end_io(rq, 0);
-	else if (refcount_dec_and_test(&rq->ref))
-		__blk_mq_free_request(rq);
-
+	blk_mq_put_rq_ref(rq);
 	return true;
 }
 
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 9ce64bc4a6c8..556368d2c5b6 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -47,6 +47,7 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head,
 void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list);
 struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx,
 					struct blk_mq_ctx *start);
+void blk_mq_put_rq_ref(struct request *rq);
 
 /*
  * Internal helpers for allocating/freeing the request map
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V7 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool
  2021-05-11 15:22 [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
  2021-05-11 15:22 ` [PATCH V7 1/4] block: avoid double io accounting for flush request Ming Lei
  2021-05-11 15:22 ` [PATCH V7 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Ming Lei
@ 2021-05-11 15:22 ` Ming Lei
  2021-05-11 15:22 ` [PATCH V7 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2021-05-11 15:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki, Ming Lei

refcount_inc_not_zero() in bt_tags_iter() still may read one freed
request.

Fix the issue by the following approach:

1) hold a per-tags spinlock when reading ->rqs[tag] and calling
refcount_inc_not_zero in bt_tags_iter()

2) clearing stale request referred via ->rqs[tag] before freeing
request pool, the per-tags spinlock is held for clearing stale
->rq[tag]

So after we cleared stale requests, bt_tags_iter() won't observe
freed request any more, also the clearing will wait for pending
request reference.

The idea of clearing ->rqs[] is borrowed from John Garry's previous
patch and one recent David's patch.

Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-tag.c |  9 +++++++--
 block/blk-mq-tag.h |  6 ++++++
 block/blk-mq.c     | 46 +++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 544edf2c56a5..1671dae43030 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -202,10 +202,14 @@ struct bt_iter_data {
 static struct request *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
 		unsigned int bitnr)
 {
-	struct request *rq = tags->rqs[bitnr];
+	struct request *rq;
+	unsigned long flags;
 
+	spin_lock_irqsave(&tags->lock, flags);
+	rq = tags->rqs[bitnr];
 	if (!rq || !refcount_inc_not_zero(&rq->ref))
-		return NULL;
+		rq = NULL;
+	spin_unlock_irqrestore(&tags->lock, flags);
 	return rq;
 }
 
@@ -538,6 +542,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 
 	tags->nr_tags = total_tags;
 	tags->nr_reserved_tags = reserved_tags;
+	spin_lock_init(&tags->lock);
 
 	if (blk_mq_is_sbitmap_shared(flags))
 		return tags;
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 7d3e6b333a4a..f887988e5ef6 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -20,6 +20,12 @@ struct blk_mq_tags {
 	struct request **rqs;
 	struct request **static_rqs;
 	struct list_head page_list;
+
+	/*
+	 * used to clear request reference in rqs[] before freeing one
+	 * request pool
+	 */
+	spinlock_t lock;
 };
 
 extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d053e80fa6d7..626b4483e006 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2306,6 +2306,45 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
 	return BLK_QC_T_NONE;
 }
 
+static size_t order_to_size(unsigned int order)
+{
+	return (size_t)PAGE_SIZE << order;
+}
+
+/* called before freeing request pool in @tags */
+static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
+		struct blk_mq_tags *tags, unsigned int hctx_idx)
+{
+	struct blk_mq_tags *drv_tags = set->tags[hctx_idx];
+	struct page *page;
+	unsigned long flags;
+
+	list_for_each_entry(page, &tags->page_list, lru) {
+		unsigned long start = (unsigned long)page_address(page);
+		unsigned long end = start + order_to_size(page->private);
+		int i;
+
+		for (i = 0; i < set->queue_depth; i++) {
+			struct request *rq = drv_tags->rqs[i];
+			unsigned long rq_addr = (unsigned long)rq;
+
+			if (rq_addr >= start && rq_addr < end) {
+				WARN_ON_ONCE(refcount_read(&rq->ref) != 0);
+				cmpxchg(&drv_tags->rqs[i], rq, NULL);
+			}
+		}
+	}
+
+	/*
+	 * Wait until all pending iteration is done.
+	 *
+	 * Request reference is cleared and it is guaranteed to be observed
+	 * after the ->lock is released.
+	 */
+	spin_lock_irqsave(&drv_tags->lock, flags);
+	spin_unlock_irqrestore(&drv_tags->lock, flags);
+}
+
 void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		     unsigned int hctx_idx)
 {
@@ -2324,6 +2363,8 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		}
 	}
 
+	blk_mq_clear_rq_mapping(set, tags, hctx_idx);
+
 	while (!list_empty(&tags->page_list)) {
 		page = list_first_entry(&tags->page_list, struct page, lru);
 		list_del_init(&page->lru);
@@ -2383,11 +2424,6 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 	return tags;
 }
 
-static size_t order_to_size(unsigned int order)
-{
-	return (size_t)PAGE_SIZE << order;
-}
-
 static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
 			       unsigned int hctx_idx, int node)
 {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V7 4/4] blk-mq: clearing flush request reference in tags->rqs[]
  2021-05-11 15:22 [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
                   ` (2 preceding siblings ...)
  2021-05-11 15:22 ` [PATCH V7 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool Ming Lei
@ 2021-05-11 15:22 ` Ming Lei
  2021-05-14  0:43 ` [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
  2021-08-06  3:40 ` yukuai (C)
  5 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2021-05-11 15:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki, Ming Lei

Before we free request queue, clearing flush request reference in
tags->rqs[], so that potential UAF can be avoided.

Based on one patch written by David Jeffery.

Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 35 ++++++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 626b4483e006..41b64bbf2154 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2642,16 +2642,49 @@ static void blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx)
 					    &hctx->cpuhp_dead);
 }
 
+/*
+ * Before freeing hw queue, clearing the flush request reference in
+ * tags->rqs[] for avoiding potential UAF.
+ */
+static void blk_mq_clear_flush_rq_mapping(struct blk_mq_tags *tags,
+		unsigned int queue_depth, struct request *flush_rq)
+{
+	int i;
+	unsigned long flags;
+
+	/* The hw queue may not be mapped yet */
+	if (!tags)
+		return;
+
+	WARN_ON_ONCE(refcount_read(&flush_rq->ref) != 0);
+
+	for (i = 0; i < queue_depth; i++)
+		cmpxchg(&tags->rqs[i], flush_rq, NULL);
+
+	/*
+	 * Wait until all pending iteration is done.
+	 *
+	 * Request reference is cleared and it is guaranteed to be observed
+	 * after the ->lock is released.
+	 */
+	spin_lock_irqsave(&tags->lock, flags);
+	spin_unlock_irqrestore(&tags->lock, flags);
+}
+
 /* hctx->ctxs will be freed in queue's release handler */
 static void blk_mq_exit_hctx(struct request_queue *q,
 		struct blk_mq_tag_set *set,
 		struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
 {
+	struct request *flush_rq = hctx->fq->flush_rq;
+
 	if (blk_mq_hw_queue_mapped(hctx))
 		blk_mq_tag_idle(hctx);
 
+	blk_mq_clear_flush_rq_mapping(set->tags[hctx_idx],
+			set->queue_depth, flush_rq);
 	if (set->ops->exit_request)
-		set->ops->exit_request(set, hctx->fq->flush_rq, hctx_idx);
+		set->ops->exit_request(set, flush_rq, hctx_idx);
 
 	if (set->ops->exit_hctx)
 		set->ops->exit_hctx(hctx, hctx_idx);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-05-11 15:22 [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
                   ` (3 preceding siblings ...)
  2021-05-11 15:22 ` [PATCH V7 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
@ 2021-05-14  0:43 ` Ming Lei
  2021-05-14 15:34   ` Jens Axboe
  2021-08-06  3:40 ` yukuai (C)
  5 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2021-05-14  0:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki

On Tue, May 11, 2021 at 11:22:32PM +0800, Ming Lei wrote:
> Hi Jens,
> 
> This patchset fixes the request UAF issue by one simple approach,
> without clearing ->rqs[] in fast path, please consider it for 5.13.
> 
> 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
> and release it after calling ->fn, so ->fn won't be called for one
> request if its queue is frozen, done in 2st patch
> 
> 2) clearing any stale request referred in ->rqs[] before freeing the
> request pool, one per-tags spinlock is added for protecting
> grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
> in bt_tags_iter() is avoided, done in 3rd patch.
> 
> V7:
> 	- fix one null-ptr-deref during updating nr_hw_queues, because
> 	blk_mq_clear_flush_rq_mapping() may touch non-mapped hw queue's
> 	tags, only patch 4/4 is modified, reported and verified by
> 	Shinichiro Kawasaki
> 	- run blktests and not see regression

Hi Jens,

We have been working on this issue for a bit long, so any chance to get
the fixes merged? Either 5.13 or 5.14 is fine.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-05-14  0:43 ` [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
@ 2021-05-14 15:34   ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2021-05-14 15:34 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki

On 5/13/21 6:43 PM, Ming Lei wrote:
> On Tue, May 11, 2021 at 11:22:32PM +0800, Ming Lei wrote:
>> Hi Jens,
>>
>> This patchset fixes the request UAF issue by one simple approach,
>> without clearing ->rqs[] in fast path, please consider it for 5.13.
>>
>> 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
>> and release it after calling ->fn, so ->fn won't be called for one
>> request if its queue is frozen, done in 2st patch
>>
>> 2) clearing any stale request referred in ->rqs[] before freeing the
>> request pool, one per-tags spinlock is added for protecting
>> grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
>> in bt_tags_iter() is avoided, done in 3rd patch.
>>
>> V7:
>> 	- fix one null-ptr-deref during updating nr_hw_queues, because
>> 	blk_mq_clear_flush_rq_mapping() may touch non-mapped hw queue's
>> 	tags, only patch 4/4 is modified, reported and verified by
>> 	Shinichiro Kawasaki
>> 	- run blktests and not see regression
> 
> Hi Jens,
> 
> We have been working on this issue for a bit long, so any chance to get
> the fixes merged? Either 5.13 or 5.14 is fine.

Let's get it queued up for 5.14 - we can backport to stable as needed.
I'll do that now.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-05-11 15:22 [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
                   ` (4 preceding siblings ...)
  2021-05-14  0:43 ` [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
@ 2021-08-06  3:40 ` yukuai (C)
  2021-08-06  4:12   ` Ming Lei
  5 siblings, 1 reply; 10+ messages in thread
From: yukuai (C) @ 2021-08-06  3:40 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: linux-block, Bart Van Assche, Hannes Reinecke, John Garry,
	Christoph Hellwig, David Jeffery, Shinichiro Kawasaki

On 2021/05/11 23:22, Ming Lei wrote:
> Hi Jens,
> 
> This patchset fixes the request UAF issue by one simple approach,
> without clearing ->rqs[] in fast path, please consider it for 5.13.
> 
> 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
> and release it after calling ->fn, so ->fn won't be called for one
> request if its queue is frozen, done in 2st patch
> 
> 2) clearing any stale request referred in ->rqs[] before freeing the
> request pool, one per-tags spinlock is added for protecting
> grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
> in bt_tags_iter() is avoided, done in 3rd patch.
> 
> V7:
> 	- fix one null-ptr-deref during updating nr_hw_queues, because
> 	blk_mq_clear_flush_rq_mapping() may touch non-mapped hw queue's
> 	tags, only patch 4/4 is modified, reported and verified by
> 	Shinichiro Kawasaki
> 	- run blktests and not see regression
> 
> V6:
> 	- hold spin lock when reading rq via ->rq[tag_bit], the issue is
> 	  added in V5
> 	- make blk_mq_find_and_get_req() more clean, as suggested by Bart
> 	- not hold tags->lock when clearing ->rqs[] for avoiding to disable
> 	interrupt a bit long, as suggested by Bart
> 	- code style change, as suggested by Christoph
> 
> V5:
> 	- cover bt_iter() by same approach taken in bt_tags_iter(), and pass
> 	John's invasive test
> 	- add tested-by/reviewed-by tag
> 
> V4:
> 	- remove hctx->fq-flush_rq from tags->rqs[] before freeing hw queue,
> 	patch 4/4 is added, which is based on David's patch.
> 
> V3:
> 	- drop patches for completing requests started in iterator ->fn,
> 	  because blk-mq guarantees that valid request is passed to ->fn,
> 	  and it is driver's responsibility for avoiding double completion.
> 	  And drivers works well for not completing rq twice.
> 	- add one patch for avoiding double accounting of flush rq
> 
> V2:
> 	- take Bart's suggestion to not add blk-mq helper for completing
> 	  requests when it is being iterated
> 	- don't grab rq->ref if the iterator is over static rqs because
> 	the use case do require to iterate over all requests no matter if
> 	the request is initialized or not
> 
> 
> Ming Lei (4):
>    block: avoid double io accounting for flush request
>    blk-mq: grab rq->refcount before calling ->fn in
>      blk_mq_tagset_busy_iter
>    blk-mq: clear stale request in tags->rq[] before freeing one request
>      pool
>    blk-mq: clearing flush request reference in tags->rqs[]
> 
>   block/blk-flush.c  |  3 +-
>   block/blk-mq-tag.c | 49 ++++++++++++++++++------
>   block/blk-mq-tag.h |  6 +++
>   block/blk-mq.c     | 95 ++++++++++++++++++++++++++++++++++++++++------
>   block/blk-mq.h     |  1 +
>   5 files changed, 130 insertions(+), 24 deletions(-)
> 

Hi, ming

I see this patchset was applied to fix the problem while iterating over
tagset, however blk_mq_tag_to_rq() is still untouched, and this
interface might still return freed request.

Any reason why this interface didn't use the same solution ?
(hold tags->lock and return null if ref is 0)

Thanks
Kuai

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-08-06  3:40 ` yukuai (C)
@ 2021-08-06  4:12   ` Ming Lei
  2021-08-06  7:50     ` yukuai (C)
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2021-08-06  4:12 UTC (permalink / raw)
  To: yukuai (C)
  Cc: Jens Axboe, linux-block, Bart Van Assche, Hannes Reinecke,
	John Garry, Christoph Hellwig, David Jeffery,
	Shinichiro Kawasaki

On Fri, Aug 06, 2021 at 11:40:25AM +0800, yukuai (C) wrote:
> On 2021/05/11 23:22, Ming Lei wrote:
> > Hi Jens,
> > 
> > This patchset fixes the request UAF issue by one simple approach,
> > without clearing ->rqs[] in fast path, please consider it for 5.13.
> > 
> > 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
> > and release it after calling ->fn, so ->fn won't be called for one
> > request if its queue is frozen, done in 2st patch
> > 
> > 2) clearing any stale request referred in ->rqs[] before freeing the
> > request pool, one per-tags spinlock is added for protecting
> > grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
> > in bt_tags_iter() is avoided, done in 3rd patch.
> > 
> > V7:
> > 	- fix one null-ptr-deref during updating nr_hw_queues, because
> > 	blk_mq_clear_flush_rq_mapping() may touch non-mapped hw queue's
> > 	tags, only patch 4/4 is modified, reported and verified by
> > 	Shinichiro Kawasaki
> > 	- run blktests and not see regression
> > 
> > V6:
> > 	- hold spin lock when reading rq via ->rq[tag_bit], the issue is
> > 	  added in V5
> > 	- make blk_mq_find_and_get_req() more clean, as suggested by Bart
> > 	- not hold tags->lock when clearing ->rqs[] for avoiding to disable
> > 	interrupt a bit long, as suggested by Bart
> > 	- code style change, as suggested by Christoph
> > 
> > V5:
> > 	- cover bt_iter() by same approach taken in bt_tags_iter(), and pass
> > 	John's invasive test
> > 	- add tested-by/reviewed-by tag
> > 
> > V4:
> > 	- remove hctx->fq-flush_rq from tags->rqs[] before freeing hw queue,
> > 	patch 4/4 is added, which is based on David's patch.
> > 
> > V3:
> > 	- drop patches for completing requests started in iterator ->fn,
> > 	  because blk-mq guarantees that valid request is passed to ->fn,
> > 	  and it is driver's responsibility for avoiding double completion.
> > 	  And drivers works well for not completing rq twice.
> > 	- add one patch for avoiding double accounting of flush rq
> > 
> > V2:
> > 	- take Bart's suggestion to not add blk-mq helper for completing
> > 	  requests when it is being iterated
> > 	- don't grab rq->ref if the iterator is over static rqs because
> > 	the use case do require to iterate over all requests no matter if
> > 	the request is initialized or not
> > 
> > 
> > Ming Lei (4):
> >    block: avoid double io accounting for flush request
> >    blk-mq: grab rq->refcount before calling ->fn in
> >      blk_mq_tagset_busy_iter
> >    blk-mq: clear stale request in tags->rq[] before freeing one request
> >      pool
> >    blk-mq: clearing flush request reference in tags->rqs[]
> > 
> >   block/blk-flush.c  |  3 +-
> >   block/blk-mq-tag.c | 49 ++++++++++++++++++------
> >   block/blk-mq-tag.h |  6 +++
> >   block/blk-mq.c     | 95 ++++++++++++++++++++++++++++++++++++++++------
> >   block/blk-mq.h     |  1 +
> >   5 files changed, 130 insertions(+), 24 deletions(-)
> > 
> 
> Hi, ming
> 
> I see this patchset was applied to fix the problem while iterating over
> tagset, however blk_mq_tag_to_rq() is still untouched, and this
> interface might still return freed request.
> 
> Any reason why this interface didn't use the same solution ?
> (hold tags->lock and return null if ref is 0)

It is driver's responsibility to cover race between normal completion
and timeout/error handing, so driver has the knowledge if one tag is valid
or not. In short, it is driver's responsibility to guarantee that valid 'tag'
is passed to blk_mq_tag_to_rq().


Thanks,
Ming


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-08-06  4:12   ` Ming Lei
@ 2021-08-06  7:50     ` yukuai (C)
  0 siblings, 0 replies; 10+ messages in thread
From: yukuai (C) @ 2021-08-06  7:50 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Bart Van Assche, Hannes Reinecke,
	John Garry, Christoph Hellwig, David Jeffery,
	Shinichiro Kawasaki

On 2021/08/06 12:12, Ming Lei wrote:
> On Fri, Aug 06, 2021 at 11:40:25AM +0800, yukuai (C) wrote:
>> On 2021/05/11 23:22, Ming Lei wrote:
>>> Hi Jens,
>>>
>>> This patchset fixes the request UAF issue by one simple approach,
>>> without clearing ->rqs[] in fast path, please consider it for 5.13.
>>>
>>> 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
>>> and release it after calling ->fn, so ->fn won't be called for one
>>> request if its queue is frozen, done in 2st patch
>>>
>>> 2) clearing any stale request referred in ->rqs[] before freeing the
>>> request pool, one per-tags spinlock is added for protecting
>>> grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
>>> in bt_tags_iter() is avoided, done in 3rd patch.
>>>
>>> V7:
>>> 	- fix one null-ptr-deref during updating nr_hw_queues, because
>>> 	blk_mq_clear_flush_rq_mapping() may touch non-mapped hw queue's
>>> 	tags, only patch 4/4 is modified, reported and verified by
>>> 	Shinichiro Kawasaki
>>> 	- run blktests and not see regression
>>>
>>> V6:
>>> 	- hold spin lock when reading rq via ->rq[tag_bit], the issue is
>>> 	  added in V5
>>> 	- make blk_mq_find_and_get_req() more clean, as suggested by Bart
>>> 	- not hold tags->lock when clearing ->rqs[] for avoiding to disable
>>> 	interrupt a bit long, as suggested by Bart
>>> 	- code style change, as suggested by Christoph
>>>
>>> V5:
>>> 	- cover bt_iter() by same approach taken in bt_tags_iter(), and pass
>>> 	John's invasive test
>>> 	- add tested-by/reviewed-by tag
>>>
>>> V4:
>>> 	- remove hctx->fq-flush_rq from tags->rqs[] before freeing hw queue,
>>> 	patch 4/4 is added, which is based on David's patch.
>>>
>>> V3:
>>> 	- drop patches for completing requests started in iterator ->fn,
>>> 	  because blk-mq guarantees that valid request is passed to ->fn,
>>> 	  and it is driver's responsibility for avoiding double completion.
>>> 	  And drivers works well for not completing rq twice.
>>> 	- add one patch for avoiding double accounting of flush rq
>>>
>>> V2:
>>> 	- take Bart's suggestion to not add blk-mq helper for completing
>>> 	  requests when it is being iterated
>>> 	- don't grab rq->ref if the iterator is over static rqs because
>>> 	the use case do require to iterate over all requests no matter if
>>> 	the request is initialized or not
>>>
>>>
>>> Ming Lei (4):
>>>     block: avoid double io accounting for flush request
>>>     blk-mq: grab rq->refcount before calling ->fn in
>>>       blk_mq_tagset_busy_iter
>>>     blk-mq: clear stale request in tags->rq[] before freeing one request
>>>       pool
>>>     blk-mq: clearing flush request reference in tags->rqs[]
>>>
>>>    block/blk-flush.c  |  3 +-
>>>    block/blk-mq-tag.c | 49 ++++++++++++++++++------
>>>    block/blk-mq-tag.h |  6 +++
>>>    block/blk-mq.c     | 95 ++++++++++++++++++++++++++++++++++++++++------
>>>    block/blk-mq.h     |  1 +
>>>    5 files changed, 130 insertions(+), 24 deletions(-)
>>>
>>
>> Hi, ming
>>
>> I see this patchset was applied to fix the problem while iterating over
>> tagset, however blk_mq_tag_to_rq() is still untouched, and this
>> interface might still return freed request.
>>
>> Any reason why this interface didn't use the same solution ?
>> (hold tags->lock and return null if ref is 0)
> 
> It is driver's responsibility to cover race between normal completion
> and timeout/error handing, so driver has the knowledge if one tag is valid
> or not. In short, it is driver's responsibility to guarantee that valid 'tag'
> is passed to blk_mq_tag_to_rq().
> 

Thanks for the explanation.

Best regards,
Kuai

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-08-06  7:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 15:22 [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
2021-05-11 15:22 ` [PATCH V7 1/4] block: avoid double io accounting for flush request Ming Lei
2021-05-11 15:22 ` [PATCH V7 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Ming Lei
2021-05-11 15:22 ` [PATCH V7 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool Ming Lei
2021-05-11 15:22 ` [PATCH V7 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
2021-05-14  0:43 ` [PATCH V7 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
2021-05-14 15:34   ` Jens Axboe
2021-08-06  3:40 ` yukuai (C)
2021-08-06  4:12   ` Ming Lei
2021-08-06  7:50     ` yukuai (C)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).