linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests
@ 2021-04-29  2:34 Ming Lei
  2021-04-29  2:34 ` [PATCH V4 1/4] block: avoid double io accounting for flush request Ming Lei
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Ming Lei @ 2021-04-29  2:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry,
	David Jeffery, Ming Lei

Hi Jens,

This patchset fixes the request UAF issue by one simple approach,
without clearing ->rqs[] in fast path.

1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
and release it after calling ->fn, so ->fn won't be called for one
request if its queue is frozen, done in 2st patch

2) clearing any stale request referred in ->rqs[] before freeing the
request pool, one per-tags spinlock is added for protecting
grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
in bt_tags_iter() is avoided, done in 3rd patch.


V4:
	- remove hctx->fq-flush_rq from tags->rqs[] before freeing hw queue,
	patch 4/4 is added, which is based on David's patch.

V3:
	- drop patches for completing requests started in iterator ->fn,
	  because blk-mq guarantees that valid request is passed to ->fn,
	  and it is driver's responsibility for avoiding double completion.
	  And drivers works well for not completing rq twice.
	- add one patch for avoiding double accounting of flush rq 

V2:
	- take Bart's suggestion to not add blk-mq helper for completing
	  requests when it is being iterated
	- don't grab rq->ref if the iterator is over static rqs because
	the use case do require to iterate over all requests no matter if
	the request is initialized or not



Ming Lei (4):
  block: avoid double io accounting for flush request
  blk-mq: grab rq->refcount before calling ->fn in
    blk_mq_tagset_busy_iter
  blk-mq: clear stale request in tags->rq[] before freeing one request
    pool
  blk-mq: clearing flush request reference in tags->rqs[]

 block/blk-flush.c  |  3 +-
 block/blk-mq-tag.c | 29 +++++++++++++----
 block/blk-mq-tag.h |  3 ++
 block/blk-mq.c     | 77 +++++++++++++++++++++++++++++++++++++++-------
 block/blk-mq.h     |  1 +
 5 files changed, 94 insertions(+), 19 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH V4 1/4] block: avoid double io accounting for flush request
  2021-04-29  2:34 [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
@ 2021-04-29  2:34 ` Ming Lei
  2021-04-30  2:51   ` Bart Van Assche
  2021-04-29  2:34 ` [PATCH V4 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Ming Lei
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2021-04-29  2:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry,
	David Jeffery, Ming Lei

For flush request, rq->end_io() may be called two times, one is from
timeout handling(blk_mq_check_expired()), another is from normal
completion(__blk_mq_end_request()).

Move blk_account_io_flush() after flush_rq->ref drops to zero, so
io accounting can be done just once for flush request.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-flush.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 7942ca6ed321..1002f6c58181 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -219,8 +219,6 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 	unsigned long flags = 0;
 	struct blk_flush_queue *fq = blk_get_flush_queue(q, flush_rq->mq_ctx);
 
-	blk_account_io_flush(flush_rq);
-
 	/* release the tag's ownership to the req cloned from */
 	spin_lock_irqsave(&fq->mq_flush_lock, flags);
 
@@ -230,6 +228,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 		return;
 	}
 
+	blk_account_io_flush(flush_rq);
 	/*
 	 * Flush request has to be marked as IDLE when it is really ended
 	 * because its .end_io() is called from timeout code path too for
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter
  2021-04-29  2:34 [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
  2021-04-29  2:34 ` [PATCH V4 1/4] block: avoid double io accounting for flush request Ming Lei
@ 2021-04-29  2:34 ` Ming Lei
  2021-04-30  3:06   ` Bart Van Assche
  2021-04-29  2:34 ` [PATCH V4 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool Ming Lei
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2021-04-29  2:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry,
	David Jeffery, Ming Lei

Grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter(), and
this way will prevent the request from being re-used when ->fn is
running. The approach is same as what we do during handling timeout.

Fix request UAF related with completion race or queue releasing:

- If one rq is referred before rq->q is frozen, then queue won't be
frozen before the request is released during iteration.

- If one rq is referred after rq->q is frozen, refcount_inc_not_zero()
will return false, and we won't iterate over this request.

However, still one request UAF not covered: refcount_inc_not_zero() may
read one freed request, and it will be handled in next patch.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-tag.c | 22 ++++++++++++++++------
 block/blk-mq.c     | 14 +++++++++-----
 block/blk-mq.h     |  1 +
 3 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 2a37731e8244..9329b94a9743 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -264,6 +264,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	struct blk_mq_tags *tags = iter_data->tags;
 	bool reserved = iter_data->flags & BT_TAG_ITER_RESERVED;
 	struct request *rq;
+	bool ret;
+	bool iter_static_rqs = !!(iter_data->flags & BT_TAG_ITER_STATIC_RQS);
 
 	if (!reserved)
 		bitnr += tags->nr_reserved_tags;
@@ -272,16 +274,21 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	 * We can hit rq == NULL here, because the tagging functions
 	 * test and set the bit before assigning ->rqs[].
 	 */
-	if (iter_data->flags & BT_TAG_ITER_STATIC_RQS)
+	if (iter_static_rqs)
 		rq = tags->static_rqs[bitnr];
-	else
+	else {
 		rq = tags->rqs[bitnr];
-	if (!rq)
-		return true;
+		if (!rq || !refcount_inc_not_zero(&rq->ref))
+			return true;
+	}
 	if ((iter_data->flags & BT_TAG_ITER_STARTED) &&
 	    !blk_mq_request_started(rq))
-		return true;
-	return iter_data->fn(rq, iter_data->data, reserved);
+		ret = true;
+	else
+		ret = iter_data->fn(rq, iter_data->data, reserved);
+	if (!iter_static_rqs)
+		blk_mq_put_rq_ref(rq);
+	return ret;
 }
 
 /**
@@ -348,6 +355,9 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn,
  *		indicates whether or not @rq is a reserved request. Return
  *		true to continue iterating tags, false to stop.
  * @priv:	Will be passed as second argument to @fn.
+ *
+ * We grab one request reference before calling @fn and release it after
+ * @fn returns.
  */
 void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 		busy_tag_iter_fn *fn, void *priv)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 927189a55575..4bd6c11bd8bc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -909,6 +909,14 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
 	return false;
 }
 
+void blk_mq_put_rq_ref(struct request *rq)
+{
+	if (is_flush_rq(rq, rq->mq_hctx))
+		rq->end_io(rq, 0);
+	else if (refcount_dec_and_test(&rq->ref))
+		__blk_mq_free_request(rq);
+}
+
 static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, void *priv, bool reserved)
 {
@@ -942,11 +950,7 @@ static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 	if (blk_mq_req_expired(rq, next))
 		blk_mq_rq_timed_out(rq, reserved);
 
-	if (is_flush_rq(rq, hctx))
-		rq->end_io(rq, 0);
-	else if (refcount_dec_and_test(&rq->ref))
-		__blk_mq_free_request(rq);
-
+	blk_mq_put_rq_ref(rq);
 	return true;
 }
 
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 3616453ca28c..143afe42c63a 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -47,6 +47,7 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head,
 void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list);
 struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx,
 					struct blk_mq_ctx *start);
+void blk_mq_put_rq_ref(struct request *rq);
 
 /*
  * Internal helpers for allocating/freeing the request map
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool
  2021-04-29  2:34 [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
  2021-04-29  2:34 ` [PATCH V4 1/4] block: avoid double io accounting for flush request Ming Lei
  2021-04-29  2:34 ` [PATCH V4 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Ming Lei
@ 2021-04-29  2:34 ` Ming Lei
  2021-04-29 14:02   ` David Jeffery
  2021-04-29  2:34 ` [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2021-04-29  2:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry,
	David Jeffery, Ming Lei

refcount_inc_not_zero() in bt_tags_iter() still may read one freed
request.

Fix the issue by the following approach:

1) hold a per-tags spinlock when reading ->rqs[tag] and calling
refcount_inc_not_zero in bt_tags_iter()

2) clearing stale request referred via ->rqs[tag] before freeing
request pool, the per-tags spinlock is held for clearing stale
->rq[tag]

So after we cleared stale requests, bt_tags_iter() won't observe
freed request any more, also the clearing will wait for pending
request reference.

The idea of clearing ->rqs[] is borrowed from John Garry's previous
patch and one recent David's patch.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-tag.c |  9 ++++++++-
 block/blk-mq-tag.h |  3 +++
 block/blk-mq.c     | 39 ++++++++++++++++++++++++++++++++++-----
 3 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 9329b94a9743..a3be267212b9 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -277,9 +277,15 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	if (iter_static_rqs)
 		rq = tags->static_rqs[bitnr];
 	else {
+		unsigned long flags;
+
+		spin_lock_irqsave(&tags->lock, flags);
 		rq = tags->rqs[bitnr];
-		if (!rq || !refcount_inc_not_zero(&rq->ref))
+		if (!rq || !refcount_inc_not_zero(&rq->ref)) {
+			spin_unlock_irqrestore(&tags->lock, flags);
 			return true;
+		}
+		spin_unlock_irqrestore(&tags->lock, flags);
 	}
 	if ((iter_data->flags & BT_TAG_ITER_STARTED) &&
 	    !blk_mq_request_started(rq))
@@ -526,6 +532,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 
 	tags->nr_tags = total_tags;
 	tags->nr_reserved_tags = reserved_tags;
+	spin_lock_init(&tags->lock);
 
 	if (blk_mq_is_sbitmap_shared(flags))
 		return tags;
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 7d3e6b333a4a..f942a601b5ef 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -20,6 +20,9 @@ struct blk_mq_tags {
 	struct request **rqs;
 	struct request **static_rqs;
 	struct list_head page_list;
+
+	/* used to clear rqs[] before one request pool is freed */
+	spinlock_t lock;
 };
 
 extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4bd6c11bd8bc..abd0f7a9d052 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2291,6 +2291,38 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
 	return BLK_QC_T_NONE;
 }
 
+static size_t order_to_size(unsigned int order)
+{
+	return (size_t)PAGE_SIZE << order;
+}
+
+/* called before freeing request pool in @tags */
+static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
+		struct blk_mq_tags *tags, unsigned int hctx_idx)
+{
+	struct blk_mq_tags *drv_tags = set->tags[hctx_idx];
+	struct page *page;
+	unsigned long flags;
+
+	spin_lock_irqsave(&drv_tags->lock, flags);
+	list_for_each_entry(page, &tags->page_list, lru) {
+		unsigned long start = (unsigned long)page_address(page);
+		unsigned long end = start + order_to_size(page->private);
+		int i;
+
+		for (i = 0; i < set->queue_depth; i++) {
+			struct request *rq = drv_tags->rqs[i];
+			unsigned long rq_addr = (unsigned long)rq;
+
+			if (rq_addr >= start && rq_addr < end) {
+				WARN_ON_ONCE(refcount_read(&rq->ref) != 0);
+				cmpxchg(&drv_tags->rqs[i], rq, NULL);
+			}
+		}
+	}
+	spin_unlock_irqrestore(&drv_tags->lock, flags);
+}
+
 void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		     unsigned int hctx_idx)
 {
@@ -2309,6 +2341,8 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		}
 	}
 
+	blk_mq_clear_rq_mapping(set, tags, hctx_idx);
+
 	while (!list_empty(&tags->page_list)) {
 		page = list_first_entry(&tags->page_list, struct page, lru);
 		list_del_init(&page->lru);
@@ -2368,11 +2402,6 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 	return tags;
 }
 
-static size_t order_to_size(unsigned int order)
-{
-	return (size_t)PAGE_SIZE << order;
-}
-
 static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
 			       unsigned int hctx_idx, int node)
 {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[]
  2021-04-29  2:34 [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
                   ` (2 preceding siblings ...)
  2021-04-29  2:34 ` [PATCH V4 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool Ming Lei
@ 2021-04-29  2:34 ` Ming Lei
  2021-04-29 14:13   ` David Jeffery
  2021-04-30  3:05   ` Bart Van Assche
  2021-05-04  7:29 ` [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
  2021-05-04 10:15 ` John Garry
  5 siblings, 2 replies; 16+ messages in thread
From: Ming Lei @ 2021-04-29  2:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry,
	David Jeffery, Ming Lei

Before we free request queue, clearing flush request reference in
tags->rqs[], so that potential UAF can be avoided.

Based on one patch written by David Jeffery.

Cc: John Garry <john.garry@huawei.com>
Cc: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index abd0f7a9d052..e320d9798715 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2620,16 +2620,38 @@ static void blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx)
 					    &hctx->cpuhp_dead);
 }
 
+/*
+ * Before freeing hw queue, clearing the flush request reference in
+ * tags->rqs[] for avoiding potential UAF.
+ */
+static void blk_mq_clear_flush_rq_mapping(struct blk_mq_tags *tags,
+		unsigned int queue_depth, struct request *flush_rq)
+{
+	int i;
+	unsigned long flags;
+
+	WARN_ON_ONCE(refcount_read(&flush_rq->ref) != 0);
+
+	spin_lock_irqsave(&tags->lock, flags);
+	for (i = 0; i < queue_depth; i++)
+		cmpxchg(&tags->rqs[i], flush_rq, NULL);
+	spin_unlock_irqrestore(&tags->lock, flags);
+}
+
 /* hctx->ctxs will be freed in queue's release handler */
 static void blk_mq_exit_hctx(struct request_queue *q,
 		struct blk_mq_tag_set *set,
 		struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
 {
+	struct request *flush_rq = hctx->fq->flush_rq;
+
 	if (blk_mq_hw_queue_mapped(hctx))
 		blk_mq_tag_idle(hctx);
 
+	blk_mq_clear_flush_rq_mapping(set->tags[hctx_idx],
+			set->queue_depth, flush_rq);
 	if (set->ops->exit_request)
-		set->ops->exit_request(set, hctx->fq->flush_rq, hctx_idx);
+		set->ops->exit_request(set, flush_rq, hctx_idx);
 
 	if (set->ops->exit_hctx)
 		set->ops->exit_hctx(hctx, hctx_idx);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool
  2021-04-29  2:34 ` [PATCH V4 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool Ming Lei
@ 2021-04-29 14:02   ` David Jeffery
  0 siblings, 0 replies; 16+ messages in thread
From: David Jeffery @ 2021-04-29 14:02 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry

On Thu, Apr 29, 2021 at 10:34:57AM +0800, Ming Lei wrote:
> 
> refcount_inc_not_zero() in bt_tags_iter() still may read one freed
> request.
> 
> Fix the issue by the following approach:
> 
> 1) hold a per-tags spinlock when reading ->rqs[tag] and calling
> refcount_inc_not_zero in bt_tags_iter()
> 
> 2) clearing stale request referred via ->rqs[tag] before freeing
> request pool, the per-tags spinlock is held for clearing stale
> ->rq[tag]
> 
> So after we cleared stale requests, bt_tags_iter() won't observe
> freed request any more, also the clearing will wait for pending
> request reference.
> 
> The idea of clearing ->rqs[] is borrowed from John Garry's previous
> patch and one recent David's patch.
 
With the flush request handled in a separate patch, this looks good to me.

Reviewed-by: David Jeffery <djeffery@redhat.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[]
  2021-04-29  2:34 ` [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
@ 2021-04-29 14:13   ` David Jeffery
  2021-04-30  3:05   ` Bart Van Assche
  1 sibling, 0 replies; 16+ messages in thread
From: David Jeffery @ 2021-04-29 14:13 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry

On Thu, Apr 29, 2021 at 10:34:58AM +0800, Ming Lei wrote:
> 
> Before we free request queue, clearing flush request reference in
> tags->rqs[], so that potential UAF can be avoided.
> 
> Based on one patch written by David Jeffery.
> 

With this fixing my issue with the previous patchset, looks good to me.

Reviewed-By: David Jeffery <djeffery@redhat.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 1/4] block: avoid double io accounting for flush request
  2021-04-29  2:34 ` [PATCH V4 1/4] block: avoid double io accounting for flush request Ming Lei
@ 2021-04-30  2:51   ` Bart Van Assche
  2021-04-30  3:12     ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: Bart Van Assche @ 2021-04-30  2:51 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: linux-block, Khazhy Kumykov, Shin'ichiro Kawasaki,
	Hannes Reinecke, John Garry, David Jeffery

On 4/28/21 7:34 PM, Ming Lei wrote:
> For flush request, rq->end_io() may be called two times, one is from
> timeout handling(blk_mq_check_expired()), another is from normal
> completion(__blk_mq_end_request()).
> 
> Move blk_account_io_flush() after flush_rq->ref drops to zero, so
> io accounting can be done just once for flush request.
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-flush.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index 7942ca6ed321..1002f6c58181 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -219,8 +219,6 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
>  	unsigned long flags = 0;
>  	struct blk_flush_queue *fq = blk_get_flush_queue(q, flush_rq->mq_ctx);
>  
> -	blk_account_io_flush(flush_rq);
> -
>  	/* release the tag's ownership to the req cloned from */
>  	spin_lock_irqsave(&fq->mq_flush_lock, flags);
>  
> @@ -230,6 +228,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
>  		return;
>  	}
>  
> +	blk_account_io_flush(flush_rq);
>  	/*
>  	 * Flush request has to be marked as IDLE when it is really ended
>  	 * because its .end_io() is called from timeout code path too for

How about adding the following:

Fixes: b68663186577 ("block: add iostat counters for flush requests")

Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[]
  2021-04-29  2:34 ` [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
  2021-04-29 14:13   ` David Jeffery
@ 2021-04-30  3:05   ` Bart Van Assche
  1 sibling, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2021-04-30  3:05 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: linux-block, Khazhy Kumykov, Shin'ichiro Kawasaki,
	Hannes Reinecke, John Garry, David Jeffery

On 4/28/21 7:34 PM, Ming Lei wrote:
> Before we free request queue, clearing flush request reference in
> tags->rqs[], so that potential UAF can be avoided.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter
  2021-04-29  2:34 ` [PATCH V4 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Ming Lei
@ 2021-04-30  3:06   ` Bart Van Assche
  0 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2021-04-30  3:06 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: linux-block, Khazhy Kumykov, Shin'ichiro Kawasaki,
	Hannes Reinecke, John Garry, David Jeffery

On 4/28/21 7:34 PM, Ming Lei wrote:
> Grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter(), and
> this way will prevent the request from being re-used when ->fn is
> running. The approach is same as what we do during handling timeout.
> 
> Fix request UAF related with completion race or queue releasing:
> 
> - If one rq is referred before rq->q is frozen, then queue won't be
> frozen before the request is released during iteration.
> 
> - If one rq is referred after rq->q is frozen, refcount_inc_not_zero()
> will return false, and we won't iterate over this request.
> 
> However, still one request UAF not covered: refcount_inc_not_zero() may
> read one freed request, and it will be handled in next patch.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 1/4] block: avoid double io accounting for flush request
  2021-04-30  2:51   ` Bart Van Assche
@ 2021-04-30  3:12     ` Ming Lei
  0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2021-04-30  3:12 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry,
	David Jeffery

On Thu, Apr 29, 2021 at 07:51:37PM -0700, Bart Van Assche wrote:
> On 4/28/21 7:34 PM, Ming Lei wrote:
> > For flush request, rq->end_io() may be called two times, one is from
> > timeout handling(blk_mq_check_expired()), another is from normal
> > completion(__blk_mq_end_request()).
> > 
> > Move blk_account_io_flush() after flush_rq->ref drops to zero, so
> > io accounting can be done just once for flush request.
> > 
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  block/blk-flush.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/block/blk-flush.c b/block/blk-flush.c
> > index 7942ca6ed321..1002f6c58181 100644
> > --- a/block/blk-flush.c
> > +++ b/block/blk-flush.c
> > @@ -219,8 +219,6 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
> >  	unsigned long flags = 0;
> >  	struct blk_flush_queue *fq = blk_get_flush_queue(q, flush_rq->mq_ctx);
> >  
> > -	blk_account_io_flush(flush_rq);
> > -
> >  	/* release the tag's ownership to the req cloned from */
> >  	spin_lock_irqsave(&fq->mq_flush_lock, flags);
> >  
> > @@ -230,6 +228,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
> >  		return;
> >  	}
> >  
> > +	blk_account_io_flush(flush_rq);
> >  	/*
> >  	 * Flush request has to be marked as IDLE when it is really ended
> >  	 * because its .end_io() is called from timeout code path too for
> 
> How about adding the following:
> 
> Fixes: b68663186577 ("block: add iostat counters for flush requests")

Yeah, good point, thanks!

Jens, I guess you may cover that, or please let me know if V5 is needed. 

Thanks, 
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-04-29  2:34 [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
                   ` (3 preceding siblings ...)
  2021-04-29  2:34 ` [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
@ 2021-05-04  7:29 ` Ming Lei
  2021-05-04 10:15 ` John Garry
  5 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2021-05-04  7:29 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, John Garry,
	David Jeffery

On Thu, Apr 29, 2021 at 10:34:54AM +0800, Ming Lei wrote:
> Hi Jens,
> 
> This patchset fixes the request UAF issue by one simple approach,
> without clearing ->rqs[] in fast path.
> 
> 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
> and release it after calling ->fn, so ->fn won't be called for one
> request if its queue is frozen, done in 2st patch
> 
> 2) clearing any stale request referred in ->rqs[] before freeing the
> request pool, one per-tags spinlock is added for protecting
> grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
> in bt_tags_iter() is avoided, done in 3rd patch.
> 
> 
> V4:
> 	- remove hctx->fq-flush_rq from tags->rqs[] before freeing hw queue,
> 	patch 4/4 is added, which is based on David's patch.

Hello Jens,

Any chance to merge the patch to 5.13 if you are fine?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-04-29  2:34 [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
                   ` (4 preceding siblings ...)
  2021-05-04  7:29 ` [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
@ 2021-05-04 10:15 ` John Garry
  2021-05-04 11:43   ` Ming Lei
  5 siblings, 1 reply; 16+ messages in thread
From: John Garry @ 2021-05-04 10:15 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, David Jeffery

On 29/04/2021 03:34, Ming Lei wrote:
> Hi Jens,
> 
> This patchset fixes the request UAF issue by one simple approach,
> without clearing ->rqs[] in fast path.
> 
> 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
> and release it after calling ->fn, so ->fn won't be called for one
> request if its queue is frozen, done in 2st patch
> 
> 2) clearing any stale request referred in ->rqs[] before freeing the
> request pool, one per-tags spinlock is added for protecting
> grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
> in bt_tags_iter() is avoided, done in 3rd patch.
> 

I had a go at testing this. Without any modifications for testing, it 
looks ok.

However I also tested by adding an artificial delay in bt_iter() - 
otherwise it may not be realistic to trigger some UAF issues in sane 
timeframes.

So I made this change:

--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -215,8 +215,11 @@ static bool bt_iter(struct sbitmap *bitmap, 
unsigned int bitnr, void *data)
          * We can hit rq == NULL here, because the tagging functions
          * test and set the bit before assigning ->rqs[].
          */
-       if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
-               return iter_data->fn(hctx, rq, iter_data->data, reserved);
+       if (rq) {
+               mdelay(50);
+               if (rq->q == hctx->queue && rq->mq_hctx == hctx)
+   		  return iter_data->fn(hctx, rq, iter_data->data, reserved);
+       }
         return true;
  }


And I could trigger this pretty quickly:

[  129.318623] 
==================================================================
[  129.325870] BUG: KASAN: use-after-free in bt_iter+0xa0/0x120
[  129.331548] Read of size 8 at addr ffff0010cc62a200 by task fio/1610

[  129.339384] CPU: 33 PID: 1610 Comm: fio Not tainted 
5.12.0-rc4-47861-g2edb78df2141 #1140
[  129.339395] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon 
D05 IT21 Nemo 2.0 RC0 04/18/2018
[  129.339403] Call trace:
[  129.339408]  dump_backtrace+0x0/0x2d0
[  129.339430]  show_stack+0x18/0x68
[  129.339441]  dump_stack+0x100/0x16c
[  129.339455]  print_address_description.constprop.13+0x68/0x30c
[  129.339468]  kasan_report+0x1d8/0x240
[  129.339479]  __asan_load8+0x9c/0xd8
[  129.339488]  bt_iter+0xa0/0x120
[  129.339496]  blk_mq_queue_tag_busy_iter+0x2d8/0x540
[  129.339506]  blk_mq_in_flight+0x80/0xb8
[  129.339516]  part_stat_show+0xcc/0x210
[  129.339528]  dev_attr_show+0x44/0x90
[  129.339542]  sysfs_kf_seq_show+0x128/0x1c8
[  129.339554]  kernfs_seq_show+0xa0/0xb8
[  129.339563]  seq_read_iter+0x210/0x660
[  129.339575]  kernfs_fop_read_iter+0x208/0x2b0
[  129.339585]  new_sync_read+0x1ec/0x2d0
[  129.339596]  vfs_read+0x188/0x248
[  129.339605]  ksys_read+0xc8/0x178
[  129.339615]  __arm64_sys_read+0x44/0x58
[  129.339625]  el0_svc_common.constprop.1+0xc4/0x190
[  129.339639]  do_el0_svc+0x90/0xa0
[  129.339648]  el0_svc+0x24/0x38
[  129.339659]  el0_sync_handler+0x90/0xb8
[  129.339670]  el0_sync+0x154/0x180

[  129.341162] Allocated by task 1613:
[  129.344644]  stack_trace_save+0x94/0xc8
[  129.344662]  kasan_save_stack+0x28/0x58
[  129.344670]  __kasan_slab_alloc+0x88/0xa8
[  129.344679]  slab_post_alloc_hook+0x58/0x2a0
[  129.344691]  kmem_cache_alloc+0x19c/0x2c0
[  129.344699]  io_submit_one+0x138/0x1580
[  129.344710]  __arm64_sys_io_submit+0x120/0x310
[  129.344721]  el0_svc_common.constprop.1+0xc4/0x190
[  129.344731]  do_el0_svc+0x90/0xa0
[  129.344741]  el0_svc+0x24/0x38
[  129.344750]  el0_sync_handler+0x90/0xb8
[  129.344760]  el0_sync+0x154/0x180

[  129.346250] Freed by task 496:
[  129.349297]  stack_trace_save+0x94/0xc8
[  129.349307]  kasan_save_stack+0x28/0x58
[  129.349315]  kasan_set_track+0x28/0x40
[  129.349323]  kasan_set_free_info+0x28/0x50
[  129.349332]  __kasan_slab_free+0xd0/0x130
[  129.349340]  slab_free_freelist_hook+0x70/0x178
[  129.349352]  kmem_cache_free+0x94/0x400
[  129.349359]  aio_complete_rw+0x4f8/0x7f8
[  129.349369]  blkdev_bio_end_io+0xe0/0x1e8
[  129.349380]  bio_endio+0x1d4/0x1f0
[  129.349392]  blk_update_request+0x388/0x6b8
[  129.349401]  scsi_end_request+0x54/0x320
[  129.349411]  scsi_io_completion+0xec/0x700
[  129.349420]  scsi_finish_command+0x168/0x1e8
[  129.349432]  scsi_softirq_done+0x140/0x1b8
[  129.349441]  blk_mq_complete_request+0x4c/0x60
[  129.349449]  scsi_mq_done+0x68/0x88
[  129.349457]  sas_scsi_task_done+0xe0/0x118
[  129.349470]  slot_complete_v2_hw+0x23c/0x620
[  129.349483]  cq_thread_v2_hw+0x10c/0x388
[  129.349493]  irq_thread_fn+0x48/0xd8
[  129.349503]  irq_thread+0x1d4/0x2f8
[  129.349511]  kthread+0x224/0x230
[  129.349521]  ret_from_fork+0x10/0x30

[  129.351012] The buggy address belongs to the object at ffff0010cc62a200
  which belongs to the cache aio_kiocb of size 176
[  129.363351] The buggy address is located 0 bytes inside of
  176-byte region [ffff0010cc62a200, ffff0010cc62a2b0)
[  129.374909] The buggy address belongs to the page:
[  129.379693] page:0000000026f8d4c3 refcount:1 mapcount:0 
mapping:0000000000000000 index:0x0 pfn:0x10cc62a
[  129.379704] head:0000000026f8d4c3 order:1 compound_mapcount:0
[  129.379710] flags: 0xbfffc0000010200(slab|head)
[  129.379727] raw: 0bfffc0000010200 dead000000000100 dead000000000122 
ffff04100630f780
[  129.379736] raw: 0000000000000000 0000000000200020 00000001ffffffff 
0000000000000000
[  129.379742] page dumped because: kasan: bad access detected

[  129.381229] Memory state around the buggy address:
[  129.386012]  ffff0010cc62a100: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[  129.393228]  ffff0010cc62a180: fb fb fb fb fb fb fc fc fc fc fc fc fc 
fc fc fc
[  129.400443] >ffff0010cc62a200: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[  129.407657] ^
[  129.410878]  ffff0010cc62a280: fb fb fb fb fb fb fc fc fc fc fc fc fc 
fc fc fc
[  129.418092]  ffff0010cc62a300: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[  129.425306] 
==================================================================
[  129.432520] Disabling lock debugging due to kernel taint
root@ubuntu:/home/john#

My script simply kicks off fio running some data on the disk on which / 
is mounted, and then continually changes IO sched from 
none->mq-deadline->none for that same disk.

I have not had a chance to try to identify the issue.

I would also have added a delay in bt_tags_iter() for testing also, but 
since I got a UAF, above, I didn't try that yet.

Thanks,
John

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-05-04 10:15 ` John Garry
@ 2021-05-04 11:43   ` Ming Lei
  2021-05-05 11:19     ` John Garry
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2021-05-04 11:43 UTC (permalink / raw)
  To: John Garry
  Cc: Jens Axboe, linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, David Jeffery

On Tue, May 04, 2021 at 11:15:37AM +0100, John Garry wrote:
> On 29/04/2021 03:34, Ming Lei wrote:
> > Hi Jens,
> > 
> > This patchset fixes the request UAF issue by one simple approach,
> > without clearing ->rqs[] in fast path.
> > 
> > 1) grab request's ref before calling ->fn in blk_mq_tagset_busy_iter,
> > and release it after calling ->fn, so ->fn won't be called for one
> > request if its queue is frozen, done in 2st patch
> > 
> > 2) clearing any stale request referred in ->rqs[] before freeing the
> > request pool, one per-tags spinlock is added for protecting
> > grabbing request ref vs. clearing ->rqs[tag], so UAF by refcount_inc_not_zero
> > in bt_tags_iter() is avoided, done in 3rd patch.
> > 
> 
> I had a go at testing this. Without any modifications for testing, it looks
> ok.
> 
> However I also tested by adding an artificial delay in bt_iter() - otherwise
> it may not be realistic to trigger some UAF issues in sane timeframes.
> 
> So I made this change:
> 
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -215,8 +215,11 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned
> int bitnr, void *data)
>          * We can hit rq == NULL here, because the tagging functions
>          * test and set the bit before assigning ->rqs[].
>          */
> -       if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
> -               return iter_data->fn(hctx, rq, iter_data->data, reserved);
> +       if (rq) {
> +               mdelay(50);
> +               if (rq->q == hctx->queue && rq->mq_hctx == hctx)
> +   		  return iter_data->fn(hctx, rq, iter_data->data, reserved);
> +       }
>         return true;
>  }

hammmm, forget to cover bt_iter(), please test the following delta
patch:


diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index a3be267212b9..27815114ee3f 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -206,18 +206,28 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	struct blk_mq_tags *tags = hctx->tags;
 	bool reserved = iter_data->reserved;
 	struct request *rq;
+	unsigned long flags;
+	bool ret = true;
 
 	if (!reserved)
 		bitnr += tags->nr_reserved_tags;
-	rq = tags->rqs[bitnr];
 
+	spin_lock_irqsave(&tags->lock, flags);
+	rq = tags->rqs[bitnr];
 	/*
 	 * We can hit rq == NULL here, because the tagging functions
 	 * test and set the bit before assigning ->rqs[].
 	 */
-	if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
-		return iter_data->fn(hctx, rq, iter_data->data, reserved);
-	return true;
+	if (!rq || !refcount_inc_not_zero(&rq->ref)) {
+		spin_unlock_irqrestore(&tags->lock, flags);
+		return true;
+	}
+	spin_unlock_irqrestore(&tags->lock, flags);
+
+	if (rq->q == hctx->queue && rq->mq_hctx == hctx)
+		ret = iter_data->fn(hctx, rq, iter_data->data, reserved);
+	blk_mq_put_rq_ref(rq);
+	return ret;
 }
 
 /**

Thanks, 
Ming


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-05-04 11:43   ` Ming Lei
@ 2021-05-05 11:19     ` John Garry
  2021-05-05 14:28       ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: John Garry @ 2021-05-05 11:19 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, David Jeffery

On 04/05/2021 12:43, Ming Lei wrote:
>>           */
>> -       if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
>> -               return iter_data->fn(hctx, rq, iter_data->data, reserved);
>> +       if (rq) {
>> +               mdelay(50);
>> +               if (rq->q == hctx->queue && rq->mq_hctx == hctx)
>> +   		  return iter_data->fn(hctx, rq, iter_data->data, reserved);
>> +       }
>>          return true;
>>   }
> hammmm, forget to cover bt_iter(), please test the following delta
> patch:
> 
> 
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index a3be267212b9..27815114ee3f 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -206,18 +206,28 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
>   	struct blk_mq_tags *tags = hctx->tags;
>   	bool reserved = iter_data->reserved;
>   	struct request *rq;
> +	unsigned long flags;
> +	bool ret = true;
>   
>   	if (!reserved)
>   		bitnr += tags->nr_reserved_tags;
> -	rq = tags->rqs[bitnr];
>   
> +	spin_lock_irqsave(&tags->lock, flags);
> +	rq = tags->rqs[bitnr];
>   	/*
>   	 * We can hit rq == NULL here, because the tagging functions
>   	 * test and set the bit before assigning ->rqs[].
>   	 */
> -	if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
> -		return iter_data->fn(hctx, rq, iter_data->data, reserved);
> -	return true;
> +	if (!rq || !refcount_inc_not_zero(&rq->ref)) {
> +		spin_unlock_irqrestore(&tags->lock, flags);
> +		return true;
> +	}
> +	spin_unlock_irqrestore(&tags->lock, flags);
> +
> +	if (rq->q == hctx->queue && rq->mq_hctx == hctx)
> +		ret = iter_data->fn(hctx, rq, iter_data->data, reserved);
> +	blk_mq_put_rq_ref(rq);
> +	return ret;
>   }


This looks to work ok from my testing.

Thanks,
John

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests
  2021-05-05 11:19     ` John Garry
@ 2021-05-05 14:28       ` Ming Lei
  0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2021-05-05 14:28 UTC (permalink / raw)
  To: John Garry
  Cc: Jens Axboe, linux-block, Bart Van Assche, Khazhy Kumykov,
	Shin'ichiro Kawasaki, Hannes Reinecke, David Jeffery

On Wed, May 05, 2021 at 12:19:21PM +0100, John Garry wrote:
> On 04/05/2021 12:43, Ming Lei wrote:
> > >           */
> > > -       if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
> > > -               return iter_data->fn(hctx, rq, iter_data->data, reserved);
> > > +       if (rq) {
> > > +               mdelay(50);
> > > +               if (rq->q == hctx->queue && rq->mq_hctx == hctx)
> > > +   		  return iter_data->fn(hctx, rq, iter_data->data, reserved);
> > > +       }
> > >          return true;
> > >   }
> > hammmm, forget to cover bt_iter(), please test the following delta
> > patch:
> > 
> > 
> > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> > index a3be267212b9..27815114ee3f 100644
> > --- a/block/blk-mq-tag.c
> > +++ b/block/blk-mq-tag.c
> > @@ -206,18 +206,28 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
> >   	struct blk_mq_tags *tags = hctx->tags;
> >   	bool reserved = iter_data->reserved;
> >   	struct request *rq;
> > +	unsigned long flags;
> > +	bool ret = true;
> >   	if (!reserved)
> >   		bitnr += tags->nr_reserved_tags;
> > -	rq = tags->rqs[bitnr];
> > +	spin_lock_irqsave(&tags->lock, flags);
> > +	rq = tags->rqs[bitnr];
> >   	/*
> >   	 * We can hit rq == NULL here, because the tagging functions
> >   	 * test and set the bit before assigning ->rqs[].
> >   	 */
> > -	if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx)
> > -		return iter_data->fn(hctx, rq, iter_data->data, reserved);
> > -	return true;
> > +	if (!rq || !refcount_inc_not_zero(&rq->ref)) {
> > +		spin_unlock_irqrestore(&tags->lock, flags);
> > +		return true;
> > +	}
> > +	spin_unlock_irqrestore(&tags->lock, flags);
> > +
> > +	if (rq->q == hctx->queue && rq->mq_hctx == hctx)
> > +		ret = iter_data->fn(hctx, rq, iter_data->data, reserved);
> > +	blk_mq_put_rq_ref(rq);
> > +	return ret;
> >   }
> 
> 
> This looks to work ok from my testing.

OK, thanks for your test, and will add your tested-by in V4.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-05-05 14:29 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29  2:34 [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
2021-04-29  2:34 ` [PATCH V4 1/4] block: avoid double io accounting for flush request Ming Lei
2021-04-30  2:51   ` Bart Van Assche
2021-04-30  3:12     ` Ming Lei
2021-04-29  2:34 ` [PATCH V4 2/4] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Ming Lei
2021-04-30  3:06   ` Bart Van Assche
2021-04-29  2:34 ` [PATCH V4 3/4] blk-mq: clear stale request in tags->rq[] before freeing one request pool Ming Lei
2021-04-29 14:02   ` David Jeffery
2021-04-29  2:34 ` [PATCH V4 4/4] blk-mq: clearing flush request reference in tags->rqs[] Ming Lei
2021-04-29 14:13   ` David Jeffery
2021-04-30  3:05   ` Bart Van Assche
2021-05-04  7:29 ` [PATCH V4 0/4] blk-mq: fix request UAF related with iterating over tagset requests Ming Lei
2021-05-04 10:15 ` John Garry
2021-05-04 11:43   ` Ming Lei
2021-05-05 11:19     ` John Garry
2021-05-05 14:28       ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).