* [PATCH] blk-mq: fix is_flush_rq
@ 2021-08-18 1:09 Ming Lei
2021-08-18 2:18 ` Jens Axboe
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Ming Lei @ 2021-08-18 1:09 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, John Garry,
Ming Lei, Blank-Burian, Markus, Dr.,
Yufen Yu
is_flush_rq() is called from bt_iter()/bt_tags_iter(), and runs the
following check:
hctx->fq->flush_rq == req
but the passed hctx from bt_iter()/bt_tags_iter() may be NULL because:
1) memory re-order in blk_mq_rq_ctx_init():
rq->mq_hctx = data->hctx;
...
refcount_set(&rq->ref, 1);
OR
2) tag re-use and ->rqs[] isn't updated with new request.
Fix the issue by re-writing is_flush_rq() as:
return rq->end_io == flush_end_io;
which turns out simpler to follow and immune to data race since we have
ordered WRITE rq->end_io and refcount_set(&rq->ref, 1).
Fixes: 2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in
blk_mq_tagset_busy_iter")
Cc: "Blank-Burian, Markus, Dr." <blankburian@uni-muenster.de>
Cc: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-flush.c | 5 +++++
block/blk-mq.c | 2 +-
block/blk.h | 6 +-----
3 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 4912c8dbb1d8..4201728bf3a5 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -262,6 +262,11 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
}
+bool is_flush_rq(struct request *rq)
+{
+ return rq->end_io == flush_end_io;
+}
+
/**
* blk_kick_flush - consider issuing flush request
* @q: request_queue being kicked
diff --git a/block/blk-mq.c b/block/blk-mq.c
index b5237211ccb7..6d7395ad6d94 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -911,7 +911,7 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
void blk_mq_put_rq_ref(struct request *rq)
{
- if (is_flush_rq(rq, rq->mq_hctx))
+ if (is_flush_rq(rq))
rq->end_io(rq, 0);
else if (refcount_dec_and_test(&rq->ref))
__blk_mq_free_request(rq);
diff --git a/block/blk.h b/block/blk.h
index 56f33fbcde59..d4fb5b72d12b 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -44,11 +44,7 @@ static inline void __blk_get_queue(struct request_queue *q)
kobject_get(&q->kobj);
}
-static inline bool
-is_flush_rq(struct request *req, struct blk_mq_hw_ctx *hctx)
-{
- return hctx->fq->flush_rq == req;
-}
+bool is_flush_rq(struct request *req);
struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
gfp_t flags);
--
2.31.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] blk-mq: fix is_flush_rq
2021-08-18 1:09 [PATCH] blk-mq: fix is_flush_rq Ming Lei
@ 2021-08-18 2:18 ` Jens Axboe
2021-08-18 3:02 ` Bart Van Assche
2021-08-23 12:42 ` Joseph Qi
2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-08-18 2:18 UTC (permalink / raw)
To: Ming Lei
Cc: linux-block, Christoph Hellwig, Bart Van Assche, John Garry,
Blank-Burian, Markus, Dr.,
Yufen Yu
On 8/17/21 7:09 PM, Ming Lei wrote:
> is_flush_rq() is called from bt_iter()/bt_tags_iter(), and runs the
> following check:
>
> hctx->fq->flush_rq == req
>
> but the passed hctx from bt_iter()/bt_tags_iter() may be NULL because:
>
> 1) memory re-order in blk_mq_rq_ctx_init():
>
> rq->mq_hctx = data->hctx;
> ...
> refcount_set(&rq->ref, 1);
>
> OR
>
> 2) tag re-use and ->rqs[] isn't updated with new request.
>
> Fix the issue by re-writing is_flush_rq() as:
>
> return rq->end_io == flush_end_io;
>
> which turns out simpler to follow and immune to data race since we have
> ordered WRITE rq->end_io and refcount_set(&rq->ref, 1).
That is way better, applied thanks.
> Fixes: 2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in
> blk_mq_tagset_busy_iter")
I think your mailer was a bit too eager to split lines here. I fixed it up.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] blk-mq: fix is_flush_rq
2021-08-18 1:09 [PATCH] blk-mq: fix is_flush_rq Ming Lei
2021-08-18 2:18 ` Jens Axboe
@ 2021-08-18 3:02 ` Bart Van Assche
2021-08-18 3:31 ` Ming Lei
2021-08-23 12:42 ` Joseph Qi
2 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2021-08-18 3:02 UTC (permalink / raw)
To: Ming Lei, Jens Axboe
Cc: linux-block, Christoph Hellwig, John Garry, Blank-Burian, Markus,
Dr.,
Yufen Yu
On 8/17/21 6:09 PM, Ming Lei wrote:
> +bool is_flush_rq(struct request *rq)
> +{
> + return rq->end_io == flush_end_io;
> +}
My understanding is that calling is_flush_rq() is only safe if the
caller guarantees that the request refcount >= 1. How about adding a
comment that explains this?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] blk-mq: fix is_flush_rq
2021-08-18 3:02 ` Bart Van Assche
@ 2021-08-18 3:31 ` Ming Lei
0 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2021-08-18 3:31 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, John Garry,
Blank-Burian, Markus, Dr.,
Yufen Yu
On Tue, Aug 17, 2021 at 08:02:58PM -0700, Bart Van Assche wrote:
> On 8/17/21 6:09 PM, Ming Lei wrote:
> > +bool is_flush_rq(struct request *rq)
> > +{
> > + return rq->end_io == flush_end_io;
> > +}
>
> My understanding is that calling is_flush_rq() is only safe if the
> caller guarantees that the request refcount >= 1. How about adding a
> comment that explains this?
Yeah, we can add the following words, but it isn't something urgent
since is_flush_rq() is one block layer internal helper.
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 4201728bf3a5..babf6262120e 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -262,6 +262,11 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
}
+/*
+ * Caller has to grab refcount of this request, otherwise the flush request
+ * may be re-cycled, then rq->end_io is cleared and kernel panic is caused,
+ * see blk_mq_put_rq_ref().
+ */
bool is_flush_rq(struct request *rq)
{
return rq->end_io == flush_end_io;
Thanks,
Ming
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] blk-mq: fix is_flush_rq
2021-08-18 1:09 [PATCH] blk-mq: fix is_flush_rq Ming Lei
2021-08-18 2:18 ` Jens Axboe
2021-08-18 3:02 ` Bart Van Assche
@ 2021-08-23 12:42 ` Joseph Qi
2021-08-23 12:50 ` Ming Lei
2 siblings, 1 reply; 6+ messages in thread
From: Joseph Qi @ 2021-08-23 12:42 UTC (permalink / raw)
To: Ming Lei, Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, John Garry,
Blank-Burian, Markus, Dr.,
Yufen Yu
Hi Ming,
On 8/18/21 9:09 AM, Ming Lei wrote:
> is_flush_rq() is called from bt_iter()/bt_tags_iter(), and runs the
> following check:
>
> hctx->fq->flush_rq == req
>
> but the passed hctx from bt_iter()/bt_tags_iter() may be NULL because:
>
> 1) memory re-order in blk_mq_rq_ctx_init():
>
> rq->mq_hctx = data->hctx;
> ...
> refcount_set(&rq->ref, 1);
>
> OR
>
> 2) tag re-use and ->rqs[] isn't updated with new request.
>
> Fix the issue by re-writing is_flush_rq() as:
>
> return rq->end_io == flush_end_io;
>
> which turns out simpler to follow and immune to data race since we have
> ordered WRITE rq->end_io and refcount_set(&rq->ref, 1).
>
Recently we've run into a similar crash due to NULL rq->mq_hctx in
blk_mq_put_rq_ref() on ARM, and it is a normal write request.
Since memory reorder truly exists, we may also risk other uninitialized
member accessing after this commit, at least we have to be more careful
in busy_iter_fn...
So here you don't use memory barrier before refcount_set() is for
performance consideration?
Thanks,
Joseph
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] blk-mq: fix is_flush_rq
2021-08-23 12:42 ` Joseph Qi
@ 2021-08-23 12:50 ` Ming Lei
0 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2021-08-23 12:50 UTC (permalink / raw)
To: Joseph Qi
Cc: Jens Axboe, linux-block, Christoph Hellwig, Bart Van Assche,
John Garry, Blank-Burian, Markus, Dr.,
Yufen Yu
On Mon, Aug 23, 2021 at 08:42:45PM +0800, Joseph Qi wrote:
> Hi Ming,
>
> On 8/18/21 9:09 AM, Ming Lei wrote:
> > is_flush_rq() is called from bt_iter()/bt_tags_iter(), and runs the
> > following check:
> >
> > hctx->fq->flush_rq == req
> >
> > but the passed hctx from bt_iter()/bt_tags_iter() may be NULL because:
> >
> > 1) memory re-order in blk_mq_rq_ctx_init():
> >
> > rq->mq_hctx = data->hctx;
> > ...
> > refcount_set(&rq->ref, 1);
> >
> > OR
> >
> > 2) tag re-use and ->rqs[] isn't updated with new request.
> >
> > Fix the issue by re-writing is_flush_rq() as:
> >
> > return rq->end_io == flush_end_io;
> >
> > which turns out simpler to follow and immune to data race since we have
> > ordered WRITE rq->end_io and refcount_set(&rq->ref, 1).
> >
> Recently we've run into a similar crash due to NULL rq->mq_hctx in
> blk_mq_put_rq_ref() on ARM, and it is a normal write request.
> Since memory reorder truly exists, we may also risk other uninitialized
> member accessing after this commit, at least we have to be more careful
> in busy_iter_fn...
> So here you don't use memory barrier before refcount_set() is for
> performance consideration?
Yes, also it is much simpler to check ->end_io in concept.
Thanks,
Ming
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-08-23 12:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-18 1:09 [PATCH] blk-mq: fix is_flush_rq Ming Lei
2021-08-18 2:18 ` Jens Axboe
2021-08-18 3:02 ` Bart Van Assche
2021-08-18 3:31 ` Ming Lei
2021-08-23 12:42 ` Joseph Qi
2021-08-23 12:50 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).