From: Ming Lei <ming.lei@redhat.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Subject: Re: [PATCH] blk-mq: put driver tag when this request is completed
Date: Thu, 2 Jul 2020 19:48:48 +0800 [thread overview]
Message-ID: <20200702114848.GE2452799@T590> (raw)
In-Reply-To: <5acf69fb-04b2-8649-1fc4-2cfe8aa8b9c7@samsung.com>
On Thu, Jul 02, 2020 at 12:19:08PM +0200, Marek Szyprowski wrote:
> On 02.07.2020 11:23, Ming Lei wrote:
> > On Thu, Jul 02, 2020 at 10:04:38AM +0200, Marek Szyprowski wrote:
> >> On 02.07.2020 03:22, Ming Lei wrote:
> >>> On Wed, Jul 01, 2020 at 04:16:32PM +0200, Marek Szyprowski wrote:
> >>>> On 01.07.2020 15:45, Ming Lei wrote:
> >>>>> On Wed, Jul 01, 2020 at 03:01:03PM +0200, Marek Szyprowski wrote:
> >>>>>> On 29.06.2020 11:47, Ming Lei wrote:
> >>>>>>> It is natural to release driver tag when this request is completed by
> >>>>>>> LLD or device since its purpose is for LLD use.
> >>>>>>>
> >>>>>>> One big benefit is that the released tag can be re-used quicker since
> >>>>>>> bio_endio() may take too long.
> >>>>>>>
> >>>>>>> Meantime we don't need to release driver tag for flush request.
> >>>>>>>
> >>>>>>> Cc: Christoph Hellwig <hch@lst.de>
> >>>>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> >>>>>> This patch landed recently in linux-next as commit 36a3df5a4574. Sadly
> >>>>>> it causes a regression on one of my test systems (ARM 32bit, Samsung
> >>>>>> Exynos5422 SoC based Odroid XU3 board with eMMC). The system boots fine
> >>>>>> and then after a few seconds every executed command hangs. No
> >>>>>> panic/ops/any other message. I will try to provide more information asap
> >>>>>> I find something to share. Simple reverting it in linux-next is not
> >>>>>> possible due to dependencies.
> >>>>> What is the exact eMMC's driver code(include the host driver)?
> >>>> dwmmc-exynos (drivers/mmc/host/dw_mmc-exynos.c)
> >>> Hi,
> >>>
> >>> Just take a quick look at mmc code, there are only two req->tag
> >>> consumers:
> >>>
> >>> 1) cqhci_tag
> >>> cqhci_tag
> >>> cqhci_request
> >>> host->cqe_ops->cqe_request
> >>> mmc_cqe_start_req
> >>> cqhci_timeout
> >>>
> >>> 2) mmc_hsq_request
> >>> mmc_hsq_request
> >>> host->cqe_ops->cqe_request
> >>> mmc_cqe_start_req
> >>>
> >>> mmc_cqe_start_req() is called before issuing this request to hardware,
> >>> so completion won't happen when the tag is used in mmc_cqe_start_req().
> >>>
> >>> cqhci_timeout() may race with normal completion, however looks the
> >>> following code can handle the race correctly:
> >>>
> >>> spin_lock_irqsave(&cq_host->lock, flags);
> >>> timed_out = slot->mrq == mrq;
> >>>
> >>> So still no idea why the commit causes the trouble for mmc.
> >>>
> >>> Do you know it is cqhci or mmc_hsh which works for dw_mmc-exynos?
> >>> And can you apply the following patch and see if warning can be
> >>> triggered?
> >>>
> >>> diff --git a/drivers/mmc/host/cqhci.c b/drivers/mmc/host/cqhci.c
> >>> index 75934f3c117e..2cb49ecfbf34 100644
> >>> --- a/drivers/mmc/host/cqhci.c
> >>> +++ b/drivers/mmc/host/cqhci.c
> >>> @@ -612,6 +612,7 @@ static int cqhci_request(struct mmc_host *mmc, struct mmc_request *mrq)
> >>> goto out_unlock;
> >>> }
> >>>
> >>> + WARN_ON_ONCE(cq_host->slot[tag].mrq);
> >>> cq_host->slot[tag].mrq = mrq;
> >>> cq_host->slot[tag].flags = 0;
> >>>
> >>> diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
> >>> index a5e05ed0fda3..11a4c1f3a970 100644
> >>> --- a/drivers/mmc/host/mmc_hsq.c
> >>> +++ b/drivers/mmc/host/mmc_hsq.c
> >>> @@ -227,6 +227,7 @@ static int mmc_hsq_request(struct mmc_host *mmc, struct mmc_request *mrq)
> >>> return -EBUSY;
> >>> }
> >>>
> >>> + WARN_ON_ONCE(hsq->slot[tag].mrq);
> >>> hsq->slot[tag].mrq = mrq;
> >>>
> >>> /*
> >> None of the above is even compiled for my system (I'm using
> >> arm/exynos_defconfig), so this must be something else.
> > Hello Marek,
> >
> > Or can you boot the system with one workable disk(usb, nand, ...)?
> > then run some IO test on this eMMC, and collect debugfs log via the following
> > command after the hang is triggered:
> >
> > (cd /sys/kernel/debug/block/$MMC && find . -type f -exec grep -aH . {} \;)
> >
> > $MMC is this mmc disk name.
>
>
> I hope it helps.
It does help, :-)
Thanks for collecting the log, now I understood the reason: flush
request's driver tag is leaked in case that request isn't done via
blk_mq_complete_request(), such as freed via blk_mq_end_request()
directly.
Please try the following patch, which should have been one two-line
change if the driver tag cleanup patch isn't merged.
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ebab8f1044cb..7d62e9e5972e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -532,6 +532,26 @@ void blk_mq_free_request(struct request *rq)
}
EXPORT_SYMBOL_GPL(blk_mq_free_request);
+static void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx,
+ struct request *rq)
+{
+ blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag);
+ rq->tag = BLK_MQ_NO_TAG;
+
+ if (rq->rq_flags & RQF_MQ_INFLIGHT) {
+ rq->rq_flags &= ~RQF_MQ_INFLIGHT;
+ atomic_dec(&hctx->nr_active);
+ }
+}
+
+static inline void blk_mq_put_driver_tag(struct request *rq)
+{
+ if (rq->tag == BLK_MQ_NO_TAG || rq->internal_tag == BLK_MQ_NO_TAG)
+ return;
+
+ __blk_mq_put_driver_tag(rq->mq_hctx, rq);
+}
+
inline void __blk_mq_end_request(struct request *rq, blk_status_t error)
{
u64 now = 0;
@@ -551,6 +571,7 @@ inline void __blk_mq_end_request(struct request *rq, blk_status_t error)
if (rq->end_io) {
rq_qos_done(rq->q, rq);
+ blk_mq_put_driver_tag(rq);
rq->end_io(rq, error);
} else {
blk_mq_free_request(rq);
@@ -862,26 +883,6 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
return cpu_online(rq->mq_ctx->cpu);
}
-static void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx,
- struct request *rq)
-{
- blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag);
- rq->tag = BLK_MQ_NO_TAG;
-
- if (rq->rq_flags & RQF_MQ_INFLIGHT) {
- rq->rq_flags &= ~RQF_MQ_INFLIGHT;
- atomic_dec(&hctx->nr_active);
- }
-}
-
-static inline void blk_mq_put_driver_tag(struct request *rq)
-{
- if (rq->tag == BLK_MQ_NO_TAG || rq->internal_tag == BLK_MQ_NO_TAG)
- return;
-
- __blk_mq_put_driver_tag(rq->mq_hctx, rq);
-}
-
bool blk_mq_complete_request_remote(struct request *rq)
{
WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
@@ -1185,9 +1186,10 @@ static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
if (blk_mq_req_expired(rq, next))
blk_mq_rq_timed_out(rq, reserved);
- if (is_flush_rq(rq, hctx))
+ if (is_flush_rq(rq, hctx)) {
+ blk_mq_put_driver_tag(rq);
rq->end_io(rq, 0);
- else if (refcount_dec_and_test(&rq->ref))
+ } else if (refcount_dec_and_test(&rq->ref))
__blk_mq_free_request(rq);
return true;
Thanks,
Ming
next prev parent reply other threads:[~2020-07-02 11:49 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-29 9:47 [PATCH] blk-mq: put driver tag when this request is completed Ming Lei
2020-06-29 15:04 ` Christoph Hellwig
2020-06-29 15:56 ` Jens Axboe
[not found] ` <CGME20200701130104eucas1p1f8dcce58bf704b726aee1e89980fe19e@eucas1p1.samsung.com>
2020-07-01 13:01 ` Marek Szyprowski
2020-07-01 13:45 ` Ming Lei
2020-07-01 14:16 ` Marek Szyprowski
2020-07-01 14:58 ` Marek Szyprowski
2020-07-02 1:22 ` Ming Lei
2020-07-02 5:03 ` Jens Axboe
2020-07-02 8:04 ` Marek Szyprowski
2020-07-02 9:23 ` Ming Lei
2020-07-02 10:19 ` Marek Szyprowski
2020-07-02 11:48 ` Ming Lei [this message]
2020-07-02 12:12 ` Marek Szyprowski
2020-07-06 14:40 Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200702114848.GE2452799@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=b.zolnierkie@samsung.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).