* [PATCH 1/5] blk-mq: Rename BLK_EH_DONE into BLK_EH_DONT_RESET_TIMER
2018-07-27 16:20 [PATCH 0/5] Improve blk-mq performance Bart Van Assche
@ 2018-07-27 16:20 ` Bart Van Assche
2018-07-27 16:20 ` [PATCH 2/5] block: Remove a superfluous #include directive Bart Van Assche
` (4 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:20 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Ming Lei,
Hannes Reinecke
If a block driver timeout handler returns BLK_EH_DONE that means
either that the request has been completed or that the block driver
still owns the request. Since the name BLK_EH_DONE is misleading,
change it into BLK_EH_DONT_RESET_TIMER.
Fixes: 88b0cfad2888 ("block: document the blk_eh_timer_return values")
Fixes: 6600593cbd93 ("block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
---
Documentation/scsi/scsi_eh.txt | 4 ++--
block/blk-mq.c | 2 +-
block/blk-timeout.c | 2 +-
drivers/block/mtip32xx/mtip32xx.c | 2 +-
drivers/block/nbd.c | 4 ++--
drivers/block/null_blk_main.c | 4 ++--
drivers/message/fusion/mptsas.c | 2 +-
drivers/mmc/core/queue.c | 2 +-
drivers/nvme/host/pci.c | 10 +++++-----
drivers/nvme/host/rdma.c | 2 +-
drivers/nvme/target/loop.c | 2 +-
drivers/s390/block/dasd.c | 6 +++---
drivers/scsi/gdth.c | 2 +-
drivers/scsi/libiscsi.c | 6 +++---
drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
drivers/scsi/mvumi.c | 2 +-
drivers/scsi/qla4xxx/ql4_os.c | 2 +-
drivers/scsi/scsi_error.c | 4 ++--
drivers/scsi/scsi_transport_fc.c | 4 ++--
drivers/scsi/scsi_transport_srp.c | 4 ++--
drivers/scsi/ufs/ufshcd.c | 6 +++---
include/linux/blkdev.h | 2 +-
22 files changed, 38 insertions(+), 38 deletions(-)
diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 1b7436932a2b..59e085b28b31 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -86,9 +86,9 @@ function
This indicates that more time is required to finish the
command. Timer is restarted. This action is counted as a
retry and only allowed scmd->allowed + 1(!) times. Once the
- limit is reached, action for BLK_EH_DONE is taken instead.
+ limit is reached, action for BLK_EH_DONT_RESET_TIMER is taken instead.
- - BLK_EH_DONE
+ - BLK_EH_DONT_RESET_TIMER
eh_timed_out() callback did not handle the command.
Step #2 is taken.
diff --git a/block/blk-mq.c b/block/blk-mq.c
index c92ce06fd565..96e1a7f25875 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -774,7 +774,7 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
enum blk_eh_timer_return ret;
ret = req->q->mq_ops->timeout(req, reserved);
- if (ret == BLK_EH_DONE)
+ if (ret == BLK_EH_DONT_RESET_TIMER)
return;
WARN_ON_ONCE(ret != BLK_EH_RESET_TIMER);
}
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index f2cfd56e1606..37df7f8f8516 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -90,7 +90,7 @@ static void blk_rq_timed_out(struct request *req)
blk_add_timer(req);
blk_clear_rq_complete(req);
break;
- case BLK_EH_DONE:
+ case BLK_EH_DONT_RESET_TIMER:
/*
* LLD handles this for now but in the future
* we can send a request msg to abort the command
diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index c73626decb46..e28be7821d0c 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3712,7 +3712,7 @@ static enum blk_eh_timer_return mtip_cmd_timeout(struct request *req,
cmd->status = BLK_STS_TIMEOUT;
blk_mq_complete_request(req);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
if (test_bit(req->tag, dd->port->cmds_to_issue))
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 3fb95c8d9fd8..67e17965e462 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -382,7 +382,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
mutex_unlock(&cmd->lock);
nbd_requeue_cmd(cmd);
nbd_config_put(nbd);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
} else {
dev_err_ratelimited(nbd_to_dev(nbd),
@@ -395,7 +395,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
nbd_config_put(nbd);
done:
blk_mq_complete_request(req);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
/*
diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
index 86cafa6d3b41..3617c9ac251a 100644
--- a/drivers/block/null_blk_main.c
+++ b/drivers/block/null_blk_main.c
@@ -1320,7 +1320,7 @@ static enum blk_eh_timer_return null_rq_timed_out_fn(struct request *rq)
{
pr_info("null: rq %p timed out\n", rq);
__blk_complete_request(rq);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
static int null_rq_prep_fn(struct request_queue *q, struct request *req)
@@ -1383,7 +1383,7 @@ static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res)
{
pr_info("null: rq %p timed out\n", rq);
blk_mq_complete_request(rq);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 76a66da33996..b703ad68f426 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1929,7 +1929,7 @@ static enum blk_eh_timer_return mptsas_eh_timed_out(struct scsi_cmnd *sc)
MPT_SCSI_HOST *hd;
MPT_ADAPTER *ioc;
VirtDevice *vdevice;
- enum blk_eh_timer_return rc = BLK_EH_DONE;
+ enum blk_eh_timer_return rc = BLK_EH_DONT_RESET_TIMER;
hd = shost_priv(sc->device->host);
if (hd == NULL) {
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 648eb6743ed5..ea32dcd6bfcf 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -113,7 +113,7 @@ static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
}
/* No timeout (XXX: huh? comment doesn't make much sense) */
blk_mq_complete_request(req);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
default:
/* Timeout is handled by mmc core */
return BLK_EH_RESET_TIMER;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8dcae11bbf3a..efb034b0b98e 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1211,7 +1211,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
nvme_warn_reset(dev, csts);
nvme_dev_disable(dev, false);
nvme_reset_ctrl(&dev->ctrl);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
/*
@@ -1221,14 +1221,14 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
dev_warn(dev->ctrl.device,
"I/O %d QID %d timeout, completion polled\n",
req->tag, nvmeq->qid);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
/*
* Shutdown immediately if controller times out while starting. The
* reset work will see the pci device disabled when it gets the forced
* cancellation error. All outstanding requests are completed on
- * shutdown, so we return BLK_EH_DONE.
+ * shutdown, so we return BLK_EH_DONT_RESET_TIMER.
*/
switch (dev->ctrl.state) {
case NVME_CTRL_CONNECTING:
@@ -1238,7 +1238,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
req->tag, nvmeq->qid);
nvme_dev_disable(dev, false);
nvme_req(req)->flags |= NVME_REQ_CANCELLED;
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
default:
break;
}
@@ -1256,7 +1256,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
nvme_reset_ctrl(&dev->ctrl);
nvme_req(req)->flags |= NVME_REQ_CANCELLED;
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
if (atomic_dec_return(&dev->ctrl.abort_limit) < 0) {
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 0805fa6215ee..d18511427437 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1682,7 +1682,7 @@ nvme_rdma_timeout(struct request *rq, bool reserved)
/* fail with DNR on cmd timeout */
nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 9908082b32c4..419107cd952a 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -148,7 +148,7 @@ nvme_loop_timeout(struct request *rq, bool reserved)
/* fail with DNR on admin cmd timeout */
nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
static blk_status_t nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx,
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index a9f60d0ee02e..d9bfebcc6aa7 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3010,7 +3010,7 @@ static blk_status_t do_dasd_request(struct blk_mq_hw_ctx *hctx,
*
* Return values:
* BLK_EH_RESET_TIMER if the request should be left running
- * BLK_EH_DONE if the request is handled or terminated
+ * BLK_EH_DONT_RESET_TIMER if the request is handled or terminated
* by the driver.
*/
enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
@@ -3023,7 +3023,7 @@ enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
cqr = blk_mq_rq_to_pdu(req);
if (!cqr)
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
spin_lock_irqsave(&cqr->dq->lock, flags);
device = cqr->startdev ? cqr->startdev : block->base;
@@ -3078,7 +3078,7 @@ enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
spin_unlock(&block->queue_lock);
spin_unlock_irqrestore(&cqr->dq->lock, flags);
- return rc ? BLK_EH_RESET_TIMER : BLK_EH_DONE;
+ return rc ? BLK_EH_RESET_TIMER : BLK_EH_DONT_RESET_TIMER;
}
static int dasd_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 85604795d8ee..fa18d3d1dc9c 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -3882,7 +3882,7 @@ static enum blk_eh_timer_return gdth_timed_out(struct scsi_cmnd *scp)
struct gdth_cmndinfo *cmndinfo = gdth_cmnd_priv(scp);
u8 b, t;
unsigned long flags;
- enum blk_eh_timer_return retval = BLK_EH_DONE;
+ enum blk_eh_timer_return retval = BLK_EH_DONT_RESET_TIMER;
TRACE(("%s() cmd 0x%x\n", scp->cmnd[0], __func__));
b = scp->device->channel;
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index d6093838f5f2..6ede7e01df30 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1963,7 +1963,7 @@ static int iscsi_has_ping_timed_out(struct iscsi_conn *conn)
enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
{
- enum blk_eh_timer_return rc = BLK_EH_DONE;
+ enum blk_eh_timer_return rc = BLK_EH_DONT_RESET_TIMER;
struct iscsi_task *task = NULL, *running_task;
struct iscsi_cls_session *cls_session;
struct iscsi_session *session;
@@ -1982,7 +1982,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
* Raced with completion. Blk layer has taken ownership
* so let timeout code complete it now.
*/
- rc = BLK_EH_DONE;
+ rc = BLK_EH_DONT_RESET_TIMER;
goto done;
}
@@ -1997,7 +1997,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
if (unlikely(system_state != SYSTEM_RUNNING)) {
sc->result = DID_NO_CONNECT << 16;
ISCSI_DBG_EH(session, "sc on shutdown, handled\n");
- rc = BLK_EH_DONE;
+ rc = BLK_EH_DONT_RESET_TIMER;
goto done;
}
/*
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 71d97573a667..bd5bc1da9be1 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -2768,7 +2768,7 @@ blk_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd)
if (time_after(jiffies, scmd->jiffies_at_alloc +
(scmd_timeout * 2) * HZ)) {
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
instance = (struct megasas_instance *)scmd->device->host->hostdata;
diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c
index b3cd9a6b1d30..84cdadd4bd78 100644
--- a/drivers/scsi/mvumi.c
+++ b/drivers/scsi/mvumi.c
@@ -2155,7 +2155,7 @@ static enum blk_eh_timer_return mvumi_timed_out(struct scsi_cmnd *scmd)
mvumi_return_cmd(mhba, cmd);
spin_unlock_irqrestore(mhba->shost->host_lock, flags);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
static int
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 0e13349dce57..ecabca8d815b 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -1848,7 +1848,7 @@ static enum blk_eh_timer_return qla4xxx_eh_cmd_timed_out(struct scsi_cmnd *sc)
struct iscsi_cls_session *session;
struct iscsi_session *sess;
unsigned long flags;
- enum blk_eh_timer_return ret = BLK_EH_DONE;
+ enum blk_eh_timer_return ret = BLK_EH_DONT_RESET_TIMER;
session = starget_to_session(scsi_target(sc->device));
sess = session->dd_data;
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 2715cdaa669c..b38b8e62e618 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -283,7 +283,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
enum blk_eh_timer_return scsi_times_out(struct request *req)
{
struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req);
- enum blk_eh_timer_return rtn = BLK_EH_DONE;
+ enum blk_eh_timer_return rtn = BLK_EH_DONT_RESET_TIMER;
struct Scsi_Host *host = scmd->device->host;
trace_scsi_dispatch_cmd_timeout(scmd);
@@ -295,7 +295,7 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
if (host->hostt->eh_timed_out)
rtn = host->hostt->eh_timed_out(scmd);
- if (rtn == BLK_EH_DONE) {
+ if (rtn == BLK_EH_DONT_RESET_TIMER) {
/*
* For blk-mq, we must set the request state to complete now
* before sending the request to the scsi error handler. This
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 13948102ca29..63d283f298f3 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -2087,7 +2087,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)
if (rport->port_state == FC_PORTSTATE_BLOCKED)
return BLK_EH_RESET_TIMER;
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
EXPORT_SYMBOL(fc_eh_timed_out);
@@ -3593,7 +3593,7 @@ fc_bsg_job_timeout(struct request *req)
/* the blk_end_sync_io() doesn't check the error */
if (inflight)
__blk_complete_request(req);
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
}
/**
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 4e46fdb2d7c9..03571512dca5 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -604,7 +604,7 @@ EXPORT_SYMBOL(srp_reconnect_rport);
*
* If a timeout occurs while an rport is in the blocked state, ask the SCSI
* EH to continue waiting (BLK_EH_RESET_TIMER). Otherwise let the SCSI core
- * handle the timeout (BLK_EH_DONE).
+ * handle the timeout (BLK_EH_DONT_RESET_TIMER).
*
* Note: This function is called from soft-IRQ context and with the request
* queue lock held.
@@ -620,7 +620,7 @@ enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
return rport && rport->fast_io_fail_tmo < 0 &&
rport->dev_loss_tmo < 0 &&
i->f->reset_timer_if_blocked && scsi_device_blocked(sdev) ?
- BLK_EH_RESET_TIMER : BLK_EH_DONE;
+ BLK_EH_RESET_TIMER : BLK_EH_DONT_RESET_TIMER;
}
EXPORT_SYMBOL(srp_timed_out);
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 397081d320b1..2184222953b2 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6682,12 +6682,12 @@ static enum blk_eh_timer_return ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
bool found = false;
if (!scmd || !scmd->device || !scmd->device->host)
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
host = scmd->device->host;
hba = shost_priv(host);
if (!hba)
- return BLK_EH_DONE;
+ return BLK_EH_DONT_RESET_TIMER;
spin_lock_irqsave(host->host_lock, flags);
@@ -6705,7 +6705,7 @@ static enum blk_eh_timer_return ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
* SCSI command was not actually dispatched to UFS driver, otherwise
* let SCSI layer handle the error as usual.
*/
- return found ? BLK_EH_DONE : BLK_EH_RESET_TIMER;
+ return found ? BLK_EH_DONT_RESET_TIMER : BLK_EH_RESET_TIMER;
}
static const struct attribute_group *ufshcd_driver_groups[] = {
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 050d599f5ea9..d5d31b7b6ba8 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -328,7 +328,7 @@ typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
typedef void (exit_rq_fn)(struct request_queue *, struct request *);
enum blk_eh_timer_return {
- BLK_EH_DONE, /* drivers has completed the command */
+ BLK_EH_DONT_RESET_TIMER,/* driver completed or owns the command */
BLK_EH_RESET_TIMER, /* reset timer and try again */
};
--
2.18.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/5] block: Remove a superfluous #include directive
2018-07-27 16:20 [PATCH 0/5] Improve blk-mq performance Bart Van Assche
2018-07-27 16:20 ` [PATCH 1/5] blk-mq: Rename BLK_EH_DONE into BLK_EH_DONT_RESET_TIMER Bart Van Assche
@ 2018-07-27 16:20 ` Bart Van Assche
2018-07-27 16:20 ` [PATCH 3/5] block: Split blk_add_timer() Bart Van Assche
` (3 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:20 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Keith Busch,
Jianchao Wang, Ming Lei
Commit 12f5b9314545 ("blk-mq: Remove generation seqeunce") removed the
only u64_stats_sync instance from <linux/blkdev.h> but did not remove
the corresponding #include directive. Since the #include directive is
no longer needed, remove it.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
---
include/linux/blkdev.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d5d31b7b6ba8..8244a5a1aa5b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -28,7 +28,6 @@
#include <linux/scatterlist.h>
#include <linux/blkzoned.h>
#include <linux/seqlock.h>
-#include <linux/u64_stats_sync.h>
struct module;
struct scsi_ioctl_command;
--
2.18.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/5] block: Split blk_add_timer()
2018-07-27 16:20 [PATCH 0/5] Improve blk-mq performance Bart Van Assche
2018-07-27 16:20 ` [PATCH 1/5] blk-mq: Rename BLK_EH_DONE into BLK_EH_DONT_RESET_TIMER Bart Van Assche
2018-07-27 16:20 ` [PATCH 2/5] block: Remove a superfluous #include directive Bart Van Assche
@ 2018-07-27 16:20 ` Bart Van Assche
2018-07-27 16:29 ` Keith Busch
2018-07-27 16:20 ` [PATCH 4/5] block: Simplify blk_add_timer() and blk_mq_add_timer() Bart Van Assche
` (2 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:20 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Keith Busch,
Jianchao Wang, Ming Lei
Split blk_add_timer() as follows:
- Introduce the helper function __blk_add_timer() that performs the
tasks that apply both to the legacy block layer core and to blk-mq.
- Duplicate the remaining blk_add_timer() function into the new
function blk_mq_add_timer().
- Change the blk_mq_timer() calls into blk_mq_add_timer() calls in
blk-mq code.
This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
---
block/blk-mq.c | 4 +--
block/blk-timeout.c | 81 +++++++++++++++++++++++++++++++++++----------
block/blk.h | 1 +
3 files changed, 66 insertions(+), 20 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 96e1a7f25875..1b49973629f6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -644,7 +644,7 @@ void blk_mq_start_request(struct request *rq)
WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE);
- blk_add_timer(rq);
+ blk_mq_add_timer(rq);
WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT);
if (q->dma_drain_size && blk_rq_bytes(rq)) {
@@ -779,7 +779,7 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
WARN_ON_ONCE(ret != BLK_EH_RESET_TIMER);
}
- blk_add_timer(req);
+ blk_mq_add_timer(req);
}
static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 37df7f8f8516..527670f2f985 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -181,6 +181,33 @@ unsigned long blk_rq_timeout(unsigned long timeout)
return timeout;
}
+static void __blk_add_timer(struct request *req, unsigned long deadline)
+{
+ struct request_queue *q = req->q;
+ unsigned long expiry;
+
+ /*
+ * If the timer isn't already pending or this timeout is earlier
+ * than an existing one, modify the timer. Round up to next nearest
+ * second.
+ */
+ expiry = blk_rq_timeout(round_jiffies_up(deadline));
+ if (!timer_pending(&q->timeout) ||
+ time_before(expiry, q->timeout.expires)) {
+ unsigned long diff = q->timeout.expires - expiry;
+
+ /*
+ * Due to added timer slack to group timers, the timer
+ * will often be a little in front of what we asked for.
+ * So apply some tolerance here too, otherwise we keep
+ * modifying the timer because expires for value X
+ * will be X + something.
+ */
+ if (!timer_pending(&q->timeout) || (diff >= HZ / 2))
+ mod_timer(&q->timeout, expiry);
+ }
+}
+
/**
* blk_add_timer - Start timeout timer for a single request
* @req: request that is about to start running.
@@ -192,7 +219,6 @@ unsigned long blk_rq_timeout(unsigned long timeout)
void blk_add_timer(struct request *req)
{
struct request_queue *q = req->q;
- unsigned long expiry;
if (!q->mq_ops)
lockdep_assert_held(q->queue_lock);
@@ -220,26 +246,45 @@ void blk_add_timer(struct request *req)
if (!q->mq_ops)
list_add_tail(&req->timeout_list, &req->q->timeout_list);
+ __blk_add_timer(req, blk_rq_deadline(req));
+}
+
+/**
+ * blk_mq_add_timer - set the deadline for a single request
+ * @req: request for which to set the deadline.
+ *
+ * Sets the deadline of a request. The caller must guarantee that the request
+ * state won't be modified while this function is in progress.
+ */
+void blk_mq_add_timer(struct request *req)
+{
+ struct request_queue *q = req->q;
+
+ if (!q->mq_ops)
+ lockdep_assert_held(q->queue_lock);
+
+ /* blk-mq has its own handler, so we don't need ->rq_timed_out_fn */
+ if (!q->mq_ops && !q->rq_timed_out_fn)
+ return;
+
+ BUG_ON(!list_empty(&req->timeout_list));
+
/*
- * If the timer isn't already pending or this timeout is earlier
- * than an existing one, modify the timer. Round up to next nearest
- * second.
+ * Some LLDs, like scsi, peek at the timeout to prevent a
+ * command from being retried forever.
*/
- expiry = blk_rq_timeout(round_jiffies_up(blk_rq_deadline(req)));
+ if (!req->timeout)
+ req->timeout = q->rq_timeout;
- if (!timer_pending(&q->timeout) ||
- time_before(expiry, q->timeout.expires)) {
- unsigned long diff = q->timeout.expires - expiry;
+ req->rq_flags &= ~RQF_TIMED_OUT;
+ blk_rq_set_deadline(req, jiffies + req->timeout);
- /*
- * Due to added timer slack to group timers, the timer
- * will often be a little in front of what we asked for.
- * So apply some tolerance here too, otherwise we keep
- * modifying the timer because expires for value X
- * will be X + something.
- */
- if (!timer_pending(&q->timeout) || (diff >= HZ / 2))
- mod_timer(&q->timeout, expiry);
- }
+ /*
+ * Only the non-mq case needs to add the request to a protected list.
+ * For the mq case we simply scan the tag map.
+ */
+ if (!q->mq_ops)
+ list_add_tail(&req->timeout_list, &req->q->timeout_list);
+ __blk_add_timer(req, blk_rq_deadline(req));
}
diff --git a/block/blk.h b/block/blk.h
index 69b14cd2bb22..6adae8f94279 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -170,6 +170,7 @@ static inline bool bio_integrity_endio(struct bio *bio)
void blk_timeout_work(struct work_struct *work);
unsigned long blk_rq_timeout(unsigned long timeout);
void blk_add_timer(struct request *req);
+void blk_mq_add_timer(struct request *req);
void blk_delete_timer(struct request *);
--
2.18.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/5] block: Split blk_add_timer()
2018-07-27 16:20 ` [PATCH 3/5] block: Split blk_add_timer() Bart Van Assche
@ 2018-07-27 16:29 ` Keith Busch
2018-07-27 16:31 ` Bart Van Assche
0 siblings, 1 reply; 17+ messages in thread
From: Keith Busch @ 2018-07-27 16:29 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Jianchao Wang, Ming Lei
On Fri, Jul 27, 2018 at 09:20:40AM -0700, Bart Van Assche wrote:
> +void blk_mq_add_timer(struct request *req)
> +{
> + struct request_queue *q = req->q;
> +
> + if (!q->mq_ops)
> + lockdep_assert_held(q->queue_lock);
> +
> + /* blk-mq has its own handler, so we don't need ->rq_timed_out_fn */
> + if (!q->mq_ops && !q->rq_timed_out_fn)
> + return;
<snip>
> + if (!q->mq_ops)
> + list_add_tail(&req->timeout_list, &req->q->timeout_list);
Do you really need these checks for !q->mq_ops?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/5] block: Split blk_add_timer()
2018-07-27 16:29 ` Keith Busch
@ 2018-07-27 16:31 ` Bart Van Assche
0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:31 UTC (permalink / raw)
To: keith.busch; +Cc: hch, ming.lei, linux-block, axboe, jianchao.w.wang
On Fri, 2018-07-27 at 10:29 -0600, Keith Busch wrote:
> On Fri, Jul 27, 2018 at 09:20:40AM -0700, Bart Van Assche wrote:
> > + if (!q->mq_ops)
> > + list_add_tail(&req->timeout_list, &=
req->q->timeout_list);
>=20
> Do you really need these checks for !q->mq_ops?
Hello Keith,
Have you noticed that patch 4/5 removes that check?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 4/5] block: Simplify blk_add_timer() and blk_mq_add_timer()
2018-07-27 16:20 [PATCH 0/5] Improve blk-mq performance Bart Van Assche
` (2 preceding siblings ...)
2018-07-27 16:20 ` [PATCH 3/5] block: Split blk_add_timer() Bart Van Assche
@ 2018-07-27 16:20 ` Bart Van Assche
2018-07-27 16:20 ` [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again Bart Van Assche
2018-07-27 16:22 ` [PATCH 0/5] Improve blk-mq performance Bart Van Assche
5 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:20 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Keith Busch,
Jianchao Wang, Ming Lei
Use the fact that q->mq_ops == NULL for the legacy block layer and
also that q->mq_ops != NULL for blk-mq to simplify blk_add_timer and
blk_mq_add_timer(). This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
---
block/blk-timeout.c | 54 +++++++++------------------------------------
1 file changed, 11 insertions(+), 43 deletions(-)
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 527670f2f985..02d7e3427369 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -219,34 +219,21 @@ static void __blk_add_timer(struct request *req, unsigned long deadline)
void blk_add_timer(struct request *req)
{
struct request_queue *q = req->q;
+ unsigned long deadline;
- if (!q->mq_ops)
- lockdep_assert_held(q->queue_lock);
+ lockdep_assert_held(q->queue_lock);
- /* blk-mq has its own handler, so we don't need ->rq_timed_out_fn */
- if (!q->mq_ops && !q->rq_timed_out_fn)
+ if (!q->rq_timed_out_fn)
return;
-
- BUG_ON(!list_empty(&req->timeout_list));
-
- /*
- * Some LLDs, like scsi, peek at the timeout to prevent a
- * command from being retried forever.
- */
if (!req->timeout)
req->timeout = q->rq_timeout;
req->rq_flags &= ~RQF_TIMED_OUT;
- blk_rq_set_deadline(req, jiffies + req->timeout);
+ deadline = jiffies + req->timeout;
+ blk_rq_set_deadline(req, deadline);
+ list_add_tail(&req->timeout_list, &req->q->timeout_list);
- /*
- * Only the non-mq case needs to add the request to a protected list.
- * For the mq case we simply scan the tag map.
- */
- if (!q->mq_ops)
- list_add_tail(&req->timeout_list, &req->q->timeout_list);
-
- __blk_add_timer(req, blk_rq_deadline(req));
+ __blk_add_timer(req, deadline);
}
/**
@@ -259,32 +246,13 @@ void blk_add_timer(struct request *req)
void blk_mq_add_timer(struct request *req)
{
struct request_queue *q = req->q;
+ unsigned long deadline;
- if (!q->mq_ops)
- lockdep_assert_held(q->queue_lock);
-
- /* blk-mq has its own handler, so we don't need ->rq_timed_out_fn */
- if (!q->mq_ops && !q->rq_timed_out_fn)
- return;
-
- BUG_ON(!list_empty(&req->timeout_list));
-
- /*
- * Some LLDs, like scsi, peek at the timeout to prevent a
- * command from being retried forever.
- */
if (!req->timeout)
req->timeout = q->rq_timeout;
req->rq_flags &= ~RQF_TIMED_OUT;
- blk_rq_set_deadline(req, jiffies + req->timeout);
-
- /*
- * Only the non-mq case needs to add the request to a protected list.
- * For the mq case we simply scan the tag map.
- */
- if (!q->mq_ops)
- list_add_tail(&req->timeout_list, &req->q->timeout_list);
-
- __blk_add_timer(req, blk_rq_deadline(req));
+ deadline = jiffies + req->timeout;
+ blk_rq_set_deadline(req, deadline);
+ __blk_add_timer(req, deadline);
}
--
2.18.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 16:20 [PATCH 0/5] Improve blk-mq performance Bart Van Assche
` (3 preceding siblings ...)
2018-07-27 16:20 ` [PATCH 4/5] block: Simplify blk_add_timer() and blk_mq_add_timer() Bart Van Assche
@ 2018-07-27 16:20 ` Bart Van Assche
2018-07-27 16:46 ` Keith Busch
2018-07-27 16:22 ` [PATCH 0/5] Improve blk-mq performance Bart Van Assche
5 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:20 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche,
Martin K . Petersen, Keith Busch, Jianchao Wang, Ming Lei
Recently the blk-mq timeout handling code was reworked to avoid that
completions that occur while a timeout handler is in progress get
ignored. However, that rework removed the protection against completions
that occur while a timeout handler is in progress. Fix this by
introducing a new request state, namely MQ_RQ_TIMED_OUT. This state
means that the timeout handler is in progress.
Other changes in this patch are:
- Reintroduce the request generation number to avoid that request
timeout handling happens after a request has already completed.
- Reintroduce deadline_seq to make reads and updates of the request
deadline possible without having to use atomic instructions.
- Remove the request state information that became superfluous due to
this patch, namely the RQF_TIMED_OUT flag and the ref member.
- Atomic instructions are only used to update the request state if
a concurrent request state change could be in progress.
- Move the code for managing the request state back from the SCSI
core into the block layer core.
This patch reverts the following two commits:
* 0fc09f920983 ("blk-mq: export setting request completion state")
* 065990bd198e ("scsi: set timed out out mq requests to complete")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
---
block/blk-core.c | 3 +
block/blk-mq-debugfs.c | 3 +-
block/blk-mq.c | 280 ++++++++++++++++++++++++++++----------
block/blk-mq.h | 10 +-
block/blk-timeout.c | 28 ++--
drivers/scsi/scsi_error.c | 14 --
include/linux/blk-mq.h | 14 --
include/linux/blkdev.h | 49 +++++--
8 files changed, 281 insertions(+), 120 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 03a4ea93a5f3..3c63e410ede4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -198,6 +198,9 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
rq->internal_tag = -1;
rq->start_time_ns = ktime_get_ns();
rq->part = NULL;
+#ifndef CONFIG_64BIT
+ seqcount_init(&rq->deadline_seq);
+#endif
}
EXPORT_SYMBOL(blk_rq_init);
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index cb1e6cf7ac48..72abc9a87c53 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -335,8 +335,9 @@ static const char *const rqf_name[] = {
static const char *const blk_mq_rq_state_name_array[] = {
[MQ_RQ_IDLE] = "idle",
- [MQ_RQ_IN_FLIGHT] = "in_flight",
+ [MQ_RQ_IN_FLIGHT] = "in flight",
[MQ_RQ_COMPLETE] = "complete",
+ [MQ_RQ_TIMED_OUT] = "timed out",
};
static const char *blk_mq_rq_state_name(enum mq_rq_state rq_state)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1b49973629f6..a97ab5ba9d18 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -6,6 +6,7 @@
*/
#include <linux/kernel.h>
#include <linux/module.h>
+#include <linux/atomic.h> /* cmpxchg() */
#include <linux/backing-dev.h>
#include <linux/bio.h>
#include <linux/blkdev.h>
@@ -322,6 +323,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
INIT_LIST_HEAD(&rq->timeout_list);
rq->timeout = 0;
+ WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE);
rq->end_io = NULL;
rq->end_io_data = NULL;
@@ -332,7 +334,6 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
#endif
data->ctx->rq_dispatched[op_is_sync(op)]++;
- refcount_set(&rq->ref, 1);
return rq;
}
@@ -466,19 +467,42 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
}
EXPORT_SYMBOL_GPL(blk_mq_alloc_request_hctx);
-static void __blk_mq_free_request(struct request *rq)
+/**
+ * blk_mq_change_rq_state - atomically test and set request state
+ * @rq: Request pointer.
+ * @old_state: Old request state.
+ * @new_state: New request state.
+ *
+ * Returns %true if and only if the old state was @old and if the state has
+ * been changed into @new.
+ */
+static bool blk_mq_change_rq_state(struct request *rq,
+ enum mq_rq_state old_state,
+ enum mq_rq_state new_state)
{
- struct request_queue *q = rq->q;
- struct blk_mq_ctx *ctx = rq->mq_ctx;
- struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
- const int sched_tag = rq->internal_tag;
+ union blk_generation_and_state old_val = READ_ONCE(rq->gstate);
+ union blk_generation_and_state new_val = old_val;
- if (rq->tag != -1)
- blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
- if (sched_tag != -1)
- blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
- blk_mq_sched_restart(hctx);
- blk_queue_exit(q);
+ old_val.state = old_state;
+ new_val.state = new_state;
+ if (new_state == MQ_RQ_IN_FLIGHT)
+ new_val.generation++;
+ /*
+ * For transitions from state in-flight or timed out to another state
+ * cmpxchg() must be used. For other state transitions it is safe to
+ * use WRITE_ONCE().
+ */
+ switch (old_state) {
+ case MQ_RQ_IN_FLIGHT:
+ case MQ_RQ_TIMED_OUT:
+ WRITE_ONCE(rq->gstate.val, new_val.val);
+ return true;
+ case MQ_RQ_IDLE:
+ case MQ_RQ_COMPLETE:
+ return blk_mq_set_rq_state(rq, old_val, new_val);
+ }
+ WARN(true, "Invalid request state %d\n", old_state);
+ return false;
}
void blk_mq_free_request(struct request *rq)
@@ -487,6 +511,7 @@ void blk_mq_free_request(struct request *rq)
struct elevator_queue *e = q->elevator;
struct blk_mq_ctx *ctx = rq->mq_ctx;
struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
+ const int sched_tag = rq->internal_tag;
if (rq->rq_flags & RQF_ELVPRIV) {
if (e && e->type->ops.mq.finish_request)
@@ -509,9 +534,14 @@ void blk_mq_free_request(struct request *rq)
if (blk_rq_rl(rq))
blk_put_rl(blk_rq_rl(rq));
- WRITE_ONCE(rq->state, MQ_RQ_IDLE);
- if (refcount_dec_and_test(&rq->ref))
- __blk_mq_free_request(rq);
+ if (!blk_mq_change_rq_state(rq, blk_mq_rq_state(rq), MQ_RQ_IDLE))
+ WARN_ON_ONCE(true);
+ if (rq->tag != -1)
+ blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
+ if (sched_tag != -1)
+ blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
+ blk_mq_sched_restart(hctx);
+ blk_queue_exit(q);
}
EXPORT_SYMBOL_GPL(blk_mq_free_request);
@@ -558,8 +588,8 @@ static void __blk_mq_complete_request(struct request *rq)
bool shared = false;
int cpu;
- if (!blk_mq_mark_complete(rq))
- return;
+ WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_COMPLETE);
+
if (rq->internal_tag != -1)
blk_mq_sched_completed_request(rq);
@@ -615,7 +645,18 @@ void blk_mq_complete_request(struct request *rq)
{
if (unlikely(blk_should_fake_timeout(rq->q)))
return;
- __blk_mq_complete_request(rq);
+again:
+ if (blk_mq_change_rq_state(rq, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE)) {
+ __blk_mq_complete_request(rq);
+ return;
+ }
+ if (blk_mq_change_rq_state(rq, MQ_RQ_TIMED_OUT, MQ_RQ_COMPLETE))
+ return;
+ if (WARN_ON_ONCE(blk_mq_rq_state(rq) == MQ_RQ_IDLE ||
+ blk_mq_rq_state(rq) == MQ_RQ_COMPLETE))
+ return;
+ /* In the unlikely case of a race with the timeout code, retry. */
+ goto again;
}
EXPORT_SYMBOL(blk_mq_complete_request);
@@ -645,7 +686,7 @@ void blk_mq_start_request(struct request *rq)
WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE);
blk_mq_add_timer(rq);
- WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT);
+ WARN_ON_ONCE(!blk_mq_change_rq_state(rq, MQ_RQ_IDLE, MQ_RQ_IN_FLIGHT));
if (q->dma_drain_size && blk_rq_bytes(rq)) {
/*
@@ -658,23 +699,46 @@ void blk_mq_start_request(struct request *rq)
}
EXPORT_SYMBOL(blk_mq_start_request);
+/**
+ * __blk_mq_requeue_request - requeue a request
+ * @rq: request to be requeued
+ *
+ * This function is either called by blk_mq_requeue_request() or by the block
+ * layer core if .queue_rq() returned BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE.
+ * If the request state is MQ_RQ_IN_FLIGHT and if this function is called from
+ * inside .queue_rq() then it is guaranteed that the timeout code won't try to
+ * modify the request state while this function is in progress because an RCU
+ * read lock is held around .queue_rq() and because the timeout code calls
+ * synchronize_rcu() after having marked requests and before processing
+ * time-outs.
+ */
static void __blk_mq_requeue_request(struct request *rq)
{
struct request_queue *q = rq->q;
+ enum mq_rq_state old_state = blk_mq_rq_state(rq);
blk_mq_put_driver_tag(rq);
trace_block_rq_requeue(q, rq);
rq_qos_requeue(q, rq);
- if (blk_mq_request_started(rq)) {
- WRITE_ONCE(rq->state, MQ_RQ_IDLE);
- rq->rq_flags &= ~RQF_TIMED_OUT;
+ if (old_state != MQ_RQ_IDLE) {
+ if (!blk_mq_change_rq_state(rq, old_state, MQ_RQ_IDLE))
+ WARN_ON_ONCE(true);
if (q->dma_drain_size && blk_rq_bytes(rq))
rq->nr_phys_segments--;
}
}
+/**
+ * blk_mq_requeue_request - requeue a request
+ * @rq: request to be requeued
+ * @kick_requeue_list: whether or not to kick the requeue_list
+ *
+ * This function is called after a request has completed, after a request has
+ * timed out or from inside .queue_rq(). In the latter case the request may
+ * already have been started.
+ */
void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list)
{
__blk_mq_requeue_request(rq);
@@ -767,82 +831,117 @@ struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
}
EXPORT_SYMBOL(blk_mq_tag_to_rq);
+struct blk_mq_timeout_data {
+ unsigned long next;
+ unsigned int next_set;
+ unsigned int nr_expired;
+};
+
static void blk_mq_rq_timed_out(struct request *req, bool reserved)
{
- req->rq_flags |= RQF_TIMED_OUT;
- if (req->q->mq_ops->timeout) {
- enum blk_eh_timer_return ret;
+ enum blk_eh_timer_return ret;
- ret = req->q->mq_ops->timeout(req, reserved);
- if (ret == BLK_EH_DONT_RESET_TIMER)
- return;
- WARN_ON_ONCE(ret != BLK_EH_RESET_TIMER);
+ if (!req->q->mq_ops->timeout) {
+ blk_mq_add_timer(req);
+ return;
}
+ ret = req->q->mq_ops->timeout(req, reserved);
+ /*
+ * BLK_EH_DONT_RESET_TIMER means that the block driver either
+ * completed the request or still owns the request and will
+ * continue processing the timeout asynchronously. In the
+ * latter case, if blk_mq_complete_request() was called while
+ * the timeout handler was in progress, ignore that call.
+ */
+ if (ret == BLK_EH_DONT_RESET_TIMER)
+ return;
+ WARN_ON_ONCE(ret != BLK_EH_RESET_TIMER);
blk_mq_add_timer(req);
+again:
+ if (blk_mq_change_rq_state(req, MQ_RQ_TIMED_OUT, MQ_RQ_IN_FLIGHT))
+ return;
+ if (blk_mq_rq_state(req) == MQ_RQ_COMPLETE) {
+ __blk_mq_complete_request(req);
+ return;
+ }
+ if (WARN_ON_ONCE(blk_mq_rq_state(req) == MQ_RQ_IDLE ||
+ blk_mq_rq_state(req) == MQ_RQ_IN_FLIGHT))
+ return;
+ /*
+ * In the unlikely case of a race with the request completion code,
+ * retry.
+ */
+ goto again;
}
-static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
+static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx, struct request *rq,
+ void *priv, bool reserved)
{
+ struct blk_mq_timeout_data *data = priv;
+ union blk_generation_and_state gstate = READ_ONCE(rq->gstate);
unsigned long deadline;
- if (blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT)
- return false;
- if (rq->rq_flags & RQF_TIMED_OUT)
- return false;
+ might_sleep();
- deadline = blk_rq_deadline(rq);
- if (time_after_eq(jiffies, deadline))
- return true;
+#ifdef CONFIG_64BIT
+ deadline = rq->__deadline;
+#else
+ while (true) {
+ int start;
- if (*next == 0)
- *next = deadline;
- else if (time_after(*next, deadline))
- *next = deadline;
- return false;
+ start = read_seqcount_begin(&rq->deadline_seq);
+ deadline = rq->__deadline;
+ if (!read_seqcount_retry(&rq->deadline_seq, start))
+ break;
+ cond_resched();
+ }
+#endif
+
+ /*
+ * Make sure that rq->aborted_gstate != rq->gstate if a request gets
+ * recycled before blk_mq_terminate_expired() is called.
+ */
+ rq->aborted_gstate = gstate;
+ rq->aborted_gstate.generation ^= (1UL << 29);
+ if (gstate.state == MQ_RQ_IN_FLIGHT &&
+ time_after_eq(jiffies, deadline)) {
+ /* request timed out */
+ rq->aborted_gstate = gstate;
+ data->nr_expired++;
+ hctx->nr_expired++;
+ } else if (!data->next_set || time_after(data->next, deadline)) {
+ data->next = deadline;
+ data->next_set = 1;
+ }
}
-static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
+static void blk_mq_terminate_expired(struct blk_mq_hw_ctx *hctx,
struct request *rq, void *priv, bool reserved)
{
- unsigned long *next = priv;
+ union blk_generation_and_state old_val = rq->aborted_gstate;
+ union blk_generation_and_state new_val = old_val;
- /*
- * Just do a quick check if it is expired before locking the request in
- * so we're not unnecessarilly synchronizing across CPUs.
- */
- if (!blk_mq_req_expired(rq, next))
- return;
-
- /*
- * We have reason to believe the request may be expired. Take a
- * reference on the request to lock this request lifetime into its
- * currently allocated context to prevent it from being reallocated in
- * the event the completion by-passes this timeout handler.
- *
- * If the reference was already released, then the driver beat the
- * timeout handler to posting a natural completion.
- */
- if (!refcount_inc_not_zero(&rq->ref))
- return;
+ new_val.state = MQ_RQ_TIMED_OUT;
/*
- * The request is now locked and cannot be reallocated underneath the
- * timeout handler's processing. Re-verify this exact request is truly
- * expired; if it is not expired, then the request was completed and
- * reallocated as a new request.
+ * We marked @rq->aborted_gstate and waited for ongoing .queue_rq()
+ * calls. If rq->gstate has not changed that means that it
+ * is now safe to change the request state and to handle the timeout.
*/
- if (blk_mq_req_expired(rq, next))
+ if (blk_mq_set_rq_state(rq, old_val, new_val))
blk_mq_rq_timed_out(rq, reserved);
- if (refcount_dec_and_test(&rq->ref))
- __blk_mq_free_request(rq);
}
static void blk_mq_timeout_work(struct work_struct *work)
{
struct request_queue *q =
container_of(work, struct request_queue, timeout_work);
- unsigned long next = 0;
+ struct blk_mq_timeout_data data = {
+ .next = 0,
+ .next_set = 0,
+ .nr_expired = 0,
+ };
struct blk_mq_hw_ctx *hctx;
int i;
@@ -862,10 +961,41 @@ static void blk_mq_timeout_work(struct work_struct *work)
if (!percpu_ref_tryget(&q->q_usage_counter))
return;
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next);
+ /* scan for the expired ones and set their ->aborted_gstate */
+ blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &data);
+
+ if (data.nr_expired) {
+ bool has_rcu = false;
+
+ /*
+ * For very short timeouts it can happen that
+ * blk_mq_check_expired() modifies the state of a request
+ * while .queue_rq() is still in progress. Hence wait until
+ * these .queue_rq() calls have finished. This is also
+ * necessary to avoid races with blk_mq_requeue_request() for
+ * requests that have already been started.
+ */
+ queue_for_each_hw_ctx(q, hctx, i) {
+ if (!hctx->nr_expired)
+ continue;
+
+ if (!(hctx->flags & BLK_MQ_F_BLOCKING))
+ has_rcu = true;
+ else
+ synchronize_srcu(hctx->srcu);
- if (next != 0) {
- mod_timer(&q->timeout, next);
+ hctx->nr_expired = 0;
+ }
+ if (has_rcu)
+ synchronize_rcu();
+
+ /* terminate the ones we won */
+ blk_mq_queue_tag_busy_iter(q, blk_mq_terminate_expired, NULL);
+ }
+
+ if (data.next_set) {
+ data.next = blk_rq_timeout(round_jiffies_up(data.next));
+ mod_timer(&q->timeout, data.next);
} else {
/*
* Request timeouts are handled as a forward rolling timer. If
@@ -2009,7 +2139,9 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
return ret;
}
- WRITE_ONCE(rq->state, MQ_RQ_IDLE);
+#ifndef CONFIG_64BIT
+ seqcount_init(&rq->deadline_seq);
+#endif
return 0;
}
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 9497b47e2526..de92cddc147f 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -90,13 +90,21 @@ extern void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx);
void blk_mq_release(struct request_queue *q);
+static inline bool blk_mq_set_rq_state(struct request *rq,
+ union blk_generation_and_state old_val,
+ union blk_generation_and_state new_val)
+{
+ return cmpxchg(&rq->gstate.val, old_val.val, new_val.val) ==
+ old_val.val;
+}
+
/**
* blk_mq_rq_state() - read the current MQ_RQ_* state of a request
* @rq: target request.
*/
static inline enum mq_rq_state blk_mq_rq_state(struct request *rq)
{
- return READ_ONCE(rq->state);
+ return READ_ONCE(rq->gstate).state;
}
static inline struct blk_mq_ctx *__blk_mq_get_ctx(struct request_queue *q,
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 02d7e3427369..b37b8090f3ae 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -142,6 +142,22 @@ void blk_timeout_work(struct work_struct *work)
spin_unlock_irqrestore(q->queue_lock, flags);
}
+/* Update deadline to @time. May be called from interrupt context. */
+static void blk_mq_rq_set_deadline(struct request *rq, unsigned long new_time)
+{
+#ifdef CONFIG_64BIT
+ rq->__deadline = new_time;
+#else
+ unsigned long flags;
+
+ local_irq_save(flags);
+ write_seqcount_begin(&rq->deadline_seq);
+ rq->__deadline = new_time;
+ write_seqcount_end(&rq->deadline_seq);
+ local_irq_restore(flags);
+#endif
+}
+
/**
* blk_abort_request -- Request request recovery for the specified command
* @req: pointer to the request of interest
@@ -155,11 +171,10 @@ void blk_abort_request(struct request *req)
{
if (req->q->mq_ops) {
/*
- * All we need to ensure is that timeout scan takes place
- * immediately and that scan sees the new timeout value.
- * No need for fancy synchronizations.
+ * Ensure that a timeout scan takes place immediately and that
+ * that scan sees the new timeout value.
*/
- blk_rq_set_deadline(req, jiffies);
+ blk_mq_rq_set_deadline(req, jiffies);
kblockd_schedule_work(&req->q->timeout_work);
} else {
if (blk_mark_rq_complete(req))
@@ -228,7 +243,6 @@ void blk_add_timer(struct request *req)
if (!req->timeout)
req->timeout = q->rq_timeout;
- req->rq_flags &= ~RQF_TIMED_OUT;
deadline = jiffies + req->timeout;
blk_rq_set_deadline(req, deadline);
list_add_tail(&req->timeout_list, &req->q->timeout_list);
@@ -250,9 +264,7 @@ void blk_mq_add_timer(struct request *req)
if (!req->timeout)
req->timeout = q->rq_timeout;
-
- req->rq_flags &= ~RQF_TIMED_OUT;
deadline = jiffies + req->timeout;
- blk_rq_set_deadline(req, deadline);
+ blk_mq_rq_set_deadline(req, deadline);
__blk_add_timer(req, deadline);
}
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b38b8e62e618..fa18f9d78170 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -296,20 +296,6 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
rtn = host->hostt->eh_timed_out(scmd);
if (rtn == BLK_EH_DONT_RESET_TIMER) {
- /*
- * For blk-mq, we must set the request state to complete now
- * before sending the request to the scsi error handler. This
- * will prevent a use-after-free in the event the LLD manages
- * to complete the request before the error handler finishes
- * processing this timed out request.
- *
- * If the request was already completed, then the LLD beat the
- * time out handler from transferring the request to the scsi
- * error handler. In that case we can return immediately as no
- * further action is required.
- */
- if (req->q->mq_ops && !blk_mq_mark_complete(req))
- return rtn;
if (scsi_abort_command(scmd) != SUCCESS) {
set_host_byte(scmd, DID_TIME_OUT);
scsi_eh_scmd_add(scmd);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 1da59c16f637..d710e92874cc 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -289,20 +289,6 @@ void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues);
void blk_mq_quiesce_queue_nowait(struct request_queue *q);
-/**
- * blk_mq_mark_complete() - Set request state to complete
- * @rq: request to set to complete state
- *
- * Returns true if request state was successfully set to complete. If
- * successful, the caller is responsibile for seeing this request is ended, as
- * blk_mq_complete_request will not work again.
- */
-static inline bool blk_mq_mark_complete(struct request *rq)
-{
- return cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) ==
- MQ_RQ_IN_FLIGHT;
-}
-
/*
* Driver command data is immediately after the request. So subtract request
* size to get back to the original request, add request size to get the PDU.
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8244a5a1aa5b..58e5b22123a2 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -126,20 +126,39 @@ typedef __u32 __bitwise req_flags_t;
#define RQF_ZONE_WRITE_LOCKED ((__force req_flags_t)(1 << 19))
/* already slept for hybrid poll */
#define RQF_MQ_POLL_SLEPT ((__force req_flags_t)(1 << 20))
-/* ->timeout has been called, don't expire again */
-#define RQF_TIMED_OUT ((__force req_flags_t)(1 << 21))
/* flags that prevent us from merging requests: */
#define RQF_NOMERGE_FLAGS \
(RQF_STARTED | RQF_SOFTBARRIER | RQF_FLUSH_SEQ | RQF_SPECIAL_PAYLOAD)
-/*
- * Request state for blk-mq.
+union blk_generation_and_state {
+ struct {
+ uint32_t generation:30;
+ uint32_t state:2;
+ };
+ uint32_t val;
+};
+
+/**
+ * enum mq_rq_state - blk-mq request state
+ *
+ * The legal state transitions are:
+ * - idle -> in flight: blk_mq_start_request()
+ * - in flight -> complete: blk_mq_complete_request()
+ * - timed out -> complete: blk_mq_complete_request()
+ * - in flight -> timed out: request times out
+ * - complete -> idle: blk_mq_requeue_request() or blk_mq_free_request()
+ * - in flight -> idle: blk_mq_requeue_request() or blk_mq_free_request()
+ * - timed out -> idle: blk_mq_requeue_request() or blk_mq_free_request()
+ * - timed out -> in flight: request restart due to BLK_EH_RESET_TIMER
+ *
+ * See also blk_generation_and_state.state.
*/
enum mq_rq_state {
MQ_RQ_IDLE = 0,
MQ_RQ_IN_FLIGHT = 1,
MQ_RQ_COMPLETE = 2,
+ MQ_RQ_TIMED_OUT = 3,
};
/*
@@ -242,12 +261,26 @@ struct request {
unsigned int extra_len; /* length of alignment and padding */
- enum mq_rq_state state;
- refcount_t ref;
-
unsigned int timeout;
- /* access through blk_rq_set_deadline, blk_rq_deadline */
+ /*
+ * ->aborted_gstate is used by the timeout to claim a specific
+ * recycle instance of this request. See blk_mq_timeout_work().
+ */
+ union blk_generation_and_state aborted_gstate;
+
+ /*
+ * Access through blk_rq_deadline() and blk_rq_set_deadline(),
+ * blk_mark_rq_complete(), blk_clear_rq_complete() and
+ * blk_rq_is_complete() for legacy queues or blk_mq_rq_state(),
+ * blk_mq_change_rq_state() and blk_mq_rq_set_deadline() for
+ * blk-mq queues. deadline_seq protects __deadline in blk-mq mode
+ * only.
+ */
+ union blk_generation_and_state gstate;
+#ifndef CONFIG_64BIT
+ seqcount_t deadline_seq;
+#endif
unsigned long __deadline;
struct list_head timeout_list;
--
2.18.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 16:20 ` [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again Bart Van Assche
@ 2018-07-27 16:46 ` Keith Busch
2018-07-27 16:51 ` Bart Van Assche
0 siblings, 1 reply; 17+ messages in thread
From: Keith Busch @ 2018-07-27 16:46 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen,
Jianchao Wang, Ming Lei
On Fri, Jul 27, 2018 at 09:20:42AM -0700, Bart Van Assche wrote:
> + ret = req->q->mq_ops->timeout(req, reserved);
> + /*
> + * BLK_EH_DONT_RESET_TIMER means that the block driver either
> + * completed the request or still owns the request and will
> + * continue processing the timeout asynchronously. In the
> + * latter case, if blk_mq_complete_request() was called while
> + * the timeout handler was in progress, ignore that call.
> + */
> + if (ret == BLK_EH_DONT_RESET_TIMER)
> + return;
This is how completions get lost.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 16:46 ` Keith Busch
@ 2018-07-27 16:51 ` Bart Van Assche
2018-07-27 16:57 ` Keith Busch
0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:51 UTC (permalink / raw)
To: keith.busch
Cc: hch, ming.lei, linux-block, martin.petersen, axboe, jianchao.w.wang
On Fri, 2018-07-27 at 10:46 -0600, Keith Busch wrote:
> On Fri, Jul 27, 2018 at 09:20:42AM -0700, Bart Van Assche wrote:
> > + ret = req->q->mq_ops->timeout(req, reser=
ved);
> > + /*
> > + * BLK_EH_DONT_RESET_TIMER means that th=
e block driver either
> > + * completed the request or still owns the request and w=
ill
> > + * continue processing the timeout asynchronously. In th=
e
> > + * latter case, if blk_mq_complete_request()=
was called while
> > + * the timeout handler was in progress, ignore that call=
.
> > + */
> > + if (ret == BLK_EH_DONT_RESET_TIMER)
> > + return;
>=20
> This is how completions get lost.
The new approach for handling completions that occur while the .timeout()
callback in progress is as follows:
* blk_mq_complete_request() executes the following code:
if (blk_mq_change_rq_state(rq, MQ_RQ_TIMED=
F8-OUT, MQ_RQ_COMPLETE))
return;
* blk_mq_rq_timed_out() executes the following code:
if (blk_mq_rq_state(req) == MQ_RQ_COMPLETE=
) {
__blk_mq_complete_request(req);
return;
}
As one can see __blk_mq_complete_request() gets called if=
this race occurs.
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 16:51 ` Bart Van Assche
@ 2018-07-27 16:57 ` Keith Busch
2018-07-27 16:59 ` Bart Van Assche
0 siblings, 1 reply; 17+ messages in thread
From: Keith Busch @ 2018-07-27 16:57 UTC (permalink / raw)
To: Bart Van Assche
Cc: hch, ming.lei, linux-block, martin.petersen, axboe, jianchao.w.wang
On Fri, Jul 27, 2018 at 04:51:05PM +0000, Bart Van Assche wrote:
> On Fri, 2018-07-27 at 10:46 -0600, Keith Busch wrote:
> > On Fri, Jul 27, 2018 at 09:20:42AM -0700, Bart Van Assche wrote:
> > > + ret = req->q->mq_ops->timeout(req, reserved);
> > > + /*
> > > + * BLK_EH_DONT_RESET_TIMER means that the block driver either
> > > + * completed the request or still owns the request and will
> > > + * continue processing the timeout asynchronously. In the
> > > + * latter case, if blk_mq_complete_request() was called while
> > > + * the timeout handler was in progress, ignore that call.
> > > + */
> > > + if (ret == BLK_EH_DONT_RESET_TIMER)
> > > + return;
> >
> > This is how completions get lost.
>
> The new approach for handling completions that occur while the .timeout()
> callback in progress is as follows:
> * blk_mq_complete_request() executes the following code:
> if (blk_mq_change_rq_state(rq, MQ_RQ_TIMED_OUT, MQ_RQ_COMPLETE))
> return;
> * blk_mq_rq_timed_out() executes the following code:
> if (blk_mq_rq_state(req) == MQ_RQ_COMPLETE) {
> __blk_mq_complete_request(req);
> return;
> }
>
> As one can see __blk_mq_complete_request() gets called if this race occurs.
You skip that code if the driver returns BLK_EH_DONT_RESET_TIMER.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 16:57 ` Keith Busch
@ 2018-07-27 16:59 ` Bart Van Assche
2018-07-27 17:04 ` Keith Busch
0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:59 UTC (permalink / raw)
To: keith.busch
Cc: hch, jianchao.w.wang, linux-block, martin.petersen, ming.lei, axboe
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-7", Size: 1138 bytes --]
On Fri, 2018-07-27 at 10:57 -0600, Keith Busch wrote:
+AD4- You skip that code if the driver returns BLK+AF8-EH+AF8-DONT+AF8-RESE=
T+AF8-TIMER.
How about applying the following patch on top of this series?
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a97ab5ba9d18..aa66535604fd 100644
--- a/block/blk-mq.c
+-+-+- b/block/blk-mq.c
+AEAAQA- -854,10 +-854,10 +AEAAQA- static void blk+AF8-mq+AF8-rq+AF8-timed+=
AF8-out(struct request +ACo-req, bool reserved)
+ACo- latter case, if blk+AF8-mq+AF8-complete+AF8-request() was called w=
hile
+ACo- the timeout handler was in progress, ignore that call.
+ACo-/
- if (ret +AD0APQ- BLK+AF8-EH+AF8-DONT+AF8-RESET+AF8-TIMER)
- return+ADs-
- WARN+AF8-ON+AF8-ONCE(ret +ACEAPQ- BLK+AF8-EH+AF8-RESET+AF8-TIMER)+ADs-
- blk+AF8-mq+AF8-add+AF8-timer(req)+ADs-
+- if (ret +AD0APQ- BLK+AF8-EH+AF8-RESET+AF8-TIMER)
+- blk+AF8-mq+AF8-add+AF8-timer(req)+ADs-
+- else
+- WARN+AF8-ON+AF8-ONCE(ret +ACEAPQ- BLK+AF8-EH+AF8-DONT+AF8-RESET+AF8-TIM=
ER)+ADs-
again:
if (blk+AF8-mq+AF8-change+AF8-rq+AF8-state(req, MQ+AF8-RQ+AF8-TIMED+AF8-O=
UT, MQ+AF8-RQ+AF8-IN+AF8-FLIGHT))
return+ADs-
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 16:59 ` Bart Van Assche
@ 2018-07-27 17:04 ` Keith Busch
2018-07-27 17:14 ` Bart Van Assche
0 siblings, 1 reply; 17+ messages in thread
From: Keith Busch @ 2018-07-27 17:04 UTC (permalink / raw)
To: Bart Van Assche
Cc: hch, jianchao.w.wang, linux-block, martin.petersen, ming.lei, axboe
On Fri, Jul 27, 2018 at 04:59:34PM +0000, Bart Van Assche wrote:
> On Fri, 2018-07-27 at 10:57 -0600, Keith Busch wrote:
> > You skip that code if the driver returns BLK_EH_DONT_RESET_TIMER.
>
> How about applying the following patch on top of this series?
That works for me if you, but it breaks scsi again when
scmd_eh_abort_handler completes the command a second time.
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index a97ab5ba9d18..aa66535604fd 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -854,10 +854,10 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
> * latter case, if blk_mq_complete_request() was called while
> * the timeout handler was in progress, ignore that call.
> */
> - if (ret == BLK_EH_DONT_RESET_TIMER)
> - return;
> - WARN_ON_ONCE(ret != BLK_EH_RESET_TIMER);
> - blk_mq_add_timer(req);
> + if (ret == BLK_EH_RESET_TIMER)
> + blk_mq_add_timer(req);
> + else
> + WARN_ON_ONCE(ret != BLK_EH_DONT_RESET_TIMER);
> again:
> if (blk_mq_change_rq_state(req, MQ_RQ_TIMED_OUT, MQ_RQ_IN_FLIGHT))
> return;
>
> Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 17:04 ` Keith Busch
@ 2018-07-27 17:14 ` Bart Van Assche
2018-07-27 17:58 ` Keith Busch
0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 17:14 UTC (permalink / raw)
To: keith.busch
Cc: hch, axboe, linux-block, martin.petersen, jianchao.w.wang, ming.lei
On Fri, 2018-07-27 at 11:04 -0600, Keith Busch wrote:
> On Fri, Jul 27, 2018 at 04:59:34PM +0000, Bart Van Assche wrote:
> > On Fri, 2018-07-27 at 10:57 -0600, Keith Busch wrote:
> > > You skip that code if the driver returns BLK_EH_D=
ONT_RESET_TIMER.
> >=20
> > How about applying the following patch on top of this series?
>=20
> That works for me if you, but it breaks scsi again when
> scmd_eh_abort_handler completes the command a second time=
.
How about introducing a new request queue flag that chooses between the
behavior with or without the patch in my previous e-mail? I don't think tha=
t
it is possible to come up with a single implementation that covers the need=
s
of NVMe and SCSI without introducing such a flag. If a SCSI request times o=
ut
then request ownership is transferred from the LLD to the error handler. Fo=
r
the NVMe driver however there is no such transfer of ownership.
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 17:14 ` Bart Van Assche
@ 2018-07-27 17:58 ` Keith Busch
2018-07-27 18:09 ` Bart Van Assche
0 siblings, 1 reply; 17+ messages in thread
From: Keith Busch @ 2018-07-27 17:58 UTC (permalink / raw)
To: Bart Van Assche
Cc: hch, axboe, linux-block, martin.petersen, jianchao.w.wang, ming.lei
On Fri, Jul 27, 2018 at 05:14:18PM +0000, Bart Van Assche wrote:
> On Fri, 2018-07-27 at 11:04 -0600, Keith Busch wrote:
> > On Fri, Jul 27, 2018 at 04:59:34PM +0000, Bart Van Assche wrote:
> > > On Fri, 2018-07-27 at 10:57 -0600, Keith Busch wrote:
> > > > You skip that code if the driver returns BLK_EH_DONT_RESET_TIMER.
> > >
> > > How about applying the following patch on top of this series?
> >
> > That works for me if you, but it breaks scsi again when
> > scmd_eh_abort_handler completes the command a second time.
>
> How about introducing a new request queue flag that chooses between the
> behavior with or without the patch in my previous e-mail? I don't think that
> it is possible to come up with a single implementation that covers the needs
> of NVMe and SCSI without introducing such a flag. If a SCSI request times out
> then request ownership is transferred from the LLD to the error handler. For
> the NVMe driver however there is no such transfer of ownership.
Instead of PATCH 1/5, how about creating a new timeout return code like
"BLK_EH_DONT_COMPLETE"?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again
2018-07-27 17:58 ` Keith Busch
@ 2018-07-27 18:09 ` Bart Van Assche
0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 18:09 UTC (permalink / raw)
To: keith.busch
Cc: hch, ming.lei, linux-block, martin.petersen, axboe, jianchao.w.wang
On Fri, 2018-07-27 at 11:58 -0600, Keith Busch wrote:
> Instead of PATCH 1/5, how about creating a new timeout return code li=
ke
> "BLK_EH_DONT_COMPLETE"?
That sounds like a good idea to me. I think this approach will avoid that w=
e
have to introduce a request queue flag that chooses between the behavior
needed by the NVMe driver or the behavior needed by the SCSI core.
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 0/5] Improve blk-mq performance
2018-07-27 16:20 [PATCH 0/5] Improve blk-mq performance Bart Van Assche
` (4 preceding siblings ...)
2018-07-27 16:20 ` [PATCH 5/5] blk-mq: Rework blk-mq timeout handling again Bart Van Assche
@ 2018-07-27 16:22 ` Bart Van Assche
5 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2018-07-27 16:22 UTC (permalink / raw)
To: axboe; +Cc: hch, linux-block
On Fri, 2018-07-27 at 09:20 -0700, Bart Van Assche wrote:
> hot path. This patch series introduces the number of IOPS reported by=
the
^^^^^^^^^^
This is a typo - I meant "improves".
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread