From: Tejun Heo <tj@kernel.org>
To: axboe@kernel.dk, newella@fb.com, clm@fb.com,
josef@toxicpanda.com, dennisz@fb.com, lizefan@huawei.com,
hannes@cmpxchg.org
Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
kernel-team@fb.com, cgroups@vger.kernel.org, ast@kernel.org,
daniel@iogearbox.net, kafai@fb.com, songliubraving@fb.com,
yhs@fb.com, bpf@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: [PATCH 07/10] blk-mq: add optional request->pre_start_time_ns
Date: Thu, 13 Jun 2019 18:56:17 -0700 [thread overview]
Message-ID: <20190614015620.1587672-8-tj@kernel.org> (raw)
In-Reply-To: <20190614015620.1587672-1-tj@kernel.org>
There are currently two start time timestamps - start_time_ns and
io_start_time_ns. The former marks the request allocation and and the
second issue-to-device time. The planned io.weight controller needs
to measure the total time bios take to execute after it leaves rq_qos
including the time spent waiting for request to become available,
which can easily dominate on saturated devices.
This patch adds request->pre_start_time_ns which records when the
request allocation attempt started. As it isn't used for the usual
stats, make it optional behind QUEUE_FLAG_REC_PRESTART.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-mq.c | 11 +++++++++--
include/linux/blkdev.h | 7 ++++++-
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ce0f5f4ede70..25ce27434c63 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -291,7 +291,7 @@ static inline bool blk_mq_need_time_stamp(struct request *rq)
}
static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
- unsigned int tag, unsigned int op)
+ unsigned int tag, unsigned int op, u64 pre_start_time_ns)
{
struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
struct request *rq = tags->static_rqs[tag];
@@ -325,6 +325,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
RB_CLEAR_NODE(&rq->rb_node);
rq->rq_disk = NULL;
rq->part = NULL;
+ rq->pre_start_time_ns = pre_start_time_ns;
if (blk_mq_need_time_stamp(rq))
rq->start_time_ns = ktime_get_ns();
else
@@ -356,8 +357,14 @@ static struct request *blk_mq_get_request(struct request_queue *q,
struct request *rq;
unsigned int tag;
bool put_ctx_on_error = false;
+ u64 pre_start_time_ns = 0;
blk_queue_enter_live(q);
+
+ /* pre_start_time includes depth and tag waits */
+ if (blk_queue_rec_prestart(q))
+ pre_start_time_ns = ktime_get_ns();
+
data->q = q;
if (likely(!data->ctx)) {
data->ctx = blk_mq_get_ctx(q);
@@ -395,7 +402,7 @@ static struct request *blk_mq_get_request(struct request_queue *q,
return NULL;
}
- rq = blk_mq_rq_ctx_init(data, tag, data->cmd_flags);
+ rq = blk_mq_rq_ctx_init(data, tag, data->cmd_flags, pre_start_time_ns);
if (!op_is_flush(data->cmd_flags)) {
rq->elv.icq = NULL;
if (e && e->type->ops.prepare_request) {
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 592669bcc536..ff72eb940d4c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -194,7 +194,9 @@ struct request {
struct gendisk *rq_disk;
struct hd_struct *part;
- /* Time that I/O was submitted to the kernel. */
+ /* Time that the first bio started allocating this request. */
+ u64 pre_start_time_ns;
+ /* Time that this request was allocated for this IO. */
u64 start_time_ns;
/* Time that I/O was submitted to the device. */
u64 io_start_time_ns;
@@ -606,6 +608,7 @@ struct request_queue {
#define QUEUE_FLAG_SCSI_PASSTHROUGH 23 /* queue supports SCSI commands */
#define QUEUE_FLAG_QUIESCED 24 /* queue has been quiesced */
#define QUEUE_FLAG_PCI_P2PDMA 25 /* device supports PCI p2p requests */
+#define QUEUE_FLAG_REC_PRESTART 26 /* record pre_start_time_ns */
#define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
(1 << QUEUE_FLAG_SAME_COMP))
@@ -632,6 +635,8 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
test_bit(QUEUE_FLAG_SCSI_PASSTHROUGH, &(q)->queue_flags)
#define blk_queue_pci_p2pdma(q) \
test_bit(QUEUE_FLAG_PCI_P2PDMA, &(q)->queue_flags)
+#define blk_queue_rec_prestart(q) \
+ test_bit(QUEUE_FLAG_REC_PRESTART, &(q)->queue_flags)
#define blk_noretry_request(rq) \
((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
--
2.17.1
next prev parent reply other threads:[~2019-06-14 1:57 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-14 1:56 [PATCHSET block/for-next] IO cost model based work-conserving porportional controller Tejun Heo
2019-06-14 1:56 ` [PATCH 01/10] blkcg: pass @q and @blkcg into blkcg_pol_alloc_pd_fn() Tejun Heo
2019-06-14 1:56 ` [PATCH 02/10] blkcg: make ->cpd_init_fn() optional Tejun Heo
2019-06-14 1:56 ` [PATCH 03/10] blkcg: separate blkcg_conf_get_disk() out of blkg_conf_prep() Tejun Heo
2019-06-14 1:56 ` [PATCH 04/10] block/rq_qos: add rq_qos_merge() Tejun Heo
2019-06-14 1:56 ` [PATCH 05/10] block/rq_qos: implement rq_qos_ops->queue_depth_changed() Tejun Heo
2019-06-14 1:56 ` [PATCH 06/10] blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ Tejun Heo
2019-06-14 1:56 ` Tejun Heo [this message]
2019-06-14 1:56 ` [PATCH 08/10] blkcg: implement blk-ioweight Tejun Heo
2019-06-14 12:17 ` Toke Høiland-Jørgensen
2019-06-14 15:09 ` Tejun Heo
2019-06-14 20:50 ` Toke Høiland-Jørgensen
2019-06-15 15:57 ` Tejun Heo
2019-06-14 1:56 ` [PATCH 09/10] blkcg: add tools/cgroup/monitor_ioweight.py Tejun Heo
2019-06-14 1:56 ` [PATCH 10/10] blkcg: implement BPF_PROG_TYPE_IO_COST Tejun Heo
2019-06-14 11:32 ` Quentin Monnet
2019-06-14 14:52 ` Tejun Heo
2019-06-14 16:35 ` Alexei Starovoitov
2019-06-14 17:09 ` Tejun Heo
2019-06-14 17:56 ` [PATCHSET block/for-next] IO cost model based work-conserving porportional controller Tejun Heo
2019-08-20 10:48 ` Paolo Valente
2019-08-20 15:04 ` Paolo Valente
2019-08-20 15:19 ` Tejun Heo
2019-08-22 8:58 ` Paolo Valente
2019-08-31 6:53 ` Tejun Heo
2019-08-31 7:10 ` Paolo Valente
2019-08-31 11:20 ` Tejun Heo
2019-09-02 15:45 ` Paolo Valente
2019-09-02 15:56 ` Tejun Heo
2019-09-02 19:43 ` Paolo Valente
2019-09-05 16:55 ` Tejun Heo
2019-09-06 9:07 ` Paolo Valente
2019-09-06 14:58 ` Tejun Heo
2020-02-19 18:34 ` Paolo Valente
2019-07-10 20:51 [PATCHSET v2 block/for-linus] " Tejun Heo
2019-07-10 20:51 ` [PATCH 07/10] blk-mq: add optional request->pre_start_time_ns Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190614015620.1587672-8-tj@kernel.org \
--to=tj@kernel.org \
--cc=ast@kernel.org \
--cc=axboe@kernel.dk \
--cc=bpf@vger.kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=clm@fb.com \
--cc=daniel@iogearbox.net \
--cc=dennisz@fb.com \
--cc=hannes@cmpxchg.org \
--cc=josef@toxicpanda.com \
--cc=kafai@fb.com \
--cc=kernel-team@fb.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=newella@fb.com \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).