All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] block: add two statistic tables
@ 2020-07-08  7:58 Guoqing Jiang
  2020-07-08  7:58 ` [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct Guoqing Jiang
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08  7:58 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, Guoqing Jiang

Hi,

The patchset mostly introduces some additional io stats for latency and sector,
with those tables, we can know better about the io patttern. And we want the
disk_start_io_acct returns ns instead of convert from jiffies, so some code
in drbd are changed accordingly.

For the table, the first row of below tables means the number which are "<= 1",
while the last row means the number which are ">= 1024". And the rest rows in
the table represent the number in a range.

With HZ=1000.
$ cat /sys/block/md127/io_latency
     1 ms: 3 0 0 0     - means 3 read IOs are finished less than or equal 1 ms
     2 ms: 1 0 0 0     - means 1 read IO is finished in the range [2ms, 4ms)
     4 ms: 0 0 0 0
     8 ms: 0 0 0 0
    16 ms: 1 0 0 0
    32 ms: 0 0 0 0
    64 ms: 0 0 0 0
   128 ms: 0 0 0 0
   256 ms: 0 0 0 0
   512 ms: 0 0 0 0
  1024 ms: 1 0 0 0
  2048 ms: 1 0 0 0     - means 1 read IO is finished more than or equal 2048 ms

While with HZ=100.
$ cat /sys/block/md127/io_latency
     10 ms: 3 0 0 0
     20 ms: 1 0 0 0
     40 ms: 0 0 0 0
     80 ms: 0 0 0 0
    160 ms: 1 0 0 0
    320 ms: 0 0 0 0
    640 ms: 0 0 0 0
   1280 ms: 0 0 0 0
   2560 ms: 0 0 0 0
   5120 ms: 0 0 0 0
  10240 ms: 1 0 0 0
  20480 ms: 1 0 0 0


$ cat /sys/block/md127/io_size
     1 KB: 0 0 0 0
     2 KB: 0 0 0 0
     4 KB: 0 0 0 0
     8 KB: 5 0 0 0
    16 KB: 0 0 0 0
    32 KB: 0 0 0 0
    64 KB: 0 0 0 0
   128 KB: 0 0 0 0
   256 KB: 0 0 0 0
   512 KB: 0 0 0 0
  1024 KB: 0 0 0 0
  2048 KB: 0 0 0 0

I will add a document if it is worth to add the two tables, review and comment
are welcome.

Thanks,
Guoqing

Guoqing Jiang (5):
  block: return ns precision from disk_start_io_acct
  drbd: remove unused argument from drbd_request_prepare and
    __drbd_make_request
  drbd: rename start_jif to start_ns
  block: add a statistic table for io latency
  block: add a statistic table for io sector

 block/Kconfig                     |  9 +++++
 block/blk-core.c                  | 61 +++++++++++++++++++++++++++++--
 block/genhd.c                     | 47 ++++++++++++++++++++++++
 drivers/block/drbd/drbd_debugfs.c |  8 ++--
 drivers/block/drbd/drbd_int.h     |  4 +-
 drivers/block/drbd/drbd_main.c    |  3 +-
 drivers/block/drbd/drbd_req.c     | 15 +++-----
 include/linux/part_stat.h         |  8 ++++
 8 files changed, 135 insertions(+), 20 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct
  2020-07-08  7:58 [RFC PATCH 0/4] block: add two statistic tables Guoqing Jiang
@ 2020-07-08  7:58 ` Guoqing Jiang
  2020-07-08 13:27   ` Ming Lei
  2020-07-08  7:58 ` [PATCH RFC 2/5] drbd: remove unused argument from drbd_request_prepare and __drbd_make_request Guoqing Jiang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08  7:58 UTC (permalink / raw)
  To: axboe
  Cc: linux-block, Guoqing Jiang, Philipp Reisner, Lars Ellenberg, drbd-dev

Currently the duration accounting of bio based driver is converted from
jiffies to ns, means it could be less accurate as request based driver.

So let disk_start_io_acct return from ns precision, instead of convert
jiffies to ns in disk_end_io_acct.

Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: drbd-dev@lists.linbit.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
 block/blk-core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index d9d632639bd1..0e806a8c62fb 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1466,6 +1466,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
 	struct hd_struct *part = &disk->part0;
 	const int sgrp = op_stat_group(op);
 	unsigned long now = READ_ONCE(jiffies);
+	unsigned long start_ns = ktime_get_ns();
 
 	part_stat_lock();
 	update_io_ticks(part, now, false);
@@ -1474,7 +1475,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
 	part_stat_local_inc(part, in_flight[op_is_write(op)]);
 	part_stat_unlock();
 
-	return now;
+	return start_ns;
 }
 EXPORT_SYMBOL(disk_start_io_acct);
 
@@ -1484,11 +1485,11 @@ void disk_end_io_acct(struct gendisk *disk, unsigned int op,
 	struct hd_struct *part = &disk->part0;
 	const int sgrp = op_stat_group(op);
 	unsigned long now = READ_ONCE(jiffies);
-	unsigned long duration = now - start_time;
+	unsigned long duration = ktime_get_ns() - start_time;
 
 	part_stat_lock();
 	update_io_ticks(part, now, true);
-	part_stat_add(part, nsecs[sgrp], jiffies_to_nsecs(duration));
+	part_stat_add(part, nsecs[sgrp], duration);
 	part_stat_local_dec(part, in_flight[op_is_write(op)]);
 	part_stat_unlock();
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 2/5] drbd: remove unused argument from drbd_request_prepare and __drbd_make_request
  2020-07-08  7:58 [RFC PATCH 0/4] block: add two statistic tables Guoqing Jiang
  2020-07-08  7:58 ` [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct Guoqing Jiang
@ 2020-07-08  7:58 ` Guoqing Jiang
  2020-07-08  7:58 ` [PATCH RFC 3/5] drbd: rename start_jif to start_ns Guoqing Jiang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08  7:58 UTC (permalink / raw)
  To: axboe
  Cc: linux-block, Guoqing Jiang, Philipp Reisner, Lars Ellenberg, drbd-dev

We can remove start_jif since it is not used by drbd_request_prepare,
then remove it from __drbd_make_request further.

Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: drbd-dev@lists.linbit.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
This had been sent before, now it is better to include it in the thread.

https://marc.info/?l=linux-block&m=159369644305325&w=2


 drivers/block/drbd/drbd_int.h  |  2 +-
 drivers/block/drbd/drbd_main.c |  3 +--
 drivers/block/drbd/drbd_req.c  | 11 ++++-------
 3 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index fe6cb99eb917..aacd2010b555 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1450,7 +1450,7 @@ extern void conn_free_crypto(struct drbd_connection *connection);
 
 /* drbd_req */
 extern void do_submit(struct work_struct *ws);
-extern void __drbd_make_request(struct drbd_device *, struct bio *, unsigned long);
+extern void __drbd_make_request(struct drbd_device *, struct bio *);
 extern blk_qc_t drbd_submit_bio(struct bio *bio);
 extern int drbd_read_remote(struct drbd_device *device, struct drbd_request *req);
 extern int is_valid_ar_handle(struct drbd_request *, sector_t);
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 7c34cc0ad8cc..42f2a235417c 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2293,7 +2293,6 @@ static void do_retry(struct work_struct *ws)
 	list_for_each_entry_safe(req, tmp, &writes, tl_requests) {
 		struct drbd_device *device = req->device;
 		struct bio *bio = req->master_bio;
-		unsigned long start_jif = req->start_jif;
 		bool expected;
 
 		expected =
@@ -2328,7 +2327,7 @@ static void do_retry(struct work_struct *ws)
 		/* We are not just doing submit_bio_noacct(),
 		 * as we want to keep the start_time information. */
 		inc_ap_bio(device);
-		__drbd_make_request(device, bio, start_jif);
+		__drbd_make_request(device, bio);
 	}
 }
 
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 674be09b2da9..f705128b4f27 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1188,7 +1188,7 @@ static void drbd_queue_write(struct drbd_device *device, struct drbd_request *re
  * Returns ERR_PTR(-ENOMEM) if we cannot allocate a drbd_request.
  */
 static struct drbd_request *
-drbd_request_prepare(struct drbd_device *device, struct bio *bio, unsigned long start_jif)
+drbd_request_prepare(struct drbd_device *device, struct bio *bio)
 {
 	const int rw = bio_data_dir(bio);
 	struct drbd_request *req;
@@ -1416,9 +1416,9 @@ static void drbd_send_and_submit(struct drbd_device *device, struct drbd_request
 		complete_master_bio(device, &m);
 }
 
-void __drbd_make_request(struct drbd_device *device, struct bio *bio, unsigned long start_jif)
+void __drbd_make_request(struct drbd_device *device, struct bio *bio)
 {
-	struct drbd_request *req = drbd_request_prepare(device, bio, start_jif);
+	struct drbd_request *req = drbd_request_prepare(device, bio);
 	if (IS_ERR_OR_NULL(req))
 		return;
 	drbd_send_and_submit(device, req);
@@ -1596,19 +1596,16 @@ void do_submit(struct work_struct *ws)
 blk_qc_t drbd_submit_bio(struct bio *bio)
 {
 	struct drbd_device *device = bio->bi_disk->private_data;
-	unsigned long start_jif;
 
 	blk_queue_split(&bio);
 
-	start_jif = jiffies;
-
 	/*
 	 * what we "blindly" assume:
 	 */
 	D_ASSERT(device, IS_ALIGNED(bio->bi_iter.bi_size, 512));
 
 	inc_ap_bio(device);
-	__drbd_make_request(device, bio, start_jif);
+	__drbd_make_request(device, bio);
 	return BLK_QC_T_NONE;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 3/5] drbd: rename start_jif to start_ns
  2020-07-08  7:58 [RFC PATCH 0/4] block: add two statistic tables Guoqing Jiang
  2020-07-08  7:58 ` [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct Guoqing Jiang
  2020-07-08  7:58 ` [PATCH RFC 2/5] drbd: remove unused argument from drbd_request_prepare and __drbd_make_request Guoqing Jiang
@ 2020-07-08  7:58 ` Guoqing Jiang
  2020-07-08  7:58 ` [PATCH RFC 4/5] block: add a statistic table for io latency Guoqing Jiang
  2020-07-08  7:58 ` [PATCH RFC 5/5] block: add a statistic table for io sector Guoqing Jiang
  4 siblings, 0 replies; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08  7:58 UTC (permalink / raw)
  To: axboe
  Cc: linux-block, Guoqing Jiang, Philipp Reisner, Lars Ellenberg, drbd-dev

Let's rename start_jif to start_ns to reflect that bio_start_io_acct
returns ns presicion now.

Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: drbd-dev@lists.linbit.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
 drivers/block/drbd/drbd_debugfs.c | 8 ++++----
 drivers/block/drbd/drbd_int.h     | 2 +-
 drivers/block/drbd/drbd_req.c     | 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/block/drbd/drbd_debugfs.c b/drivers/block/drbd/drbd_debugfs.c
index b3b9cd5628fd..672fd7e6587d 100644
--- a/drivers/block/drbd/drbd_debugfs.c
+++ b/drivers/block/drbd/drbd_debugfs.c
@@ -105,7 +105,7 @@ static void seq_print_one_request(struct seq_file *m, struct drbd_request *req,
 		(s & RQ_WRITE) ? "W" : "R");
 
 #define RQ_HDR_2 "\tstart\tin AL\tsubmit"
-	seq_printf(m, "\t%d", jiffies_to_msecs(now - req->start_jif));
+	seq_printf(m, "\t%llu", (ktime_get_ns() - req->start_ns) / NSEC_PER_MSEC);
 	seq_print_age_or_dash(m, s & RQ_IN_ACT_LOG, now - req->in_actlog_jif);
 	seq_print_age_or_dash(m, s & RQ_LOCAL_PENDING, now - req->pre_submit_jif);
 
@@ -161,7 +161,7 @@ static void seq_print_waiting_for_AL(struct seq_file *m, struct drbd_resource *r
 	seq_puts(m, "minor\tvnr\tage\t#waiting\n");
 	rcu_read_lock();
 	idr_for_each_entry(&resource->devices, device, i) {
-		unsigned long jif;
+		unsigned long ns;
 		struct drbd_request *req;
 		int n = atomic_read(&device->ap_actlog_cnt);
 		if (n) {
@@ -171,7 +171,7 @@ static void seq_print_waiting_for_AL(struct seq_file *m, struct drbd_resource *r
 			/* if the oldest request does not wait for the activity log
 			 * it is not interesting for us here */
 			if (req && !(req->rq_state & RQ_IN_ACT_LOG))
-				jif = req->start_jif;
+				ns = req->start_ns;
 			else
 				req = NULL;
 			spin_unlock_irq(&device->resource->req_lock);
@@ -179,7 +179,7 @@ static void seq_print_waiting_for_AL(struct seq_file *m, struct drbd_resource *r
 		if (n) {
 			seq_printf(m, "%u\t%u\t", device->minor, device->vnr);
 			if (req)
-				seq_printf(m, "%u\t", jiffies_to_msecs(now - jif));
+				seq_printf(m, "%llu\t", (ktime_get_ns() - ns) / NSEC_PER_MSEC);
 			else
 				seq_puts(m, "-\t");
 			seq_printf(m, "%u\n", n);
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index aacd2010b555..467d96316230 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -301,7 +301,7 @@ struct drbd_request {
 	struct list_head req_pending_local;
 
 	/* for generic IO accounting */
-	unsigned long start_jif;
+	unsigned long start_ns;
 
 	/* for DRBD internal statistics */
 
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index f705128b4f27..6ad6b4470ebd 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -245,7 +245,7 @@ void drbd_req_complete(struct drbd_request *req, struct bio_and_error *m)
 		start_new_tl_epoch(first_peer_device(device)->connection);
 
 	/* Update disk stats */
-	bio_end_io_acct(req->master_bio, req->start_jif);
+	bio_end_io_acct(req->master_bio, req->start_ns);
 
 	/* If READ failed,
 	 * have it be pushed back to the retry work queue,
@@ -1206,7 +1206,7 @@ drbd_request_prepare(struct drbd_device *device, struct bio *bio)
 	}
 
 	/* Update disk stats */
-	req->start_jif = bio_start_io_acct(req->master_bio);
+	req->start_ns = bio_start_io_acct(req->master_bio);
 
 	if (!get_ldev(device)) {
 		bio_put(req->private_bio);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-08  7:58 [RFC PATCH 0/4] block: add two statistic tables Guoqing Jiang
                   ` (2 preceding siblings ...)
  2020-07-08  7:58 ` [PATCH RFC 3/5] drbd: rename start_jif to start_ns Guoqing Jiang
@ 2020-07-08  7:58 ` Guoqing Jiang
  2020-07-08 13:29   ` Ming Lei
  2020-07-08  7:58 ` [PATCH RFC 5/5] block: add a statistic table for io sector Guoqing Jiang
  4 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08  7:58 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, Guoqing Jiang, Florian-Ewald Mueller

Usually, we get the status of block device by cat stat file,
but we can only know the total time with that file. And we
would like to know more accurate statistic, such as each
latency range, which helps people to diagnose if there is
issue about the hardware.

Also a new config option is introduced to control if people
want to know the additional statistics or not, and we also
use the option for io sector in next patch.

Signed-off-by: Florian-Ewald Mueller <florian-ewald.mueller@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
 block/Kconfig             |  8 ++++++++
 block/blk-core.c          | 35 +++++++++++++++++++++++++++++++++++
 block/genhd.c             | 26 ++++++++++++++++++++++++++
 include/linux/part_stat.h |  7 +++++++
 4 files changed, 76 insertions(+)

diff --git a/block/Kconfig b/block/Kconfig
index 9357d7302398..dba71feaa85b 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -175,6 +175,14 @@ config BLK_DEBUG_FS
 	Unless you are building a kernel for a tiny system, you should
 	say Y here.
 
+config BLK_ADDITIONAL_DISKSTAT
+	bool "Block layer additional diskstat"
+	default n
+	help
+	Enabling this option adds io latency statistics for each block device.
+
+	If unsure, say N.
+
 config BLK_DEBUG_FS_ZONED
        bool
        default BLK_DEBUG_FS && BLK_DEV_ZONED
diff --git a/block/blk-core.c b/block/blk-core.c
index 0e806a8c62fb..7a129c8f1b23 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1411,6 +1411,39 @@ static void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
 	}
 }
 
+#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
+/*
+ * Either account additional stat for request if req is not NULL or account for bio.
+ */
+static void blk_additional_latency(struct hd_struct *part, const int sgrp,
+				   struct request *req, unsigned long start_ns)
+{
+	unsigned int idx;
+	unsigned long duration, now = ktime_get_ns();
+
+	if (req)
+		duration = (now - req->start_time_ns) / NSEC_PER_MSEC;
+	else
+		duration = (now - start_ns) / NSEC_PER_MSEC;
+
+	duration /= HZ_TO_MSEC_NUM;
+	if (likely(duration > 0)) {
+		idx = ilog2(duration);
+		if (idx > ADD_STAT_NUM - 1)
+			idx = ADD_STAT_NUM - 1;
+	} else
+		idx = 0;
+	part_stat_inc(part, latency_table[idx][sgrp]);
+
+}
+#else
+static void blk_additional_latency(struct hd_struct *part, const int sgrp,
+				   struct request *req, unsigned long start_jiffies)
+
+{
+}
+#endif
+
 static void blk_account_io_completion(struct request *req, unsigned int bytes)
 {
 	if (req->part && blk_do_io_stat(req)) {
@@ -1440,6 +1473,7 @@ void blk_account_io_done(struct request *req, u64 now)
 		part = req->part;
 
 		update_io_ticks(part, jiffies, true);
+		blk_additional_latency(part, sgrp, req, 0);
 		part_stat_inc(part, ios[sgrp]);
 		part_stat_add(part, nsecs[sgrp], now - req->start_time_ns);
 		part_stat_unlock();
@@ -1489,6 +1523,7 @@ void disk_end_io_acct(struct gendisk *disk, unsigned int op,
 
 	part_stat_lock();
 	update_io_ticks(part, now, true);
+	blk_additional_latency(part, sgrp, NULL, start_time);
 	part_stat_add(part, nsecs[sgrp], duration);
 	part_stat_local_dec(part, in_flight[op_is_write(op)]);
 	part_stat_unlock();
diff --git a/block/genhd.c b/block/genhd.c
index 60ae4e1b4d38..a33937a74fb1 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1420,6 +1420,29 @@ static struct device_attribute dev_attr_fail_timeout =
 	__ATTR(io-timeout-fail, 0644, part_timeout_show, part_timeout_store);
 #endif
 
+#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
+static ssize_t io_latency_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct hd_struct *p = dev_to_part(dev);
+	size_t count = 0;
+	int i, sgrp;
+
+	for (i = 0; i < ADD_STAT_NUM; i++) {
+		count += scnprintf(buf + count, PAGE_SIZE - count, "%5d ms: ",
+				   (1 << i) * HZ_TO_MSEC_NUM);
+		for (sgrp = 0; sgrp < NR_STAT_GROUPS; sgrp++)
+			count += scnprintf(buf + count, PAGE_SIZE - count, "%lu ",
+					   part_stat_read(p, latency_table[i][sgrp]));
+		count += scnprintf(buf + count, PAGE_SIZE - count, "\n");
+	}
+
+	return count;
+}
+
+static struct device_attribute dev_attr_io_latency =
+	__ATTR(io_latency, 0444, io_latency_show, NULL);
+#endif
+
 static struct attribute *disk_attrs[] = {
 	&dev_attr_range.attr,
 	&dev_attr_ext_range.attr,
@@ -1438,6 +1461,9 @@ static struct attribute *disk_attrs[] = {
 #endif
 #ifdef CONFIG_FAIL_IO_TIMEOUT
 	&dev_attr_fail_timeout.attr,
+#endif
+#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
+	&dev_attr_io_latency.attr,
 #endif
 	NULL
 };
diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
index 24125778ef3e..fe3def8c69d7 100644
--- a/include/linux/part_stat.h
+++ b/include/linux/part_stat.h
@@ -9,6 +9,13 @@ struct disk_stats {
 	unsigned long sectors[NR_STAT_GROUPS];
 	unsigned long ios[NR_STAT_GROUPS];
 	unsigned long merges[NR_STAT_GROUPS];
+#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
+/*
+ * We measure latency (ms) for 1, 2, ..., 1024 and >=1024.
+ */
+#define ADD_STAT_NUM	12
+	unsigned long latency_table[ADD_STAT_NUM][NR_STAT_GROUPS];
+#endif
 	unsigned long io_ticks;
 	local_t in_flight[2];
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 5/5] block: add a statistic table for io sector
  2020-07-08  7:58 [RFC PATCH 0/4] block: add two statistic tables Guoqing Jiang
                   ` (3 preceding siblings ...)
  2020-07-08  7:58 ` [PATCH RFC 4/5] block: add a statistic table for io latency Guoqing Jiang
@ 2020-07-08  7:58 ` Guoqing Jiang
  4 siblings, 0 replies; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08  7:58 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, Guoqing Jiang, Florian-Ewald Mueller

With the sector table, so we can know the distribution of
different IO size from upper layer, which means we could
have the opportunity to tune the performance based on the
mostly issued IOs.

Signed-off-by: Florian-Ewald Mueller <florian-ewald.mueller@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
 block/Kconfig             |  3 ++-
 block/blk-core.c          | 19 +++++++++++++++++++
 block/genhd.c             | 21 +++++++++++++++++++++
 include/linux/part_stat.h |  3 ++-
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/block/Kconfig b/block/Kconfig
index dba71feaa85b..f1c4800969ef 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -179,7 +179,8 @@ config BLK_ADDITIONAL_DISKSTAT
 	bool "Block layer additional diskstat"
 	default n
 	help
-	Enabling this option adds io latency statistics for each block device.
+	Enabling this option adds io latency and io size statistics for each
+	block device.
 
 	If unsure, say N.
 
diff --git a/block/blk-core.c b/block/blk-core.c
index 7a129c8f1b23..d3c9f727926f 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1436,12 +1436,29 @@ static void blk_additional_latency(struct hd_struct *part, const int sgrp,
 	part_stat_inc(part, latency_table[idx][sgrp]);
 
 }
+
+static void blk_additional_sector(struct hd_struct *part, const int sgrp,
+				  unsigned int sectors)
+{
+	unsigned int KB, idx;
+
+	KB = sectors / 2;
+	idx = (KB > 0) ? ilog2(KB) : 0;
+
+	idx = (idx > (ADD_STAT_NUM - 1)) ? (ADD_STAT_NUM - 1) : idx;
+	part_stat_inc(part, size_table[idx][sgrp]);
+}
 #else
 static void blk_additional_latency(struct hd_struct *part, const int sgrp,
 				   struct request *req, unsigned long start_jiffies)
 
 {
 }
+
+static void blk_additional_sector(struct hd_struct *part, const int sgrp,
+				  unsigned int sectors)
+{
+}
 #endif
 
 static void blk_account_io_completion(struct request *req, unsigned int bytes)
@@ -1452,6 +1469,7 @@ static void blk_account_io_completion(struct request *req, unsigned int bytes)
 
 		part_stat_lock();
 		part = req->part;
+		blk_additional_sector(part, sgrp, bytes >> SECTOR_SHIFT);
 		part_stat_add(part, sectors[sgrp], bytes >> 9);
 		part_stat_unlock();
 	}
@@ -1506,6 +1524,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
 	update_io_ticks(part, now, false);
 	part_stat_inc(part, ios[sgrp]);
 	part_stat_add(part, sectors[sgrp], sectors);
+	blk_additional_sector(part, sgrp, sectors);
 	part_stat_local_inc(part, in_flight[op_is_write(op)]);
 	part_stat_unlock();
 
diff --git a/block/genhd.c b/block/genhd.c
index a33937a74fb1..68e4735662f3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1441,6 +1441,26 @@ static ssize_t io_latency_show(struct device *dev, struct device_attribute *attr
 
 static struct device_attribute dev_attr_io_latency =
 	__ATTR(io_latency, 0444, io_latency_show, NULL);
+
+static ssize_t io_size_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct hd_struct *p = dev_to_part(dev);
+	size_t count = 0;
+	int i, sgrp;
+
+	for (i = 0; i < ADD_STAT_NUM; i++) {
+		count += scnprintf(buf + count, PAGE_SIZE - count, "%5d KB: ", 1 << i);
+		for (sgrp = 0; sgrp < NR_STAT_GROUPS; sgrp++)
+			count += scnprintf(buf + count, PAGE_SIZE - count, "%lu ",
+					   part_stat_read(p, size_table[i][sgrp]));
+		count += scnprintf(buf + count, PAGE_SIZE - count, "\n");
+	}
+
+	return count;
+}
+
+static struct device_attribute dev_attr_io_size =
+	__ATTR(io_size, 0444, io_size_show, NULL);
 #endif
 
 static struct attribute *disk_attrs[] = {
@@ -1464,6 +1484,7 @@ static struct attribute *disk_attrs[] = {
 #endif
 #ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
 	&dev_attr_io_latency.attr,
+	&dev_attr_io_size.attr,
 #endif
 	NULL
 };
diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
index fe3def8c69d7..2b056cd70d1f 100644
--- a/include/linux/part_stat.h
+++ b/include/linux/part_stat.h
@@ -11,10 +11,11 @@ struct disk_stats {
 	unsigned long merges[NR_STAT_GROUPS];
 #ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
 /*
- * We measure latency (ms) for 1, 2, ..., 1024 and >=1024.
+ * We measure latency (ms) and size (sector) for 1, 2, ..., 1024 and >=1024.
  */
 #define ADD_STAT_NUM	12
 	unsigned long latency_table[ADD_STAT_NUM][NR_STAT_GROUPS];
+	unsigned long size_table[ADD_STAT_NUM][NR_STAT_GROUPS];
 #endif
 	unsigned long io_ticks;
 	local_t in_flight[2];
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct
  2020-07-08  7:58 ` [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct Guoqing Jiang
@ 2020-07-08 13:27   ` Ming Lei
  2020-07-08 13:53     ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2020-07-08 13:27 UTC (permalink / raw)
  To: Guoqing Jiang
  Cc: axboe, linux-block, Philipp Reisner, Lars Ellenberg, drbd-dev

On Wed, Jul 08, 2020 at 09:58:15AM +0200, Guoqing Jiang wrote:
> Currently the duration accounting of bio based driver is converted from
> jiffies to ns, means it could be less accurate as request based driver.
> 
> So let disk_start_io_acct return from ns precision, instead of convert
> jiffies to ns in disk_end_io_acct.
> 
> Cc: Philipp Reisner <philipp.reisner@linbit.com>
> Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
> Cc: drbd-dev@lists.linbit.com
> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> ---
>  block/blk-core.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index d9d632639bd1..0e806a8c62fb 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1466,6 +1466,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
>  	struct hd_struct *part = &disk->part0;
>  	const int sgrp = op_stat_group(op);
>  	unsigned long now = READ_ONCE(jiffies);
> +	unsigned long start_ns = ktime_get_ns();
>  
>  	part_stat_lock();
>  	update_io_ticks(part, now, false);
> @@ -1474,7 +1475,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
>  	part_stat_local_inc(part, in_flight[op_is_write(op)]);
>  	part_stat_unlock();
>  
> -	return now;
> +	return start_ns;
>  }
>  EXPORT_SYMBOL(disk_start_io_acct);
>  
> @@ -1484,11 +1485,11 @@ void disk_end_io_acct(struct gendisk *disk, unsigned int op,
>  	struct hd_struct *part = &disk->part0;
>  	const int sgrp = op_stat_group(op);
>  	unsigned long now = READ_ONCE(jiffies);
> -	unsigned long duration = now - start_time;
> +	unsigned long duration = ktime_get_ns() - start_time;
>  
>  	part_stat_lock();
>  	update_io_ticks(part, now, true);
> -	part_stat_add(part, nsecs[sgrp], jiffies_to_nsecs(duration));
> +	part_stat_add(part, nsecs[sgrp], duration);
>  	part_stat_local_dec(part, in_flight[op_is_write(op)]);
>  	part_stat_unlock();

Hi Guoqing,

Cost of ktime_get_ns() can be observed as not cheap in high IOPS device,
so not sure the conversion is good. Also could you share what benefit we can
get with this change?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-08  7:58 ` [PATCH RFC 4/5] block: add a statistic table for io latency Guoqing Jiang
@ 2020-07-08 13:29   ` Ming Lei
  2020-07-08 14:02     ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2020-07-08 13:29 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: axboe, linux-block, Florian-Ewald Mueller

On Wed, Jul 08, 2020 at 09:58:18AM +0200, Guoqing Jiang wrote:
> Usually, we get the status of block device by cat stat file,
> but we can only know the total time with that file. And we
> would like to know more accurate statistic, such as each
> latency range, which helps people to diagnose if there is
> issue about the hardware.
> 
> Also a new config option is introduced to control if people
> want to know the additional statistics or not, and we also
> use the option for io sector in next patch.
> 
> Signed-off-by: Florian-Ewald Mueller <florian-ewald.mueller@cloud.ionos.com>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> ---
>  block/Kconfig             |  8 ++++++++
>  block/blk-core.c          | 35 +++++++++++++++++++++++++++++++++++
>  block/genhd.c             | 26 ++++++++++++++++++++++++++
>  include/linux/part_stat.h |  7 +++++++
>  4 files changed, 76 insertions(+)
> 
> diff --git a/block/Kconfig b/block/Kconfig
> index 9357d7302398..dba71feaa85b 100644
> --- a/block/Kconfig
> +++ b/block/Kconfig
> @@ -175,6 +175,14 @@ config BLK_DEBUG_FS
>  	Unless you are building a kernel for a tiny system, you should
>  	say Y here.
>  
> +config BLK_ADDITIONAL_DISKSTAT
> +	bool "Block layer additional diskstat"
> +	default n
> +	help
> +	Enabling this option adds io latency statistics for each block device.
> +
> +	If unsure, say N.
> +
>  config BLK_DEBUG_FS_ZONED
>         bool
>         default BLK_DEBUG_FS && BLK_DEV_ZONED
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 0e806a8c62fb..7a129c8f1b23 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1411,6 +1411,39 @@ static void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
>  	}
>  }
>  
> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
> +/*
> + * Either account additional stat for request if req is not NULL or account for bio.
> + */
> +static void blk_additional_latency(struct hd_struct *part, const int sgrp,
> +				   struct request *req, unsigned long start_ns)
> +{
> +	unsigned int idx;
> +	unsigned long duration, now = ktime_get_ns();
> +
> +	if (req)
> +		duration = (now - req->start_time_ns) / NSEC_PER_MSEC;
> +	else
> +		duration = (now - start_ns) / NSEC_PER_MSEC;
> +
> +	duration /= HZ_TO_MSEC_NUM;
> +	if (likely(duration > 0)) {
> +		idx = ilog2(duration);
> +		if (idx > ADD_STAT_NUM - 1)
> +			idx = ADD_STAT_NUM - 1;
> +	} else
> +		idx = 0;
> +	part_stat_inc(part, latency_table[idx][sgrp]);
> +
> +}
> +#else
> +static void blk_additional_latency(struct hd_struct *part, const int sgrp,
> +				   struct request *req, unsigned long start_jiffies)
> +
> +{
> +}
> +#endif
> +
>  static void blk_account_io_completion(struct request *req, unsigned int bytes)
>  {
>  	if (req->part && blk_do_io_stat(req)) {
> @@ -1440,6 +1473,7 @@ void blk_account_io_done(struct request *req, u64 now)
>  		part = req->part;
>  
>  		update_io_ticks(part, jiffies, true);
> +		blk_additional_latency(part, sgrp, req, 0);
>  		part_stat_inc(part, ios[sgrp]);
>  		part_stat_add(part, nsecs[sgrp], now - req->start_time_ns);
>  		part_stat_unlock();
> @@ -1489,6 +1523,7 @@ void disk_end_io_acct(struct gendisk *disk, unsigned int op,
>  
>  	part_stat_lock();
>  	update_io_ticks(part, now, true);
> +	blk_additional_latency(part, sgrp, NULL, start_time);
>  	part_stat_add(part, nsecs[sgrp], duration);
>  	part_stat_local_dec(part, in_flight[op_is_write(op)]);
>  	part_stat_unlock();
> diff --git a/block/genhd.c b/block/genhd.c
> index 60ae4e1b4d38..a33937a74fb1 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -1420,6 +1420,29 @@ static struct device_attribute dev_attr_fail_timeout =
>  	__ATTR(io-timeout-fail, 0644, part_timeout_show, part_timeout_store);
>  #endif
>  
> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
> +static ssize_t io_latency_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> +	struct hd_struct *p = dev_to_part(dev);
> +	size_t count = 0;
> +	int i, sgrp;
> +
> +	for (i = 0; i < ADD_STAT_NUM; i++) {
> +		count += scnprintf(buf + count, PAGE_SIZE - count, "%5d ms: ",
> +				   (1 << i) * HZ_TO_MSEC_NUM);
> +		for (sgrp = 0; sgrp < NR_STAT_GROUPS; sgrp++)
> +			count += scnprintf(buf + count, PAGE_SIZE - count, "%lu ",
> +					   part_stat_read(p, latency_table[i][sgrp]));
> +		count += scnprintf(buf + count, PAGE_SIZE - count, "\n");
> +	}
> +
> +	return count;
> +}
> +
> +static struct device_attribute dev_attr_io_latency =
> +	__ATTR(io_latency, 0444, io_latency_show, NULL);
> +#endif
> +
>  static struct attribute *disk_attrs[] = {
>  	&dev_attr_range.attr,
>  	&dev_attr_ext_range.attr,
> @@ -1438,6 +1461,9 @@ static struct attribute *disk_attrs[] = {
>  #endif
>  #ifdef CONFIG_FAIL_IO_TIMEOUT
>  	&dev_attr_fail_timeout.attr,
> +#endif
> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
> +	&dev_attr_io_latency.attr,
>  #endif
>  	NULL
>  };
> diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
> index 24125778ef3e..fe3def8c69d7 100644
> --- a/include/linux/part_stat.h
> +++ b/include/linux/part_stat.h
> @@ -9,6 +9,13 @@ struct disk_stats {
>  	unsigned long sectors[NR_STAT_GROUPS];
>  	unsigned long ios[NR_STAT_GROUPS];
>  	unsigned long merges[NR_STAT_GROUPS];
> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
> +/*
> + * We measure latency (ms) for 1, 2, ..., 1024 and >=1024.
> + */
> +#define ADD_STAT_NUM	12
> +	unsigned long latency_table[ADD_STAT_NUM][NR_STAT_GROUPS];
> +#endif
>  	unsigned long io_ticks;
>  	local_t in_flight[2];
>  };
> -- 
> 2.17.1
> 

Hi Guoqing,

I believe it isn't hard to write a ebpf based script(bcc or bpftrace) to
collect this kind of performance data, so looks not necessary to do it
in kernel.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct
  2020-07-08 13:27   ` Ming Lei
@ 2020-07-08 13:53     ` Guoqing Jiang
  2020-07-08 17:46       ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08 13:53 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Philipp Reisner, Lars Ellenberg, drbd-dev

Hi Ming,

On 7/8/20 3:27 PM, Ming Lei wrote:
> On Wed, Jul 08, 2020 at 09:58:15AM +0200, Guoqing Jiang wrote:
>> Currently the duration accounting of bio based driver is converted from
>> jiffies to ns, means it could be less accurate as request based driver.
>>
>> So let disk_start_io_acct return from ns precision, instead of convert
>> jiffies to ns in disk_end_io_acct.
>>
>> Cc: Philipp Reisner <philipp.reisner@linbit.com>
>> Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
>> Cc: drbd-dev@lists.linbit.com
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
>> ---
>>   block/blk-core.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index d9d632639bd1..0e806a8c62fb 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1466,6 +1466,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
>>   	struct hd_struct *part = &disk->part0;
>>   	const int sgrp = op_stat_group(op);
>>   	unsigned long now = READ_ONCE(jiffies);
>> +	unsigned long start_ns = ktime_get_ns();
>>   
>>   	part_stat_lock();
>>   	update_io_ticks(part, now, false);
>> @@ -1474,7 +1475,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
>>   	part_stat_local_inc(part, in_flight[op_is_write(op)]);
>>   	part_stat_unlock();
>>   
>> -	return now;
>> +	return start_ns;
>>   }
>>   EXPORT_SYMBOL(disk_start_io_acct);
>>   
>> @@ -1484,11 +1485,11 @@ void disk_end_io_acct(struct gendisk *disk, unsigned int op,
>>   	struct hd_struct *part = &disk->part0;
>>   	const int sgrp = op_stat_group(op);
>>   	unsigned long now = READ_ONCE(jiffies);
>> -	unsigned long duration = now - start_time;
>> +	unsigned long duration = ktime_get_ns() - start_time;
>>   
>>   	part_stat_lock();
>>   	update_io_ticks(part, now, true);
>> -	part_stat_add(part, nsecs[sgrp], jiffies_to_nsecs(duration));
>> +	part_stat_add(part, nsecs[sgrp], duration);
>>   	part_stat_local_dec(part, in_flight[op_is_write(op)]);
>>   	part_stat_unlock();
> Hi Guoqing,
>
> Cost of ktime_get_ns() can be observed as not cheap in high IOPS device,

Could you share some links about it? Thanks.

> so not sure the conversion is good. Also could you share what benefit we can
> get with this change?

Without the conversion, we have to track io latency with jiffies in 4th 
patch.
Then with HZ=100, some rows (such as 1ms, 2ms and 4ms) in that table
don't make sense.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-08 13:29   ` Ming Lei
@ 2020-07-08 14:02     ` Guoqing Jiang
  2020-07-08 14:06       ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08 14:02 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Florian-Ewald Mueller

On 7/8/20 3:29 PM, Ming Lei wrote:
> On Wed, Jul 08, 2020 at 09:58:18AM +0200, Guoqing Jiang wrote:
>> Usually, we get the status of block device by cat stat file,
>> but we can only know the total time with that file. And we
>> would like to know more accurate statistic, such as each
>> latency range, which helps people to diagnose if there is
>> issue about the hardware.
>>
>> Also a new config option is introduced to control if people
>> want to know the additional statistics or not, and we also
>> use the option for io sector in next patch.
>>
>> Signed-off-by: Florian-Ewald Mueller <florian-ewald.mueller@cloud.ionos.com>
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
>> ---
>>   block/Kconfig             |  8 ++++++++
>>   block/blk-core.c          | 35 +++++++++++++++++++++++++++++++++++
>>   block/genhd.c             | 26 ++++++++++++++++++++++++++
>>   include/linux/part_stat.h |  7 +++++++
>>   4 files changed, 76 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 9357d7302398..dba71feaa85b 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -175,6 +175,14 @@ config BLK_DEBUG_FS
>>   	Unless you are building a kernel for a tiny system, you should
>>   	say Y here.
>>   
>> +config BLK_ADDITIONAL_DISKSTAT
>> +	bool "Block layer additional diskstat"
>> +	default n
>> +	help
>> +	Enabling this option adds io latency statistics for each block device.
>> +
>> +	If unsure, say N.
>> +
>>   config BLK_DEBUG_FS_ZONED
>>          bool
>>          default BLK_DEBUG_FS && BLK_DEV_ZONED
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 0e806a8c62fb..7a129c8f1b23 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1411,6 +1411,39 @@ static void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
>>   	}
>>   }
>>   
>> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
>> +/*
>> + * Either account additional stat for request if req is not NULL or account for bio.
>> + */
>> +static void blk_additional_latency(struct hd_struct *part, const int sgrp,
>> +				   struct request *req, unsigned long start_ns)
>> +{
>> +	unsigned int idx;
>> +	unsigned long duration, now = ktime_get_ns();
>> +
>> +	if (req)
>> +		duration = (now - req->start_time_ns) / NSEC_PER_MSEC;
>> +	else
>> +		duration = (now - start_ns) / NSEC_PER_MSEC;
>> +
>> +	duration /= HZ_TO_MSEC_NUM;
>> +	if (likely(duration > 0)) {
>> +		idx = ilog2(duration);
>> +		if (idx > ADD_STAT_NUM - 1)
>> +			idx = ADD_STAT_NUM - 1;
>> +	} else
>> +		idx = 0;
>> +	part_stat_inc(part, latency_table[idx][sgrp]);
>> +
>> +}
>> +#else
>> +static void blk_additional_latency(struct hd_struct *part, const int sgrp,
>> +				   struct request *req, unsigned long start_jiffies)
>> +
>> +{
>> +}
>> +#endif
>> +
>>   static void blk_account_io_completion(struct request *req, unsigned int bytes)
>>   {
>>   	if (req->part && blk_do_io_stat(req)) {
>> @@ -1440,6 +1473,7 @@ void blk_account_io_done(struct request *req, u64 now)
>>   		part = req->part;
>>   
>>   		update_io_ticks(part, jiffies, true);
>> +		blk_additional_latency(part, sgrp, req, 0);
>>   		part_stat_inc(part, ios[sgrp]);
>>   		part_stat_add(part, nsecs[sgrp], now - req->start_time_ns);
>>   		part_stat_unlock();
>> @@ -1489,6 +1523,7 @@ void disk_end_io_acct(struct gendisk *disk, unsigned int op,
>>   
>>   	part_stat_lock();
>>   	update_io_ticks(part, now, true);
>> +	blk_additional_latency(part, sgrp, NULL, start_time);
>>   	part_stat_add(part, nsecs[sgrp], duration);
>>   	part_stat_local_dec(part, in_flight[op_is_write(op)]);
>>   	part_stat_unlock();
>> diff --git a/block/genhd.c b/block/genhd.c
>> index 60ae4e1b4d38..a33937a74fb1 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -1420,6 +1420,29 @@ static struct device_attribute dev_attr_fail_timeout =
>>   	__ATTR(io-timeout-fail, 0644, part_timeout_show, part_timeout_store);
>>   #endif
>>   
>> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
>> +static ssize_t io_latency_show(struct device *dev, struct device_attribute *attr, char *buf)
>> +{
>> +	struct hd_struct *p = dev_to_part(dev);
>> +	size_t count = 0;
>> +	int i, sgrp;
>> +
>> +	for (i = 0; i < ADD_STAT_NUM; i++) {
>> +		count += scnprintf(buf + count, PAGE_SIZE - count, "%5d ms: ",
>> +				   (1 << i) * HZ_TO_MSEC_NUM);
>> +		for (sgrp = 0; sgrp < NR_STAT_GROUPS; sgrp++)
>> +			count += scnprintf(buf + count, PAGE_SIZE - count, "%lu ",
>> +					   part_stat_read(p, latency_table[i][sgrp]));
>> +		count += scnprintf(buf + count, PAGE_SIZE - count, "\n");
>> +	}
>> +
>> +	return count;
>> +}
>> +
>> +static struct device_attribute dev_attr_io_latency =
>> +	__ATTR(io_latency, 0444, io_latency_show, NULL);
>> +#endif
>> +
>>   static struct attribute *disk_attrs[] = {
>>   	&dev_attr_range.attr,
>>   	&dev_attr_ext_range.attr,
>> @@ -1438,6 +1461,9 @@ static struct attribute *disk_attrs[] = {
>>   #endif
>>   #ifdef CONFIG_FAIL_IO_TIMEOUT
>>   	&dev_attr_fail_timeout.attr,
>> +#endif
>> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
>> +	&dev_attr_io_latency.attr,
>>   #endif
>>   	NULL
>>   };
>> diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
>> index 24125778ef3e..fe3def8c69d7 100644
>> --- a/include/linux/part_stat.h
>> +++ b/include/linux/part_stat.h
>> @@ -9,6 +9,13 @@ struct disk_stats {
>>   	unsigned long sectors[NR_STAT_GROUPS];
>>   	unsigned long ios[NR_STAT_GROUPS];
>>   	unsigned long merges[NR_STAT_GROUPS];
>> +#ifdef CONFIG_BLK_ADDITIONAL_DISKSTAT
>> +/*
>> + * We measure latency (ms) for 1, 2, ..., 1024 and >=1024.
>> + */
>> +#define ADD_STAT_NUM	12
>> +	unsigned long latency_table[ADD_STAT_NUM][NR_STAT_GROUPS];
>> +#endif
>>   	unsigned long io_ticks;
>>   	local_t in_flight[2];
>>   };
>> -- 
>> 2.17.1
>>
> Hi Guoqing,
>
> I believe it isn't hard to write a ebpf based script(bcc or bpftrace) to
> collect this kind of performance data, so looks not necessary to do it
> in kernel.

Hi Ming,

Sorry, I don't know well about bcc or bpftrace, but I assume they need to
read the latency value from somewhere inside kernel. Could you point
how can I get the latency value? Thanks in advance!

Thanks,
Guoqing


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-08 14:02     ` Guoqing Jiang
@ 2020-07-08 14:06       ` Guoqing Jiang
  2020-07-09 18:48         ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08 14:06 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Florian-Ewald Mueller

On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>> Hi Guoqing,
>>
>> I believe it isn't hard to write a ebpf based script(bcc or bpftrace) to
>> collect this kind of performance data, so looks not necessary to do it
>> in kernel.
>
> Hi Ming,
>
> Sorry, I don't know well about bcc or bpftrace, but I assume they need to
> read the latency value from somewhere inside kernel. Could you point
> how can I get the latency value? Thanks in advance!

Hmm, I suppose biolatency is suitable for track latency, will look into it.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct
  2020-07-08 13:53     ` Guoqing Jiang
@ 2020-07-08 17:46       ` Guoqing Jiang
  0 siblings, 0 replies; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-08 17:46 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Philipp Reisner, Lars Ellenberg, drbd-dev

On 7/8/20 3:53 PM, Guoqing Jiang wrote:
>>
>> Cost of ktime_get_ns() can be observed as not cheap in high IOPS device,
>
> Could you share some links about it? Thanks.
>
>> so not sure the conversion is good. Also could you share what benefit 
>> we can
>> get with this change?
>
> Without the conversion, we have to track io latency with jiffies in 
> 4th patch.
> Then with HZ=100, some rows (such as 1ms, 2ms and 4ms) in that table
> don't make sense. 

Hmm, I can still output those rows based on HZ_TO_MSEC_NUM, which means
this patch can be dropped since the cost of ktime_get_ns is more expensive.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-08 14:06       ` Guoqing Jiang
@ 2020-07-09 18:48         ` Guoqing Jiang
  2020-07-10  0:53           ` Ming Lei
  0 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-09 18:48 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Florian-Ewald Mueller

Hi Ming,

On 7/8/20 4:06 PM, Guoqing Jiang wrote:
> On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>>> Hi Guoqing,
>>>
>>> I believe it isn't hard to write a ebpf based script(bcc or 
>>> bpftrace) to
>>> collect this kind of performance data, so looks not necessary to do it
>>> in kernel.
>>
>> Hi Ming,
>>
>> Sorry, I don't know well about bcc or bpftrace, but I assume they 
>> need to
>> read the latency value from somewhere inside kernel. Could you point
>> how can I get the latency value? Thanks in advance!
>
> Hmm, I suppose biolatency is suitable for track latency, will look 
> into it.

I think biolatency can't trace data if it is not running, also seems no 
place
inside kernel have recorded such information for ebpf to read, correct me
if my understanding is wrong.

And as cloud provider,we would like to know data when necessary instead
of collect data by keep script running because it is expensive than just 
read
node IMHO.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-09 18:48         ` Guoqing Jiang
@ 2020-07-10  0:53           ` Ming Lei
  2020-07-10  8:55             ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2020-07-10  0:53 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: axboe, linux-block, Florian-Ewald Mueller

Hi Guoqing,

On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
> Hi Ming,
> 
> On 7/8/20 4:06 PM, Guoqing Jiang wrote:
> > On 7/8/20 4:02 PM, Guoqing Jiang wrote:
> > > > Hi Guoqing,
> > > > 
> > > > I believe it isn't hard to write a ebpf based script(bcc or
> > > > bpftrace) to
> > > > collect this kind of performance data, so looks not necessary to do it
> > > > in kernel.
> > > 
> > > Hi Ming,
> > > 
> > > Sorry, I don't know well about bcc or bpftrace, but I assume they
> > > need to
> > > read the latency value from somewhere inside kernel. Could you point
> > > how can I get the latency value? Thanks in advance!
> > 
> > Hmm, I suppose biolatency is suitable for track latency, will look into
> > it.
> 
> I think biolatency can't trace data if it is not running,

Yeah, the ebpf prog is only injected when the trace is started.

> also seems no
> place
> inside kernel have recorded such information for ebpf to read, correct me
> if my understanding is wrong.

Just record the info by starting the bcc script in case you need that, is there
anything wrong with this usage? Always doing such stuff in kernel isn't fair for
users which don't care or need this info.

> 
> And as cloud provider,we would like to know data when necessary instead
> of collect data by keep script running because it is expensive than just
> read
> node IMHO.

It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
the code has to be verified, however once it is put inside kernel, it should have
been efficient enough. The kernel side prog only updates & stores the latency
summery data into bpf map, and the stored summery data can be read out anytime
by userspace.

Could you explain a bit why it is expensive? such as biolatency


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-10  0:53           ` Ming Lei
@ 2020-07-10  8:55             ` Guoqing Jiang
  2020-07-10 10:00               ` Ming Lei
  2020-07-10 14:04               ` Jens Axboe
  0 siblings, 2 replies; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-10  8:55 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Florian-Ewald Mueller

Hi Ming,

On 7/10/20 2:53 AM, Ming Lei wrote:
> Hi Guoqing,
>
> On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
>> Hi Ming,
>>
>> On 7/8/20 4:06 PM, Guoqing Jiang wrote:
>>> On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>>>>> Hi Guoqing,
>>>>>
>>>>> I believe it isn't hard to write a ebpf based script(bcc or
>>>>> bpftrace) to
>>>>> collect this kind of performance data, so looks not necessary to do it
>>>>> in kernel.
>>>> Hi Ming,
>>>>
>>>> Sorry, I don't know well about bcc or bpftrace, but I assume they
>>>> need to
>>>> read the latency value from somewhere inside kernel. Could you point
>>>> how can I get the latency value? Thanks in advance!
>>> Hmm, I suppose biolatency is suitable for track latency, will look into
>>> it.
>> I think biolatency can't trace data if it is not running,
> Yeah, the ebpf prog is only injected when the trace is started.
>
>> also seems no
>> place
>> inside kernel have recorded such information for ebpf to read, correct me
>> if my understanding is wrong.
> Just record the info by starting the bcc script in case you need that, is there
> anything wrong with this usage? Always doing such stuff in kernel isn't fair for
> users which don't care or need this info.

That is why we add a Kconfig option and set it to N by default. And I 
suppose
with modern cpu, the cost with several more instructions would not be that
expensive even the option is enabled, just my $0.02.

>> And as cloud provider,we would like to know data when necessary instead
>> of collect data by keep script running because it is expensive than just
>> read
>> node IMHO.
> It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
> the code has to be verified, however once it is put inside kernel, it should have
> been efficient enough. The kernel side prog only updates & stores the latency
> summery data into bpf map, and the stored summery data can be read out anytime
> by userspace.
>
> Could you explain a bit why it is expensive? such as biolatency

I thought I am compare read a sys node + extra instructions in kernel with
launch a specific process for monitoring which need to occupy more
resources (memory) and context switch. And for biolatency, it calls the
bpf_ktime_get_ns to calculate latency for each IO which I assume the
ktime_get_ns will be triggered finally, and it is not cheap as you said.

Of course, we will keep the change in our own tree if it doesn't make sense
to others.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-10  8:55             ` Guoqing Jiang
@ 2020-07-10 10:00               ` Ming Lei
  2020-07-10 10:29                 ` Guoqing Jiang
  2020-07-10 14:04               ` Jens Axboe
  1 sibling, 1 reply; 22+ messages in thread
From: Ming Lei @ 2020-07-10 10:00 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: axboe, linux-block, Florian-Ewald Mueller

On Fri, Jul 10, 2020 at 10:55:24AM +0200, Guoqing Jiang wrote:
> Hi Ming,
> 
> On 7/10/20 2:53 AM, Ming Lei wrote:
> > Hi Guoqing,
> > 
> > On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
> > > Hi Ming,
> > > 
> > > On 7/8/20 4:06 PM, Guoqing Jiang wrote:
> > > > On 7/8/20 4:02 PM, Guoqing Jiang wrote:
> > > > > > Hi Guoqing,
> > > > > > 
> > > > > > I believe it isn't hard to write a ebpf based script(bcc or
> > > > > > bpftrace) to
> > > > > > collect this kind of performance data, so looks not necessary to do it
> > > > > > in kernel.
> > > > > Hi Ming,
> > > > > 
> > > > > Sorry, I don't know well about bcc or bpftrace, but I assume they
> > > > > need to
> > > > > read the latency value from somewhere inside kernel. Could you point
> > > > > how can I get the latency value? Thanks in advance!
> > > > Hmm, I suppose biolatency is suitable for track latency, will look into
> > > > it.
> > > I think biolatency can't trace data if it is not running,
> > Yeah, the ebpf prog is only injected when the trace is started.
> > 
> > > also seems no
> > > place
> > > inside kernel have recorded such information for ebpf to read, correct me
> > > if my understanding is wrong.
> > Just record the info by starting the bcc script in case you need that, is there
> > anything wrong with this usage? Always doing such stuff in kernel isn't fair for
> > users which don't care or need this info.
> 
> That is why we add a Kconfig option and set it to N by default. And I
> suppose
> with modern cpu, the cost with several more instructions would not be that
> expensive even the option is enabled, just my $0.02.
> 
> > > And as cloud provider,we would like to know data when necessary instead
> > > of collect data by keep script running because it is expensive than just
> > > read
> > > node IMHO.
> > It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
> > the code has to be verified, however once it is put inside kernel, it should have
> > been efficient enough. The kernel side prog only updates & stores the latency
> > summery data into bpf map, and the stored summery data can be read out anytime
> > by userspace.
> > 
> > Could you explain a bit why it is expensive? such as biolatency
> 
> I thought I am compare read a sys node + extra instructions in kernel with
> launch a specific process for monitoring which need to occupy more
> resources (memory) and context switch. And for biolatency, it calls the
> bpf_ktime_get_ns to calculate latency for each IO which I assume the
> ktime_get_ns will be triggered finally, and it is not cheap as you said.

You can replace one read of timestamp with rq->start_time_ns too, just
like what this patch does. You can write your bcc/bfptrace script,
which is quite easy to start. Once you learn its power, maybe you will love
it.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-10 10:00               ` Ming Lei
@ 2020-07-10 10:29                 ` Guoqing Jiang
  2020-07-11  1:32                   ` Ming Lei
  0 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-10 10:29 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Florian-Ewald Mueller

On 7/10/20 12:00 PM, Ming Lei wrote:
> On Fri, Jul 10, 2020 at 10:55:24AM +0200, Guoqing Jiang wrote:
>> Hi Ming,
>>
>> On 7/10/20 2:53 AM, Ming Lei wrote:
>>> Hi Guoqing,
>>>
>>> On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
>>>> Hi Ming,
>>>>
>>>> On 7/8/20 4:06 PM, Guoqing Jiang wrote:
>>>>> On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>>>>>>> Hi Guoqing,
>>>>>>>
>>>>>>> I believe it isn't hard to write a ebpf based script(bcc or
>>>>>>> bpftrace) to
>>>>>>> collect this kind of performance data, so looks not necessary to do it
>>>>>>> in kernel.
>>>>>> Hi Ming,
>>>>>>
>>>>>> Sorry, I don't know well about bcc or bpftrace, but I assume they
>>>>>> need to
>>>>>> read the latency value from somewhere inside kernel. Could you point
>>>>>> how can I get the latency value? Thanks in advance!
>>>>> Hmm, I suppose biolatency is suitable for track latency, will look into
>>>>> it.
>>>> I think biolatency can't trace data if it is not running,
>>> Yeah, the ebpf prog is only injected when the trace is started.
>>>
>>>> also seems no
>>>> place
>>>> inside kernel have recorded such information for ebpf to read, correct me
>>>> if my understanding is wrong.
>>> Just record the info by starting the bcc script in case you need that, is there
>>> anything wrong with this usage? Always doing such stuff in kernel isn't fair for
>>> users which don't care or need this info.
>> That is why we add a Kconfig option and set it to N by default. And I
>> suppose
>> with modern cpu, the cost with several more instructions would not be that
>> expensive even the option is enabled, just my $0.02.
>>
>>>> And as cloud provider,we would like to know data when necessary instead
>>>> of collect data by keep script running because it is expensive than just
>>>> read
>>>> node IMHO.
>>> It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
>>> the code has to be verified, however once it is put inside kernel, it should have
>>> been efficient enough. The kernel side prog only updates & stores the latency
>>> summery data into bpf map, and the stored summery data can be read out anytime
>>> by userspace.
>>>
>>> Could you explain a bit why it is expensive? such as biolatency
>> I thought I am compare read a sys node + extra instructions in kernel with
>> launch a specific process for monitoring which need to occupy more
>> resources (memory) and context switch. And for biolatency, it calls the
>> bpf_ktime_get_ns to calculate latency for each IO which I assume the
>> ktime_get_ns will be triggered finally, and it is not cheap as you said.
> You can replace one read of timestamp with rq->start_time_ns too, just
> like what this patch does. You can write your bcc/bfptrace script,
> which is quite easy to start. Once you learn its power, maybe you will love
> it.

Yes, I definitely need to learn more about it :-). But even with the 
change,
I still believe read a node is cheaper than a script.

And seems biolatency can't trace bio based driver per below, and with
collect data in tree we can trace all block drivers.

# load BPF program
b = BPF(text=bpf_text)
if args.queued:
     b.attach_kprobe(event="blk_account_io_start", 
fn_name="trace_req_start")
else:
     b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
     b.attach_kprobe(event="blk_mq_start_request", 
fn_name="trace_req_start")
b.attach_kprobe(event="blk_account_io_completion",
     fn_name="trace_req_completion")

Could it possible to extend it support trace both request and bio? Otherwise
we have to run another script to trace md raid.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-10  8:55             ` Guoqing Jiang
  2020-07-10 10:00               ` Ming Lei
@ 2020-07-10 14:04               ` Jens Axboe
  1 sibling, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2020-07-10 14:04 UTC (permalink / raw)
  To: Guoqing Jiang, Ming Lei; +Cc: linux-block, Florian-Ewald Mueller

On 7/10/20 2:55 AM, Guoqing Jiang wrote:
> Hi Ming,
> 
> On 7/10/20 2:53 AM, Ming Lei wrote:
>> Hi Guoqing,
>>
>> On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
>>> Hi Ming,
>>>
>>> On 7/8/20 4:06 PM, Guoqing Jiang wrote:
>>>> On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>>>>>> Hi Guoqing,
>>>>>>
>>>>>> I believe it isn't hard to write a ebpf based script(bcc or
>>>>>> bpftrace) to
>>>>>> collect this kind of performance data, so looks not necessary to do it
>>>>>> in kernel.
>>>>> Hi Ming,
>>>>>
>>>>> Sorry, I don't know well about bcc or bpftrace, but I assume they
>>>>> need to
>>>>> read the latency value from somewhere inside kernel. Could you point
>>>>> how can I get the latency value? Thanks in advance!
>>>> Hmm, I suppose biolatency is suitable for track latency, will look into
>>>> it.
>>> I think biolatency can't trace data if it is not running,
>> Yeah, the ebpf prog is only injected when the trace is started.
>>
>>> also seems no
>>> place
>>> inside kernel have recorded such information for ebpf to read, correct me
>>> if my understanding is wrong.
>> Just record the info by starting the bcc script in case you need that, is there
>> anything wrong with this usage? Always doing such stuff in kernel isn't fair for
>> users which don't care or need this info.
> 
> That is why we add a Kconfig option and set it to N by default. And I 
> suppose
> with modern cpu, the cost with several more instructions would not be that
> expensive even the option is enabled, just my $0.02.

Never justify it with a Kconfig option, that doesn't help anything at
all. Distros then enable it, and all users are stuck with this overhead.
The ktime_get() is definitely extra overhead.

FWIW, I agree with Ming here in that this can easily be done from
userspace. And if that's the case, then I don't see why everybody should
carry this extra burden.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-10 10:29                 ` Guoqing Jiang
@ 2020-07-11  1:32                   ` Ming Lei
  2020-07-12 20:39                     ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Ming Lei @ 2020-07-11  1:32 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: axboe, linux-block, Florian-Ewald Mueller

On Fri, Jul 10, 2020 at 12:29:28PM +0200, Guoqing Jiang wrote:
> On 7/10/20 12:00 PM, Ming Lei wrote:
> > On Fri, Jul 10, 2020 at 10:55:24AM +0200, Guoqing Jiang wrote:
> > > Hi Ming,
> > > 
> > > On 7/10/20 2:53 AM, Ming Lei wrote:
> > > > Hi Guoqing,
> > > > 
> > > > On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
> > > > > Hi Ming,
> > > > > 
> > > > > On 7/8/20 4:06 PM, Guoqing Jiang wrote:
> > > > > > On 7/8/20 4:02 PM, Guoqing Jiang wrote:
> > > > > > > > Hi Guoqing,
> > > > > > > > 
> > > > > > > > I believe it isn't hard to write a ebpf based script(bcc or
> > > > > > > > bpftrace) to
> > > > > > > > collect this kind of performance data, so looks not necessary to do it
> > > > > > > > in kernel.
> > > > > > > Hi Ming,
> > > > > > > 
> > > > > > > Sorry, I don't know well about bcc or bpftrace, but I assume they
> > > > > > > need to
> > > > > > > read the latency value from somewhere inside kernel. Could you point
> > > > > > > how can I get the latency value? Thanks in advance!
> > > > > > Hmm, I suppose biolatency is suitable for track latency, will look into
> > > > > > it.
> > > > > I think biolatency can't trace data if it is not running,
> > > > Yeah, the ebpf prog is only injected when the trace is started.
> > > > 
> > > > > also seems no
> > > > > place
> > > > > inside kernel have recorded such information for ebpf to read, correct me
> > > > > if my understanding is wrong.
> > > > Just record the info by starting the bcc script in case you need that, is there
> > > > anything wrong with this usage? Always doing such stuff in kernel isn't fair for
> > > > users which don't care or need this info.
> > > That is why we add a Kconfig option and set it to N by default. And I
> > > suppose
> > > with modern cpu, the cost with several more instructions would not be that
> > > expensive even the option is enabled, just my $0.02.
> > > 
> > > > > And as cloud provider,we would like to know data when necessary instead
> > > > > of collect data by keep script running because it is expensive than just
> > > > > read
> > > > > node IMHO.
> > > > It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
> > > > the code has to be verified, however once it is put inside kernel, it should have
> > > > been efficient enough. The kernel side prog only updates & stores the latency
> > > > summery data into bpf map, and the stored summery data can be read out anytime
> > > > by userspace.
> > > > 
> > > > Could you explain a bit why it is expensive? such as biolatency
> > > I thought I am compare read a sys node + extra instructions in kernel with
> > > launch a specific process for monitoring which need to occupy more
> > > resources (memory) and context switch. And for biolatency, it calls the
> > > bpf_ktime_get_ns to calculate latency for each IO which I assume the
> > > ktime_get_ns will be triggered finally, and it is not cheap as you said.
> > You can replace one read of timestamp with rq->start_time_ns too, just
> > like what this patch does. You can write your bcc/bfptrace script,
> > which is quite easy to start. Once you learn its power, maybe you will love
> > it.
> 
> Yes, I definitely need to learn more about it :-). But even with the change,
> I still believe read a node is cheaper than a script.
> 
> And seems biolatency can't trace bio based driver per below, and with
> collect data in tree we can trace all block drivers.
> 
> # load BPF program
> b = BPF(text=bpf_text)
> if args.queued:
>     b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start")
> else:
>     b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
>     b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start")
> b.attach_kprobe(event="blk_account_io_completion",
>     fn_name="trace_req_completion")
> 
> Could it possible to extend it support trace both request and bio? Otherwise
> we have to run another script to trace md raid.

It is pretty easy to extend support bio, just add kprobe on submit_bio
and bio_endio().

thanks,
Ming


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-11  1:32                   ` Ming Lei
@ 2020-07-12 20:39                     ` Guoqing Jiang
  2020-07-12 20:44                       ` Jens Axboe
  0 siblings, 1 reply; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-12 20:39 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, linux-block, Florian-Ewald Mueller

On 7/11/20 3:32 AM, Ming Lei wrote:
> On Fri, Jul 10, 2020 at 12:29:28PM +0200, Guoqing Jiang wrote:
>> On 7/10/20 12:00 PM, Ming Lei wrote:
>>> On Fri, Jul 10, 2020 at 10:55:24AM +0200, Guoqing Jiang wrote:
>>>> Hi Ming,
>>>>
>>>> On 7/10/20 2:53 AM, Ming Lei wrote:
>>>>> Hi Guoqing,
>>>>>
>>>>> On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
>>>>>> Hi Ming,
>>>>>>
>>>>>> On 7/8/20 4:06 PM, Guoqing Jiang wrote:
>>>>>>> On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>>>>>>>>> Hi Guoqing,
>>>>>>>>>
>>>>>>>>> I believe it isn't hard to write a ebpf based script(bcc or
>>>>>>>>> bpftrace) to
>>>>>>>>> collect this kind of performance data, so looks not necessary to do it
>>>>>>>>> in kernel.
>>>>>>>> Hi Ming,
>>>>>>>>
>>>>>>>> Sorry, I don't know well about bcc or bpftrace, but I assume they
>>>>>>>> need to
>>>>>>>> read the latency value from somewhere inside kernel. Could you point
>>>>>>>> how can I get the latency value? Thanks in advance!
>>>>>>> Hmm, I suppose biolatency is suitable for track latency, will look into
>>>>>>> it.
>>>>>> I think biolatency can't trace data if it is not running,
>>>>> Yeah, the ebpf prog is only injected when the trace is started.
>>>>>
>>>>>> also seems no
>>>>>> place
>>>>>> inside kernel have recorded such information for ebpf to read, correct me
>>>>>> if my understanding is wrong.
>>>>> Just record the info by starting the bcc script in case you need that, is there
>>>>> anything wrong with this usage? Always doing such stuff in kernel isn't fair for
>>>>> users which don't care or need this info.
>>>> That is why we add a Kconfig option and set it to N by default. And I
>>>> suppose
>>>> with modern cpu, the cost with several more instructions would not be that
>>>> expensive even the option is enabled, just my $0.02.
>>>>
>>>>>> And as cloud provider,we would like to know data when necessary instead
>>>>>> of collect data by keep script running because it is expensive than just
>>>>>> read
>>>>>> node IMHO.
>>>>> It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
>>>>> the code has to be verified, however once it is put inside kernel, it should have
>>>>> been efficient enough. The kernel side prog only updates & stores the latency
>>>>> summery data into bpf map, and the stored summery data can be read out anytime
>>>>> by userspace.
>>>>>
>>>>> Could you explain a bit why it is expensive? such as biolatency
>>>> I thought I am compare read a sys node + extra instructions in kernel with
>>>> launch a specific process for monitoring which need to occupy more
>>>> resources (memory) and context switch. And for biolatency, it calls the
>>>> bpf_ktime_get_ns to calculate latency for each IO which I assume the
>>>> ktime_get_ns will be triggered finally, and it is not cheap as you said.
>>> You can replace one read of timestamp with rq->start_time_ns too, just
>>> like what this patch does. You can write your bcc/bfptrace script,
>>> which is quite easy to start. Once you learn its power, maybe you will love
>>> it.
>> Yes, I definitely need to learn more about it :-). But even with the change,
>> I still believe read a node is cheaper than a script.
>>
>> And seems biolatency can't trace bio based driver per below, and with
>> collect data in tree we can trace all block drivers.
>>
>> # load BPF program
>> b = BPF(text=bpf_text)
>> if args.queued:
>>      b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start")
>> else:
>>      b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
>>      b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start")
>> b.attach_kprobe(event="blk_account_io_completion",
>>      fn_name="trace_req_completion")
>>
>> Could it possible to extend it support trace both request and bio? Otherwise
>> we have to run another script to trace md raid.
> It is pretty easy to extend support bio, just add kprobe on submit_bio
> and bio_endio().
>

The thing is that we don't like the cost of ebpf based solution. And FWIW,
two years ago, we had compared about ebpf solution and record those
data in kernel.

A. in-kernel monitor: 1~5% performance drop
B. ebpf monitor: 10~15% performance drop

Note, we even copied each bio in approach A, which means the performance
could be more better since we don't clone bio now.

And I think the major concern is about the additional Kconfig option, since
Jens doesn't like it, so I guess no need to make the change it in upstream
kernel.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-12 20:39                     ` Guoqing Jiang
@ 2020-07-12 20:44                       ` Jens Axboe
  2020-07-12 21:04                         ` Guoqing Jiang
  0 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2020-07-12 20:44 UTC (permalink / raw)
  To: Guoqing Jiang, Ming Lei; +Cc: linux-block, Florian-Ewald Mueller

On 7/12/20 2:39 PM, Guoqing Jiang wrote:
> On 7/11/20 3:32 AM, Ming Lei wrote:
>> On Fri, Jul 10, 2020 at 12:29:28PM +0200, Guoqing Jiang wrote:
>>> On 7/10/20 12:00 PM, Ming Lei wrote:
>>>> On Fri, Jul 10, 2020 at 10:55:24AM +0200, Guoqing Jiang wrote:
>>>>> Hi Ming,
>>>>>
>>>>> On 7/10/20 2:53 AM, Ming Lei wrote:
>>>>>> Hi Guoqing,
>>>>>>
>>>>>> On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
>>>>>>> Hi Ming,
>>>>>>>
>>>>>>> On 7/8/20 4:06 PM, Guoqing Jiang wrote:
>>>>>>>> On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>>>>>>>>>> Hi Guoqing,
>>>>>>>>>>
>>>>>>>>>> I believe it isn't hard to write a ebpf based script(bcc or
>>>>>>>>>> bpftrace) to
>>>>>>>>>> collect this kind of performance data, so looks not necessary to do it
>>>>>>>>>> in kernel.
>>>>>>>>> Hi Ming,
>>>>>>>>>
>>>>>>>>> Sorry, I don't know well about bcc or bpftrace, but I assume they
>>>>>>>>> need to
>>>>>>>>> read the latency value from somewhere inside kernel. Could you point
>>>>>>>>> how can I get the latency value? Thanks in advance!
>>>>>>>> Hmm, I suppose biolatency is suitable for track latency, will look into
>>>>>>>> it.
>>>>>>> I think biolatency can't trace data if it is not running,
>>>>>> Yeah, the ebpf prog is only injected when the trace is started.
>>>>>>
>>>>>>> also seems no
>>>>>>> place
>>>>>>> inside kernel have recorded such information for ebpf to read, correct me
>>>>>>> if my understanding is wrong.
>>>>>> Just record the info by starting the bcc script in case you need that, is there
>>>>>> anything wrong with this usage? Always doing such stuff in kernel isn't fair for
>>>>>> users which don't care or need this info.
>>>>> That is why we add a Kconfig option and set it to N by default. And I
>>>>> suppose
>>>>> with modern cpu, the cost with several more instructions would not be that
>>>>> expensive even the option is enabled, just my $0.02.
>>>>>
>>>>>>> And as cloud provider,we would like to know data when necessary instead
>>>>>>> of collect data by keep script running because it is expensive than just
>>>>>>> read
>>>>>>> node IMHO.
>>>>>> It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
>>>>>> the code has to be verified, however once it is put inside kernel, it should have
>>>>>> been efficient enough. The kernel side prog only updates & stores the latency
>>>>>> summery data into bpf map, and the stored summery data can be read out anytime
>>>>>> by userspace.
>>>>>>
>>>>>> Could you explain a bit why it is expensive? such as biolatency
>>>>> I thought I am compare read a sys node + extra instructions in kernel with
>>>>> launch a specific process for monitoring which need to occupy more
>>>>> resources (memory) and context switch. And for biolatency, it calls the
>>>>> bpf_ktime_get_ns to calculate latency for each IO which I assume the
>>>>> ktime_get_ns will be triggered finally, and it is not cheap as you said.
>>>> You can replace one read of timestamp with rq->start_time_ns too, just
>>>> like what this patch does. You can write your bcc/bfptrace script,
>>>> which is quite easy to start. Once you learn its power, maybe you will love
>>>> it.
>>> Yes, I definitely need to learn more about it :-). But even with the change,
>>> I still believe read a node is cheaper than a script.
>>>
>>> And seems biolatency can't trace bio based driver per below, and with
>>> collect data in tree we can trace all block drivers.
>>>
>>> # load BPF program
>>> b = BPF(text=bpf_text)
>>> if args.queued:
>>>      b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start")
>>> else:
>>>      b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
>>>      b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start")
>>> b.attach_kprobe(event="blk_account_io_completion",
>>>      fn_name="trace_req_completion")
>>>
>>> Could it possible to extend it support trace both request and bio? Otherwise
>>> we have to run another script to trace md raid.
>> It is pretty easy to extend support bio, just add kprobe on submit_bio
>> and bio_endio().
>>
> 
> The thing is that we don't like the cost of ebpf based solution. And FWIW,
> two years ago, we had compared about ebpf solution and record those
> data in kernel.
> 
> A. in-kernel monitor: 1~5% performance drop
> B. ebpf monitor: 10~15% performance drop
> 
> Note, we even copied each bio in approach A, which means the performance
> could be more better since we don't clone bio now.
> 
> And I think the major concern is about the additional Kconfig option, since
> Jens doesn't like it, so I guess no need to make the change it in upstream
> kernel.

No, my main concern is trying to justify it with having a Kconfig option
to turn it off. Fact is, distros will likely turn it on, and then
everybody gets that overhead. There's a temptation to hide features like
that behind a Kconfig option with this exact justification, and it just
doesn't work like that in practice.

I might be amenable to the change if:

1) it doesn't add any (real) overhead to the normal fast path
2) probably needs to be opt-in. not via kconfig, but as a sysfs
   attribute. Like we have 'iostats' today.

Something I've mentioned in the past has been a blktrace flight recorder
mode, but that's largely been superseded by having bpf available. But
the point is that something like blktrace generally doesn't add ANY
overhead at all if blktrace isn't being run. Your solution ends up
collecting stats all the time, regardless of whether or not anyone is
actually looking at the data. That's a bad idea, and I'd be much happier
with a solution that only adds overhead when actually used, not all the
time.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 4/5] block: add a statistic table for io latency
  2020-07-12 20:44                       ` Jens Axboe
@ 2020-07-12 21:04                         ` Guoqing Jiang
  0 siblings, 0 replies; 22+ messages in thread
From: Guoqing Jiang @ 2020-07-12 21:04 UTC (permalink / raw)
  To: Jens Axboe, Ming Lei; +Cc: linux-block, Florian-Ewald Mueller

On 7/12/20 10:44 PM, Jens Axboe wrote:
> On 7/12/20 2:39 PM, Guoqing Jiang wrote:
>> On 7/11/20 3:32 AM, Ming Lei wrote:
>>> On Fri, Jul 10, 2020 at 12:29:28PM +0200, Guoqing Jiang wrote:
>>>> On 7/10/20 12:00 PM, Ming Lei wrote:
>>>>> On Fri, Jul 10, 2020 at 10:55:24AM +0200, Guoqing Jiang wrote:
>>>>>> Hi Ming,
>>>>>>
>>>>>> On 7/10/20 2:53 AM, Ming Lei wrote:
>>>>>>> Hi Guoqing,
>>>>>>>
>>>>>>> On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
>>>>>>>> Hi Ming,
>>>>>>>>
>>>>>>>> On 7/8/20 4:06 PM, Guoqing Jiang wrote:
>>>>>>>>> On 7/8/20 4:02 PM, Guoqing Jiang wrote:
>>>>>>>>>>> Hi Guoqing,
>>>>>>>>>>>
>>>>>>>>>>> I believe it isn't hard to write a ebpf based script(bcc or
>>>>>>>>>>> bpftrace) to
>>>>>>>>>>> collect this kind of performance data, so looks not necessary to do it
>>>>>>>>>>> in kernel.
>>>>>>>>>> Hi Ming,
>>>>>>>>>>
>>>>>>>>>> Sorry, I don't know well about bcc or bpftrace, but I assume they
>>>>>>>>>> need to
>>>>>>>>>> read the latency value from somewhere inside kernel. Could you point
>>>>>>>>>> how can I get the latency value? Thanks in advance!
>>>>>>>>> Hmm, I suppose biolatency is suitable for track latency, will look into
>>>>>>>>> it.
>>>>>>>> I think biolatency can't trace data if it is not running,
>>>>>>> Yeah, the ebpf prog is only injected when the trace is started.
>>>>>>>
>>>>>>>> also seems no
>>>>>>>> place
>>>>>>>> inside kernel have recorded such information for ebpf to read, correct me
>>>>>>>> if my understanding is wrong.
>>>>>>> Just record the info by starting the bcc script in case you need that, is there
>>>>>>> anything wrong with this usage? Always doing such stuff in kernel isn't fair for
>>>>>>> users which don't care or need this info.
>>>>>> That is why we add a Kconfig option and set it to N by default. And I
>>>>>> suppose
>>>>>> with modern cpu, the cost with several more instructions would not be that
>>>>>> expensive even the option is enabled, just my $0.02.
>>>>>>
>>>>>>>> And as cloud provider,we would like to know data when necessary instead
>>>>>>>> of collect data by keep script running because it is expensive than just
>>>>>>>> read
>>>>>>>> node IMHO.
>>>>>>> It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
>>>>>>> the code has to be verified, however once it is put inside kernel, it should have
>>>>>>> been efficient enough. The kernel side prog only updates & stores the latency
>>>>>>> summery data into bpf map, and the stored summery data can be read out anytime
>>>>>>> by userspace.
>>>>>>>
>>>>>>> Could you explain a bit why it is expensive? such as biolatency
>>>>>> I thought I am compare read a sys node + extra instructions in kernel with
>>>>>> launch a specific process for monitoring which need to occupy more
>>>>>> resources (memory) and context switch. And for biolatency, it calls the
>>>>>> bpf_ktime_get_ns to calculate latency for each IO which I assume the
>>>>>> ktime_get_ns will be triggered finally, and it is not cheap as you said.
>>>>> You can replace one read of timestamp with rq->start_time_ns too, just
>>>>> like what this patch does. You can write your bcc/bfptrace script,
>>>>> which is quite easy to start. Once you learn its power, maybe you will love
>>>>> it.
>>>> Yes, I definitely need to learn more about it :-). But even with the change,
>>>> I still believe read a node is cheaper than a script.
>>>>
>>>> And seems biolatency can't trace bio based driver per below, and with
>>>> collect data in tree we can trace all block drivers.
>>>>
>>>> # load BPF program
>>>> b = BPF(text=bpf_text)
>>>> if args.queued:
>>>>       b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start")
>>>> else:
>>>>       b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
>>>>       b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start")
>>>> b.attach_kprobe(event="blk_account_io_completion",
>>>>       fn_name="trace_req_completion")
>>>>
>>>> Could it possible to extend it support trace both request and bio? Otherwise
>>>> we have to run another script to trace md raid.
>>> It is pretty easy to extend support bio, just add kprobe on submit_bio
>>> and bio_endio().
>>>
>> The thing is that we don't like the cost of ebpf based solution. And FWIW,
>> two years ago, we had compared about ebpf solution and record those
>> data in kernel.
>>
>> A. in-kernel monitor: 1~5% performance drop
>> B. ebpf monitor: 10~15% performance drop
>>
>> Note, we even copied each bio in approach A, which means the performance
>> could be more better since we don't clone bio now.
>>
>> And I think the major concern is about the additional Kconfig option, since
>> Jens doesn't like it, so I guess no need to make the change it in upstream
>> kernel.
> No, my main concern is trying to justify it with having a Kconfig option
> to turn it off. Fact is, distros will likely turn it on, and then
> everybody gets that overhead. There's a temptation to hide features like
> that behind a Kconfig option with this exact justification, and it just
> doesn't work like that in practice.

Good to know that, :-).

> I might be amenable to the change if:
>
> 1) it doesn't add any (real) overhead to the normal fast path

It is possible to remove ktime_get_ns from current change, but I admit
we can't avoid small overhead (several instructions per each IO I think),
and it is not fair to people who doesn't care about those data.

> 2) probably needs to be opt-in. not via kconfig, but as a sysfs
>     attribute. Like we have 'iostats' today.

Thanks for the suggestion, will investigate a better way.

> Something I've mentioned in the past has been a blktrace flight recorder
> mode, but that's largely been superseded by having bpf available. But
> the point is that something like blktrace generally doesn't add ANY
> overhead at all if blktrace isn't being run. Your solution ends up
> collecting stats all the time, regardless of whether or not anyone is
> actually looking at the data. That's a bad idea, and I'd be much happier
> with a solution that only adds overhead when actually used, not all the
> time.

Appreciate for your input which make me know your concern better.

Thanks,
Guoqing


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-07-12 21:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08  7:58 [RFC PATCH 0/4] block: add two statistic tables Guoqing Jiang
2020-07-08  7:58 ` [PATCH RFC 1/5] block: return ns precision from disk_start_io_acct Guoqing Jiang
2020-07-08 13:27   ` Ming Lei
2020-07-08 13:53     ` Guoqing Jiang
2020-07-08 17:46       ` Guoqing Jiang
2020-07-08  7:58 ` [PATCH RFC 2/5] drbd: remove unused argument from drbd_request_prepare and __drbd_make_request Guoqing Jiang
2020-07-08  7:58 ` [PATCH RFC 3/5] drbd: rename start_jif to start_ns Guoqing Jiang
2020-07-08  7:58 ` [PATCH RFC 4/5] block: add a statistic table for io latency Guoqing Jiang
2020-07-08 13:29   ` Ming Lei
2020-07-08 14:02     ` Guoqing Jiang
2020-07-08 14:06       ` Guoqing Jiang
2020-07-09 18:48         ` Guoqing Jiang
2020-07-10  0:53           ` Ming Lei
2020-07-10  8:55             ` Guoqing Jiang
2020-07-10 10:00               ` Ming Lei
2020-07-10 10:29                 ` Guoqing Jiang
2020-07-11  1:32                   ` Ming Lei
2020-07-12 20:39                     ` Guoqing Jiang
2020-07-12 20:44                       ` Jens Axboe
2020-07-12 21:04                         ` Guoqing Jiang
2020-07-10 14:04               ` Jens Axboe
2020-07-08  7:58 ` [PATCH RFC 5/5] block: add a statistic table for io sector Guoqing Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.