All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/7] md: io stats accounting
@ 2021-05-21  0:55 Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 1/7] md: revert " Guoqing Jiang
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

V2 changes:

1. add accounting_bio to md_personality.
2. cleanup in case bioset_integrity_create fails.
3. use bio_end_io_acct.
4. remove patch for enable io accounting for multipath.
5. add one patch to rename print_msg.
6. add one patch to deprecate linear, multipath and faulty.

Artur Paszkiewicz (1):
  md: the latest try for improve io stats accounting

Guoqing Jiang (6):
  md: revert io stats accounting
  md: add accounting_bio for raid0 and raid5
  md/raid1: rename print_msg with r1bio_existed
  md/raid1: enable io accounting
  md/raid10: enable io accounting
  md: mark some personalities as deprecated

 drivers/md/Kconfig        |  6 ++--
 drivers/md/md-faulty.c    |  2 +-
 drivers/md/md-linear.c    |  2 +-
 drivers/md/md-multipath.c |  2 +-
 drivers/md/md.c           | 66 ++++++++++++++++++++++-----------------
 drivers/md/md.h           |  4 ++-
 drivers/md/raid0.c        | 14 +++++++++
 drivers/md/raid1.c        | 15 ++++++---
 drivers/md/raid1.h        |  1 +
 drivers/md/raid10.c       |  6 ++++
 drivers/md/raid10.h       |  1 +
 drivers/md/raid5.c        | 17 ++++++++++
 12 files changed, 97 insertions(+), 39 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH V2 1/7] md: revert io stats accounting
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
@ 2021-05-21  0:55 ` Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5 Guoqing Jiang
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

The commit 41d2d848e5c0 ("md: improve io stats accounting") could cause
double fault problem per the report [1], and also it is not correct to
change ->bi_end_io if md don't own it, so let's revert it.

And io stats accounting will be replemented in later commits.

[1]. https://lore.kernel.org/linux-raid/3bf04253-3fad-434a-63a7-20214e38cf26@gmail.com/T/#t

Fixes: 41d2d848e5c0 ("md: improve io stats accounting")
Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
 drivers/md/md.c | 45 ---------------------------------------------
 drivers/md/md.h |  1 -
 2 files changed, 46 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 49f897fbb89b..7ba00e4c862d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -441,30 +441,6 @@ void md_handle_request(struct mddev *mddev, struct bio *bio)
 }
 EXPORT_SYMBOL(md_handle_request);
 
-struct md_io {
-	struct mddev *mddev;
-	bio_end_io_t *orig_bi_end_io;
-	void *orig_bi_private;
-	struct block_device *orig_bi_bdev;
-	unsigned long start_time;
-};
-
-static void md_end_io(struct bio *bio)
-{
-	struct md_io *md_io = bio->bi_private;
-	struct mddev *mddev = md_io->mddev;
-
-	bio_end_io_acct_remapped(bio, md_io->start_time, md_io->orig_bi_bdev);
-
-	bio->bi_end_io = md_io->orig_bi_end_io;
-	bio->bi_private = md_io->orig_bi_private;
-
-	mempool_free(md_io, &mddev->md_io_pool);
-
-	if (bio->bi_end_io)
-		bio->bi_end_io(bio);
-}
-
 static blk_qc_t md_submit_bio(struct bio *bio)
 {
 	const int rw = bio_data_dir(bio);
@@ -489,21 +465,6 @@ static blk_qc_t md_submit_bio(struct bio *bio)
 		return BLK_QC_T_NONE;
 	}
 
-	if (bio->bi_end_io != md_end_io) {
-		struct md_io *md_io;
-
-		md_io = mempool_alloc(&mddev->md_io_pool, GFP_NOIO);
-		md_io->mddev = mddev;
-		md_io->orig_bi_end_io = bio->bi_end_io;
-		md_io->orig_bi_private = bio->bi_private;
-		md_io->orig_bi_bdev = bio->bi_bdev;
-
-		bio->bi_end_io = md_end_io;
-		bio->bi_private = md_io;
-
-		md_io->start_time = bio_start_io_acct(bio);
-	}
-
 	/* bio could be mergeable after passing to underlayer */
 	bio->bi_opf &= ~REQ_NOMERGE;
 
@@ -5608,7 +5569,6 @@ static void md_free(struct kobject *ko)
 
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);
-	mempool_exit(&mddev->md_io_pool);
 	kfree(mddev);
 }
 
@@ -5705,11 +5665,6 @@ static int md_alloc(dev_t dev, char *name)
 		 */
 		mddev->hold_active = UNTIL_STOP;
 
-	error = mempool_init_kmalloc_pool(&mddev->md_io_pool, BIO_POOL_SIZE,
-					  sizeof(struct md_io));
-	if (error)
-		goto abort;
-
 	error = -ENOMEM;
 	mddev->queue = blk_alloc_queue(NUMA_NO_NODE);
 	if (!mddev->queue)
diff --git a/drivers/md/md.h b/drivers/md/md.h
index fb7eab58cfd5..4da240ffe2c5 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -487,7 +487,6 @@ struct mddev {
 	struct bio_set			sync_set; /* for sync operations like
 						   * metadata and bitmap writes
 						   */
-	mempool_t			md_io_pool;
 
 	/* Generic flush handling.
 	 * The last to finish preflush schedules a worker to submit
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 1/7] md: revert " Guoqing Jiang
@ 2021-05-21  0:55 ` Guoqing Jiang
  2021-05-24  5:48   ` Song Liu
  2021-05-24  8:45   ` Christoph Hellwig
  2021-05-21  0:55 ` [PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

Let's introduce accounting_bio which checks if md needs clone the bio
for accounting.

And add relevant function to raid0 and raid5 given both don't have
their own clone infrastrure, also checks if it is split bio.

Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
 drivers/md/md.h    |  2 ++
 drivers/md/raid0.c | 14 ++++++++++++++
 drivers/md/raid5.c | 17 +++++++++++++++++
 3 files changed, 33 insertions(+)

diff --git a/drivers/md/md.h b/drivers/md/md.h
index 4da240ffe2c5..5125ccf9df06 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -605,6 +605,8 @@ struct md_personality
 	void *(*takeover) (struct mddev *mddev);
 	/* Changes the consistency policy of an active array. */
 	int (*change_consistency_policy)(struct mddev *mddev, const char *buf);
+	/* check if need to clone bio for accounting in md layer */
+	bool (*accounting_bio)(struct mddev *mddev, struct bio *bio);
 };
 
 struct md_sysfs_entry {
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index e5d7411cba9b..d309b639b5d9 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -748,6 +748,19 @@ static void raid0_quiesce(struct mddev *mddev, int quiesce)
 {
 }
 
+/*
+ * Don't account the bio if it was split from mddev->bio_set.
+ */
+static bool raid0_accounting_bio(struct mddev *mddev, struct bio *bio)
+{
+	bool ret = true;
+
+	if (bio->bi_pool == &mddev->bio_set)
+		ret = false;
+
+	return ret;
+}
+
 static struct md_personality raid0_personality=
 {
 	.name		= "raid0",
@@ -760,6 +773,7 @@ static struct md_personality raid0_personality=
 	.size		= raid0_size,
 	.takeover	= raid0_takeover,
 	.quiesce	= raid0_quiesce,
+	.accounting_bio = raid0_accounting_bio,
 };
 
 static int __init raid0_init (void)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 841e1c1aa5e6..bcc1ceb69c73 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -8596,6 +8596,20 @@ static void *raid6_takeover(struct mddev *mddev)
 	return setup_conf(mddev);
 }
 
+/*
+ * Don't account the bio if it was split from r5conf->bio_split.
+ */
+static bool raid5_accounting_bio(struct mddev *mddev, struct bio *bio)
+{
+	bool ret = true;
+	struct r5conf *conf = mddev->private;
+
+	if (bio->bi_pool == &conf->bio_split)
+		ret = false;
+
+	return ret;
+}
+
 static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 {
 	struct r5conf *conf;
@@ -8688,6 +8702,7 @@ static struct md_personality raid6_personality =
 	.quiesce	= raid5_quiesce,
 	.takeover	= raid6_takeover,
 	.change_consistency_policy = raid5_change_consistency_policy,
+	.accounting_bio	= raid5_accounting_bio,
 };
 static struct md_personality raid5_personality =
 {
@@ -8712,6 +8727,7 @@ static struct md_personality raid5_personality =
 	.quiesce	= raid5_quiesce,
 	.takeover	= raid5_takeover,
 	.change_consistency_policy = raid5_change_consistency_policy,
+	.accounting_bio	= raid5_accounting_bio,
 };
 
 static struct md_personality raid4_personality =
@@ -8737,6 +8753,7 @@ static struct md_personality raid4_personality =
 	.quiesce	= raid5_quiesce,
 	.takeover	= raid4_takeover,
 	.change_consistency_policy = raid5_change_consistency_policy,
+	.accounting_bio	= raid5_accounting_bio,
 };
 
 static int __init raid5_init(void)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V2 3/7] md: the latest try for improve io stats accounting
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 1/7] md: revert " Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5 Guoqing Jiang
@ 2021-05-21  0:55 ` Guoqing Jiang
  2021-05-21  7:32   ` Artur Paszkiewicz
  2021-05-21  0:55 ` [PATCH V2 4/7] md/raid1: rename print_msg with r1bio_existed Guoqing Jiang
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>

Use generic io accounting functions to manage io stats. There was an
attempt to do this earlier in commit 18c0b223cf990172 ("md: use generic
io stats accounting functions to simplify io stat accounting"), but it
did not include a call to generic_end_io_acct() and caused issues with
tracking in-flight IOs, so it was later removed in commit 74672d069b29
("md: fix md io stats accounting broken").

This patch attempts to fix this by using both generic_start_io_acct()
and generic_end_io_acct(). To make it possible, in md_make_request() a
bio is cloned with additional data - struct md_io, which includes the io
start_time. A new bioset is introduced for this purpose. We call
generic_start_io_acct() and pass the clone instead of the original to
md_handle_request(). When it completes, we call generic_end_io_acct()
and complete the original bio.

This adds correct statistics about in-flight IOs and IO processing time,
interpreted e.g. in iostat as await, svctm, aqu-sz and %util.

It also fixes a situation where too many IOs where reported if a bio was
re-submitted to the mddev, because io accounting is now performed only
on newly arriving bios.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
[Guoqing: rebase and make generic accounting applies to personalities
	  which don't have clone infrastructure]
Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
 drivers/md/md.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/md/md.h |  1 +
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7ba00e4c862d..13392fe9379c 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -441,6 +441,25 @@ void md_handle_request(struct mddev *mddev, struct bio *bio)
 }
 EXPORT_SYMBOL(md_handle_request);
 
+struct md_io {
+	struct mddev *mddev;
+	struct bio *orig_bio;
+	unsigned long start_time;
+	struct bio bio_clone;
+};
+
+static void md_end_io(struct bio *bio)
+{
+	struct md_io *md_io = bio->bi_private;
+	struct bio *orig_bio = md_io->orig_bio;
+
+	orig_bio->bi_status = bio->bi_status;
+
+	bio_end_io_acct(orig_bio, md_io->start_time);
+	bio_put(bio);
+	bio_endio(orig_bio);
+}
+
 static blk_qc_t md_submit_bio(struct bio *bio)
 {
 	const int rw = bio_data_dir(bio);
@@ -465,6 +484,30 @@ static blk_qc_t md_submit_bio(struct bio *bio)
 		return BLK_QC_T_NONE;
 	}
 
+	/*
+	 * clone bio under conditions:
+	 * 1. QUEUE_FLAG_IO_STAT flag is set.
+	 * 2. bio just enters md and it is not split from personality.
+	 */
+	if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue) &&
+	    (bio->bi_pool != &mddev->md_io_bs) &&
+	    (mddev->pers->accounting_bio &&
+	     mddev->pers->accounting_bio(mddev, bio))) {
+		struct md_io *md_io;
+		struct bio *clone;
+
+		clone = bio_clone_fast(bio, GFP_NOIO, &mddev->md_io_bs);
+
+		md_io = container_of(clone, struct md_io, bio_clone);
+		md_io->mddev = mddev;
+		md_io->orig_bio = bio;
+		md_io->start_time = bio_start_io_acct(bio);
+
+		clone->bi_end_io = md_end_io;
+		clone->bi_private = md_io;
+		bio = clone;
+	}
+
 	/* bio could be mergeable after passing to underlayer */
 	bio->bi_opf &= ~REQ_NOMERGE;
 
@@ -2340,9 +2383,12 @@ int md_integrity_register(struct mddev *mddev)
 			       bdev_get_integrity(reference->bdev));
 
 	pr_debug("md: data integrity enabled on %s\n", mdname(mddev));
-	if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE)) {
+	if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE) ||
+	    bioset_integrity_create(&mddev->md_io_bs, BIO_POOL_SIZE)) {
 		pr_err("md: failed to create integrity pool for %s\n",
 		       mdname(mddev));
+		bioset_exit(&mddev->bio_set);
+		bioset_exit(&mddev->md_io_bs);
 		return -EINVAL;
 	}
 	return 0;
@@ -5569,6 +5615,7 @@ static void md_free(struct kobject *ko)
 
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);
+	bioset_exit(&mddev->md_io_bs);
 	kfree(mddev);
 }
 
@@ -5864,6 +5911,12 @@ int md_run(struct mddev *mddev)
 		if (err)
 			return err;
 	}
+	if (!bioset_initialized(&mddev->md_io_bs)) {
+		err = bioset_init(&mddev->md_io_bs, BIO_POOL_SIZE,
+				  offsetof(struct md_io, bio_clone), 0);
+		if (err)
+			return err;
+	}
 
 	spin_lock(&pers_lock);
 	pers = find_pers(mddev->level, mddev->clevel);
@@ -6041,6 +6094,7 @@ int md_run(struct mddev *mddev)
 abort:
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);
+	bioset_exit(&mddev->md_io_bs);
 	return err;
 }
 EXPORT_SYMBOL_GPL(md_run);
@@ -6264,6 +6318,7 @@ void md_stop(struct mddev *mddev)
 	__md_stop(mddev);
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);
+	bioset_exit(&mddev->md_io_bs);
 }
 
 EXPORT_SYMBOL_GPL(md_stop);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 5125ccf9df06..d2f476c427a9 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -487,6 +487,7 @@ struct mddev {
 	struct bio_set			sync_set; /* for sync operations like
 						   * metadata and bitmap writes
 						   */
+	struct bio_set			md_io_bs; /* for io accounting */
 
 	/* Generic flush handling.
 	 * The last to finish preflush schedules a worker to submit
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V2 4/7] md/raid1: rename print_msg with r1bio_existed
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
                   ` (2 preceding siblings ...)
  2021-05-21  0:55 ` [PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
@ 2021-05-21  0:55 ` Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 5/7] md/raid1: enable io accounting Guoqing Jiang
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

The caller of raid1_read_request could pass NULL or a valid pointer for
"struct r1bio *r1_bio", so it actually means whether r1_bio is existed
or not.

Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
 drivers/md/raid1.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index ced076ba560e..696da6b8b7ed 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1210,7 +1210,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
 	const unsigned long do_sync = (bio->bi_opf & REQ_SYNC);
 	int max_sectors;
 	int rdisk;
-	bool print_msg = !!r1_bio;
+	bool r1bio_existed = !!r1_bio;
 	char b[BDEVNAME_SIZE];
 
 	/*
@@ -1220,7 +1220,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
 	 */
 	gfp_t gfp = r1_bio ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
 
-	if (print_msg) {
+	if (r1bio_existed) {
 		/* Need to get the block device name carefully */
 		struct md_rdev *rdev;
 		rcu_read_lock();
@@ -1252,7 +1252,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
 
 	if (rdisk < 0) {
 		/* couldn't find anywhere to read from */
-		if (print_msg) {
+		if (r1bio_existed) {
 			pr_crit_ratelimited("md/raid1:%s: %s: unrecoverable I/O read error for block %llu\n",
 					    mdname(mddev),
 					    b,
@@ -1263,7 +1263,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
 	}
 	mirror = conf->mirrors + rdisk;
 
-	if (print_msg)
+	if (r1bio_existed)
 		pr_info_ratelimited("md/raid1:%s: redirecting sector %llu to other mirror: %s\n",
 				    mdname(mddev),
 				    (unsigned long long)r1_bio->sector,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V2 5/7] md/raid1: enable io accounting
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
                   ` (3 preceding siblings ...)
  2021-05-21  0:55 ` [PATCH V2 4/7] md/raid1: rename print_msg with r1bio_existed Guoqing Jiang
@ 2021-05-21  0:55 ` Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 6/7] md/raid10: " Guoqing Jiang
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

For raid1, we record the start time between split bio and clone bio,
and finish the accounting in the final endio.

Also introduce start_time in r1bio accordingly.

Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
 drivers/md/raid1.c | 7 +++++++
 drivers/md/raid1.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 696da6b8b7ed..51f2547c2007 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -300,6 +300,8 @@ static void call_bio_endio(struct r1bio *r1_bio)
 	if (!test_bit(R1BIO_Uptodate, &r1_bio->state))
 		bio->bi_status = BLK_STS_IOERR;
 
+	if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
+		bio_end_io_acct(bio, r1_bio->start_time);
 	bio_endio(bio);
 }
 
@@ -1292,6 +1294,9 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
 
 	r1_bio->read_disk = rdisk;
 
+	if (!r1bio_existed && blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
+		r1_bio->start_time = bio_start_io_acct(bio);
+
 	read_bio = bio_clone_fast(bio, gfp, &mddev->bio_set);
 
 	r1_bio->bios[rdisk] = read_bio;
@@ -1461,6 +1466,8 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
 		r1_bio->sectors = max_sectors;
 	}
 
+	if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
+		r1_bio->start_time = bio_start_io_acct(bio);
 	atomic_set(&r1_bio->remaining, 1);
 	atomic_set(&r1_bio->behind_remaining, 0);
 
diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h
index b7eb09e8c025..ccf10e59b116 100644
--- a/drivers/md/raid1.h
+++ b/drivers/md/raid1.h
@@ -158,6 +158,7 @@ struct r1bio {
 	sector_t		sector;
 	int			sectors;
 	unsigned long		state;
+	unsigned long		start_time;
 	struct mddev		*mddev;
 	/*
 	 * original bio going to /dev/mdx
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V2 6/7] md/raid10: enable io accounting
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
                   ` (4 preceding siblings ...)
  2021-05-21  0:55 ` [PATCH V2 5/7] md/raid1: enable io accounting Guoqing Jiang
@ 2021-05-21  0:55 ` Guoqing Jiang
  2021-05-21  0:55 ` [PATCH V2 7/7] md: mark some personalities as deprecated Guoqing Jiang
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

For raid10, we record the start time between split bio and clone bio,
and finish the accounting in the final endio.

Also introduce start_time in r10bio accordingly.

Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
 drivers/md/raid10.c | 6 ++++++
 drivers/md/raid10.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 13f5e6b2a73d..16977e8e075d 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -297,6 +297,8 @@ static void raid_end_bio_io(struct r10bio *r10_bio)
 	if (!test_bit(R10BIO_Uptodate, &r10_bio->state))
 		bio->bi_status = BLK_STS_IOERR;
 
+	if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
+		bio_end_io_acct(bio, r10_bio->start_time);
 	bio_endio(bio);
 	/*
 	 * Wake up any possible resync thread that waits for the device
@@ -1184,6 +1186,8 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
 	}
 	slot = r10_bio->read_slot;
 
+	if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
+		r10_bio->start_time = bio_start_io_acct(bio);
 	read_bio = bio_clone_fast(bio, gfp, &mddev->bio_set);
 
 	r10_bio->devs[slot].bio = read_bio;
@@ -1483,6 +1487,8 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
 		r10_bio->master_bio = bio;
 	}
 
+	if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
+		r10_bio->start_time = bio_start_io_acct(bio);
 	atomic_set(&r10_bio->remaining, 1);
 	md_bitmap_startwrite(mddev->bitmap, r10_bio->sector, r10_bio->sectors, 0);
 
diff --git a/drivers/md/raid10.h b/drivers/md/raid10.h
index 1461fd55311b..c34bb196790e 100644
--- a/drivers/md/raid10.h
+++ b/drivers/md/raid10.h
@@ -124,6 +124,7 @@ struct r10bio {
 	sector_t		sector;	/* virtual sector number */
 	int			sectors;
 	unsigned long		state;
+	unsigned long		start_time;
 	struct mddev		*mddev;
 	/*
 	 * original bio going to /dev/mdx
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V2 7/7] md: mark some personalities as deprecated
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
                   ` (5 preceding siblings ...)
  2021-05-21  0:55 ` [PATCH V2 6/7] md/raid10: " Guoqing Jiang
@ 2021-05-21  0:55 ` Guoqing Jiang
  2021-05-21  8:00 ` [UPDATE PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
  2021-05-24  6:04 ` [PATCH V2 0/7] md: " Song Liu
  8 siblings, 0 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  0:55 UTC (permalink / raw)
  To: song; +Cc: linux-raid, artur.paszkiewicz

Mark the three personalities (linear, fault and multipath) as deprecated
because:

1. people usually use dm multipath or nvme multipath.
2. linear is already deprecated in MODULE_ALIAS.
3. no active development and user for fault from my understanding.

Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
 drivers/md/Kconfig        | 6 +++---
 drivers/md/md-faulty.c    | 2 +-
 drivers/md/md-linear.c    | 2 +-
 drivers/md/md-multipath.c | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index f2014385d48b..0602e82a9516 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -47,7 +47,7 @@ config MD_AUTODETECT
 	  If unsure, say Y.
 
 config MD_LINEAR
-	tristate "Linear (append) mode"
+	tristate "Linear (append) mode (deprecated)"
 	depends on BLK_DEV_MD
 	help
 	  If you say Y here, then your multiple devices driver will be able to
@@ -158,7 +158,7 @@ config MD_RAID456
 	  If unsure, say Y.
 
 config MD_MULTIPATH
-	tristate "Multipath I/O support"
+	tristate "Multipath I/O support (deprecated)"
 	depends on BLK_DEV_MD
 	help
 	  MD_MULTIPATH provides a simple multi-path personality for use
@@ -169,7 +169,7 @@ config MD_MULTIPATH
 	  If unsure, say N.
 
 config MD_FAULTY
-	tristate "Faulty test module for MD"
+	tristate "Faulty test module for MD (deprecated)"
 	depends on BLK_DEV_MD
 	help
 	  The "faulty" module allows for a block device that occasionally returns
diff --git a/drivers/md/md-faulty.c b/drivers/md/md-faulty.c
index fda4cb3f936f..c0dc6f2ef4a3 100644
--- a/drivers/md/md-faulty.c
+++ b/drivers/md/md-faulty.c
@@ -357,7 +357,7 @@ static void raid_exit(void)
 module_init(raid_init);
 module_exit(raid_exit);
 MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("Fault injection personality for MD");
+MODULE_DESCRIPTION("Fault injection personality for MD (deprecated)");
 MODULE_ALIAS("md-personality-10"); /* faulty */
 MODULE_ALIAS("md-faulty");
 MODULE_ALIAS("md-level--5");
diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index 63ed8329a98d..1ff51647a682 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -312,7 +312,7 @@ static void linear_exit (void)
 module_init(linear_init);
 module_exit(linear_exit);
 MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("Linear device concatenation personality for MD");
+MODULE_DESCRIPTION("Linear device concatenation personality for MD (deprecated)");
 MODULE_ALIAS("md-personality-1"); /* LINEAR - deprecated*/
 MODULE_ALIAS("md-linear");
 MODULE_ALIAS("md-level--1");
diff --git a/drivers/md/md-multipath.c b/drivers/md/md-multipath.c
index 776bbe542db5..e7d6486f090f 100644
--- a/drivers/md/md-multipath.c
+++ b/drivers/md/md-multipath.c
@@ -471,7 +471,7 @@ static void __exit multipath_exit (void)
 module_init(multipath_init);
 module_exit(multipath_exit);
 MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("simple multi-path personality for MD");
+MODULE_DESCRIPTION("simple multi-path personality for MD (deprecated)");
 MODULE_ALIAS("md-personality-7"); /* MULTIPATH */
 MODULE_ALIAS("md-multipath");
 MODULE_ALIAS("md-level--4");
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 3/7] md: the latest try for improve io stats accounting
  2021-05-21  0:55 ` [PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
@ 2021-05-21  7:32   ` Artur Paszkiewicz
  2021-05-21  7:43     ` Guoqing Jiang
  0 siblings, 1 reply; 16+ messages in thread
From: Artur Paszkiewicz @ 2021-05-21  7:32 UTC (permalink / raw)
  To: Guoqing Jiang, song; +Cc: linux-raid

> @@ -2340,9 +2383,12 @@ int md_integrity_register(struct mddev *mddev)
>                              bdev_get_integrity(reference->bdev));
>  
>       pr_debug("md: data integrity enabled on %s\n", mdname(mddev));
> -     if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE)) {
> +     if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE) ||
> +         bioset_integrity_create(&mddev->md_io_bs, BIO_POOL_SIZE)) {
>               pr_err("md: failed to create integrity pool for %s\n",
>                      mdname(mddev));
> +             bioset_exit(&mddev->bio_set);
> +             bioset_exit(&mddev->md_io_bs);
>               return -EINVAL;

Are you sure bioset_exit() here is correct? This is always called from
pers->run() and the cleanup in case of error is handled in md_run().

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 3/7] md: the latest try for improve io stats accounting
  2021-05-21  7:32   ` Artur Paszkiewicz
@ 2021-05-21  7:43     ` Guoqing Jiang
  0 siblings, 0 replies; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  7:43 UTC (permalink / raw)
  To: Artur Paszkiewicz, song; +Cc: linux-raid



On 5/21/21 3:32 PM, Artur Paszkiewicz wrote:
>> @@ -2340,9 +2383,12 @@ int md_integrity_register(struct mddev *mddev)
>>                               bdev_get_integrity(reference->bdev));
>>   
>>        pr_debug("md: data integrity enabled on %s\n", mdname(mddev));
>> -     if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE)) {
>> +     if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE) ||
>> +         bioset_integrity_create(&mddev->md_io_bs, BIO_POOL_SIZE)) {
>>                pr_err("md: failed to create integrity pool for %s\n",
>>                       mdname(mddev));
>> +             bioset_exit(&mddev->bio_set);
>> +             bioset_exit(&mddev->md_io_bs);
>>                return -EINVAL;
> Are you sure bioset_exit() here is correct? This is always called from
> pers->run() and the cleanup in case of error is handled in md_run().

You are right! Maybe it deserves additional comment given it is
not obvious. Will drop it.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [UPDATE PATCH V2 3/7] md: the latest try for improve io stats accounting
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
                   ` (6 preceding siblings ...)
  2021-05-21  0:55 ` [PATCH V2 7/7] md: mark some personalities as deprecated Guoqing Jiang
@ 2021-05-21  8:00 ` Guoqing Jiang
  2021-05-24  5:49   ` Song Liu
  2021-05-24  6:04 ` [PATCH V2 0/7] md: " Song Liu
  8 siblings, 1 reply; 16+ messages in thread
From: Guoqing Jiang @ 2021-05-21  8:00 UTC (permalink / raw)
  To: song; +Cc: linux-raid

From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>

Use generic io accounting functions to manage io stats. There was an
attempt to do this earlier in commit 18c0b223cf990172 ("md: use generic
io stats accounting functions to simplify io stat accounting"), but it
did not include a call to generic_end_io_acct() and caused issues with
tracking in-flight IOs, so it was later removed in commit 74672d069b29
("md: fix md io stats accounting broken").

This patch attempts to fix this by using both generic_start_io_acct()
and generic_end_io_acct(). To make it possible, in md_make_request() a
bio is cloned with additional data - struct md_io, which includes the io
start_time. A new bioset is introduced for this purpose. We call
generic_start_io_acct() and pass the clone instead of the original to
md_handle_request(). When it completes, we call generic_end_io_acct()
and complete the original bio.

This adds correct statistics about in-flight IOs and IO processing time,
interpreted e.g. in iostat as await, svctm, aqu-sz and %util.

It also fixes a situation where too many IOs where reported if a bio was
re-submitted to the mddev, because io accounting is now performed only
on newly arriving bios.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
[Guoqing: rebase and make generic accounting applies to personalities
	  which don't have clone infrastructure]
Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
---
Delete not necessary bioset_exit in md_integrity_register, thanks for
Artur's review.

 drivers/md/md.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/md/md.h |  1 +
 2 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7ba00e4c862d..d8823db843db 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -441,6 +441,25 @@ void md_handle_request(struct mddev *mddev, struct bio *bio)
 }
 EXPORT_SYMBOL(md_handle_request);
 
+struct md_io {
+	struct mddev *mddev;
+	struct bio *orig_bio;
+	unsigned long start_time;
+	struct bio bio_clone;
+};
+
+static void md_end_io(struct bio *bio)
+{
+	struct md_io *md_io = bio->bi_private;
+	struct bio *orig_bio = md_io->orig_bio;
+
+	orig_bio->bi_status = bio->bi_status;
+
+	bio_end_io_acct(orig_bio, md_io->start_time);
+	bio_put(bio);
+	bio_endio(orig_bio);
+}
+
 static blk_qc_t md_submit_bio(struct bio *bio)
 {
 	const int rw = bio_data_dir(bio);
@@ -465,6 +484,30 @@ static blk_qc_t md_submit_bio(struct bio *bio)
 		return BLK_QC_T_NONE;
 	}
 
+	/*
+	 * clone bio under conditions:
+	 * 1. QUEUE_FLAG_IO_STAT flag is set.
+	 * 2. bio just enters md and it is not split from personality.
+	 */
+	if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue) &&
+	    (bio->bi_pool != &mddev->md_io_bs) &&
+	    (mddev->pers->accounting_bio &&
+	     mddev->pers->accounting_bio(mddev, bio))) {
+		struct md_io *md_io;
+		struct bio *clone;
+
+		clone = bio_clone_fast(bio, GFP_NOIO, &mddev->md_io_bs);
+
+		md_io = container_of(clone, struct md_io, bio_clone);
+		md_io->mddev = mddev;
+		md_io->orig_bio = bio;
+		md_io->start_time = bio_start_io_acct(bio);
+
+		clone->bi_end_io = md_end_io;
+		clone->bi_private = md_io;
+		bio = clone;
+	}
+
 	/* bio could be mergeable after passing to underlayer */
 	bio->bi_opf &= ~REQ_NOMERGE;
 
@@ -2340,7 +2383,8 @@ int md_integrity_register(struct mddev *mddev)
 			       bdev_get_integrity(reference->bdev));
 
 	pr_debug("md: data integrity enabled on %s\n", mdname(mddev));
-	if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE)) {
+	if (bioset_integrity_create(&mddev->bio_set, BIO_POOL_SIZE) ||
+	    bioset_integrity_create(&mddev->md_io_bs, BIO_POOL_SIZE)) {
 		pr_err("md: failed to create integrity pool for %s\n",
 		       mdname(mddev));
 		return -EINVAL;
@@ -5569,6 +5613,7 @@ static void md_free(struct kobject *ko)
 
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);
+	bioset_exit(&mddev->md_io_bs);
 	kfree(mddev);
 }
 
@@ -5864,6 +5909,12 @@ int md_run(struct mddev *mddev)
 		if (err)
 			return err;
 	}
+	if (!bioset_initialized(&mddev->md_io_bs)) {
+		err = bioset_init(&mddev->md_io_bs, BIO_POOL_SIZE,
+				  offsetof(struct md_io, bio_clone), 0);
+		if (err)
+			return err;
+	}
 
 	spin_lock(&pers_lock);
 	pers = find_pers(mddev->level, mddev->clevel);
@@ -6041,6 +6092,7 @@ int md_run(struct mddev *mddev)
 abort:
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);
+	bioset_exit(&mddev->md_io_bs);
 	return err;
 }
 EXPORT_SYMBOL_GPL(md_run);
@@ -6264,6 +6316,7 @@ void md_stop(struct mddev *mddev)
 	__md_stop(mddev);
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);
+	bioset_exit(&mddev->md_io_bs);
 }
 
 EXPORT_SYMBOL_GPL(md_stop);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 5125ccf9df06..d2f476c427a9 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -487,6 +487,7 @@ struct mddev {
 	struct bio_set			sync_set; /* for sync operations like
 						   * metadata and bitmap writes
 						   */
+	struct bio_set			md_io_bs; /* for io accounting */
 
 	/* Generic flush handling.
 	 * The last to finish preflush schedules a worker to submit
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5
  2021-05-21  0:55 ` [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5 Guoqing Jiang
@ 2021-05-24  5:48   ` Song Liu
  2021-05-24  8:45   ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Song Liu @ 2021-05-24  5:48 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: linux-raid, Artur Paszkiewicz

On Thu, May 20, 2021 at 5:56 PM Guoqing Jiang <jgq516@gmail.com> wrote:
>
> Let's introduce accounting_bio which checks if md needs clone the bio
> for accounting.
>
> And add relevant function to raid0 and raid5 given both don't have
> their own clone infrastrure, also checks if it is split bio.
>
> Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
> ---
>  drivers/md/md.h    |  2 ++
>  drivers/md/raid0.c | 14 ++++++++++++++
>  drivers/md/raid5.c | 17 +++++++++++++++++
>  3 files changed, 33 insertions(+)
>
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 4da240ffe2c5..5125ccf9df06 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -605,6 +605,8 @@ struct md_personality
>         void *(*takeover) (struct mddev *mddev);
>         /* Changes the consistency policy of an active array. */
>         int (*change_consistency_policy)(struct mddev *mddev, const char *buf);
> +       /* check if need to clone bio for accounting in md layer */
> +       bool (*accounting_bio)(struct mddev *mddev, struct bio *bio);
>  };
>
>  struct md_sysfs_entry {
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index e5d7411cba9b..d309b639b5d9 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -748,6 +748,19 @@ static void raid0_quiesce(struct mddev *mddev, int quiesce)
>  {
>  }
>
> +/*
> + * Don't account the bio if it was split from mddev->bio_set.
> + */
> +static bool raid0_accounting_bio(struct mddev *mddev, struct bio *bio)
> +{
> +       bool ret = true;
> +
> +       if (bio->bi_pool == &mddev->bio_set)
> +               ret = false;

We can simply do "return bio->bi_pool != &mddev->bio_set;". And same for
raid5_accouting_bio.

> +
> +       return ret;
> +}
> +
>  static struct md_personality raid0_personality=
>  {
>         .name           = "raid0",
> @@ -760,6 +773,7 @@ static struct md_personality raid0_personality=
>         .size           = raid0_size,
>         .takeover       = raid0_takeover,
>         .quiesce        = raid0_quiesce,
> +       .accounting_bio = raid0_accounting_bio,
>  };
>
>  static int __init raid0_init (void)
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 841e1c1aa5e6..bcc1ceb69c73 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -8596,6 +8596,20 @@ static void *raid6_takeover(struct mddev *mddev)
>         return setup_conf(mddev);
>  }
>
> +/*
> + * Don't account the bio if it was split from r5conf->bio_split.
> + */
> +static bool raid5_accounting_bio(struct mddev *mddev, struct bio *bio)
> +{
> +       bool ret = true;
> +       struct r5conf *conf = mddev->private;
> +
> +       if (bio->bi_pool == &conf->bio_split)
> +               ret = false;
> +
> +       return ret;
> +}
> +
>  static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
>  {
>         struct r5conf *conf;
> @@ -8688,6 +8702,7 @@ static struct md_personality raid6_personality =
>         .quiesce        = raid5_quiesce,
>         .takeover       = raid6_takeover,
>         .change_consistency_policy = raid5_change_consistency_policy,
> +       .accounting_bio = raid5_accounting_bio,
>  };
>  static struct md_personality raid5_personality =
>  {
> @@ -8712,6 +8727,7 @@ static struct md_personality raid5_personality =
>         .quiesce        = raid5_quiesce,
>         .takeover       = raid5_takeover,
>         .change_consistency_policy = raid5_change_consistency_policy,
> +       .accounting_bio = raid5_accounting_bio,
>  };
>
>  static struct md_personality raid4_personality =
> @@ -8737,6 +8753,7 @@ static struct md_personality raid4_personality =
>         .quiesce        = raid5_quiesce,
>         .takeover       = raid4_takeover,
>         .change_consistency_policy = raid5_change_consistency_policy,
> +       .accounting_bio = raid5_accounting_bio,
>  };
>
>  static int __init raid5_init(void)
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [UPDATE PATCH V2 3/7] md: the latest try for improve io stats accounting
  2021-05-21  8:00 ` [UPDATE PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
@ 2021-05-24  5:49   ` Song Liu
  0 siblings, 0 replies; 16+ messages in thread
From: Song Liu @ 2021-05-24  5:49 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: linux-raid

On Fri, May 21, 2021 at 1:01 AM Guoqing Jiang <jgq516@gmail.com> wrote:
>
> From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
>
> Use generic io accounting functions to manage io stats. There was an
> attempt to do this earlier in commit 18c0b223cf990172 ("md: use generic
> io stats accounting functions to simplify io stat accounting"), but it
> did not include a call to generic_end_io_acct() and caused issues with
> tracking in-flight IOs, so it was later removed in commit 74672d069b29
> ("md: fix md io stats accounting broken").
>
> This patch attempts to fix this by using both generic_start_io_acct()
> and generic_end_io_acct(). To make it possible, in md_make_request() a
> bio is cloned with additional data - struct md_io, which includes the io
> start_time. A new bioset is introduced for this purpose. We call
> generic_start_io_acct() and pass the clone instead of the original to
> md_handle_request(). When it completes, we call generic_end_io_acct()
> and complete the original bio.
>
> This adds correct statistics about in-flight IOs and IO processing time,
> interpreted e.g. in iostat as await, svctm, aqu-sz and %util.
>
> It also fixes a situation where too many IOs where reported if a bio was
> re-submitted to the mddev, because io accounting is now performed only
> on newly arriving bios.
>
> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
> [Guoqing: rebase and make generic accounting applies to personalities
>           which don't have clone infrastructure]
> Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
> ---
> Delete not necessary bioset_exit in md_integrity_register, thanks for
> Artur's review.
>
>  drivers/md/md.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  drivers/md/md.h |  1 +
>  2 files changed, 55 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7ba00e4c862d..d8823db843db 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -441,6 +441,25 @@ void md_handle_request(struct mddev *mddev, struct bio *bio)
>  }
>  EXPORT_SYMBOL(md_handle_request);
>
> +struct md_io {
> +       struct mddev *mddev;
> +       struct bio *orig_bio;
> +       unsigned long start_time;
> +       struct bio bio_clone;
> +};
> +
> +static void md_end_io(struct bio *bio)
> +{
> +       struct md_io *md_io = bio->bi_private;
> +       struct bio *orig_bio = md_io->orig_bio;
> +
> +       orig_bio->bi_status = bio->bi_status;
> +
> +       bio_end_io_acct(orig_bio, md_io->start_time);
> +       bio_put(bio);
> +       bio_endio(orig_bio);
> +}
> +
>  static blk_qc_t md_submit_bio(struct bio *bio)
>  {
>         const int rw = bio_data_dir(bio);
> @@ -465,6 +484,30 @@ static blk_qc_t md_submit_bio(struct bio *bio)
>                 return BLK_QC_T_NONE;
>         }
>
> +       /*
> +        * clone bio under conditions:
> +        * 1. QUEUE_FLAG_IO_STAT flag is set.
> +        * 2. bio just enters md and it is not split from personality.
> +        */

We had iostat on by default. So, let set QUEUE_FLAG_IO_STAT in
md_run().

Thanks,
Song

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 0/7] md: io stats accounting
  2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
                   ` (7 preceding siblings ...)
  2021-05-21  8:00 ` [UPDATE PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
@ 2021-05-24  6:04 ` Song Liu
  2021-05-27 14:09   ` Paweł Wiejacha
  8 siblings, 1 reply; 16+ messages in thread
From: Song Liu @ 2021-05-24  6:04 UTC (permalink / raw)
  To: Guoqing Jiang, Christoph Hellwig, Paweł Wiejacha
  Cc: linux-raid, Artur Paszkiewicz

On Thu, May 20, 2021 at 5:56 PM Guoqing Jiang <jgq516@gmail.com> wrote:
>
> V2 changes:
>
> 1. add accounting_bio to md_personality.
> 2. cleanup in case bioset_integrity_create fails.
> 3. use bio_end_io_acct.
> 4. remove patch for enable io accounting for multipath.
> 5. add one patch to rename print_msg.
> 6. add one patch to deprecate linear, multipath and faulty.
>
> Artur Paszkiewicz (1):
>   md: the latest try for improve io stats accounting
>
> Guoqing Jiang (6):
>   md: revert io stats accounting
>   md: add accounting_bio for raid0 and raid5
>   md/raid1: rename print_msg with r1bio_existed
>   md/raid1: enable io accounting
>   md/raid10: enable io accounting
>   md: mark some personalities as deprecated

Thanks Guoqing! This version looks great to me. No need to send v3 for
those two minor comments.

Artur and Christoph, could you please share your comments on this version
and/or reply with your Reviewed-by tag?

Pawel, could you please run your tests with this set? Note that, the test should
be run after setting
   echo 1 > /sys/block/mdXXX/queue/iostats

Thanks,
Song

>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5
  2021-05-21  0:55 ` [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5 Guoqing Jiang
  2021-05-24  5:48   ` Song Liu
@ 2021-05-24  8:45   ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2021-05-24  8:45 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: song, linux-raid, artur.paszkiewicz

On Fri, May 21, 2021 at 08:55:16AM +0800, Guoqing Jiang wrote:
> Let's introduce accounting_bio which checks if md needs clone the bio
> for accounting.
> 
> And add relevant function to raid0 and raid5 given both don't have
> their own clone infrastrure, also checks if it is split bio.

Please don't add another indirect call in the I/O submission fast path.
With Spectre mitigations these are really slow, and also are hard to
follow.

I think moving the call to allocate the accounting bio clone entirely
into the personalities is probably the cleanest approach without any
of these downsides.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 0/7] md: io stats accounting
  2021-05-24  6:04 ` [PATCH V2 0/7] md: " Song Liu
@ 2021-05-27 14:09   ` Paweł Wiejacha
  0 siblings, 0 replies; 16+ messages in thread
From: Paweł Wiejacha @ 2021-05-27 14:09 UTC (permalink / raw)
  To: Song Liu; +Cc: Guoqing Jiang, Christoph Hellwig, linux-raid, Artur Paszkiewicz

On Mon, 24 May 2021 at 08:04, Song Liu <song@kernel.org> wrote:
>
> On Thu, May 20, 2021 at 5:56 PM Guoqing Jiang <jgq516@gmail.com> wrote:
> >
> > V2 changes:
> >
> > 1. add accounting_bio to md_personality.
> > 2. cleanup in case bioset_integrity_create fails.
> > 3. use bio_end_io_acct.
> > 4. remove patch for enable io accounting for multipath.
> > 5. add one patch to rename print_msg.
> > 6. add one patch to deprecate linear, multipath and faulty.
> >
> > Artur Paszkiewicz (1):
> >   md: the latest try for improve io stats accounting
> >
> > Guoqing Jiang (6):
> >   md: revert io stats accounting
> >   md: add accounting_bio for raid0 and raid5
> >   md/raid1: rename print_msg with r1bio_existed
> >   md/raid1: enable io accounting
> >   md/raid10: enable io accounting
> >   md: mark some personalities as deprecated
>
> Thanks Guoqing! This version looks great to me. No need to send v3 for
> those two minor comments.
>
> Artur and Christoph, could you please share your comments on this version
> and/or reply with your Reviewed-by tag?
>
> Pawel, could you please run your tests with this set? Note that, the test should
> be run after setting
>    echo 1 > /sys/block/mdXXX/queue/iostats

I've been testing this patchset for a while (with md iostats enabled)
and no double fault occurred.

Thanks!
Paweł Wiejacha

> Thanks,
> Song
>
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-05-27 14:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-21  0:55 [PATCH V2 0/7] md: io stats accounting Guoqing Jiang
2021-05-21  0:55 ` [PATCH V2 1/7] md: revert " Guoqing Jiang
2021-05-21  0:55 ` [PATCH V2 2/7] md: add accounting_bio for raid0 and raid5 Guoqing Jiang
2021-05-24  5:48   ` Song Liu
2021-05-24  8:45   ` Christoph Hellwig
2021-05-21  0:55 ` [PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
2021-05-21  7:32   ` Artur Paszkiewicz
2021-05-21  7:43     ` Guoqing Jiang
2021-05-21  0:55 ` [PATCH V2 4/7] md/raid1: rename print_msg with r1bio_existed Guoqing Jiang
2021-05-21  0:55 ` [PATCH V2 5/7] md/raid1: enable io accounting Guoqing Jiang
2021-05-21  0:55 ` [PATCH V2 6/7] md/raid10: " Guoqing Jiang
2021-05-21  0:55 ` [PATCH V2 7/7] md: mark some personalities as deprecated Guoqing Jiang
2021-05-21  8:00 ` [UPDATE PATCH V2 3/7] md: the latest try for improve io stats accounting Guoqing Jiang
2021-05-24  5:49   ` Song Liu
2021-05-24  6:04 ` [PATCH V2 0/7] md: " Song Liu
2021-05-27 14:09   ` Paweł Wiejacha

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.