linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs@vger.kernel.org
Cc: dsterba@suse.com, josef@toxicpanda.com
Subject: [PATCH 1/4] btrfs: add read_policy latency
Date: Wed, 28 Oct 2020 21:26:00 +0800	[thread overview]
Message-ID: <ae5e526c1549d4e6f602c09d8235aa406c5a1404.1603884539.git.anand.jain@oracle.com> (raw)
In-Reply-To: <cover.1603884539.git.anand.jain@oracle.com>

The read policy type latency routes the read IO based on the historical
average wait time experienced by the read IOs through the individual
device factored by 1/10 of inflight commands in the queue. The factor
1/10 is because generally the block device queue depth is more than 1,
so there can be commands in the queue even before the previous commands
have been completed. This patch obtains the historical read IO stats from
the kernel block layer.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v1: Drop part_stat_read_all instead use part_stat_read
    Drop inflight
    
 fs/btrfs/sysfs.c   |  3 ++-
 fs/btrfs/volumes.c | 39 ++++++++++++++++++++++++++++++++++++++-
 fs/btrfs/volumes.h |  1 +
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 4dbf90ff088a..88cbf7b2edf0 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -906,7 +906,8 @@ static bool btrfs_strmatch(const char *given, const char *golden)
 	return false;
 }
 
-static const char * const btrfs_read_policy_name[] = { "pid" };
+/* Must follow the order as in enum btrfs_read_policy */
+static const char * const btrfs_read_policy_name[] = { "pid", "latency" };
 
 static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 				      struct kobj_attribute *a, char *buf)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6bf487626f23..48587009b656 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -14,6 +14,7 @@
 #include <linux/semaphore.h>
 #include <linux/uuid.h>
 #include <linux/list_sort.h>
+#include <linux/part_stat.h>
 #include "misc.h"
 #include "ctree.h"
 #include "extent_map.h"
@@ -5468,6 +5469,39 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
 	return ret;
 }
 
+static int btrfs_find_best_stripe(struct btrfs_fs_info *fs_info,
+				  struct map_lookup *map, int first,
+				  int num_stripe)
+{
+	u64 est_wait = 0;
+	int best_stripe = 0;
+	int index;
+
+	for (index = first; index < first + num_stripe; index++) {
+		u64 read_wait;
+		u64 avg_wait = 0;
+		unsigned long read_ios;
+		struct btrfs_device *device = map->stripes[index].dev;
+
+		read_wait = part_stat_read(device->bdev->bd_part, nsecs[READ]);
+		read_ios = part_stat_read(device->bdev->bd_part, ios[READ]);
+
+		if (read_wait && read_ios && read_wait >= read_ios)
+			avg_wait = div_u64(read_wait, read_ios);
+		else
+			btrfs_info_rl(device->fs_devices->fs_info,
+			"devid: %llu avg_wait ZERO read_wait %llu read_ios %lu",
+				      device->devid, read_wait, read_ios);
+
+		if (est_wait == 0 || est_wait > avg_wait) {
+			est_wait = avg_wait;
+			best_stripe = index;
+		}
+	}
+
+	return best_stripe;
+}
+
 static int find_live_mirror(struct btrfs_fs_info *fs_info,
 			    struct map_lookup *map, int first,
 			    int dev_replace_is_ongoing)
@@ -5498,6 +5532,10 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 	case BTRFS_READ_POLICY_PID:
 		preferred_mirror = first + current->pid % num_stripes;
 		break;
+	case BTRFS_READ_POLICY_LATENCY:
+		preferred_mirror = btrfs_find_best_stripe(fs_info, map, first,
+							  num_stripes);
+		break;
 	}
 
 	if (dev_replace_is_ongoing &&
@@ -6114,7 +6152,6 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info,
 
 	} else if (map->type & BTRFS_BLOCK_GROUP_RAID10) {
 		u32 factor = map->num_stripes / map->sub_stripes;
-
 		stripe_nr = div_u64_rem(stripe_nr, factor, &stripe_index);
 		stripe_index *= map->sub_stripes;
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 97f075516696..24db586a9837 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -217,6 +217,7 @@ enum btrfs_chunk_allocation_policy {
  */
 enum btrfs_read_policy {
 	BTRFS_READ_POLICY_PID,
+	BTRFS_READ_POLICY_LATENCY,
 	BTRFS_NR_READ_POLICY,
 };
 
-- 
2.25.1


  reply	other threads:[~2020-10-29  2:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28 13:25 [PATCH v1 0/4] btrfs: read_policy types latency, device and round-robin Anand Jain
2020-10-28 13:26 ` Anand Jain [this message]
2020-10-28 14:30   ` [PATCH 1/4] btrfs: add read_policy latency Josef Bacik
2020-10-29  1:06     ` Anand Jain
2020-10-28 13:26 ` [PATCH 2/4] btrfs: introduce new device-state read_preferred Anand Jain
2020-10-28 14:37   ` Josef Bacik
2020-10-29  1:12     ` Anand Jain
2020-10-28 13:26 ` [PATCH 3/4] btrfs: introduce new read_policy device Anand Jain
2020-10-28 14:40   ` Josef Bacik
2020-10-29  1:56     ` Anand Jain
2020-10-28 13:26 ` [PATCH RFC 4/4] btrfs: introduce new read_policy round-robin Anand Jain
2020-10-28 14:44   ` Josef Bacik
2020-10-29  2:06     ` Anand Jain
2020-10-28 14:32 ` [PATCH v1 0/4] btrfs: read_policy types latency, device and round-robin Josef Bacik
2020-10-29  1:08   ` Anand Jain
2020-10-29  7:44     ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae5e526c1549d4e6f602c09d8235aa406c5a1404.1603884539.git.anand.jain@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).