linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device)
@ 2020-02-17 11:12 Anand Jain
  2020-02-17 11:12 ` [PATCH v5 1/5] btrfs: add btrfs_strmatch helper Anand Jain
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Anand Jain @ 2020-02-17 11:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

v5:
Worked on review comments as received in its previous version.
Please refer to individual patches for the specific changes.
Introduces the new read_policy 'device'.

v4:
Rename readmirror attribute to read_policy. Drop separate kobj for
readmirror instead create read_policy attribute in fsid kobj.
merge v2:2/3 and v2:3/3 into v4:2/2. Patch titles have changed.
 
v3:
v2:
Mainly fixes the fs_devices::readmirror declaration type from atomic_t
to u8. (Thanks Josef).

v1:
As of now we use only %pid method to read stripped mirrored data. So
application's process id determines the stripe id to be read. This type
of routing typically helps in a system with many small independent
applications tying to read random data. On the other hand the %pid
based read IO distribution policy is inefficient if there is a single
application trying to read large data as because the overall disk
bandwidth would remains under utilized.

One type of readmirror policy isn't good enough and other choices are
routing the IO based on device's waitqueue or manual when we have a
read-preferred device or a readmirror policy based on the target storage
caching. So this patch-set introduces a framework where we could add more
readmirror policies.

This policy is a filesystem wide policy as of now, and though the
readmirror policy at the subvolume level is a novel approach as it
provides maximum flexibility in the data center, but as of now its not
practical to implement such a granularity as you can't really ensure
reflinked extents will be read from the stripe of its desire and so
there will be more limitations and it can be assessed separately.

The approach in this patch-set is sys interface with in-memory policy.
And does not add any new readmirror type in this set, which can be add
once we are ok with the framework. Also the default policy remains %pid.

Previous works:
----------------------------------------------------------------------
There were few RFCs [1] before, mainly to figure out storage
(or in memory only) for the readmirror policy and the interface needed.

[1]
https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg86368.html

https://lore.kernel.org/linux-btrfs/20190826090438.7044-1-anand.jain@oracle.com/

https://lore.kernel.org/linux-btrfs/5fcf9c23-89b5-b167-1f80-a0f4ac107d0b@oracle.com/

https://patchwork.kernel.org/cover/10859213/

Mount -o:
In the first trial it was attempted to use the mount -o option to carry
the readmirror policy, this is good for debugging which can make sure
even the mount thread metadata tree blocks are read from the disk desired.
It was very effective in testing radi1/raid10 write-holes.

Extended attribute:
As extended attribute is associated with the inode, to implement this
there is bit of extended attribute abuse or else makes it mandatory to
mount the rootid 5. Its messy unless readmirror policy is applied at the
subvol level which is not possible as of now. 

An item type:
The proposed patch was to create an item to hold the readmirror policy,
it makes sense when compared to the abusive extended attribute approach
but introduces a new item and so no backward compatibility.
-----------------------------------------------------------------------


Anand Jain (5):
  btrfs: add btrfs_strmatch helper
  btrfs: create read policy framework
  btrfs: create read policy sysfs attribute, pid
  btrfs: introduce new device-state read_preferred
  btrfs: introduce new read_policy device

 fs/btrfs/sysfs.c   | 145 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.c |  45 ++++++++++++++++-
 fs/btrfs/volumes.h |  16 ++++++
 3 files changed, 205 insertions(+), 1 deletion(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v5 1/5] btrfs: add btrfs_strmatch helper
  2020-02-17 11:12 [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
@ 2020-02-17 11:12 ` Anand Jain
  2020-02-17 11:12 ` [PATCH v5 2/5] btrfs: create read policy framework Anand Jain
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2020-02-17 11:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

Add a generic helper to match the golden-string in the given-string,
and ignore the leading and trailing whitespaces if any.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Suggested-by: David Sterba <dsterba@suse.com>
---
v5: born

 fs/btrfs/sysfs.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 93cf76118a04..7bb68cef98ab 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -809,6 +809,29 @@ static ssize_t btrfs_checksum_show(struct kobject *kobj,
 
 BTRFS_ATTR(, checksum, btrfs_checksum_show);
 
+/*
+ * Match the %golden in the %given. Ignore the leading and trailing whitespaces
+ * if any.
+ */
+static int btrfs_strmatch(const char *given, const char *golden)
+{
+	size_t len = strlen(golden);
+	char *stripped;
+
+	/* strip leading whitespace */
+	stripped = skip_spaces(given);
+
+	if (strncmp(stripped, golden, len) == 0) {
+		/* strip trailing whitespace */
+		if (strlen(skip_spaces(stripped + len)))
+			return -EINVAL;
+
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
 static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, label),
 	BTRFS_ATTR_PTR(, nodesize),
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 2/5] btrfs: create read policy framework
  2020-02-17 11:12 [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
  2020-02-17 11:12 ` [PATCH v5 1/5] btrfs: add btrfs_strmatch helper Anand Jain
@ 2020-02-17 11:12 ` Anand Jain
  2020-02-19  7:41   ` kbuild test robot
  2020-02-17 11:12 ` [PATCH v5 3/5] btrfs: create read policy sysfs attribute, pid Anand Jain
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Anand Jain @ 2020-02-17 11:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

As of now we use %pid method to read stripped mirrored data, which means
process id determines the stripe id to be read. This type of routing
typically helps in a system with many small independent processes tying
to read random data. On the other hand the %pid based read IO policy is
inefficient because if there is a single process trying to read large
data the overall disk bandwidth remains under-utilized.

So this patch introduces read policy framework so that we could add more
read policies, such as IO routing based on device's wait-queue or manual
when we have a read-preferred device or a policy based on the target
storage caching.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v5: Title renamed from:- btrfs: add read_policy framework
    Change log updated.
    Unnecessary comment dropped, added more where necessary.
    Optimize code in the switch remove duplicate code.
    Define BTRFS_READ_POLICY_DEFAULT dropped.
    Rename enum btrfs_read_policy_type to enum btrfs_read_policy.
    Rename BTRFS_READ_BY_PID to BTRFS_READ_POLICY_PID.
v4: -
v3: Declare fs_devices::readmirror as enum btrfs_readmirror_policy_type
v2: Declare fs_devices::readmirror as u8 instead of atomic_t
    A small change in comment and change log wordings.

 fs/btrfs/volumes.c | 13 ++++++++++++-
 fs/btrfs/volumes.h | 14 ++++++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 387f80656476..b6efb87bb0ae 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1209,6 +1209,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->opened = 1;
 	fs_devices->latest_bdev = latest_dev->bdev;
 	fs_devices->total_rw_bytes = 0;
+	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
 out:
 	return ret;
 }
@@ -5358,7 +5359,17 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 	else
 		num_stripes = map->num_stripes;
 
-	preferred_mirror = first + current->pid % num_stripes;
+	switch (fs_info->fs_devices->read_policy) {
+	default:
+		/*
+		 * Shouldn't happen, just warn and use pid instead of failing.
+		 */
+		btrfs_warn_rl(fs_info,
+			      "unknown read_policy type %u, fallback to pid",
+			      fs_info->fs_devices->read_policy);
+	case BTRFS_READ_POLICY_PID:
+		preferred_mirror = first + current->pid % num_stripes;
+	}
 
 	if (dev_replace_is_ongoing &&
 	    fs_info->dev_replace.cont_reading_from_srcdev_mode ==
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index f01552a0785e..ed2bba741b6e 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -209,6 +209,15 @@ struct btrfs_device {
 BTRFS_DEVICE_GETSET_FUNCS(disk_total_bytes);
 BTRFS_DEVICE_GETSET_FUNCS(bytes_used);
 
+/*
+ * Read policies for the mirrored block groups, read picks the stripe based
+ * on these policies.
+ */
+enum btrfs_read_policy {
+	BTRFS_READ_POLICY_PID,
+	BTRFS_NR_READ_POLICY,
+};
+
 struct btrfs_fs_devices {
 	u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
@@ -260,6 +269,11 @@ struct btrfs_fs_devices {
 	struct kobject *devices_kobj;
 	struct kobject *devinfo_kobj;
 	struct completion kobj_unregister;
+
+	/*
+	 * policy used to read the mirrored stripes
+	 */
+	enum btrfs_read_policy read_policy;
 };
 
 #define BTRFS_BIO_INLINE_CSUM_SIZE	64
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 3/5] btrfs: create read policy sysfs attribute, pid
  2020-02-17 11:12 [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
  2020-02-17 11:12 ` [PATCH v5 1/5] btrfs: add btrfs_strmatch helper Anand Jain
  2020-02-17 11:12 ` [PATCH v5 2/5] btrfs: create read policy framework Anand Jain
@ 2020-02-17 11:12 ` Anand Jain
  2020-02-17 11:12 ` [PATCH v5 4/5] btrfs: introduce new device-state read_preferred Anand Jain
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2020-02-17 11:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

Add

 /sys/fs/btrfs/UUID/read_policy

attribute so that the read policy for the raid1 and raid10 chunks can be
tuned.

When this attribute is read, it shall show all available policies, with
active policy being with in [ ]. The read_policy attribute can be written
using one of the items listed in there.

For example:
  $cat /sys/fs/btrfs/UUID/read_policy
  [pid]
  $echo pid > /sys/fs/btrfs/UUID/read_policy

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v5:
  Title rename: old: btrfs: sysfs, add read_policy attribute
  Uses the btrfs_strmatch() helper (BTRFS_READ_POLICY_NAME_MAX dropped).
  Use the table for the policy names.
  Rename len to ret.
  Use a simple logic to prefix space in btrfs_read_policy_show()
 
v4:-
v3: rename [by_pid] to [pid]
v2: v2: check input len before strip and kstrdup

 fs/btrfs/sysfs.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 7bb68cef98ab..c9a8850b186a 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -832,6 +832,54 @@ static int btrfs_strmatch(const char *given, const char *golden)
 	return -EINVAL;
 }
 
+static const char* const btrfs_read_policy_name[] = { "pid" };
+
+static ssize_t btrfs_read_policy_show(struct kobject *kobj,
+				      struct kobj_attribute *a, char *buf)
+{
+	int i;
+	ssize_t ret = 0;
+	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+		if (fs_devices->read_policy == i)
+			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s[%s]",
+					(ret == 0 ? "" : " "),
+					btrfs_read_policy_name[i]);
+		else
+			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
+					(ret == 0 ? "" : " "),
+					btrfs_read_policy_name[i]);
+	}
+
+	ret += snprintf(buf + ret, PAGE_SIZE - ret, "\n");
+
+	return ret;
+}
+
+static ssize_t btrfs_read_policy_store(struct kobject *kobj,
+				       struct kobj_attribute *a,
+				       const char *buf, size_t len)
+{
+	int i;
+	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+		if (btrfs_strmatch(buf, btrfs_read_policy_name[i]) == 0) {
+			if (i != fs_devices->read_policy) {
+				fs_devices->read_policy = i;
+				btrfs_info(fs_devices->fs_info,
+					   "read policy set to '%s'",
+					   btrfs_read_policy_name[i]);
+			}
+			return len;
+		}
+	}
+
+	return -EINVAL;
+}
+BTRFS_ATTR_RW(, read_policy, btrfs_read_policy_show, btrfs_read_policy_store);
+
 static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, label),
 	BTRFS_ATTR_PTR(, nodesize),
@@ -840,6 +888,7 @@ static int btrfs_strmatch(const char *given, const char *golden)
 	BTRFS_ATTR_PTR(, quota_override),
 	BTRFS_ATTR_PTR(, metadata_uuid),
 	BTRFS_ATTR_PTR(, checksum),
+	BTRFS_ATTR_PTR(, read_policy),
 	NULL,
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 4/5] btrfs: introduce new device-state read_preferred
  2020-02-17 11:12 [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
                   ` (2 preceding siblings ...)
  2020-02-17 11:12 ` [PATCH v5 3/5] btrfs: create read policy sysfs attribute, pid Anand Jain
@ 2020-02-17 11:12 ` Anand Jain
  2020-02-17 11:12 ` [PATCH v5 5/5] btrfs: introduce new read_policy device Anand Jain
  2020-02-18 16:35 ` [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) David Sterba
  5 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2020-02-17 11:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

Preparatory patch to add new read_policy type 'device'.
Provides a sysfs interface to set the device state as read_preferred.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v5: born

 fs/btrfs/sysfs.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.h |  1 +
 2 files changed, 50 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index c9a8850b186a..638ce3f4abe3 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1317,11 +1317,60 @@ static ssize_t btrfs_devinfo_writeable_show(struct kobject *kobj,
 }
 BTRFS_ATTR(devid, writeable, btrfs_devinfo_writeable_show);
 
+static ssize_t btrfs_devinfo_read_pref_show(struct kobject *kobj,
+					    struct kobj_attribute *a, char *buf)
+{
+	int val;
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	val = !!test_bit(BTRFS_DEV_STATE_READ_PREFERRED, &device->dev_state);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+static ssize_t btrfs_devinfo_read_pref_store(struct kobject *kobj,
+					     struct kobj_attribute *a,
+					     const char *buf, size_t len)
+{
+	int ret;
+	unsigned long val;
+	struct btrfs_device *device;
+
+	ret = kstrtoul(skip_spaces(buf), 0, &val);
+	if (ret)
+		return ret;
+
+	if (val != 0 && val != 1)
+		return -EINVAL;
+
+	/*
+	 * lock is not required, the btrfs_device struct can't be freed while
+	 * its kobject btrfs_device::devid_kobj is still open.
+	 */
+	device = container_of(kobj, struct btrfs_device, devid_kobj);
+
+	if (val) {
+		set_bit(BTRFS_DEV_STATE_READ_PREFERRED, &device->dev_state);
+		btrfs_info(device->fs_devices->fs_info,
+			   "set read preferred on devid %llu", device->devid);
+	} else {
+		clear_bit(BTRFS_DEV_STATE_READ_PREFERRED, &device->dev_state);
+		btrfs_info(device->fs_devices->fs_info,
+			   "reset read preferred on devid %llu", device->devid);
+	}
+
+	return len;
+}
+BTRFS_ATTR_RW(devid, read_preferred, btrfs_devinfo_read_pref_show,
+	      btrfs_devinfo_read_pref_store);
+
 static struct attribute *devid_attrs[] = {
 	BTRFS_ATTR_PTR(devid, in_fs_metadata),
 	BTRFS_ATTR_PTR(devid, missing),
 	BTRFS_ATTR_PTR(devid, replace_target),
 	BTRFS_ATTR_PTR(devid, writeable),
+	BTRFS_ATTR_PTR(devid, read_preferred),
 	NULL
 };
 ATTRIBUTE_GROUPS(devid);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index ed2bba741b6e..07962a0ce898 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -52,6 +52,7 @@ struct btrfs_io_geometry {
 #define BTRFS_DEV_STATE_MISSING		(2)
 #define BTRFS_DEV_STATE_REPLACE_TGT	(3)
 #define BTRFS_DEV_STATE_FLUSH_SENT	(4)
+#define BTRFS_DEV_STATE_READ_PREFERRED	(5)
 
 struct btrfs_device {
 	struct list_head dev_list; /* device_list_mutex */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 5/5] btrfs: introduce new read_policy device
  2020-02-17 11:12 [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
                   ` (3 preceding siblings ...)
  2020-02-17 11:12 ` [PATCH v5 4/5] btrfs: introduce new device-state read_preferred Anand Jain
@ 2020-02-17 11:12 ` Anand Jain
  2020-02-18 16:35 ` [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) David Sterba
  5 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2020-02-17 11:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

A new read policy 'device' is introduced with this patch, which when set
can pick the read_preferred device for reading. This tunable is for
advance users who knows what they are doing so that they have the
flexibility of making sure that reads are read from the device they
prefer.

For example:

$ cat /sys/fs/btrfs/UUID/read_policy
[pid] device

$echo 1 > /sys/fs/btrfs/UUID/devinfo/1/read_preferred

$echo device > /sys/fs/btrfs/UUID/read_policy

$cat /sys/fs/btrfs/UUID/read_policy
pid [device]

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v5: born

 fs/btrfs/sysfs.c   | 26 +++++++++++++++++++++++++-
 fs/btrfs/volumes.c | 36 ++++++++++++++++++++++++++++++++++--
 fs/btrfs/volumes.h |  1 +
 3 files changed, 60 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 638ce3f4abe3..e142a31c62e7 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -832,7 +832,8 @@ static int btrfs_strmatch(const char *given, const char *golden)
 	return -EINVAL;
 }
 
-static const char* const btrfs_read_policy_name[] = { "pid" };
+/* Must follow the order as in enum btrfs_read_policy */
+static const char* const btrfs_read_policy_name[] = { "pid", "device" };
 
 static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 				      struct kobj_attribute *a, char *buf)
@@ -857,6 +858,25 @@ static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 	return ret;
 }
 
+static bool btrfs_has_read_pref_device(struct btrfs_fs_devices *fs_devices)
+{
+	struct btrfs_device *device;
+
+	/*
+	 * rcu is good enough. If read preferred device is deleted by another
+	 * thread, the find_live_mirror will reset the policy to default.
+	 */
+	rcu_read_lock();
+	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
+		if (test_bit(BTRFS_DEV_STATE_READ_PREFERRED,
+			     &device->dev_state))
+			return true;
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
 static ssize_t btrfs_read_policy_store(struct kobject *kobj,
 				       struct kobj_attribute *a,
 				       const char *buf, size_t len)
@@ -866,6 +886,10 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
 
 	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
 		if (btrfs_strmatch(buf, btrfs_read_policy_name[i]) == 0) {
+			if (i == BTRFS_READ_POLICY_DEVICE &&
+			    ! btrfs_has_read_pref_device(fs_devices))
+					return -EINVAL;
+
 			if (i != fs_devices->read_policy) {
 				fs_devices->read_policy = i;
 				btrfs_info(fs_devices->fs_info,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b6efb87bb0ae..53020699c6cc 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5341,6 +5341,25 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
 	return ret;
 }
 
+static int btrfs_find_read_preferred(struct map_lookup *map, int num_stripe)
+{
+	int i;
+
+	/*
+	 * If there are more than one read preferred devices, then just pick the
+	 * first found read preferred device as of now. Once we have the Qdepth
+	 * based device selection, we could pick the least busy device among the
+	 * read preferred devices.
+	 */
+	for (i = 0; i < num_stripe; i++) {
+		if (test_bit(BTRFS_DEV_STATE_READ_PREFERRED,
+			     &map->stripes[i].dev->dev_state))
+			return i;
+        }
+
+	return -EINVAL;
+}
+
 static int find_live_mirror(struct btrfs_fs_info *fs_info,
 			    struct map_lookup *map, int first,
 			    int dev_replace_is_ongoing)
@@ -5360,15 +5379,28 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 		num_stripes = map->num_stripes;
 
 	switch (fs_info->fs_devices->read_policy) {
+	case BTRFS_READ_POLICY_DEVICE:
+		preferred_mirror = btrfs_find_read_preferred(map, num_stripes);
+		if (preferred_mirror != -EINVAL) {
+			preferred_mirror = first + preferred_mirror;
+			break;
+		}
+		/* Reset the read_policy to the default */
+		fs_info->fs_devices->read_policy = BTRFS_READ_POLICY_PID;
+		btrfs_warn_rl(fs_info,
+		      "No read preferred device found, fallback to default");
+		/* Fall through if there isn't any read preferred device */
+	case BTRFS_READ_POLICY_PID:
+		preferred_mirror = first + current->pid % num_stripes;
+		break;
 	default:
 		/*
 		 * Shouldn't happen, just warn and use pid instead of failing.
 		 */
+		preferred_mirror = first + current->pid % num_stripes;
 		btrfs_warn_rl(fs_info,
 			      "unknown read_policy type %u, fallback to pid",
 			      fs_info->fs_devices->read_policy);
-	case BTRFS_READ_POLICY_PID:
-		preferred_mirror = first + current->pid % num_stripes;
 	}
 
 	if (dev_replace_is_ongoing &&
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 07962a0ce898..9c3c6ba7aad5 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -216,6 +216,7 @@ struct btrfs_device {
  */
 enum btrfs_read_policy {
 	BTRFS_READ_POLICY_PID,
+	BTRFS_READ_POLICY_DEVICE,
 	BTRFS_NR_READ_POLICY,
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device)
  2020-02-17 11:12 [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
                   ` (4 preceding siblings ...)
  2020-02-17 11:12 ` [PATCH v5 5/5] btrfs: introduce new read_policy device Anand Jain
@ 2020-02-18 16:35 ` David Sterba
  2020-02-19 11:42   ` Anand Jain
  5 siblings, 1 reply; 11+ messages in thread
From: David Sterba @ 2020-02-18 16:35 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, josef, dsterba

On Mon, Feb 17, 2020 at 07:12:40PM +0800, Anand Jain wrote:
> v5:
> Worked on review comments as received in its previous version.
> Please refer to individual patches for the specific changes.
> Introduces the new read_policy 'device'.

The preparatory patches look overall good, there are some minor issues.

The last patch implementing the new policy needs explanation how it
works so users can decide if this is the right read policy for the use
case.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/5] btrfs: create read policy framework
  2020-02-17 11:12 ` [PATCH v5 2/5] btrfs: create read policy framework Anand Jain
@ 2020-02-19  7:41   ` kbuild test robot
  2020-02-19 11:40     ` Anand Jain
  0 siblings, 1 reply; 11+ messages in thread
From: kbuild test robot @ 2020-02-19  7:41 UTC (permalink / raw)
  To: Anand Jain; +Cc: kbuild-all, linux-btrfs, josef, dsterba

[-- Attachment #1: Type: text/plain, Size: 3982 bytes --]

Hi Anand,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v5.6-rc2]
[also build test WARNING on next-20200219]
[cannot apply to btrfs/next]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Anand-Jain/readmirror-feature-sysfs-and-in-memory-only-approach-with-new-read_policy-device/20200219-113525
base:    11a48a5a18c63fd7621bb050228cebf13566e4d8
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 7.5.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.5.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from fs/btrfs/volumes.c:18:0:
   fs/btrfs/volumes.c: In function 'find_live_mirror':
>> fs/btrfs/ctree.h:3143:4: warning: this statement may fall through [-Wimplicit-fallthrough=]
    do {        \
       ^
>> fs/btrfs/ctree.h:3089:2: note: in expansion of macro 'btrfs_printk_ratelimited'
     btrfs_printk_ratelimited(fs_info, KERN_WARNING fmt, ##args)
     ^~~~~~~~~~~~~~~~~~~~~~~~
>> fs/btrfs/volumes.c:5367:3: note: in expansion of macro 'btrfs_warn_rl'
      btrfs_warn_rl(fs_info,
      ^~~~~~~~~~~~~
   fs/btrfs/volumes.c:5370:2: note: here
     case BTRFS_READ_POLICY_PID:
     ^~~~

vim +/btrfs_warn_rl +5367 fs/btrfs/volumes.c

  5343	
  5344	static int find_live_mirror(struct btrfs_fs_info *fs_info,
  5345				    struct map_lookup *map, int first,
  5346				    int dev_replace_is_ongoing)
  5347	{
  5348		int i;
  5349		int num_stripes;
  5350		int preferred_mirror;
  5351		int tolerance;
  5352		struct btrfs_device *srcdev;
  5353	
  5354		ASSERT((map->type &
  5355			 (BTRFS_BLOCK_GROUP_RAID1_MASK | BTRFS_BLOCK_GROUP_RAID10)));
  5356	
  5357		if (map->type & BTRFS_BLOCK_GROUP_RAID10)
  5358			num_stripes = map->sub_stripes;
  5359		else
  5360			num_stripes = map->num_stripes;
  5361	
  5362		switch (fs_info->fs_devices->read_policy) {
  5363		default:
  5364			/*
  5365			 * Shouldn't happen, just warn and use pid instead of failing.
  5366			 */
> 5367			btrfs_warn_rl(fs_info,
  5368				      "unknown read_policy type %u, fallback to pid",
  5369				      fs_info->fs_devices->read_policy);
  5370		case BTRFS_READ_POLICY_PID:
  5371			preferred_mirror = first + current->pid % num_stripes;
  5372		}
  5373	
  5374		if (dev_replace_is_ongoing &&
  5375		    fs_info->dev_replace.cont_reading_from_srcdev_mode ==
  5376		     BTRFS_DEV_REPLACE_ITEM_CONT_READING_FROM_SRCDEV_MODE_AVOID)
  5377			srcdev = fs_info->dev_replace.srcdev;
  5378		else
  5379			srcdev = NULL;
  5380	
  5381		/*
  5382		 * try to avoid the drive that is the source drive for a
  5383		 * dev-replace procedure, only choose it if no other non-missing
  5384		 * mirror is available
  5385		 */
  5386		for (tolerance = 0; tolerance < 2; tolerance++) {
  5387			if (map->stripes[preferred_mirror].dev->bdev &&
  5388			    (tolerance || map->stripes[preferred_mirror].dev != srcdev))
  5389				return preferred_mirror;
  5390			for (i = first; i < first + num_stripes; i++) {
  5391				if (map->stripes[i].dev->bdev &&
  5392				    (tolerance || map->stripes[i].dev != srcdev))
  5393					return i;
  5394			}
  5395		}
  5396	
  5397		/* we couldn't find one that doesn't fail.  Just return something
  5398		 * and the io error handling code will clean up eventually
  5399		 */
  5400		return preferred_mirror;
  5401	}
  5402	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25773 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/5] btrfs: create read policy framework
  2020-02-19  7:41   ` kbuild test robot
@ 2020-02-19 11:40     ` Anand Jain
  2020-02-19 12:35       ` David Sterba
  0 siblings, 1 reply; 11+ messages in thread
From: Anand Jain @ 2020-02-19 11:40 UTC (permalink / raw)
  To: kbuild test robot; +Cc: kbuild-all, linux-btrfs, josef, dsterba


>     In file included from fs/btrfs/volumes.c:18:0:
>     fs/btrfs/volumes.c: In function 'find_live_mirror':

>>> fs/btrfs/ctree.h:3143:4: warning: this statement may fall through [-Wimplicit-fallthrough=]

  This is not a bug. May be we can make the robot happy.

Thanks, Anand

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device)
  2020-02-18 16:35 ` [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) David Sterba
@ 2020-02-19 11:42   ` Anand Jain
  0 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2020-02-19 11:42 UTC (permalink / raw)
  To: dsterba, linux-btrfs, josef, dsterba



> The last patch implementing the new policy needs explanation how it
> works so users can decide if this is the right read policy for the use
> case.

V6 adds more details.

HTH.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/5] btrfs: create read policy framework
  2020-02-19 11:40     ` Anand Jain
@ 2020-02-19 12:35       ` David Sterba
  0 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2020-02-19 12:35 UTC (permalink / raw)
  To: Anand Jain; +Cc: kbuild test robot, kbuild-all, linux-btrfs, josef, dsterba

On Wed, Feb 19, 2020 at 07:40:47PM +0800, Anand Jain wrote:
> 
> >     In file included from fs/btrfs/volumes.c:18:0:
> >     fs/btrfs/volumes.c: In function 'find_live_mirror':
> 
> >>> fs/btrfs/ctree.h:3143:4: warning: this statement may fall through [-Wimplicit-fallthrough=]
> 
>   This is not a bug. May be we can make the robot happy.

Since 5.5 all fall-trhough switch labels should be annotated properly.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-02-19 12:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-17 11:12 [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
2020-02-17 11:12 ` [PATCH v5 1/5] btrfs: add btrfs_strmatch helper Anand Jain
2020-02-17 11:12 ` [PATCH v5 2/5] btrfs: create read policy framework Anand Jain
2020-02-19  7:41   ` kbuild test robot
2020-02-19 11:40     ` Anand Jain
2020-02-19 12:35       ` David Sterba
2020-02-17 11:12 ` [PATCH v5 3/5] btrfs: create read policy sysfs attribute, pid Anand Jain
2020-02-17 11:12 ` [PATCH v5 4/5] btrfs: introduce new device-state read_preferred Anand Jain
2020-02-17 11:12 ` [PATCH v5 5/5] btrfs: introduce new read_policy device Anand Jain
2020-02-18 16:35 ` [PATCH v5 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) David Sterba
2020-02-19 11:42   ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).