linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach)
@ 2020-10-22  7:43 Anand Jain
  2020-10-22  7:43 ` [PATCH v9 1/3] btrfs: add btrfs_strmatch helper Anand Jain
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-22  7:43 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

v9: C coding style fixes in 1/3 and 3/3

v8:
Separate the sysfs framework and the %pid read_policy into a separate
patchset here, so that the new read policies can be in its own patch set.

A latency based read_policy is being prepared to send it in a separate
patchset as it depends on a few changes in the block layer as well.

__ Original email: __

v7:
Fix switch's fall through warning. Changle logs updates where necessary.

v6:
Patch 4/5 - If there is no change in device's read prefer then don't log
Patch 4/5 - Add pid to the logs
Patch 5/5 - If there isn't read preferred device in the chunk don't reset
read policy to default, instead just use stripe 0. As this is in
the read path it avoids going through the device list to find
read preferred device. So inline to this drop to check if there
is read preferred device before setting read policy to device.

v5:
Worked on review comments as received in its previous version.
Please refer to individual patches for the specific changes.
Introduces the new read_policy 'device'.

v4:
Rename readmirror attribute to read_policy. Drop separate kobj for
readmirror instead create read_policy attribute in fsid kobj.
merge v2:2/3 and v2:3/3 into v4:2/2. Patch titles have changed.
 
v3:
v2:
Mainly fixes the fs_devices::readmirror declaration type from atomic_t
to u8. (Thanks Josef).

v1:
As of now we use only %pid method to read stripped mirrored data. So
application's process id determines the stripe id to be read. This type
of routing typically helps in a system with many small independent
applications tying to read random data. On the other hand the %pid
based read IO distribution policy is inefficient if there is a single
application trying to read large data as because the overall disk
bandwidth would remains under utilized.

One type of readmirror policy isn't good enough and other choices are
routing the IO based on device's waitqueue or manual when we have a
read-preferred device or a readmirror policy based on the target storage
caching. So this patch-set introduces a framework where we could add more
readmirror policies.

This policy is a filesystem wide policy as of now, and though the
readmirror policy at the subvolume level is a novel approach as it
provides maximum flexibility in the data center, but as of now its not
practical to implement such a granularity as you can't really ensure
reflinked extents will be read from the stripe of its desire and so
there will be more limitations and it can be assessed separately.

The approach in this patch-set is sys interface with in-memory policy.
And does not add any new readmirror type in this set, which can be add
once we are ok with the framework. Also the default policy remains %pid.

Previous works:
----------------------------------------------------------------------
There were few RFCs [1] before, mainly to figure out storage
(or in memory only) for the readmirror policy and the interface needed.

[1]
https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg86368.html

https://lore.kernel.org/linux-btrfs/20190826090438.7044-1-anand.jain@oracle.com/

https://lore.kernel.org/linux-btrfs/5fcf9c23-89b5-b167-1f80-a0f4ac107d0b@oracle.com/

https://patchwork.kernel.org/cover/10859213/

Mount -o:
In the first trial it was attempted to use the mount -o option to carry
the readmirror policy, this is good for debugging which can make sure
even the mount thread metadata tree blocks are read from the disk desired.
It was very effective in testing radi1/raid10 write-holes.

Extended attribute:
As extended attribute is associated with the inode, to implement this
there is bit of extended attribute abuse or else makes it mandatory to
mount the rootid 5. Its messy unless readmirror policy is applied at the
subvol level which is not possible as of now. 

An item type:
The proposed patch was to create an item to hold the readmirror policy,
it makes sense when compared to the abusive extended attribute approach
but introduces a new item and so no backward compatibility.
-----------------------------------------------------------------------

Anand Jain (3):
  btrfs: add btrfs_strmatch helper
  btrfs: create read policy framework
  btrfs: create read policy sysfs attribute, pid

 fs/btrfs/sysfs.c   | 69 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.c | 15 +++++++++-
 fs/btrfs/volumes.h | 14 ++++++++++
 3 files changed, 97 insertions(+), 1 deletion(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 1/3] btrfs: add btrfs_strmatch helper
  2020-10-22  7:43 [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
@ 2020-10-22  7:43 ` Anand Jain
  2020-10-23 11:12   ` Christoph Hellwig
  2020-10-26 17:52   ` David Sterba
  2020-10-22  7:43 ` [PATCH v9 2/3] btrfs: create read policy framework Anand Jain
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-22  7:43 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

Add a generic helper to match the golden-string in the given-string,
and ignore the leading and trailing whitespaces if any.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Suggested-by: David Sterba <dsterba@suse.com>
---
v9: use Josef suggested C coding style, using single if statement.
v5: born

 fs/btrfs/sysfs.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 8424f5d0e5ed..5ea262d289c6 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -863,6 +863,26 @@ static ssize_t btrfs_generation_show(struct kobject *kobj,
 }
 BTRFS_ATTR(, generation, btrfs_generation_show);
 
+/*
+ * Match the %golden in the %given. Ignore the leading and trailing whitespaces
+ * if any.
+ */
+static int btrfs_strmatch(const char *given, const char *golden)
+{
+	size_t len = strlen(golden);
+	char *stripped;
+
+	/* strip leading whitespace */
+	stripped = skip_spaces(given);
+
+	/* strip trailing whitespace */
+	if ((strncmp(stripped, golden, len) == 0) &&
+	    (strlen(skip_spaces(stripped + len)) == 0))
+		return 0;
+
+	return -EINVAL;
+}
+
 static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, label),
 	BTRFS_ATTR_PTR(, nodesize),
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 2/3] btrfs: create read policy framework
  2020-10-22  7:43 [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
  2020-10-22  7:43 ` [PATCH v9 1/3] btrfs: add btrfs_strmatch helper Anand Jain
@ 2020-10-22  7:43 ` Anand Jain
  2020-10-22  7:43 ` [PATCH v9 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-22  7:43 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

As of now, we use the %pid method to read striped mirrored data, which means
process id determines the stripe id to read. This type of routing
typically helps in a system with many small independent processes tying
to read random data. On the other hand, the %pid based read IO policy is
inefficient because if there is a single process trying to read a large
file, the overall disk bandwidth remains under-utilized.

So this patch introduces a read policy framework so that we could add more
read policies, such as IO routing based on the device's wait-queue or manual
when we have a read-preferred device or a policy based on the target
storage caching.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v9: -
v8: use fallthrough;
v7: Fix missing /* fall through */ in the switch
    Removed Reviewed-by: Josef Bacik <josef@toxicpanda.com>
v6:-
v5: Title renamed from:- btrfs: add read_policy framework
    Change log updated.
    Unnecessary comment dropped, added more where necessary.
    Optimize code in the switch remove duplicate code.
    Define BTRFS_READ_POLICY_DEFAULT dropped.
    Rename enum btrfs_read_policy_type to enum btrfs_read_policy.
    Rename BTRFS_READ_BY_PID to BTRFS_READ_POLICY_PID.
    (As its mainly renames. Reviewed-by retained).
v4: -
v3: Declare fs_devices::readmirror as enum btrfs_readmirror_policy_type
v2: Declare fs_devices::readmirror as u8 instead of atomic_t 
    A small change in comment and change log wordings.

 fs/btrfs/volumes.c | 15 ++++++++++++++-
 fs/btrfs/volumes.h | 14 ++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1991bc5a6f59..cb343ac47f29 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1223,6 +1223,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->latest_bdev = latest_dev->bdev;
 	fs_devices->total_rw_bytes = 0;
 	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
+	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
 
 	return 0;
 }
@@ -5485,7 +5486,19 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 	else
 		num_stripes = map->num_stripes;
 
-	preferred_mirror = first + current->pid % num_stripes;
+	switch (fs_info->fs_devices->read_policy) {
+	default:
+		/*
+		 * Shouldn't happen, just warn and use pid instead of failing.
+		 */
+		btrfs_warn_rl(fs_info,
+			      "unknown read_policy type %u, fallback to pid",
+			      fs_info->fs_devices->read_policy);
+		fallthrough;
+	case BTRFS_READ_POLICY_PID:
+		preferred_mirror = first + current->pid % num_stripes;
+		break;
+	}
 
 	if (dev_replace_is_ongoing &&
 	    fs_info->dev_replace.cont_reading_from_srcdev_mode ==
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index f2177263748e..ebeb8118e578 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -212,6 +212,15 @@ enum btrfs_chunk_allocation_policy {
 	BTRFS_CHUNK_ALLOC_REGULAR,
 };
 
+/*
+ * Read policies for the mirrored block groups, read picks the stripe based
+ * on these policies.
+ */
+enum btrfs_read_policy {
+	BTRFS_READ_POLICY_PID,
+	BTRFS_NR_READ_POLICY,
+};
+
 struct btrfs_fs_devices {
 	u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
@@ -265,6 +274,11 @@ struct btrfs_fs_devices {
 	struct completion kobj_unregister;
 
 	enum btrfs_chunk_allocation_policy chunk_alloc_policy;
+
+	/*
+	 * policy used to read the mirrored stripes
+	 */
+	enum btrfs_read_policy read_policy;
 };
 
 #define BTRFS_BIO_INLINE_CSUM_SIZE	64
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 3/3] btrfs: create read policy sysfs attribute, pid
  2020-10-22  7:43 [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
  2020-10-22  7:43 ` [PATCH v9 1/3] btrfs: add btrfs_strmatch helper Anand Jain
  2020-10-22  7:43 ` [PATCH v9 2/3] btrfs: create read policy framework Anand Jain
@ 2020-10-22  7:43 ` Anand Jain
  2020-10-26 17:57   ` David Sterba
  2020-10-23 17:04 ` [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) David Sterba
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Anand Jain @ 2020-10-22  7:43 UTC (permalink / raw)
  To: linux-btrfs; +Cc: josef, dsterba

Add

 /sys/fs/btrfs/UUID/read_policy

attribute so that the read policy for the raid1, raid1c34 and raid10 can
be tuned.

When this attribute is read, it shall show all available policies, with
active policy being with in [ ]. The read_policy attribute can be written
using one of the items listed in there.

For example:
  $cat /sys/fs/btrfs/UUID/read_policy
  [pid]
  $echo pid > /sys/fs/btrfs/UUID/read_policy

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v9: fix C coding style, static const char*
v5: 
  Title rename: old: btrfs: sysfs, add read_policy attribute
  Uses the btrfs_strmatch() helper (BTRFS_READ_POLICY_NAME_MAX dropped).
  Use the table for the policy names.
  Rename len to ret.
  Use a simple logic to prefix space in btrfs_read_policy_show()
  Reviewed-by: Josef Bacik <josef@toxicpanda.com> dropped.

v4:-
v3: rename [by_pid] to [pid]
v2: v2: check input len before strip and kstrdup

 fs/btrfs/sysfs.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 5ea262d289c6..e23ae3643527 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -883,6 +883,54 @@ static int btrfs_strmatch(const char *given, const char *golden)
 	return -EINVAL;
 }
 
+static const char * const btrfs_read_policy_name[] = { "pid" };
+
+static ssize_t btrfs_read_policy_show(struct kobject *kobj,
+				      struct kobj_attribute *a, char *buf)
+{
+	int i;
+	ssize_t ret = 0;
+	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+		if (fs_devices->read_policy == i)
+			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s[%s]",
+					(ret == 0 ? "" : " "),
+					btrfs_read_policy_name[i]);
+		else
+			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
+					(ret == 0 ? "" : " "),
+					btrfs_read_policy_name[i]);
+	}
+
+	ret += snprintf(buf + ret, PAGE_SIZE - ret, "\n");
+
+	return ret;
+}
+
+static ssize_t btrfs_read_policy_store(struct kobject *kobj,
+				       struct kobj_attribute *a,
+				       const char *buf, size_t len)
+{
+	int i;
+	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+		if (btrfs_strmatch(buf, btrfs_read_policy_name[i]) == 0) {
+			if (i != fs_devices->read_policy) {
+				fs_devices->read_policy = i;
+				btrfs_info(fs_devices->fs_info,
+					   "read policy set to '%s'",
+					   btrfs_read_policy_name[i]);
+			}
+			return len;
+		}
+	}
+
+	return -EINVAL;
+}
+BTRFS_ATTR_RW(, read_policy, btrfs_read_policy_show, btrfs_read_policy_store);
+
 static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, label),
 	BTRFS_ATTR_PTR(, nodesize),
@@ -893,6 +941,7 @@ static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, checksum),
 	BTRFS_ATTR_PTR(, exclusive_operation),
 	BTRFS_ATTR_PTR(, generation),
+	BTRFS_ATTR_PTR(, read_policy),
 	NULL,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 1/3] btrfs: add btrfs_strmatch helper
  2020-10-22  7:43 ` [PATCH v9 1/3] btrfs: add btrfs_strmatch helper Anand Jain
@ 2020-10-23 11:12   ` Christoph Hellwig
  2020-10-27 13:25     ` Anand Jain
  2020-10-26 17:52   ` David Sterba
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2020-10-23 11:12 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, josef, dsterba

> +	if ((strncmp(stripped, golden, len) == 0) &&
> +	    (strlen(skip_spaces(stripped + len)) == 0))

No need for the inner braces.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach)
  2020-10-22  7:43 [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
                   ` (2 preceding siblings ...)
  2020-10-22  7:43 ` [PATCH v9 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
@ 2020-10-23 17:04 ` David Sterba
  2020-10-27  1:52   ` Anand Jain
  2020-10-26 14:01 ` Josef Bacik
  2020-10-28  4:25 ` [PATCH v10 " Anand Jain
  5 siblings, 1 reply; 20+ messages in thread
From: David Sterba @ 2020-10-23 17:04 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, josef, dsterba

On Thu, Oct 22, 2020 at 03:43:34PM +0800, Anand Jain wrote:
> v9: C coding style fixes in 1/3 and 3/3

So the point of adding the sysfs knobs is to allow testing various
mirror selection strategies, what exactly was discussed in the past. Do
you have patches for that as well? It does not need to be final and
polished but at least give us something to test.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach)
  2020-10-22  7:43 [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
                   ` (3 preceding siblings ...)
  2020-10-23 17:04 ` [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) David Sterba
@ 2020-10-26 14:01 ` Josef Bacik
  2020-10-28  4:25 ` [PATCH v10 " Anand Jain
  5 siblings, 0 replies; 20+ messages in thread
From: Josef Bacik @ 2020-10-26 14:01 UTC (permalink / raw)
  To: Anand Jain, linux-btrfs; +Cc: dsterba

On 10/22/20 3:43 AM, Anand Jain wrote:
> v9: C coding style fixes in 1/3 and 3/3
> 

You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

to this series, lets get this in so we can focus on the actual changes to the 
policy.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 1/3] btrfs: add btrfs_strmatch helper
  2020-10-22  7:43 ` [PATCH v9 1/3] btrfs: add btrfs_strmatch helper Anand Jain
  2020-10-23 11:12   ` Christoph Hellwig
@ 2020-10-26 17:52   ` David Sterba
  2020-10-27 13:19     ` Anand Jain
  1 sibling, 1 reply; 20+ messages in thread
From: David Sterba @ 2020-10-26 17:52 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, josef, dsterba

On Thu, Oct 22, 2020 at 03:43:35PM +0800, Anand Jain wrote:
> Add a generic helper to match the golden-string in the given-string,
> and ignore the leading and trailing whitespaces if any.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> Suggested-by: David Sterba <dsterba@suse.com>
> ---
> v9: use Josef suggested C coding style, using single if statement.
> v5: born
> 
>  fs/btrfs/sysfs.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index 8424f5d0e5ed..5ea262d289c6 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -863,6 +863,26 @@ static ssize_t btrfs_generation_show(struct kobject *kobj,
>  }
>  BTRFS_ATTR(, generation, btrfs_generation_show);
>  
> +/*
> + * Match the %golden in the %given. Ignore the leading and trailing whitespaces
> + * if any.
> + */
> +static int btrfs_strmatch(const char *given, const char *golden)
> +{
> +	size_t len = strlen(golden);
> +	char *stripped;
> +
> +	/* strip leading whitespace */

This is confusing as it's not stripping the space but merely skipping
it.  The arguments are not changed so you also don't need the separate
variable and just update 'given'.

> +	stripped = skip_spaces(given);
> +
> +	/* strip trailing whitespace */
> +	if ((strncmp(stripped, golden, len) == 0) &&
> +	    (strlen(skip_spaces(stripped + len)) == 0))
> +		return 0;

This a bit hard to read but ok, essentially we can do the string
comparison in a loop or use the library functions.

> +
> +	return -EINVAL;

This does not make sense as it's an error code while the function is a
predicate, without error states.

> +}
> +
>  static const struct attribute *btrfs_attrs[] = {
>  	BTRFS_ATTR_PTR(, label),
>  	BTRFS_ATTR_PTR(, nodesize),
> -- 
> 2.25.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 3/3] btrfs: create read policy sysfs attribute, pid
  2020-10-22  7:43 ` [PATCH v9 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
@ 2020-10-26 17:57   ` David Sterba
  2020-10-28 12:37     ` Anand Jain
  0 siblings, 1 reply; 20+ messages in thread
From: David Sterba @ 2020-10-26 17:57 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, josef, dsterba

On Thu, Oct 22, 2020 at 03:43:37PM +0800, Anand Jain wrote:
> Add
> 
>  /sys/fs/btrfs/UUID/read_policy
> 
> attribute so that the read policy for the raid1, raid1c34 and raid10 can
> be tuned.
> 
> When this attribute is read, it shall show all available policies, with
> active policy being with in [ ]. The read_policy attribute can be written
> using one of the items listed in there.
> 
> For example:
>   $cat /sys/fs/btrfs/UUID/read_policy
>   [pid]
>   $echo pid > /sys/fs/btrfs/UUID/read_policy
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v9: fix C coding style, static const char*
> v5: 
>   Title rename: old: btrfs: sysfs, add read_policy attribute
>   Uses the btrfs_strmatch() helper (BTRFS_READ_POLICY_NAME_MAX dropped).
>   Use the table for the policy names.
>   Rename len to ret.
>   Use a simple logic to prefix space in btrfs_read_policy_show()
>   Reviewed-by: Josef Bacik <josef@toxicpanda.com> dropped.
> 
> v4:-
> v3: rename [by_pid] to [pid]
> v2: v2: check input len before strip and kstrdup
> 
>  fs/btrfs/sysfs.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index 5ea262d289c6..e23ae3643527 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -883,6 +883,54 @@ static int btrfs_strmatch(const char *given, const char *golden)
>  	return -EINVAL;
>  }
>  
> +static const char * const btrfs_read_policy_name[] = { "pid" };
> +
> +static ssize_t btrfs_read_policy_show(struct kobject *kobj,
> +				      struct kobj_attribute *a, char *buf)
> +{
> +	int i;
> +	ssize_t ret = 0;
> +	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
> +
> +	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
> +		if (fs_devices->read_policy == i)
> +			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s[%s]",

All snprintf have been upgraded to scnprintf as it does have sane
behaviour when the buffer is not large enough.

> +					(ret == 0 ? "" : " "),
> +					btrfs_read_policy_name[i]);
> +		else
> +			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
> +					(ret == 0 ? "" : " "),
> +					btrfs_read_policy_name[i]);
> +	}
> +
> +	ret += snprintf(buf + ret, PAGE_SIZE - ret, "\n");
> +
> +	return ret;
> +}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach)
  2020-10-23 17:04 ` [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) David Sterba
@ 2020-10-27  1:52   ` Anand Jain
  2020-10-27 18:02     ` David Sterba
  0 siblings, 1 reply; 20+ messages in thread
From: Anand Jain @ 2020-10-27  1:52 UTC (permalink / raw)
  To: dsterba, linux-btrfs, josef, dsterba



On 24/10/20 1:04 am, David Sterba wrote:
> On Thu, Oct 22, 2020 at 03:43:34PM +0800, Anand Jain wrote:
>> v9: C coding style fixes in 1/3 and 3/3
> 
> So the point of adding the sysfs knobs is to allow testing various
> mirror selection strategies, what exactly was discussed in the past. Do
> you have patches for that as well? It does not need to be final and
> polished but at least give us something to test.
> 

Sure. I just sent out the patchset [1]. It provides read_policy: 
latency, device, and round-robin.

[1]
https://patchwork.kernel.org/project/linux-btrfs/list/?series=371023

Thanks, Anand

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 1/3] btrfs: add btrfs_strmatch helper
  2020-10-26 17:52   ` David Sterba
@ 2020-10-27 13:19     ` Anand Jain
  2020-10-27 18:00       ` David Sterba
  0 siblings, 1 reply; 20+ messages in thread
From: Anand Jain @ 2020-10-27 13:19 UTC (permalink / raw)
  To: dsterba, linux-btrfs, josef, dsterba


>> +static int btrfs_strmatch(const char *given, const char *golden)
>> +{
>> +	size_t len = strlen(golden);
>> +	char *stripped;
>> +
>> +	/* strip leading whitespace */
> 
> This is confusing as it's not stripping the space but merely skipping
> it.  The arguments are not changed so you also don't need the separate
> variable and just update 'given'.

  ok let me update it.

> 
>> +	stripped = skip_spaces(given);
>> +
>> +	/* strip trailing whitespace */
>> +	if ((strncmp(stripped, golden, len) == 0) &&
>> +	    (strlen(skip_spaces(stripped + len)) == 0))
>> +		return 0;
> 
> This a bit hard to read but ok, essentially we can do the string
> comparison in a loop or use the library functions.
>

>> +
>> +	return -EINVAL;
> 
> This does not make sense as it's an error code while the function is a
> predicate, without error states.
> 

  A non zero value return will be some arbitrary number, is that OK?
  Or the return arg can be bool instead of int.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 1/3] btrfs: add btrfs_strmatch helper
  2020-10-23 11:12   ` Christoph Hellwig
@ 2020-10-27 13:25     ` Anand Jain
  0 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-27 13:25 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-btrfs, josef, dsterba



On 23/10/20 7:12 pm, Christoph Hellwig wrote:
>> +	if ((strncmp(stripped, golden, len) == 0) &&
>> +	    (strlen(skip_spaces(stripped + len)) == 0))
> 
> No need for the inner braces.
> 

fixed in next version. Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 1/3] btrfs: add btrfs_strmatch helper
  2020-10-27 13:19     ` Anand Jain
@ 2020-10-27 18:00       ` David Sterba
  0 siblings, 0 replies; 20+ messages in thread
From: David Sterba @ 2020-10-27 18:00 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs, josef, dsterba

On Tue, Oct 27, 2020 at 09:19:33PM +0800, Anand Jain wrote:
> >> +	stripped = skip_spaces(given);
> >> +
> >> +	/* strip trailing whitespace */
> >> +	if ((strncmp(stripped, golden, len) == 0) &&
> >> +	    (strlen(skip_spaces(stripped + len)) == 0))
> >> +		return 0;
> > 
> > This a bit hard to read but ok, essentially we can do the string
> > comparison in a loop or use the library functions.
> >
> 
> >> +
> >> +	return -EINVAL;
> > 
> > This does not make sense as it's an error code while the function is a
> > predicate, without error states.
> > 
> 
>   A non zero value return will be some arbitrary number, is that OK?
>   Or the return arg can be bool instead of int.

Bool is fine in this case.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach)
  2020-10-27  1:52   ` Anand Jain
@ 2020-10-27 18:02     ` David Sterba
  2020-10-28 12:06       ` Anand Jain
  0 siblings, 1 reply; 20+ messages in thread
From: David Sterba @ 2020-10-27 18:02 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs, josef, dsterba

On Tue, Oct 27, 2020 at 09:52:11AM +0800, Anand Jain wrote:
> 
> 
> On 24/10/20 1:04 am, David Sterba wrote:
> > On Thu, Oct 22, 2020 at 03:43:34PM +0800, Anand Jain wrote:
> >> v9: C coding style fixes in 1/3 and 3/3
> > 
> > So the point of adding the sysfs knobs is to allow testing various
> > mirror selection strategies, what exactly was discussed in the past. Do
> > you have patches for that as well? It does not need to be final and
> > polished but at least give us something to test.
> > 
> 
> Sure. I just sent out the patchset [1]. It provides read_policy: 
> latency, device, and round-robin.
> 
> [1]
> https://patchwork.kernel.org/project/linux-btrfs/list/?series=371023

Exporting more information from the devices would help us to decide but
is there anything we can do just with what we have? Or eventually add
our counters like for the in-flight requests or total bytes. The
intention is not to duplicate what block layer does as we need to
experiment with the stats and heuristics it's just for that purpose and
we don't have to rely on other subsystem patches.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v10 0/3] readmirror feature (read_policy sysfs and in-memory only approach)
  2020-10-22  7:43 [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
                   ` (4 preceding siblings ...)
  2020-10-26 14:01 ` Josef Bacik
@ 2020-10-28  4:25 ` Anand Jain
  2020-10-28  4:25   ` [PATCH v10 1/3] btrfs: add btrfs_strmatch helper Anand Jain
                     ` (2 more replies)
  5 siblings, 3 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-28  4:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba

Based on mainline v5.10-rc1
On misc-next there will be a conflict due to addition of metadata_uuid
to the sysfs, which is easy to resolve.

v10: patch1/3 (fix btrfs_strmatch) and patch 3/3 (use scnprintf instead of snprintf)
     and add rb.

v9: C coding style fixes in 1/3 and 3/3

v8:
Separate the sysfs framework and the %pid read_policy into a separate
patchset here, so that the new read policies can be in its own patch set.

A latency based read_policy is being prepared to send it in a separate
patchset as it depends on a few changes in the block layer as well.

__ Original email: __

v7:
Fix switch's fall through warning. Changle logs updates where necessary.

v6:
Patch 4/5 - If there is no change in device's read prefer then don't log
Patch 4/5 - Add pid to the logs
Patch 5/5 - If there isn't read preferred device in the chunk don't reset
read policy to default, instead just use stripe 0. As this is in
the read path it avoids going through the device list to find
read preferred device. So inline to this drop to check if there
is read preferred device before setting read policy to device.

v5:
Worked on review comments as received in its previous version.
Please refer to individual patches for the specific changes.
Introduces the new read_policy 'device'.

v4:
Rename readmirror attribute to read_policy. Drop separate kobj for
readmirror instead create read_policy attribute in fsid kobj.
merge v2:2/3 and v2:3/3 into v4:2/2. Patch titles have changed.
 
v3:
v2:
Mainly fixes the fs_devices::readmirror declaration type from atomic_t
to u8. (Thanks Josef).

v1:
As of now we use only %pid method to read stripped mirrored data. So
application's process id determines the stripe id to be read. This type
of routing typically helps in a system with many small independent
applications tying to read random data. On the other hand the %pid
based read IO distribution policy is inefficient if there is a single
application trying to read large data as because the overall disk
bandwidth would remains under utilized.

One type of readmirror policy isn't good enough and other choices are
routing the IO based on device's waitqueue or manual when we have a
read-preferred device or a readmirror policy based on the target storage
caching. So this patch-set introduces a framework where we could add more
readmirror policies.

This policy is a filesystem wide policy as of now, and though the
readmirror policy at the subvolume level is a novel approach as it
provides maximum flexibility in the data center, but as of now its not
practical to implement such a granularity as you can't really ensure
reflinked extents will be read from the stripe of its desire and so
there will be more limitations and it can be assessed separately.

The approach in this patch-set is sys interface with in-memory policy.
And does not add any new readmirror type in this set, which can be add
once we are ok with the framework. Also the default policy remains %pid.

Previous works:
----------------------------------------------------------------------
There were few RFCs [1] before, mainly to figure out storage
(or in memory only) for the readmirror policy and the interface needed.

[1]
https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg86368.html

https://lore.kernel.org/linux-btrfs/20190826090438.7044-1-anand.jain@oracle.com/

https://lore.kernel.org/linux-btrfs/5fcf9c23-89b5-b167-1f80-a0f4ac107d0b@oracle.com/

https://patchwork.kernel.org/cover/10859213/

Mount -o:
In the first trial it was attempted to use the mount -o option to carry
the readmirror policy, this is good for debugging which can make sure
even the mount thread metadata tree blocks are read from the disk desired.
It was very effective in testing radi1/raid10 write-holes.

Extended attribute:
As extended attribute is associated with the inode, to implement this
there is bit of extended attribute abuse or else makes it mandatory to
mount the rootid 5. Its messy unless readmirror policy is applied at the
subvol level which is not possible as of now. 

An item type:
The proposed patch was to create an item to hold the readmirror policy,
it makes sense when compared to the abusive extended attribute approach
but introduces a new item and so no backward compatibility.
-----------------------------------------------------------------------

Anand Jain (3):
  btrfs: add btrfs_strmatch helper
  btrfs: create read policy framework
  btrfs: create read policy sysfs attribute, pid

 fs/btrfs/sysfs.c   | 67 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.c | 15 ++++++++++-
 fs/btrfs/volumes.h | 14 ++++++++++
 3 files changed, 95 insertions(+), 1 deletion(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v10 1/3] btrfs: add btrfs_strmatch helper
  2020-10-28  4:25 ` [PATCH v10 " Anand Jain
@ 2020-10-28  4:25   ` Anand Jain
  2020-10-28  4:25   ` [PATCH v10 2/3] btrfs: create read policy framework Anand Jain
  2020-10-28  4:25   ` [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
  2 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-28  4:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, David Sterba, Josef Bacik

Add a generic helper to match the golden-string in the given-string,
and ignore the leading and trailing whitespaces if any.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Suggested-by: David Sterba <dsterba@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v10: return bool instead of int.
     drop unnecessary local variable and ( ) with in if.
v9: use Josef suggested C coding style, using single if statement.
v5: born
 fs/btrfs/sysfs.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 279d9262b676..2e114f5181e8 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -854,6 +854,24 @@ static ssize_t btrfs_exclusive_operation_show(struct kobject *kobj,
 }
 BTRFS_ATTR(, exclusive_operation, btrfs_exclusive_operation_show);
 
+/*
+ * Match the %golden in the %given. Ignore the leading and trailing whitespaces
+ * if any.
+ */
+static bool btrfs_strmatch(const char *given, const char *golden)
+{
+	size_t len = strlen(golden);
+
+	/* skip leading whitespace */
+	given = skip_spaces(given);
+
+	if (strncmp(given, golden, len) == 0 &&
+	    !strlen(skip_spaces(given + len)))
+		return true;
+
+	return false;
+}
+
 static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, label),
 	BTRFS_ATTR_PTR(, nodesize),
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 2/3] btrfs: create read policy framework
  2020-10-28  4:25 ` [PATCH v10 " Anand Jain
  2020-10-28  4:25   ` [PATCH v10 1/3] btrfs: add btrfs_strmatch helper Anand Jain
@ 2020-10-28  4:25   ` Anand Jain
  2020-10-28  4:25   ` [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
  2 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-28  4:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Josef Bacik

As of now, we use the %pid method to read striped mirrored data, which means
process id determines the stripe id to read. This type of routing
typically helps in a system with many small independent processes tying
to read random data. On the other hand, the %pid based read IO policy is
inefficient because if there is a single process trying to read a large
file, the overall disk bandwidth remains under-utilized.

So this patch introduces a read policy framework so that we could add more
read policies, such as IO routing based on the device's wait-queue or manual
when we have a read-preferred device or a policy based on the target
storage caching.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v10: -
v9: -
v8: use fallthrough;
v7: Fix missing /* fall through */ in the switch
    Removed Reviewed-by: Josef Bacik <josef@toxicpanda.com>
v6:-
v5: Title renamed from:- btrfs: add read_policy framework
    Change log updated.
    Unnecessary comment dropped, added more where necessary.
    Optimize code in the switch remove duplicate code.
    Define BTRFS_READ_POLICY_DEFAULT dropped.
    Rename enum btrfs_read_policy_type to enum btrfs_read_policy.
    Rename BTRFS_READ_BY_PID to BTRFS_READ_POLICY_PID.
    (As its mainly renames. Reviewed-by retained).
v4: -
v3: Declare fs_devices::readmirror as enum btrfs_readmirror_policy_type
v2: Declare fs_devices::readmirror as u8 instead of atomic_t 
    A small change in comment and change log wordings.

 fs/btrfs/volumes.c | 15 ++++++++++++++-
 fs/btrfs/volumes.h | 14 ++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 58b9c419a2b6..da31b11ceb61 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1223,6 +1223,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->latest_bdev = latest_dev->bdev;
 	fs_devices->total_rw_bytes = 0;
 	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
+	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
 
 	return 0;
 }
@@ -5482,7 +5483,19 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 	else
 		num_stripes = map->num_stripes;
 
-	preferred_mirror = first + current->pid % num_stripes;
+	switch (fs_info->fs_devices->read_policy) {
+	default:
+		/*
+		 * Shouldn't happen, just warn and use pid instead of failing.
+		 */
+		btrfs_warn_rl(fs_info,
+			      "unknown read_policy type %u, fallback to pid",
+			      fs_info->fs_devices->read_policy);
+		fallthrough;
+	case BTRFS_READ_POLICY_PID:
+		preferred_mirror = first + current->pid % num_stripes;
+		break;
+	}
 
 	if (dev_replace_is_ongoing &&
 	    fs_info->dev_replace.cont_reading_from_srcdev_mode ==
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index bf27ac07d315..e3c36951742d 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -211,6 +211,15 @@ enum btrfs_chunk_allocation_policy {
 	BTRFS_CHUNK_ALLOC_REGULAR,
 };
 
+/*
+ * Read policies for the mirrored block groups, read picks the stripe based
+ * on these policies.
+ */
+enum btrfs_read_policy {
+	BTRFS_READ_POLICY_PID,
+	BTRFS_NR_READ_POLICY,
+};
+
 struct btrfs_fs_devices {
 	u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
@@ -264,6 +273,11 @@ struct btrfs_fs_devices {
 	struct completion kobj_unregister;
 
 	enum btrfs_chunk_allocation_policy chunk_alloc_policy;
+
+	/*
+	 * policy used to read the mirrored stripes
+	 */
+	enum btrfs_read_policy read_policy;
 };
 
 #define BTRFS_BIO_INLINE_CSUM_SIZE	64
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid
  2020-10-28  4:25 ` [PATCH v10 " Anand Jain
  2020-10-28  4:25   ` [PATCH v10 1/3] btrfs: add btrfs_strmatch helper Anand Jain
  2020-10-28  4:25   ` [PATCH v10 2/3] btrfs: create read policy framework Anand Jain
@ 2020-10-28  4:25   ` Anand Jain
  2 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-28  4:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Josef Bacik

Add

 /sys/fs/btrfs/UUID/read_policy

attribute so that the read policy for the raid1, raid1c34 and raid10 can
be tuned.

When this attribute is read, it shall show all available policies, with
active policy being with in [ ]. The read_policy attribute can be written
using one of the items listed in there.

For example:
  $cat /sys/fs/btrfs/UUID/read_policy
  [pid]
  $echo pid > /sys/fs/btrfs/UUID/read_policy

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v10: use scnprintf instead of snprintf
v9: fix C coding style, static const char*
v5: 
  Title rename: old: btrfs: sysfs, add read_policy attribute
  Uses the btrfs_strmatch() helper (BTRFS_READ_POLICY_NAME_MAX dropped).
  Use the table for the policy names.
  Rename len to ret.
  Use a simple logic to prefix space in btrfs_read_policy_show()
  Reviewed-by: Josef Bacik <josef@toxicpanda.com> dropped.

v4:-
v3: rename [by_pid] to [pid]
v2: v2: check input len before strip and kstrdup

 fs/btrfs/sysfs.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 2e114f5181e8..ef90b9cc1d5d 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -872,6 +872,54 @@ static bool btrfs_strmatch(const char *given, const char *golden)
 	return false;
 }
 
+static const char * const btrfs_read_policy_name[] = { "pid" };
+
+static ssize_t btrfs_read_policy_show(struct kobject *kobj,
+				      struct kobj_attribute *a, char *buf)
+{
+	int i;
+	ssize_t ret = 0;
+	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+		if (fs_devices->read_policy == i)
+			ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%s[%s]",
+					 (ret == 0 ? "" : " "),
+					 btrfs_read_policy_name[i]);
+		else
+			ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
+					 (ret == 0 ? "" : " "),
+					 btrfs_read_policy_name[i]);
+	}
+
+	ret += scnprintf(buf + ret, PAGE_SIZE - ret, "\n");
+
+	return ret;
+}
+
+static ssize_t btrfs_read_policy_store(struct kobject *kobj,
+				       struct kobj_attribute *a,
+				       const char *buf, size_t len)
+{
+	int i;
+	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+		if (btrfs_strmatch(buf, btrfs_read_policy_name[i])) {
+			if (i != fs_devices->read_policy) {
+				fs_devices->read_policy = i;
+				btrfs_info(fs_devices->fs_info,
+					   "read policy set to '%s'",
+					   btrfs_read_policy_name[i]);
+			}
+			return len;
+		}
+	}
+
+	return -EINVAL;
+}
+BTRFS_ATTR_RW(, read_policy, btrfs_read_policy_show, btrfs_read_policy_store);
+
 static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, label),
 	BTRFS_ATTR_PTR(, nodesize),
@@ -881,6 +929,7 @@ static const struct attribute *btrfs_attrs[] = {
 	BTRFS_ATTR_PTR(, metadata_uuid),
 	BTRFS_ATTR_PTR(, checksum),
 	BTRFS_ATTR_PTR(, exclusive_operation),
+	BTRFS_ATTR_PTR(, read_policy),
 	NULL,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach)
  2020-10-27 18:02     ` David Sterba
@ 2020-10-28 12:06       ` Anand Jain
  0 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-28 12:06 UTC (permalink / raw)
  To: dsterba, linux-btrfs, josef, dsterba

On 28/10/20 2:02 am, David Sterba wrote:
> On Tue, Oct 27, 2020 at 09:52:11AM +0800, Anand Jain wrote:
>>
>>
>> On 24/10/20 1:04 am, David Sterba wrote:
>>> On Thu, Oct 22, 2020 at 03:43:34PM +0800, Anand Jain wrote:
>>>> v9: C coding style fixes in 1/3 and 3/3
>>>
>>> So the point of adding the sysfs knobs is to allow testing various
>>> mirror selection strategies, what exactly was discussed in the past. Do
>>> you have patches for that as well? It does not need to be final and
>>> polished but at least give us something to test.
>>>
>>
>> Sure. I just sent out the patchset [1]. It provides read_policy:
>> latency, device, and round-robin.
>>
>> [1]
>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=371023
> 
> Exporting more information from the devices would help us to decide but
> is there anything we can do just with what we have? Or eventually add
> our counters like for the in-flight requests or total bytes. The
> intention is not to duplicate what block layer does as we need to
> experiment with the stats and heuristics it's just for that purpose and
> we don't have to rely on other subsystem patches.
> 

I could rewrote the latency patch without export of any new block layer 
functions. Patchset v1 will soon be sent to the ML. I hope that
shall address your concern.

Thanks, Anand

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9 3/3] btrfs: create read policy sysfs attribute, pid
  2020-10-26 17:57   ` David Sterba
@ 2020-10-28 12:37     ` Anand Jain
  0 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2020-10-28 12:37 UTC (permalink / raw)
  To: dsterba, linux-btrfs, josef, dsterba

On 27/10/20 1:57 am, David Sterba wrote:
> On Thu, Oct 22, 2020 at 03:43:37PM +0800, Anand Jain wrote:
>> Add
>>
>>   /sys/fs/btrfs/UUID/read_policy
>>
>> attribute so that the read policy for the raid1, raid1c34 and raid10 can
>> be tuned.
>>
>> When this attribute is read, it shall show all available policies, with
>> active policy being with in [ ]. The read_policy attribute can be written
>> using one of the items listed in there.
>>
>> For example:
>>    $cat /sys/fs/btrfs/UUID/read_policy
>>    [pid]
>>    $echo pid > /sys/fs/btrfs/UUID/read_policy
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> v9: fix C coding style, static const char*
>> v5:
>>    Title rename: old: btrfs: sysfs, add read_policy attribute
>>    Uses the btrfs_strmatch() helper (BTRFS_READ_POLICY_NAME_MAX dropped).
>>    Use the table for the policy names.
>>    Rename len to ret.
>>    Use a simple logic to prefix space in btrfs_read_policy_show()
>>    Reviewed-by: Josef Bacik <josef@toxicpanda.com> dropped.
>>
>> v4:-
>> v3: rename [by_pid] to [pid]
>> v2: v2: check input len before strip and kstrdup
>>
>>   fs/btrfs/sysfs.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 49 insertions(+)
>>
>> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
>> index 5ea262d289c6..e23ae3643527 100644
>> --- a/fs/btrfs/sysfs.c
>> +++ b/fs/btrfs/sysfs.c
>> @@ -883,6 +883,54 @@ static int btrfs_strmatch(const char *given, const char *golden)
>>   	return -EINVAL;
>>   }
>>   
>> +static const char * const btrfs_read_policy_name[] = { "pid" };
>> +
>> +static ssize_t btrfs_read_policy_show(struct kobject *kobj,
>> +				      struct kobj_attribute *a, char *buf)
>> +{
>> +	int i;
>> +	ssize_t ret = 0;
>> +	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
>> +
>> +	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
>> +		if (fs_devices->read_policy == i)
>> +			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s[%s]",
> 
> All snprintf have been upgraded to scnprintf as it does have sane
> behaviour when the buffer is not large enough.

This is taken care in v10.

Thanks, Anand

> 
>> +					(ret == 0 ? "" : " "),
>> +					btrfs_read_policy_name[i]);
>> +		else
>> +			ret += snprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
>> +					(ret == 0 ? "" : " "),
>> +					btrfs_read_policy_name[i]);
>> +	}
>> +
>> +	ret += snprintf(buf + ret, PAGE_SIZE - ret, "\n");
>> +
>> +	return ret;
>> +}


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-10-29  2:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22  7:43 [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
2020-10-22  7:43 ` [PATCH v9 1/3] btrfs: add btrfs_strmatch helper Anand Jain
2020-10-23 11:12   ` Christoph Hellwig
2020-10-27 13:25     ` Anand Jain
2020-10-26 17:52   ` David Sterba
2020-10-27 13:19     ` Anand Jain
2020-10-27 18:00       ` David Sterba
2020-10-22  7:43 ` [PATCH v9 2/3] btrfs: create read policy framework Anand Jain
2020-10-22  7:43 ` [PATCH v9 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
2020-10-26 17:57   ` David Sterba
2020-10-28 12:37     ` Anand Jain
2020-10-23 17:04 ` [PATCH v9 0/3] readmirror feature (read_policy sysfs and in-memory only approach) David Sterba
2020-10-27  1:52   ` Anand Jain
2020-10-27 18:02     ` David Sterba
2020-10-28 12:06       ` Anand Jain
2020-10-26 14:01 ` Josef Bacik
2020-10-28  4:25 ` [PATCH v10 " Anand Jain
2020-10-28  4:25   ` [PATCH v10 1/3] btrfs: add btrfs_strmatch helper Anand Jain
2020-10-28  4:25   ` [PATCH v10 2/3] btrfs: create read policy framework Anand Jain
2020-10-28  4:25   ` [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).