All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: chunk_width_limit mount option
@ 2016-06-03  1:50 Andrew Armenia
  2016-06-06  9:17 ` David Sterba
  2016-06-13  2:06 ` Anand Jain
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Armenia @ 2016-06-03  1:50 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Andrew Armenia

This patch adds mount option 'chunk_width_limit=X', which when set forces
the chunk allocator to use only up to X devices when allocating a chunk.
This may help reduce the seek penalties seen in filesystems with large
numbers of devices.

Signed-off-by: Andrew Armenia <andrew@asquaredlabs.com>
---
 fs/btrfs/ctree.h   |  3 +++
 fs/btrfs/super.c   | 22 +++++++++++++++++++++-
 fs/btrfs/volumes.c | 26 ++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 101c3cf..27b6f8f 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -754,6 +754,9 @@ struct btrfs_fs_info {
 	unsigned long pending_changes;
 	unsigned long compress_type:4;
 	int commit_interval;
+
+	int chunk_width_limit;
+
 	/*
 	 * It is a suggestive number, the read side is safe even it gets a
 	 * wrong number because we will write out the data into a regular
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 4e59a91..3da5220 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -300,7 +300,7 @@ enum {
 	Opt_commit_interval, Opt_barrier, Opt_nodefrag, Opt_nodiscard,
 	Opt_noenospc_debug, Opt_noflushoncommit, Opt_acl, Opt_datacow,
 	Opt_datasum, Opt_treelog, Opt_noinode_cache, Opt_usebackuproot,
-	Opt_nologreplay, Opt_norecovery,
+	Opt_nologreplay, Opt_norecovery, Opt_width_limit,
 #ifdef CONFIG_BTRFS_DEBUG
 	Opt_fragment_data, Opt_fragment_metadata, Opt_fragment_all,
 #endif
@@ -360,6 +360,7 @@ static const match_table_t tokens = {
 	{Opt_rescan_uuid_tree, "rescan_uuid_tree"},
 	{Opt_fatal_errors, "fatal_errors=%s"},
 	{Opt_commit_interval, "commit=%d"},
+	{Opt_width_limit, "chunk_width_limit=%d"},
 #ifdef CONFIG_BTRFS_DEBUG
 	{Opt_fragment_data, "fragment=data"},
 	{Opt_fragment_metadata, "fragment=metadata"},
@@ -782,6 +783,22 @@ int btrfs_parse_options(struct btrfs_root *root, char *options,
 				info->commit_interval = BTRFS_DEFAULT_COMMIT_INTERVAL;
 			}
 			break;
+		case Opt_width_limit:
+			intarg = 0;
+			ret = match_int(&args[0], &intarg);
+			if (ret < 0) {
+				btrfs_err(root->fs_info, "invalid chunk width limit");
+				ret = -EINVAL;
+				goto out;
+			}
+
+			if (intarg > 0) {
+				info->chunk_width_limit = intarg;
+			} else {
+				btrfs_info(root->fs_info, "chunk width is unlimited");
+				info->chunk_width_limit = 0;
+			}
+			break;
 #ifdef CONFIG_BTRFS_DEBUG
 		case Opt_fragment_all:
 			btrfs_info(root->fs_info, "fragmenting all space");
@@ -1207,6 +1224,9 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 	if (info->thread_pool_size !=  min_t(unsigned long,
 					     num_online_cpus() + 2, 8))
 		seq_printf(seq, ",thread_pool=%d", info->thread_pool_size);
+	if (info->chunk_width_limit != 0)
+		seq_printf(seq, ",chunk_width_limit=%d",
+			info->chunk_width_limit);
 	if (btrfs_test_opt(root, COMPRESS)) {
 		if (info->compress_type == BTRFS_COMPRESS_ZLIB)
 			compress_type = "zlib";
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bdc6256..6d0d35d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4558,6 +4558,32 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	devs_increment = btrfs_raid_array[index].devs_increment;
 	ncopies = btrfs_raid_array[index].ncopies;
 
+	/*
+	 * if we have a statically-configured chunk width, and the type doesn't
+	 * specify one, go ahead and use the statically-configured max instead.
+	 *
+	 * If the static value is greater than the BTRFS_MAX_DEVS for the
+	 * chunk tree, we ignore it.
+	 *
+	 * Also, we ignore the static value for system chunks.
+	 */
+	if (
+		devs_max == 0 && info->chunk_width_limit != 0
+		&& !(type & BTRFS_BLOCK_GROUP_SYSTEM)
+		&& info->chunk_width_limit <= BTRFS_MAX_DEVS(info->chunk_root)
+	) {
+		if (info->chunk_width_limit >= devs_min) {
+			devs_max = info->chunk_width_limit;
+		} else {
+			/* warn that the static devs_max is unusable */
+			btrfs_warn(info,
+				"can't satisfy max chunk width of %d; "
+				"minimum %d devices needed",
+				info->chunk_width_limit, devs_max
+			);
+		}
+	}
+
 	if (type & BTRFS_BLOCK_GROUP_DATA) {
 		max_stripe_size = SZ_1G;
 		max_chunk_size = 10 * max_stripe_size;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: chunk_width_limit mount option
  2016-06-03  1:50 [PATCH] btrfs: chunk_width_limit mount option Andrew Armenia
@ 2016-06-06  9:17 ` David Sterba
  2016-06-06 13:43   ` Andrew Armenia
  2016-06-13  2:06 ` Anand Jain
  1 sibling, 1 reply; 8+ messages in thread
From: David Sterba @ 2016-06-06  9:17 UTC (permalink / raw)
  To: Andrew Armenia; +Cc: linux-btrfs

On Thu, Jun 02, 2016 at 09:50:15PM -0400, Andrew Armenia wrote:
> This patch adds mount option 'chunk_width_limit=X', which when set forces
> the chunk allocator to use only up to X devices when allocating a chunk.
> This may help reduce the seek penalties seen in filesystems with large
> numbers of devices.

There is a use for such limit but passing it as a mount is not the right
way to do it. Changing the value requires remount, different raid levels
might need different values, it's not persistent etc.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: chunk_width_limit mount option
  2016-06-06  9:17 ` David Sterba
@ 2016-06-06 13:43   ` Andrew Armenia
  2016-06-06 15:00     ` Hugo Mills
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Armenia @ 2016-06-06 13:43 UTC (permalink / raw)
  To: dsterba, linux-btrfs

On Mon, Jun 6, 2016 at 5:17 AM, David Sterba <dsterba@suse.cz> wrote:
> On Thu, Jun 02, 2016 at 09:50:15PM -0400, Andrew Armenia wrote:
>> This patch adds mount option 'chunk_width_limit=X', which when set forces
>> the chunk allocator to use only up to X devices when allocating a chunk.
>> This may help reduce the seek penalties seen in filesystems with large
>> numbers of devices.
>
> There is a use for such limit but passing it as a mount is not the right
> way to do it. Changing the value requires remount, different raid levels
> might need different values, it's not persistent etc.

Thanks for the feedback. I agree that it's less than an ideal way to
do it. I guess I was a bit quick to release proof-of-concept code.
I'll try to do some more work on the configuration side of things and
resubmit.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: chunk_width_limit mount option
  2016-06-06 13:43   ` Andrew Armenia
@ 2016-06-06 15:00     ` Hugo Mills
  2016-06-09 19:19       ` Andrew Armenia
  0 siblings, 1 reply; 8+ messages in thread
From: Hugo Mills @ 2016-06-06 15:00 UTC (permalink / raw)
  To: Andrew Armenia; +Cc: dsterba, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1573 bytes --]

On Mon, Jun 06, 2016 at 09:43:19AM -0400, Andrew Armenia wrote:
> On Mon, Jun 6, 2016 at 5:17 AM, David Sterba <dsterba@suse.cz> wrote:
> > On Thu, Jun 02, 2016 at 09:50:15PM -0400, Andrew Armenia wrote:
> >> This patch adds mount option 'chunk_width_limit=X', which when set forces
> >> the chunk allocator to use only up to X devices when allocating a chunk.
> >> This may help reduce the seek penalties seen in filesystems with large
> >> numbers of devices.
> >
> > There is a use for such limit but passing it as a mount is not the right
> > way to do it. Changing the value requires remount, different raid levels
> > might need different values, it's not persistent etc.
> 
> Thanks for the feedback. I agree that it's less than an ideal way to
> do it. I guess I was a bit quick to release proof-of-concept code.
> I'll try to do some more work on the configuration side of things and
> resubmit.

   Last time something like this came up in discussion, the consensus
was to put the configuration in xattrs in the btrfs namespace. There
should be one with a name to indicate "global" configuration, applied
to the top of subvol 5. If the feature allows configuration on a
per-subvol or per-object basis, then there should also be a name for
the relevant xattr (also in the btrfs namespace) that can be created
on each object as required.

   Hugo.

-- 
Hugo Mills             | Klytus, I'm bored. What plaything can you offer me
hugo@... carfax.org.uk | today?
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                      Ming the Merciless, Flash Gordon

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: chunk_width_limit mount option
  2016-06-06 15:00     ` Hugo Mills
@ 2016-06-09 19:19       ` Andrew Armenia
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Armenia @ 2016-06-09 19:19 UTC (permalink / raw)
  To: Hugo Mills, dsterba, linux-btrfs

On Mon, Jun 6, 2016 at 11:00 AM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Mon, Jun 06, 2016 at 09:43:19AM -0400, Andrew Armenia wrote:
>> On Mon, Jun 6, 2016 at 5:17 AM, David Sterba <dsterba@suse.cz> wrote:
>> > On Thu, Jun 02, 2016 at 09:50:15PM -0400, Andrew Armenia wrote:
>> >> This patch adds mount option 'chunk_width_limit=X', which when set forces
>> >> the chunk allocator to use only up to X devices when allocating a chunk.
>> >> This may help reduce the seek penalties seen in filesystems with large
>> >> numbers of devices.
>> >
>> > There is a use for such limit but passing it as a mount is not the right
>> > way to do it. Changing the value requires remount, different raid levels
>> > might need different values, it's not persistent etc.
>>
>> Thanks for the feedback. I agree that it's less than an ideal way to
>> do it. I guess I was a bit quick to release proof-of-concept code.
>> I'll try to do some more work on the configuration side of things and
>> resubmit.
>
>    Last time something like this came up in discussion, the consensus
> was to put the configuration in xattrs in the btrfs namespace. There
> should be one with a name to indicate "global" configuration, applied
> to the top of subvol 5. If the feature allows configuration on a
> per-subvol or per-object basis, then there should also be a name for
> the relevant xattr (also in the btrfs namespace) that can be created
> on each object as required.
>
>    Hugo.
>
> --
> Hugo Mills             | Klytus, I'm bored. What plaything can you offer me
> hugo@... carfax.org.uk | today?
> http://carfax.org.uk/  |
> PGP: E2AB1DE4          |                      Ming the Merciless, Flash Gordon

OK, since I'm new here I'll throw my thoughts out before writing and
submitting another patch. I wouldn't want to have to read the xattrs
from disk every time a chunk needs to be allocated. That means the
allocator configuration will need to be cached in the fs_info
structure. My current line of thinking is to create a
btrfs_allocator_config structure containing all of the allocator
parameters - for now a user-set devs_max for each RAID level. I'd add
a couple ioctls to allow the userspace tools to get and set these
parameters. The "set" ioctl would also persist this configuration to
one or more xattrs on the root of subvol 5 per Hugo's suggestion.

I'm still a bit hung up on the proper way to read the config back at
mount time. Patching the mount routines doesn't seem like a good way
to do it. Perhaps just doing it at the first invocation of
__btrfs_alloc_chunk is best? Any other thoughts?

-Andrew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: chunk_width_limit mount option
  2016-06-03  1:50 [PATCH] btrfs: chunk_width_limit mount option Andrew Armenia
  2016-06-06  9:17 ` David Sterba
@ 2016-06-13  2:06 ` Anand Jain
  2016-06-14 19:44   ` Andrew Armenia
  1 sibling, 1 reply; 8+ messages in thread
From: Anand Jain @ 2016-06-13  2:06 UTC (permalink / raw)
  To: Andrew Armenia, linux-btrfs



On 06/03/2016 09:50 AM, Andrew Armenia wrote:
> This patch adds mount option 'chunk_width_limit=X', which when set forces
> the chunk allocator to use only up to X devices when allocating a chunk.
> This may help reduce the seek penalties seen in filesystems with large
> numbers of devices.

Have you reviewed implementations like device allocation grouping?
Some info is in the btrfs project ideas..

https://btrfs.wiki.kernel.org/index.php/Project_ideas
  Chunk allocation groups
  Limits on number of stripes (stripe width)
  Linear chunk allocation mode

(Device allocation grouping is important for enterprise storage solutions).

Thanks, Anand

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: chunk_width_limit mount option
  2016-06-13  2:06 ` Anand Jain
@ 2016-06-14 19:44   ` Andrew Armenia
  2016-06-14 19:55     ` Hugo Mills
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Armenia @ 2016-06-14 19:44 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Sun, Jun 12, 2016 at 10:06 PM, Anand Jain <anand.jain@oracle.com> wrote:
>
>
> On 06/03/2016 09:50 AM, Andrew Armenia wrote:
>>
>> This patch adds mount option 'chunk_width_limit=X', which when set forces
>> the chunk allocator to use only up to X devices when allocating a chunk.
>> This may help reduce the seek penalties seen in filesystems with large
>> numbers of devices.
>
>
> Have you reviewed implementations like device allocation grouping?
> Some info is in the btrfs project ideas..
>
> https://btrfs.wiki.kernel.org/index.php/Project_ideas
>  Chunk allocation groups
>  Limits on number of stripes (stripe width)
>  Linear chunk allocation mode
>
> (Device allocation grouping is important for enterprise storage solutions).
>
> Thanks, Anand

I have looked at those ideas; allocation groups are what I'm ideally
after but I decided to start out small. I just spotted the dev_group
field in the device tree that appears to be currently unused, so
perhaps I will look at developing a group-aware allocator instead of
just limiting the chunk width.

-Andrew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: chunk_width_limit mount option
  2016-06-14 19:44   ` Andrew Armenia
@ 2016-06-14 19:55     ` Hugo Mills
  0 siblings, 0 replies; 8+ messages in thread
From: Hugo Mills @ 2016-06-14 19:55 UTC (permalink / raw)
  To: Andrew Armenia; +Cc: Anand Jain, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1590 bytes --]

On Tue, Jun 14, 2016 at 03:44:47PM -0400, Andrew Armenia wrote:
> On Sun, Jun 12, 2016 at 10:06 PM, Anand Jain <anand.jain@oracle.com> wrote:
> >
> >
> > On 06/03/2016 09:50 AM, Andrew Armenia wrote:
> >>
> >> This patch adds mount option 'chunk_width_limit=X', which when set forces
> >> the chunk allocator to use only up to X devices when allocating a chunk.
> >> This may help reduce the seek penalties seen in filesystems with large
> >> numbers of devices.
> >
> >
> > Have you reviewed implementations like device allocation grouping?
> > Some info is in the btrfs project ideas..
> >
> > https://btrfs.wiki.kernel.org/index.php/Project_ideas
> >  Chunk allocation groups
> >  Limits on number of stripes (stripe width)
> >  Linear chunk allocation mode
> >
> > (Device allocation grouping is important for enterprise storage solutions).
> >
> > Thanks, Anand
> 
> I have looked at those ideas; allocation groups are what I'm ideally
> after but I decided to start out small. I just spotted the dev_group
> field in the device tree that appears to be currently unused, so
> perhaps I will look at developing a group-aware allocator instead of
> just limiting the chunk width.

   I made some design notes on a generalised approach for this a while
ago:

http://www.spinics.net/lists/linux-btrfs/msg33782.html
http://www.spinics.net/lists/linux-btrfs/msg33916.html

   Hugo.

-- 
Hugo Mills             | What do we want?
hugo@... carfax.org.uk | Time Travel!
http://carfax.org.uk/  | When do we want it?
PGP: E2AB1DE4          | Irrelevant!                       Terminator: Genisys

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-14 19:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-03  1:50 [PATCH] btrfs: chunk_width_limit mount option Andrew Armenia
2016-06-06  9:17 ` David Sterba
2016-06-06 13:43   ` Andrew Armenia
2016-06-06 15:00     ` Hugo Mills
2016-06-09 19:19       ` Andrew Armenia
2016-06-13  2:06 ` Anand Jain
2016-06-14 19:44   ` Andrew Armenia
2016-06-14 19:55     ` Hugo Mills

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.