All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size
@ 2021-10-29 18:39 Stefan Roesch
  2021-10-29 18:39 ` [PATCH v4 1/4] btrfs: store chunk size in space-info struct Stefan Roesch
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Stefan Roesch @ 2021-10-29 18:39 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: shr

Motivation:
The btrfs allocator is currently not ideal for all workloads. It tends
to suffer from overallocating data block groups and underallocating
metadata block groups. This results in filesystems becoming read-only
even though there is plenty of "free" space.

This is naturally confusing and distressing to users.

Patches:
1) Store the stripe and chunk size in the btrfs_space_info structure
2) Add a sysfs entry to expose the above information
3) Add a sysfs entry to force a space allocation
4) Increase the default size of the metadata chunk allocation to 5GB
   for volumes greater than 50GB.

Testing:
  A new test is being added to the xfstest suite. For reference the
  corresponding patch has the title:
    [PATCH] btrfs: Test chunk allocation with different sizes

  In addition also manual testing has been performed.
    - Run xfstests with the changes and the new test. It does not
      show new diffs.
    - Test with storage devices 10G, 20G, 30G, 50G, 60G
      - Default allocation
      - Increase of chunk size
      - If the stripe size is > the free space, it allocates
        free space - 1MB. The 1MB is left as free space.
      - If the device has a storage size > 50G, it uses a 5GB
        chunk size for new allocations.

Stefan Roesch (4):
  btrfs: store chunk size in space-info struct.
  btrfs: expose chunk size in sysfs.
  btrfs: add force_chunk_alloc sysfs entry to force allocation
  btrfs: increase metadata alloc size to 5GB for volumes > 50GB

 fs/btrfs/space-info.c |  51 +++++++++++++++
 fs/btrfs/space-info.h |   3 +
 fs/btrfs/sysfs.c      | 145 ++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.c    |  28 +++-----
 4 files changed, 208 insertions(+), 19 deletions(-)

Signed-off-by: Stefan Roesch <shr@fb.com>
---
V4: - Patch email contained duplicate entries.

V3: - Rename sysfs entry from stripe_size to chunk_size
    - Remove max_chunk_size field in data structure btrfs_space_info
    - Rename max_stripe_size field to default_chunk_size in data
      structure btrfs_space_info
    - Remove max_chunk_size logic
    - Use stripe_size = chunk_size

V2:
   - Split the patch in 4 patches
   - Added checks for zone volumes in sysfs.c
   - Replaced the BUG() with ASSERT()
   - Changed if with fallthrough
   - Removed comments in space-info.h
-- 
2.30.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/4] btrfs: store chunk size in space-info struct.
  2021-10-29 18:39 [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Stefan Roesch
@ 2021-10-29 18:39 ` Stefan Roesch
  2021-11-05  8:52   ` Nikolay Borisov
  2021-10-29 18:39 ` [PATCH v4 2/4] btrfs: expose chunk size in sysfs Stefan Roesch
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Stefan Roesch @ 2021-10-29 18:39 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: shr

The chunk size is stored in the btrfs_space_info structure.
It is initialized at the start and is then used.

Two api's are added to get the current value and one to update
the value.

These api's will be used to be able to expose the chunk_size
as a sysfs setting.

Signed-off-by: Stefan Roesch <shr@fb.com>
---
 fs/btrfs/space-info.c | 51 +++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/space-info.h |  3 +++
 fs/btrfs/volumes.c    | 28 ++++++++----------------
 3 files changed, 63 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 48d77f360a24..7370c152ce8a 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -181,6 +181,56 @@ void btrfs_clear_space_info_full(struct btrfs_fs_info *info)
 		found->full = 0;
 }
 
+/*
+ * Compute chunk size depending on block type for regular volumes.
+ */
+static u64 compute_chunk_size_regular(struct btrfs_fs_info *info, u64 flags)
+{
+	ASSERT(flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
+
+	if (flags & BTRFS_BLOCK_GROUP_DATA)
+		return SZ_1G;
+	else if (flags & BTRFS_BLOCK_GROUP_SYSTEM)
+		return SZ_32M;
+
+	/* Handle BTRFS_BLOCK_GROUP_METADATA */
+	if (info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
+		return SZ_1G;
+
+	return SZ_256M;
+}
+
+/*
+ * Compute chunk size for zoned volumes.
+ */
+static u64 compute_chunk_size_zoned(struct btrfs_fs_info *info)
+{
+	return info->zone_size;
+}
+
+/*
+ * Compute chunk size depending on volume type.
+ */
+static u64 compute_chunk_size(struct btrfs_fs_info *info, u64 flags)
+{
+	if (btrfs_is_zoned(info))
+		return compute_chunk_size_zoned(info);
+
+	return compute_chunk_size_regular(info, flags);
+}
+
+/*
+ * Update default chunk size.
+ *
+ */
+void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
+					u64 flags, u64 chunk_size)
+{
+	spin_lock(&space_info->lock);
+	space_info->default_chunk_size = chunk_size;
+	spin_unlock(&space_info->lock);
+}
+
 static int create_space_info(struct btrfs_fs_info *info, u64 flags)
 {
 
@@ -202,6 +252,7 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
 	INIT_LIST_HEAD(&space_info->tickets);
 	INIT_LIST_HEAD(&space_info->priority_tickets);
 	space_info->clamp = 1;
+	space_info->default_chunk_size = compute_chunk_size(info, flags);
 
 	ret = btrfs_sysfs_add_space_info_type(info, space_info);
 	if (ret)
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index cb5056472e79..aa8fc0f72ea6 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -23,6 +23,7 @@ struct btrfs_space_info {
 	u64 max_extent_size;	/* This will hold the maximum extent size of
 				   the space info if we had an ENOSPC in the
 				   allocator. */
+	u64 default_chunk_size; /* default chunk size in bytes */
 
 	int clamp;		/* Used to scale our threshold for preemptive
 				   flushing. The value is >> clamp, so turns
@@ -115,6 +116,8 @@ void btrfs_update_space_info(struct btrfs_fs_info *info, u64 flags,
 			     u64 total_bytes, u64 bytes_used,
 			     u64 bytes_readonly, u64 bytes_zone_unusable,
 			     struct btrfs_space_info **space_info);
+void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
+			     u64 flags, u64 chunk_size);
 struct btrfs_space_info *btrfs_find_space_info(struct btrfs_fs_info *info,
 					       u64 flags);
 u64 __pure btrfs_space_info_used(struct btrfs_space_info *s_info,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 546bf1146b2d..563e5b30060d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5063,26 +5063,16 @@ static void init_alloc_chunk_ctl_policy_regular(
 				struct btrfs_fs_devices *fs_devices,
 				struct alloc_chunk_ctl *ctl)
 {
-	u64 type = ctl->type;
+	struct btrfs_space_info *space_info;
 
-	if (type & BTRFS_BLOCK_GROUP_DATA) {
-		ctl->max_stripe_size = SZ_1G;
-		ctl->max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
-	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
-		/* For larger filesystems, use larger metadata chunks */
-		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
-			ctl->max_stripe_size = SZ_1G;
-		else
-			ctl->max_stripe_size = SZ_256M;
-		ctl->max_chunk_size = ctl->max_stripe_size;
-	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
-		ctl->max_stripe_size = SZ_32M;
-		ctl->max_chunk_size = 2 * ctl->max_stripe_size;
-		ctl->devs_max = min_t(int, ctl->devs_max,
-				      BTRFS_MAX_DEVS_SYS_CHUNK);
-	} else {
-		BUG();
-	}
+	space_info = btrfs_find_space_info(fs_devices->fs_info, ctl->type);
+	ASSERT(space_info);
+
+	ctl->max_chunk_size = space_info->default_chunk_size;
+	ctl->max_stripe_size = space_info->default_chunk_size;
+
+	if (ctl->type & BTRFS_BLOCK_GROUP_SYSTEM)
+		ctl->devs_max = min_t(int, ctl->devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
 
 	/* We don't want a chunk larger than 10% of writable space */
 	ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/4] btrfs: expose chunk size in sysfs.
  2021-10-29 18:39 [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Stefan Roesch
  2021-10-29 18:39 ` [PATCH v4 1/4] btrfs: store chunk size in space-info struct Stefan Roesch
@ 2021-10-29 18:39 ` Stefan Roesch
  2021-11-05  9:27   ` Nikolay Borisov
  2021-10-29 18:39 ` [PATCH v4 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation Stefan Roesch
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Stefan Roesch @ 2021-10-29 18:39 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: shr

This adds a new sysfs entry in /sys/fs/btrfs/<uuid>/allocation/<block
type>/chunk_size. This allows to query the chunk size and also set the
chunk size.

Signed-off-by: Stefan Roesch <shr@fb.com>
---
 fs/btrfs/sysfs.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index f9eff3b0f77c..3b0bcbc2ed2e 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -21,6 +21,7 @@
 #include "space-info.h"
 #include "block-group.h"
 #include "qgroup.h"
+#include "misc.h"
 
 /*
  * Structure name                       Path
@@ -92,6 +93,7 @@ static struct btrfs_feature_attr btrfs_attr_features_##_name = {	     \
 
 static inline struct btrfs_fs_info *to_fs_info(struct kobject *kobj);
 static inline struct btrfs_fs_devices *to_fs_devs(struct kobject *kobj);
+static inline struct kobject *get_btrfs_kobj(struct kobject *kobj);
 
 static struct btrfs_feature_attr *to_btrfs_feature_attr(struct kobj_attribute *a)
 {
@@ -708,6 +710,67 @@ static ssize_t btrfs_space_info_show_##field(struct kobject *kobj,	\
 }									\
 BTRFS_ATTR(space_info, field, btrfs_space_info_show_##field)
 
+/*
+ * Return space info chunk size.
+ */
+static ssize_t btrfs_chunk_size_show(struct kobject *kobj,
+				     struct kobj_attribute *a, char *buf)
+{
+	struct btrfs_space_info *sinfo = to_space_info(kobj);
+
+	return btrfs_show_u64(&sinfo->default_chunk_size, &sinfo->lock, buf);
+}
+
+/*
+ * Store new user supplied chunk size in space info.
+ *
+ * Note: If the new chunk size value is larger than 10% of free space it is
+ *       reduced to match that limit.
+ */
+static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
+				      struct kobj_attribute *a,
+				      const char *buf, size_t len)
+{
+	struct btrfs_space_info *space_info = to_space_info(kobj);
+	struct btrfs_fs_info *fs_info = to_fs_info(get_btrfs_kobj(kobj));
+	u64 val;
+	int ret;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (!fs_info) {
+		pr_err("couldn't get fs_info\n");
+		return -EPERM;
+	}
+
+	if (sb_rdonly(fs_info->sb))
+		return -EROFS;
+
+	if (!fs_info->fs_devices)
+		return -EINVAL;
+
+	if (btrfs_is_zoned(fs_info))
+		return -EINVAL;
+
+	if (!space_info) {
+		btrfs_err(fs_info, "couldn't get space_info\n");
+		return -EPERM;
+	}
+
+	ret = kstrtoull(buf, 10, &val);
+	if (ret)
+		return ret;
+
+	/*
+	 * Limit stripe size to 10% of available space.
+	 */
+	val = min(div_factor(fs_info->fs_devices->total_rw_bytes, 1), val);
+	btrfs_update_space_info_chunk_size(space_info, space_info->flags, val);
+
+	return val;
+}
+
 SPACE_INFO_ATTR(flags);
 SPACE_INFO_ATTR(total_bytes);
 SPACE_INFO_ATTR(bytes_used);
@@ -718,6 +781,8 @@ SPACE_INFO_ATTR(bytes_readonly);
 SPACE_INFO_ATTR(bytes_zone_unusable);
 SPACE_INFO_ATTR(disk_used);
 SPACE_INFO_ATTR(disk_total);
+BTRFS_ATTR_RW(space_info, chunk_size, btrfs_chunk_size_show,
+	      btrfs_chunk_size_store);
 
 /*
  * Allocation information about block group types.
@@ -735,6 +800,7 @@ static struct attribute *space_info_attrs[] = {
 	BTRFS_ATTR_PTR(space_info, bytes_zone_unusable),
 	BTRFS_ATTR_PTR(space_info, disk_used),
 	BTRFS_ATTR_PTR(space_info, disk_total),
+	BTRFS_ATTR_PTR(space_info, chunk_size),
 	NULL,
 };
 ATTRIBUTE_GROUPS(space_info);
@@ -1099,6 +1165,20 @@ static inline struct btrfs_fs_info *to_fs_info(struct kobject *kobj)
 	return to_fs_devs(kobj)->fs_info;
 }
 
+/*
+ * Get btrfs sysfs kobject.
+ */
+static inline struct kobject *get_btrfs_kobj(struct kobject *kobj)
+{
+	while (kobj) {
+		if (kobj->ktype == &btrfs_ktype)
+			return kobj;
+		kobj = kobj->parent;
+	}
+
+	return NULL;
+}
+
 #define NUM_FEATURE_BITS 64
 #define BTRFS_FEATURE_NAME_MAX 13
 static char btrfs_unknown_feature_names[FEAT_MAX][NUM_FEATURE_BITS][BTRFS_FEATURE_NAME_MAX];
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation
  2021-10-29 18:39 [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Stefan Roesch
  2021-10-29 18:39 ` [PATCH v4 1/4] btrfs: store chunk size in space-info struct Stefan Roesch
  2021-10-29 18:39 ` [PATCH v4 2/4] btrfs: expose chunk size in sysfs Stefan Roesch
@ 2021-10-29 18:39 ` Stefan Roesch
  2021-11-05 10:04   ` Nikolay Borisov
  2021-10-29 18:39 ` [PATCH v4 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB Stefan Roesch
  2021-11-02 16:15 ` [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Josef Bacik
  4 siblings, 1 reply; 15+ messages in thread
From: Stefan Roesch @ 2021-10-29 18:39 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: shr

This adds the force_chunk_alloc sysfs entry to be able to force a
storage allocation. This is a debugging and test feature and is
enabled with the CONFIG_BTRFS_DEBUG configuration option.

It is stored at
/sys/fs/btrfs/<uuid>/allocation/<block_type>/force_chunk_alloc.
---
 fs/btrfs/sysfs.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 3b0bcbc2ed2e..983c53b76aa6 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -771,6 +771,64 @@ static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
 	return val;
 }
 
+#ifdef CONFIG_BTRFS_DEBUG
+/*
+ * Return if space info force allocation chunk flag is set.
+ */
+static ssize_t btrfs_force_chunk_alloc_show(struct kobject *kobj,
+					    struct kobj_attribute *a,
+					    char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "0\n");
+}
+
+/*
+ * Request chunk allocation with current chunk size.
+ */
+static ssize_t btrfs_force_chunk_alloc_store(struct kobject *kobj,
+					     struct kobj_attribute *a,
+					     const char *buf, size_t len)
+{
+	struct btrfs_space_info *space_info = to_space_info(kobj);
+	struct btrfs_fs_info *fs_info = to_fs_info(get_btrfs_kobj(kobj));
+	struct btrfs_trans_handle *trans;
+	unsigned long val;
+	int ret;
+
+	if (!fs_info) {
+		pr_err("couldn't get fs_info\n");
+		return -EPERM;
+	}
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (sb_rdonly(fs_info->sb))
+		return -EROFS;
+
+	ret = kstrtoul(buf, 10, &val);
+	if (ret)
+		return ret;
+
+	if (val == 0)
+		return -EINVAL;
+
+	/*
+	 * Allocate new chunk.
+	 */
+	trans = btrfs_start_transaction(fs_info->extent_root, 0);
+	if (!trans)
+		return PTR_ERR(trans);
+	ret = btrfs_force_chunk_alloc(trans, space_info->flags);
+	btrfs_end_transaction(trans);
+
+	if (ret == 1)
+		return len;
+
+	return -ENOSPC;
+}
+#endif
+
 SPACE_INFO_ATTR(flags);
 SPACE_INFO_ATTR(total_bytes);
 SPACE_INFO_ATTR(bytes_used);
@@ -783,6 +841,10 @@ SPACE_INFO_ATTR(disk_used);
 SPACE_INFO_ATTR(disk_total);
 BTRFS_ATTR_RW(space_info, chunk_size, btrfs_chunk_size_show,
 	      btrfs_chunk_size_store);
+#ifdef CONFIG_BTRFS_DEBUG
+BTRFS_ATTR_RW(space_info, force_chunk_alloc, btrfs_force_chunk_alloc_show,
+	      btrfs_force_chunk_alloc_store);
+#endif
 
 /*
  * Allocation information about block group types.
@@ -801,6 +863,9 @@ static struct attribute *space_info_attrs[] = {
 	BTRFS_ATTR_PTR(space_info, disk_used),
 	BTRFS_ATTR_PTR(space_info, disk_total),
 	BTRFS_ATTR_PTR(space_info, chunk_size),
+#ifdef CONFIG_BTRFS_DEBUG
+	BTRFS_ATTR_PTR(space_info, force_chunk_alloc),
+#endif
 	NULL,
 };
 ATTRIBUTE_GROUPS(space_info);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB
  2021-10-29 18:39 [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Stefan Roesch
                   ` (2 preceding siblings ...)
  2021-10-29 18:39 ` [PATCH v4 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation Stefan Roesch
@ 2021-10-29 18:39 ` Stefan Roesch
  2021-11-05 10:11   ` Nikolay Borisov
  2021-11-02 16:15 ` [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Josef Bacik
  4 siblings, 1 reply; 15+ messages in thread
From: Stefan Roesch @ 2021-10-29 18:39 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: shr

This increases the metadata default allocation size from 1GB to 5GB for
volumes with a size greater than 50GB.

Signed-off-by: Stefan Roesch <shr@fb.com>
---
 fs/btrfs/space-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 7370c152ce8a..44507262c515 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -195,7 +195,7 @@ static u64 compute_chunk_size_regular(struct btrfs_fs_info *info, u64 flags)
 
 	/* Handle BTRFS_BLOCK_GROUP_METADATA */
 	if (info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
-		return SZ_1G;
+		return 5ULL * SZ_1G;
 
 	return SZ_256M;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size
  2021-10-29 18:39 [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Stefan Roesch
                   ` (3 preceding siblings ...)
  2021-10-29 18:39 ` [PATCH v4 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB Stefan Roesch
@ 2021-11-02 16:15 ` Josef Bacik
  4 siblings, 0 replies; 15+ messages in thread
From: Josef Bacik @ 2021-11-02 16:15 UTC (permalink / raw)
  To: Stefan Roesch; +Cc: linux-btrfs, kernel-team

On Fri, Oct 29, 2021 at 11:39:46AM -0700, Stefan Roesch wrote:
> Motivation:
> The btrfs allocator is currently not ideal for all workloads. It tends
> to suffer from overallocating data block groups and underallocating
> metadata block groups. This results in filesystems becoming read-only
> even though there is plenty of "free" space.
> 
> This is naturally confusing and distressing to users.
> 
> Patches:
> 1) Store the stripe and chunk size in the btrfs_space_info structure
> 2) Add a sysfs entry to expose the above information
> 3) Add a sysfs entry to force a space allocation
> 4) Increase the default size of the metadata chunk allocation to 5GB
>    for volumes greater than 50GB.
> 
> Testing:
>   A new test is being added to the xfstest suite. For reference the
>   corresponding patch has the title:
>     [PATCH] btrfs: Test chunk allocation with different sizes
> 
>   In addition also manual testing has been performed.
>     - Run xfstests with the changes and the new test. It does not
>       show new diffs.
>     - Test with storage devices 10G, 20G, 30G, 50G, 60G
>       - Default allocation
>       - Increase of chunk size
>       - If the stripe size is > the free space, it allocates
>         free space - 1MB. The 1MB is left as free space.
>       - If the device has a storage size > 50G, it uses a 5GB
>         chunk size for new allocations.
> 
> Stefan Roesch (4):
>   btrfs: store chunk size in space-info struct.
>   btrfs: expose chunk size in sysfs.
>   btrfs: add force_chunk_alloc sysfs entry to force allocation
>   btrfs: increase metadata alloc size to 5GB for volumes > 50GB
> 

You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

to the series, thanks,

Josef

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/4] btrfs: store chunk size in space-info struct.
  2021-10-29 18:39 ` [PATCH v4 1/4] btrfs: store chunk size in space-info struct Stefan Roesch
@ 2021-11-05  8:52   ` Nikolay Borisov
  2021-11-09 19:44     ` Stefan Roesch
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2021-11-05  8:52 UTC (permalink / raw)
  To: Stefan Roesch, linux-btrfs, kernel-team



On 29.10.21 г. 21:39, Stefan Roesch wrote:
> The chunk size is stored in the btrfs_space_info structure.
> It is initialized at the start and is then used.
> 
> Two api's are added to get the current value and one to update
> the value.

There is just a single API added to update the size, there is no api to
get the value, one has to directly read default_chunk_size. Additionally
if it's going to be changed then does the "default_" prefix really mean
anything?

> 
> These api's will be used to be able to expose the chunk_size
> as a sysfs setting.
> 
> Signed-off-by: Stefan Roesch <shr@fb.com>
> ---
>  fs/btrfs/space-info.c | 51 +++++++++++++++++++++++++++++++++++++++++++
>  fs/btrfs/space-info.h |  3 +++
>  fs/btrfs/volumes.c    | 28 ++++++++----------------
>  3 files changed, 63 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 48d77f360a24..7370c152ce8a 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -181,6 +181,56 @@ void btrfs_clear_space_info_full(struct btrfs_fs_info *info)
>  		found->full = 0;
>  }
>  
> +/*
> + * Compute chunk size depending on block type for regular volumes.
> + */
> +static u64 compute_chunk_size_regular(struct btrfs_fs_info *info, u64 flags)
> +{
> +	ASSERT(flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
> +
> +	if (flags & BTRFS_BLOCK_GROUP_DATA)
> +		return SZ_1G;
> +	else if (flags & BTRFS_BLOCK_GROUP_SYSTEM)
> +		return SZ_32M;
> +
> +	/* Handle BTRFS_BLOCK_GROUP_METADATA */
> +	if (info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
> +		return SZ_1G;
> +
> +	return SZ_256M;
> +}
> +
> +/*
> + * Compute chunk size for zoned volumes.
> + */
> +static u64 compute_chunk_size_zoned(struct btrfs_fs_info *info)
> +{
> +	return info->zone_size;
> +}

nit: This is trivial and so can simply be open-coded in
compute_chunk_size, yes it's static and will likely be compiled out but
it adds a bit of cognitive load when reading the code. In any case I'd
leave this to David to decide whether to leave the function or not.

> +
> +/*
> + * Compute chunk size depending on volume type.
> + */

<snip>


>  static int create_space_info(struct btrfs_fs_info *info, u64 flags)
>  {
>  
> @@ -202,6 +252,7 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
>  	INIT_LIST_HEAD(&space_info->tickets);
>  	INIT_LIST_HEAD(&space_info->priority_tickets);
>  	space_info->clamp = 1;
> +	space_info->default_chunk_size = compute_chunk_size(info, flags);
>  
>  	ret = btrfs_sysfs_add_space_info_type(info, space_info);
>  	if (ret)

<snip>

> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 546bf1146b2d..563e5b30060d 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -5063,26 +5063,16 @@ static void init_alloc_chunk_ctl_policy_regular(
>  				struct btrfs_fs_devices *fs_devices,
>  				struct alloc_chunk_ctl *ctl)
>  {
> -	u64 type = ctl->type;
> +	struct btrfs_space_info *space_info;
>  
> -	if (type & BTRFS_BLOCK_GROUP_DATA) {
> -		ctl->max_stripe_size = SZ_1G;
> -		ctl->max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
> -	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
> -		/* For larger filesystems, use larger metadata chunks */
> -		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
> -			ctl->max_stripe_size = SZ_1G;
> -		else
> -			ctl->max_stripe_size = SZ_256M;
> -		ctl->max_chunk_size = ctl->max_stripe_size;
> -	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
> -		ctl->max_stripe_size = SZ_32M;
> -		ctl->max_chunk_size = 2 * ctl->max_stripe_size;
> -		ctl->devs_max = min_t(int, ctl->devs_max,
> -				      BTRFS_MAX_DEVS_SYS_CHUNK);
> -	} else {
> -		BUG();
> -	}
> +	space_info = btrfs_find_space_info(fs_devices->fs_info, ctl->type);
> +	ASSERT(space_info);
> +
> +	ctl->max_chunk_size = space_info->default_chunk_size;
> +	ctl->max_stripe_size = space_info->default_chunk_size;

Those are racy accesses, no ? Chunk allocation happens under
chunk_mutex, not the space_info lock ? Perhaps it could be turned into
an atomic?

> +
> +	if (ctl->type & BTRFS_BLOCK_GROUP_SYSTEM)
> +		ctl->devs_max = min_t(int, ctl->devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
>  
>  	/* We don't want a chunk larger than 10% of writable space */
>  	ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/4] btrfs: expose chunk size in sysfs.
  2021-10-29 18:39 ` [PATCH v4 2/4] btrfs: expose chunk size in sysfs Stefan Roesch
@ 2021-11-05  9:27   ` Nikolay Borisov
  2021-11-09  1:57     ` Stefan Roesch
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2021-11-05  9:27 UTC (permalink / raw)
  To: Stefan Roesch, linux-btrfs, kernel-team



On 29.10.21 г. 21:39, Stefan Roesch wrote:
> This adds a new sysfs entry in /sys/fs/btrfs/<uuid>/allocation/<block
> type>/chunk_size. This allows to query the chunk size and also set the
> chunk size.
> 
> Signed-off-by: Stefan Roesch <shr@fb.com>
> ---
>  fs/btrfs/sysfs.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index f9eff3b0f77c..3b0bcbc2ed2e 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -21,6 +21,7 @@
>  #include "space-info.h"
>  #include "block-group.h"
>  #include "qgroup.h"
> +#include "misc.h"
>  
>  /*
>   * Structure name                       Path
> @@ -92,6 +93,7 @@ static struct btrfs_feature_attr btrfs_attr_features_##_name = {	     \
>  
>  static inline struct btrfs_fs_info *to_fs_info(struct kobject *kobj);
>  static inline struct btrfs_fs_devices *to_fs_devs(struct kobject *kobj);
> +static inline struct kobject *get_btrfs_kobj(struct kobject *kobj);
>  
>  static struct btrfs_feature_attr *to_btrfs_feature_attr(struct kobj_attribute *a)
>  {
> @@ -708,6 +710,67 @@ static ssize_t btrfs_space_info_show_##field(struct kobject *kobj,	\
>  }									\
>  BTRFS_ATTR(space_info, field, btrfs_space_info_show_##field)
>  
> +/*
> + * Return space info chunk size.
> + */
> +static ssize_t btrfs_chunk_size_show(struct kobject *kobj,
> +				     struct kobj_attribute *a, char *buf)
> +{
> +	struct btrfs_space_info *sinfo = to_space_info(kobj);
> +
> +	return btrfs_show_u64(&sinfo->default_chunk_size, &sinfo->lock, buf);
> +}
> +
> +/*
> + * Store new user supplied chunk size in space info.
> + *
> + * Note: If the new chunk size value is larger than 10% of free space it is
> + *       reduced to match that limit.
> + */
> +static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
> +				      struct kobj_attribute *a,
> +				      const char *buf, size_t len)
> +{
> +	struct btrfs_space_info *space_info = to_space_info(kobj);
> +	struct btrfs_fs_info *fs_info = to_fs_info(get_btrfs_kobj(kobj));
> +	u64 val;
> +	int ret;
> +
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;
> +
> +	if (!fs_info) {
> +		pr_err("couldn't get fs_info\n");
> +		return -EPERM;
> +	}
> +
> +	if (sb_rdonly(fs_info->sb))
> +		return -EROFS;
> +
> +	if (!fs_info->fs_devices)
> +		return -EINVAL;
> +
> +	if (btrfs_is_zoned(fs_info))
> +		return -EINVAL;
> +
> +	if (!space_info) {
> +		btrfs_err(fs_info, "couldn't get space_info\n");
> +		return -EPERM;
> +	}
> +
> +	ret = kstrtoull(buf, 10, &val);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * Limit stripe size to 10% of available space.
> +	 */
> +	val = min(div_factor(fs_info->fs_devices->total_rw_bytes, 1), val);
> +	btrfs_update_space_info_chunk_size(space_info, space_info->flags, val);

I wonder if we need to enforce some sort of alignment i.e 128/256m
otherwise we give the user to give all kinds of funky byte sizes?

<snip>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation
  2021-10-29 18:39 ` [PATCH v4 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation Stefan Roesch
@ 2021-11-05 10:04   ` Nikolay Borisov
  2021-11-09  1:09     ` Stefan Roesch
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2021-11-05 10:04 UTC (permalink / raw)
  To: Stefan Roesch, linux-btrfs, kernel-team



On 29.10.21 г. 21:39, Stefan Roesch wrote:
> This adds the force_chunk_alloc sysfs entry to be able to force a
> storage allocation. This is a debugging and test feature and is
> enabled with the CONFIG_BTRFS_DEBUG configuration option.
> 
> It is stored at
> /sys/fs/btrfs/<uuid>/allocation/<block_type>/force_chunk_alloc.
> ---
>  fs/btrfs/sysfs.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 65 insertions(+)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index 3b0bcbc2ed2e..983c53b76aa6 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -771,6 +771,64 @@ static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
>  	return val;
>  }
>  
> +#ifdef CONFIG_BTRFS_DEBUG
> +/*
> + * Return if space info force allocation chunk flag is set.

nit: Well this is a dummy function which simply returns 0. I guess what
makes sense is to simply make the show op a NULL, i wonder if sysfs can
handle this gracefully though.

<snip>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB
  2021-10-29 18:39 ` [PATCH v4 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB Stefan Roesch
@ 2021-11-05 10:11   ` Nikolay Borisov
  2021-11-09 21:19     ` Stefan Roesch
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2021-11-05 10:11 UTC (permalink / raw)
  To: Stefan Roesch, linux-btrfs, kernel-team



On 29.10.21 г. 21:39, Stefan Roesch wrote:
> This increases the metadata default allocation size from 1GB to 5GB for
> volumes with a size greater than 50GB.

Imo such a change would warrant more data about why it's ok. Clearly
there is some internal facebook workload which benefits from this,
however this doesn't automatically mean it will be beneficial for the
wider user base. Also your cover letter doesn't contain any more
specifics about the particular workload. Isn't 5g a bit too steep,
perhaps 2 is larger and yet more conservative?


> 
> Signed-off-by: Stefan Roesch <shr@fb.com>
> ---
>  fs/btrfs/space-info.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 7370c152ce8a..44507262c515 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -195,7 +195,7 @@ static u64 compute_chunk_size_regular(struct btrfs_fs_info *info, u64 flags)
>  
>  	/* Handle BTRFS_BLOCK_GROUP_METADATA */
>  	if (info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
> -		return SZ_1G;
> +		return 5ULL * SZ_1G;
>  
>  	return SZ_256M;
>  }
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation
  2021-11-05 10:04   ` Nikolay Borisov
@ 2021-11-09  1:09     ` Stefan Roesch
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Roesch @ 2021-11-09  1:09 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team



On 11/5/21 3:04 AM, Nikolay Borisov wrote:
> 
> 
> On 29.10.21 г. 21:39, Stefan Roesch wrote:
>> This adds the force_chunk_alloc sysfs entry to be able to force a
>> storage allocation. This is a debugging and test feature and is
>> enabled with the CONFIG_BTRFS_DEBUG configuration option.
>>
>> It is stored at
>> /sys/fs/btrfs/<uuid>/allocation/<block_type>/force_chunk_alloc.
>> ---
>>  fs/btrfs/sysfs.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 65 insertions(+)
>>
>> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
>> index 3b0bcbc2ed2e..983c53b76aa6 100644
>> --- a/fs/btrfs/sysfs.c
>> +++ b/fs/btrfs/sysfs.c
>> @@ -771,6 +771,64 @@ static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
>>  	return val;
>>  }
>>  
>> +#ifdef CONFIG_BTRFS_DEBUG
>> +/*
>> + * Return if space info force allocation chunk flag is set.
> 
> nit: Well this is a dummy function which simply returns 0. I guess what
> makes sense is to simply make the show op a NULL, i wonder if sysfs can
> handle this gracefully though.
> 
> <snip>
>

This is possible however I need to also change the file mask, so the user can
only write to the file. I'll make the change.
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/4] btrfs: expose chunk size in sysfs.
  2021-11-05  9:27   ` Nikolay Borisov
@ 2021-11-09  1:57     ` Stefan Roesch
  2021-11-09  6:36       ` Nikolay Borisov
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Roesch @ 2021-11-09  1:57 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team



On 11/5/21 2:27 AM, Nikolay Borisov wrote:
> 
> 
> On 29.10.21 г. 21:39, Stefan Roesch wrote:
>> This adds a new sysfs entry in /sys/fs/btrfs/<uuid>/allocation/<block
>> type>/chunk_size. This allows to query the chunk size and also set the
>> chunk size.
>>
>> Signed-off-by: Stefan Roesch <shr@fb.com>
>> ---
>>  fs/btrfs/sysfs.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 80 insertions(+)
>>
>> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
>> index f9eff3b0f77c..3b0bcbc2ed2e 100644
>> --- a/fs/btrfs/sysfs.c
>> +++ b/fs/btrfs/sysfs.c
>> @@ -21,6 +21,7 @@
>>  #include "space-info.h"
>>  #include "block-group.h"
>>  #include "qgroup.h"
>> +#include "misc.h"
>>  
>>  /*
>>   * Structure name                       Path
>> @@ -92,6 +93,7 @@ static struct btrfs_feature_attr btrfs_attr_features_##_name = {	     \
>>  
>>  static inline struct btrfs_fs_info *to_fs_info(struct kobject *kobj);
>>  static inline struct btrfs_fs_devices *to_fs_devs(struct kobject *kobj);
>> +static inline struct kobject *get_btrfs_kobj(struct kobject *kobj);
>>  
>>  static struct btrfs_feature_attr *to_btrfs_feature_attr(struct kobj_attribute *a)
>>  {
>> @@ -708,6 +710,67 @@ static ssize_t btrfs_space_info_show_##field(struct kobject *kobj,	\
>>  }									\
>>  BTRFS_ATTR(space_info, field, btrfs_space_info_show_##field)
>>  
>> +/*
>> + * Return space info chunk size.
>> + */
>> +static ssize_t btrfs_chunk_size_show(struct kobject *kobj,
>> +				     struct kobj_attribute *a, char *buf)
>> +{
>> +	struct btrfs_space_info *sinfo = to_space_info(kobj);
>> +
>> +	return btrfs_show_u64(&sinfo->default_chunk_size, &sinfo->lock, buf);
>> +}
>> +
>> +/*
>> + * Store new user supplied chunk size in space info.
>> + *
>> + * Note: If the new chunk size value is larger than 10% of free space it is
>> + *       reduced to match that limit.
>> + */
>> +static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
>> +				      struct kobj_attribute *a,
>> +				      const char *buf, size_t len)
>> +{
>> +	struct btrfs_space_info *space_info = to_space_info(kobj);
>> +	struct btrfs_fs_info *fs_info = to_fs_info(get_btrfs_kobj(kobj));
>> +	u64 val;
>> +	int ret;
>> +
>> +	if (!capable(CAP_SYS_ADMIN))
>> +		return -EPERM;
>> +
>> +	if (!fs_info) {
>> +		pr_err("couldn't get fs_info\n");
>> +		return -EPERM;
>> +	}
>> +
>> +	if (sb_rdonly(fs_info->sb))
>> +		return -EROFS;
>> +
>> +	if (!fs_info->fs_devices)
>> +		return -EINVAL;
>> +
>> +	if (btrfs_is_zoned(fs_info))
>> +		return -EINVAL;
>> +
>> +	if (!space_info) {
>> +		btrfs_err(fs_info, "couldn't get space_info\n");
>> +		return -EPERM;
>> +	}
>> +
>> +	ret = kstrtoull(buf, 10, &val);
>> +	if (ret)
>> +		return ret;
>> +
>> +	/*
>> +	 * Limit stripe size to 10% of available space.
>> +	 */
>> +	val = min(div_factor(fs_info->fs_devices->total_rw_bytes, 1), val);
>> +	btrfs_update_space_info_chunk_size(space_info, space_info->flags, val);
> 
> I wonder if we need to enforce some sort of alignment i.e 128/256m
> otherwise we give the user to give all kinds of funky byte sizes?
> 
> <snip>
> 

That makes sense. As the default size is 256M. I propose the following:
- Sizes smaller than 256M are not allowed.
- Sizes are aligned on 256M boundary otherwise.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/4] btrfs: expose chunk size in sysfs.
  2021-11-09  1:57     ` Stefan Roesch
@ 2021-11-09  6:36       ` Nikolay Borisov
  0 siblings, 0 replies; 15+ messages in thread
From: Nikolay Borisov @ 2021-11-09  6:36 UTC (permalink / raw)
  To: Stefan Roesch, linux-btrfs, kernel-team



On 9.11.21 г. 3:57, Stefan Roesch wrote:
> 
> 

<snip>

>> I wonder if we need to enforce some sort of alignment i.e 128/256m
>> otherwise we give the user to give all kinds of funky byte sizes?
>>
>> <snip>
>>
> 
> That makes sense. As the default size is 256M. I propose the following:
> - Sizes smaller than 256M are not allowed.
> - Sizes are aligned on 256M boundary otherwise.

Sounds good.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/4] btrfs: store chunk size in space-info struct.
  2021-11-05  8:52   ` Nikolay Borisov
@ 2021-11-09 19:44     ` Stefan Roesch
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Roesch @ 2021-11-09 19:44 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team



On 11/5/21 1:52 AM, Nikolay Borisov wrote:
> 
> 
> On 29.10.21 г. 21:39, Stefan Roesch wrote:
>> The chunk size is stored in the btrfs_space_info structure.
>> It is initialized at the start and is then used.
>>
>> Two api's are added to get the current value and one to update
>> the value.
> 
> There is just a single API added to update the size, there is no api to
> get the value, one has to directly read default_chunk_size. Additionally
> if it's going to be changed then does the "default_" prefix really mean
> anything?
>

I changed the name to chunk_size.
 
>>
>> These api's will be used to be able to expose the chunk_size
>> as a sysfs setting.
>>
>> Signed-off-by: Stefan Roesch <shr@fb.com>
>> ---
>>  fs/btrfs/space-info.c | 51 +++++++++++++++++++++++++++++++++++++++++++
>>  fs/btrfs/space-info.h |  3 +++
>>  fs/btrfs/volumes.c    | 28 ++++++++----------------
>>  3 files changed, 63 insertions(+), 19 deletions(-)
>>
>> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
>> index 48d77f360a24..7370c152ce8a 100644
>> --- a/fs/btrfs/space-info.c
>> +++ b/fs/btrfs/space-info.c
>> @@ -181,6 +181,56 @@ void btrfs_clear_space_info_full(struct btrfs_fs_info *info)
>>  		found->full = 0;
>>  }
>>  
>> +/*
>> + * Compute chunk size depending on block type for regular volumes.
>> + */
>> +static u64 compute_chunk_size_regular(struct btrfs_fs_info *info, u64 flags)
>> +{
>> +	ASSERT(flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
>> +
>> +	if (flags & BTRFS_BLOCK_GROUP_DATA)
>> +		return SZ_1G;
>> +	else if (flags & BTRFS_BLOCK_GROUP_SYSTEM)
>> +		return SZ_32M;
>> +
>> +	/* Handle BTRFS_BLOCK_GROUP_METADATA */
>> +	if (info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
>> +		return SZ_1G;
>> +
>> +	return SZ_256M;
>> +}
>> +
>> +/*
>> + * Compute chunk size for zoned volumes.
>> + */
>> +static u64 compute_chunk_size_zoned(struct btrfs_fs_info *info)
>> +{
>> +	return info->zone_size;
>> +}
> 
> nit: This is trivial and so can simply be open-coded in
> compute_chunk_size, yes it's static and will likely be compiled out but
> it adds a bit of cognitive load when reading the code. In any case I'd
> leave this to David to decide whether to leave the function or not.
> 

I removed the function.

>> +
>> +/*
>> + * Compute chunk size depending on volume type.
>> + */
> 
> <snip>
> 
> 
>>  static int create_space_info(struct btrfs_fs_info *info, u64 flags)
>>  {
>>  
>> @@ -202,6 +252,7 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
>>  	INIT_LIST_HEAD(&space_info->tickets);
>>  	INIT_LIST_HEAD(&space_info->priority_tickets);
>>  	space_info->clamp = 1;
>> +	space_info->default_chunk_size = compute_chunk_size(info, flags);
>>  
>>  	ret = btrfs_sysfs_add_space_info_type(info, space_info);
>>  	if (ret)
> 
> <snip>
> 
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 546bf1146b2d..563e5b30060d 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -5063,26 +5063,16 @@ static void init_alloc_chunk_ctl_policy_regular(
>>  				struct btrfs_fs_devices *fs_devices,
>>  				struct alloc_chunk_ctl *ctl)
>>  {
>> -	u64 type = ctl->type;
>> +	struct btrfs_space_info *space_info;
>>  
>> -	if (type & BTRFS_BLOCK_GROUP_DATA) {
>> -		ctl->max_stripe_size = SZ_1G;
>> -		ctl->max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
>> -	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
>> -		/* For larger filesystems, use larger metadata chunks */
>> -		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
>> -			ctl->max_stripe_size = SZ_1G;
>> -		else
>> -			ctl->max_stripe_size = SZ_256M;
>> -		ctl->max_chunk_size = ctl->max_stripe_size;
>> -	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
>> -		ctl->max_stripe_size = SZ_32M;
>> -		ctl->max_chunk_size = 2 * ctl->max_stripe_size;
>> -		ctl->devs_max = min_t(int, ctl->devs_max,
>> -				      BTRFS_MAX_DEVS_SYS_CHUNK);
>> -	} else {
>> -		BUG();
>> -	}
>> +	space_info = btrfs_find_space_info(fs_devices->fs_info, ctl->type);
>> +	ASSERT(space_info);
>> +
>> +	ctl->max_chunk_size = space_info->default_chunk_size;
>> +	ctl->max_stripe_size = space_info->default_chunk_size;
> 
> Those are racy accesses, no ? Chunk allocation happens under
> chunk_mutex, not the space_info lock ? Perhaps it could be turned into
> an atomic?
> 

Good catch. I replaced it with an atomic.

>> +
>> +	if (ctl->type & BTRFS_BLOCK_GROUP_SYSTEM)
>> +		ctl->devs_max = min_t(int, ctl->devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
>>  
>>  	/* We don't want a chunk larger than 10% of writable space */
>>  	ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB
  2021-11-05 10:11   ` Nikolay Borisov
@ 2021-11-09 21:19     ` Stefan Roesch
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Roesch @ 2021-11-09 21:19 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team



On 11/5/21 3:11 AM, Nikolay Borisov wrote:
> 
> 
> On 29.10.21 г. 21:39, Stefan Roesch wrote:
>> This increases the metadata default allocation size from 1GB to 5GB for
>> volumes with a size greater than 50GB.
> 
> Imo such a change would warrant more data about why it's ok. Clearly
> there is some internal facebook workload which benefits from this,
> however this doesn't automatically mean it will be beneficial for the
> wider user base. Also your cover letter doesn't contain any more
> specifics about the particular workload. Isn't 5g a bit too steep,
> perhaps 2 is larger and yet more conservative?
>

I understand that this change is getting some questions and a more conservative
value might need to be chosen. Let's wait until all the reviews are in and
then settle on a compromise. 
 
> 
>>
>> Signed-off-by: Stefan Roesch <shr@fb.com>
>> ---
>>  fs/btrfs/space-info.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
>> index 7370c152ce8a..44507262c515 100644
>> --- a/fs/btrfs/space-info.c
>> +++ b/fs/btrfs/space-info.c
>> @@ -195,7 +195,7 @@ static u64 compute_chunk_size_regular(struct btrfs_fs_info *info, u64 flags)
>>  
>>  	/* Handle BTRFS_BLOCK_GROUP_METADATA */
>>  	if (info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
>> -		return SZ_1G;
>> +		return 5ULL * SZ_1G;
>>  
>>  	return SZ_256M;
>>  }
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-11-09 21:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-29 18:39 [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Stefan Roesch
2021-10-29 18:39 ` [PATCH v4 1/4] btrfs: store chunk size in space-info struct Stefan Roesch
2021-11-05  8:52   ` Nikolay Borisov
2021-11-09 19:44     ` Stefan Roesch
2021-10-29 18:39 ` [PATCH v4 2/4] btrfs: expose chunk size in sysfs Stefan Roesch
2021-11-05  9:27   ` Nikolay Borisov
2021-11-09  1:57     ` Stefan Roesch
2021-11-09  6:36       ` Nikolay Borisov
2021-10-29 18:39 ` [PATCH v4 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation Stefan Roesch
2021-11-05 10:04   ` Nikolay Borisov
2021-11-09  1:09     ` Stefan Roesch
2021-10-29 18:39 ` [PATCH v4 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB Stefan Roesch
2021-11-05 10:11   ` Nikolay Borisov
2021-11-09 21:19     ` Stefan Roesch
2021-11-02 16:15 ` [PATCH v4 0/4] btrfs: sysfs: set / query btrfs chunk size Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.